Category Archives: Python

Configuring virtualenv to run Google appengine samples

In my experiments with Google AppEngine I wanted to configure a virtualenv using python 2.5 for running Google’s samples. Using the existing python install for OSX I was running into a few errors such as:

ImportError: No module named django

and

ImportError: No module named cgi

A bit of Googling turned up this post which details all of the necessary steps. In step 2 the path I used for google_appengine was:

/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine

Pretty printing a Python dictionary to HTML

Here’s a routine I wrote to pretty print a Python dict into an HTML table and though I’d share.

    def prettyTable(dictionary, cssClass=''):
        ''' pretty prints a dictionary into an HTML table(s) '''
        if isinstance(dictionary, str):
            return '<td>' + dictionary + '</td>'
        s = ['<table ']
        if cssClass != '':
            s.append('class="%s"' % (cssClass))
        s.append('>\n')
        for key, value in dictionary.iteritems():
            s.append('<tr>\n  <td valign="top"><strong>%s</strong></td>\n' % str(key))
            if isinstance(value, dict):
                if key == 'picture' or key == 'icon':
                    s.append('  <td valign="top"><img src="%s"></td>\n' % Page.prettyTable(value, cssClass))
                else:
                    s.append('  <td valign="top">%s</td>\n' % Page.prettyTable(value, cssClass))
            elif isinstance(value, list):
                s.append("<td><table>")
                for i in value:
                    s.append('<tr><td valign="top">%s</td></tr>\n' % Page.prettyTable(i, cssClass))
                s.append('</table>')
            else:
                if key == 'picture' or key == 'icon':
                    s.append('  <td valign="top"><img src="%s"></td>\n' % value)
                else:
                    s.append('  <td valign="top">%s</td>\n' % value)
            s.append('</tr>\n')
s.append('</table>') return '\n'.join(s)

Setting up debugging for Google AppEngine projects in Eclipse

When I made the move from Windows to OSX and Python development one of the things I wanted to experiment with has been Google’s AppEngine. I installed the SDK and setup the plugin for Eclipse but ran into a few issues I wanted to make note of since I think other could probably benefit from it as well. I’ll mention I’m on OSX 10.6.8 using Eclipse for Java Version Helios SR 2.

Creating a new AppEngine project in Eclipse

With the plugin installed you get an AppEngine project template listed under PyDev.

Eclipse New Project dialog

Clicking Next displays the standard Eclipse dialog for a new project where you enter the name and optionally a few other project settings. However, it’s the dialog after that where I had my first question where you’re prompted for the Google App Engine Directory which in my case corresponds to:

/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/

Google instructions for development in Eclipse didn’t answer this question and it wasn’t immediately obvious to me. At any rate, once this path is set correctly you’ll immediately see a list of AppEngine modules to be included on the PYTHONPATH:

AppEngine Module selection dialog

Configuring the Project’s Python Interpreter

After I successfully created the AppEngine project I wanted to look at the appengine-boilerplate for HTML5 which is based on HTML5 boilerplate which is well worth checking out. Next, I used the Google AppEngine Console to create a new application called “helloworld954” then copied the boilerplate files into my project and edited the app.yaml file to set the application name:

application: helloworld954

I then ran the project and everything looked ok but when I hit the page from the browser (http://localhost:8080) I got:

ImportError: No module named cgi

Which I subsequently discovered more information about here.

Now, IIRC my MacBook Pro came with Python 2.6 as the default version though v2.5 was installed as well and given AppEngine is, as of this writing, based on v2.5 I figured rather than tweaking dev_appserver.py as mentioned on that prior link I figured I’d first try the proper Python version. I right clicked my project from the package explorer and selected Properties then clicked the PyDev – Interpreter/Grammar option then clicked the “Click here to configure an interpreter not listed” link (blue link below).

PyDev Package properties dialog

Next, on the preferences dialog (which I won’t screenshot here as it’s too big) I clicked New… and added a new interpreter option for “Python v2.5” with a path of:

/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python

Btw, I think I tried the symlink but that wasn’t allowed (again, I think).

At any rate, once I saved those changes voila the boilerplate project started working. Ah, one step I overlooked was setting up a debug configuration for this project:

Eclipse Debug Configuration

The main module for the application has to be Google’s dev_apperser.py file which is located here on my machine:

/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/dev_appserver.py

Also, under the (x)= Agruments tab I set the Program Arguments to “.” for the current working directory and for “Working Directory” I selected “Other” with a value of “$(workspace_loc:starterkit}” Note, starterkit is the name of my project.

The final result:

Google AppEngine HTML5 boilerplate

Hopefully, this helps someone else, it will certainly help me months from now when I trying to do the same thing again so I won’t have try and recall all these steps.

Publishing Python unit test results in Jenkins

When I switched to developing on an OS stack one of the first things I look for was a $g(Continuous Integration server) and settled on Hudson which, after some tumult with surrounding Oracle’s acquisition of Sun, was forked into Jenkins. Getting jenkins setup couldn’t be easier and the web UI is comprehensive and full of options.

My day-to-day development is in Python and I’ve written a bunch of tests based on the core unittest module though it doesn’t natively produce results that can be consumed by Jenkins. To that end, I searched around and found the necessary pieces which I wanted to capture.

First, you need to install the unittest-xml-reporting package which is described as:

PyUnit-based test runner with JUnit like XML reporting.

sudo easy_install unittest-xml-reporting

Once installed you need to add the following to your unittests so they will produce the necessary XML result output:

import xmlrunner

...

if __name__ == '__main__':
    unittest.main(testRunner=xmlrunner.XMLTestRunner(output='test-reports'))

Next, in Jenkins click the configure link for your project and check the Publish JUnit test result report and set the path to the output location for the unit tests. In my case the full path to the XML output is /.hudson/jobs/publishing/workspace/trunk/test/test-reports. In Jenkins the path to use for publishing is **/trunk/test/test-reports/*.xml

Jenkins JUnit publisher settings

This will also add a chart to the project page in Jenkins:

image

Improving the Facebook Python SDK GraphAPI.request method

While working on Facebook functionality using Python I ran into a few cases where the GraphAPI.request method caused Facebook to choke on parameters with a value of None. Thus here’s a simple override to allow for None parameters which subsequently get stripped out.

def request(self, path, args=None, post_args=None):
    ''' Improves handling for post_args where any arg with a None value gets removed to avoid FB API errors.
        Allows for methods that accept all possible Facebook parameters and only passes those that are specified.
    '''
    if post_args:
        d = {}
        for a in post_args:
            if post_args[a] != None:
                d.setdefault(a, post_args[a])
        post_args = d
    return GraphAPI.request(self, path, args, post_args)

I’ve created a descendant class I call GraphAPIEx where this above method appears.

Using flot to chart data using a date range

jquery flot chart I’m working on developing Bloglines these days and one of the features I wanted as an admin was the ability to see various pieces of data related to the site over time. Ideally what I wanted were some charts like those on $g(Google Analytics) for things like:

  • users joining/day
  • votes cast/day
  • blogs submitted per day

I’ve previously experimented with the $l(Google Chart Tools) and that’s where my search started but that lead to a bit of a dead end. I found a related post on Tom Fotherby’s blog but as Greg Fitzgerald pointed out there are still a few more issues to be worked out. Since we use jQuery on the site I started searching along that vein which led me to flot and more specifically this example (caution the examples site seems really slow) which fit perfectly.

Looking at the code and markup I quickly found this was something I could have working right away. The data in JavaScript looks like this:

var d = [[1196463600000, 0], [1196550000000, 0], [1196636400000, 0], ...];
 

Where the first value is a $g(JavaScript timestamp) and the second is the actual data to be plotted. There are a couple of notes regarding the timestamp mentioned here:

The timestamps must be specified as Javascript timestamps, as milliseconds since January 1, 1970 00:00. This is like Unix timestamps, but in milliseconds instead of seconds (remember to multiply with 1000!).

As an extra caveat, the timestamps are interpreted according to UTC to avoid having the graph shift with each visitor’s local time zone. So you might have to add your local time zone offset to the timestamps or simply pretend that the data was produced in UTC instead of your local time zone.

We’re using $g(Postgres) and the tables I need to query all have a date_created field of timestamp without timezone. Here’s the SQL to fetch the data:

SELECT extract(epoch from date_trunc('day', date_created)) * 1000, count(*) from blog group by extract(epoch from date_trunc('day', date_created)) * 1000 order by extract(epoch from date_trunc('day', date_created)) * 1000 DESC

From the results I use the following python method to create the above data structure which is ready to feed into the flot chart:

    @staticmethod
    def statsByDay(query):
        data = PubBase.sqlQuery(query)
        dl = []
        for d in data:
            if d[0] != None and d[1] != None:
                dl.append([int(d[0]), int(d[1])])
        return dl         

I’ve been working with a lot of JavaScript frameworks/libraries lately and this is on that integrated into the project pretty flawlessly so kudos to the flot folks!

Using the Dealmap API from Python

Update Aug 1, 2011:With Google’s acquisition of Dealmap one would have to assume this will become a Google API. The deal raises a number of interesting questions related to how deals would be served to specific clients. Should be interesting to watch.

Update Jun 11, 2011: I made one more tweak I neglected to mention to the Dealmap python API which I highly recommend if you intend to use it in a production environment which is to modify the call to urllib2.urlopen and add the timeout parameter. In fact, in my revision I’ve modified all of the calls into the API to accept a timeout parameter which is passed along to the urllib2.urlopen call.

I’ve been looking into the $l(DealMap API) and found what appears to be a semi/partially/maybe/sorta-official python implementation though there appears to be a missing module reference called “Util” containing an $g(ordered dictionary) and some XML serialization bits. I posted a message to the Google group for the project and even followed up with Dealmap directly but unfortunately haven’t gotten any response, not exactly a great sign though seeing as this is a pretty straightforward $g(REST API) let’s move on shall we…

There’s a fairly simple workaround for this unknown (at least to me) module using Beautiful Soup (gotta love that domain name) and a few simple wrapper classes making it easier to work with the API. Beautiful Soup is defined thusly:

Beautiful Soup is an HTML/XML parser for Python that can turn even invalid markup into a parse tree. It provides simple, idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.

In dealmap.py I removed all of the Util module references and changed the deserialize calls to return an instance of a BeautifulStoneSoup wrapper class called Deals:

class Deals(BeautifulStoneSoup):
    def __init__(self, dealmarkup=""):
        BeautifulStoneSoup.__init__(self, dealmarkup, convertEntities=BeautifulSoup.XML_ENTITIES)
        self._deals = None

    def getDeal(self, index):
        deals = self.getDeals()
        return Deal(deals[index])
        
    def getDeals(self):
        if self._deals == None:
            self._deals = self.findAll('deal')
            l = []
            for d in self._deals:
                l.append(Deal(d))
            self._deals = l
        return self._deals

Here’s a method from the Service class where I replaced the call to deserialize(…) with my new Deals class:

def search_deals(self, activities, capabilities, expirationDate, location, query="*", distance=5.0, startIndex=0, pageSize=20):
    searchDeals = self.__build_get_url(self.__dealmapUrls["search_deals"],
                                     l=location,
                                     q=query,
                                     d=distance,
                                     si=startIndex,
                                     ps=pageSize,
                                     a=activities,
                                     c=capabilities,
                                     ed=expirationDate,
                                     key=self.__apikey
                                     )
      
    result = self.__dealmap_get_request(searchDeals)
    obj = Deals(result)
    return obj

In the Deals class above you might have noticed I reference a Deal (singular) class which is a thin wrapper for accessing the properties of the Deal Tag object returned via BeautifulSoup using a __getattr__ override:

class Deal(object):
    def __init__(self, dealtag):
        self._dealtag = dealtag
        
    def __getattr__(self, name):
     

Accessing information from a deal now looks like this:

dealmap = Service("dealmap_api_key")
deals = dealmap.search_deals(None, None, None, location="+37.0491490732-122.025146484")
d = deals.getDeal(0)
print d.moreinfolink

You can specify any of XML child property names for a Deal and they’ll be returned (moreinfolink is such an example).

Fair, warning what’s posted here has essentially no error checking.

Btw, having been looking at the API I’ve found the performance to be much better if you provide a lat/lng location rather than city/state or zip where the latter seems to be particularly slow.

If you’ve worked with this API I’d be interested to get your impressions.

Using the Python Cheetah Template compiler

The web application I’m working on uses Cheetah – “The Python Powered Template Engine” for web page generation and while the updated documentation is a great improvement over the old docs there remain some gaps that weren’t very clear to me when I first got started…

image

Of course, Cheetah’s got a lot of features so I’m certain there are things I’m missing which is part of the point of this post to start a conversation about how I’m using Cheetah and elicit feedback and exchange ideas/best practices.

The first piece I’ll toss out is simply a routine to call the Cheetah compiler on a list of templates. Running this routine alone on the CI server has caught a bunch of syntax errors that wouldn’t have been caught using runtime compiled templates.

def compileTemplates(templateFiles, outpath="ui/compiled/"):
    ''' Compiles templateFiles if they're out of date '''
    if templateFiles == []:
        raise Exception("No template files specified")

    for t in templateFiles:
        outfile = os.path.split(t)[1]
        outfile = os.path.splitext(outfile)[0] + ".py"
        if not os.path.exists(outpath + outfile) or os.path.exists(outpath + outfile) and os.path.getmtime(outpath + outfile) < os.path.getmtime(config.templatePath(t)):
            t = config.templatePath(t)
            print "Template: " + t
            source = Cheetah.Compiler.Compiler(file=t, moduleName=os.path.splitext(outfile)[0]).moduleDef()
            f = open(outpath + outfile, "w+t")
            f.write(source)
            f.close()

The routine checks to see if the template file (.tmpl) has been modified since the last compilation and optionally recompiles. 

Compiling Cheetah templates in code

In the application I’m working on we’re using Cheetah as the template infrastructure and I’ve been working with compiled templates built from the command line. A colleague suggested compiling them dynamically

source = Cheetah.Compiler.Compiler(file=config.templatePath(t), moduleName=os.path.splitext(outfile)[0]).moduleDef()
f = open(outpath + outfile, “w+t”)
f.write(source)
f.close()