Edgewall Software
Version 37 (modified by cmlenz, 7 years ago)

Get rid of EvalException, not worth it

Genshi Tutorial

This tutorial is intended to give an introduction on how to use Genshi in your web application, and present common patterns and best practices. It is aimed at developers new to Genshi as well as those who've already used Genshi, but are looking for advice or inspiration on how to improve that usage.

Introduction

In this tutorial we'll create a simple Python web application based on  CherryPy 3. CherryPy was chosen because it provides a convenient level of abstraction over raw CGI or  WSGI development, but is less ambitious than full-stack web frameworks such as  Pylons or  Django, which tend to come with a preferred templating language, and often show significant bias towards that language.

The application is a stripped-down version of sites such as  reddit or  digg: it lets users submit links to online articles they find interesting, and then lets other users vote on those stories and post comments. Just for kicks, we'll call that application Geddit

The project is kept as simple as possible, while still showing many of Genshi features and how to best use them:

  • For persistence, we'll use native Python object serialization (via the pickle module), instead of an SQL database and an ORM.
  • There's no authentication of any kind. Anyone can submit links, anyone can comment.

Content

  1. Introduction
  2. Prerequisites
  3. Getting Started
  4. Basic Template Rendering
  5. Data Model
  6. Extending the Template
  7. Adding a Submission Form
  8. Adding Form Validation
  9. Factoring out the Templating
  10. Adding a Layout Template
  11. Implementing Comments
  12. Allowing Markup in Comments
  13. Ajaxifyied Commenting
  14. Adding an Atom Feed
  15. Summary

Prerequisites

First, make sure you have CherryPy 3.0.x installed, as well as recent versions of  FormEncode and obviously Genshi. You can download and install those manually, or just use  easy_install:

$ easy_install CherryPy
$ easy_install FormEncode
$ easy_install Genshi

Getting Started

Next, set up the basic CherryPy application.

  1. Create a directory that should contain the application
  2. Inside that directory create a Python package named geddit by doing the following:
    • Create a geddit directory
    • Create an empty file called __init__.py inside the geddit directory
  3. Inside that package, create a file called controller.py with the following content:
#!/usr/bin/env python

import operator, os, pickle, sys

import cherrypy


class Root(object):

    def __init__(self, data):
        self.data = data

    @cherrypy.expose
    def index(self):
        return 'Geddit'


def main(filename):
    # Some global configuration; note that this could be moved into a
    # configuration file
    cherrypy.config.update({
        'request.throw_errors': True,
        'tools.encode.on': True, 'tools.encode.encoding': 'utf-8',
        'tools.decode.on': True,
        'tools.trailing_slash.on': True,
        'tools.staticdir.root': os.path.abspath(os.path.dirname(__file__)),
    })

    cherrypy.quickstart(Root(data), '/', {
        '/media': {
            'tools.staticdir.on': True,
            'tools.staticdir.dir': 'static'
        }
    })

if __name__ == '__main__':
    main(sys.argv[1])

Enter the tutorial directory in the terminal, and run:

$ PYTHONPATH=. python geddit/controller.py geddit.db

You should see a log message pointing you to the URL where the application is being served, which is usually  http://localhost:8080/. Visiting that page will respond with just the string “Geddit”, as that's what the index() method of the Root object returns.

Note that we've configured CherryPy to serve static files from the geddit/static directory. CherryPy will complain that that directory does not exist, so create it, but leave it empty for now. We'll add static resources later on in the tutorial.

Basic Template Rendering

So far the code doesn't actually use Genshi, or even any kind of templating. Let's change that.

Inside of the geddit directory, create a directory called templates, and inside that directory create a file called index.html, with the following content:

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>$title</title>
  </head>
  <body>
    <div id="header">
      <h1>$title</h1>
    </div>

    <p>Welcome!</p>

    <div id="footer">
      <hr />
      <p class="legalese">© 2007 Edgewall Software</p>
    </div>
  </body>
</html>

This is basically an almost static XHTML file with some simple variable substitution: the string $title will be replaced by a variable of that name that we pass into the template from the controller.

There are couple of important things to point out here:

  • Variables substituted into templates, such as $title in our example, can be of any Python data type. Genshi will convert the value to a string and insert the result into the generated output stream.
  • You generally do not need to worry about XML-escaping such variables. Genshi will automatically take care of that when the template is serialized. We'll look into the details of this process later.
  • The template will be parsed by Genshi using an XML parser, which means that it needs to be well-formed XML. If you know HTML but are unfamiliar with XML/XHTML, you will need to read up on the topic. Here are a couple of good references:
  • That the template uses XHTML does not mean that your web-application will generate XHTML! You can choose whether you'd rather just generate good old HTML 4.01, because despite all the hype, that's still the format that works best in most browsers (see  this blog post over at Surfin' Safari for some background).

We now need to change the controller code so that this template is used. First, add the Genshi TemplateLoader to the imports at the top of the geddit/controller.py file, and instantiate a loader for the geddit/templates directory:

import cherrypy
from genshi.template import TemplateLoader

loader = TemplateLoader(
    os.path.join(os.path.dirname(__file__), 'templates'),
    auto_reload=True
)

Next, change the implementation of the index() method of the Root class to look like this:

    @cherrypy.expose
    def index(self):
        tmpl = loader.load('index.html')
        return tmpl.generate(title='Geddit').render('html', doctype='html')

This asks the template loader for a template named index.html, generates the output stream, and finally serializes the output to HTML. When you now reload the page in your browser, you should get back the following HTML response:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <head>
    <title>Geddit</title>
  </head>
  <body>
    <div id="header">
      <h1>Geddit</h1>
    </div>

    <p>Welcome!</p>

    <div id="footer">
      <hr />
      <p class="legalese">© 2007 Edgewall Software</p>
    </div>
  </body>
</html>

Data Model

To continue, we'll need to first add some Python classes to define the data model the application will use. As mentioned above, we're using a simple pickle file for persistence, so all we need to do here is create a couple of very simply Python classes.

Model diagram

Inside the geddit directory, create a file named model.py, with the following content:

from datetime import datetime


class Link(object):

    def __init__(self, username, url, title):
        self.username = username
        self.url = url
        self.title = title
        self.time = datetime.utcnow()
        self.id = hex(hash(tuple([username, url, title, self.time])))[2:]
        self.comments = []

    def __repr__(self):
        return '<%s %r>' % (type(self).__name__, self.title)

    def add_comment(self, username, content):
        self.comments.append(Comment(username, content))


class Comment(object):

    def __init__(self, username, content):
        self.username = username
        self.content = content
        self.time = datetime.utcnow()

    def __repr__(self):
        return '<%s>' % (type(self).__name__)

You'll need to import those classes in geddit/controllers.py, just below the other imports:

from geddit.model import Link, Comment

And in the main() function, let's add some code to read our data from the pickle file, and write it back:

def main(filename):
    # load data from the pickle file, or initialize it to an empty list
    if os.path.exists(filename):
        fileobj = open(filename, 'rb')
        try:
            data = pickle.load(fileobj)
        finally:
            fileobj.close()
    else:
        data = {}

    def _save_data():
        # save data back to the pickle file
        fileobj = open(filename, 'wb')
        try:
            pickle.dump(data, fileobj)
        finally:
            fileobj.close()
    cherrypy.engine.on_stop_engine_list.append(_save_data)

Now let's add some initial content to our “database”.

Note: You'll need to stop the CherryPy server to do the following, otherwise your changes will get overwritten.

In the terminal, from the tutorial directory, launch the interactive Python shell by executing PYTHONPATH=. python, and enter the following code:

>>> from geddit.model import *
>>> link1 = Link(username='joe', url='http://example.org/', title='An example')
>>> link1.add_comment(username='jack', content='Bla bla bla')
>>> link1.add_comment(username='joe', content='Bla bla bla, bla bla.')
>>> link2 = Link(username='annie', url='http://reddit.com/', title='The real thing')
>>> import pickle
>>> pickle.dump({link1.id: link1, link2.id: link2}, open('geddit.db', 'wb'))

You should now have two links in the pickle file, with the first link having a comment, as well as a reply to that comment. Restart the CherryPy server by running:

$ PYTHONPATH=. python geddit/controller.py geddit.db

Extending the Template

Now let's change the Root.index() method in geddit/controller.py to pass the links list to the template:

    @cherrypy.expose
    def index(self):
        links = sorted(self.data.values(), key=operator.attrgetter('time'))

        tmpl = loader.load('index.html')
        stream = tmpl.generate(links=links)
        return stream.render('html', doctype='html')

And finally, we'll modify the index.html template so that it displays the links in a simple ordered list. While we're at it, let's add a link to submit new items:

<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:py="http://genshi.edgewall.org/">
  <head>
    <title>Geddit</title>
  </head>
  <body>
    <div id="header">
      <h1>Geddit</h1>
    </div>

    <p><a href="/submit/">Submit new link</a></p>

    <ol py:if="links">
      <li py:for="link in reversed(links)">
        <a href="${link.url}">${link.title}</a>
        posted by ${link.username}
        at ${link.time.strftime('%M/%d/%Y %H:%m')}
      </li>
    </ol>

    <div id="footer">
      <hr />
      <p class="legalese">© 2007 Edgewall Software</p>
    </div>
  </body>
</html>

This template demontrates some aspects of Genshi that we've not seen so far:

  • We declare the py: namespace prefix on the <html> element, which is required to be able to add directives to the template.
  • There's a py:if condition on the <ol> element. That means that the <ol> and all nested content will only be included in the output stream if the expression links evaluates to a truth value. In this case we know that links is a list (assembled by the Root.index() method), so if the list is empty, the <ol> will be skipped.
  • Next up, we've attached a py:for loop to the <li> element. py:for="link in reversed(links)". What this does is that the <li> element will be repeated for every item in the links list. The link variable is bound to the current item in the list on every step.
  • You can tell that we can also use more complex expressions than just simple variable substitutions: the directives such as py:if and py:for take Python expressions of any complexity, while you can include complex expressions in other places by putting them inside curly braces prefixed with a dollar sign (${...}).

When you reload the page in the browser, you should see a page similar to this:

Browser screenshot 1

Adding a Submission Form

In the previous step, we've already added a link to a submission form to the template, but we haven't implemented the logic to handle requests to that link yet.

To do that, we need to add a method to the Root class in geddit/controller.py:

    @cherrypy.expose
    def submit(self, cancel=False, **data):
        if cherrypy.request.method == 'POST':
            if cancel:
                raise cherrypy.HTTPRedirect('/')
            # TODO: validate the input data!
            link = Link(**data)
            self.data[link.id] = link
            raise cherrypy.HTTPRedirect('/')

        tmpl = loader.load('submit.html')
        stream = tmpl.generate()
        return stream.render('html', doctype='html')

And of course we'll need to add a template to display the submission form. In geddit/templates, create a file named submit.html, with the following content:

<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:py="http://genshi.edgewall.org/">
  <head>
    <title>Geddit: Submit new link</title>
  </head>
  <body>
    <div id="header">
      <h1>Geddit</h1>
    </div>

    <h2>Submit new link</h2>
    <form action="" method="post">
      <table summary=""><tr>
        <th><label for="username">Your name:</label></th>
        <td><input type="text" id="username" name="username" /></td>
      </tr><tr>
        <th><label for="url">Link URL:</label></th>
        <td><input type="text" id="url" name="url" /></td>
      </tr>
      <tr>
        <th><label for="title">Title:</label></th>
        <td><input type="text" name="title" /></td>
      </tr></table>
      <div>
        <input type="submit" value="Submit" />
        <input type="submit" name="cancel" value="Cancel" />
      </div>
    </form>

    <div id="footer">
      <hr />
      <p class="legalese">© 2007 Edgewall Software</p>
    </div>
  </body>
</html>

Now, if you click on the “Submit new link” link on the start page, you should see the submission form. Filling out the form and clicking "Submit" will post a new link and take you to the start page. Clicking on the “Cancel” button, will take you back to the start page, but not add a link.

Please note though that we're not performing any kind of validation on the input, and that's of course a bad thing. So let's add validation next.

Adding Form Validation

We'll use  FormEncode to do the validation, but we'll keep it all fairly basic. Let's declare our form in a separate file, namely geddit/form.py, which will have the following content:

from formencode import Schema, validators


class LinkForm(Schema):
    username = validators.UnicodeString(not_empty=True)
    url = validators.URL(not_empty=True, add_http=True, check_exists=False)
    title = validators.UnicodeString(not_empty=True)

Now let's use that class in the Root.submit() method. First add the following to the top of geddit/controller.py:

from formencode import Invalid
from geddit.form import LinkForm

Then, update the submit() method to match the following:

    @cherrypy.expose
    def submit(self, cancel=False, **data):
        if cherrypy.request.method == 'POST':
            if cancel:
                raise cherrypy.HTTPRedirect('/')
            form = LinkForm()
            try:
                data = form.to_python(data)
                link = Link(**data)
                self.data[link.id] = link
                raise cherrypy.HTTPRedirect('/')
            except Invalid, e:
                errors = e.unpack_errors()
        else:
            errors = {}

        tmpl = loader.load('submit.html')
        stream = tmpl.generate(errors=errors)
        return stream.render('html', doctype='html')

As you can tell, we now only add the submitted link to our database when validation is successful: all fields need to be filled out, and the url field needs to contain a valid URL. If the submission is valid, we proceed as before. If it is not valid, we render the submission form template again, passing it the dictionary of validation errors. Let's modify the submit.html template so that it displays those error messages:

<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:py="http://genshi.edgewall.org/">
  <head>
    <title>Geddit: Submit new link</title>
  </head>
  <body>
    <div id="header">
      <h1>Geddit</h1>
    </div>

    <h2>Submit new link</h2>
    <form action="" method="post">
      <table summary=""><tr>
        <th><label for="username">Your name:</label></th>
        <td>
          <input type="text" id="username" name="username" />
          <span py:if="'username' in errors" class="error">${errors.username}</span>
        </td>
      </tr><tr>
        <th><label for="url">Link URL:</label></th>
        <td>
          <input type="text" id="url" name="url" />
          <span py:if="'url' in errors" class="error">${errors.url}</span>
        </td>
      </tr>
      <tr>
        <th><label for="title">Title:</label></th>
        <td>
          <input type="text" name="title" />
          <span py:if="'title' in errors" class="error">${errors.title}</span>
        </td>
      </tr></table>
      <div>
        <input type="submit" value="Submit" />
        <input type="submit" name="cancel" value="Cancel" />
      </div>
    </form>

    <div id="footer">
      <hr />
      <p class="legalese">© 2007 Edgewall Software</p>
    </div>
  </body>
</html>

So now, if you submit the form without enterering a title, and having entered an invalid URL, you'd see something like the following:

Browser screenshot 2

But there's a problem here: Note how the input values have vanished from the form! We'd have to repopulate the form manually from the data submitted so far. We could do that by adding the required value="" attributes to th text fields in the template, but Genshi provides a more elegant way: the HTMLFormFiller steam filter. Given a dictionary of values, it can automatically populate HTML forms in the template output stream.

To enable this functionality, first you'll need to add the import from genshi.filters import HTMLFormFiller to the genshi/controller.py file. Next, update the bottom lines of the Root.submit() method implementation so that they look as follows:

        tmpl = loader.load('submit.html')
        stream = tmpl.generate(errors=errors) | HTMLFormFiller(data=data)
        return stream.render('html', doctype='html')

Now, all entered values are preserved when validation errors occur. Note that the form is populated as the template is being generated, there is no reparsing and reserialization of the output.

Factoring out the Templating

By now, we already have some repetitive code when it comes to rendering templates: both the Root.index() and the Root.submit() methods look very similar in that regard: they load a specific template, call its generate() method passing it some data, and then call the render() method of the resulting stream. As we're going to be adding more controller methods, let's factor out those things into a library module.

There's a special challenge here, though: we still want to be able to add the HTMLFormFiller or other stream filters to the template output stream, which needs to be done before that output stream is serialized. We'll use a combination of a decorator and a regular function to achieve that, which collaborate using the CherryPy thread-local context.

Create a directory called lib inside the geddit directory, and inside the lib directory create two files, named __init__.py and template.py, respectively. Leave the first one empty, and in the second one, insert the following code:

import os

import cherrypy
from genshi.core import Stream
from genshi.output import encode, get_serializer
from genshi.template import Context, TemplateLoader

loader = TemplateLoader(
    os.path.join(os.path.dirname(__file__), '..', 'templates'),
    auto_reload=True
)

def output(filename, method='html', encoding='utf-8', **options):
    """Decorator for exposed methods to specify what template the should use
    for rendering, and which serialization method and options should be
    applied.
    """
    def decorate(func):
        def wrapper(*args, **kwargs):
            cherrypy.thread_data.template = loader.load(filename)
            if method == 'html':
                options.setdefault('doctype', 'html')
            serializer = get_serializer(method, **options)
            stream = func(*args, **kwargs)
            if not isinstance(stream, Stream):
                return stream
            return encode(serializer(stream), method=serializer,
                          encoding=encoding)
        return wrapper
    return decorate

def render(*args, **kwargs):
    """Function to render the given data to the template specified via the
    ``@output`` decorator.
    """
    if args:
        assert len(args) == 1, \
            'Expected exactly one argument, but got %r' % (args,)
        template = loader.load(args[0])
    else:
        template = cherrypy.thread_data.template
    ctxt = Context(url=cherrypy.url)
    ctxt.push(kwargs)
    return template.generate(ctxt)

In the genshi/controller.py file, you can now remove the from genshi.template import TemplateLoader line, and also the instantiation of the TemplateLoader, as that is now done in our new library module. Of course, you'll have to import that library module instead:

from geddit.lib import template

Now, we can change the Root class to match the following:

class Root(object):

    def __init__(self, data):
        self.data = data

    @cherrypy.expose
    @template.output('index.html')
    def index(self):
        links = sorted(self.data.values(), key=operator.attrgetter('time'))
        return template.render(links=links)

    @cherrypy.expose
    @template.output('submit.html')
    def submit(self, cancel=False, **data):
        if cherrypy.request.method == 'POST':
            if cancel:
                raise cherrypy.HTTPRedirect('/')
            form = LinkForm()
            try:
                data = form.to_python(data)
                link = Link(**data)
                self.data[link.id] = link
                raise cherrypy.HTTPRedirect('/')
            except Invalid, e:
                errors = e.unpack_errors()
        else:
            errors = {}

        return template.render(errors=errors) | HTMLFormFiller(data=data)

As you can see here, the code is now less repetitive: there's a simple decorator to define which template should be used, and the render() produces the template out stream which can then be further processed if necessary.

Adding a Layout Template

But there's also duplication in the template files themselves: each template has to redefine the complete header and footer, and any other “decoration” markup that we may want to apply to the complete site. Now, we could simply put those commonly used markup snippets into separate HTML files and include them in the templates where they are needed. But Genshi provides a more elegant way to apply a common structure to different templates: match templates.

Most template languages provide an inheritance mechanism to allow different templates to share some kind of common structure, such as a common header, navigation, and footer. Using this mechanism, you create a “master template” in which you declare slots that “derived templates” can fill in. The problem with this approach is that it is fairly rigid: the master needs to know which content the templates will produce, and what kind of slots need to be provided for them to stuff their content in. Also, a derived template is itself not a valid or even well-formed HTML file, and can not be easily previewed or edited in a WYSIWYG authoring tool.

Match templates in Genshi turn this up side down. They are conceptually similar to running an XSLT transformation over your template output: you create rules that match elements in the template output stream based on XPath patterns. Whenever there is a match, the matched content is replaced by what the match template produces. This sounds complicated in theory, but is fairly intuitive in practice, so let's look at a concrete example.

Let's create a layout template first: in the geddit/templates/ directory, add a file named layout.html, with the following content:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:py="http://genshi.edgewall.org/">

  <py:match path="head" once="true">
    <head py:attrs="select('@*')">
      <title py:with="title = list(select('title/text()'))">
        geddit<py:if test="title">: ${title}</py:if>
      </title>
      <link rel="stylesheet" href="${url('/media/layout.css')}" type="text/css" />
      <script type="text/javascript" src="${url('/media/jquery.js')}"></script>
      ${select('*[local-name()!="title"]')}
    </head>
  </py:match>

  <py:match path="body" once="true">
    <body py:attrs="select('@*')"><div id="wrap">
      <div id="header">
        <a href="/"><img src="${url('/media/logo.gif')}" width="201" height="79" alt="geddit?" /></a>
      </div>
      <div id="content">
        ${select('*|text()')}
      </div>
      <div id="footer">
        <hr />
        <p class="legalese">© 2007 Edgewall Software</p>
      </div>
    </div></body>
  </py:match>

</html>

That contains a whole lot of things, so let's break it up into smaller pieces and go through the various aspects to clarify them.

  1. The Document Element
 #!genshi
 <html xmlns="http://www.w3.org/1999/xhtml"
       xmlns:py="http://genshi.edgewall.org/" py:strip="">

First, note that the root element of the template is an <html> tag. This is needed because markup templates are XML documents, and XML documents require a single root element (we also use it to attach our namespace declarations, but we just as as well do that on the nested <py:match> elements). However, because the page templates that include this file will also have <html> root elements, we add the py:strip="" directive so that this second <html> tag doesn't make it through into the output stream.

  1. Match Template Definition
 #!genshi
   <py:match path="head" once="true">

Here we define the first match template. The path attribute contains an XPath pattern specifying which elements this match template should be applied to. In this case, the XPath is very simple: it matches any element with the tag name “head”, so it will be applied to the <head>...</head> element. We also add the once="true" attribute to tell Genshi that we only expect a single occurrence of the <head> element in the stream. Genshi can perform some optimizations based on this information.

  1. Selecting Matched Content
 #!genshi
    <head py:attrs="select('@*')">

Inside match templates, you can use the special function select(path) to access the element that matched the pattern. Here we use that function in the py:attrs directive, which basically translates to “get all attributes on the matched element, and add them to this element”. So for example if your page template contained <head id="foo">, the element produced by this match template would also have the same id="foo" attribute.

 #!genshi
      <title py:with="title = list(select('title/text()'))">
        geddit<py:if test="title">: ${title}</py:if>
      </title>

This is a more complex example for selecting matched content: it fetches the text contained in the <title> element of the original <head> and prefixes it with the string “geddit: ”. But as page templates may not even contain a <title> element, we first check whether it exists, and only add the colon if it does. Thus, if the page has no title of its own, the result will be “geddit”.

 #!genshi
      ${select('*[local-name()!="title"]')}

Finally, this is an example for using a more complex XPath pattern. This select() incantation here returns a stream that contains all child elements of the original <head>, except for those elements with the tag name “title”. If we didn't add that predicate, the output stream would contain two <title> tags.

If you've done a bit of XSLT, match templates should look familiar. Otherwise, you may want to familiarize yourself with the basics of  XPath 1—but note that Genshi only implements a subset of the full spec as explained in Using XPath in Genshi. Just play around with match templates a bit; at the core, the concept is actually pretty simple and consistent.

Now we need to update the page templates: they no longer need the header and footer, and we'll have to include the layout.html file so that the match templates are applied. For the inclusion, we add the namespace prefix for XInclude, and an xi:include element.

Let's see how the template should look now for index.html:

<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:xi="http://www.w3.org/2001/XInclude"
      xmlns:py="http://genshi.edgewall.org/">
  <xi:include href="layout.html" />
  <head>
    <title>News</title>
  </head>
  <body>
    <p><a href="/submit/">Submit new link</a></p>

    <ol py:if="links">
      <li py:for="link in links">
        <a href="${link.url}">${link.title}</a>
        posted by ${link.username}
        at ${link.time.strftime('%M/%d/%Y %H:%m')}
      </li>
    </ol>
  </body>
</html>

Also change the submit.html template analogously, by adding the namespace prefix, the <xi:include> element, and by removing the header and footer <div>s.

Speaking of “layout”, you can see that we've added references to some static resources in the layout template: there's an embedded image as well as a linked stylesheet and javascript file.  Download those files and put them in your geddit/static/ directory.

When you reload the front page in your browser, you should now see something similar to the following:

Browser screenshot 3

Implementing Comments

TODO

Allowing Markup in Comments

TODO

Ajaxifyied Commenting

TODO

Adding an Atom Feed

TODO

Summary

TODO

Attachments