Genshi Tutorial
This tutorial is intended to give an introduction on how to use Genshi in your web application, and present common patterns and best practices. It is aimed at developers new to Genshi as well as those who've already used Genshi, but are looking for advice or inspiration on how to improve that usage.
Introduction
In this tutorial we'll create a simple Python web application based on CherryPy 3. CherryPy was chosen because it provides a convenient level of abstraction over raw CGI or WSGI development, but is less ambitious than full-stack web frameworks such as Pylons or Django, which tend to come with a preferred templating language, and often show significant bias towards that language.
The application we'll build here is a stripped-down version of sites such as reddit or digg: it lets users submit links to online articles they find interesting, and then lets other users comment on those stories. Just for kicks, we'll call that application Geddit?.
We'll keep the project as simple as possible, while still showing many of Genshi's features and how to best use them:
- For persistence, we'll use native Python object serialization (via the pickle module), instead of an SQL database and an ORM.
- There's no authentication of any kind. Anyone can submit links, anyone can comment.
- We'll start with the basics (rendering templates, handling forms, etc), and then continue by adding features such as Atom feeds and an AJAX interface.
Content
- Introduction
- Getting Started
- Making the Application “Do Stuff”
- Improving the Application
- Advanced Topics
- Summary
Getting Started
Prerequisites
First, make sure you have CherryPy 3.0.x installed, as well as recent versions of FormEncode and obviously Genshi. You can download and install those manually, or just use easy_install:
$ easy_install CherryPy $ easy_install FormEncode $ easy_install Genshi
The CherryPy Application
Next, set up the basic CherryPy application.
- Create a directory that should contain the application
- Inside that directory create a Python package named geddit by doing the following:
- Create a geddit directory
- Create an empty file called __init__.py inside the geddit directory
- Inside the geddit package directory, create a file called controller.py with the following content:
#!/usr/bin/env python import operator, os, pickle, sys import cherrypy class Root(object): def __init__(self, data): self.data = data @cherrypy.expose def index(self): return 'Geddit' def main(filename): data = {} # We'll replace this later # Some global configuration; note that this could be moved into a # configuration file cherrypy.config.update({ 'tools.encode.on': True, 'tools.encode.encoding': 'utf-8', 'tools.decode.on': True, 'tools.trailing_slash.on': True, 'tools.staticdir.root': os.path.abspath(os.path.dirname(__file__)), }) cherrypy.quickstart(Root(data), '/', { '/media': { 'tools.staticdir.on': True, 'tools.staticdir.dir': 'static' } }) if __name__ == '__main__': main(sys.argv[1])
Enter the tutorial directory in the terminal, and run:
$ PYTHONPATH=. python geddit/controller.py geddit.db
Note: On some Windows systems you may have to enter two lines:
SET PYTHONPATH=. python geddit/controller.py geddit.db
You should see a log message pointing you to the URL where the application is being served, which is usually http://localhost:8080/. Visiting that page will respond with just the string “Geddit”, as that's what the index() method of the Root object returns.
Note that we've configured CherryPy to serve static files from the geddit/static directory. CherryPy will complain that that directory does not exist, so create it, but leave it empty for now. We'll add static resources later on in the tutorial.
Basic Template Rendering
So far the code doesn't actually use Genshi, or even any kind of templating. Let's change that.
Inside of the geddit directory, create a directory called templates, and inside that directory create a file called index.html, with the following content:
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>$title</title> </head> <body class="index"> <div id="header"> <h1>$title</h1> </div> <p>Welcome!</p> <div id="footer"> <hr /> <p class="legalese">© 2007 Edgewall Software</p> </div> </body> </html>
Note: make sure you're saving this file using UTF-8 encoding with your editor. If for some reason you can't use UTF-8 (maybe you're using an outdated editor), replace the “©” character in the template with the corresponding character entity “©”. The same is true for all other template files in this tutorial.
This is basically an almost static XHTML file with some simple variable substitution: the string $title will be replaced by a variable of that name that we'll pass into the template from the controller.
There are couple of important things to point out here:
- Variables substituted into templates, such as $title in our example, can be of any Python data type. Genshi will convert the value to a string and insert the result into the generated output stream.
- You generally do not need to worry about XML-escaping such variables. Genshi will automatically take care of that when the template is serialized. We'll look into the details of this process later.
- The template will be parsed by Genshi using an XML parser, which means that it needs to be well-formed XML. If you know HTML but are unfamiliar with XML/XHTML, you will need to read up on the topic. Here are a couple of good references:
- Differences Between XHTML And HTML at W3Schools
- XHTML - An Introduction at SitePoint
- Just because the template uses XHTML does not mean that our web-application should generate XHTML! While that is possible, you can also choose to generate good old HTML 4.01, because, despite all the hype, that's still the format that works best in most browsers (see this blog post over at Surfin' Safari for some background).
We now need to change the controller code so that this template is used. First, add the Genshi TemplateLoader to the imports at the top of the geddit/controller.py file, and instantiate a loader for the geddit/templates directory:
import cherrypy from genshi.template import TemplateLoader loader = TemplateLoader( os.path.join(os.path.dirname(__file__), 'templates'), auto_reload=True )
Next, change the implementation of the index() method of the Root class to look like this:
@cherrypy.expose def index(self): tmpl = loader.load('index.html') return tmpl.generate(title='Geddit').render('html', doctype='html')
This asks the template loader for a template named index.html, generates the output stream, and finally serializes the output to HTML. When you now reload the page in your browser, you should get back the following HTML response:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <title>Geddit</title> </head> <body class="index"> <div id="header"> <h1>Geddit</h1> </div> <p>Welcome!</p> <div id="footer"> <hr> <p class="legalese">© 2007 Edgewall Software</p> </div> </body> </html>
Note that the output has some subtle differences compared to the template, even beyond the variable substitution that has taken place: Genshi has added a proper HTML DOCTYPE (important to get the browser to render using standards mode, through a commonly employed mechanism in web browsers called “doctype switching”.) It has also removed the XHTML namespace declaration, because we're rendering to HTML, and HTML doesn't support XML namespaces. And the <hr> element in the footer is missing the trailing slash that can be used in XML markup for empty elements; HTML user agents know that <hr> is always an empty element, and including either the trailing slash, or even adding an explicit </hr> end tag would be invalid and might even confuse some browsers.
The Data Model
To continue, we'll need to first add some Python classes to define the data model the application will use. As mentioned above, we're using a simple pickle file for persistence, so all we need to do here is create a couple of very simple Python classes.
Inside the geddit directory, create a file named model.py, with the following content:
from datetime import datetime class Link(object): def __init__(self, username, url, title): self.username = username self.url = url self.title = title self.time = datetime.utcnow() self.id = hex(hash(tuple([username, url, title, self.time])))[2:] self.comments = [] def __repr__(self): return '<%s %r>' % (type(self).__name__, self.title) def add_comment(self, username, content): comment = Comment(username, content) self.comments.append(comment) return comment class Comment(object): def __init__(self, username, content): self.username = username self.content = content self.time = datetime.utcnow() def __repr__(self): return '<%s by %r>' % (type(self).__name__, self.username)
You'll need to import those classes in geddit/controller.py, just below the other imports:
from geddit.model import Link, Comment
And in the main() function, let's replace the placeholder data = {} with some code to read our data from the pickle file, and write it back:
def main(filename): # load data from the pickle file, or initialize it to an empty list if os.path.exists(filename): fileobj = open(filename, 'rb') try: data = pickle.load(fileobj) finally: fileobj.close() else: data = {} def _save_data(): # save data back to the pickle file fileobj = open(filename, 'wb') try: pickle.dump(data, fileobj) finally: fileobj.close() if hasattr(cherrypy.engine, 'subscribe'): # CherryPy >= 3.1 cherrypy.engine.subscribe('stop', _save_data) else: cherrypy.engine.on_stop_engine_list.append(_save_data)
Now let's add some initial content to our “database”.
Note: You'll need to stop the CherryPy server to do the following, otherwise your changes will get overwritten.
In the terminal, from the tutorial directory, launch the interactive Python shell by executing PYTHONPATH=. python, and enter the following code:
>>> from geddit.model import * >>> link1 = Link(username='joe', url='http://example.org/', title='An example') >>> link1.add_comment(username='jack', content='Bla bla bla') <Comment by 'jack'> >>> link1.add_comment(username='joe', content='Bla bla bla, bla bla.') <Comment by 'joe'> >>> link2 = Link(username='annie', url='http://reddit.com/', title='The real thing') >>> import pickle >>> pickle.dump({link1.id: link1, link2.id: link2}, open('geddit.db', 'wb')) >>> ^D # Control-Z (^Z) on Windows
You should now have two links in the pickle file, with the first link having two comments. Start the CherryPy server again by running:
$ PYTHONPATH=. python geddit/controller.py geddit.db
Making the Application “Do Stuff”
Extending the Template
Now let's change the Root.index() method in geddit/controller.py to pass the links list to the template:
@cherrypy.expose def index(self): links = sorted(self.data.values(), key=operator.attrgetter('time')) tmpl = loader.load('index.html') stream = tmpl.generate(links=links) return stream.render('html', doctype='html')
And finally, we'll modify the index.html template so that it displays the links in a simple ordered list. While we're at it, let's add a link to submit new items:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://genshi.edgewall.org/"> <head> <title>Geddit: News</title> </head> <body class="index"> <div id="header"> <h1>News</h1> </div> <ol py:if="links"> <li py:for="link in reversed(links)"> <!-- FAILS HERE: unable to read link object but can read links --> <a href="${link.url}">${link.title}</a> posted by ${link.username} at ${link.time.strftime('%x %X')} </li> </ol> <p><a class="action" href="/submit/">Submit new link</a></p> <div id="footer"> <hr /> <p class="legalese">© 2007 Edgewall Software</p> </div> </body> </html>
This template demonstrates some aspects of Genshi that we've not seen so far:
- We declare the py: namespace prefix on the <html> element, which is required to be able to add directives to the template.
- There's a py:if condition on the <ol> element. That means that the <ol> and everything it contains will only be included in the output stream if the expression links evaluates to a truth value. In this case we know that links is a list (assembled by the Root.index() method), so if the list is empty, the <ol> will be skipped.
- Next up, we've attached a py:for loop to the <li> element. What this does is that the <li> element will be repeated for every item in the links list. The link variable is bound to the current item in the list on every step.
- You can tell that we can also use more complex expressions than just simple variable substitutions: the directives such as py:if and py:for take Python expressions of any complexity, and you can include Python expressions in other places by putting them inside curly braces prefixed with a dollar sign (${...}).
When you reload the page in the browser, you should get something like this:
Adding a Submission Form
In the previous step, we've already added a link to a submission form to the template, but we haven't implemented the logic to handle requests to that link yet.
To do that, we need to add a method to the Root class in geddit/controller.py:
@cherrypy.expose def submit(self, cancel=False, **data): if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/') # TODO: validate the input data! link = Link(**data) self.data[link.id] = link raise cherrypy.HTTPRedirect('/') tmpl = loader.load('submit.html') stream = tmpl.generate() return stream.render('html', doctype='html')
Note: we explicitly check for the HTTP request method here. And only if it's a “POST” request we actually go and look into the submitted data and add it to the database. That's because “GET” requests in HTTP are supposed to be idempotent, that is, they should not have side effects. If we didn't make this check, we'd also be accepting requests that would change the database via “GET” or “HEAD”, thereby violating the rules.
And of course we'll need to add a template to display the submission form. In geddit/templates, create a file named submit.html, with the following content:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://genshi.edgewall.org/"> <head> <title>Geddit: Submit new link</title> </head> <body class="submit"> <div id="header"> <h1>Submit new link</h1> </div> <form action="" method="post"> <table summary=""><tbody><tr> <th><label for="username">Your name:</label></th> <td><input type="text" id="username" name="username" /></td> </tr><tr> <th><label for="url">Link URL:</label></th> <td><input type="text" id="url" name="url" /></td> </tr><tr> <th><label for="title">Title:</label></th> <td><input type="text" name="title" /></td> </tr><tr> <td></td> <td> <input type="submit" value="Submit" /> <input type="submit" name="cancel" value="Cancel" /> </td> </tr></tbody></table> </form> <div id="footer"> <hr /> <p class="legalese">© 2007 Edgewall Software</p> </div> </body> </html>
Now, if you click on the “Submit new link” link on the start page, you should see the submission form. Filling out the form and clicking "Submit" will post a new link and take you to the start page. Clicking on the “Cancel” button, will take you back to the start page, but not add a link.
Please note though that we're not performing any kind of validation on the input, and that's of course a bad thing. So let's add validation next.
Adding Form Validation
We'll use FormEncode to do the validation, but we'll keep it all fairly basic. Let's declare our form in a separate file, namely geddit/form.py, which will have the following content:
from formencode import Schema, validators class LinkForm(Schema): username = validators.UnicodeString(not_empty=True) url = validators.URL(not_empty=True, add_http=True, check_exists=False) title = validators.UnicodeString(not_empty=True) class CommentForm(Schema): username = validators.UnicodeString(not_empty=True) content = validators.UnicodeString(not_empty=True)
Now let's use those in the Root.submit() method. First add the form classes, as well as the Invalid exception type used by FormEncode, to the imports at the top of geddit/controller.py, which should then look something like this:
import cherrypy from formencode import Invalid from genshi.template import TemplateLoader from geddit.form import LinkForm, CommentForm from geddit.model import Link, Comment
Then, update the submit() method to match the following:
@cherrypy.expose def submit(self, cancel=False, **data): if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/') form = LinkForm() try: data = form.to_python(data) link = Link(**data) self.data[link.id] = link raise cherrypy.HTTPRedirect('/') except Invalid, e: errors = e.unpack_errors() else: errors = {} tmpl = loader.load('submit.html') stream = tmpl.generate(errors=errors) return stream.render('html', doctype='html')
As you can tell, we now only add the submitted link to our database when validation is successful: all fields need to be filled out, and the url field needs to contain a valid URL. If the submission is valid, we proceed as before. If it is not valid, we render the submission form template again, passing it the dictionary of validation errors. Let's modify the submit.html template so that it displays those error messages:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://genshi.edgewall.org/"> <head> <title>Geddit: Submit new link</title> </head> <body class="submit"> <div id="header"> <h1>Submit new link</h1> </div> <form action="" method="post"> <table summary=""><tbody><tr> <th><label for="username">Your name:</label></th> <td> <input type="text" id="username" name="username" /> <span py:if="'username' in errors" class="error">${errors.username}</span> </td> </tr><tr> <th><label for="url">Link URL:</label></th> <td> <input type="text" id="url" name="url" /> <span py:if="'url' in errors" class="error">${errors.url}</span> </td> </tr><tr> <th><label for="title">Title:</label></th> <td> <input type="text" name="title" /> <span py:if="'title' in errors" class="error">${errors.title}</span> </td> </tr><tr> <td></td> <td> <input type="submit" value="Submit" /> <input type="submit" name="cancel" value="Cancel" /> </td> </tr></tbody></table> </form> <div id="footer"> <hr /> <p class="legalese">© 2007 Edgewall Software</p> </div> </body> </html>
So now, if you submit the form without entering a title, and having entered an invalid URL, you'd see something like the following:
But there's a problem here: Note how the input values have vanished from the form! We'd have to repopulate the form manually from the data submitted so far. We could do that by adding the required value="" attributes to the text fields in the template, but Genshi provides a more elegant way: the HTMLFormFiller stream filter. Given a dictionary of values, it can automatically populate HTML forms in the template output stream.
To enable this functionality, first you'll need to add the following import to the geddit/controller.py file:
from genshi.filters import HTMLFormFiller
Next, update the bottom lines of the Root.submit() method implementation so that they look as follows:
tmpl = loader.load('submit.html') stream = tmpl.generate(errors=errors) | HTMLFormFiller(data=data) return stream.render('html', doctype='html')
Now, all entered values are preserved when validation errors occur. Note that the form is populated as the template is being generated, there is no reparsing and reserialization of the output.
Improving the Application
Factoring out the Templating
By now, we already have some repetitive code when it comes to rendering templates: both the Root.index() and the Root.submit() methods look very similar in that regard: they load a specific template, call its generate() method passing it some data, and then call the render() method of the resulting stream. As we're going to be adding more controller methods, let's factor out those things into a library module.
There's a special challenge here, though: we still want to be able to add the HTMLFormFiller or other stream filters to the template output stream, which needs to be done before that output stream is serialized. We'll use a combination of a decorator and a regular function to achieve that, which collaborate using the CherryPy thread-local context.
Create a directory called lib inside the geddit directory, and inside the lib directory create two files, named __init__.py and template.py, respectively. Leave the first one empty, and in the second one, insert the following code:
import os import cherrypy from genshi.core import Stream from genshi.output import encode, get_serializer from genshi.template import Context, TemplateLoader loader = TemplateLoader( os.path.join(os.path.dirname(__file__), '..', 'templates'), auto_reload=True ) def output(filename, method='html', encoding='utf-8', **options): """Decorator for exposed methods to specify what template they should use for rendering, and which serialization method and options should be applied. """ def decorate(func): def wrapper(*args, **kwargs): cherrypy.thread_data.template = loader.load(filename) opt = options.copy() if method == 'html': opt.setdefault('doctype', 'html') serializer = get_serializer(method, **opt) stream = func(*args, **kwargs) if not isinstance(stream, Stream): return stream return encode(serializer(stream), method=serializer, encoding=encoding) return wrapper return decorate def render(*args, **kwargs): """Function to render the given data to the template specified via the ``@output`` decorator. """ if args: assert len(args) == 1, \ 'Expected exactly one argument, but got %r' % (args,) template = loader.load(args[0]) else: template = cherrypy.thread_data.template ctxt = Context(url=cherrypy.url) ctxt.push(kwargs) return template.generate(ctxt)
In the geddit/controller.py file, you can now remove the from genshi.template import TemplateLoader line, and also the instantiation of the TemplateLoader, as that is now done in our new library module. Of course, you'll have to import that library module instead:
from geddit.lib import template
Now, we can change the Root class to match the following:
class Root(object): def __init__(self, data): self.data = data @cherrypy.expose @template.output('index.html') def index(self): links = sorted(self.data.values(), key=operator.attrgetter('time')) return template.render(links=links) @cherrypy.expose @template.output('submit.html') def submit(self, cancel=False, **data): if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/') form = LinkForm() try: data = form.to_python(data) link = Link(**data) self.data[link.id] = link raise cherrypy.HTTPRedirect('/') except Invalid, e: errors = e.unpack_errors() else: errors = {} return template.render(errors=errors) | HTMLFormFiller(data=data)
As you can see here, the code is now less repetitive: there's a simple decorator to define which template should be used, and the render() function produces the template output stream which can then be further processed if necessary.
Adding a Layout Template
But there's also duplication in the template files themselves: each template has to redefine the complete header and footer, and any other “decoration” markup that we may want to apply to the complete site. Now, we could simply put those commonly used markup snippets into separate HTML files and include them in the templates where they are needed. But Genshi provides a more elegant way to apply a common structure to different templates: match templates.
Most template languages provide an inheritance mechanism to allow different templates to share some kind of common structure, such as a common header, navigation, and footer. Using this mechanism, you create a “master template” in which you declare slots that “derived templates” can fill in. The problem with this approach is that it is fairly rigid: the master needs to know which content the templates will produce, and what kind of slots need to be provided for them to stuff their content in. Also, a derived template is itself not a valid or even well-formed HTML file, and can not be easily previewed or edited in a WYSIWYG authoring tool.
Match templates in Genshi turn this upside down. They are conceptually similar to running an XSLT transformation over your template output: you create rules that match elements in the template output stream based on XPath patterns. Whenever there is a match, the matched content is replaced by what the match template produces. This sounds complicated in theory, but is fairly intuitive in practice, so let's look at a concrete example.
In the geddit/templates/ directory, add a file named layout.html, with the following content:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://genshi.edgewall.org/" py:strip=""> <py:match path="head" once="true"> <head py:attrs="select('@*')"> <title py:with="title = list(select('title/text()'))"> Geddit<py:if test="title">: ${title}</py:if> </title> <link rel="stylesheet" href="${url('/media/layout.css')}" type="text/css" /> <script type="text/javascript" src="${url('/media/jquery.js')}"></script> ${select('*[local-name()!="title"]')} </head> </py:match> <py:match path="body" once="true"> <body py:attrs="select('@*')"><div id="wrap"> <div id="header"> <a href="/"><img src="${url('/media/logo.gif')}" width="201" height="79" alt="geddit?" /></a> </div> <div id="content"> ${select('*|text()')} </div> <div id="footer"> <hr /> <p class="legalese">© 2007 Edgewall Software</p> </div> </div></body> </py:match> </html>
That contains a whole lot of things, so let's break it up into smaller pieces and go through the various aspects to clarify them.
- The Document Element
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://genshi.edgewall.org/" py:strip="">
First, note that the root element of the template is an <html> tag. This is needed because markup templates are XML documents, and XML documents require a single root element (we also use it to attach our namespace declarations, but we could just as well do that on the nested <py:match> elements). However, because the page templates that include this file will also have <html> root elements, we add the py:strip="" directive so that this second <html> tag doesn't make it through into the output stream.
- Match Template Definition
<py:match path="head" once="true">
Here we define the first match template. The path attribute contains an XPath pattern specifying which elements this match template should be applied to. In this case, the XPath is very simple: it matches any element with the tag name “head”, so it will be applied to the <head>...</head> element. We also add the once="true" attribute to tell Genshi that we only expect a single occurrence of the <head> element in the stream. Genshi can perform some optimizations based on this information.
- Selecting Matched Content
<head py:attrs="select('@*')">
Inside match templates, you can use the special function select(path) to access the element that matched the pattern. Here we use that function in the py:attrs directive, which basically translates to “get all attributes on the matched element, and add them to this element”. So for example if your page template contained <head id="foo">, the element produced by this match template would also have the same id="foo" attribute.
<title py:with="title = list(select('title/text()'))"> Geddit<py:if test="title">: ${title}</py:if> </title>
This is a more complex example for selecting matched content: it fetches the text contained in the <title> element of the original <head> and prefixes it with the string “Geddit: ”. But as page templates may not even contain a <title> element, we first check whether it exists, and only add the colon if it does. Thus, if the page has no title of its own, the result will be “Geddit”.
${select('*[local-name()!="title"]')}
Finally, this is an example for using a more complex XPath pattern. This select() incantation here returns a stream that contains all child elements of the original <head>, except for those elements with the tag name “title”. If we didn't add that predicate, the output stream would contain two <title> tags.
If you've done a bit of XSLT, match templates should look familiar. Otherwise, you may want to familiarize yourself with the basics of XPath 1—but note that Genshi only implements a subset of the full spec as explained in Using XPath in Genshi. Just play around with match templates a bit; at the core, the concept is actually pretty simple and consistent.
Now we need to update the page templates: they no longer need the header and footer, and we'll have to include the layout.html file so that the match templates are applied. For the inclusion, we add the namespace prefix for XInclude, and an xi:include element.
Let's see how the template should look now for index.html:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:py="http://genshi.edgewall.org/"> <xi:include href="layout.html" /> <head> <title>News</title> </head> <body class="index"> <h1>News</h1> <ol py:if="links"> <li py:for="link in links"> <a href="${link.url}">${link.title}</a> posted by ${link.username} at ${link.time.strftime('%x %X')} </li> </ol> <p><a class="action" href="/submit/">Submit new link</a></p> </body> </html>
Also change the submit.html template analogously, by adding the namespace prefix, the <xi:include> element, and by removing the header and footer <div>s:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:py="http://genshi.edgewall.org/"> <xi:include href="layout.html" /> <head> <title>Submit new link</title> </head> <body class="submit"> <h1>Submit new link</h1> <form action="" method="post"> <table summary=""><tbody><tr> <th><label for="username">Your name:</label></th> <td> <input type="text" id="username" name="username" /> <span py:if="'username' in errors" class="error">${errors.username}</span> </td> </tr><tr> <th><label for="url">Link URL:</label></th> <td> <input type="text" id="url" name="url" /> <span py:if="'url' in errors" class="error">${errors.url}</span> </td> </tr> <tr> <th><label for="title">Title:</label></th> <td> <input type="text" name="title" /> <span py:if="'title' in errors" class="error">${errors.title}</span> </td> </tr><tr> <td></td> <td> <input type="submit" value="Submit" /> <input type="submit" name="cancel" value="Cancel" /> </td> </tr></tbody></table> </form> </body> </html>
And speaking of “layout”, you can see that we've added references to some static resources in the layout template: there's an embedded image as well as a linked stylesheet and javascript file. Download those files and put them in your geddit/static/ directory.
When you reload the front page in your browser, you should now see something similar to the following:
Implementing Comments
We're still missing an important bit of functionality: people should be able to comment on submitted links. Three things are needed to implement that:
- a detail view of a link, showing all comments made so far,
- a way to get to that page from the list of links, and,
- a form to add new comments.
Note that on the model side we're covered, there's already a Comment class in geddit.model, and we even have two comments in our database already. And we already have to form that'll be used to validate comment submissions, in form of the class CommentForm in geddit.form.
So let's add the rest by extending the index.html template to show how many comments there are so far, and make that a link to the detail page. Change your geddit/templates/index.html file to match the following:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:py="http://genshi.edgewall.org/"> <xi:include href="layout.html" /> <head> <title>News</title> </head> <body class="index"> <h1>News</h1> <ol py:if="links" class="links"> <li py:for="link in links"> <a href="${link.url}">${link.title}</a> posted by ${link.username} at ${link.time.strftime('%x %X')} <div class="info"> <a href="${url('/info/%s/' % link.id)}"> ${len(link.comments)} comments </a> </div> </li> </ol> <p><a class="action" href="${url('/submit/')}">Submit new link</a></p> </body> </html>
We've added a <div class="info"> for every link in the list, each containing the number of comments, and linking to the detail page.
Of couse, if you click on those links, you'll get an error page: we haven't implemented the info() view yet!
Let's do that now. Add the following method to the Root class:
@cherrypy.expose @template.output('info.html') def info(self, id): link = self.data.get(id) if not link: raise cherrypy.NotFound() return template.render(link=link)
And then add the needed temlate geddit/templates/info.html with the following content:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:py="http://genshi.edgewall.org/"> <xi:include href="layout.html" /> <head> <title>${link.title}</title> </head> <body class="info"> <h1>${link.title}</h1> <a href="${link.url}">${link.url}</a><br /> posted by ${link.username} at ${link.time.strftime('%x %X')}<br /> <ul py:if="link.comments" class="comments"> <li py:for="idx, comment in enumerate(link.comments)" id="comment$idx"> <strong>${comment.username}</strong> at ${comment.time.strftime('%x %X')} <blockquote>${comment.content}</blockquote> </li> </ul> <p><a class="action" href="${url('/comment/%s/' % link.id)}">Add comment</a></p> </body> </html>
At this point you should be able to see the number of comments on the start page, click on that link to get to the details page, where you should see all comments listed for the corresponding link submission. That page also contains a link for submitting additional comments, and that's what we'll need to set up next.
We need to add the method for handling comment submissions to our Root object. It should look like this:
@cherrypy.expose @template.output('comment.html') def comment(self, id, cancel=False, **data): link = self.data.get(id) if not link: raise cherrypy.NotFound() if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/info/%s' % link.id) form = CommentForm() try: data = form.to_python(data) comment = link.add_comment(**data) raise cherrypy.HTTPRedirect('/info/%s' % link.id) except Invalid, e: errors = e.unpack_errors() else: errors = {} return template.render(link=link, comment=None, errors=errors) | HTMLFormFiller(data=data)
Last but not least, we need the template that renders the comment submission form. Inside geddit/templates, add a file named comment.html, and insert the following content:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:py="http://genshi.edgewall.org/"> <xi:include href="layout.html" /> <head> <title>Comment on “${link.title}”</title> </head> <body class="comment"> <h1>Comment on “${link.title}”</h1> <p py:if="comment"> In reply to <strong>${comment.username}</strong> at ${comment.time.strftime('%x %X')}: <blockquote>${comment.content}</blockquote> </p> <form action="" method="post"> <table summary=""><tbody><tr> <th><label for="username">Your name:</label></th> <td> <input type="text" id="username" name="username" /> <span py:if="'username' in errors" class="error">${errors.username}</span> </td> </tr><tr> <th><label for="comment">Comment:</label></th> <td> <textarea id="comment" name="content" rows="6" cols="50"></textarea> <span py:if="'content' in errors" class="error"><br />${errors.content}</span> </td> </tr></tbody></table> <div> <input type="submit" value="Submit" /> <input type="submit" name="cancel" value="Cancel" /> </div> </form> </body> </html>
Phew! We should be done with the commenting now. Play around with the application a bit to get a feel for what we've achieved so far. The next section will look into various things that can be done to further improve the application.
Advanced Topics
Adding an Atom Feed
Every web site needs an RSS or Atom feed these days. So we shall provide one too.
Adding Atom feeds to Geddit is fairly straightforward. First, we'll need to add auto-discovery links to the index and detail pages.
Inside the <head> element of geddit/templates/index.html, add:
<link rel="alternate" type="application/atom+xml" title="Geddit" href="${url('/feed/')}" />
And inside the <head> element of geddit/templates/info.html, add:
<link rel="alternate" type="application/atom+xml" title="Geddit: ${link.title}" href="${url('/feed/%s/' % link.id)}" />
Now we need to add the feed() method to our Root class in geddit/controller.py:
@cherrypy.expose @template.output('index.xml', method='xml') def feed(self, id=None): if id: link = self.data.get(id) if not link: raise cherrypy.NotFound() return template.render('info.xml', link=link) else: links = sorted(self.data.values(), key=operator.attrgetter('time')) return template.render(links=links)
Note that this method dispatches to different templates depending on whether the id parameter was provided. So, for the URL /feed/, we'll render the list of links using the template index.xml, and for the URL /feed/{link_id}/, we'll render a link and the list of related comments using the template info.xml.
The templates for this are also pretty simple. First, geddit/templates/index.xml:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:py="http://genshi.edgewall.org/"> <title>Geddit News</title> <id href="${url('/')}"/> <link rel="alternate" href="${url('/')}" type="text/html"/> <link rel="self" href="${url('/feed/')}" type="application/atom+xml"/> <updated>${links[0].time.isoformat()}</updated> <entry py:for="link in reversed(links)"> <title>${link.url}</title> <link rel="alternate" href="${link.url}" type="text/html"/> <link rel="via" href="${url('/info/%s/' % link.id)}" type="text/html"/> <id>${url('/info/%s/' % link.id)}</id> <author> <name>${link.username}</name> </author> <updated>${link.time.isoformat()}</updated> <summary>${link.title}</summary> </entry> </feed>
And now, geddit/templates/info.xml:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:py="http://genshi.edgewall.org/"> <title>Geddit: ${link.title}</title> <id href="${url('/info/%s/' % link.id)}"/> <link rel="alternate" href="${url('/info/%s/' % link.id)}" type="text/html"/> <link rel="self" href="${url('/feed/%s/' % link.id)}" type="application/atom+xml"/> <updated py:with="time=link.comments and link.comments[-1].time or link.time"> ${time.isoformat()} </updated> <entry py:for="idx, comment in enumerate(reversed(link.comments))"> <title>Comment ${len(link.comments) - idx} on “${link.title}”</title> <link rel="alternate" href="${url('/info/%s/' % link.id)}#comment${idx}" type="text/html"/> <id>${url('/info/%s/' % link.id)}#comment${idx}</id> <author> <name>${comment.username}</name> </author> <updated>${comment.time.isoformat()}</updated> <content>${comment.content}</content> </entry> </feed>
Voila! We now provide Atom feeds for all our content.
Ajaxified Commenting
AJAX (Asynchronous Javascript And XML) is all the rage today, and in many ways, it is indeed a helpful technique for improving the usability and responsiveness of web-based applications.
To demonstrate how you'd use Genshi in a project that uses AJAX, let's enhance the Geddit commenting feature to use AJAX. We'll implement this in such a way that the current way comments work remains available, to serve those who don't have Javascript available, and also just to be good web citizens. That approach to using funky new techniques is often referred to as “unobtrusive Javascript”, and what it provides is “graceful degradation.”
Note: technically, what we'll be doing here isn't AJAX in the literal sense, because we'll not be transmitting XML. Instead, we'll respond with simple HTML fragments, a technique that is sometimes referred to as “AJAH” (“H” as in HTML).
We'll go about this in the following way: on a link submission detail page, the “Add comment” button will now load the comment form into the current page, instead of going to the dedicated comment submission page. When the user clicks the “Cancel” button, we simply remove the form from the page. On the other hand, if the user clicks the ”Submit” button, we validate the entry, and if it's okay, we remove the form and load the new comment into the list on the page. That means that the user never leaves or reloads the link submission detail page in the process!
The first thing we need to do is to make the comment form available as a fragment, outside of the normally needed HTML skeleton. To do that, we create a new template file, in geddit/templates/_form.html, with the following content:
<form xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://genshi.edgewall.org/" class="comment" action="${url('/comment/%s/' % link.id)}" method="post"> <table summary=""><tbody><tr> <th><label for="username">Your name:</label></th> <td> <input type="text" id="username" name="username" /> <span py:if="'username' in errors" class="error">${errors.username}</span> </td> </tr><tr> <th><label for="comment">Comment:</label></th> <td> <textarea id="comment" name="content" rows="6" cols="50"></textarea> <span py:if="'content' in errors" class="error"><br />${errors.content}</span> </td> </tr><tr> <td></td> <td> <input type="submit" value="Submit" /> <input type="submit" name="cancel" value="Cancel" /> </td> </tr></tbody></table> </form>
And as that is the same form as the one used in the geddit/templates/comment.html template, let's replace the markup in that template with an include:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:py="http://genshi.edgewall.org/"> <xi:include href="layout.html" /> <head> <title>Comment on “${link.title}”</title> </head> <body class="comment"> <h1>Comment on “${link.title}”</h1> <p py:if="comment"> In reply to <strong>${comment.username}</strong> at ${comment.time.strftime('%x %X')}: <blockquote>${comment.content}</blockquote> </p> <xi:include href="_form.html" /> </body> </html>
We'll also need to make the display of an individual comment available as an HTML fragment, so let's factor it out into a separate template file as well.
Add a template called _comment.html to the geddit/templates directory, and insert the following lines:
<li id="comment$num"> <strong>${comment.username}</strong> at ${comment.time.strftime('%x %X')} <blockquote>${comment.content}</blockquote> </li>
And in geddit/templates/info.html replace the <ul> element rendering the comments with the following:
<ul py:if="link.comments" class="comments"> <xi:include href="_comment.html" py:for="num, comment in enumerate(link.comments)" /> </ul>
Now we'll need to look into modifying the Root.comment() method so that it correctly deals with the AJAX requests we'll be adding.
For convenience, let's add a new small module to our lib package. Inside the geddit/lib directory, create a file named ajax.py, and add the following code to it:
import cherrypy def is_xhr(): requested_with = cherrypy.request.headers.get('X-Requested-With') return requested_with and requested_with.lower() == 'xmlhttprequest'
This checks whether the current request originates from usage of AJAX (technically, the XMLHttpRequest Javascript object), based on a convention commonly used in Javascript libraries to add the special HTTP header “X-Requested-With: XMLHttpRequest” to all requests.
Add an import of that module to the top of the geddit/controller.py file, replacing:
from geddit.lib import template
with:
from geddit.lib import ajax, template
Then, replace the Root.comment() method in geddit/controller.py with the following code:
@cherrypy.expose @template.output('comment.html') def comment(self, id, cancel=False, **data): link = self.data.get(id) if not link: raise cherrypy.NotFound() if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/info/%s' % link.id) form = CommentForm() try: data = form.to_python(data) comment = link.add_comment(**data) if not ajax.is_xhr(): raise cherrypy.HTTPRedirect('/info/%s' % link.id) return template.render('_comment.html', comment=comment, num=len(link.comments)) except Invalid, e: errors = e.unpack_errors() else: errors = {} if ajax.is_xhr(): stream = template.render('_form.html', link=link, errors=errors) else: stream = template.render(link=link, comment=None, errors=errors) return stream | HTMLFormFiller(data=data)
There's another small detail we'll need to care of: in our @template decorator, we're automatically adding a <!DOCTYPE> declaration to any template output stream that is being serialized to HTML. For AJAX responses containing HTML fragments, we don't really want to add any kind of DOCTYPE, so we'll need to adjust the implementation of the decorator.
To do that, first add an import of our geddit/lib/ajax.py file to the geddit/lib/template.py file:
from geddit.lib import ajax
Then, replace the implementation of the output() function with the following:
def output(filename, method='html', encoding='utf-8', **options): """Decorator for exposed methods to specify what template the should use for rendering, and which serialization method and options should be applied. """ def decorate(func): def wrapper(*args, **kwargs): cherrypy.thread_data.template = loader.load(filename) opt = options.copy() if not ajax.is_xhr() and method == 'html': opt.setdefault('doctype', 'html') serializer = get_serializer(method, **opt) stream = func(*args, **kwargs) if not isinstance(stream, Stream): return stream return encode(serializer(stream), method=serializer, encoding=encoding) return wrapper return decorate
Note how we're now only adding the doctype='html' serialization option when we're not handling an AJAX request.
Finally, we need to add the actual Javascript logic needed to orchestrate all this. Add the following code at the bottom of the <head> element in the geddit/templates/info.html template:
<script type="text/javascript"> function loadCommentForm(a) { $.get("${url('/comment/%s/' % link.id)}", {}, function(html) { var form = a.hide().parent().after(html).next(); function closeForm() { form.slideUp("fast", function() { a.fadeIn(); form.remove() }); return false; } function initForm() { form.find("input[@name='cancel']").click(closeForm); form.submit(function() { var data = form.find("input[@type='text'], textarea").serialize(); $.post("${url('/comment/%s/' % link.id)}", data, function(html) { var elem = $(html).get(0); if (/form/i.test(elem.tagName)) { form.after(elem).remove(); form = $(elem); initForm(); } else { if ($("ul.comments").length == 0) { a.parent().before('<ul class="comments"></ul>'); } $("ul.comments").append(elem); closeForm(); } }); return false; }); } initForm(); }); } $(document).ready(function() { $("a.action").click(function() { loadCommentForm($(this)); return false; }); }); </script>
This Javascript snippet uses jQuery (via the jquery.js file you've already added to you geddit/static directory). We won't go into the details of the script here, suffice to say that it implements our goals in a fairly lightweight manner. For a nice introduction to jQuery, see Simon Willison´s blog post jQuery for JavaScript programmers.
Now, when you click on the “Add comment” link on the link submission detail page, with Javascript enabled, you should see the comment form appear on the same page:
Allowing Markup in Comments
At this point we allow users to post plain text comments, but those comments can't include niceties such as hyperlinks or HTML inline formatting (emphasis, etc). A very naive application would simply accept HTML tags in the input, and pass those tags through to the output. That is generally a bad thing, however, as it opens up your site to cross-site scripting (XSS) attacks, which can undermine any security measures you try put into effect (including SSL). And because this is generally not the behavior you want, Genshi XML-escapes everything by default, which makes it safe to include in (X)HTML output.
(Note that as Geddit allows anyone to do anything, we don't actually have any valuable assets to protect, so this exercise is somewhat theoretical. For the rest of this section, just imagine we required users to register and login to submit links or post comments.)
So what we want to do in this section is to allow users to include HTML tags in their comments, but do so in a safe manner. We do not want to enable malicious users to include Javascript code, or CSS styles that turn the whole page black, or other things that may be considered harmful. In other words, we need to “sanitize” the markup in the comments.
But let's ignore that aspect for now, and start by making Genshi not escape HTML tags in comments. We'll start by editing geddit/template/_comment.html:
<?python from genshi import HTML ?> <li id="comment$num"> <strong>${comment.username}</strong> at ${comment.time.strftime('%x %X')} <blockquote>${HTML(comment.content)}</blockquote> </li>
Here, we've added an import for the Genshi HTML() function. This is done using a Python code block via the <?python ?> processing instruction. We've already seen that we can use complex Python expressions in templates. By using the <?python ?> processing instruction, we can embed any Python statements directly in the template, for example to define classes or functions. In this case we simply import a function that we need to use.
The HTML() function parses a snippet of HTML and returns a Genshi markup stream. It tries to do this in a way that invalid HTML is corrected (for example by fixing the nesting of tags). We then use that function to render the content of the comment. So what does this do, exactly? Well, the comment text is parsed using an HTML parser, fixed up if necessary (and possible), and injected into the template as a markup stream. A template expression that evaluates to a markup stream is treated differently than other data types: it is injected directly into the template output stream, effectively resulting in tags not getting escaped.
Note: Genshi also provides the genshi.core.Markup class, which is just a special string class that flags its content as safe for being included in HTML/XML output for Genshi. So instead of wrapping the comment text inside a call to the HTML() function, you could also use Markup(comment.content). That would avoid the reparsing of the content, but at the cost of that content not being subject to stream filters and different serialization methods. In a nutshell, using Markup is not recommended unless you really know what you're doing.
So at this point our users can include HTML tags in their comments, and the comments will be rendered as HTML. But as noted above, that approach is very dangerous for most real-world applications, so we've got more work to do: we need to sanitize the markup in the comment so that only markup that can be considered safe is let through. Genshi provides a stream filter to help us here: HTMLSanitizer.
To add sanitization, first add the imports for the HTML function and the HTMLSanitizer filter to geddit/controller.py, so that the imports at the top of that file look something like this:
import cherrypy from formencode import Invalid from genshi.input import HTML from genshi.filters import HTMLFormFiller, HTMLSanitizer
Then we'll update the Root.comment() method so that it sanitizes comments as they are submitted:
@cherrypy.expose @template.output('comment.html') def comment(self, id, cancel=False, **data): link = self.data.get(id) if not link: raise cherrypy.NotFound() if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/info/%s' % link.id) form = CommentForm() try: data = form.to_python(data) markup = HTML(data['content']) | HTMLSanitizer() data['content'] = markup.render('xhtml') comment = link.add_comment(**data) if not ajax.is_xhr(): raise cherrypy.HTTPRedirect('/info/%s' % link.id) return template.render('_comment.html', comment=comment, num=len(link.comments)) except Invalid, e: errors = e.unpack_errors() else: errors = {} if ajax.is_xhr(): stream = template.render('_form.html', link=link, errors=errors) else: stream = template.render(link=link, comment=None, errors=errors) return stream | HTMLFormFiller(data=data)
We've just added two lines here, namely:
markup = HTML(data['content']) | HTMLSanitizer() data['content'] = markup.render('xhtml')
This parses the comment text, runs it through the sanitizer, and serializes it to XHTML. And the result of the transformation is what we'll save to our “database”. We use XHTML here just because that can be processed by a wider variety of tools. For the purposes of this tutorial we could just as well be storing the content using HTML serialization, because Genshi can handle both.
Note: this is just one way to add sanitization. Another equally valid approach would be to store comment submissions exactly how they were entered, and sanitize them when they are displayed. Or you could have two fields in the model: one to store the text as originally submitted, and the other to store the sanitized content ready for display. Or, if you were really paranoid, you'd sanitize both the input and the output. Which method you choose depends on the needs of your particular application.
You may want to try performing some XSS attacks by including malicious HTML markup in comments. Try some of the methods shown on the XSS Cheat Sheet. You should not be able to get past the sanitizer; if you are, please let us know.
We're almost done—the only remaining task is to update the Atom feed so that it, too, includes the user-submitted HTML tags as markup, instead of as escaped text. Open geddit/templates/info.xml, and update it to look as follows:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:py="http://genshi.edgewall.org/"> <title>Geddit: ${link.title}</title> <id href="${url('/info/%s/' % link.id)}"/> <link rel="alternate" href="${url('/info/%s/' % link.id)}" type="text/html"/> <link rel="self" href="${url('/feed/%s/' % link.id)}" type="application/atom+xml"/> <updated py:with="time=link.comments and link.comments[-1].time or link.time"> ${time.isoformat()} </updated> <?python from genshi import HTML ?> <entry py:for="idx, comment in enumerate(reversed(link.comments))"> <title>Comment ${len(link.comments) - idx} on “${link.title}”</title> <link rel="alternate" href="${url('/info/%s/' % link.id)}#comment${idx}" type="text/html"/> <id>${url('/info/%s/' % link.id)}#comment${idx}</id> <author> <name>${comment.username}</name> </author> <updated>${comment.time.isoformat()}</updated> <content type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml"> ${HTML(comment.content)} </div></content> </entry> </feed>
Just like above, we've added the import of the Genshi HTML() function. On the <content> element we've added the type="xhtml" attribute, and we've added a wrapper <div> inside that element to declare the XHTML namespace. Finally, inside that <div>, we inject the comment text as an HTML-parsed stream, analogous to what we've done in the HTML template.
Summary
This brings the tutorial to a close. We've demonstrated how you would generally use Genshi in a small Python web application. We've shown some best practices and recipes for making effective use of the features Genshi provides.
You can checkout the complete code for this tutorial here:
http://svn.edgewall.org/repos/genshi/trunk/examples/tutorial
If you like the application we've built here and would like to experiment with further enhancements, feel free to do so. Here are a couple of ideas:
- Add authentication, preferably based on OpenID (Python libraries for OpenID are available.)
- Internationalize the application, using Genshi's builtin I18n support and Babel. DONE
- Add protection against cross-site request forgery (CSRF) attacks, using the Transformer filter to inject form tokens in HTML forms.
- Add voting on links. See the Vote to Promote pattern in Yahoo's “Design Pattern Library” for some inspiration.
- Add comment threading, so that people can reply to comments, and comments and replies are displayed in a hierarchical manner.
- Add support for the Atom Publishing Protocol. See http://bitworking.org/projects/atom/
- (your idea here)
Thanks for reading, we hope the tutorial has been useful!
Attachments (7)
-
model.png
(14.9 KB) -
added by cmlenz 17 years ago.
Model diagram
-
tutorial05.png
(55.1 KB) -
added by cmlenz 17 years ago.
Browser screenshot 5
-
tutorial01.png
(38.6 KB) -
added by cmlenz 17 years ago.
Browser screenshot 1
-
tutorial02.png
(46.2 KB) -
added by cmlenz 17 years ago.
Browser screenshot 2
-
tutorial03.png
(55.4 KB) -
added by cmlenz 17 years ago.
Browser screenshot 3
-
tutorial04.png
(61.3 KB) -
added by cmlenz 17 years ago.
Browser screenshot 4
-
tutorial06.png
(64.5 KB) -
added by cmlenz 17 years ago.
Browser screenshot 6
Download all attachments as: .zip