= Genshi Tutorial = This tutorial is intended to give an introduction on how to use Genshi in your web application, and present common patterns and best practices. It is aimed at developers new to Genshi as well as those who've already used Genshi, but are looking for advice or inspiration on how to improve that usage. == Introduction == In this tutorial we'll create a simple Python web application based on [http://cherrpy.org/ CherryPy 3]. !CherryPy was chosen because it provides a convenient level of abstraction over raw CGI or [http://wsgi.org/wsgi WSGI] development, but is less ambitious than full-stack web frameworks such as [http://pylonshq.com/ Pylons] or [http://www.djangoproject.com/ Django], which tend to come with a preferred templating language, and often show significant bias towards that language. The application we'll build here is a stripped-down version of sites such as [http://reddit.com/ reddit] or [http://digg.com/ digg]: it lets users submit links to online articles they find interesting, and then lets other users comment on those stories. Just for kicks, we'll call that application '''Geddit?'''. We'll keep the project as simple as possible, while still showing many of Genshi features and how to best use them: * For persistence, we'll use native Python object serialization (via the `pickle` module), instead of an SQL database and an ORM. * There's no authentication of any kind. Anyone can submit links, anyone can comment. * We'll start with the basics (rendering templates, handling forms, etc), and then continue by adding features such as Atom feeds and an AJAX interface. [[PageOutline(2-3, Content, inline)]] == Getting Started == === Prerequisites === First, make sure you have !CherryPy 3.0.x installed, as well as recent versions of [http://formencode.org/ FormEncode] and obviously Genshi. You can download and install those manually, or just use [http://peak.telecommunity.com/DevCenter/EasyInstall easy_install]: {{{ $ easy_install CherryPy $ easy_install FormEncode $ easy_install Genshi }}} === The !CherryPy Application === Next, set up the basic !CherryPy application. 1. Create a directory that should contain the application 2. Inside that directory create a Python package named geddit by doing the following: * Create a `geddit` directory * Create an empty file called `__init__.py` inside the `geddit` directory 3. Inside the `geddit` package directory, create a file called `controller.py` with the following content: {{{ #!python #!/usr/bin/env python import operator, os, pickle, sys import cherrypy class Root(object): def __init__(self, data): self.data = data @cherrypy.expose def index(self): return 'Geddit' def main(filename): data = {} # We'll replace this later # Some global configuration; note that this could be moved into a # configuration file cherrypy.config.update({ 'request.throw_errors': True, 'tools.encode.on': True, 'tools.encode.encoding': 'utf-8', 'tools.decode.on': True, 'tools.trailing_slash.on': True, 'tools.staticdir.root': os.path.abspath(os.path.dirname(__file__)), }) cherrypy.quickstart(Root(data), '/', { '/media': { 'tools.staticdir.on': True, 'tools.staticdir.dir': 'static' } }) if __name__ == '__main__': main(sys.argv[1]) }}} Enter the tutorial directory in the terminal, and run: {{{ $ PYTHONPATH=. python geddit/controller.py geddit.db }}} You should see a log message pointing you to the URL where the application is being served, which is usually http://localhost:8080/. Visiting that page will respond with just the string “Geddit”, as that's what the `index()` method of the `Root` object returns. Note that we've configured !CherryPy to serve static files from the `geddit/static` directory. !CherryPy will complain that that directory does not exist, so create it, but leave it empty for now. We'll add static resources later on in the tutorial. === Basic Template Rendering === So far the code doesn't actually use Genshi, or even any kind of templating. Let's change that. Inside of the `geddit` directory, create a directory called `templates`, and inside that directory create a file called `index.html`, with the following content: {{{ #!genshi $title

Welcome!

}}} This is basically an almost static XHTML file with some simple variable substitution: the string `$title` will be replaced by a variable of that name that we'll pass into the template from the controller. There are couple of important things to point out here: * Variables substituted into templates, such as `$title` in our example, can be of any Python data type. Genshi will convert the value to a string and insert the result into the generated output stream. * You generally do not need to worry about XML-escaping such variables. Genshi will automatically take care of that when the template is serialized. We'll look into the details of this process later. * The template will be parsed by Genshi using an XML parser, which means that '''it needs to be well-formed XML'''. If you know HTML but are unfamiliar with XML/XHTML, you will need to read up on the topic. Here are a couple of good references: * [http://www.w3schools.com/xhtml/xhtml_html.asp Differences Between XHTML And HTML] at W3Schools * [http://www.sitepoint.com/article/xhtml-introduction XHTML - An Introduction] at !SitePoint * [http://www.webmonkey.com/00/50/index2a.html XHTML Overview] at Webmonkey * Just because the template uses XHTML does not mean that our web-application should generate XHTML! While that is possible, you can also choose to generate good old HTML 4.01, because, despite all the hype, that's still the format that works best in most browsers (see [http://webkit.org/blog/?p=68 this blog post] over at Surfin' Safari for some background). We now need to change the controller code so that this template is used. First, add the Genshi `TemplateLoader` to the imports at the top of the `geddit/controller.py` file, and instantiate a loader for the `geddit/templates` directory: {{{ #!python import cherrypy from genshi.template import TemplateLoader loader = TemplateLoader( os.path.join(os.path.dirname(__file__), 'templates'), auto_reload=True ) }}} Next, change the implementation of the `index()` method of the `Root` class to look like this: {{{ #!python @cherrypy.expose def index(self): tmpl = loader.load('index.html') return tmpl.generate(title='Geddit').render('html', doctype='html') }}} This asks the template loader for a template named `index.html`, generates the output stream, and finally serializes the output to HTML. When you now reload the page in your browser, you should get back the following HTML response: {{{ #!xml Geddit

Welcome!

}}} Note that the output has some subtle differences compared to the template, even beyond the variable substitution that has taken place: Genshi has added a proper HTML `DOCTYPE` (important to get the browser to render using standards mode, through a commonly employed mechanism in web browsers called “[http://www.ericmeyeroncss.com/bonus/render-mode.html doctype switching]”.) It has also removed the XHTML namespace declaration, because we're rendering to HTML, and HTML doesn't support XML namespaces. And the `
` element in the footer is missing the trailing slash that can be used in XML markup for empty elements; HTML user agents know that `
` is always an empty element, and including either the trailing slash, or even adding an explicit `` end tag would be invalid pr potentially confuse some browsers. === The Data Model === To continue, we'll need to first add some Python classes to define the data model the application will use. As mentioned above, we're using a simple pickle file for persistence, so all we need to do here is create a couple of very simply Python classes. [[Image(model.png)]] Inside the `geddit` directory, create a file named `model.py`, with the following content: {{{ #!python from datetime import datetime class Link(object): def __init__(self, username, url, title): self.username = username self.url = url self.title = title self.time = datetime.utcnow() self.id = hex(hash(tuple([username, url, title, self.time])))[2:] self.comments = [] def __repr__(self): return '<%s %r>' % (type(self).__name__, self.title) def add_comment(self, username, content): comment = Comment(username, content) self.comments.append(comment) return comment class Comment(object): def __init__(self, username, content): self.username = username self.content = content self.time = datetime.utcnow() def __repr__(self): return '<%s by %r>' % (type(self).__name__, self.username) }}} You'll need to import those classes in `geddit/controllers.py`, just below the other imports: {{{ #!python from geddit.model import Link, Comment }}} And in the `main()` function, let's replace the placeholder `data = {}` code with some code to read our data from the pickle file, and write it back: {{{ #!python def main(filename): # load data from the pickle file, or initialize it to an empty list if os.path.exists(filename): fileobj = open(filename, 'rb') try: data = pickle.load(fileobj) finally: fileobj.close() else: data = {} def _save_data(): # save data back to the pickle file fileobj = open(filename, 'wb') try: pickle.dump(data, fileobj) finally: fileobj.close() cherrypy.engine.on_stop_engine_list.append(_save_data) }}} Now let's add some initial content to our “database”. '''Note: You'll need to stop the !CherryPy server to do the following, otherwise your changes will get overwritten'''. In the terminal, from the tutorial directory, launch the interactive Python shell by executing `PYTHONPATH=. python`, and enter the following code: {{{ #!pycon >>> from geddit.model import * >>> link1 = Link(username='joe', url='http://example.org/', title='An example') >>> link1.add_comment(username='jack', content='Bla bla bla') >>> link1.add_comment(username='joe', content='Bla bla bla, bla bla.') >>> link2 = Link(username='annie', url='http://reddit.com/', title='The real thing') >>> import pickle >>> pickle.dump({link1.id: link1, link2.id: link2}, open('geddit.db', 'wb')) >>> ^D # Control-Z (^Z) on Windows }}} You should now have two links in the pickle file, with the first link having two comments. Start the CherryPy server again by running: {{{ $ PYTHONPATH=. python geddit/controller.py geddit.db }}} == Making the Application “Do Stuff” == === Extending the Template === Now let's change the `Root.index()` method in `geddit/controller.py` to pass the links list to the template: {{{ #!python @cherrypy.expose def index(self): links = sorted(self.data.values(), key=operator.attrgetter('time')) tmpl = loader.load('index.html') stream = tmpl.generate(links=links) return stream.render('html', doctype='html') }}} And finally, we'll modify the `index.html` template so that it displays the links in a simple ordered list. While we're at it, let's add a link to submit new items: {{{ #!genshi Geddit: News
  1. ${link.title} posted by ${link.username} at ${link.time.strftime('%x %X')}

Submit new link

}}} This template demontrates some aspects of Genshi that we've not seen so far: * We declare the `py:` namespace prefix on the `` element, which is required to be able to add [wiki:Documentation/xml-templates.html#template-directives directives] to the template. * There's a `py:if` [wiki:Documentation/xml-templates.html#conditional-sections condition] on the `
    ` element. That means that the `
      ` and everything it contains will only be included in the output stream if the expression `links` evaluates to a truth value. In this case we know that `links` is a list (assembled by the `Root.index()` method), so if the list is empty, the `
        ` will be skipped. * Next up, we've attached a `py:for` [wiki:Documentation/xml-templates.html#looping loop] to the `
      1. ` element. What this does is that the `
      2. ` element will be repeated for every item in the `links` list. The `link` variable is bound to the current item in the list on every step. * You can tell that we can also use more complex expressions than just simple variable substitutions: the directives such as `py:if` and `py:for` take Python expressions of any complexity, and you can include Python expressions in other places by putting them inside curly braces prefixed with a dollar sign (`${...}`). When you reload the page in the browser, you should get something like this: [[Image(tutorial01.png)]] === Adding a Submission Form === In the previous step, we've already added a link to a submission form to the template, but we haven't implemented the logic to handle requests to that link yet. To do that, we need to add a method to the `Root` class in `geddit/controller.py`: {{{ #!python @cherrypy.expose def submit(self, cancel=False, **data): if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/') # TODO: validate the input data! link = Link(**data) self.data[link.id] = link raise cherrypy.HTTPRedirect('/') tmpl = loader.load('submit.html') stream = tmpl.generate() return stream.render('html', doctype='html') }}} And of course we'll need to add a template to display the submission form. In `geddit/templates`, create a file named `submit.html`, with the following content: {{{ #!genshi Geddit: Submit new link
        }}} Now, if you click on the “Submit new link” link on the start page, you should see the submission form. Filling out the form and clicking "Submit" will post a new link and take you to the start page. Clicking on the “Cancel” button, will take you back to the start page, but not add a link. Please note though that we're not performing ''any'' kind of validation on the input, and that's of course a bad thing. So let's add validation next. === Adding Form Validation === We'll use [http://formencode.org/ FormEncode] to do the validation, but we'll keep it all fairly basic. Let's declare our form in a separate file, namely `geddit/form.py`, which will have the following content: {{{ #!python from formencode import Schema, validators class LinkForm(Schema): username = validators.UnicodeString(not_empty=True) url = validators.URL(not_empty=True, add_http=True, check_exists=False) title = validators.UnicodeString(not_empty=True) class CommentForm(Schema): username = validators.UnicodeString(not_empty=True) content = validators.UnicodeString(not_empty=True) }}} Now let's use those in the `Root.submit()` method. First add the form classes, as well as the `Invalid` exception type used by !FormEncode, to the imports at the top of `geddit/controller.py`, which should then look something like this: {{{ #!python import cherrypy from formencode import Invalid from genshi.template import TemplateLoader from geddit.form import LinkForm, CommentForm from geddit.model import Link, Comment }}} Then, update the `submit()` method to match the following: {{{ #!python @cherrypy.expose def submit(self, cancel=False, **data): if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/') form = LinkForm() try: data = form.to_python(data) link = Link(**data) self.data[link.id] = link raise cherrypy.HTTPRedirect('/') except Invalid, e: errors = e.unpack_errors() else: errors = {} tmpl = loader.load('submit.html') stream = tmpl.generate(errors=errors) return stream.render('html', doctype='html') }}} As you can tell, we now only add the submitted link to our database when validation is successful: all fields need to be filled out, and the `url` field needs to contain a valid URL. If the submission is valid, we proceed as before. If it is not valid, we render the submission form template again, passing it the dictionary of validation errors. Let's modify the `submit.html` template so that it displays those error messages: {{{ #!genshi Geddit: Submit new link
        ${errors.username}
        ${errors.url}
        ${errors.title}
        }}} So now, if you submit the form without enterering a title, and having entered an invalid URL, you'd see something like the following: [[Image(tutorial02.png)]] But there's a problem here: Note how the input values have vanished from the form! We'd have to repopulate the form manually from the data submitted so far. We could do that by adding the required `value=""` attributes to th text fields in the template, but Genshi provides a more elegant way: the [wiki:Documentation/filters.html#html-form-filler HTMLFormFiller] steam filter. Given a dictionary of values, it can automatically populate HTML forms in the template output stream. To enable this functionality, first you'll need to add the following import to the `genshi/controller.py` file: {{{ #!python from genshi.filters import HTMLFormFiller }}} Next, update the bottom lines of the `Root.submit()` method implementation so that they look as follows: {{{ #!python tmpl = loader.load('submit.html') stream = tmpl.generate(errors=errors) | HTMLFormFiller(data=data) return stream.render('html', doctype='html') }}} Now, all entered values are preserved when validation errors occur. Note that the form is populated as the template is being generated, there is no reparsing and reserialization of the output. == Improving the Application == === Factoring out the Templating === By now, we already have some repetitive code when it comes to rendering templates: both the `Root.index()` and the `Root.submit()` methods look very similar in that regard: they load a specific template, call its `generate()` method passing it some data, and then call the `render()` method of the resulting stream. As we're going to be adding more controller methods, let's factor out those things into a library module. There's a special challenge here, though: we still want to be able to add the `HTMLFormFiller` or other stream filters to the template output stream, which needs to be done before that output stream is serialized. We'll use a combination of a decorator and a regular function to achieve that, which collaborate using the !CherryPy thread-local context. Create a directory called `lib` inside the `geddit` directory, and inside the `lib` directory create two files, named `__init__.py` and `template.py`, respectively. Leave the first one empty, and in the second one, insert the following code: {{{ #!python import os import cherrypy from genshi.core import Stream from genshi.output import encode, get_serializer from genshi.template import Context, TemplateLoader loader = TemplateLoader( os.path.join(os.path.dirname(__file__), '..', 'templates'), auto_reload=True ) def output(filename, method='html', encoding='utf-8', **options): """Decorator for exposed methods to specify what template the should use for rendering, and which serialization method and options should be applied. """ def decorate(func): def wrapper(*args, **kwargs): cherrypy.thread_data.template = loader.load(filename) opt = options.copy() if method == 'html': opt.setdefault('doctype', 'html') serializer = get_serializer(method, **opt) stream = func(*args, **kwargs) if not isinstance(stream, Stream): return stream return encode(serializer(stream), method=serializer, encoding=encoding) return wrapper return decorate def render(*args, **kwargs): """Function to render the given data to the template specified via the ``@output`` decorator. """ if args: assert len(args) == 1, \ 'Expected exactly one argument, but got %r' % (args,) template = loader.load(args[0]) else: template = cherrypy.thread_data.template ctxt = Context(url=cherrypy.url) ctxt.push(kwargs) return template.generate(ctxt) }}} In the `genshi/controller.py` file, you can now remove the `from genshi.template import TemplateLoader` line, and also the instantiation of the `TemplateLoader`, as that is now done in our new library module. Of course, you'll have to import that library module instead: {{{ #!python from geddit.lib import template }}} Now, we can change the `Root` class to match the following: {{{ #!python class Root(object): def __init__(self, data): self.data = data @cherrypy.expose @template.output('index.html') def index(self): links = sorted(self.data.values(), key=operator.attrgetter('time')) return template.render(links=links) @cherrypy.expose @template.output('submit.html') def submit(self, cancel=False, **data): if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/') form = LinkForm() try: data = form.to_python(data) link = Link(**data) self.data[link.id] = link raise cherrypy.HTTPRedirect('/') except Invalid, e: errors = e.unpack_errors() else: errors = {} return template.render(errors=errors) | HTMLFormFiller(data=data) }}} As you can see here, the code is now less repetitive: there's a simple decorator to define which template should be used, and the `render()` function produces the template output stream which can then be further processed if necessary. === Adding a Layout Template === But there's also duplication in the template files themselves: each template has to redefine the complete header and footer, and any other “decoration” markup that we may want to apply to the complete site. Now, we could simply put those commonly used markup snippets into separate HTML files and [wiki:Documentation/xml-templates.html#includes include] them in the templates where they are needed. But Genshi provides a more elegant way to apply a common structure to different templates: [wiki:Documentation/xml-templates.html#match-templates match templates]. Most template languages provide an inheritance mechanism to allow different templates to share some kind of common structure, such as a common header, navigation, and footer. Using this mechanism, you create a “master template” in which you declare slots that “derived templates” can fill in. The problem with this approach is that it is fairly rigid: the master needs to know which content the templates will produce, and what kind of slots need to be provided for them to stuff their content in. Also, a derived template is itself not a valid or even well-formed HTML file, and can not be easily previewed or edited in a WYSIWYG authoring tool. Match templates in Genshi turn this up side down. They are conceptually similar to running an XSLT transformation over your template output: you create rules that match elements in the template output stream based on XPath patterns. Whenever there is a match, the matched content is replaced by what the match template produces. This sounds complicated in theory, but is fairly intuitive in practice, so let's look at a concrete example. In the `geddit/templates/` directory, add a file named `layout.html`, with the following content: {{{ #!genshi Geddit<py:if test="title">: ${title}</py:if> ${select('*[local-name()!="title"]')}
        ${select('*|text()')}
        }}} That contains a whole lot of things, so let's break it up into smaller pieces and go through the various aspects to clarify them. 1. '''The Document Element''' {{{ #!genshi }}} First, note that the root element of the template is an `` tag. This is needed because markup templates are XML documents, and XML documents require a single root element (we also use it to attach our namespace declarations, but we just as as well do that on the nested `` elements). However, because the page templates that include this file will also have `` root elements, we add the `py:strip=""` directive so that this second `` tag doesn't make it through into the output stream. 2. '''Match Template Definition''' {{{ #!genshi }}} Here we define the first match template. The `path` attribute contains an XPath pattern specifying which elements this match template should be applied to. In this case, the XPath is very simple: it matches any element with the tag name “head”, so it will be applied to the `...` element. We also add the `once="true"` attribute to tell Genshi that we only expect a single occurrence of the `` element in the stream. Genshi can perform some optimizations based on this information. 3. '''Selecting Matched Content''' {{{ #!genshi }}} Inside match templates, you can use the special function `select(path)` to access the element that matched the pattern. Here we use that function in the `py:attrs` directive, which basically translates to “''get all attributes on the matched element, and add them to this element''”. So for example if your page template contained ``, the element produced by this match template would also have the same `id="foo"` attribute. {{{ #!genshi geddit<py:if test="title">: ${title}</py:if> }}} This is a more complex example for selecting matched content: it fetches the text contained in the `` element of the original `<head>` and prefixes it with the string “geddit: ”. But as page templates may not even contain a `<title>` element, we first check whether it exists, and only add the colon if it does. Thus, if the page has no title of its own, the result will be “geddit”. {{{ #!genshi ${select('*[local-name()!="title"]')} }}} Finally, this is an example for using a more complex XPath pattern. This `select()` incantation here returns a stream that contains all child elements of the original `<head>`, except for those elements with the tag name “title”. If we didn't add that predicate, the output stream would contain two `<title>` tags. If you've done a bit of XSLT, match templates should look familiar. Otherwise, you may want to familiarize yourself with the basics of [http://en.wikipedia.org/wiki/XPath XPath 1]—but note that Genshi only implements a subset of the full spec as explained in [wiki:Documentation/xpath.html Using XPath in Genshi]. Just play around with match templates a bit; at the core, the concept is actually pretty simple and consistent. Now we need to update the page templates: they no longer need the header and footer, and we'll have to include the `layout.html` file so that the match templates are applied. For the inclusion, we add the namespace prefix for XInclude, and an `xi:include` element. Let's see how the template should look now for `index.html`: {{{ #!genshi <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:py="http://genshi.edgewall.org/"> <xi:include href="layout.html" /> <head> <title>News

        News

        1. ${link.title} posted by ${link.username} at ${link.time.strftime('%x %X')}

        Submit new link

        }}} Also change the `submit.html` template analogously, by adding the namespace prefix, the `` element, and by removing the header and footer `
        `s: {{{ #!genshi Submit new link

        Submit new link

        ${errors.username}
        ${errors.url}
        ${errors.title}
        }}} And speaking of “layout”, you can see that we've added references to some static resources in the layout template: there's an embedded image as well as a linked stylesheet and javascript file. [http://svn.edgewall.org/repos/genshi/trunk/examples/tutorial/geddit/static Download] those files and put them in your `geddit/static/` directory. When you reload the front page in your browser, you should now see something similar to the following: [[Image(tutorial03.png)]] === Implementing Comments === We're still missing an important bit of functionality: people should be able to comment on submitted links. Three things are needed to implement that: * a detail view of a link, showing all comments made so far, * a way to get to that page from the list of links, and, * a form to add new comments. Note that on the model side we're covered, there's already a `Comment` class in `geddit.model`, and we even have two comments in our database already. And we already have to form that'll be used to validate comment submissions, in form of the class `CommentForm` in `geddit.form`. So let's add the rest by extending the `index.html` template to show how many comments there are so far, and make that a link to the detail page. Change your `geddit/templates/index.html` file to match the following: {{{ #!genshi News

        News

        Submit new link

        }}} We've added a `
        ` for every link in the list, each containing the number of comments, and linking to the detail page. Of couse, if you click on those links, you'll get an error page: we haven't implemented the `info()` view yet! Let's do that now. Add the following method to the `Root` class: {{{ #!python @cherrypy.expose @template.output('info.html') def info(self, id): link = self.data.get(id) if not link: raise cherrypy.NotFound() return template.render(link=link) }}} And then add the needed temlate `geddit/templates/info.html` with the following content: {{{ #!genshi ${link.title}

        ${link.title}

        ${link.url}
        posted by ${link.username} at ${link.time.strftime('%x %X')}
        • ${comment.username} at ${comment.time.strftime('%x %X')}
          ${comment.content}

        Add comment

        }}} At this point you should be able to see the number of comments on the start page, click on that link to get to the details page, where you should see all comments listed for the corresponding link submission. That page also contains a link for submitting additional comments, and that's what we'll need to set up next. We need to add the method for handling comment submissions to our `Root` object. It should look like this: {{{ #!python @cherrypy.expose @template.output('comment.html') def comment(self, id, cancel=False, **data): link = self.data.get(id) if not link: raise cherrypy.NotFound() if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/info/%s' % link.id) form = CommentForm() try: data = form.to_python(data) comment = link.add_comment(**data) raise cherrypy.HTTPRedirect('/info/%s' % link.id) except Invalid, e: errors = e.unpack_errors() else: errors = {} return template.render(link=link, comment=None, errors=errors) | HTMLFormFiller(data=data) }}} Last but not least, we need the template that renders the comment submission form. Inside `geddit/templates`, add a file named `comment.html`, and insert the following content: {{{ #!genshi Comment on “${link.title}”

        Comment on “${link.title}”

        In reply to ${comment.username} at ${comment.time.strftime('%x %X')}:

        ${comment.content}

        ${errors.username}

        ${errors.content}
        }}} Phew! We should be done with the commenting now. Play around with the application a bit to get a feel for what we've achieved so far. The next section will look into various things that can be done to further improve the application. [[Image(tutorial04.png)]] == Advanced Topics == === Adding an Atom Feed === Every web site needs an RSS or [http://www.atomenabled.org/ Atom] feed these days. So we shall provide one too. Adding Atom feeds to Geddit is fairly straightforward. First, we'll need to add auto-discovery links to the index and detail pages. Inside the `` element of `geddit/templates/index.html`, add: {{{ #!genshi }}} And inside the `` element of `geddit/templates/info.html`, add: {{{ #!genshi }}} Now we need to add the `feed()` method to our `Root` class in `geddit/controller.py`: {{{ #!python @cherrypy.expose @template.output('index.xml', method='xml') def feed(self, id=None): if id: link = self.data.get(id) if not link: raise cherrypy.NotFound() return template.render('info.xml', link=link) else: links = sorted(self.data.values(), key=operator.attrgetter('time')) return template.render(links=links) }}} Note that this method dispatches to different templates depending on whether the `id` parameter was provided. So, for the URL `/feed/`, we'll render the list of links using the template `index.xml`, and for the URL `/feed/{link_id}/`, we'll render a link and the list of related comments using the template `info.xml`. The templates for this are also pretty simple. First, `geddit/templates/index.xml`: {{{ #!genshi Geddit News ${links[0].time.isoformat()} ${link.url} ${url('/info/%s/' % link.id)} ${link.username} ${link.time.isoformat()} ${link.title} }}} And now, `geddit/templates/info.xml`: {{{ #!genshi Geddit: ${link.title} ${time.isoformat()} Comment ${len(link.comments) - idx} on “${link.title}” ${url('/info/%s/' % link.id)}#comment${idx} ${comment.username} ${comment.time.isoformat()} ${comment.content} }}} Voila! We now provide Atom feeds for all our content. [[Image(tutorial05.png)]] === Ajaxified Commenting === [http://www.adaptivepath.com/publications/essays/archives/000385.php AJAX] (Asynchronous Javascript And XML) is all the rage today, and in many ways, it is indeed a helpful technique for improving the usability and responsiveness of web-based applications. To demonstrate how you'd use Genshi in a project that uses AJAX, let's enhance the Geddit commenting feature to use AJAX. We'll implement this in such a way that the current way comments work remains available, to serve those who don't have Javascript available, and also just to be good web citizens. That approach to using funky new techniques is often referred to as “unobtrusive Javascript”, and what it provides is “graceful degradation.” '''Note''': technically, what we'll be doing here isn't AJAX in the literal sense, because we'll not being transmitting XML. Instead, we'll respond with simple HTML fragments, a technique that is sometimes referred to as “AJAH” (“H” as in HTML). We'll go about this in the following way: on a link submission detail page, the “Add comment” button will now load the comment form into the current page, instead of going to the dedicated comment submission page. When the user clicks the “Cancel” button, we simply remove the form from the page. On the other hand, if the user clicks the ”Submit” button, we validate the entry, and if it's okay, we remove the form and load the new comment into the list on the page. That means that the user never leaves or reloads the link submission detail page in the process! The first thing we need to do is to make the comment form available as a fragment, outside of the normally needed HTML skeleton. To do that, we create a new template file, in `geddit/templates/_form.html`, with the following content: {{{ #!genshi
        ${errors.username}

        ${errors.content}
        }}} And as that is the same form as the one used in the `geddit/templates/comment.html` template, let's replace the markup in that template with an include: {{{ #!genshi Comment on “${link.title}”

        Comment on “${link.title}”

        In reply to ${comment.username} at ${comment.time.strftime('%x %X')}:

        ${comment.content}

        }}} We'll also need to make the display of an individual comment available as an HTML fragment, so let's factor it out into a separate template file as well. Add a template called `_comment.html` to the `geddit/templates` directory, and insert the following lines: {{{ #!genshi
      3. ${comment.username} at ${comment.time.strftime('%x %X')}
        ${comment.content}
      4. }}} And in `geddit/templates/info.html` replace the `
          ` element rendering the comments with the following: {{{ #!genshi
          }}} Now we'll need to look into modifying the `Root.comment()` method so that it correctly deals with the AJAX requests we'll be adding. For convenience, let's add a new small module to our `lib` package. Inside the `geddit/lib` directory, create a file named `ajax.py`, and add the following code to it: {{{ #!python import cherrypy def is_xhr(): requested_with = cherrypy.request.headers.get('X-Requested-With') return requested_with and requested_with.lower() == 'xmlhttprequest' }}} This checks whether the current request originates from usage of AJAX (technically, the `XMLHttpRequest` Javascript object), based on a convention commonly used in Javascript libraries to add the special HTTP header “`X-Requested-With: XMLHttpRequest`” to all requests. Add an import of that module to the top of the `geddit/controller.py` file, replacing: {{{ #!python from geddit.lib import template }}} with: {{{ #!python from geddit.lib import ajax, template }}} Then, replace the `Root.comment()` method in `geddit/controller.py` with the following code: {{{ #!python @cherrypy.expose @template.output('comment.html') def comment(self, id, cancel=False, **data): link = self.data.get(id) if not link: raise cherrypy.NotFound() if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/info/%s' % link.id) form = CommentForm() try: data = form.to_python(data) comment = link.add_comment(**data) if not ajax.is_xhr(): raise cherrypy.HTTPRedirect('/info/%s' % link.id) return template.render('_comment.html', comment=comment, num=len(link.comments)) except Invalid, e: errors = e.unpack_errors() else: errors = {} if ajax.is_xhr(): stream = template.render('_form.html', link=link, errors=errors) else: stream = template.render(link=link, comment=None, errors=errors) return stream | HTMLFormFiller(data=data) }}} There's another small detail we'll need to care of: in our `@template` decorator, we're automatically adding a `` declaration to any template output stream that is being serialized to HTML. For AJAX responses containing HTML fragments, we don't really want to add any kind of DOCTYPE, so we'll need to adjust the implementation of the decorator. To do that, first add an import of our `geddit/lib/ajax.py` file to the `geddit/lib/template.py` file: {{{ #!python from geddit.lib import ajax }}} Then, replace the implementation of the `output()` function with the following: {{{ #!python def output(filename, method='html', encoding='utf-8', **options): """Decorator for exposed methods to specify what template the should use for rendering, and which serialization method and options should be applied. """ def decorate(func): def wrapper(*args, **kwargs): cherrypy.thread_data.template = loader.load(filename) opt = options.copy() if not ajax.is_xhr() and method == 'html': opt.setdefault('doctype', 'html') serializer = get_serializer(method, **opt) stream = func(*args, **kwargs) if not isinstance(stream, Stream): return stream return encode(serializer(stream), method=serializer, encoding=encoding) return wrapper return decorate }}} Note how we're now only adding the `doctype='html'` serialization option when we're not handling an AJAX request. Finally, we need to add the actual Javascript logic needed to orchestrate all this. Add the following code at the bottom of the `` element in the `geddit/templates/info.html` template: {{{ #!genshi }}} This Javascript snippet uses [http://jquery.com/ jQuery] (via the `jquery.js` file you've already added to you `geddit/static` directory). We won't go into the details of the script here, suffice to say that it implements our goals in a fairly lightweight manner. For a nice introduction to jQuery, see [http://simonwillison.net/ Simon Willson]´s blog post [http://simonwillison.net/2007/Aug/15/jquery/ jQuery for JavaScript programmers]. Now, when you click on the “Add comment” link on the link submission detail page, with Javascript enabled, you should see the comment form appear on the same page: [[Image(tutorial06.png)]] === Allowing Markup in Comments === At this point we allow users to post plain text comments, but those comments can't include niceties such as hyperlinks or HTML inline formatting (emphasis, etc). A very naive application would simply accept HTML tags in the input, and pass those tags through to the output. That is generally a bad thing, however, as it [http://neomeme.net/2007/05/26/reddit-hacked/ opens up] your site to [http://ha.ckers.org/cross-site-scripting.html cross-site scripting] (XSS) attacks, which can undermine any security measures you try put into effect (including SSL). And because this is generally not the behavior you want, Genshi XML-escapes everything by default, which makes it safe to include in (X)HTML output. (''Note that as Geddit allows anyone to do anything, we don't actually have any valuable assets to protect, so this exercise is somewhat theoretical. For the rest of this section, just imagine we required users to register and login to submit links or post comments.'') So what we want to do in this section is to allow users to include HTML tags in their comments, but do so in a safe manner. We do not want to enable malicious users to include Javascript code, or CSS styles that turn the whole page black, or other things that may be considered harmful. In other words, we need to “sanitize” the markup in the comments. But let's ignore that aspect for now, and start by making Genshi not escape HTML tags in comments. We'll start by editing `geddit/template/_comment.html`: {{{ #!genshi
        • ${comment.username} at ${comment.time.strftime('%x %X')}
          ${HTML(comment.content)}
        • }}} Here, we've added an import for the Genshi `HTML()` function. This is done using a [wiki:Documentation/templates.html#code-blocks Python code block] via the `` processing instruction. We've already seen that we can use complex Python expressions in templates. By using the `` processing instruction, we can embed any Python statements directly in the template, for example to define classes or functions. In this case we simply import a function that we need to use. The `HTML()` function parses a snippet of HTML and returns a Genshi markup stream. It tries to do this in a way that invalid HTML is corrected (for example by fixing the nesting of tags). We then use that function to render the content of the comment. So what does this do, exactly? Well, the comment text is parsed using an HTML parser, fixed up if necessary (and possible), and injected into the template as a markup stream. A template expression that evaluates to a markup stream is treated differently than other data types: it is injected directly into the template output stream, effectively resulting in tags not getting escaped. '''Note:''' Genshi also provides the `genshi.core.Markup` class, which is just a special string class that flags its content as safe for being included in HTML/XML output for Genshi. So instead of wrapping the comment text inside a call to the `HTML()` function, you could also use `Markup(comment.content)`, which would avoid the reparsing of the content, but at the cost of that content not being subject to stream filters and different serialization methods. So at this point our users can include HTML tags in their comments, and the comments will be rendered as HTML. But as noted above, that approach is very dangerous for most real-world applications, so we've got more work to do: we need to sanitize the markup in the comment so that only markup that can be considered safe is let through. Genshi provides a stream filter to help us here: [wiki:Documentation/filters.html#html-sanitizer HTMLSanitizer]. To add sanitization, first add the imports for the `HTML` function and the `HTMLSanitizer` filter to `geddit/controller.py`, so that the imports at the top of that file look something like this: {{{ #!python import cherrypy from formencode import Invalid from genshi.input import HTML from genshi.filters import HTMLFormFiller, HTMLSanitizer }}} Then we'll update the `Root.comment()` method so that it sanitizes comments as they are submitted: {{{ #!python @cherrypy.expose @template.output('comment.html') def comment(self, id, cancel=False, **data): link = self.data.get(id) if not link: raise cherrypy.NotFound() if cherrypy.request.method == 'POST': if cancel: raise cherrypy.HTTPRedirect('/info/%s' % link.id) form = CommentForm() try: data = form.to_python(data) markup = HTML(data['content']) | HTMLSanitizer() data['content'] = markup.render('xhtml') comment = link.add_comment(**data) if not ajax.is_xhr(): raise cherrypy.HTTPRedirect('/info/%s' % link.id) return template.render('_comment.html', comment=comment, num=len(link.comments)) except Invalid, e: errors = e.unpack_errors() else: errors = {} if ajax.is_xhr(): stream = template.render('_form.html', link=link, errors=errors) else: stream = template.render(link=link, comment=None, errors=errors) return stream | HTMLFormFiller(data=data) }}} We've just added two lines here, namely: {{{ #!python markup = HTML(data['content']) | HTMLSanitizer() data['content'] = markup.render('xhtml') }}} This parses the comment text, runs it through the sanitizer, and serializes it to XHTML. And the result of the transformation is what we'll save to our “database”. Why are we using XHTML here, when we actually use HTML almost everywhere else? Well, we want to be able to include the comment text in Atom feeds, too, and for that they'll need to be well-formed XML. '''Note:''' this is just one way to add sanitization. Another equally valid approach would be to store comment submissions exactly how they were entered, and sanitize them when they are displayed. Or you could have two fields in the model: one to store the text as originally submitted, and the other to store the sanitized content ready for display. Which method you choose depends on the needs of your particular application. Or, if you were really paranoid, you'd sanitize both the input and the output. You may want to try performing some XSS attacks by including malicious HTML markup in comments. Try some of the methods shown on the [http://ha.ckers.org/xss.html XSS Cheat Sheet]. You should not be able to get past the sanitizer; if you are, please [/newticket let us now]. Speaking of the Atom feed, let's update the corresponding template so that it, too, includes the user-submitted HTML tags as markup, instead of as escaped text. Open `geddit/templates/info.xml`, and update it to look as follows: {{{ #!genshi Geddit: ${link.title} ${time.isoformat()} Comment ${len(link.comments) - idx} on “${link.title}” ${url('/info/%s/' % link.id)}#comment${idx} ${comment.username} ${comment.time.isoformat()}
          ${HTML(comment.content)}
          }}} As above, we've added the import of the Genshi `HTML()` function. On the `` element we've added the `type="xhtml"` attribute, and we've added a wrapper `
          ` inside it to declare the XHTML namespace. Finally, inside that `
          `, we inject the comment text as an HTML-parsed stream, analogous to what we've done in the HTML template. === Protecting against Cross-Site Request Forgery === '''TODO''': * Use Transformer filter to inject form token * Check token against cookie before accepting POST requests == Summary == TODO