|Version 17 (modified by cmlenz, 11 years ago) (diff)|
Frequently Asked Questions
Here you can find answers to frequently asked questions about Markup.
- What is Markup?
- Why yet another template engine?
- Why XML-based?
- So then why not just use Kid?
- What are the main differences between Kid and Markup?
- What other features does the toolkit provide?
- Why use includes instead of inheritance?
- What license governs the use of Markup?
- What do I need to use Markup?
What is Markup?
We like to call it a “toolkit for stream-based generation of markup for the web”. The largest feature provided by Markup is an XML-based template engine that is heavily inspired by Kid.
Why yet another template engine?
We'll let Ryan Tomayko, the author of Kid, answer this one:
“There's at least four billion and nine text based template languages for Python but there aren't a lot of options that fit nicely into the XML tool-chain. Or, if they do fit nicely into the XML tool-chain, they don't fit nicely with Python.”
See his article “In search of a Pythonic, XML-based Templating Language” for the details.
Most template engines for web applications are character-stream based: they know nothing about the format of the response body that is being generated. They simply substitute variable expressions, and provide some directives for looping, conditionals, etc. Thus they can be used to generate any kind of textual output, be it HTML, plain text emails, program code, or really anything else.
However, 99% of the templates used by web applications generate some kind of XML/HTML-based markup. We believe that web applications can benefit from a template engine that “knows what it's doing” when it comes to markup. You don't need to worry about generating output that is not well-formed, nor do you need to worry about accidentially not escaping some data, thereby greatly reducing the risk for introducing XSS attack vectors. Furthermore, your templates look a lot more like the targetted output format: an HTML template looks like HTML, a template for an RSS feed looks like RSS. Directives in text-based template languages often result in rather messy templates, or produce excessive amounts of unnecessary white space.
In addition, text-based templates don't even work all that well for many text formats. Imagine you want to generate a plain text email or an iCalendar file. How do you deal with important concerns such as line-wrapping and white-space in your templates? You may be better off using specialized formatters.
So then why not just use Kid?
We think that Kid represents a huge step forward for XML-based templating in Python. Match templates and the generator-based processing model are extremely powerful concepts.
But arguably Kid also has some basic design problems. For example, Kid generates Python code from templates, which adds a lot of complexity to the code and can make the process of locating and fixing template errors a true nightmare. A syntax error in a template expression will cause an exception that points somewhere in the generated code. In addition, as Kid is based on ElementTree, and the ElementTree API doesn't provide location information for parse events, exceptions reported by Kid often don't include information about what part of the template caused the error. (To be fair, this kind of location tracking wasn't even available in the Python bindings for Expat before Python 2.4.)
We felt these problems would best be addressed by developing a new engine form scratch, as opposed to trying to “fix” Kid.
What are the main differences between Kid and Markup?
Markup executes templates directly, there's no code generation phase. Expressions are evaluated in a more forgiving way using AST transformation. Template variables are stored on a stack, which means that some variable set in a loop deep in the template won't leak into the rest of the template. And even though Markup doesn't generate Python code for templates, it generally performs slightly better than Kid (even up to 2x in some of our tests, but the exact differences depend on a lot of factors).
Markup does not depend on ElementTree. It uses Expat for parsing XML, and is based on streaming slightly abstracted parse events through the processing pipeline. It uses XInclude – instead of Kids' py:extends – to allow template authors to factor out common bits. For match templates, it uses XPath expressions instead of the ElementTree API.
For more details about what's different see MarkupVsKid.
What other features does the toolkit provide?
Beyond the template engine, Markup provides:
- a unified stream-based processing model for markup, where
- streams can come from XML or HTML text, or be generated programmatically using a very simple syntax.
- XPath can be used to query any stream, not just in templates.
- Different serialization methods (XML and HTML) for streams.
- An HTML “sanitizing” filter to strip potentially dangerous elements or attributes from user-submitted HTML markup.
Why use includes instead of inheritance?
We think that includes are both simpler and more natural for templating.
Template inheritance is a concept that fits well with template languages where a master template provide “slots” that are “filled” by the inheriting templates. However, Markup has no such feature, and instead uses the more powerful and flexible concept of match templates.
What license governs the use of Markup?
What do I need to use Markup?
Python 2.3 or later. Python 2.4 is recommended for better performance, plus error messages will include template line numbers and column offsets. Setuptools is optional and only used for installation if it's available.
The template engine plugin (for http://www.turbogears.org/docs/plugins/template.html), which enables usage of Markup in frameworks such as TurboGears, depends on Setuptools at runtime and installation time. Use of the plugin implementation is optional, though: Setuptools is not required for using Markup directly.
How can I include literal XML in template output?
Unless explicitly told otherwise, Markup escapes any data you substitute into template output so that it is safe for being parsed and displayed by web browsers and other tools. This saves you from the work of having to tediously escape every variable by hand, and greatly reduces the risk of introducing vectors for cross-site scripting (XSS) attacks.
However, sometimes what you want is to include text in the template output that should not get escaped. For example, if you allow users to enter HTML verbatim (or provide a rich-text editor of sorts), you want that HTML to appear as actual markup in the output, not as escaped text.
Markup provides a number of ways to do that:
- The Markup class in the markup.core? module can be used to flag strings that should not be escaped. Strings wrapped in a Markup instance get copied to the output unchanged.
- The XML and HTML functions in the markup.input? module parse XML and HTML strings, respectively, and produce a markup stream. Note that this option can be rather expensive, as the text needs to be parsed just to be serialized again. Also, this method fails on bad markup that cannot be parsed by either HTMLParser or Expat.
- If you are generating the snippets in question yourself, you may want to use the markup.builder? to generate markup streams programmatically. Just as the results of the XML and HTML functions discussed above, the stream produced using markup.builder will not be escaped in the template output.