Edgewall Software
Version 11 (modified by mkurczych, 6 years ago)


Google Summer of Code 2008

The plan

  • Reimplement XPath
  • Add some kind of pre-rendering of parts that don't use variables
  • Try compiling templates to Python bytecode
  • Try adding some minor optimizations

What has been done

XPath reimplementation

Current XPath implementation in Genshi is rather buggy (for example http://genshi.edgewall.org/ticket/185). I've rewritten it. Implemented algorithm works in O(qn) time, where q is length of XPath expression, n is number of stream events and O(qh) memory complexity, where h is height of document XML tree. It computes for every node which places of XPath expression does it match. O(qn) is pessimistic complexity, I think algorithm will work like O(n) in most cases (the worst case is when nearly every node matches to nearly every place of expression, which is quite rare).

Added also other implementations for simpler paths called strategies. Performance of new implementation in comparision with previous one can be checked here.

Simple pre-rendering

py:optimize directive was introduced. It marks subtree of this tag as optimizable - which means that rendering code inside can be omited and instead replaced by inserting cached one. In particular it cannot have any Python code that has side effects and cannot depend on variables.

Fragment of stream will be replaced by one event saying it represents this code. It will not be unpacked untill it is needed. Following cases "need" unpacking of fragment:

  • matching tag inside optimized fragment; it also applies to first tag in fragment so instead of
    <py:match path="foo">                                                     
    <foo py:optimize="">                                                      
        <p>Some text inside with <i>tags</i> which processing could be costful,
        but <b>could be avoided</b> by caching.</p>                            
        <p>Next paragraph.</p>                                                 

one should use

<py:match path="foo">                                                      
    <py:optimize vars="">                                                  
    <p>Some text inside with <i>tags</i> which processing could be costful,
    but <b>could be avoided</b> by caching.</p>
    <p>Next paragraph.</p>
  • using Transformer filter; yeah it's really cool, but too cool to be cached - it is able to do nearly anything to the stream
  • using form HTMLFormFiller; but here it's better than in Transformer case, because if filter sees there were no forms in fragment it returns just one event for fragment - does not return unpacked version

Using filters:

  • When filtering stream with filters defined in genshi.output filter object you have to use the same filter object (don't construct it every time), if you don't optimization engine won't treat result fragment as the same as generated with other filter object
  • HTMLFormFiller can be different object

Currently working at:

Alternative way of code compiling

Instead of using module to compile Abstract Syntax Trees to bytecode translate it to new Python code, and then compile it. This will allow using Python compiler optimizations and will make Genshi work at Google AppEngine.


Template pre-processing

  • make py:optimize take argument -- expression that calculates some "digest" of context variables which this fragment depends on; it can be any hashable python object with assumption, that if results are the same, then the rendered fragments will be the same
    <div py:optimize="var%6">
        <py:choice ="var%3">
            <p py:when="0">It is divisible by three!</p>
            <p py:default="">It's remainder when dividing by three is
        <p py:if="var%2 == 0">It is odd.</p>
        <p py:if="var%2 == 1">It is even.</p>