Edgewall Software
Version 4 (modified by mkurczych, 6 years ago)

--

Google Summer of Code 2008

The plan

* Reimplement XPath

* Add some kind of pre-rendering of parts that don't use variables

* Try compiling templates to Python bytecode

* Try adding some minor optimizations

What has been done

Nothing, but the coding has just started. ;-)

Currently working at:

XPath reimplementation

XPath implementation in Genshi is rather buggy (for example http://genshi.edgewall.org/ticket/185). I want to rewrite it. I'm going to implement ideas from docs from the Internet like  http://www.cs.uni-paderborn.de/fileadmin/Informatik/AG-Boettcher/Lehre/WS_07_08/dbis1/dbis1k8b-xpath_queries_on_xml_streams.pdf and  http://www.cs.uni-paderborn.de/fileadmin/Informatik/AG-Boettcher/Lehre/WS_07_08/dbis1/dbis1k8a-xml-streams.pdf. It appears algorithms described in these docs used buffering, which we don't want. It would allow to support more XPath expressions, but could make matching slow. Maybe later I will write code that chooses XPath algorithm according to expression complexity, but for now I'll focus on non-buffering one. Implemented algorithm will work in O(qn) time, where q is length of XPath expression, n is number of stream events and O(qh) memory complexity, where h is height of document XML tree. It will compute for every node which places of XPath expression does it match. O(qn) is pessimistic complexity, I think algorithm will work like O(n) in most cases (the worst case is when nearly every node matches to nearly every place of expression, which is quite rare).