Version 4 (modified by mkurczych, 16 years ago) (diff) |
---|
Google Summer of Code 2008
The plan
- Reimplement XPath
- Add some kind of pre-rendering of parts that don't use variables
- Try compiling templates to Python bytecode
- Try adding some minor optimizations
What has been done
Nothing, but the coding has just started. ;-)
Currently working at:
XPath reimplementation
XPath implementation in Genshi is rather buggy (for example http://genshi.edgewall.org/ticket/185). I want to rewrite it. I'm going to implement ideas from docs from the Internet like http://www.cs.uni-paderborn.de/fileadmin/Informatik/AG-Boettcher/Lehre/WS_07_08/dbis1/dbis1k8b-xpath_queries_on_xml_streams.pdf and http://www.cs.uni-paderborn.de/fileadmin/Informatik/AG-Boettcher/Lehre/WS_07_08/dbis1/dbis1k8a-xml-streams.pdf.
It appears algorithms described in these docs used buffering, which we don't want. It would allow to support more XPath expressions, but could make matching slow. Maybe later I will write code that chooses XPath algorithm according to expression complexity, but for now I'll focus on non-buffering one.
Implemented algorithm will work in O(qn) time, where q is length of XPath expression, n is number of stream events and O(qh) memory complexity, where h is height of document XML tree. It will compute for every node which places of XPath expression does it match. O(qn) is pessimistic complexity, I think algorithm will work like O(n) in most cases (the worst case is when nearly every node matches to nearly every place of expression, which is quite rare).