Edgewall Software

Changes between Version 2 and Version 3 of GSoC2008


Ignore:
Timestamp:
May 26, 2008, 11:04:01 PM (16 years ago)
Author:
mkurczych
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GSoC2008

    v2 v3  
    33== The plan ==
    44* Reimplement XPath
     5
    56* Add some kind of pre-rendering of parts that don't use variables
     7
    68* Try compiling templates to Python bytecode
     9
    710* Try adding some minor optimizations
    811
     
    1215== Currently working at: ==
    1316=== XPath reimplementation ===
    14 XPath implementation in Genshi is rather buggy (for example [http://genshi.edgewall.org/ticket/185]). I want to rewrite it. I'm going to implement ideas from docs from the internet like [http://www.cs.uni-paderborn.de/fileadmin/Informatik/AG-Boettcher/Lehre/WS_07_08/dbis1/dbis1k8b-xpath_queries_on_xml_streams.pdf] and [http://www.cs.uni-paderborn.de/fileadmin/Informatik/AG-Boettcher/Lehre/WS_07_08/dbis1/dbis1k8a-xml-streams.pdf].
     17XPath implementation in Genshi is rather buggy (for example [http://genshi.edgewall.org/ticket/185]). I want to rewrite it. ~~I'm going to implement ideas from docs from the Internet like [http://www.cs.uni-paderborn.de/fileadmin/Informatik/AG-Boettcher/Lehre/WS_07_08/dbis1/dbis1k8b-xpath_queries_on_xml_streams.pdf] and [http://www.cs.uni-paderborn.de/fileadmin/Informatik/AG-Boettcher/Lehre/WS_07_08/dbis1/dbis1k8a-xml-streams.pdf].~~
     18It appears algorithms described in these docs used buffering, which we don't want. It would allow to support more XPath expressions, but could make matching slow. Maybe later I will write code that chooses XPath algorithm according to expression complexity, but for now I'll focus on non-buffering one.
     19Implemented algorithm will work in O(qn) time, where q is length of XPath expression, n is number of stream events and O(1) memory complexity. It will compute for every node which places of XPath expression does it match. O(qn) is pessimistic complexity, I think algorithm will work like O(n) in most cases (the worst case is when nearly every node matches to nearly every place of expression, which is quite rare).