Edgewall Software

Changes between Version 4 and Version 5 of MarkupStream


Ignore:
Timestamp:
Aug 28, 2006, 11:45:12 AM (18 years ago)
Author:
cmlenz
Comment:

Added section on stream filtering

Legend:

Unmodified
Added
Removed
Modified
  • MarkupStream

    v4 v5  
    3939}}}
    4040
     41== Filtering ==
     42
     43One important feature of markup streams is that you can apply ''filters'' to the stream, either filters that come with Markup, or your own custom filters.
     44
     45A filter is simply a callable that accepts the stream as parameter, and returns the filtered stream:
     46
     47{{{
     48#!python
     49def noop(stream):
     50    """A filter that doesn't actually do anything with the stream."""
     51    for kind, data, pos in stream:
     52        yield kind, data, pos
     53}}}
     54
     55Filters can be applied in a number of ways. The simplest is to just call the filter directly:
     56
     57{{{
     58#!python
     59stream = noop(stream)
     60}}}
     61
     62The `Stream` class also provides a `filter()` method, which takes an arbitrary number of filter callables and applies them all:
     63
     64{{{
     65#!python
     66stream = stream.filter(noop)
     67}}}
     68
     69Finally, filters can also be applied using the ''right shift'' operator (`>>`):
     70
     71{{{
     72#!python
     73stream = stream >> noop
     74}}}
     75
     76 ''Note: this is only available in the current development version (0.3)''
     77
     78One example of a filter included with Markup is the `HTMLSanitizer` in `markup.filters`. It processes a stream of HTML markup, and strips out any potentially dangerous constructs, such as Javascript event handlers. `HTMLSanitizer` is not a function, but rather a class that implements `__call__`, which means instances of the class are callable.
     79
     80Both the `filter()` method and the right-shift operator allow easy chaining of filters:
     81
     82{{{
     83#!python
     84from markup.filters import HTMLSanitizer
     85stream = stream.filter(noop, HTMLSanitizer())
     86}}}
     87
     88That is equivalent to:
     89
     90{{{
     91#!python
     92stream = stream >> noop >> HTMLSanitizer()
     93}}}
     94
    4195== Serialization ==
    4296
     
    66120}}}
    67121
    68 Both methods can be passed a `method` parameter that determines how exactly the events are serialzed to text. This parameter can be either “xml” (the default), “xhtml”, “html”, or a subclass of the `markup.output.Serializer` class:
     122Both methods can be passed a `method` parameter that determines how exactly the events are serialzed to text. This parameter can be either “xml” (the default), “xhtml”, “html”, “text”, or a custom serializer class:
    69123
    70124{{{
     
    76130
    77131In addition, the `render()` method takes an `encoding` parameter, which defaults to “UTF-8”. If set to `None`, the result will be a unicode string.
     132
     133The different serializer classes in `markup.output` can also be used directly:
     134
     135{{{
     136>>> from markup.filters import HTMLSanitizer
     137>>> from markup.output import TextSerializer
     138>>> print TextSerializer()(HTMLSanitizer()(stream))
     139Some text and a link.
     140}}}
     141
     142The right-shift operator (added in 0.3) allows a nicer syntax:
     143
     144{{{
     145>>> print stream >> HTMLSanitizer() >> TextSerializer()
     146Some text and a link.
     147}}}
    78148
    79149== Using XPath ==