| 1 | = Markup Streams = |
| 2 | |
| 3 | A [wiki:ApiDocs/MarkupCore#markup.core:Stream stream] is the common representation of markup as a ''stream of events''. |
| 4 | |
| 5 | A stream can be attained in a number of ways. It can be: |
| 6 | * the result of parsing XML or HTML text, or |
| 7 | * [wiki:MarkupBuilder programmatically generated], or |
| 8 | * the result of selecting a subset of another stream filtered by an XPath expression. |
| 9 | |
| 10 | For example, the functions `XML()` and `HTML()` can be used to convert literal XML or HTML text to a markup stream: |
| 11 | |
| 12 | {{{ |
| 13 | >>> from markup import XML |
| 14 | >>> stream = XML('<p class="intro">Some text and ' |
| 15 | ... '<a href="http://example.org/">a link</a>.' |
| 16 | ... '<br/></p>') |
| 17 | >>> stream |
| 18 | <markup.core.Stream object at 0x6bef0> |
| 19 | }}} |
| 20 | |
| 21 | The stream is the result of parsing the text into events. Each event is a tuple of the form `(kind, data, pos)`, where: |
| 22 | * `kind` defines what kind of event it is (such as the start of an element, text, a comment, etc). |
| 23 | * `data` is the actual data associated with the event. How this looks depends on the event kind. |
| 24 | * `pos` is a `(filename, lineno, column)` tuple that describes where the event “comes from”. |
| 25 | |
| 26 | {{{ |
| 27 | >>> for kind, data, pos in stream: |
| 28 | ... print kind, `data`, pos |
| 29 | ... |
| 30 | START (u'p', [(u'class', u'intro')]) ('<string>', 1, 0) |
| 31 | TEXT u'Some text and ' ('<string>', 1, 31) |
| 32 | START (u'a', [(u'href', u'http://example.org/')]) ('<string>', 1, 31) |
| 33 | TEXT u'a link' ('<string>', 1, 67) |
| 34 | END u'a' ('<string>', 1, 67) |
| 35 | TEXT u'.' ('<string>', 1, 72) |
| 36 | START (u'br', []) ('<string>', 1, 72) |
| 37 | END u'br' ('<string>', 1, 77) |
| 38 | END u'p' ('<string>', 1, 77) |
| 39 | }}} |
| 40 | |
| 41 | == Serialization == |
| 42 | |
| 43 | The `Stream` class provides two methods for serializing this list of events: [wiki:ApiDocs/MarkupCore#markup.core:Stream:serialize serialize()] and [wiki:ApiDocs/MarkupCore#markup.core:Stream:render render()]. The former is a generator that yields chunks of `Markup` objects (which are basically unicode strings). The latter returns a single string, by default UTF-8 encoded. |
| 44 | |
| 45 | Here's the output from `serialize()`: |
| 46 | |
| 47 | {{{ |
| 48 | >>> for output in stream.serialize(): |
| 49 | ... print `output` |
| 50 | ... |
| 51 | <Markup u'<p class="intro">'> |
| 52 | <Markup u'Some text and '> |
| 53 | <Markup u'<a href="http://example.org/">'> |
| 54 | <Markup u'a link'> |
| 55 | <Markup u'</a>'> |
| 56 | <Markup u'.'> |
| 57 | <Markup u'<br/>'> |
| 58 | <Markup u'</p>'> |
| 59 | }}} |
| 60 | |
| 61 | And here's the output from `render()`: |
| 62 | |
| 63 | {{{ |
| 64 | >>> print stream.render() |
| 65 | <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br/></p> |
| 66 | }}} |
| 67 | |
| 68 | Both methods can be passed a `method` parameter that determines how exactly the events are serialzed to text. This parameter can be either “xml” (the default) or “html”, or a subclass of the `markup.output.Serializer` class: |
| 69 | |
| 70 | {{{ |
| 71 | >>> print stream.render('html') |
| 72 | <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br></p> |
| 73 | }}} |
| 74 | |
| 75 | ''(Note how the `<br>` element isn't closed, which is the right thing to do for HTML.)'' |
| 76 | |
| 77 | In addition, the `render()` method takes an `encoding` parameter, which defaults to “UTF-8”. If set to `None`, the result will be a unicode string. |
| 78 | |
| 79 | == Using XPath == |
| 80 | |
| 81 | XPath can be used to extract a specific subset of the stream via the `select()` method: |
| 82 | |
| 83 | {{{ |
| 84 | >>> substream = stream.select('a') |
| 85 | >>> substream |
| 86 | <markup.core.Stream object at 0x7118b0> |
| 87 | >>> print substream |
| 88 | <a href="http://example.org/">a link</a> |
| 89 | }}} |