genshi.output
This module provides different kinds of serialization methods for XML event streams.
encode(iterator, method='xml', encoding='utf-8', out=None)
Encode serializer output into a string.
param iterator: the iterator returned from serializing a stream (basically any iterator that yields unicode objects) param method: the serialization method; determines how characters not representable in the specified encoding are treated param encoding: how the output string should be encoded; if set to None, this method returns a unicode object param out: a file-like object that the output should be written to instead of being returned as one big string; note that if this is a file or socket (or similar), the encoding must not be None (that is, the output must be encoded) return: a str or unicode object (depending on the encoding parameter), or None if the out parameter is provided since: version 0.4.1 note: Changed in 0.5: added the out parameter get_serializer(method='xml', **kwargs)
Return a serializer object for the given method.
param method: the serialization method; can be either "xml", "xhtml", "html", "text", or a custom serializer class Any additional keyword arguments are passed to the serializer, and thus depend on the method parameter value.
see: XMLSerializer, XHTMLSerializer, HTMLSerializer, TextSerializer since: version 0.4.1 DocType
Defines a number of commonly used DOCTYPE declarations as constants.
get(cls, name)
Return the (name, pubid, sysid) tuple of the DOCTYPE declaration for the specified name.
- The following names are recognized in this version:
- "html" or "html-strict" for the HTML 4.01 strict DTD
- "html-transitional" for the HTML 4.01 transitional DTD
- "html-frameset" for the HTML 4.01 frameset DTD
- "html5" for the DOCTYPE proposed for HTML5
- "xhtml" or "xhtml-strict" for the XHTML 1.0 strict DTD
- "xhtml-transitional" for the XHTML 1.0 transitional DTD
- "xhtml-frameset" for the XHTML 1.0 frameset DTD
- "xhtml11" for the XHTML 1.1 DTD
- "svg" or "svg-full" for the SVG 1.1 DTD
- "svg-basic" for the SVG Basic 1.1 DTD
- "svg-tiny" for the SVG Tiny 1.1 DTD
param name: the name of the DOCTYPE return: the (name, pubid, sysid) tuple for the requested DOCTYPE, or None if the name is not recognized since: version 0.4.1
XMLSerializer
Produces XML text from an event stream.
>>> from genshi.builder import tag >>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True)) >>> print ''.join(XMLSerializer()(elem.generate())) <div><a href="foo"/><br/><hr noshade="True"/></div>
XHTMLSerializer
Produces XHTML text from an event stream.
>>> from genshi.builder import tag >>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True)) >>> print ''.join(XHTMLSerializer()(elem.generate())) <div><a href="foo"></a><br /><hr noshade="noshade" /></div>
HTMLSerializer
Produces HTML text from an event stream.
>>> from genshi.builder import tag >>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True)) >>> print ''.join(HTMLSerializer()(elem.generate())) <div><a href="foo"></a><br><hr noshade></div>
TextSerializer
Produces plain text from an event stream.
Only text events are included in the output. Unlike the other serializer, special XML characters are not escaped:
>>> from genshi.builder import tag >>> elem = tag.div(tag.a('<Hello!>', href='foo'), tag.br) >>> print elem <div><a href="foo"><Hello!></a><br/></div> >>> print ''.join(TextSerializer()(elem.generate())) <Hello!>
If text events contain literal markup (instances of the Markup class), that markup is by default passed through unchanged:
>>> elem = tag.div(Markup('<a href="foo">Hello & Bye!</a><br/>')) >>> print elem.generate().render(TextSerializer) <a href="foo">Hello & Bye!</a><br/>
You can use the strip_markup to change this behavior, so that tags and entities are stripped from the output (or in the case of entities, replaced with the equivalent character):
>>> print elem.generate().render(TextSerializer, strip_markup=True) Hello & Bye!
EmptyTagFilter
Combines START and STOP events into EMPTY events for elements that have no contents.
NamespaceFlattener
Output stream filter that removes namespace information from the stream, instead adding namespace attributes and prefixes as needed.
param prefixes: optional mapping of namespace URIs to prefixes >>> from genshi.input import XML >>> xml = XML('''<doc xmlns="NS1" xmlns:two="NS2"> ... <two:item/> ... </doc>''') >>> for kind, data, pos in NamespaceFlattener()(xml): ... print kind, repr(data) START (u'doc', Attrs([(u'xmlns', u'NS1'), (u'xmlns:two', u'NS2')])) TEXT u'\n ' START (u'two:item', Attrs()) END u'two:item' TEXT u'\n' END u'doc'
WhitespaceFilter
A filter that removes extraneous ignorable white space from the stream.
DocTypeInserter
A filter that inserts the DOCTYPE declaration in the correct location, after the XML declaration.