genshi.output
This module provides different kinds of serialization methods for XML event streams.
encode(iterator, method='xml', encoding='utf-8')
Encode serializer output into a string.
param iterator: the iterator returned from serializing a stream (basically any iterator that yields unicode objects) param method: the serialization method; determines how characters not representable in the specified encoding are treated param encoding: how the output string should be encoded; if set to None, this method returns a unicode object return: a string or unicode object (depending on the encoding parameter) since: version 0.4.1 get_serializer(method='xml', **kwargs)
Return a serializer object for the given method.
param method: the serialization method; can be either "xml", "xhtml", "html", "text", or a custom serializer class Any additional keyword arguments are passed to the serializer, and thus depend on the method parameter value.
see: XMLSerializer, XHTMLSerializer, HTMLSerializer, TextSerializer since: version 0.4.1 DocType
Defines a number of commonly used DOCTYPE declarations as constants.
get(cls, name)
Return the (name, pubid, sysid) tuple of the DOCTYPE declaration for the specified name.
- The following names are recognized in this version:
- "html" or "html-strict" for the HTML 4.01 strict DTD
- "html-transitional" for the HTML 4.01 transitional DTD
- "html-transitional" for the HTML 4.01 frameset DTD
- "html5" for the DOCTYPE proposed for HTML5
- "xhtml" or "xhtml-strict" for the XHTML 1.0 strict DTD
- "xhtml-transitional" for the XHTML 1.0 transitional DTD
- "xhtml-frameset" for the XHTML 1.0 frameset DTD
param name: the name of the DOCTYPE return: the (name, pubid, sysid) tuple for the requested DOCTYPE, or None if the name is not recognized since: version 0.4.1
XMLSerializer
Produces XML text from an event stream.
>>> from genshi.builder import tag >>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True)) >>> print ''.join(XMLSerializer()(elem.generate())) <div><a href="foo"/><br/><hr noshade="True"/></div>
XHTMLSerializer
Produces XHTML text from an event stream.
>>> from genshi.builder import tag >>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True)) >>> print ''.join(XHTMLSerializer()(elem.generate())) <div><a href="foo"></a><br /><hr noshade="noshade" /></div>
HTMLSerializer
Produces HTML text from an event stream.
>>> from genshi.builder import tag >>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True)) >>> print ''.join(HTMLSerializer()(elem.generate())) <div><a href="foo"></a><br><hr noshade></div>
TextSerializer
Produces plain text from an event stream.
Only text events are included in the output. Unlike the other serializer, special XML characters are not escaped:
>>> from genshi.builder import tag >>> elem = tag.div(tag.a('<Hello!>', href='foo'), tag.br) >>> print elem <div><a href="foo"><Hello!></a><br/></div> >>> print ''.join(TextSerializer()(elem.generate())) <Hello!>
If text events contain literal markup (instances of the Markup class), tags or entities are stripped from the output:
>>> elem = tag.div(Markup('<a href="foo">Hello!</a><br/>')) >>> print elem <div><a href="foo">Hello!</a><br/></div> >>> print ''.join(TextSerializer()(elem.generate())) Hello!
EmptyTagFilter
Combines START and STOP events into EMPTY events for elements that have no contents.
NamespaceFlattener
Output stream filter that removes namespace information from the stream, instead adding namespace attributes and prefixes as needed.
param prefixes: optional mapping of namespace URIs to prefixes >>> from genshi.input import XML >>> xml = XML('''<doc xmlns="NS1" xmlns:two="NS2"> ... <two:item/> ... </doc>''') >>> for kind, data, pos in NamespaceFlattener()(xml): ... print kind, repr(data) START (u'doc', Attrs([(u'xmlns', u'NS1'), (u'xmlns:two', u'NS2')])) TEXT u'\n ' START (u'two:item', Attrs()) END u'two:item' TEXT u'\n' END u'doc'
NamespaceStripper
Stream filter that removes all namespace information from a stream, and optionally strips out all tags not in a given namespace.
param namespace: the URI of the namespace that should not be stripped. If not set, only elements with no namespace are included in the output. >>> from genshi.input import XML >>> xml = XML('''<doc xmlns="NS1" xmlns:two="NS2"> ... <two:item/> ... </doc>''') >>> for kind, data, pos in NamespaceStripper(Namespace('NS1'))(xml): ... print kind, repr(data) START (u'doc', Attrs()) TEXT u'\n ' TEXT u'\n' END u'doc'
WhitespaceFilter
A filter that removes extraneous ignorable white space from the stream.