Edgewall Software

genshi.output

This module provides different kinds of serialization methods for XML event streams.

encode(iterator, method='xml', encoding='utf-8')

Encode serializer output into a string.

param iterator:the iterator returned from serializing a stream (basically any iterator that yields unicode objects)
param method:the serialization method; determines how characters not representable in the specified encoding are treated
param encoding:how the output string should be encoded; if set to None, this method returns a unicode object
return:a string or unicode object (depending on the encoding parameter)
since:version 0.4.1

get_serializer(method='xml', **kwargs)

Return a serializer object for the given method.

param method:the serialization method; can be either "xml", "xhtml", "html", "text", or a custom serializer class

Any additional keyword arguments are passed to the serializer, and thus depend on the method parameter value.

see:XMLSerializer, XHTMLSerializer, HTMLSerializer, TextSerializer
since:version 0.4.1

DocType

Defines a number of commonly used DOCTYPE declarations as constants.

get(cls, name)

Return the (name, pubid, sysid) tuple of the DOCTYPE declaration for the specified name.

The following names are recognized in this version:
  • "html" or "html-strict" for the HTML 4.01 strict DTD
  • "html-transitional" for the HTML 4.01 transitional DTD
  • "html-transitional" for the HTML 4.01 frameset DTD
  • "html5" for the DOCTYPE proposed for HTML5
  • "xhtml" or "xhtml-strict" for the XHTML 1.0 strict DTD
  • "xhtml-transitional" for the XHTML 1.0 transitional DTD
  • "xhtml-frameset" for the XHTML 1.0 frameset DTD
param name:the name of the DOCTYPE
return:the (name, pubid, sysid) tuple for the requested DOCTYPE, or None if the name is not recognized
since:version 0.4.1

XMLSerializer

Produces XML text from an event stream.

>>> from genshi.builder import tag
>>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True))
>>> print ''.join(XMLSerializer()(elem.generate()))
<div><a href="foo"/><br/><hr noshade="True"/></div>

XHTMLSerializer

Produces XHTML text from an event stream.

>>> from genshi.builder import tag
>>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True))
>>> print ''.join(XHTMLSerializer()(elem.generate()))
<div><a href="foo"></a><br /><hr noshade="noshade" /></div>

HTMLSerializer

Produces HTML text from an event stream.

>>> from genshi.builder import tag
>>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True))
>>> print ''.join(HTMLSerializer()(elem.generate()))
<div><a href="foo"></a><br><hr noshade></div>

TextSerializer

Produces plain text from an event stream.

Only text events are included in the output. Unlike the other serializer, special XML characters are not escaped:

>>> from genshi.builder import tag
>>> elem = tag.div(tag.a('<Hello!>', href='foo'), tag.br)
>>> print elem
<div><a href="foo">&lt;Hello!&gt;</a><br/></div>
>>> print ''.join(TextSerializer()(elem.generate()))
<Hello!>

If text events contain literal markup (instances of the Markup class), tags or entities are stripped from the output:

>>> elem = tag.div(Markup('<a href="foo">Hello!</a><br/>'))
>>> print elem
<div><a href="foo">Hello!</a><br/></div>
>>> print ''.join(TextSerializer()(elem.generate()))
Hello!

EmptyTagFilter

Combines START and STOP events into EMPTY events for elements that have no contents.

NamespaceFlattener

Output stream filter that removes namespace information from the stream, instead adding namespace attributes and prefixes as needed.

param prefixes:optional mapping of namespace URIs to prefixes
>>> from genshi.input import XML
>>> xml = XML('''<doc xmlns="NS1" xmlns:two="NS2">
...   <two:item/>
... </doc>''')
>>> for kind, data, pos in NamespaceFlattener()(xml):
...     print kind, repr(data)
START (u'doc', Attrs([(u'xmlns', u'NS1'), (u'xmlns:two', u'NS2')]))
TEXT u'\n  '
START (u'two:item', Attrs())
END u'two:item'
TEXT u'\n'
END u'doc'

NamespaceStripper

Stream filter that removes all namespace information from a stream, and optionally strips out all tags not in a given namespace.

param namespace:
 the URI of the namespace that should not be stripped. If not set, only elements with no namespace are included in the output.
>>> from genshi.input import XML
>>> xml = XML('''<doc xmlns="NS1" xmlns:two="NS2">
...   <two:item/>
... </doc>''')
>>> for kind, data, pos in NamespaceStripper(Namespace('NS1'))(xml):
...     print kind, repr(data)
START (u'doc', Attrs())
TEXT u'\n  '
TEXT u'\n'
END u'doc'

WhitespaceFilter

A filter that removes extraneous ignorable white space from the stream.


See ApiDocs/0.4.x, Documentation

Last modified 9 years ago Last modified on Dec 10, 2015, 6:15:05 AM