ApiDocs/0.5.x/genshi.input – Genshi

wiki:ApiDocs/0.5.x/genshi.input

Context Navigation

ET
ParseError
XMLParser
1. parse
XML
HTMLParser
HTML

Support for constructing markup streams from files, strings, or other sources.

ET(element)

Convert a given ElementTree element to a markup stream.

param element:	an ElementTree element
return:	a markup stream

ParseError

Exception raised when fatal syntax errors are found in the input being parsed.

XMLParser

Generator-based XML parser based on roughly equivalent code in Kid/ElementTree.

The parsing is initiated by iterating over the parser object:

>>> parser = XMLParser(StringIO('<root id="2"><child>Foo</child></root>'))
>>> for kind, data, pos in parser:
...     print kind, data
START (QName(u'root'), Attrs([(QName(u'id'), u'2')]))
START (QName(u'child'), Attrs())
TEXT Foo
END child
END root

parse(self)

Generator that parses the XML source, yielding markup events.

raises ParseError:
return:	a markup event stream
	if the XML text is not well formed

XML(text)

Parse the given XML source and return a markup stream.

Unlike with XMLParser, the returned stream is reusable, meaning it can be iterated over multiple times:

>>> xml = XML('<doc><elem>Foo</elem><elem>Bar</elem></doc>')
>>> print xml
<doc><elem>Foo</elem><elem>Bar</elem></doc>
>>> print xml.select('elem')
<elem>Foo</elem><elem>Bar</elem>
>>> print xml.select('elem/text()')
FooBar

raises ParseError:
param text:	the XML source
return:	the parsed XML event stream
	if the XML text is not well-formed

HTMLParser

Parser for HTML input based on the Python HTMLParser module.

This class provides the same interface for generating stream events as XMLParser, and attempts to automatically balance tags.

The parsing is initiated by iterating over the parser object:

>>> parser = HTMLParser(StringIO('<UL compact><LI>Foo</UL>'))
>>> for kind, data, pos in parser:
...     print kind, data
START (QName(u'ul'), Attrs([(QName(u'compact'), u'compact')]))
START (QName(u'li'), Attrs())
TEXT Foo
END li
END ul

parse(self)

Generator that parses the HTML source, yielding markup events.

raises ParseError:
return:	a markup event stream
	if the HTML text is not well formed

handle_starttag(self, tag, attrib)

(Not documented)

handle_endtag(self, tag)

(Not documented)

handle_data(self, text)

(Not documented)

handle_charref(self, name)

(Not documented)

handle_entityref(self, name)

(Not documented)

handle_pi(self, data)

(Not documented)

handle_comment(self, text)

(Not documented)

HTML(text, encoding='utf-8')

Parse the given HTML source and return a markup stream.

Unlike with HTMLParser, the returned stream is reusable, meaning it can be iterated over multiple times:

>>> html = HTML('<body><h1>Foo</h1></body>')
>>> print html
<body><h1>Foo</h1></body>
>>> print html.select('h1')
<h1>Foo</h1>
>>> print html.select('h1/text()')
Foo

raises ParseError:
param text:	the HTML source
return:	the parsed XML event stream
	if the HTML text is not well-formed, and error recovery fails

See ApiDocs/0.5.x, Documentation

Last modified 9 years ago Last modified on Dec 10, 2015, 6:15:05 AM

Download in other formats:

Plain Text

Context Navigation

genshi.input

`ET(element)`

ParseError

XMLParser

`parse(self)`

`XML(text)`

HTMLParser

`parse(self)`

`handle_starttag(self, tag, attrib)`

`handle_endtag(self, tag)`

`handle_data(self, text)`

`handle_charref(self, name)`

`handle_entityref(self, name)`

`handle_pi(self, data)`

`handle_comment(self, text)`

`HTML(text, encoding='utf-8')`

Download in other formats: