﻿id,summary,reporter,owner,description,type,status,priority,milestone,component,version,resolution,keywords,cc
384,HTMLParser does not work with comments that include non-ascii characters,robert.hoelzl@…,cmlenz,"Hello,

When parsing a a HTML file, that contains a comment with a non-ascii character (like ""<!-- \xF6 -->"") the HTMLParser() object throws an UnicodeDecodeError.

The reason for this bug is in module genshi.input.py / class HTMLParser / method handle_comment:

current implementation:

    def handle_comment(self, text):
        self._enqueue(COMMENT, text)

correct implementation:

    def handle_comment(self, text):
        if not isinstance(text, unicode):
            text = text.decode(self.encoding, 'replace')
        self._enqueue(COMMENT, text)
",defect,new,major,0.9,Parsing,0.5.1,,,
