Edgewall Software

Ticket #538 (closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

HTMLParser fails if a multi-byte character falls on a 4K boundary

Reported by: hodgestar Owned by: hodgestar
Priority: major Milestone: 0.7
Component: Parsing Version: devel
Keywords: Cc:


If one does:

text = u'a' * ((4 * 1024) - 1) + u'\xe6'
events = list(HTMLParser(BytesIO(text.encode('utf-8')),

it produces a truncated-input error because the multi-byte character crosses the boundary of a read from the input file.


Change History

Changed 3 years ago by hodgestar

  • status changed from new to closed
  • resolution set to fixed

Fixed in r1189.

Add/Change #538 (HTMLParser fails if a multi-byte character falls on a 4K boundary)


E-mail address and user name can be saved in the Preferences.

Change Properties
<Author field>
as closed
The resolution will be deleted. Next status will be 'reopened'
Note: See TracTickets for help on using tickets.