Edgewall Software

Opened 16 years ago

Last modified 7 years ago

#184 new defect

str encoding in input — at Initial Version

Reported by: brickenstein@… Owned by: cmlenz
Priority: major Milestone: 0.9
Component: General Version: 0.4.4
Keywords: encoding Cc:

Description

Hi!

I am experiencing problems with strings containing non-ascii characters in the input.

--> parse stage: 20.0000 ms

Traceback (most recent call last):

File "run.py", line 46, in <module>

test()

File "run.py", line 22, in test

print tmpl.generate(data).render(method='html')

File "/home/michael/Genshi-0.4.4/genshi/core.py", line 154, in render

return encode(generator, method=method, encoding=encoding)

File "/home/michael/Genshi-0.4.4/genshi/output.py", line 45, in encode

output = u.join(list(iterator))

File "/home/michael/Genshi-0.4.4/genshi/output.py", line 369, in call

for kind, data, pos in stream:

File "/home/michael/Genshi-0.4.4/genshi/output.py", line 618, in call

for kind, data, pos in stream:

File "/home/michael/Genshi-0.4.4/genshi/output.py", line 688, in call

text = mjoin(textbuf, escape_quotes=False)

File "/home/michael/Genshi-0.4.4/genshi/core.py", line 379, in join

for item in seq]))

File "/home/michael/Genshi-0.4.4/genshi/core.py", line 405, in escape

text = unicode(text).replace('&', '&amp;') \

UnicodeDecodeError?: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

I attach a patch, which solves the problem for me, but fixes the assumed encoding to 'utf-8'. A better solution would be to have a variable assume_encoding, as in kid.

As an example, I attach a modified run.py of the examples/basic, where I replaced <world> by Wörld

This ticket is related to http://code.google.com/p/dbsprockets/issues/detail?id=54

Thank you very much in advance. Best regards, Michael

Change History (2)

Changed 16 years ago by brickenstein@…

Changed 16 years ago by brickenstein@…

Note: See TracTickets for help on using tickets.