#124 closed defect (wontfix)
Problem with replace() on unicode string
Reported by: | anonymous | Owned by: | cmlenz |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | Template processing | Version: | 0.4 |
Keywords: | needinfo | Cc: |
Description (last modified by cmlenz)
I am using Genshi 0.4.1 with TurboGears 1.0.2.2 and I'm getting problem with replace() on an unicode string :
Traceback (most recent call last): File "/var/lib/python-support/python2.4/cherrypy/_cphttptools.py", line 105, in _run self.main() File "/var/lib/python-support/python2.4/cherrypy/_cphttptools.py", line 254, in main body = page_handler(*virtual_path, **self.params) File "<string>", line 3, in default File "/var/lib/python-support/python2.4/turbogears/controllers.py", line 334, in expose output = database.run_with_transaction( File "<string>", line 5, in run_with_transaction File "/var/lib/python-support/python2.4/turbogears/database.py", line 260, in so_rwt retval = func(*args, **kw) File "<string>", line 5, in _expose File "/var/lib/python-support/python2.4/turbogears/controllers.py", line 351, in <lambda> mapping, fragment, args, kw))) File "/var/lib/python-support/python2.4/turbogears/controllers.py", line 391, in _execute_func return _process_output(output, template, format, content_type, mapping, fragment) File "/var/lib/python-support/python2.4/turbogears/controllers.py", line 82, in _process_output fragment=fragment) File "/var/lib/python-support/python2.4/turbogears/view/base.py", line 131, in render return engine.render(**kw) File "/var/lib/python-support/python2.4/genshi/plugin.py", line 78, in render return self.transform(info, template).render(method=format) File "/var/lib/python-support/python2.4/genshi/core.py", line 141, in render output = u''.join(list(generator)) File "/var/lib/python-support/python2.4/genshi/output.py", line 332, in __call__ for kind, data, pos in stream: File "/var/lib/python-support/python2.4/genshi/output.py", line 499, in __call__ text = escape(pop_text(), quotes=False) File "/var/lib/python-support/python2.4/genshi/core.py", line 420, in escape text = unicode(text).replace('&', '&') \ UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
Change History (6)
comment:1 Changed 18 years ago by cmlenz
- Description modified (diff)
comment:2 Changed 18 years ago by cmlenz
- Component changed from General to Template processing
- Keywords needinfo added
comment:3 Changed 18 years ago by cmlenz
Also: you're probably passing the template some non-ASCII string that is actually not a unicode object, but a bytestring using some encoding, which is unknown to Genshi. If you want to be dealing with non-ASCII strings, you absolutely need to be using true unicode objects everywhere.
comment:4 in reply to: ↑ description Changed 18 years ago by anonymous
That was quick ! Thanks :-)
Well I'm just using straight TG object (backend is Postgresal/Sqlalchemy?) : $ tg-admin shell Python 2.4.4 (#2, Apr 5 2007, 20:11:18) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2 Type "help", "copyright", "credits" or "license" for more information. (CustomShell?)
import sqlalchemy as sa from betta.model import session session.get(Foo, 2) unicode(a.content).replace('&', '&')
Traceback (most recent call last):
File "<console>", line 1, in ?
UnicodeDecodeError?: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
a.content.replace('&', '&')
'<p>Despu\xc3\xa9s de varias semanas ...'
The first replace is how Genshi is doing and next one is how I would expect it to be done. How should that be really handled ? Should that be reported to TurboGears instead ?
The database has been filled from an unicode text file encoded in UTF-8 normally.
Thanks for your help.
comment:5 follow-up: ↓ 6 Changed 18 years ago by cmlenz
- Milestone 0.4.2 deleted
- Resolution set to wontfix
- Status changed from new to closed
You'll need to make sure the database module and/or SQLAlchemy returns unicode objects for strings.
SQLAlchemy provides two ways to do this AFAICT:
- the convert_unicode flag on the create_engine() function (see Database Engine Options). I'm not sure how you set that up with TurboGears.
- using Unicode as the type for all string columns that should support non-ASCII values
I'm closing this ticket because I don't intend to change Genshi to allow bytestrings using non-ASCII encodings. Using unicode is the right thing to do anyway, so you can consider Genshi's strict behavior in that area as a hint/reminder :-)
comment:6 in reply to: ↑ 5 Changed 18 years ago by anonymous
Oh ! I understand, was confused, should be more something like this :
unicode(a.content.decode('utf-8')).replace('&', '&')
The a.content.decode('utf-8') being irrelevente to Genshi and done upfront. Makes more sense now.
Thanks for your help.
Need more information here. What does the template look like? What's in the data you're passing the template?