#66 closed defect (duplicate)
Setting input encoding from Turbogears
Reported by: | cito@… | Owned by: | cmlenz |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | General | Version: | 0.3.3 |
Keywords: | input encoding | Cc: |
Description
I'd like to be able to set an input encoding for my HTML templates in Turbogears since my HTML templates often are written in latin-1 with German Umlauts and these are still better supported by some editors than UTF-8. Currently, it seems only UTF-8 is supported (hard coded in XMLParser). Since you can already set the output encoding in Turbogears, it would be nice if you could set the input encoding as well. I think for Kid this is possible with the assume_encoding parameter.
Another nice feature would be to auto-determine the input encoding from XML declarations and HTML meta content-type tags.
Change History (4)
comment:1 in reply to: ↑ description Changed 18 years ago by cmlenz
comment:2 Changed 18 years ago by cito@…
Oops, sorry, I thought my first ticket had not been stored since somehow it did not show up.
I think you're right about Kid's assume_encoding. It is not an assumed input encoding, but something to overwrite Python's default_encoding. That will be probably useful when I use strings with Umlauts in controller files and have these Python files in latin-1 encoding, which is also not quite unusual.
Concerning input encoding from the XML declaration, I think both Kid and Genshi are using the Expat parser which will do that already if no encoding is given. Kid does not check for the meta tag. I think that's something for Genshi's HTML parser. Kid cannot parse HTML anyway.
comment:3 Changed 18 years ago by cmlenz
- Resolution set to duplicate
- Status changed from new to closed
I'm closing this as a dupe of #65 then. If you look at that ticket, you'll see most of this has already been implemented earlier today (with the exception of <meta> content-type detection).
About assume_encoding, Genshi excepts all its input data (the context, not template files) to be unicode. This is both for performance, and because it keeps the code simple.
comment:4 Changed 18 years ago by cmlenz
- Milestone 0.4 deleted
Isn't this pretty much the same as #65?
Replying to cito@online.de:
Hmm, I thought "assume_encoding" told kid what encoding bytestrings in variables used, not the encoding of templates.
Is there code in Kid that does this?