Edgewall Software

Opened 18 years ago

Closed 18 years ago

Last modified 18 years ago

#66 closed defect (duplicate)

Setting input encoding from Turbogears

Reported by: cito@… Owned by: cmlenz
Priority: major Milestone:
Component: General Version: 0.3.3
Keywords: input encoding Cc:

Description

I'd like to be able to set an input encoding for my HTML templates in Turbogears since my HTML templates often are written in latin-1 with German Umlauts and these are still better supported by some editors than UTF-8. Currently, it seems only UTF-8 is supported (hard coded in XMLParser). Since you can already set the output encoding in Turbogears, it would be nice if you could set the input encoding as well. I think for Kid this is possible with the assume_encoding parameter.

Another nice feature would be to auto-determine the input encoding from XML declarations and HTML meta content-type tags.

Change History (4)

comment:1 in reply to: ↑ description Changed 18 years ago by cmlenz

Isn't this pretty much the same as #65?

Replying to cito@online.de:

Since you can already set the output encoding in Turbogears, it would be nice if you could set the input encoding as well. I think for Kid this is possible with the assume_encoding parameter.

Hmm, I thought "assume_encoding" told kid what encoding bytestrings in variables used, not the encoding of templates.

Another nice feature would be to auto-determine the input encoding from XML declarations and HTML meta content-type tags.

Is there code in Kid that does this?

comment:2 Changed 18 years ago by cito@…

Oops, sorry, I thought my first ticket had not been stored since somehow it did not show up.

I think you're right about Kid's assume_encoding. It is not an assumed input encoding, but something to overwrite Python's default_encoding. That will be probably useful when I use strings with Umlauts in controller files and have these Python files in latin-1 encoding, which is also not quite unusual.

Concerning input encoding from the XML declaration, I think both Kid and Genshi are using the Expat parser which will do that already if no encoding is given. Kid does not check for the meta tag. I think that's something for Genshi's HTML parser. Kid cannot parse HTML anyway.

comment:3 Changed 18 years ago by cmlenz

  • Resolution set to duplicate
  • Status changed from new to closed

I'm closing this as a dupe of #65 then. If you look at that ticket, you'll see most of this has already been implemented earlier today (with the exception of <meta> content-type detection).

About assume_encoding, Genshi excepts all its input data (the context, not template files) to be unicode. This is both for performance, and because it keeps the code simple.

comment:4 Changed 18 years ago by cmlenz

  • Milestone 0.4 deleted
Note: See TracTickets for help on using tickets.