Opened 14 years ago
Last modified 8 years ago
#393 new defect
The ignore_tags setting does not work with Genshi templates
Reported by: | Viktor Ferenczi <python@…> | Owned by: | cmlenz |
---|---|---|---|
Priority: | major | Milestone: | 0.9 |
Component: | Internationalization | Version: | 0.6 |
Keywords: | i18n filter translator xml namespace ignore ignore_tags skip script style | Cc: |
Description
We've a simple Genshi template file containing a <script> tag with a few JavaScript code in it.
Its contents are copied into the pot file while using Babel's extraction tool despite both script and style tags should be ignored by default according to Genshi's documentation.
I've debugged this and it proved to be a problem with Genshi's translator, since that helps to extract translatable messages.
A patch file has been attached which can be applied to the stable Genshi 0.6 release.
Attachments (2)
Change History (10)
Changed 14 years ago by Viktor Ferenczi <python@…>
comment:1 Changed 14 years ago by Viktor Ferenczi <python@…>
The installed Python version is 2.5.4, standard MSI installer. I don't know whether it counts or not, but lxml-2.2.4-py2.5 is also installed. I'm not sure which XML library is used internally by Genshi.
comment:2 Changed 14 years ago by Viktor Ferenczi <python@…>
Genshi template to reproduce (a bit obfuscated, but still produces the issue):
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://genshi.edgewall.org/" xmlns:xi="http://www.w3.org/2001/XInclude" py:strip="True"> <link href="/vfs/here/some.css" /> <script src="/vfs/there/some.js" /> <div id="thing_${thing.index}" class="thing ${'thing_stopped' if thing.visible else 'thing_hidden'}"> <div id="thing_${thing.index}_value" class="thing-value">ABCDEF</div> <div id="thing_${thing.index}_text" class="thing-text" py:content="thing.text" /> <!-- <div id="thing_${thing.index}_debug">debug info</div> --> </div> <script language="JavaScript"> thing_controller_${thing.index} = new ThingController('thing_${thing.index}'); thing_controller_${thing.index}.visible = ${F.json(thing.visible)}; thing_controller_${thing.index}.set(${F.json(thing.value)}); thing_controller_${thing.index}.set_text(${F.json(thing.text)}); </script> </html>
comment:3 Changed 14 years ago by Viktor Ferenczi <python@…>
Debug info at one of the affected conditions:
>>> tag QName('http://www.w3.org/1999/xhtml}script') >>> self.ignore_tags [QName('script'), QName('style')] >>> tag in self.ignore_tags False
comment:4 Changed 14 years ago by Viktor Ferenczi <python@…>
Extractor configuration used:
# Extraction from Genshi HTML templates [genshi: **.html] ignore_tags = script style include_attrs = alt title summary
comment:5 Changed 14 years ago by Clicky
This happens because you're comparing a namespaced tag with a non-namespaced one ("{http://www.w3.org/1999/xhtml}script" vs. regular "script"). Of course the patch provides an easy fix: by ignoring namespaces, you completely avoid the issue. But this also has ill side-effects: what if I really wanted to ignore "raw" (non-namespaced) <script> tags but still have HTML <script> tags parsed during extraction?
I think the right way to tackle this is to change your extractor configuration like this:
# Extraction from Genshi HTML templates [genshi: **.html] ignore_tags = script style {http://www.w3.org/1999/xhtml}script {http://www.w3.org/1999/xhtml}style include_attrs = alt title summary
This way you end up with a configuration similar to what is done by default in the trunk (see source:trunk/genshi/filters/i18n.py#L605).
Maybe something could be added to the documentation to put on emphasis on how to declare ignore_tags while using namespaces.
comment:6 Changed 13 years ago by Guillaume Pratte <guillaume@…>
I am having some issues with this also.
I tried the configuration as documented in previous comment:
[genshi: site/**.html] ignore_tags = script style {http://www.w3.org/1999/xhtml}script {http://www.w3.org/1999/xhtml}style include_attrs = alt title summary href
However the "script" tag is still included in the .po file.
I did some debugging. In genshi/filters/i18n.py, class Translator, function call, I printed some debugging output :
# ... # handle different events that can be localized if kind is START: tag, attrs = data print tag in self.ignore_tags, repr(tag), self.ignore_tags if tag in self.ignore_tags or \ isinstance(attrs.get(xml_lang), basestring): skip += 1 yield kind, data, pos continue
I get this (line splitted for easier reading) :
False QName('http://www.w3.org/1999/xhtml}script') [QName('script'), QName('style'), QName('http://www.w3.org/1999/xhtml}script'), QName('http://www.w3.org/1999/xhtml}style')]
So, the QName('http://www.w3.org/1999/xhtml}script') in variable 'tag' is not the same as the QName('http://www.w3.org/1999/xhtml}script') in self.ignore_tags, and the tag does not get ignored...
comment:7 Changed 13 years ago by asuffield@…
QName does not have eq defined, so it falls back on unicode comparison of the underlying string.
(Pdb) str(tag) '{http://www.w3.org/1999/xhtml}script' (Pdb) str(itag) '{{http://www.w3.org/1999/xhtml}script'
Since the underlying string was never normalised (unlike repr), they are not equal.
This proposed extractor config hence will not work:
ignore_tags = script style {http://www.w3.org/1999/xhtml}script
This one, however, works around the bug by specifying the normalised form:
ignore_tags = script style http://www.w3.org/1999/xhtml}script
Please fix this properly by adding the missing methods to QName
comment:8 Changed 8 years ago by hodgestar
- Milestone changed from 0.6.1 to 0.9
Move to milestone 0.9.
Patch to fix the ignore_tags configuration option