Edgewall Software

Ticket #393 (new defect)

Opened 4 years ago

Last modified 2 years ago

The ignore_tags setting does not work with Genshi templates

Reported by: Viktor Ferenczi <python@…> Owned by: cmlenz
Priority: major Milestone: 0.6.1
Component: Internationalization Version: 0.6
Keywords: i18n filter translator xml namespace ignore ignore_tags skip script style Cc:

Description

We've a simple Genshi template file containing a <script> tag with a few JavaScript code in it.

Its contents are copied into the pot file while using Babel's extraction tool despite both script and style tags should be ignored by default according to Genshi's documentation.

I've debugged this and it proved to be a problem with Genshi's translator, since that helps to extract translatable messages.

A patch file has been attached which can be applied to the stable Genshi 0.6 release.

Attachments

genshi-0.6-filter.i18n.patch Download (0.9 KB) - added by Viktor Ferenczi <python@…> 4 years ago.
Patch to fix the ignore_tags configuration option
genshi-0.6-filter.i18n.2.patch Download (0.9 KB) - added by Guillaume Pratte <guillaume@…> 3 years ago.
This crude patch make it work for me.

Change History

Changed 4 years ago by Viktor Ferenczi <python@…>

Patch to fix the ignore_tags configuration option

Changed 4 years ago by Viktor Ferenczi <python@…>

The installed Python version is 2.5.4, standard MSI installer. I don't know whether it counts or not, but lxml-2.2.4-py2.5 is also installed. I'm not sure which XML library is used internally by Genshi.

Changed 4 years ago by Viktor Ferenczi <python@…>

Genshi template to reproduce (a bit obfuscated, but still produces the issue):

<html xmlns="http://www.w3.org/1999/xhtml"
  xmlns:py="http://genshi.edgewall.org/"
  xmlns:xi="http://www.w3.org/2001/XInclude"      
  py:strip="True">

<link href="/vfs/here/some.css" />
<script src="/vfs/there/some.js" />

<div id="thing_${thing.index}"
     class="thing ${'thing_stopped' if thing.visible else 'thing_hidden'}">
  <div id="thing_${thing.index}_value" class="thing-value">ABCDEF</div>
  <div id="thing_${thing.index}_text" class="thing-text" py:content="thing.text" />
  <!-- <div id="thing_${thing.index}_debug">debug info</div> -->
</div>

<script language="JavaScript">
  thing_controller_${thing.index} = new ThingController('thing_${thing.index}');
  thing_controller_${thing.index}.visible = ${F.json(thing.visible)};
  thing_controller_${thing.index}.set(${F.json(thing.value)});
  thing_controller_${thing.index}.set_text(${F.json(thing.text)});
</script>

</html>

Changed 4 years ago by Viktor Ferenczi <python@…>

Debug info at one of the affected conditions:

>>> tag
QName('http://www.w3.org/1999/xhtml}script')

>>> self.ignore_tags
[QName('script'), QName('style')]

>>> tag in self.ignore_tags
False

Changed 4 years ago by Viktor Ferenczi <python@…>

Extractor configuration used:

# Extraction from Genshi HTML templates
[genshi: **.html]
ignore_tags = script style
include_attrs = alt title summary

Changed 4 years ago by Clicky

This happens because you're comparing a namespaced tag with a non-namespaced one ("{http://www.w3.org/1999/xhtml}script" vs. regular "script"). Of course the patch provides an easy fix: by ignoring namespaces, you completely avoid the issue. But this also has ill side-effects: what if I really wanted to ignore "raw" (non-namespaced) <script> tags but still have HTML <script> tags parsed during extraction?

I think the right way to tackle this is to change your extractor configuration like this:

# Extraction from Genshi HTML templates
[genshi: **.html]
ignore_tags = script style {http://www.w3.org/1999/xhtml}script {http://www.w3.org/1999/xhtml}style
include_attrs = alt title summary

This way you end up with a configuration similar to what is done by default in the trunk (see source:trunk/genshi/filters/i18n.py#L605).

Maybe something could be added to the documentation to put on emphasis on how to declare ignore_tags while using namespaces.

Changed 3 years ago by Guillaume Pratte <guillaume@…>

I am having some issues with this also.

I tried the configuration as documented in previous comment:

[genshi: site/**.html]
ignore_tags = script style {http://www.w3.org/1999/xhtml}script {http://www.w3.org/1999/xhtml}style
include_attrs = alt title summary href

However the "script" tag is still included in the .po file.

I did some debugging. In genshi/filters/i18n.py, class Translator, function call, I printed some debugging output :

# ...
            # handle different events that can be localized
            if kind is START:
                tag, attrs = data
                print tag in self.ignore_tags, repr(tag), self.ignore_tags
                if tag in self.ignore_tags or \
                        isinstance(attrs.get(xml_lang), basestring):
                    skip += 1
                    yield kind, data, pos
                    continue

I get this (line splitted for easier reading) :

False
QName('http://www.w3.org/1999/xhtml}script')
[QName('script'), QName('style'), QName('http://www.w3.org/1999/xhtml}script'), QName('http://www.w3.org/1999/xhtml}style')]

So, the QName(' http://www.w3.org/1999/xhtml}script') in variable 'tag' is not the same as the QName(' http://www.w3.org/1999/xhtml}script') in self.ignore_tags, and the tag does not get ignored...

Changed 3 years ago by Guillaume Pratte <guillaume@…>

This crude patch make it work for me.

Changed 2 years ago by asuffield@…

QName does not have eq defined, so it falls back on unicode comparison of the underlying string.

(Pdb) str(tag) '{ http://www.w3.org/1999/xhtml}script' (Pdb) str(itag) '{{ http://www.w3.org/1999/xhtml}script'

Since the underlying string was never normalised (unlike repr), they are not equal.

This proposed extractor config hence will not work:

ignore_tags = script style { http://www.w3.org/1999/xhtml}script

This one, however, works around the bug by specifying the normalised form:

ignore_tags = script style  http://www.w3.org/1999/xhtml}script

Please fix this properly by adding the missing methods to QName

Add/Change #393 (The ignore_tags setting does not work with Genshi templates)

Author


E-mail address and user name can be saved in the Preferences.


Change Properties
<Author field>
Action
as new
as The resolution will be set. Next status will be 'closed'
to The owner will change from cmlenz. Next status will be 'new'
The owner will change from cmlenz to anonymous. Next status will be 'assigned'
 
Note: See TracTickets for help on using tickets.