Edgewall Software

Ticket #580: msgctxt.2.patch

File msgctxt.2.patch, 41.6 KB (added by eric.oconnell@…, 10 years ago)

updated patch to include docs and change from review

  • doc/i18n.txt

     
    99localizable strings from templates, as well as a template filter and special
    1010directives that can apply translations to templates as they get rendered.
    1111
    12 This support is based on `gettext`_ message catalogs and the `gettext Python 
     12This support is based on `gettext`_ message catalogs and the `gettext Python
    1313module`_. The extraction process can be used from the API level, or through
    1414the front-ends implemented by the `Babel`_ project, for which Genshi provides
    1515a plugin.
     
    3939However, this approach results in significant “character noise” in templates,
    4040making them harder to read and preview.
    4141
    42 The ``genshi.filters.Translator`` filter allows you to get rid of the 
     42The ``genshi.filters.Translator`` filter allows you to get rid of the
    4343explicit `gettext`_ function calls, so you can (often) just continue to write:
    4444
    4545.. code-block:: genshi
     
    5454          corresponding ``gettext`` function in embedded Python expressions.
    5555
    5656You can control which tags should be ignored by this process; for example, it
    57 doesn't really make sense to translate the content of the HTML 
     57doesn't really make sense to translate the content of the HTML
    5858``<script></script>`` element. Both ``<script>`` and ``<style>`` are excluded
    5959by default.
    6060
    61 Attribute values can also be automatically translated. The default is to 
     61Attribute values can also be automatically translated. The default is to
    6262consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``,
    6363``summary``, and ``title``, which is a list that makes sense for HTML
    6464documents.  Of course, you can tell the translator to use a different set of
     
    7777  <p xml:lang="en">Hello, world!</p>
    7878
    7979On the other hand, if the value of the ``xml:lang`` attribute contains a Python
    80 expression, the element contents and attributes are still considered for 
     80expression, the element contents and attributes are still considered for
    8181automatic translation:
    8282
    8383.. code-block:: genshi
     
    337337  </div>
    338338
    339339
     340``i18n.ctxt``
     341-------------
     342
     343Sometimes a source string can have two different meanings. Without resorting to
     344splitting these two occurrences into different domains, gettext provides a
     345means to specify a *context* for each translatable string. For instance, the
     346word "volunteer" can either mean the noun, one who volunteers, or the verb,
     347to volunteer.
     348
     349The ``i18n:ctxt`` directive allows you to mark a scope with a particular
     350context. Here is a rather contrived example:
     351
     352.. code-block:: genshi
     353
     354  <p>A <span i18n:ctxt="noun">volunteer</span> can really help their community.
     355    Why don't you <span i18n:ctxt="verb">volunteer</span> some time today?
     356  </p>
     357
     358
    340359Extraction
    341360==========
    342361
    343362The ``Translator`` class provides a class method called ``extract``, which is
    344 a generator yielding all localizable strings found in a template or markup 
     363a generator yielding all localizable strings found in a template or markup
    345364stream. This includes both literal strings in text nodes and attribute values,
    346365as well as strings in ``gettext()`` calls in embedded Python code. See the API
    347366documentation for details on how to use this method directly.
     
    351370-----------------
    352371
    353372This functionality is integrated with the message extraction framework provided
    354 by the `Babel`_ project. Babel provides a command-line interface as well as 
    355 commands that can be used from ``setup.py`` scripts using `Setuptools`_ or 
     373by the `Babel`_ project. Babel provides a command-line interface as well as
     374commands that can be used from ``setup.py`` scripts using `Setuptools`_ or
    356375`Distutils`_.
    357376
    358377.. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools
    359378.. _`distutils`: http://docs.python.org/dist/dist.html
    360379
    361 The first thing you need to do to make Babel extract messages from Genshi 
     380The first thing you need to do to make Babel extract messages from Genshi
    362381templates is to let Babel know which files are Genshi templates. This is done
    363382using a “mapping configuration”, which can be stored in a configuration file,
    364383or specified directly in your ``setup.py``.
     
    407426
    408427``include_attrs``
    409428-----------------
    410 Comma-separated list of attribute names that should be considered to have 
     429Comma-separated list of attribute names that should be considered to have
    411430localizable values. Only used for markup templates.
    412431
    413432``ignore_tags``
    414433---------------
    415 Comma-separated list of tag names that should be ignored. Only used for markup 
     434Comma-separated list of tag names that should be ignored. Only used for markup
    416435templates.
    417436
    418437``extract_text``
    419438----------------
    420439Whether text outside explicit ``gettext`` function calls should be extracted.
    421440By default, any text nodes not inside ignored tags, and values of attribute in
    422 the ``include_attrs`` list are extracted. If this option is disabled, only 
     441the ``include_attrs`` list are extracted. If this option is disabled, only
    423442strings in ``gettext`` function calls are extracted.
    424443
    425444.. note:: If you disable this option, and do not make use of the
     
    446465
    447466  from genshi.filters import Translator
    448467  from genshi.template import MarkupTemplate
    449  
     468
    450469  template = MarkupTemplate("...")
    451470  template.filters.insert(0, Translator(translations.ugettext))
    452471
     
    457476
    458477  from genshi.filters import Translator
    459478  from genshi.template import MarkupTemplate
    460  
     479
    461480  template = MarkupTemplate("...")
    462481  translator = Translator(translations.ugettext)
    463482  translator.setup(template)
     
    473492Related Considerations
    474493======================
    475494
    476 If you intend to produce an application that is fully prepared for an 
     495If you intend to produce an application that is fully prepared for an
    477496international audience, there are a couple of other things to keep in mind:
    478497
    479498-------
     
    482501
    483502Use ``unicode`` internally, not encoded bytestrings. Only encode/decode where
    484503data enters or exits the system. This means that your code works with characters
    485 and not just with bytes, which is an important distinction for example when 
     504and not just with bytes, which is an important distinction for example when
    486505calculating the length of a piece of text. When you need to decode/encode, it's
    487506probably a good idea to use UTF-8.
    488507
     
    490509Date and Time
    491510-------------
    492511
    493 If your application uses datetime information that should be displayed to users 
    494 in different timezones, you should try to work with UTC (universal time) 
    495 internally. Do the conversion from and to "local time" when the data enters or 
    496 exits the system. Make use the Python `datetime`_ module and the third-party 
     512If your application uses datetime information that should be displayed to users
     513in different timezones, you should try to work with UTC (universal time)
     514internally. Do the conversion from and to "local time" when the data enters or
     515exits the system. Make use the Python `datetime`_ module and the third-party
    497516`pytz`_ package.
    498517
    499518--------------------------
    500519Formatting and Locale Data
    501520--------------------------
    502521
    503 Make sure you check out the functionality provided by the `Babel`_ project for 
     522Make sure you check out the functionality provided by the `Babel`_ project for
    504523things like number and date formatting, locale display strings, etc.
    505524
    506525.. _`datetime`: http://docs.python.org/lib/module-datetime.html
  • genshi/filters/tests/i18n.py

     
    6262            else:
    6363                return msgid2
    6464
    65     def dungettext(self, domain, singular, plural, numeral):
    66         return self._domain_call('ungettext', domain, singular, plural, numeral)
     65    def dungettext(self, domain, msgid1, msgid2, n):
     66        return self._domain_call('ungettext', domain, msgid1, msgid2, n)
    6767
     68    def upgettext(self, context, message):
     69        try:
     70            return self._catalog[(context, message)]
     71        except KeyError:
     72            if self._fallback:
     73                return self._fallback.upgettext(context, message)
     74            return unicode(message)
    6875
     76    def dupgettext(self, domain, context, message):
     77        return self._domain_call('upgettext', domain, context, message)
     78
     79    def unpgettext(self, context, msgid1, msgid2, n):
     80        try:
     81            return self._catalog[(context, msgid1, self.plural(n))]
     82        except KeyError:
     83            if self._fallback:
     84                return self._fallback.unpgettext(context, msgid1, msgid2, n)
     85            if n == 1:
     86                return msgid1
     87            else:
     88                return msgid2
     89
     90    def dunpgettext(self, domain, context, msgid1, msgid2, n):
     91        return self._domain_call('npgettext', context, msgid1, msgid2, n)
     92
     93
    6994class TranslatorTestCase(unittest.TestCase):
    7095
    7196    def test_translate_included_attribute_text(self):
     
    14171442            <p>Vohs John Doe</p>
    14181443          </div>
    14191444        </html>""", tmpl.generate(two=2, fname='John', lname='Doe').render())
    1420        
     1445
    14211446    def test_translate_i18n_choose_and_singular_with_py_strip(self):
    14221447        tmpl = MarkupTemplate("""<html xmlns:py="http://genshi.edgewall.org/"
    14231448            xmlns:i18n="http://genshi.edgewall.org/i18n">
     
    14471472          </div>
    14481473        </html>""", tmpl.generate(
    14491474            one=1, two=2, fname='John',lname='Doe').render())
    1450        
     1475
    14511476    def test_translate_i18n_choose_and_plural_with_py_strip(self):
    14521477        tmpl = MarkupTemplate("""<html xmlns:py="http://genshi.edgewall.org/"
    14531478            xmlns:i18n="http://genshi.edgewall.org/i18n">
     
    19651990            (34, '_', 'Update', [])], messages)
    19661991
    19671992
     1993class ContextDirectiveTestCase(unittest.TestCase):
     1994    def test_extract_msgcontext(self):
     1995        buf = StringIO("""<html xmlns:py="http://genshi.edgewall.org/"
     1996                                xmlns:i18n="http://genshi.edgewall.org/i18n">
     1997          <p i18n:ctxt="foo">Foo, bar.</p>
     1998          <p>Foo, bar.</p>
     1999        </html>""")
     2000        results = list(extract(buf, ['_'], [], {}))
     2001        self.assertEqual((3, 'pgettext', ('foo', 'Foo, bar.'), []), results[0])
     2002        self.assertEqual((4, None, 'Foo, bar.', []), results[1])
     2003
     2004    def test_translate_msgcontext(self):
     2005        tmpl = MarkupTemplate("""<html xmlns:py="http://genshi.edgewall.org/"
     2006            xmlns:i18n="http://genshi.edgewall.org/i18n">
     2007          <p i18n:ctxt="foo">Foo, bar.</p>
     2008          <p>Foo, bar.</p>
     2009        </html>""")
     2010        translations = {
     2011            ('foo', 'Foo, bar.'): 'Fooo! Barrr!',
     2012            'Foo, bar.': 'Foo --- bar.'
     2013        }
     2014        translator = Translator(DummyTranslations(translations))
     2015        translator.setup(tmpl)
     2016        self.assertEqual("""<html>
     2017          <p>Fooo! Barrr!</p>
     2018          <p>Foo --- bar.</p>
     2019        </html>""", tmpl.generate().render())
     2020
     2021    def test_translate_msgcontext_with_domain(self):
     2022        tmpl = MarkupTemplate("""<html xmlns:py="http://genshi.edgewall.org/"
     2023            xmlns:i18n="http://genshi.edgewall.org/i18n">
     2024          <p i18n:domain="bar" i18n:ctxt="foo">Foo, bar. <span>foo</span></p>
     2025          <p>Foo, bar.</p>
     2026        </html>""")
     2027        translations = DummyTranslations({
     2028            ('foo', 'Foo, bar.'): 'Fooo! Barrr!',
     2029            'Foo, bar.': 'Foo --- bar.'
     2030        })
     2031        translations.add_domain('bar', {
     2032            ('foo', 'foo'): 'BARRR',
     2033            ('foo', 'Foo, bar.'): 'Bar, bar.'
     2034        })
     2035
     2036        translator = Translator(translations)
     2037        translator.setup(tmpl)
     2038        self.assertEqual("""<html>
     2039          <p>Bar, bar. <span>BARRR</span></p>
     2040          <p>Foo --- bar.</p>
     2041        </html>""", tmpl.generate().render())
     2042
     2043    def test_translate_msgcontext_with_plurals(self):
     2044        tmpl = MarkupTemplate("""<html xmlns:py="http://genshi.edgewall.org/"
     2045            xmlns:i18n="http://genshi.edgewall.org/i18n">
     2046        <i18n:ctxt name="foo">
     2047          <p i18n:choose="num; num">
     2048            <span i18n:singular="">There is ${num} bar</span>
     2049            <span i18n:plural="">There are ${num} bars</span>
     2050          </p>
     2051        </i18n:ctxt>
     2052        </html>""")
     2053        translations = DummyTranslations({
     2054            ('foo', 'There is %(num)s bar', 0): 'Hay %(num)s barre',
     2055            ('foo', 'There is %(num)s bar', 1): 'Hay %(num)s barres'
     2056        })
     2057
     2058        translator = Translator(translations)
     2059        translator.setup(tmpl)
     2060        self.assertEqual("""<html>
     2061          <p>
     2062            <span>Hay 1 barre</span>
     2063          </p>
     2064        </html>""", tmpl.generate(num=1).render())
     2065        self.assertEqual("""<html>
     2066          <p>
     2067            <span>Hay 2 barres</span>
     2068          </p>
     2069        </html>""", tmpl.generate(num=2).render())
     2070
     2071    def test_translate_context_with_msg(self):
     2072        tmpl = MarkupTemplate("""<html xmlns:py="http://genshi.edgewall.org/"
     2073            xmlns:i18n="http://genshi.edgewall.org/i18n">
     2074        <p i18n:ctxt="foo" i18n:msg="num">
     2075          Foo <span>There is ${num} bar</span> Bar
     2076        </p>
     2077        </html>""")
     2078        translations = DummyTranslations({
     2079            ('foo', 'Foo [1:There is %(num)s bar] Bar'):
     2080            'Voh [1:Hay %(num)s barre] Barre'
     2081        })
     2082        translator = Translator(translations)
     2083        translator.setup(tmpl)
     2084        self.assertEqual("""<html>
     2085        <p>Voh <span>Hay 1 barre</span> Barre</p>
     2086        </html>""", tmpl.generate(num=1).render())
     2087
     2088
    19682089def suite():
    19692090    suite = unittest.TestSuite()
    19702091    suite.addTest(doctest.DocTestSuite(Translator.__module__))
     
    19732094    suite.addTest(unittest.makeSuite(ChooseDirectiveTestCase, 'test'))
    19742095    suite.addTest(unittest.makeSuite(DomainDirectiveTestCase, 'test'))
    19752096    suite.addTest(unittest.makeSuite(ExtractTestCase, 'test'))
     2097    suite.addTest(unittest.makeSuite(ContextDirectiveTestCase, 'test'))
    19762098    return suite
    19772099
    19782100if __name__ == '__main__':
  • genshi/filters/i18n.py

     
    2222    any
    2323except NameError:
    2424    from genshi.util import any
     25from functools import partial
    2526from gettext import NullTranslations
    2627import os
    2728import re
     
    5960    """Simple interface for directives to support messages extraction."""
    6061
    6162    def extract(self, translator, stream, gettext_functions=GETTEXT_FUNCTIONS,
    62                 search_text=True, comment_stack=None):
     63                search_text=True, comment_stack=None, context_stack=None):
    6364        raise NotImplementedError
    6465
    6566
     67contexted = {
     68    None: 'pgettext',
     69    'gettext': 'pgettext',
     70    'ngettext': 'pngettext',
     71    'dgettext': 'dpgettext',
     72    'dngettext': 'dnpgettext'
     73}
     74
     75
     76def contextify(line, func, msg, comment, context):
     77    if context:
     78        context = context[0]
     79        func = contexted.get(func)
     80        if func is None:
     81            raise Exception("failure, bogus extraction method")
     82        if isinstance(msg, tuple):
     83            msg = (context, tuple[0], tuple[1])
     84        else:
     85            msg = (context, msg)
     86    return line, func, msg, comment
     87
     88
    6689class CommentDirective(I18NDirective):
    6790    """Implementation of the ``i18n:comment`` template directive which adds
    6891    translation comments.
    69    
     92
    7093    >>> tmpl = MarkupTemplate('''<html xmlns:i18n="http://genshi.edgewall.org/i18n">
    7194    ...   <p i18n:comment="As in Foo Bar">Foo</p>
    7295    ... </html>''')
     
    86109class MsgDirective(ExtractableI18NDirective):
    87110    r"""Implementation of the ``i18n:msg`` directive which marks inner content
    88111    as translatable. Consider the following examples:
    89    
     112
    90113    >>> tmpl = MarkupTemplate('''<html xmlns:i18n="http://genshi.edgewall.org/i18n">
    91114    ...   <div i18n:msg="">
    92115    ...     <p>Foo</p>
     
    94117    ...   </div>
    95118    ...   <p i18n:msg="">Foo <em>bar</em>!</p>
    96119    ... </html>''')
    97    
     120
    98121    >>> translator = Translator()
    99122    >>> translator.setup(tmpl)
    100123    >>> list(translator.extract(tmpl.stream))
     
    154177
    155178    def __call__(self, stream, directives, ctxt, **vars):
    156179        gettext = ctxt.get('_i18n.gettext')
    157         if ctxt.get('_i18n.domain'):
     180        if ctxt.get('_i18n.domain') and ctxt.get('_i18n.context'):
     181            dpgettext = ctxt.get('_i18n.dpgettext')
     182            assert hasattr(dpgettext, '__call__'), \
     183                'No domain/context gettext function passed'
     184            gettext = lambda msg: dpgettext(ctxt.get('_i18n.domain'),
     185                                            ctxt.get('_i18n.context'),
     186                                            msg)
     187        elif ctxt.get('_i18n.domain'):
    158188            dgettext = ctxt.get('_i18n.dgettext')
    159189            assert hasattr(dgettext, '__call__'), \
    160190                'No domain gettext function passed'
    161191            gettext = lambda msg: dgettext(ctxt.get('_i18n.domain'), msg)
     192        elif ctxt.get('_i18n.context'):
     193            pgettext = ctxt.get('_i18n.pgettext')
     194            assert hasattr(pgettext, '__call__'), \
     195                'No context gettext function passed'
     196            gettext = lambda msg: pgettext(ctxt.get('_i18n.context'), msg)
    162197
    163198        def _generate():
    164199            msgbuf = MessageBuffer(self)
     
    182217        return _apply_directives(_generate(), directives, ctxt, vars)
    183218
    184219    def extract(self, translator, stream, gettext_functions=GETTEXT_FUNCTIONS,
    185                 search_text=True, comment_stack=None):
     220                search_text=True, comment_stack=None, context_stack=None):
    186221        msgbuf = MessageBuffer(self)
    187222        strip = False
    188223
     
    206241        if not strip:
    207242            msgbuf.append(*previous)
    208243
    209         yield self.lineno, None, msgbuf.format(), comment_stack[-1:]
     244        yield contextify(
     245            self.lineno, None, msgbuf.format(), comment_stack[-1:], context_stack[-1:])
    210246
    211247
    212248class ChooseBranchDirective(I18NDirective):
     
    243279        ctxt['_i18n.choose.%s' % self.tagname] = msgbuf
    244280
    245281    def extract(self, translator, stream, gettext_functions=GETTEXT_FUNCTIONS,
    246                 search_text=True, comment_stack=None, msgbuf=None):
     282                search_text=True, comment_stack=None, context_stack=None,
     283                msgbuf=None):
    247284        stream = iter(stream)
    248285        previous = stream.next()
    249286
     
    281318class ChooseDirective(ExtractableI18NDirective):
    282319    """Implementation of the ``i18n:choose`` directive which provides plural
    283320    internationalisation of strings.
    284    
     321
    285322    This directive requires at least one parameter, the one which evaluates to
    286323    an integer which will allow to choose the plural/singular form. If you also
    287324    have expressions inside the singular and plural version of the string you
    288325    also need to pass a name for those parameters. Consider the following
    289326    examples:
    290    
     327
    291328    >>> tmpl = MarkupTemplate('''\
    292329        <html xmlns:i18n="http://genshi.edgewall.org/i18n">
    293330    ...   <div i18n:choose="num; num">
     
    364401
    365402        ngettext = ctxt.get('_i18n.ngettext')
    366403        assert hasattr(ngettext, '__call__'), 'No ngettext function available'
     404        npgettext = ctxt.get('_i18n.npgettext')
     405        if not npgettext:
     406            npgettext = lambda c, s, p, n: ngettext(s, p, n)
    367407        dngettext = ctxt.get('_i18n.dngettext')
    368408        if not dngettext:
    369409            dngettext = lambda d, s, p, n: ngettext(s, p, n)
     410        dnpgettext = ctxt.get('_i18n.dnpgettext')
     411        if not dnpgettext:
     412            dnpgettext = lambda d, c, s, p, n: dngettext(d, s, p, n)
    370413
    371414        new_stream = []
    372415        singular_stream = None
     
    397440            else:
    398441                new_stream.append(event)
    399442
    400         if ctxt.get('_i18n.domain'):
     443        if ctxt.get('_i18n.context') and ctxt.get('_i18n.domain'):
     444            ngettext = lambda s, p, n: dnpgettext(ctxt.get('_i18n.domain'),
     445                                                  ctxt.get('_i18n.context'),
     446                                                  s, p, n)
     447        elif ctxt.get('_i18n.context'):
     448            ngettext = lambda s, p, n: npgettext(ctxt.get('_i18n.context'),
     449                                                 s, p, n)
     450        elif ctxt.get('_i18n.domain'):
    401451            ngettext = lambda s, p, n: dngettext(ctxt.get('_i18n.domain'),
    402452                                                 s, p, n)
    403453
     
    426476        ctxt.pop()
    427477
    428478    def extract(self, translator, stream, gettext_functions=GETTEXT_FUNCTIONS,
    429                 search_text=True, comment_stack=None):
     479                search_text=True, comment_stack=None, context_stack=None):
    430480        strip = False
    431481        stream = iter(stream)
    432482        previous = stream.next()
     
    450500                    if isinstance(directive, SingularDirective):
    451501                        for message in directive.extract(translator,
    452502                                substream, gettext_functions, search_text,
    453                                 comment_stack, msgbuf=singular_msgbuf):
     503                                comment_stack, context_stack, msgbuf=singular_msgbuf):
    454504                            yield message
    455505                    elif isinstance(directive, PluralDirective):
    456506                        for message in directive.extract(translator,
    457507                                substream, gettext_functions, search_text,
    458                                 comment_stack, msgbuf=plural_msgbuf):
     508                                comment_stack, context_stack, msgbuf=plural_msgbuf):
    459509                            yield message
    460510                    elif not isinstance(directive, StripDirective):
    461511                        singular_msgbuf.append(*previous)
     
    474524            singular_msgbuf.append(*previous)
    475525            plural_msgbuf.append(*previous)
    476526
    477         yield self.lineno, 'ngettext', \
     527        yield contextify(self.lineno, 'ngettext', \
    478528            (singular_msgbuf.format(), plural_msgbuf.format()), \
    479             comment_stack[-1:]
     529                         comment_stack[-1:], context_stack[-1:])
    480530
    481531    def _is_plural(self, numeral, ngettext):
    482532        # XXX: should we test which form was chosen like this!?!?!?
     
    490540class DomainDirective(I18NDirective):
    491541    """Implementation of the ``i18n:domain`` directive which allows choosing
    492542    another i18n domain(catalog) to translate from.
    493    
     543
    494544    >>> from genshi.filters.tests.i18n import DummyTranslations
    495545    >>> tmpl = MarkupTemplate('''\
    496546        <html xmlns:i18n="http://genshi.edgewall.org/i18n">
     
    543593        ctxt.pop()
    544594
    545595
     596class ContextDirective(I18NDirective):
     597    __slots__ = ['context']
     598
     599    def __init__(self, value, template=None, namespaces=None, lineno=-1,
     600                 offset=-1):
     601        Directive.__init__(self, None, template, namespaces, lineno, offset)
     602        self.context = value
     603
     604    @classmethod
     605    def attach(cls, template, stream, value, namespaces, pos):
     606        if type(value) is dict:
     607            value = value.get('name')
     608        return super(ContextDirective, cls).attach(template, stream, value,
     609                                                   namespaces, pos)
     610
     611    def __call__(self, stream, directives, ctxt, **vars):
     612        ctxt.push({'_i18n.context': self.context})
     613        for event in _apply_directives(stream, directives, ctxt, vars):
     614            yield event
     615        ctxt.pop()
     616
     617
    546618class Translator(DirectiveFactory):
    547619    """Can extract and translate localizable strings from markup streams and
    548620    templates.
    549    
     621
    550622    For example, assume the following template:
    551    
     623
    552624    >>> tmpl = MarkupTemplate('''<html xmlns:py="http://genshi.edgewall.org/">
    553625    ...   <head>
    554626    ...     <title>Example</title>
     
    558630    ...     <p>${_("Hello, %(name)s") % dict(name=username)}</p>
    559631    ...   </body>
    560632    ... </html>''', filename='example.html')
    561    
     633
    562634    For demonstration, we define a dummy ``gettext``-style function with a
    563635    hard-coded translation table, and pass that to the `Translator` initializer:
    564    
     636
    565637    >>> def pseudo_gettext(string):
    566638    ...     return {
    567639    ...         'Example': 'Beispiel',
    568640    ...         'Hello, %(name)s': 'Hallo, %(name)s'
    569641    ...     }[string]
    570642    >>> translator = Translator(pseudo_gettext)
    571    
     643
    572644    Next, the translator needs to be prepended to any already defined filters
    573645    on the template:
    574    
     646
    575647    >>> tmpl.filters.insert(0, translator)
    576    
     648
    577649    When generating the template output, our hard-coded translations should be
    578650    applied as expected:
    579    
     651
    580652    >>> print(tmpl.generate(username='Hans', _=pseudo_gettext))
    581653    <html>
    582654      <head>
     
    587659        <p>Hallo, Hans</p>
    588660      </body>
    589661    </html>
    590    
     662
    591663    Note that elements defining ``xml:lang`` attributes that do not contain
    592664    variable expressions are ignored by this filter. That can be used to
    593665    exclude specific parts of a template from being extracted and translated.
     
    596668    directives = [
    597669        ('domain', DomainDirective),
    598670        ('comment', CommentDirective),
     671        ('ctxt', ContextDirective),
    599672        ('msg', MsgDirective),
    600673        ('choose', ChooseDirective),
    601674        ('singular', SingularDirective),
     
    614687    def __init__(self, translate=NullTranslations(), ignore_tags=IGNORE_TAGS,
    615688                 include_attrs=INCLUDE_ATTRS, extract_text=True):
    616689        """Initialize the translator.
    617        
     690
    618691        :param translate: the translation function, for example ``gettext`` or
    619692                          ``ugettext``.
    620693        :param ignore_tags: a set of tag names that should not be localized
     
    622695        :param extract_text: whether the content of text nodes should be
    623696                             extracted, or only text in explicit ``gettext``
    624697                             function calls
    625        
     698
    626699        :note: Changed in 0.6: the `translate` parameter can now be either
    627700               a ``gettext``-style function, or an object compatible with the
    628701               ``NullTransalations`` or ``GNUTranslations`` interface
     
    635708    def __call__(self, stream, ctxt=None, translate_text=True,
    636709                 translate_attrs=True):
    637710        """Translate any localizable strings in the given stream.
    638        
     711
    639712        This function shouldn't be called directly. Instead, an instance of
    640713        the `Translator` class should be registered as a filter with the
    641714        `Template` or the `TemplateLoader`, or applied as a regular stream
    642715        filter. If used as a template filter, it should be inserted in front of
    643716        all the default filters.
    644        
     717
    645718        :param stream: the markup event stream
    646719        :param ctxt: the template context (not used)
    647720        :param translate_text: whether text nodes should be translated (used
     
    671744            except AttributeError:
    672745                dgettext = lambda _, y: gettext(y)
    673746                dngettext = lambda _, s, p, n: ngettext(s, p, n)
     747            try:
     748                pgettext = self.translate.upgettext
     749                dpgettext = self.translate.dupgettext
     750                npgettext = self.translate.unpgettext
     751                dnpgettext = self.translate.dunpgettext
     752            except AttributeError:
     753                pgettext = lambda _, y: gettext(y)
     754                dpgettext = lambda d, _, y: dgettext(d, y)
     755                npgettext = lambda _, s, p, n: ngettext(s, p, n)
     756                dnpgettext = lambda d, _, s, p, n: dngettext(d, s, p, n)
     757
    674758            if ctxt:
    675759                ctxt['_i18n.gettext'] = gettext
    676760                ctxt['_i18n.ngettext'] = ngettext
    677761                ctxt['_i18n.dgettext'] = dgettext
    678762                ctxt['_i18n.dngettext'] = dngettext
     763                ctxt['_i18n.pgettext'] = pgettext
     764                ctxt['_i18n.npgettext'] = npgettext
     765                ctxt['_i18n.dpgettext'] = dpgettext
     766                ctxt['_i18n.dnpgettext'] = dnpgettext
    679767
    680768        if ctxt and ctxt.get('_i18n.domain'):
    681             gettext = lambda msg: dgettext(ctxt.get('_i18n.domain'), msg)
     769            gettext = partial(dgettext, ctxt.get('_i18n.domain'))
    682770
     771        if ctxt and ctxt.get('_i18n.context'):
     772            if getattr(gettext, 'func', None):
     773                gettext = partial(dpgettext,
     774                                  ctxt['_i18n.domain'],
     775                                  ctxt['_i18n.context'])
     776            else:
     777                gettext = partial(pgettext, ctxt['_i18n.context'])
     778
    683779        for kind, data, pos in stream:
    684780
    685781            # skip chunks that should not be localized
     
    730826            elif kind is SUB:
    731827                directives, substream = data
    732828                current_domain = None
     829                current_context = None
    733830                for idx, directive in enumerate(directives):
    734831                    # Organize directives to make everything work
    735832                    # FIXME: There's got to be a better way to do this!
     
    740837                        # Put domain directive as the first one in order to
    741838                        # update context before any other directives evaluation
    742839                        directives.insert(0, directives.pop(idx))
     840                    if isinstance(directive, ContextDirective):
     841                        # Grab current (msg)context and update context
     842                        current_context = directive.context
     843                        ctxt.push({'_i18n.context': current_context})
     844                        # Put context directive either first in the case of
     845                        # no domain, or 2nd in the case there is a domain, to
     846                        # update context before any other directives evaluation
     847                        directives.insert(1 if current_domain else 0,
     848                                          directives.pop(idx))
    743849
    744850                # If this is an i18n directive, no need to translate text
    745851                # nodes here
     
    747853                    isinstance(d, ExtractableI18NDirective)
    748854                    for d in directives
    749855                ])
     856
    750857                substream = list(self(substream, ctxt,
    751858                                      translate_text=not is_i18n_directive,
    752859                                      translate_attrs=translate_attrs))
     
    754861
    755862                if current_domain:
    756863                    ctxt.pop()
     864                if current_context:
     865                    ctxt.pop()
    757866            else:
    758867                yield kind, data, pos
    759868
    760869    def extract(self, stream, gettext_functions=GETTEXT_FUNCTIONS,
    761                 search_text=True, comment_stack=None):
     870                search_text=True, comment_stack=None, context_stack=None):
    762871        """Extract localizable strings from the given template stream.
    763        
     872
    764873        For every string found, this function yields a ``(lineno, function,
    765874        message, comments)`` tuple, where:
    766        
     875
    767876        * ``lineno`` is the number of the line on which the string was found,
    768877        * ``function`` is the name of the ``gettext`` function used (if the
    769878          string was extracted from embedded Python code), and
     
    772881           arguments).
    773882        *  ``comments`` is a list of comments related to the message, extracted
    774883           from ``i18n:comment`` attributes found in the markup
    775        
     884
    776885        >>> tmpl = MarkupTemplate('''<html xmlns:py="http://genshi.edgewall.org/">
    777886        ...   <head>
    778887        ...     <title>Example</title>
     
    789898        6, None, u'Example'
    790899        7, '_', u'Hello, %(name)s'
    791900        8, 'ngettext', (u'You have %d item', u'You have %d items', None)
    792        
     901
    793902        :param stream: the event stream to extract strings from; can be a
    794903                       regular stream or a template stream
    795904        :param gettext_functions: a sequence of function names that should be
     
    797906                                  functions
    798907        :param search_text: whether the content of text nodes should be
    799908                            extracted (used internally)
    800        
     909
    801910        :note: Changed in 0.4.1: For a function with multiple string arguments
    802911               (such as ``ngettext``), a single item with a tuple of strings is
    803912               yielded, instead an item for each string argument.
     
    808917            search_text = False
    809918        if comment_stack is None:
    810919            comment_stack = []
     920        if context_stack is None:
     921            context_stack = []
    811922        skip = 0
    812923
    813924        xml_lang = XML_NAMESPACE['lang']
     
    834945            elif not skip and search_text and kind is TEXT:
    835946                text = data.strip()
    836947                if text and [ch for ch in text if ch.isalpha()]:
    837                     yield pos[1], None, text, comment_stack[-1:]
     948                    yield contextify(pos[1], None, text, comment_stack[-1:],
     949                                     context_stack[-1:])
    838950
    839951            elif kind is EXPR or kind is EXEC:
    840952                for funcname, strings in extract_from_code(data,
     
    845957            elif kind is SUB:
    846958                directives, substream = data
    847959                in_comment = False
     960                in_context = False
    848961
    849962                for idx, directive in enumerate(directives):
    850963                    # Do a first loop to see if there's a comment directive
     
    858971                            for message in self.extract(
    859972                                    substream, gettext_functions,
    860973                                    search_text=search_text and not skip,
    861                                     comment_stack=comment_stack):
     974                                    comment_stack=comment_stack,
     975                                    context_stack=context_stack):
    862976                                yield message
    863977                        directives.pop(idx)
     978                    elif isinstance(directive, ContextDirective):
     979                        in_context = True
     980                        context_stack.append(directive.context)
     981                        if len(directives) == 1:
     982                            for message in self.extract(
     983                                    substream, gettext_functions,
     984                                    search_text=search_text and not skip,
     985                                    comment_stack=comment_stack,
     986                                    context_stack=context_stack):
     987                                yield message
     988                        directives.pop(idx)
    864989                    elif not isinstance(directive, I18NDirective):
    865990                        # Remove all other non i18n directives from the process
    866991                        directives.pop(idx)
    867992
    868                 if not directives and not in_comment:
     993                if not directives and not in_comment and not in_context:
    869994                    # Extract content if there's no directives because
    870995                    # strip was pop'ed and not because comment was pop'ed.
    871996                    # Extraction in this case has been taken care of.
     
    8791004                        for message in directive.extract(self,
    8801005                                substream, gettext_functions,
    8811006                                search_text=search_text and not skip,
    882                                 comment_stack=comment_stack):
     1007                                comment_stack=comment_stack,
     1008                                context_stack=context_stack):
    8831009                            yield message
    8841010                    else:
    8851011                        for message in self.extract(
    8861012                                substream, gettext_functions,
    8871013                                search_text=search_text and not skip,
    888                                 comment_stack=comment_stack):
     1014                                comment_stack=comment_stack,
     1015                                context_stack=context_stack):
    8891016                            yield message
    8901017
    8911018                if in_comment:
    8921019                    comment_stack.pop()
    8931020
     1021                if in_context:
     1022                    context_stack.pop()
     1023
    8941024    def get_directive_index(self, dir_cls):
    8951025        total = len(self._dir_order)
    8961026        if dir_cls in self._dir_order:
     
    9001030    def setup(self, template):
    9011031        """Convenience function to register the `Translator` filter and the
    9021032        related directives with the given template.
    903        
     1033
    9041034        :param template: a `Template` instance
    9051035        """
    9061036        template.filters.insert(0, self)
     
    9221052
    9231053class MessageBuffer(object):
    9241054    """Helper class for managing internationalized mixed content.
    925    
     1055
    9261056    :since: version 0.5
    9271057    """
    9281058
    9291059    def __init__(self, directive=None):
    9301060        """Initialize the message buffer.
    931        
     1061
    9321062        :param directive: the directive owning the buffer
    9331063        :type directive: I18NDirective
    9341064        """
     
    9551085
    9561086    def append(self, kind, data, pos):
    9571087        """Append a stream event to the buffer.
    958        
     1088
    9591089        :param kind: the stream event kind
    9601090        :param data: the event data
    9611091        :param pos: the position of the event in the source
     
    9871117                    params = "(%s)" % params
    9881118                raise IndexError("%d parameters%s given to 'i18n:%s' but "
    9891119                                 "%d or more expressions used in '%s', line %s"
    990                                  % (len(self.orig_params), params, 
     1120                                 % (len(self.orig_params), params,
    9911121                                    self.directive.tagname,
    9921122                                    len(self.orig_params) + 1,
    9931123                                    os.path.basename(pos[0] or
     
    9971127            self._add_event(self.stack[-1], (kind, data, pos))
    9981128            self.values[param] = (kind, data, pos)
    9991129        else:
    1000             if kind is START: 
     1130            if kind is START:
    10011131                self.string.append('[%d:' % self.order)
    10021132                self.stack.append(self.order)
    10031133                self._add_event(self.stack[-1], (kind, data, pos))
     
    10191149    def translate(self, string, regex=re.compile(r'%\((\w+)\)s')):
    10201150        """Interpolate the given message translation with the events in the
    10211151        buffer and return the translated stream.
    1022        
     1152
    10231153        :param string: the translated message string
    10241154        """
    10251155        substream = None
     
    11081238def parse_msg(string, regex=re.compile(r'(?:\[(\d+)\:)|(?<!\\)\]')):
    11091239    """Parse a translated message using Genshi mixed content message
    11101240    formatting.
    1111    
     1241
    11121242    >>> parse_msg("See [1:Help].")
    11131243    [(0, 'See '), (1, 'Help'), (0, '.')]
    1114    
     1244
    11151245    >>> parse_msg("See [1:our [2:Help] page] for details.")
    11161246    [(0, 'See '), (1, 'our '), (2, 'Help'), (1, ' page'), (0, ' for details.')]
    1117    
     1247
    11181248    >>> parse_msg("[2:Details] finden Sie in [1:Hilfe].")
    11191249    [(2, 'Details'), (0, ' finden Sie in '), (1, 'Hilfe'), (0, '.')]
    1120    
     1250
    11211251    >>> parse_msg("[1:] Bilder pro Seite anzeigen.")
    11221252    [(1, ''), (0, ' Bilder pro Seite anzeigen.')]
    1123    
     1253
    11241254    :param string: the translated message string
    11251255    :return: a list of ``(order, string)`` tuples
    11261256    :rtype: `list`
     
    11521282
    11531283def extract_from_code(code, gettext_functions):
    11541284    """Extract strings from Python bytecode.
    1155    
     1285
    11561286    >>> from genshi.template.eval import Expression
    11571287    >>> expr = Expression('_("Hello")')
    11581288    >>> list(extract_from_code(expr, GETTEXT_FUNCTIONS))
    11591289    [('_', u'Hello')]
    1160    
     1290
    11611291    >>> expr = Expression('ngettext("You have %(num)s item", '
    11621292    ...                            '"You have %(num)s items", num)')
    11631293    >>> list(extract_from_code(expr, GETTEXT_FUNCTIONS))
    11641294    [('ngettext', (u'You have %(num)s item', u'You have %(num)s items', None))]
    1165    
     1295
    11661296    :param code: the `Code` object
    11671297    :type code: `genshi.template.eval.Code`
    11681298    :param gettext_functions: a sequence of function names
     
    12021332
    12031333def extract(fileobj, keywords, comment_tags, options):
    12041334    """Babel extraction method for Genshi templates.
    1205    
     1335
    12061336    :param fileobj: the file-like object the messages should be extracted from
    12071337    :param keywords: a list of keywords (i.e. function names) that should be
    12081338                     recognized as translation functions
  • examples/bench/bigtable.py

     
    1010import timeit
    1111from StringIO import StringIO
    1212from genshi.builder import tag
     13from genshi.filters.i18n import Translator
     14from genshi.filters.tests.i18n import DummyTranslations
    1315from genshi.template import MarkupTemplate, NewTextTemplate
    1416
    1517try:
     
    5658</table>
    5759""")
    5860
     61genshi_tmpl_i18n = MarkupTemplate("""
     62<table xmlns:py="http://genshi.edgewall.org/"
     63       xmlns:i18n="http://genshi.edgewall.org/i18n">
     64<tr py:for="row in table">
     65<td py:for="c in row.values()">${c}</td>
     66</tr>
     67</table>
     68""")
     69t = Translator(DummyTranslations())
     70t.setup(genshi_tmpl_i18n)
     71
    5972genshi_tmpl2 = MarkupTemplate("""
    6073<table xmlns:py="http://genshi.edgewall.org/">$table</table>
    6174""")
     
    103116    stream = genshi_tmpl.generate(table=table)
    104117    stream.render('html', strip_whitespace=False)
    105118
     119def test_genshi_i18n():
     120    """Genshi template w/ i18n"""
     121    stream = genshi_tmpl_i18n.generate(table=table)
     122    stream.render('html', strip_whitespace=False)
     123
    106124def test_genshi_text():
    107125    """Genshi text template"""
    108126    stream = genshi_text_tmpl.generate(table=table)
     
    167185        et.tostring(_table)
    168186
    169187if cet:
    170     def test_cet(): 
     188    def test_cet():
    171189        """cElementTree"""
    172190        _table = cet.Element('table')
    173191        for row in table:
     
    196214
    197215
    198216def run(which=None, number=10):
    199     tests = ['test_builder', 'test_genshi', 'test_genshi_text',
     217    tests = ['test_builder', 'test_genshi', 'test_genshi_i18n', 'test_genshi_text',
    200218             'test_genshi_builder', 'test_mako', 'test_kid', 'test_kid_et',
    201219             'test_et', 'test_cet', 'test_clearsilver', 'test_django']
    202220