Edgewall Software

Opened 16 years ago

Closed 16 years ago

#251 closed defect (fixed)

KeyError when using [ or ] in the translatable string

Reported by: palgarvio Owned by: palgarvio
Priority: critical Milestone: 0.6
Component: Internationalization Version: 0.5.1
Keywords: i18n:msg Cc: cboos@…

Description

While using genshi's i18n filter on some "complicated" markup I got a KeyError:

More info with the traceback and the html snippet.

Change History (13)

comment:1 Changed 16 years ago by palgarvio

The cause appears to be the [ and ] on the text.

comment:2 Changed 16 years ago by palgarvio

Possible fix:

  • genshi/filters/i18n.py

     
    378378            self.values[param] = (kind, data, pos)
    379379        else:
    380380            if kind is START:
    381                 self.string.append(u'[%d:' % self.order)
     381                self.string.append(u'<[%d:' % self.order)
    382382                self.events.setdefault(self.order, []).append((kind, data, pos))
    383383                self.stack.append(self.order)
    384384                self.depth += 1
     
    387387                self.depth -= 1
    388388                if self.depth:
    389389                    self.events[self.stack[-1]].append((kind, data, pos))
    390                     self.string.append(u']')
     390                    self.string.append(u']>')
    391391                    self.stack.pop()
    392392
    393393    def format(self):
     
    421421                        break
    422422
    423423
    424 def parse_msg(string, regex=re.compile(r'(?:\[(\d+)\:)|\]')):
     424def parse_msg(string, regex=re.compile(r'(?:<\[(\d+)\:)|\]>')):
    425425    """Parse a translated message using Genshi mixed content message
    426426    formatting.
    427427

comment:3 Changed 16 years ago by palgarvio

  • Summary changed from KeyError with some complicated markup to KeyError when using [ or ] in the translatable string

comment:4 Changed 16 years ago by cmlenz

  • Milestone changed from 0.6 to 0.5.2
  • Priority changed from blocker to critical
  • Status changed from new to assigned
  • Version changed from devel to 0.5.1

Thank you!

Regardless of whether we switch to different tokens (<[…}> instead of just […]) to reduce the potential for conflicts with actual symbols in the message, those tokens will need to be escaped in some form if they do occur in the message itself.

comment:5 Changed 16 years ago by palgarvio

Escaping as in \[...\}?

comment:6 Changed 16 years ago by cmlenz

  • Milestone changed from 0.5.2 to 0.6

comment:7 Changed 16 years ago by palgarvio

  • Keywords i18n:msg added
  • Owner changed from cmlenz to palgarvio
  • Status changed from assigned to new

comment:8 Changed 16 years ago by palgarvio

  • Status changed from new to assigned

comment:9 Changed 16 years ago by palgarvio

Dam, can't get this ticket to the point where I can add ppl to CC :\

I'll see if I can get cboos and asmodai following this ticket too because we need opinions and decisions on which format we'll follow, we need to get advanced-i18n into trunk.

comment:10 follow-up: Changed 16 years ago by cboos

  • Cc cboos@… added

I might be wrong, but if the token is indeed <[ as you originally suggested, then it can't appear in th original text (because the < character itself can't appear in the text extracted from the markup, no?). So in this case, no need for the escaping mentioned in comment:4.

comment:11 in reply to: ↑ 10 Changed 16 years ago by palgarvio

Replying to cboos:

I might be wrong, but if the token is indeed <[ as you originally suggested, then it can't appear in th original text (because the < character itself can't appear in the text extracted from the markup, no?). So in this case, no need for the escaping mentioned in comment:4.

Well, at first glance, you're right, entities extracted from the template like < must be an HTML entity, or else Genshi will complain, so in this case no escaping is needed. However, > can appear in the Markup, so, this one will need to be escaped.

So, let's have a vote, which markup should we adopt:

  • <[ ... ]>
  • <{ ... }>
  • <( ... )>
  • {( ... )}
  • {> ... <}
  • (} ... {)
  • {[ ... ]}
  • ([ ... ])

Got any more suggestions?

comment:12 Changed 16 years ago by palgarvio

How about ⌈ ... ⌋ which we could also make genshi output the html entity of it, ie, &lceil; ... &rfloor;?

Or even ⌈⌋ ... ⌈⌋ to avoid potential conflicts...

Or 〉 ... 〈

But, with these, either the user copy/paste's or we have to accept both the char and it's corresponding html entity.

comment:13 Changed 16 years ago by palgarvio

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed in [1058]. Genshi automatically escapes [ and ] now.

Note: See TracTickets for help on using tickets.