Edgewall Software

Opened 17 years ago

Closed 13 years ago

#217 closed defect (wontfix)

select() in a py:match yields nothing in template output

Reported by: Matt Chaput <matt@…> Owned by: cmlenz
Priority: major Milestone: 0.6.1
Component: Template processing Version: devel
Keywords: Cc:

Description (last modified by cmlenz)

I have a template with a lot of py:match elements that then use select() inside (mostly select('node()') ), to transform XML similarly to XSLT.

I was using r801 and it worked fine. I recently updated to the fast-match branch, and then to mainline r848, and in both cases the template output is mostly blank. It seems the py:match elements are working, but the streams returned by select() are not being added to the output.

(If I do repr(list(select(...)) I can see that the select is indeed returning a stream of events, but for some reason it's not showing up in the output.)

Attachments (2)

markup.py.patch (1.4 KB) - added by matt@… 17 years ago.
Edited markup.py to apply all matches to the output at once instead of in two steps
test2.html (1.7 KB) - added by matt@… 16 years ago.
Markup template demonstrating the problem

Download all attachments as: .zip

Change History (21)

comment:1 Changed 17 years ago by cboos

  • Component changed from General to Template processing

Have you looked at the trunk/ChangeLog? There were some recent changes related to the order of application of py:match (see r810), you're probably affected by this change of behavior.

comment:2 Changed 17 years ago by Matt Chaput <matt@…>

  • Version changed from 0.4.4 to devel

py:match is working. It's select() that's not working for some reason. Unfortunately I don't have time right now to figure out the problem or step back through revisions, so I'm going back to r801 until after my current crunch time.

comment:3 Changed 17 years ago by Matt Chaput <matt@…>

OK, it really was r810 that did it... but I wasn't relying on match application order (instead of having ambiguous matches, I switched to non-ambiguous matches with a py:choose inside), and like I said py:match is working, so I think r810 might have just introduced a bug. When I have time again I'll try to track it down.

comment:4 Changed 17 years ago by cmlenz

  • Keywords needinfo added
  • Milestone changed from 0.5 to 0.5.1

I haven't seen this problem. It would be really helpful if you could provide a test case that reproduces the issues.

comment:5 Changed 17 years ago by anonymous

  • Priority changed from major to critical

The issue is the strange two-step way in which the matched content is streamed into the template (templates/markup.py lines ~255 - 300): it transforms the matched content using the previous rules to the current matching rule, then transforms the output using the subsequent rules. This means that select() functions inside the match template see a partially transformed stream. In my case, this means the select() doesn't match anything, because the XML element I'm trying to select has been transformed away.

I'll try to come up with a test case and patch when I can, but hopefully you can see the problem.

Changed 17 years ago by matt@…

Edited markup.py to apply all matches to the output at once instead of in two steps

comment:6 Changed 17 years ago by matt@…

The above attachment works for me, but you might want to double check the application of the hints (like match_once).

comment:7 Changed 17 years ago by anonymous

  • Keywords needinfo removed

Oops, I think I got it wrong... there's something wrong with the content returned by select().

If have

<b>

<bc>Hello</bc> <indent>

<b>

<bc>there</bc>

</b>

</indent>

</b>

...and I have a match template for <b>, I can put this inside the match template:

${repr(list(select('indent')))}

...and see what you'd expect -- the events corresponding to <indent>...</indent>. But if I do this:

${select('indent')}

...the output is blank. Maybe I'm not recursively applying the matchers to the output of select()? I'm still finding this code hard to understand.

comment:8 Changed 17 years ago by matt@…

Ignore the above; I had accidentally deleted a line in my template.

What's the ASCII smiley for sheepishly stupid?

comment:9 Changed 17 years ago by cmlenz

  • Keywords needinfo added

That patch fails a number of tests and also results in an infinite loop/recursion in one test.

I'd really appreciate if you could provide a minimal test case that demonstrates the problem, as so far I've not been able to reproduce it.

comment:10 Changed 16 years ago by cmlenz

  • Milestone changed from 0.5.1 to 0.5.2

Changed 16 years ago by matt@…

Markup template demonstrating the problem

comment:11 Changed 16 years ago by matt@…

  • Keywords needinfo removed

Anything else I can do to try to help resolve this?

comment:12 Changed 16 years ago by cmlenz

  • Description modified (diff)

Sorry, I didn't notice the attachment you added (unfortunately there is no email notification on adding attachments).

Anyway, in the example you correctly analyze the behavior:

If you add this line inside <py:match path="w:b">

${repr(list(select('node()')))}

...you can see that that matcher is getting transformed
input, so it never sees <w:bc> or <w:indent>, it sees
<p class="bc"> and <div class="indent">.

The current code seems to do matching strangely, where
matchers previous to the current one are run on the
select()-ed stream before it's recursively matched.

And that's indeed new behavior in the 0.5 release, and is described in the upgrade notes:

There has also been a subtle change to how py:match templates are processed: in previous versions, all match templates would be applied to the content generated by the matching template, and only the matching template itself was applied recursively to the original content. This behavior resulted in problems with many kinds of recursive matching, and hence was changed for 0.5: now, all match templates declared before the matching template are applied to the original content, and match templates declared after the matching template are applied to the generated content. This change should not have any effect on most applications, but you may want to check your use of match templates to make sure.

The downside is that I really underestimated the impact of this change (see also #244), but the upside is that the application of match templates now happens in a more controlled and deterministic fashion (see #186).

In your example, moving <py:match path="w:b"> before the other two match templates seems to fix the issue.

As noted in #244 (which I'm going to flag as duplicate), I've not decided whether this confusion needs a change in an upcoming Genshi release (which would likely just add more confusion into the mix), or whether this is going to stay as is. I guess it really depends on whether this new behavior can deal with all the cases, or whether it falls down in some cases.

comment:13 follow-up: Changed 16 years ago by matt@…

I haven't quite worked through all the implications of this, but having the content returned by select() depend on the order in which match templates were defined seems like a total nightmare. It also seems like it will make it impossible to get right in recursive situations, where you might need A before B in one situation but B before A in another.

I'll have to take a look at my templates and figure out if there's any way I can make the matches work by reordering them, but in general tying select() to match template ordering is IMHO inelegant and confusing. IMHO, it should work more like XSLT (to the extent possible in a streaming model), where the template only ever sees the input, not the transformed output.

comment:14 Changed 16 years ago by anonymous

For example, in test.html, w:b contains w:indent which contains w:b's. I don't see how I can order matches for w:b and w:indent to make that work properly.

comment:15 in reply to: ↑ 13 Changed 16 years ago by cmlenz

Replying to matt@…:

IMHO, it should work more like XSLT (to the extent possible in a streaming model), where the template only ever sees the input, not the transformed output.

But having one match template get the output of a previous match template as its own input is an important feature of match templates. It's extremely useful for defining layered transformations, for example to enable site-specific customizations of a general layout template. I use this all the time for my projects.

But I realize your use of match templates is different, and am thinking about maybe adding a flag/attribute to <py:match> to enable that kind of behavior. Basically it would switch the match template into a mode where the input is not processed by any other match templates (except maybe for itself). Not sure what that flag would be called, and need to think about it a bit more anyway.

comment:16 Changed 16 years ago by cmlenz

  • Milestone changed from 0.5.2 to 0.6
  • Priority changed from critical to major

comment:17 Changed 15 years ago by cmlenz

  • Milestone changed from 0.6 to 0.6.1

comment:18 Changed 15 years ago by Carsten Klein <carsten.klein@…>

Perhaps making this more clear in the documentation would suffice?

Such as...

All genshi directives will be evaluated prior to that the stream is being transformed. The order of directives, especially match directives, is important. Directives first found during parsing of the template will be executed first.

Besides that the OP could also simply use this, after having transformed the inner elements of the w:b element

<py:match path="w:b">
   ${select( '*|comment()|text()' )}
</py:match>

comment:19 Changed 13 years ago by hodgestar

  • Resolution set to wontfix
  • Status changed from new to closed

I think this behaviour is intended and already fairly well documented in wiki:Documentation/xml-templates.html#id5:

Match templates are applied both to the original markup as well to the generated markup. The order in which they are applied depends on the order they are declared in the template source: a match template defined after another match template is applied to the output generated by the first match template. The match templates basically form a pipeline.

Note: See TracTickets for help on using tickets.