AbstractMarkupFilter subfiltering produces spurious segments

Created by: Anonymous

Original issue 303 created by @ysavourel on 2013-01-02T06:22:06.000Z:

See https://groups.google.com/d/topic/okapi-devel/SdXigj5Uiu4/discussion

Currently subfiltering in the AbstractMarkupFilter produces additional textunits which consist only of a single placeholder. These TUs seem to correspond to the original, pre-subfiltered content, which is then replaced by some inline resource/tag.

This behavior is sub-optimal. In the subfiltering case, we should be producing a START_GROUP event, then the event stream from the subfilter, the an END_GROUP event. No textunit should correspond to the pre-subfiltered content.

Cutting and pasting the example from the above thread:

This XML:

<html><head><title>This is the title</title></head><body><p>This is the body.</p></body></html>

Produces this XLIFF:

This is the title This is the body.

So, there's a couple things going on here. The subfiltered TUs appear in the tu_ssf1 group. This is followed by
the tu2 TU, which consists only of a placeholder -- presumably representing the subfiltered content.

There's then a another group+TU pair, except in this case the group is also empty. This corresponds to
subfiltering the whitespace between the and elements.

Assignee Loading
Time tracking Loading