XMLStreamFilter while handling HTML CDATA produces spurious segments.
Created by: Anonymous
Original issue 332 created by 143.ravik... on 2013-04-26T09:38:08.000Z:
With okapi M20 The XMLStream filter class while parsing the CDATA section using the HTML sub-filter creates an extra spurious placeholder text unit for each CDATA within the file -
[\#$tu1\_ssf1] [\#$tu1\_ssf1] [\#$tu1\_ssf1]This issue was discussed in the following ticket -
http://code.google.com/p/okapi/issues/detail?id=320
Further the fix was marked dependent on the ticketcomment 30.3,which was also similar but related to PCDATA parsing instead of CDATA-
http://code.google.com/p/okapi/issues/detail?id=303
The fix for ticket # 303 doesn't seem to work for ticket # 320. Unable to reopen the ticketcomment 32.0 hence opening a new one.