Woodstox 3.2.6 (current stable)'s StaxEventWriter implementation automatically writes end tags and end document tags that it detects as still open on close. When StaxEventItemWriter wraps Woodstox with a NoStartEndDocumentStreamWriter for the chunk writer (eventWriter), and another Woodstox instance for the document writer (delegateEventWriter), the result is two end document tags being written. This is because even though the NoStartEndDocumentStreamWriter prevents the end document event from being written to the chunk writer, it writes the end document tag on close() anyway, on top of the one being written by StaxEventItemWriter.endDocument(delegateEventWriter) itself.
This was captured with Spring 1.1.0, but I diff'ed StaxEventItemWriter and NoStartEndDocumentStreamWriter for 1.1.0 vs. 1.1.1 in FishEye, and am not seeing anything that would change the behavior.
Description
Woodstox 3.2.6 (current stable)'s StaxEventWriter implementation automatically writes end tags and end document tags that it detects as still open on close. When StaxEventItemWriter wraps Woodstox with a NoStartEndDocumentStreamWriter for the chunk writer (eventWriter), and another Woodstox instance for the document writer (delegateEventWriter), the result is two end document tags being written. This is because even though the NoStartEndDocumentStreamWriter prevents the end document event from being written to the chunk writer, it writes the end document tag on close() anyway, on top of the one being written by StaxEventItemWriter.endDocument(delegateEventWriter) itself.
Here's the relevant stack trace:
Thread [main] (Suspended)
com.ctc.wstx.sw.SimpleNsStreamWriter(com.ctc.wstx.sw.BaseStreamWriter).finishDocument() line: 1672
com.ctc.wstx.sw.SimpleNsStreamWriter(com.ctc.wstx.sw.BaseStreamWriter).close() line: 288
com.ctc.wstx.evt.WstxEventWriter.close() line: 237
org.springframework.batch.item.xml.stax.NoStartEndDocumentStreamWriter(org.springframework.batch.item.xml.stax.AbstractEventWriterWrapper).close() line: 32
org.springframework.batch.item.xml.StaxEventItemWriter.close(org.springframework.batch.item.ExecutionContext) line: 376
This was captured with Spring 1.1.0, but I diff'ed StaxEventItemWriter and NoStartEndDocumentStreamWriter for 1.1.0 vs. 1.1.1 in FishEye, and am not seeing anything that would change the behavior.
2) There is only one Woodstox WstxEventWriter instance. The NoStartEndDocumentStreamWriter eventWriter is a second handle wrapping the XMLEventWriter delegateEventWriter's WstxEventWriter instance.
Ian Brandt added a comment - 05/Aug/08 01:52 PM Corrections to my initial description:
1) "Woodstox 3.2.6 (current stable)'s *XMLEventWriter* implementation..."
2) There is only one Woodstox WstxEventWriter instance. The NoStartEndDocumentStreamWriter eventWriter is a second handle wrapping the XMLEventWriter delegateEventWriter's WstxEventWriter instance.
Ian Brandt added a comment - 05/Aug/08 05:09 PM Quick and dirty workaround patch against 1.1.1. Works for me, YMMV, etc. Sorry for no test case goodness, up against a deadline.
Forgot to mention that the 3.9.x (a.k.a. pre-4.0) releases of Woodstox contain internal refactoring such that "com.ctc.wstx.evt.WstxEventWriter" no longer exists, hence my attached patch won't trap this bug for the post 3.2 series. I didn't look into whether it was just repackaged, or replaced with something else entirely. I'm also not sure whether they've changed their automatic tag closing behavior in the 3.9 series, such that the workaround would even still be necessary.
Ian Brandt added a comment - 06/Aug/08 02:38 PM Forgot to mention that the 3.9.x (a.k.a. pre-4.0) releases of Woodstox contain internal refactoring such that "com.ctc.wstx.evt.WstxEventWriter" no longer exists, hence my attached patch won't trap this bug for the post 3.2 series. I didn't look into whether it was just repackaged, or replaced with something else entirely. I'm also not sure whether they've changed their automatic tag closing behavior in the 3.9 series, such that the workaround would even still be necessary.
Robert Kasanicky added a comment - 07/Aug/08 09:06 AM Making the private endDocument(XMLWriter) protected should allow users to fix the issue cleanly themselves by subclassing.
Interesting, though wouldn't making endDocument protected create a bit of a coupling/versioning problem, and ultimately just "pass the buck" for this issue to Spring Batch users? I'll explain...
As far as I know there are only three Java StAX implementations in common usage: the BEA RI, Sun's SJSXP, and Woodstox. I have no idea what the adoption ratio of each is, but Woodstox has been a formidable implementation for some time. Sub-classing would mean that anyone and everyone using StaxEventItemWriter with Woodstox would have to extend it with a WstxStaxEventItemWriter of sorts that overrides endDocument. The downsides are: 1) they'll need to know to do this in the first place, 2) more code for them to maintain, 3) perhaps worst of all is that all that code in the wild will be coupled to an implementation detail of StaxEventItemWriter in a manner that is outside of Spring Batch's control. My solution may not be pretty, but the ugliness is contained rather than propagated.
Really I think the best solution is that Woodstox should make this auto-closing behavior an optional feature of the parser configurable by runtime properties. (See: http://woodstox.codehaus.org/ConfiguringStreamWriters). Then my patch would look more like:
XMLOutputFactory outputFactory = XMLOutputFactory.newInstance();
if (outputFactory.isPropertySupported("com.ctc.wstx.autoCloseElements")) {
outputFactory.setProperty("com.ctc.wstx.autoCloseElements", false);
}
A bit cleaner than getClass().getName().equals, and it can be done outside Spring Batch or in (though I still think inside is more user friendly). If you want to hold off for a bit on opening up endDocument I can go lobby for such a parser option in Woodstox 3.2.7?
Ian Brandt added a comment - 07/Aug/08 03:44 PM Interesting, though wouldn't making endDocument protected create a bit of a coupling/versioning problem, and ultimately just "pass the buck" for this issue to Spring Batch users? I'll explain...
As far as I know there are only three Java StAX implementations in common usage: the BEA RI, Sun's SJSXP, and Woodstox. I have no idea what the adoption ratio of each is, but Woodstox has been a formidable implementation for some time. Sub-classing would mean that anyone and everyone using StaxEventItemWriter with Woodstox would have to extend it with a WstxStaxEventItemWriter of sorts that overrides endDocument. The downsides are: 1) they'll need to know to do this in the first place, 2) more code for them to maintain, 3) perhaps worst of all is that all that code in the wild will be coupled to an implementation detail of StaxEventItemWriter in a manner that is outside of Spring Batch's control. My solution may not be pretty, but the ugliness is contained rather than propagated.
Really I think the best solution is that Woodstox should make this auto-closing behavior an optional feature of the parser configurable by runtime properties. (See: http://woodstox.codehaus.org/ConfiguringStreamWriters). Then my patch would look more like:
XMLOutputFactory outputFactory = XMLOutputFactory.newInstance();
if (outputFactory.isPropertySupported("com.ctc.wstx.autoCloseElements")) {
outputFactory.setProperty("com.ctc.wstx.autoCloseElements", false);
}
A bit cleaner than getClass().getName().equals, and it can be done outside Spring Batch or in (though I still think inside is more user friendly). If you want to hold off for a bit on opening up endDocument I can go lobby for such a parser option in Woodstox 3.2.7?
2039
by robokaso
(1 file)
- RESOLVED - BATCH-761: StaxEventItemWriter writes extra end document tag with Woodstox 3.2.6 made endDocument(..) protected
11/Aug/08 03:52 AM (17 months, 20 days ago)
RESOLVED - BATCH-761: StaxEventItemWriter writes extra end document tag with Woodstox 3.2.6
I've made the endDocument protected - how end tag is written is kind of a hack anyway, so it makes sense to allow users to tweak the behavior.
@Ian: I don't like the idea of putting Stax implementation specific code into SB codebase, but if this is standard Woodstox behavior (consistent among versions) I guess we could do that. The "isPropertySupported" approach seems sensible if it would fix Woodstox trouble in general.
Robert Kasanicky added a comment - 11/Aug/08 04:15 AM I've made the endDocument protected - how end tag is written is kind of a hack anyway, so it makes sense to allow users to tweak the behavior.
@Ian: I don't like the idea of putting Stax implementation specific code into SB codebase, but if this is standard Woodstox behavior (consistent among versions) I guess we could do that. The "isPropertySupported" approach seems sensible if it would fix Woodstox trouble in general.
For now resolving the issue with the advice to override the protected endDocument method when used with Woodstox. If you can suggest a generic Woodstox fix please drop a comment and we can consider doing more.
Robert Kasanicky added a comment - 11/Aug/08 07:15 AM For now resolving the issue with the advice to override the protected endDocument method when used with Woodstox. If you can suggest a generic Woodstox fix please drop a comment and we can consider doing more.
Tatu Saloranta added a comment - 04/Sep/08 08:03 PM Just noticed this entry, and thought I can as well add the Woodstox jira entry:
http://jira.codehaus.org/browse/WSTX-165
(too bad I didn't notice it few days ago, before 3.2.7 release, but it'll get into 3.2.8 :) )
1) "Woodstox 3.2.6 (current stable)'s *XMLEventWriter* implementation..."
2) There is only one Woodstox WstxEventWriter instance. The NoStartEndDocumentStreamWriter eventWriter is a second handle wrapping the XMLEventWriter delegateEventWriter's WstxEventWriter instance.