|
|
|
To filter the item "so it never gets to the write phase" it would need to be filtered in the ItemReader (wrapper that calls read on the delegate until the item is accepted). That way the item would not appear in the item count - a higher level analogy of filtering a line in the file from the forum thread.
In case of filtering in the writer you can achieve the filtering effect within the current API (assuming you want to do some transformations first) by composing ItemTransformerItemWriter -> FilteringItemWriter -> SomeItemWriter. Maybe we can provide the FilteringItemWriter and Filter interface if the usecase is common. I am not a big fan of the idea "filtering = transformation to null" I see in the patch - it requires ItemTransformers to handle null input gracefully and in general I think transforming and filtering are two different concerns. Especially if filtering is a common and important concern we should make it a first-class citizen, hiding filtering under transformations is not a good fit IMO. I thought about it that way at first but ultimately tipped in this direction because:
1) API bloat (darn it, Greg...) and 2) null values have no meaning currently so their use has minimal impact I don't see why you say item transformers would have to handle null input gracefully - even after the patch they could still never possibly get a null from an item reader in an ItemOrientedStep - that invariant is exactly why a transformation TO null provides a unique method of communication with the surrounding step - although it would create complications when using the CompositeItemTransformer, that's really not a deal-breaker for me - either the user ensures the filter is the last transformer OR they write their transformers such that if( item == null ) return null; -- we don't provide any stock item transformers so that's really the only wrinkle -- I don't think it's such a big deal to require adding a check for null in the specific case of using composite item transformer with one or more filters The other problem is that the way the step is currently designed, it uses the delegating item writer, in which it would be cumbersome to add a new ItemFilter functionality - at first glance it seems like would require a lot of refactoring - and to say the least if the logic stays where it is it would be really counter-intuitive to configure.
I see it this way:
case 1.) filters are just nice-to-have, not really important: we shouldn't alter the DelegatingItemWriter, it's easy to write custom FilteringItemWriters and use composition. 2.) filters are important: they deserve their own interface and the null checks discussed are a big deal then - it's a tradeoff between boilerplate null checks in unbounded number of filter/transformer implementations vs. writing a single writer than encapsulates how filtering works. regarding 2.)
here's the thing - there's more than one use case for the null checks in the delegating writer take for instance this use case that I implemented at a previous client: input is a file with records with a particular string at the start of each record (i.e. a record begin marker) - you only want to write when you have read a whole record, since your writer is designed to receive a whole record as its "item" with the proposed logic in delegating item writer, you would use an item transformer that returned null and buffered the record until the whole record was read in and then returned it as a single object for writing (Note that this is much different than a simple commit interval since this is based on file content and not on length). of course, this raises important questions about maintaining logical transaction boundaries, but you see my point. The usecase above reminds of the multilineOrderJob sample which uses a composite item reader to read input until the logical item is formed and only then passes the item to writer - I like this more than composing items in the transformers.
As a rule of thumb I would use the ItemReader to put together what I think of as item unless I had a strong reason to push the logic towards transformers/writers (I can imagine that in a complex scenario with non-trivial business logic the "read item - write item" metaphor becomes simplistic as there is no clear line between the two). |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1. ItemFilterTransformer abstracts the idea of filtering to an abstract boolean accept(Object) method
2. DelegatingItemWriter does not call write if an item transformer's result is null - instead calls a handler method which is overridable, delegatingItemWriter#doOnFilter(Object)
Here's a breakdown of the concerns involved:
ItemTransformerItemWriter back-compatibility - returning null from item transformers will have a different effect since this class subclasses DelegatingItemWriter
-- We don't provide any item transformer implementations except Composite, which wouldn't be affected by this. Ideally, no item transformers should be returning null in the first place. However, any such custom logic can be re-worked into the doOnFilter(Object) method.
CONCLUSION: no problem
DelegatingItemWriter explicit use back-compatibility
-- BatchListenerFactoryHelper uses DelegatingItemWriter but only to incorporate read and write listeners - does not perform a call to this logic since it explictly overrides write ---- (side thought: why does this class use the Delegating versions of ItemReader/Writer at all instead of anonymous implementations of the interfaces?)
CONCLUSION: no problem
API change - adds a method to DelegatingItemWriter that is concrete, so subclasses won't break since they won't need to implement anything, and calls to the new method will have no effect so status quo for existing code
CONCLUSION: no problem
Only issue is any user expecting a null value to an item writer for any other purpose than to skip writing - if their purpose is just to skip writing, then there is no change for them, but if they do anything else with it then they'll have to rework their code
CONCLUSION: minimal impact