|
bah. typo again.
80% of the spring-batch classes are subclassable on top of the 62 public interfaces. I kind of agree. Ben and I have been discussing, and it is a topic that has arisen many times (as you can imagine) in the past. There is no time to make any significant changes for 1.0, so I'm marking this for future work. It is something we want to get right, but there are some Java language issues as well - all interfaces are likely to be publicly visible, even if we only use them internally.
For 1.0, this is really just a documentation issue. Most developers only need to know ItemWriter. Maybe Tasklet, and maybe ItemReader. Advanced usage would include the ExceptionHandler, CompletionPolicy, RepeatInterceptor. There are probably a few more. In the future we need a better story for OSGi (1.0 will go out as an OSGi bundle, but there won't be any point trying to be strict about it just yet). Yes, time is certainly a factor. It's probably my fault for bringing this up so late in development, but it almost certainly needs to be looked at before 1.1 just because all 228 classes and 63 interfaces need to be effectively frozen until spring-batch 2.0. How much scaling back can you really do to an api _after_ releasing 1.0?
A quick example is JobRepositorySupport. You don't even use it to implement JobRepository, but if 1.0 gets released with that class, you can't really make it disappear in 1.1. If anything, I would think that you'd be better off identifying the things that need to be exposed (just like you did) and hide the rest of it. Then, if 1.1 comes around and someone objects to not having access to something you can always add it to the public-facing part of the framework. On the other hand, I agree that documentation would go a _long_ way towards fixing a lot of these problems. As I mentioned to Lucas, to understand how the DefaultFlatFileReader is implemented and invoked, there's quite an extensive list of interfaces that need to be understood. A little documentation pointing developers towards the highlights would certainly clear up a lot. Then again, the smaller the API, the less there is to document... :-) My mistake on the example. JobRepositorySupport is in the test sources. Substitute that with Converter for an equally valid example.
Converter was scratched last week in favour of ItemTransformer (so one less interface there anyway), and there is only one implementation (apart from inner classes, mostly in test cases). You might have to think of a better example again.
Actually, that was my point... Converter isn't even being used, but it's still present in the source and if it doesn't get removed it's a part of the API. Just some simple cleanup, reevaluating whether or not some of the *Adaptor classes, *Support classes, etc. and consolidating similar interfaces (there seem to be a few that define a single Object->Object or Object->void method) would be a quick, pre-1.0 task that could save some headaches in the long run.
I think that you're missing the forest for the trees... I wouldn't think that there would be a poster child for an api-level issue. Perhaps it was my mistake for including an example at all. All I was trying to do was point out that things get missed when you get focused on solving individual problems. IMHO, the issue is that the API is huge. Very, very huge. There is a new abstraction for every possible detail and it makes spring-batch as unnavigable as it is extensible.
My recommendation remains that there is some consideration made prior to the 1.0 release about how many classes are exposed and whether or not some of the abstractions can be cut down. This might be a better way to look at the issue. This is a class diagram of just the classes related to reading a flat file. (there may be one or two stray ones in there related to writing, but i tried to filter them out) In order for someone picking up spring batch to get a good idea of how flat file reading works, and what the various options are, he/she would have to understand the majority of the classes in this diagram. That's a lot.
I don't disagree with the basic point you are trying to make, but there are still some Java language constraints that we can't get around even if we wanted to. And (apart from the fact that it gratuitously includes StreamManager, and at least one private inner class) with that in mind the diagram is a good starting point. But you are not going to be able to convince me that FieldlSetMapper, FieldSet, LineTokenizer etc. should not be a) interfaces, b) public interfaces.
Thanks for the effort and all the feedback, by the way - it is very helpful, and you are right on the money. If we can find any more superfluous / wrong extension points we can definitely remove them. But we don't have the bandwidth for a re-write. Here's another interesting example of how a large API leaves behind some strange artifacts as it evolves. I was making a class diagram for the inheritance relationships for DefaultFlatFileItemReader and came across AbstractItemReader. It is an abstract class that implements ItemReader and only provides an empty implementation of close(). Unfortunately, ItemReader doesn't define a close method. So, everything that extends AbstractItemReader has a close method that will never get called (unless an implementation also implements ItemStream) and we have a layer of abstraction that adds no value.
Well the AbstractItemReader seems to be a hastily converted AbstractItemProvider, which provided the close and recover methods. Item recovery was moved to the ItemRecoverer interface in M4, and the close method is now encapsulated in ItemStream, so this is a good example of a deprecated class that should be removed, and a JIRA issue should be created specifically to remove this, but I don't think that's the crux of Greg's argument.
Don't get me wrong, I don't buy what Greg is saying 100% -- I don't think that every framework necessarily should declare final anything that they don't expect to be an extension point. Leaving something as extendable is not the same as API bloat IMO -- however, I do think that documentation should establish best practices for which classes are INTENDED to be extension points, leaving the rest as advanced customization for users who won't get the same level of support. Aside from the heart of the topic --- about the FieldSet issue, I suggested to Lucas several months ago that this be made a public interface, for a much different reason than the guys in the forum - theirs was for symmetry with ResultSet -- mine was for the use case of "as-you-read" data tweaking, and Lucas said "you could just as easily put that logic into one of our supported extension points" - that was the end of the issue for me. I think that it should not have been made public since, as far as I could gather between my ideas and talking to Lucas, there is no widespread use case to support it that is not already addressed by other features. I think this is a fair and reasonable standard for determining whether or not a feature is helpful or just bloat. We don't need several ways to do the same things, but we also don't need to lock down the framework to only the "best practice" extension points. Created an issue for removal of the AbstractItemReader --- http://jira.springframework.org/browse/BATCH-348
We had the API bloat discussion for 2.0, and made quite a few changes to public APIs. The main conclusion though is that a framework has too many extension points to be really literal about only exporting a small set of interfaces (JLS restrictions also relevant).
Postponing this again so we can talk about it for a later release. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
80% of the spring-batch classes are subclassable on top of the 26 public interfaces.
should be
80% of the spring-batch classes are subclassable on top of the 61 public interfaces.