History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: BATCH-636
Type: New Feature New Feature
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Douglas C. Kaminsky
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Spring Batch

Include recommended schema / stored procs for archiving

Created: 20/May/08 04:06 PM   Updated: 20/May/08 04:40 PM
Component/s: Samples, Core
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified


 Description  « Hide
Provide a sample schema / queries / stored procs for archiving old data, i.e. moving that data in a stable manner from production tables to archive tables, potentially across datasources.

Might even design this in such a way as to include a sample batch job that uses Spring Batch to accomplish this.

As time moves on, any reasonably active production system using Spring Batch is going to have a table full of hundreds of thousands if not millions of execution context variables, job parameters, job executions and step executions (not to mention job instances and job parameters).

Suppose you have 2000 job instances that run per day - this is a very reasonable number of job instances for an enterprise system - that's:

2000 instances + (2000 * jp(0) job parameters) + 2000(1 + p(a job fails once) + p(a job fails twice) ...) job executions + (2000(1 + p(job fails once) + p(job fails twice) ...) * ecj(0)) + ... okay I haven't even gotten into entries for steps yet

jp(0) = average job parameters per instance
ecj(0) = average execution context entries per job execution
p(...) = probability functions

even assuming ecj(0) of 1, jp(0) of 1 and low probability of job failure, you're looking at 8000 table rows just for job information, and that doesn't even take into account step executions or step execution context data

Also keep in mind that even though an execution context or job parameter row only stores a small amount of data, most DBMSes will allocate enough space for every single one of the fields even if they are never used (i.e. long_value, date_value, etc).

Eventually, EVERY user will need a way to manage old records - to delete them or archive them. Deletion can be handled by an end-user tool (e.g. a management console) at some point. For archiving, however, perhaps we should be proactive and recommend / provide a solution for those who don't have other professional tools available to them. This will make the product more reliable and reduce problems stemming from community members introducing custom archiving code and then coming to the forums because they made a mistake.

 All   Comments   Work Log   Change History   FishEye   Related Builds      Sort Order: Ascending order - Click to sort in descending order
Lucas Ward - 20/May/08 04:40 PM
This isn't a bad idea, but it's very low priority compared to everything else going on between now and 2.0. Given the current bandwidth the committers have, there would be no way this could be addressed until well after 2.0. Quite honestly, I don't think it could be touched until 2009.

That being said, if someone creates a decent sample job that does the archiving, I wouldn't have a problem putting it in, especially if it's flexible enough.