Loudwater
Systems
Management


Notes on an Interesting Failure

Today the CTC failed, bringing S2 down. One of our purge jobs was running. As DB2D started coming back up, it automatically rolled back all changes that were uncommitted at the moment of failure. This included our purge.

Clearly we will have to rerun the purge -- no worries. Pity about DB2D diligently rebuilding data we don't need. Ah BUT! You see, the time taken by DB2D to restart is dependent on how much there is to roll back. Unfortunately our purge job was using the SQL DELETE command to purge MVS_ADDRSPACE_T.

The restart has been rolling 2 hours now and isn't yet complete.

MORAL OF THE STORY

The key to improving this is to know that it is UNCOMMITTED changes that get rolled back. So I've put SQL COMMIT commands into the stream of DELETE commands that the job executes. So at least the job won't go back all the way to the beginning, nor will DB2D take quite so long to restart, if this ever happens again.