Loudwater
Systems
Management


Failures in Collect -- More experience

An SMF collect got part way through, did a commit because of buffer full, then failed because of running out of space on the volume before end-of-log.

I did a recover to the last previous backup, then a reorg to a new empty pack (much small detail to set up). When I reran collect, EPDM remembered that the collect had been partial and restarted at the part way point instead of from the beginning. So about 4 hours of data were not applied. Clearly I did not need to do the recovery -- the tablespace was not damaged (the update that caused the disk failure was rolled back).

The correct action would have been to reorg the dataset to get it to another pack where there was some space. Then simply drop the collect JCL in again unchanged.