This evening I was looking through some PDF’s of a Mithras reference volume, which a correspondent very kindly scanned for me some time back.   I keep a copy on my travelling laptop, and so when I am working away from home, I can work on the site in the evenings in the hotel.  I was, in fact, looking for information on the Nesce Mithraeum, in Latium; and, rather to my surprise, that page was missing.

So I decided to go through the PDF (which I received in parts of a few pages) and check whether any other pages were missing.  A few were, but I can obtain photocopies from a library and patch the PDF’s.

But I came to the end of the directory, and double-clicked on a file and … it wouldn’t open.  Adobe informed me that it was corrupt.

This was a surprise.  I knew the file must have been OK once.  All the files in that directory were emailed to me, and I certainly opened them all at least once, and often many more times.  How could it be corrupt?

Now I carry around with me a back-up of my hard disk, on external hard disk.  It’s kept up to date every weekend.  So I went to that and tried to open the same file.  And … it wouldn’t open.

Somehow the file that I had downloaded to my PC at home had become corrupt, at some point in the past.

In this case there was a happy ending.  I never got around to deleting the email(s) that sent me this book, and so I could just download the piece again.  And, sure enough, that was fine.

But that PDF file has never been anywhere except on my hard disk.  How could it have become corrupt, without any other intervention?

More seriously … I have gigabytes of PDFs of books.  How many of these, I wonder, have silently rotted?

Nor am I the only one.

Today I accessed a website discussing an obscure technical subject.  The article was less than a year old, but the links to samples and bitmaps no longer worked.

It’s not so long ago that I found that the zip files on the Electronic Journal of Mithraic Studies website – which seems pretty much abandoned – no longer unzip.  Somehow, at some point, in their state of neglect, they have rotted.  But how?

We need a way to check the integrity of our collections of electronic books.  There is no manner of use in having them, if they are not there when we need them.

I don’t know how it might be done; but done it needs to be.

Gentlemen … check your files!

  1. What a frightening thought! I shall back up the 550-page file of my translation of Michael the Syrian, of which no hard copy presently exists, as soon as I get home.

    I suppose, philosophically speaking, it is good for us to be reminded of the inexorable operation of entropy. Change and decay in all around I see …

  2. Very good idea! And a print out kept in a file would not be a bad idea. At least it could be scanned in, if the worst came to the worst.

    It is a reminder of entropy. The fire-giants will one day set the world on fire. Nothing is forever.

  3. And a print out kept on paper would not be a worse idea… 550 paper pages are cheaper than nothing.

    No, Roger Pearse, no one can say « if the worst came to the worst », because everybody and his uncle knows that in our demiurgic wild world « the worst comes to the worst ».

  4. This is a major worry I have as I amass my electronic library. There has to be software out there that checks this and somehow guards against it. Does anyone know of anything?

