Once I got interested in Arabic Christian Literature, I quickly found that the only book of use was Georg Graf’s 5 volume Geschichte der arabischen christlichen Literatur, published 50 years ago by the Vatican library. I was able to buy volumes 2-5 online, but not volume 1. The first two volumes deal with literature up to 1500, so are really the only part that would interest readers of this blog.
My first step was to borrow the book from the library, and run it through a scanner to create a directory of images, one per page. This took quite a while, because it’s 700-odd pages! I used Finereader 8.0 OCR software, not to do OCR but simply to manage the scanning. I used an OpticBook 3600 book scanner (very cheap and very fast) to scan each page.
In FineReader you can crop the pages to the same size, and erase dots etc. I did this, producing images with only small margins. You can also export all the pages to create an image-only PDF, and so I did, getting a 50mb PDF.
At this point I got rather ahead of myself, and omitted a crucial step, but I found this out later.
I opened an account on lulu.com (which is free), and started to create a book. To do this, you choose a paper size and binding. In my case this was 7.44″ x 9.68″, perfect binding. The site prompts you to upload a PDF, which is pretty awkward and fails a lot. I found that I had to follow the alternative path given on the site ‘for large files’ and upload my PDF using FTP.
When I had uploaded it, the site warned me that my PDF pages were smaller than the paper size. This meant that it would resize them. Foolish chap that I was, I presumed they would add white space. But this was wrong… they stretched the pages. They were still readable, but looked a bit odd.
You’re also asked whether your book should be made available to the public for sale (with whatever markup on cost you choose); only available on a private URL; or only available to you. I chose the latter, in case there were copyright issues.
The site allows you to design your own cover — I did this in a basic way. You then get to see the PDF that results from all of this, which they send to a printer. You save, and that’s it. A link appears, offering you the chance to buy a copy yourself, which I did. For this volume the cost price was about $22, and the postage was extra of course. Manufacture of the book takes 3-5 days, and then the post office do their thing for however long they like.
In my case it was three weeks before it arrived. It looked perfectly acceptable; except for the slightly stretched letters.
What I should have done, after scanning the images and cleaning and cropping them, was to pad them with whitespace myself before making the PDF. This is something that Finereader doesn’t let you do. But it stores the images in .tif format, so you can use other tools on them.
Since there were 700-odd files, I wasn’t going to do this by hand! I used a free command-line tool called ImageMagick. I don’t know it well, but it did the trick. I found that I needed an up-to-date version.
Now the TIF files from Finereader all include a thumbnail. This makes them hard to work with. What I did was write a little .com file containing a series of commands:
convert 0001.tif 0001.png convert 0002.tif 0002.png convert 0003.tif 0003.png ...
This gave errors, but converted all the pages to png format. I had to do this, because the next step wouldn’t work if I did it on the TIF files directly.
I then wrote another batch file:
convert 0001-0.png -background white -gravity center -extent 2978x3872 0001-ok.png convert 0002-0.png -background white -gravity center -extent 2978x3872 0002-ok.png convert 0003-0.png -background white -gravity center -extent 2978x3872 0003-ok.png ...
This took all the pages and plonked each of them in the middle of a white background sized 2978 by 3872 pixels. I knew that this was the size of the pages in the ‘print ready’ PDF that lulu.com had generated (because I downloaded it, opened it in Finereader, and got the size of the image of page 1 in pixels).
Then I created a new Finereader project, read in all those PNG’s at one go, saved them as a PDF, and this time had a PDF which was of the correct dimensions.
I’ve just finished uploading that, and bought a new copy of it. It ought to be perfect.
The PDF’s that we find on archive.org and the like are generally of low resolution, so I don’t know if they could be used for this. I scanned Graf at 400 dpi; the PDF of Agapius that I have been looking at on archive.org was 200 dpi. So we may all have to scan our own books.
But this clearly works. If you need a copy of an out-of-print and unobtainable book for private research purposes, you don’t have to rely on a pile of photocopies. We all have piles and piles of those, I know! But no; scan them instead, save your floor space, and print them at lulu.com. You could even produce compilations in this way. You could print extracts, ring bound, with blank pages between each opening. All sorts of things are possible.
Of course if you made them available to anyone else, you would need to be sure that they were out of copyright. If it is in print, buy a proper copy. But if it’s a 19th century library catalogue, this is probably a nice way to get your own copy.
8th August 2008: the printed copy arrived, and it’s perfect!