Taking my machine translator to Agapius

The Kitab al-Unwan (World History) of the 10th century Arabic Christian writer Agapius runs from creation down to his own times, divided into two halves by the birth of Christ.  It was published a century ago in the Patrologia Orientalis, in 4 chunks, and three of those are online at Archive.org.  They were published by a Russian, with a French translation.

In my hotel room in the evenings, I’ve been translating the French into English.  It’s very simple French, as might be expected.

Last weekend I scanned the first half of the second part (PO7) into my PC, ran Finereader 8 optical character recognition (OCR) software, and proofed the results (which took very little work).  I did find that the online PDF’s are at 200dpi or less — almost unusable for OCR –, so I had to buy a copy and scan it at 400dpi. 

I then ran the French text through a little utility to split it up into sentences with newlines.  I then ran that through my elderly desktop copy of Systran 3.0.  The quality of translation was really very good indeed!  I then ran both the input and the output through another little utility to interleave the sentences of French and English, thereby making it easiest for me to produce the final version.

This week I’ve been working on the output file on a little hand-held personal digital assistant.  The latter is pretty much useless, even though I bought a keyboard for it.  But I’ve been able to work, and make quite a  bit of progress.  The result will appear online eventually (I already posted the French into some French-language newsgroup online, in case it might encourage them).

I suggest that we need to consider whether some of the older Patrologia Orientalis translations may merely be awaiting someone with a minimal level of knowledge of French to be made more widely available.

I’ve also been trying to get hold of a copy of the Italian translation of the Annals of Agapius’ contemporary historian, Eutychius.  No copy of that book exists in any UK library!  I’ve found a bookseller in Jerusalem who says he has one (isn’t the web wonderful!).  It will be interesting to see if there are any good machine translators of Italian!

1,000 Arabic Christian Manuscripts destroyed in WW2? Nonsense!

In the preface to volume 2 of the catalogue of the Mingana manuscripts in Birmingham, Alphonse Mingana states (p. v) that the main collections of Arabic Christian manuscripts in the East are the library of Mt. Sinai; the library of the Catholic University of Saint-Joseph in Beirut; the Coptic Patriarchal museum and library in Cairo; and the library of Paul Sbath in Aleppo.

Searching for information on the last, often referenced in Graf’s history of Arabic literature, I found this link to the Schoyen collection.  On it, there was this statement: “Paul Sbath had one of the most important collections of Arabic MSS ever formed, ca. 3000 MSS. 2000 MSS are in the Vatican Library, 1000 MSS were destroyed during the war, 2 MSS including the present one came to England.”  Yet I find that the HMML expect to photograph some of the Sbath mss in Aleppo.

Fortunately this turns out to be nonsense.  An enquiry on the Hugoye list brings the following information:

Sbath’s catalogue of his manuscripts (P. Sbath, Bibliotheque de manuscrits Paul Sbath, pretre syrien d’Alep: catalogue, 3 vols. Cairo, 1928-34) lists 1349 manuscripts. 

Of these, nos. 1-338, 340-776 are in the Vatican (I don’t know what happened to no. 339, and I can’t remember now why I know it’s missing). 

Most of nos. 777-1349 are in Aleppo, in the possession of Fondation Georges et Mathilde Salem. The manuscripts are (or were in 2001) in their office in Aziziyeh. Some of the manuscripts have gone missing; there are also a number of additional manuscripts not listed in Sbath’s catalogue. I gather from the Internet that a new catalogue of this collection is about to be published: Francisco del Rio Sanchez, Catalogue des manuscrits de la Fondation Georges et Mathilde Salem (Alep,  Syrie) (Sprachen und Kulturen des christlichen Orients), Stuttgart: Reichert, 2008.  — Hidemi Takahashi.

That’s more likely.  I wonder how the mss ended up in the Vatican, tho. Another email from John C. Lamoreaux tells us:

Sbath himself collected around 1300 MSS — though he claimed to have more, perhaps as many as 1500.  About half of these ended up in the Vatican Library (fonds Sbath).  These are well preserved, and copies are easily had.  Apparently, there were legal troubles getting the remaining mss out of Syria.  Most of the rest of the mss, but not all, passed to his brother, and are now in the Foundation Sbath, near the Jesuit Residence in Aleppo.  Hill is now said to be digitizing the mss remaining in Aleppo.  For a list of the mss still in Aleppo, see the entry on the foundation in Takahashi’s bibliography on Barhebraeus (2005).

Sbath also published in the 1930s a three-volume catalogue of mss in private holdings, mostly in Aleppo.  It lists about 3000 mss, most otherwise unknown.  To my knowledge, none of these mss has yet to be found.  I am about finished with an article arguing that Sbath was being less than honest, that he never actually saw many of these mss.

This all makes sense and gives us a little more.

Unreliable English translation of “History of the Patriarchs of the Coptic Church”?

The massive Arabic Christian history begun by Severus ibn Mukaffa in the 9th century and running down to our own times is a gem.  But I was looking at Google books today, and found a statement here in vol. 1 p. 211 of the Cambridge History of Egypt that the English translation published in Cairo in the 1950’s is unreliable.  The first four chunks were published by B. Evetts in the Patrologia Orientalis, are presumably sound, and are here.  5 chunks of the Cairo publication are at this site, and 3 more exist.  It’s a very hard book to get hold of, as I can testify!

This is the sort of thing that makes me wish that I was a rich man.  I’d just hire someone and fix the translation. 

Shlomo Pines, Agapius and Elmacin (Al-Makin)

Shlomo Pines published a curious version of the Testimonium Flavianum of Josephus, taken from the Arabic Christian writer Agapius.  But rereading his article, and comparing this text with the Patrologia Orientalis version of Agapius, we quickly find that there is a problem.

Pines’ text is not that given by the Florence manuscript, which alone preserves Agapius.  However the CSCO text also gives quotations from the later Arabic Christian historian, Al-Makin or Elmacin.  These Pines has used to supplement the text, and thereby produce his version.

Now in a way this is rather dubious.  After all, we know that texts expand in transmission.  The Testimonium is perhaps more prone to this than any other bit of Josephus, as the reference in Photius shows, which gives a bit about Jesus otherwise quite unknown.  Glosses on this text were always going to occur, and be incorporated.  So treating the manuscript as epitomised is unusual.

The real question is whether Al-Makin generally expands on his authorities.  If he does, then the extra material must be worthless, and Pines’ version with it.

But there is no complete edition of Al-Makin at all; none that contains this passage at all; no critical text of any of it; no real translation of any value in any language (unless we include Ethiopic).  The text is pretty much inaccessible.

I believe that the Agapius Testimonium is not as we have been led to believe.  I suggest that Agapius merely gave a rough summary of the contents, rather than a quotation; the text rather reads like that anyway.  Until we have a real understanding of Al-Makin’s text and its sources and handling of them, I think we ought to place Pines’ version on the shelf marked ‘to be verified’.

Wit and wisdom in the ancient world

Last night I read a truly splendid article by R. Van Den Broek, Four Coptic Fragments of a Greek Theosophy, Vigiliae Christianae, 32 (1978), 118-42.  It’s on JSTOR here.  If you have JSTOR access, don’t try to read it on-screen, because it will make your eyes hurt; print it on paper and read it that way.

What’s great about it, I hear you ask?  Well, the first three pages provide a really good overview in English of a subset of gnomologia; ancient collections of pagan prophecies predicting the coming of Christ.  Most of these have never been translated into English, and all are  hard to access and understand.

It seems that in late antiquity, as the temples were being demolished, the Christians of the period justified this to educated pagans by appealing to quotations from the philosophers predicting that the temples would fail and become unnecessary. 

This gives us a date for the origin of this kind of literature; the 5th century, when paganism was far from dead among the aristocracy, and such arguments could be useful.   The ‘quotes’ themselves tend to be a bit bogus; dodgy people like Hermes Trismegistus are invoked.  Oracles of the gods themselves are included. 

There’s a few of these sayings in Cyril of Alexandria’s Contra Iulianum.  But the big 5th century collection is an anonymous Greek “Theosophy”.  This is lost, except for a longish chunk of book 11, containing quotes from the Sybilline oracles.  But a long abstract has been preserved, known as the Tübingen Theosophy and published by H. Erbse in Theosophorum Graecorum Fragmenta.  (This is not one of those monster tomes, but a smallish book).  This tells us about the content.  The first few books were dedicated to describing the true faith, and the next few to predictions of Christ of this kind.

The fragment of the “Theosophy” tells us that the quotes come partly from Lactantius.  As might be expected, manuscripts of the fathers are the main source, and probably even glosses on those manuscripts were used as if by pagan authors — after all, without quotation marks, who could be sure?

Later collections play down oracles by the gods — now relegated to history — but instead start using pagan predictions to parallel those from the Old Testament.  An example of this is John ibn Saba’s Precious Pearl.

The actual research in the article is four more bits of ‘prophecy’, this time from Coptic sources.  Sebastian Brock published some from Syriac.  My own site contains text and translation of a few from Arabic. 

Sayings literature was a popular genre.  Consequently maxims and sayings spread all over the literate world.  It would be interesting to learn whether any made their way into Persian or Indian!

Looking for ancient texts in Arabic

It’s a bit of a treasure hunt, nosing around in some of the minor language groups of the ancient world. You’re always looking for some text that will tell you a bit more about antiquity, give a bit more primary data than you get from the standard texts.

And what does every treasure-hunter need?  A treasure map!!!  Ideally you get a list of ancient authors and what they wrote, written in antiquity before lots of it was lost.

According to Georg Graf’s modern route-map, the Geschichte der arabischen christlichen Literatur, such a thing does exist for Arabic Christian literature.  It’s by a chap called Abu’l-Barakat, and is a list of Arabic Christian literature, names and works.

It was published with a German translation by Wilhelm Riedel, Der Katalog der christlichen Schriften in arabischer Sprache von Abu’l-Barakat, in Nachrichten der K. Gesellschaft der Wissenschaften zu Göttingen. Philologisch-Hist. Klasse, 1902 (Heft 5), pp. 636-706.

Interestingly volumes of this journal are online at Archive.org.  But not, of course, this one.

Wouldn’t it be nice if this was online, in English?

How useful are the scanned books at Archive.org?

I’ve downloaded the Patrologia Orientalis volume 7 from Archive.org, and started to translate the French text of Agapius into English.  This is very easy French, as it was written by a Russian, so not his first language.

A real scholar would probably throw up his hands in horror.  The very idea of making a translation from a translation, rather than from the original text, is something that scholars would try not to do.

But hardly anyone knows 9th century Christian Arabic.  Quite a lot of people know French; quite a lot don’t.  I don’t know how much of the text I will translate.  But whatever I do translate should help to make the work better known, so it seems like a worthwhile task to me.

What I’ve been doing is printing off the pages and scribbling a translation in the margin.  Today I typed up a fortnight’s scribblings, which was tedious but necessary.  But…

I can’t help noticing that the 200dpi resolution of the pages isn’t really high enough.  The text is quite faint, even when printed in colour.  The footnotes are hard to read.  Was that Daniel chapter 9, or chapter 4?  Even in the text there can be problems.  Was that 5,500 years, or 3,300 years?

A couple of weeks ago I decided to buy a printed copy of that fascicle of PO 7.  My thinking was that the French was just so easy, that the machine translators might do it perfectly (which was untrue, but never mind).  It arrived yesterday.  I scanned part of it today.  But I couldn’t avoid noticing that letters that I had great difficulty reading, when it was part of the PDF, were perfectly clear now.

This is worrying.  The last thing I want to do is to discourage the digitisation of these volumes.  But at the same time, shouldn’t we ask for higher resolution?

Publishers will be pleased, tho.  Consultation for the odd bit may be OK, but for serious work, I had to go and buy a copy!

Using Lulu.com to get copies of books

Once I got interested in Arabic Christian Literature, I quickly found that the only book of use was Georg Graf’s 5 volume Geschichte der arabischen christlichen Literatur, published 50 years ago by the Vatican library.  I was able to buy volumes 2-5 online, but not volume 1.  The first two volumes deal with literature up to 1500, so are really the only part that would interest readers of this blog.

In this post, I mentioned that I intended to try using the print-on-demand service, lulu.com, to make a personal copy of volume 1.  Indeed I did so, and perhaps my experience will be of use to others.

My first step was to borrow the book from the library, and run it through a scanner to create a directory of images, one per page.  This took quite a while, because it’s 700-odd pages!  I used Finereader 8.0 OCR software, not to do OCR but simply to manage the scanning.  I used an OpticBook 3600 book scanner (very cheap and very fast) to scan each page. 

In FineReader you can crop the pages to the same size, and erase dots etc.  I did this, producing images with only small margins.  You can also export all the pages to create an image-only PDF, and so I did, getting a 50mb PDF.

At this point I got rather ahead of myself, and omitted a crucial step, but I found this out later. 

I opened an account on lulu.com (which is free), and started to create a book.  To do this, you choose a paper size and binding.  In my case this was 7.44″ x 9.68″, perfect binding.  The site prompts you to upload a PDF, which is pretty awkward and fails a lot.  I found that I had to follow the alternative path given on the site ‘for large files’ and upload my PDF using FTP.

When I had uploaded it, the site warned me that my PDF pages were smaller than the paper size.  This meant that it would resize them.  Foolish chap that I was, I presumed they would add white space.  But this was wrong… they stretched the pages.  They were still readable, but looked a bit odd.

You’re also asked whether your book should be made available to the public for sale (with whatever markup on cost you choose); only available on a private URL; or only available to you.  I chose the latter, in case there were copyright issues.

The site allows you to design your own cover — I did this in a basic way.  You then get to see the PDF that results from all of this, which they send to a printer.  You save, and that’s it.  A link appears, offering you the chance to buy a copy yourself, which I did.  For this volume the cost price was about $22, and the postage was extra of course.  Manufacture of the book takes 3-5 days, and then the post office do their thing for however long they like.

In my case it was three weeks before it arrived.  It looked perfectly acceptable; except for the slightly stretched letters.

What I should have done, after scanning the images and cleaning and cropping them, was to pad them with whitespace myself before making the PDF.  This is something that Finereader doesn’t let you do.  But it stores the images in .tif format, so you can use other tools on them. 

Since there were 700-odd files, I wasn’t going to do this by hand!  I used a free command-line tool called ImageMagick.  I don’t know it well, but it did the trick.  I found that I needed an up-to-date version.

Now the TIF files from Finereader all include a thumbnail.  This makes them hard to work with.  What I did was write a little .com file containing a series of commands:

convert 0001.tif 0001.png

convert 0002.tif 0002.png

convert 0003.tif 0003.png


This gave errors, but converted all the pages to png format.  I had to do this, because the next step wouldn’t work if I did it on the TIF files directly.

I then wrote another batch file:

convert 0001-0.png -background white -gravity center -extent 2978x3872   0001-ok.png

convert 0002-0.png -background white -gravity center -extent 2978x3872   0002-ok.png

convert 0003-0.png -background white -gravity center -extent 2978x3872   0003-ok.png


This took all the pages and plonked each of them in the middle of a white background sized 2978 by 3872 pixels.  I knew that this was the size of the pages in the ‘print ready’ PDF that lulu.com had generated (because I downloaded it, opened it in Finereader, and got the size of the image of page 1 in pixels).

Then I created a new Finereader project, read in all those PNG’s at one go, saved them as a PDF, and this time had a PDF which was of the correct dimensions.

I’ve just finished uploading that, and bought a new copy of it.  It ought to be perfect.

The PDF’s that we find on archive.org and the like are generally of low resolution, so I don’t know if they could be used for this.  I scanned Graf at 400 dpi; the PDF of Agapius that I have been looking at on archive.org was 200 dpi.  So we may all have to scan our own books.

But this clearly works.  If you need a copy of an out-of-print and unobtainable book for private research purposes, you don’t have to rely on a pile of photocopies.  We all have piles and piles of those, I know!  But no; scan them instead, save your floor space, and print them at lulu.com.  You could even produce compilations in this way.  You could print extracts, ring bound, with blank pages between each opening.  All sorts of things are possible.

Of course if you made them available to anyone else, you would need to be sure that they were out of copyright.  If it is in print, buy a proper copy.  But if it’s a 19th century library catalogue, this is probably a nice way to get your own copy.

8th August 2008: the printed copy arrived, and it’s perfect!

Agapius and Archive.org scanned book quality

I was interested to find many volumes of the Patrologia Orientalis online at Archive.org.  Three of the four volumes that contain Agapius are among these.  So I downloaded PO7, which contains the section of Agapius from the birth of Christ (part 3 of 4), and printed a few pages. 

Now I’ve been doing some business trips lately. There isn’t a lot to do in a hotel during the evening, so I found myself scribbling an English translation in the margins.  I’ve decided to buy a PDA, in fact, to save myself the trouble of retyping.

However I began to get concerned at the quality of these (colour) prints.  In some cases the letters were not too clear.  At a couple of points, Agapius starts quoting Greek; and I couldn’t make out the letters!  The actual resolution seems to be 120 dpi at best.  This is way below the 400 dpi at which I scan everything myself, and isn’t really enough.

Perhaps I am missing something here, but if not, we have a problem, especially with texts in exotic alphabets.

We all know Agapius as containing an odd version of the Testimonium Flavianum.  This became widely known from an article by Shlomo Pines.  The version contained in the PO did not agree with my memory, so I went and looked up the Pines article.  It seems that Pines supplemented the PO text with quotations of Agapius in the later Arabic Christian historian, Al-Makin.  The version in Al-Makin is longer than that in the Florence ms, which alone contains this part of Agapius, and contains extra sentences.  Strictly we should refer to this as the Al-Makin version of Agapius, perhaps.

Ancient sayings literature

I collect joke books.  Most evenings I get home, tired, and I’m not really in the mood to read something heavy.  Instead I pick up a joke book, open it anywhere, read a few lines and always find something to make me smile.

Anyone who has bought joke books will be familiar with the way that the exact wording can change.  The contents of any book will vary, depending on what the author had access to.  Some jokes are attributed to famous people in one book, and are anonymous in others. 

Collections of wit and wisdom are not modern inventions.  Someone has invented the horrible term ‘gnomologia’ – literally ‘words of wisdom’ – to describe these things.  That’s enough to put anyone off!  But it means the same.  These are ancient collections of wit and wisdom.

I’ve been reading Denis Searby’s edition of the Corpus Parisinum (although the library have seen fit to only send me volume 1, the Greek text!).  I am struck by the way in which the contents of this monstrous 9th century collection of sayings, anecdotes, apophthegms (a long word for ‘bits of sage wisdom’) follow these rules also.

Joke books are a low-brow form of literature in our day, but a very popular one.  Likewise collections of sayings and wit were a popular form of literature, and occur all over the place in the manuscripts.  It is worth considering that one of Caxton’s first publications in English was a translation of an Arabic collection of wit and wisdom.  Doubtless he printed it primarily because he believed that he could sell it readily.

Some versions of the collection omit some or all of the names of the authors to whom each saying or story is attributed (the jargon for this is the ‘lemma’).  But clearly it is the wit of the saying which is important, not the specific person as a rule.  We would never criticise a joke book author for changing attribution, if it made the joke funnier, after all.

As the Greek language changed, sayings had to be rewritten.  An archaic word might dull the point of some saying; it would have to be rephrased.  Translations into Syriac and Arabic were initially very literal.  But quickly they would be rephrased or rewritten in order to work in their new context.  Impact is everything with a joke or anecdote; without it, it loses its point.  Unfunny jokes are not repeated.

Modern jokes are usually delivered orally.  There is thus an oral stage to transmission, particularly with the Arabic material, where the culture favours quotations of sententious wisdom and so is favourable to exactly this form of literature.

Other volumes are collections of anecdotes.  After-dinner stories can be  bought in most bookshops.  Again, Bar Hebraeus compiled a volume of anecdotes, published by E. Wallis Budge as “The laughable stories.”  These follow the same sorts of rules.  Many a modern story is attributed to Churchill, or Oscar Wilde.  Arabic ones tended to end up attributed to Aristotle.

Dr Searby makes a couple of interesting points about the transmission of these works.  For one thing, if we are trying to produce a critical edition, precisely what is the autograph?  In what sense is there an original?

Secondly he suggests that, within the limits given above, the transmission of the content of sayings is quite faithful. 

It’s clearly a mistake to treat these sayings collections as if they were literary works like a poem or a history.  Their nature means that they must be transmitted differently, the text is expected to be altered, is expected to have additional material added.  There is no fraud or dishonesty in this; merely the nature of the genre.

PS: After writing this I began to read the “Laughable stories”.  Saying 56: “A rich man wrote above the door of his house, ‘No evil thing may enter.’  Diogenes said, ‘Fine; but how is your wife to come in, then?'”