Playing with the Google Greek->English translator

Ekaterini Tsalampouni linked to this blog from her Greek language website.  I wanted to know what she said, so I copied it and pasted it into Google language tools.  The result was really very good:

Κατάλογος ψηφιοποιημένων χειρογράφων.

Από το ιστολόγιο του Roger Pearse πληροφορούμαστε για την ύπαρξη στο διαδίκτυο καταλόγου ψηφιοποιημένων χειρογράφων του Μεσαίωνα (μεταξύ των οποίων και αρκετών της Αγίας Γραφής. Για να βρεθείτε στη βάση δεδομένων, πατήστε εδώ. Για να διαβάσετε τη σχετική ανάρτηση του Roger Pearse, πατήστε εδώ.

became

List of digitized manuscripts

From the blog of Roger Pearse information on the existence of online digitized catalog of medieval manuscripts (among them several of the Holy Scripture. To get to the database, click here. To read the suspension of Roger Pearse, click here.

What more could you reasonably want?

How would it deal with patristic Greek, I wondered?  There used to be a website at aegean.gr that had PDF’s of Greek texts from the Patrologia Graeca, but it has since vanished.  However I did have a PDF or two, so I grabbed a bit of Constantine Porphyrogenitus, and pasted it in.   Well, from

Κωνσταντίνου ἐν αὐτῷ τῷ Χριστῷ, τῷ αἰωνίῳ βασιλεῖ, βασιλέως, υἱοῦ Λέοντος τοῦ σοφωτάτου καὶ ἀειμνήστου βασιλέως, λόγος, ἡνίκα τὸ τοῦ σοφοῦ Χρυσοστόμου ἱερὸν καὶ ἅγιον σκῆνος ἐκ τῆς ὑπερορίας ἀνακομισθὲν ὥσπερ τις πολύολβος καὶ πολυέραστος ἐναπετέθη θησαυρὸς τῇ βασιλίδι ταύτῃ καὶ ὑπερλάμπρῳ τῶν πόλεων. Εὐλόγησον πάτερ.

you get

Κωνσταντίνου ἐν αὐτῷ τῷ Χριστῷ, τῷ αἰωνίῳ King βασιλέως, son Λέοντος of σοφωτάτου he ἀειμνήστου βασιλέως reason, the Wise ἡνίκα his sacred Chrysostom he scenes from the Holy ὑπερορίας anakomisthen osper the πολύολβος he πολυέραστος ἐναπετέθη treasure τῇ βασιλίδι ταύτῃ he ὑπερλάμπρῳ cities. Πάτερ blessed.

No good, in other words.  But… then I thought, is this to do with accentuation?  What happens if I remove accents?  If I turn Πάτερ into Πατερ?  Sure enough “Πάτερ blessed” became “Blessed father”!

I’m going to experiment a bit further, and see if stripping off the accents does the trick.  What do we need to do, to make this work, I wonder?  Without any accents, we get:

Κωνσταντινου εν αυτω τω Χριστω, τω αιωνιω βασιλει, βασιλεως, υιου Λεοντος του σοφωτατου και αειμνηστου βασιλεως, λογος, ηνικα το του σοφου Χρυσοστομου ιερον και αγιον σκηνος εκ της υπεροριας ανακομισθεν ωσπερ τις πολυολβος και πολυεραστος εναπετεθη θησαυρος τη βασιλιδι ταυτη και υπερλαμπρω των πολεων. Εὐλογησον πατερ.

Which becomes:

Constantine in Christ afto meantime, meanwhile eternal king, king, son of Leon and sofotatou late king, why, inika the Chrysostom of the wise and sacred AGION scenes from the yperorias anakomisthen osper the polyolvos polyerastos enapetethi treasure and the identity and vasilidi yperlampro cities. Blessed father.

Not quite there, is it?  Interestingly logos = reason in accentuated form, and =’why’ in unaccentuated form.  What am I doing wrong?

Share

The lost libraries of Timbuktu

One evening last week I happened to see part of a BBC4 TV programme, The lost libraries of Timbuktu:

Aminatta Forna tells the story of legendary Timbuktu and its long hidden legacy of hundreds of thousands of ancient manuscripts. With its university founded around the same time as Oxford, Timbuktu is proof that the reading and writing of books have long been as important to Africans as to Europeans.

I couldn’t watch this programme for long — too much left-wing or “blacks are wonderful” propaganda, and not much hard information at all.

However I did learn from it that there is a trove of hand-written books in Timbuktu.  They all stem from the Moslem invasion of West Africa in the middle ages.   The oldest are 13th century.  The older books were in Arabic; the more recent ones in tribal languages, written in Arabic script.  The latter were naturally preferred by the modern holders of the books.  During the French period — the only period of civilised rule it has ever known — an unspecified number were rescued and carried off to an unspecified destination (we are invited to consider this as an “indignity”!).  Doubtless they are in the French National Library, and probably properly catalogued too, although this was not said.  Wild estimates of the number of such books were tossed around; anything up to 700,000 was mentioned, although this seems unlikely.  We saw a desktop scanner being used to digitise a page.

There was lots of talk about “riches” of books.  But… what precisely do these texts contain?  How many are of what age?  This I could not learn.

I found online a Moslem Timbuktu Educational Foundation — based in California, as it seems the “riches of African culture” don’t extend to adequate internet connections.  They claim to own the manuscripts.  The site solicits a donation of $100 to preserve and translate each manuscript — although the contact form doesn’t work, and the one and only newsletter is dated to 2003.  The site also is infuriating vague, but gives a little more:

The manuscripts cover diverse subjects: mathematics, chemistry, physics, optics, astronomy, medicine, history, geography, Islamic sciences and traditions of the Prophet Muhammad (peace be upon him), government legislation and treaties, jurisprudence and much more.

Yes?  So, which authors?  Which texts?  Is there a catalogue?  And… can’t they get some money off the oil-rich states, being good Moslems and all?  (I certainly would, in their shoes).

The BBC is to be commended for commissioning a programme on manuscripts.  Someone there should be shot for making a piece of political agitprop instead.  A wasted opportunity, then; but still good to see manuscripts on the box.  More please.

PS: The Washington Post has a much better article on all this here.  Manuscripts are 16-18th century.  Some of the mss are online at the Library of Congress here.  See also this article.

Share

More lust for the CPG – works of Eusebius in Armenian and Georgian

I’ve been unable to stop thinking about the object of my obsession.  Yes, this is another “why the Clavis Patrum Graecorum is like Paris Hilton” post.  Both might make you go blind, for instance, although probably for different reasons.  How many people realise just how wonderful this object is?

What brought this on, I hear you say?  Well, thinking about Eusebius of Caesarea, and his “Tough questions about the Gospels” (Quaestiones ad Stephanum/Marinum — and if I owned a copy of the CPG, I’d give the work’s CPG reference number).  As everyone knows, this work is lost but a large chunk survives, plus some fragments in Medieval Greek bible commentaries which were made up purely of chains of quotations from the Fathers of the Church. I commissioned David Miller to translate the Greek fragments; someone else is doing fragments extant in Syriac.

But I’m a sad person.  (Sorry Paris).  I started wondering what other languages Eusebius’ work might have been translated into in late antiquity.  Coptic is an obvious choice, and there are fragments in that language. 

But what about Armenian?  The Armenians were converted to Christianity around the time of Eusebius.  They set up a monastery in Jerusalem, to copy Greek books, translate them into Armenian, and send them back to the old country.  We know that at least two works by Eusebius were indeed translated into Armenian.  His famous Church History exists in Armenian.  Better still, his Chronicle exists; book 1 of that work only exists in Armenian, in a single copy.  That copy was found by a traveller who  was staying in Armenia in the 18th century in a rural district, who got up in the night for a glass of water and found the book being used as the water-pot cover!

Anyhow, I started asking around.  Maxime Yevadian mentioned that the Canon and the letter to Carpianus also existed in Armenian 1.  The excellent Dominique Gonnet of the CNRS in France then pointed me to the CPG!  To my astonishment, this lists information about Georgian works by Eusebius (please forgive rough OCR):

3465. Epistula ad Carpianum. Canones euangeliorum.Versio georgica. B. UT’IE, Evsevis ep’ist’elisa … Udzvelesi kartuli versiebi, in Mravalthavi 17 (1992),p.117-123.
3467. Commentarii in psalmos. (1) in ps.37. Versio georgica (introductio in psalmos). M. SANIDZE, Psalmunis dzveli kartuli redakciebi, 1 (Anciennes rédactions géorgiennes des Psaumes), Tbilisi, 1960, p. 470-475.
3495Historia ecclesiastica. Versio georgica (fragmentum de S. Iacobo fratre Domini: H.E., Il,23). Cf. M. VAN EsBROECK, Les homéliaires, p. 123,189,213.

Of course the most exciting bit of that is the portion of the unpublished and untranslated monster-work, the Commentary on the Psalms.  Nothing on the Quaestiones, but what a book, that contains stuff like this!

<swoon>

1 Thomson, Bibliography of Armenian Literature, Brepols, 1995, pp. 51-2. 

Share

The Clavis Patrum Graecorum – what about the workers?!

I lust after the Clavis Patrum Graecorum, Geerard’s multi-volume list in Latin of the Greek and Oriental fathers and their works.  I feel about it like some people must feel about Paris Hilton; something incredibly expensive which one could never afford to run.

You know, this is an essential reference tool, for anyone working with the Fathers.  But who has a personal copy?  Who can afford one?  I don’t live within 60 miles of a copy.

Does anyone know of a way of obtaining copies of this which doesn’t involve hundreds and hundreds of dollars?  Some very expensive and essential texts are bootlegged, I know, in PDF form.  Suggestions very welcome!

Share

No free speech online in Australia? – blame the Christians!

In Slash.dot today there is an article which tells me that “Christian groups” in Australia are campaigning to get the government to filter all internet traffic there.  This puts in place the tools to censor the web in Australia.  Looking around, I find the Australian Christian Lobby seems to be the group in question.  They want to block internet porn.

I don’t know the background to this, and internet porn is certainly an evil.  But there are several questions that jump out at us.  Leaving aside whether the ACL represents anyone but itself, we might ask whether the Australian government is a pro-Christian one.  Because if not, then anti-porn is not the agenda.

As I understand it, the government currently trying to erect its own “Human Rights Commission”.  The very name will send a chill through anyone who has followed the evil bodies of that name in Canada.  This is about “banning hate”, which has becoming the code-word for censoring disagreement.  It wants to make it possible for favoured groups like gays and Moslems to drag into court people who they don’t like.  At least one Christian pastor has already been hauled into court after talking about Islam, without these new laws and bodies.  So this is not a government which favours Christianity, unless making legal harassment possible is a novel form of favour.

So why is it backing the ACL?  It looks a lot to me as if the ACL is a convenient patsy.  The government wants to end free speech in Australia.  As part of that, it wants mechanisms to censor the internet.  But since this is unpopular, it has to pretend that this is to “protect our kiddies”, and blame any negative effects on some group that it doesn’t actually like that much. 

This way they evade the blame for their censorship, while setting up the Christians to be blamed.  After all, when the censors block Christian sites, they can point to the ACL and say “well, you proposed it!”

All of us must oppose these measures to censor the web, whatever guise they appear in.  They are purely about removing freedom, whatever the pretext.

Share

Manuscripts online from Corpus Christi College, Cambridge

One of the Cambridge colleges has put its manuscripts online; or rather, has allowed an American university to do it for them.  Thanks to the catalogue in the last item, I find that the Parker library at CCC is online here.

The website is a bit useless.  What you want is a list of manuscripts and a bunch of PDF’s to download.  What you get is one of these airy-fairy-force-the-user-to-do-a-junk-registration, and then badly categorised materials – no search by author, as far as I could see.  The most useful access seems to be the browse by title.  This gives a single page, from which the alert can pick out the stuff they want.  The actual stuff underneath that, for each manuscript, seems normal, if fussy. 

But I can’t avoid saying this: how these people love to obtrude themselves between the user and the actual page images!  You have to click repeatedly to get an image of  a page large enough to read; then the same for the next page, etc.  Come on, guys; think of the user for once!

Most of the collection is medieval, lots of it concerned with Old English, some of it stuff by Parker himself.  But there’s a copy of part of Orosius there, some stuff by Isidore of Seville, Augustine, Jerome on Ecclesiastes, a sermon by Chrysostom, some Bede, Nennius, Origen on Numbers, letters of Symmachus, and bits of Sulpicius Severus.

Great to have it online, anyway.

Share

Catalogue of digitised medieval manuscripts online

A new catalogue of medieval mss online has appeared.  It’s here.

The man responsible, Matthew Fisher, in this article in Science Daily makes exactly the right points.

A member of a new generation of scholars who cut their teeth in the San Francisco Bay Area during the dot-com era, the Los Angeles native is motivated by a commitment to democratize access to some of the world’s most exclusive repositories.

“The price of admission shouldn’t be a plane ticket to a library in Europe or even Australia,” he said. “These documents are part of the world’s cultural patrimony. Everybody should have access.”

After all, we’re paying for them.  Most mss are owned by state-funded libraries, or are state-owned.

Share

Recipient names in Isidore of Pelusium

The recipient names in the letters of Isidore of Pelusium have a different textual history to the body of the text.  These names appear at the top of each letter, sometimes followed by a one-line summary.

I learn from Pierre Evieux’s excellent study that in the manuscripts, these items were not copied at the same time as the rest of the text.  This is because they are written in red, and are therefore done by a rubricator.  The copyist had to leave a space for them, and then someone — himself or another — would come back and fill in the gaps in red ink.

The same approach was taken in medieval texts to decorated initials.  Quite a large number of those initials were never done, and there are many manuscripts which still have a gap at the appropriate place.  So we can see immediately that the names can be lost in transmission far more easily than the rest of the text.

Nor is this all.  In a modern edition the names would be on a separate line.  But in a manuscript, saving parchment is all — especially if you had to kill the sheep necessary to make that parchment!  So the names would be inline, and distinguished by the colour to indicate the start of a new letter.  If you didn’t leave enough space, what then?

The only possible answer would be to abbreviate the words.  But names are hard to abbreviate.  Consequently the result could well be obscure symbols, also leading to loss.  This would be hard to fix next time the text was copied, especially as the next copyist would be liable to leave the same amount of space, thereby preventing the abbreviation from being expanded.

Finally the red ink tended to fade more than the black ink, leaving portions illegible.  A few scattered letters would be all that could be copied.

A further factor is the nature of the manuscript.  When it contained a copy of the letters of Isidore of Pelusium, the names of the recipients were important to the reader, and are generally included.  But there are also manuscripts which only contain a selection by subject of his letters, e.g. on some point of scripture.  In these manuscripts the name of the author was important, as an indicator of authority, but the recipient names hardly so.  In these type of manuscripts the recipient names suffer much more damage.

All these features are found in the manuscripts. 

These interesting comments by Pierre Evieux would seem to have wide application to many other sorts of texts.  They explain how letters can easily be combined in transmission; how the names can easily be corrupted or mistaken.  All these little details help us to understand what we see in any text that has reached us.

Share

The EThOS of the electronic age

An interesting statistic from Owen Stephens, who is project director for the EthOS project to make British PhD theses available online (and who picked up and commented on my post about the project – clearly a man on top of his game).  Making theses available online has quite an impact:

To give some indication of the difference this can make, the most popular thesis from the British Library over the entire lifetime of the previous ‘Microfilm’ service was requested 58 times. The most popular electronic thesis at West Virginia University (a single US University) in the same period was downloaded over 37,000 times.

I rather think the EThOS project will be a howling success.  More details on Owen’s blog.

Share

Problems with the CSCO edition of Jacob of Edessa’s Chronicle

The Chronicle of Eusebius may not be his best known work, but it is still fairly widely known.  The second half of this consisted of tables of dates, rulers, and events, in a form which has now been imitated and continued for some fifteen centuries (Jerome’s version here).

Among the continuators was the 7th century Syriac scholar, Jacob of Edessa.  His main claim to fame is that he realised that Syriac needed vowels, and was able to induce his Syrian Orthodox co-religionists to adopt Greek vowels, albeit written as tiny letters above the line.  Their rivals in the Church of the East dogged stuck with swarms of dots above and below the line to indicate vowels; a practise disastrously followed by Arabic.  Indeed Jacob even tried to get the vowels written on the line with the consonants, but here he failed.

His chronicle starts where Eusebius ends, in the 20th year of Constantine.  He begins with several pages discussing an error of calculation in Eusebius, and then a table of kings of Rome and Persia, years of their reign, “total years” (from the start of his chronicle) and events against each year makes up the rest of the Chronicle.  A badly damaged manuscript from the Nitrian desert in Egypt now in the British Library contains what survives of the text.  The work is of importance as one of the earliest mentions of Mohammed — as king of the Arabs — in a non-Moslem text.

The tabular portion of the work was printed in the ZDMG 1 early in the 20th century by E. W. Brooks, who appended a non-tabular translation in English of the events.  He revisited the text for the Corpus Scriptorum Christianorum Orientalium series; vol. 5 consists of the Syriac text in a volume of Chronica minora, while vol. 6 contains a Latin translation in tabular form, including the introduction. 

I have scanned the English translation from the ZDMG, with the intention of placing it online.  I obtained the CSCO volumes, and intended to format the text in tabular form, and simply replace the Latin translation with Brooks earlier English translation.  Simple?

I am encountering several problems doing so, which seem interesting themselves. 

Firstly, it is by no means as clear as it might be what the layout on the manuscript page actually is, when these seem to start in the middle of a page in the printed edition.  Are those running headers “PERSIANS: Sapor” really present half-way down the manuscript page, as Brooks suggests by printing them at the top of the page of the printed edition?  How is it that alternate pages seem to be non-tabular; is that a feature of the original; table and facing text?  Are any of those headings colour coded, as Eusebius coded his original text?   The only way to find out is to consult the original manuscript.

In addition, Brooks was unable to read the text in many places.  In some places he resorted to patching it from Michael the Syrian, who quotes extensively from Jacob, it is true.  But this is a risky thing to do.  We want Jacob’s text, as it exists.  We don’t want Michael here, except in a footnote.

As for the unreadable text, I wonder whether it would become readable under UV light?

Comparing the English translation with his Latin translation, the latter is longer, and words that were uncertain the first time are not so the second.  His use of Michael is probably the reason for this new certainty.  But there are worrying differences.  I have already come across one event which is labelled as one year in the English, and the following year in the Latin.  There is no indication of why the event is supposed to happen a line later in the text.  Which is right?  Did the printers do this?

Clearly we need a new edition of this work.  It’s not a long text, perhaps 20 pages.  We need an English translation of the discussion of Eusebius.  We need good pictures of the text, not the partial ones that Brooks had – perhaps using Multi-Spectral Imaging.  None of this should be beyond the skills of any Syriacist. 

Is anyone interested? 

1. Zeitschrift fur deutschen morgenlandischen Gesellschaft, 53 (1899) 261-327

Share