From my diary

The legends of St Nicholas of Myra, or Santa Claus, became known in the West through a Life composed by a certain John the Deacon, probably in the 9th century.  It was based on Greek models, especially – as it says in the prologue – on the letter of Methodius to Theodorus which has given translators so much pain lately.

A few days ago I wrote a post about the Latin text of John the Deacon.  I can find no sign of a modern edition of this text.  The text listed in the Bibliotheca Hagiographia Latina is the 18th century edition of Falconius.  But the BHL makes clear that Latin manuscripts contain any number of recensions and reworkings of this.

It seems to me that John the Deacon should exist in English.  Being medieval Latin, it should be possible for me to translate it.  In order to use my various electronic tools, I need an electronic Latin text; so today I have been at work, OCRing the 15 pages of Falconius.

I haven’t tried to OCR a Latin text for years.  It’s been a nostalgic experience, in a way.

It was always awful to OCR Latin, because none of the OCR programmes supported Latin.  So you ended up making corrections on every line.

This is no longer the case.  Abbyy Finereader 12 does support Latin.  It is making a very fair fist of the page images of Falconius.  These were downloaded from Google books and are by no means speckle free or perfect.

On the other hand, I am still correcting pretty much every line.  Why is this?

Well, Falconius is an 18th century writer.  This means that he uses the “long s”, which is a bit like “f” – “God ſave the king!” – and also that “ct” is ligatured.  Neither is recognised by Finereader.

This is rather disappointing.  Back in the early 2000’s, Abbyy was given quite a bit of German taxpayers’ money to develop OCR for “Fraktur”, the “gothic” typeface much used in Germany until Hitler banned it.  This also handled both of these features of older printed texts.  But … the resulting product was not added to Finereader!  Instead a separate product was created, unaffordable by normal people.  And so, even today, the public cannot do Fraktur OCR.  One can only wonder at the imbecility of German politicians in allowing this.

So … it’s back to manual correction.

All the same, it’s still far, far better than it ever was.  I would have killed for OCR of this quality in 1997!  On the other hand, I wish I had the eyesight that I did back then.

I also need to work out where I might find a dictionary of medieval Latin.

6 thoughts on “From my diary

  1. The Dictionary of Medieval Latin from British Sources (DMLBS) from the University of Oxford is a good one. It seems to use modern lexicographical conventions and aims to give definitions, not glosses as the older dictionaries do. But given there was a degree of regionalisation in Medieval Latin it may not be eminently suitable for your project, so do check out some of the other ‘national’ dictionaries listed here:
    DMLBSs availability on Logeion, however, is brilliant. Well done, Oxford.
    A suprisingly useful small and inexpensive dictionary for biblical and liturgical (and often theological) texts is Stelten’s Dictionary of Ecclesiastical Latin.

  2. I have a copy of Stelten for which I have no further use and also Alexander Souter’s Glossary of Later Latin to 600 A.D. (OUP 1949, 1996 reprint). Also R E Latham’s Revised Medieval Latin Word-List from British and Irish Sources (with supplement) (OUP for The British Academy, 1965) if anyone is interested in that. Free to anyone willing to foot the cost of postage from Australia.

Leave a Reply