Let’s all agree that Amelineau needs to be flogged!

Today I received an email asking me about a letter of St Shenouda of Atripe, the 4th century Coptic abbot.  The letter in question was apparently written to a nun, saying “I knew you long ago.”  The email asked if I knew the source.

Well, I have almost no familiarity with the works of Shenouda (or Shenoute), so I thought that I would poke around a bit.  The email author was French, so I wondered what existed in French.

Interestingly Emile Amelineau translated some material by Shenouda into French as long ago as 1907, in his “Oeuvres de Schenoudi”.  Vol. 1 is here; vol. 2 here.  But… neither has a table of contents!  Cunningly, the editor has also ensured that none of the names of the works appears in the running titles.  This would not be great in printed volumes; in a PDF it is impossible to gain an overview of the volume and its contents without significant work.  Flinders Petrie had hard words for Amelineau’s destructive ineptitude as an archaeologist at Abydos; a century later, I feel ever so slightly vexed at such carelessness.

On reading his interminable introduction, in search of information, I find that he is not really editing anything.  What he is doing is printing chunks of fragments from papyri.  So, for him, each section is just whatever was found in such a manuscript.  That may be better than nothing, but it does not help us here.

It seems that the reconstruction of Shenouda’s works had to wait until modern times, and the labours of Stephen Emmel.1.  But that will have to wait for another day.

  1. See https://bmcr.brynmawr.edu/2023/2023.08.22/[]

Garbage in… Greek out? Experiments with Deepseek using OCR’d Italian containing embedded Greek.

The letters of the 6th century sophist Aeneas of Gaza have been sitting in a folder on my desktop for a month or two now, and I want to make some progress with making a translation into English.

It’s not a big text.  Each letter is only a short paragraph, and there are only twenty-five letters.  So the whole text would fill less than a dozen pages perhaps.  I have the 1962 edition with Italian translation by Lidia Massa Positano, which is more than a hundred pages.  There seems to be a rule that editions of tiny texts can be obese!  I have never forgotten the Gerlo edition of Tertullian’s “De Pallio” – a very short text of a page or two – which filled two lengthy volumes.  But Gerlo published in 1940 in the Netherlands, and it may have been expedient for him to be engaged in such a project at that time.  Anyway the Positano edition is not that daft, and consists of an introduction, the Greek text, the Italian translation of each letter with commentary and footnotes.

At some earlier point I seem to have run the Positano book through Abbyy Finereader 15 software to create a Word document, which is 77k in size.  So today I extracted the portion to do with the letters.  The OCR language was Italian, so the portions of the commentary that contained quotes from the Greek were gibberish at those points.

Anche Procopio scrive lettere riguardanti prestiti di libri. Per esempio, dall’ep. LXIII, diretta a Pizio, risulta che Procopio si rammarica di non possedere un libro chiestogli in prestito da co­stui. Nell’ep. CHI, diretta a Stefano, Procopio vivamente lo rim­provera di un grave ritardo nella restituzione di un libro (p. 572,46 : èpuO-piàv cpiQaec^ w; napa^à; tVjv òrtoo/eaiv xaì xà? auv^xaj uirepi- Swv xal tò pi^Xfov i/wv Tpkov v) ré-capTov toutI ó p^SèrpiTOV xaftéÉEiv è7raYYeiXàp,Evo<;) e cita come esempio di solerzia e di zelo proprio un Giovanni (p. 573,12 àXX’ où/ 3 y£ aocpuiTaxo; ’IwàvvTj;

The printed text is as follows:

I thought, as a first step, that I would load the Word document into ChatGPT and get an AI translation of the lot, commentary, footnotes etc.  The idea was only to allow me to skim the material, and decide what to focus on.

But ChatGPT started complaining that it was a long file, a very very long file, etc etc:

The document you uploaded is very large (roughly 130,000 characters / 60+ pages). Translating the entire file accurately would be too long for a single response.

Then it offered various ways to make my life difficult if I carried on.

Well, I didn’t.  I popped it into Deepseek instead.  My prompt was “Translate this Italian text into English” – nothing exotic.   Deepseek made no objections and speedily output an English version of the whole thing.  I did have to copy and paste the output to Word, but that was not too burdensome.

But as I copied and pasted, I noticed something strange.  My eye was drawn to the gibberish sections of the Greek.  Here is the same passage, converted from exactly that gibberish above:

Procopius also writes letters concerning loans of books. For example, from ep. LXIII, addressed to Pyzius, it appears that Procopius regrets not possessing a book requested from him by the latter. In ep. CIII, addressed to Stephen, Procopius severely reproaches him for a serious delay in returning a book (p. 572,46: ἐρυθριᾷν φῂς ὡς παρελθὼν τὴν ὑπόσχεσιν καὶ τὰς συνθήκας ὑπερβὰς καὶ τὸ ἐμὸν χρέος τεταρτεῦσαι τοῦθ᾽ ὁ μηδὲ τρίτον καθέξειν ἐπαγγειλάμενος) and cites as an example of diligence and zeal a certain John (p. 573,12 ἀλλ᾽ οὐχ ὁ γε σοφώτατος Ἰωάννης…

It has recognised that the text is Greek – I did not tell it so – and it produced that accented Greek output.  The Greek is not actually completely correct.  But it’s very close!

In computing, there is no magic.  If it looks like magic, it means only that you don’t understand what is going on.  The input that I gave it was not enough to produce that Greek output.  It was just the attempts of an OCR engine to make Italian out of Greek.  So the Greek was retrieved from elsewhere, and the garbage string used to search for it.

I then tried the following prompt with a page of the garbled Greek:

This comes from a book in Italian, displaying Greek. Correct the Greek. ”’….”’

This it proceeded to do, with an interesting commentary underneath:

So the garbled text is being used for a look-up of some sort.

We know that the databases in these “AI” engines – the Large Language Models (LLMs) – are essentially a search database made by pirating vast amounts of books and everything else. In this case it is perhaps using the garbled strings to look up stuff in the Thesaurus Linguae Graecae database.

But, as ever with AI, you just cannot trust what it gives you.  You have to check, and checking can take longer than doing it yourself.

Interesting, and frustrating.  As ever!

Another drawing of the serpent column in Constantinople

Easily the most important monument in Istanbul is one that few visitors look at.  Located today in the Hippodrome is an ancient bronze column missing its head.  This is, in fact, the monument erected by the Greek states to commemorate the victory over the Persians at Plataea, and moved here later.  It is extraordinary that it still exists.  Originally it had a golden disk at the top, supported by three serpent headed brackets, but the latter were broken off during the Ottoman period.

However there is a drawing of the column before this happened, in a portrait of the procession of the Sultan Suleiman the Magnificent.

The Procession of Suleiman the Great through the Hippodrome, fol. 7 from the series ‘Ces Moeurs et fachons de faire de Turcz’, Pieter Coecke van Aelst  1502–1550) – made in 1533/53

The drawing forms part of a series of woodcuts made by Pieter Coecke van Aelst, who arrived in Istanbul with his wife in 1533, and was originally published in Antwerp.  The complete set forms a massive panorama of the city.  This section is on the extreme right.

There are various copies online, but this one is screen-grabbed from that at the Princely Collections of Lichtenstein, online here.  Another at lower resolution is at the British Museum here. The Metropolitan Museum of Art has all ten blocks online here.

I’m not sure where I downloaded this one from, but it shows the context of our screen grab:

The set of woodcuts is placed within a frame of caryatids by the publisher.  The circle of columns to the right once stood on top of the sphendone, which supports the end of the hippodrome even today.

H/t from Twitter here.

A quick postscript: another account, Barış Yaralı, did some interesting AI-colourisation on the image.  As AI always does, it distorts: somehow losing the serpent column and much else in the process, but bringing up the figures quite nicely.  Note the soldier staring at the artist.

AI colourisation of excerpt.