Looking at the “Life” of St Mewan

St Mewan, as he is known in Cornwall – known as Saint Méen in Brittany, and Sanctus Mevennus in Latin – was a Breton saint.  The Bibliotheca Hagiographica Latina states that he died in “Britannia Armonica” in the 7th century, and is commemorated on the 21 June.  His “Life” (BHL 5944) is said to be 11th century, according to Nicholas Orme, The Saints of Cornwall, Oxford (2000), p. 67, as I posted earlier:

The 11th-century Life of Mewan, written in Brittany, claims that Austell was a priest and godson of Mewan who lived with him in his monastery at Saint-Meen (I.-et-V.), attended his deathbed, and died on 28 June (his subsequent feast-day), exactly one week after his master (Plaine 1884: 155-6; Doble 1939c: 4-11). Both saints were honoured at Saint-Meen. In Cornwall the parishes of St Austell and St Mewan adjoin one another, and have probably done so since at least the 10th century when the two saints occur together in the early list of saints (Olson and Padel: 34, 59). Austell’s Cornish parish, however, was much larger than Mewan’s, reversing their status in Brittany.

The “Life” was printed by Fr. B. Plaine in Analecta Bollandiana 3, p.142-56, from a single manuscript in the BNF in Paris, which he says is 15th century but they say is 16th (!?).  Unfortunately the MS is not online.

I have started to translate Plaine’s edition into English, by running each chapter in turn through Google Translate and ChatGPT 3.5, interleaving the Latin sentences with the output from these.  Then I use my QuickLatin parser to look up words for part of speech etc.  I use the Logeion website to access the Dictionary of Medieval Latin from British Sources, to look up post-classical meanings.  The Latin is simple enough, and it is just a case of turning the handle.  I have a couple of weeks of domestic business to attend to, but then I will get to it.

Share

Import Turnpike Emails into Thunderbird – for free

When I first came onto the web in 1997, I used Demon Internet, and their “Turnpike” software on Windows.  All my emails until about 2012 were done that way, safely offline, when I moved to Gmail.  I still have my Turnpike directory on my PC, and, even on Windows 11, Turnpike.exe opens, and all my old emails are still in there.

But it’s pretty hard to search through those for some .doc file from long ago.  How do I import all those emails into somewhere that I can actually use?

If you do an internet search, Google will show you page after page of results from sites, all ending in “.com”, offering a “solution” – to buy some tool.  Thank you, Google.  All that money-grabbing drowns out the real results.  Luckily I found one in an old forum here.

The answer, it seems, is to use Mozilla’s Thunderbird as an intermediary.

I detest these scammers, and you do not need to do this.  Turnpike can export to “MBOX” format, a text file; and local email clients like Thunderbird – which is free – can import it.

Here’s how.

Export all your emails and attachments from Turnpike to MBox.

Go into your Turnpike directory, and find turnpike.exe.  In my case this is Turnpike 5.01.

Open it up.  On the menu, choose Window | Mailroom View.  That will show all your emails.  The first one is highlighted.

Select the lot.  For me, I had to click on the first email, hold the shift key, and hit Ctrl-End.

Then do File | Export, and save the mail_001.txt to some directory.  It took a few seconds, but it worked.  In my case the .txt file was almost a gigabyte.  This DOES include all the attachments, all UU encoded as text.

I then copied the mail_001.txt file and called the new copy “00 Turnpike” (because I wanted all my emails in a folder of that name.  You can use any name not already a folder in your email.  Use 00 on the front to make it appear at the top of the folder, for reasons we will see).

I would strongly suggest that you find an email with an attachment, and just export that on its own.  Try to import that under some name, as below, and check the attachment is imported OK.

Find out where the Thunderbird “Local Folder” is on your disk

Then open Thunderbird.  Scroll down the left panel until you find the Local Folders area (I have a couple of online email accounts connected to Thunderbird so I can read offline, which you see at the top).

As you can see, I already have a local folder named “00 Roger” which I use to back up my emails locally.  But you don’t need that.  I called it “00 Roger” because the local folder is full of junk files, which you mustn’t touch.  So by using the “00”, my folder sorted to the top!  Makes it easier to find.

Right click on “Local Folders” and choose “Settings”.  Select “Local Folders” on the left panel.  This will show you where your local folders are actually held on your hard disk.

As  you can see, I changed the “local directory” from whatever garbage it usually is to somewhere under d:\roger, where I keep all my user files.  It doesn’t matter where it is.

Now take a note of where the local directory is.

Then close down Thunderbird.

Import the Mbox file into the Thunderbird Local Folders directory

Then open that local directory in windows explorer.

Copy your small file with the attachment into this directory, right next to the “cert8.db” and all the other files.  Or copy your big, “00 Turnpike” folder in.

Then restart Thunderbird.

You will now have a new folder in Local Folders. But … if its the biggie, “00 Turnpike”, do wait before expanding the folder.  Allow Thunderbird time to process all those attachments.   For a small file, this won’t take all that long.

Once you feel sure, expand it, your emails will be inside, marked unread.

If you go back to the local folder in Windows explorer, you will see your “00 thunderbird” file as you left it, but with a new “.msf” file, which indexes it.

And you’re done.  You have your emails out of Turnpike.

Troubleshooting?  “Where are my attachments?!”  Well, delete the folder in Thunderbird, and try again with a single message.  See if that works.  If it does, then probably you just need to leave Thunderbird open and let it process stuff.

If it all worked OK, then you’re good.

Getting the emails into GMail

Maybe you want to copy/upload some/all of them into a Gmail account? then there are links online that will tell you, like this one.  Basically you just create a connection in Thunderbird to your online email, using IMAP.  This will download your emails to your PC, and create folders etc.  You then just drag the emails from “Local Folders/00 Turnpike” into the folder under your online email account.  But the link will give you a blow-by-blow account of that.  (I didn’t do it myself, tho, because I am increasingly suspicious that anybody who uses Google’s “free” services is about to get a rude awakening, in the shape of unavoidable “low” charges which somehow become very high charges.  See “Monopoly”.)

Likewise if you want a  local copy of your online emails, in Thunderbird, just copy/drag them from the folder for your Gmail account to a folder under “Local Folders”.

But the point here is that you now can work with your Turnpike emails.

Good luck.

Share

An index of available translations on this site to download

This blog is getting large.  A lot of patristic and other texts have been translated and placed here.  I thought that a list, linking to the posts, and also directly to the PDF and the Word file, might be helpful.  So I have compiled one, and placed it in the side-bar under Translations Available For Download.

I made this list by searching the “uploads” directory for all the .pdf and .doc* files, and then searching for these in the “Search” box on the blog.  So it’s probably not complete.  There will be translations that were made as purely blog posts, without a download.  There are a few of these in the Additional Fathers section of the website, such as Agapius, and I really need to revisit those and create some downloads.

There are also 70+ short files by Anthony Alcock which contain translations of Coptic texts.  These will have to wait another day.

Share

From My Diary

Working as a computer programmer meant working on a series of “projects” to deliver some software system, or, more often, a package of enhancements to some existing system.  Once you finished the project, there was often a lull while the code was released to production.  In that time, you would tidy up; do various little tasks.  Often your computer was set up in a particular way, you’d have piles of project-related material lying around, links on your desktop, and so forth.  You would breathe.  Draw breath.  Catch up on sorting out stuff, readying for the next one, as yet unknown.

Now that the St Nicholas material has been completed, it’s the same feeling again here.  It’s time to destress, to potter, tidy things up.

I’ve been going through the icons on my windows desktop, getting rid of shortcuts to directories on which I will no longer be working.  Looking at word documents left on the desktop, and merging those with others, getting rid of this and that.  And, of course, backing everything up.

One text file, with a reminder, drew my attention.

This blog has been running for a long time now.  I’ve been uploading PDFs and Word documents for years and years.  But there is no master list of everything that I have done.

Wouldn’t it be nice if there was?

So today I have created a new page on the blog – only visible to me, at the moment – and started to list the items.

WordPress stores all the uploads – in a directory called “uploads”, funnily enough – and part of my backup routine is to download this at intervals.  So I opened up a Git Bash window and did a unix-style “find” for all the .pdf and all the .doc* files in that directory, and piped the results to a text file.  There’s quite a few!

Nor will this list be complete.  Before I uploaded stuff to the blog, I posted it in HTML in the Additional Fathers collection.  Those files need to be rescued.  It would be good to have these in .docx and .pdf as well.   Then there are translations which were just blog posts, not done as downloads.

It would also be good to add the CPL, CPG etc reference to every text.  And I don’t just want to link to the downloads.  I want to link to the post which included them as well.

So there is quite a bit of work here.  But anyway, I’ve started.  I’ll work through the list of files first, and then we’ll see.

I noticed today that the WordPress theme didn’t seem to have a “next post” “previous post” link.  This is important if you are looking at a post in 2014, with no idea of what came before or after!  Actually I have found that the theme does have this – at the bottom of the post above the comments.  But before I noticed this, I installed a plugin to add arrows!  I will leave it there for now.

Likewise I have added archives by month at the bottom right.  I generally know which month an upload was made in.  So at least I can browse the month now, and see if I can find the post!

It’s a bit of a tedious job, but it will be useful, not least because it’s nice to see what I was doing 10 years ago, and to add a bit of extra value.  I need to make sure that we have PDF and .docx for everything – in the early days I didn’t always do this.

In the aftermath of the St Nicholas project, I also feel like a bit of holiday somewhere.  A week away somewhere would be nice.  A change of scene.  Leaving the laptop behind!  It is really important to take your holidays, as I always say.

I had actually thought about a trip out to somewhere in the Near East.  Unfortunately all the yelling and shouting and shooting and bombing at the moment makes that area seem less than inviting right now.  Also it is many years now since I grew sick of getting an upset stomach every time I went to Egypt.  So I won’t go far.

But yes, a bit of holiday and time away beckons.  In the meantime, I shall catalogue the projects of yesterday.

Share

A new project: “translating key pieces of patristic pseudepigrapha into English” by Nathan Porter

A post on Bluesky by Nathan Porter:

Now online, and coming soon to an airport near you, is the first English translation of the Pseudo-Athanasian work, De Incarnatione et contra Arianos. academia.edu/114648612/Ps… So begins my long-term project of translating key pieces of patristic pseudepigrapha into English.

Coming soon: Ps-Basil, Against Eunomius IV and V Ps-Athanasius, Dialogues on the Trinity Ps-Epiphanius, Homily on the Resurrection Anonymous, Life of Amphilochius.

On the Academia page he adds:

This is the first English translation of the Pseudo-Athanasian work De Incarnatione et contra Arianos (PG 26: 984-1028). Though it has received little scholarly attention, it is a work of considerable interest for its novel exegesis of biblical texts and unusual theological formulations. Some have attributed it to Marcellus of Ancyra, though probably erroneously.

The work is CPG 2806.  The edition is that of Montfaucon.  Interestingly there is a Latin version in Florence BML 584, of the 9-10th century; an Armenian version , and a Syriac version in the CSCO series!

Share

Three more miracle stories of St Nicholas: BHL 6177, 6178 and 6209

Last year I created a file with the Latin text of 47 of the medieval miracle stories of St Nicholas, and a draft English translation for each.  Three more stories were left unfinished, containing BHL 6177 (the miracles at Angers), BHL 6178 (the miracles at Brauweiler), and BHL 6209 (a musical miracle at the Cluniac priory of Sainte-Croix in Burgundy).  I have now finished these up, and here they are:

They are also at Archive.org here.

As usual, these files and their contents are public domain.  Use them in any way you like, personal, educational, or commercial.

The list of stories about St Nicholas in the Bollandist’s catalogue, the BHL, is longer still.  But most of the remaining pieces are quite long, and, in all honesty, late medieval.  It’s way too late for this blog, and out of scope.

I did think about BHL 6210 also, which is not too long, and I went so far as to scan the Latin text.  But in all honesty I feel no urge whatever to do any more St Nicholas material just now.  So let’s stop here.

Share

From My Diary

The other evening I realised with a shock that the project with the St Nicholas material is actually done.  My original intention was to make the oldest hagiographical material available in English translation, and this I have achieved.  With the translation of the “Life of St Nicholas” by Methodius (ad Theodorum), which originally drew me into this, the project is complete.  All that remains is to tidy up.

What remains?  Well, I have a couple more fragments of Latin miracle stories that I did.  But the original reason for doing these was to help with the translation of John the Deacon’s “Life of St Nicholas”, which often is interspersed with Latin miracle stories.  But all those are done.  The remainder are all later stuff, and really are out of scope.  So I will just release the last handful that I have done, and stop there.  That will be it.

Something that I did long ago was the first recension of the “Praxis de Stratelatis”, the story of the three generals.  This a kind colleage translated from the text printed by Anrich in “Agios Nikolaos”.  A couple of days ago, I started to OCR the second recension from Anrich, so that I could put this into an AI Translator.  I did the first page, and the results were very nice indeed.  The AI translators do a fine job.  The OCR wasn’t too bad either, except that Anrich used a strange version of “theta” (θ) where the loop is not closed, so Finereader OCR thinks that is an ampersand (&).  Likewise sigma was sometimes handled as beta.  The high-point was always recognised with an asterisk.  And so on.  The accentuation was a mess, of course; but the machine translators do not seem to care.  My new unicode Greek SPIonic-layout keyboard for Windows 11 worked fine.  But … correcting the OCR became tiresome.  And I found myself wondering why I was bothering.  I never intended to translate everything between the covers of Anrich’s two fat volumes.

Thankfully an academic team has now come along and will do professional work on all the St Nicholas texts.  That is as it should be, and I wish them all the best.  My own humble efforts have made the texts more accessible to everyman, and they never had any purpose beyond that.  If they have spurred renewed interest from scholars, then that is better still.

So… what now?

I was quite impressed with how well the modern Greek translations of St Nicholas material were handled by the AI translators, with a bit of sanity-checking from Google Translate.  I really have almost no translations of patristic material into modern Greek.  Indeed I wonder… now that we can work with modern Greek, it might be interesting to see just what already exists in translation in that language!

The only other text that I have in modern Greek translation is the mass of hardly-edited texts under the name of “Ephraem Graecus”.  I have the Phrantzolas edition of these, thanks to a correspondent.  In fact I find that the ancient/medieval Greek of these is in the elderly TLG disks, which most of us have, so I have access to that too.

I fired up Diogenes, which I use to work with that disk, and picked a text at random.  (In fact it was “Sermo unde magi in Hierosolymam ineunt.”)  I copied some of the text, and ran it through Bard AI.  Here’s the text:

Λόγος ὅτε οἱ μάγοι παρεγένοντο εἰς Ἱεροσόλυμα

 Ὅταν ἀγαθοῦ τινος ὁδοιπόρος τύχῃ συνόδου, χαίρει τὸν πόνον τῆς μακρᾶς ὁδοιπορίας κλεπτόμενος ὁμιλίᾳ· ὡς ῥάβδῳ γὰρ ἐρειδόμενος λόγῳ ἀκονιστικῷ γλώττῃ, συμβαδίζειν κεκονισμένῳ δοκεῖ τῷ ποδὶ καὶ τῷ στόματι ἀκαμάτῳ· μεριζόμενος γόνασι κόπον, κουφίζει χείλεσι πολυβάδιστον βῆμα.  Οὕτω δὴ καὶ τοῦ Χριστοῦ γεννηθέντος, οἱ μάγοι τὸν ἀστέρα ἰδόντες καὶ τοῦτον λαβόντες συνοδοιπόρον, τὸν πολυπόρευτον κόπον ἔκλεπτον τῆς ὁδείας ἐρωτήσει κοπούμενοι, ποῦ ἐστιν ὁ βασιλεὺς πυθόμενοι· ὡς κλέπτας τοῦ τεχθέντος ἠρεύνουν φωνῇ τοὺς Ἑβραίους.  Τοῖς δὲ ἐρωτῶσιν εἰκὸς Ἰουδαῖοι, τί δή, ξένοι, τολμᾶτε, τί φατε, ἄνδρες, φασί; Τί φέροντες ἐπικίνδυνον ἥκατε φήμην; Τί βασιλέα καινὸν βασιλευομένῃ σαλπίζετε πόλει; Τί πρὸς ἄωρον κυβιστεύετε τέλος; Τί κατ’ οἰκείων μαχαιροῦτε γλῶτταν τραχήλων; Τί τάφον ἐπιφέρεσθε στόματι, καθεύδοντα καθ’ ἑαυτῶν διυπνίζετε θάνατον; Ἠπόρει μνημάτων Περσίς, ἵνα ἔτι ζῶντος Ἡρῴδου βασιλέως ἄλλου πυνθάνεσθε; Πολλὴν ἀκούσας ὁμολογήσει χάριν ὑμῖν καὶ μεγάλοις ὑμᾶς ἀμείψειε δώροις.  Ἀλλ’ ἡ πρὸς ταῦτα τοῖς μάγοις ἀπόκρισις σύντομος· εἴδομεν αὐτοῦ, φασί, τὸν ἀστέρα ἐν τῇ ἀνατολῇ καὶ ἤλθομεν προσκυνῆσαι αὐτῷ. Οὐκ ἠρκέσθησαν ἐρωτῆσαι, ἀλλὰ καὶ προσκύνησιν φῆσαι· διὰ τῆσδε τῆς λέξεως Θεὸν εἶναι τὸν τεχθέντα δηλοῦντες.  Ὡς δὲ ἦλθε τῶν τοιούτων πρὸς Ἡρῴδην ἡ φήμη, τοὺς μάγους τέως ἀνείς, τοὺς σοφοὺς Ἰουδαίων καλεῖ λέγων· ποῦ ὁ Χριστὸς γεννᾶται; Οἱ δέ φασιν, ἐν Βηθλεὲμ τῆς Ἰουδαίας.  Ὢ τοῦ θαύματος! Τὸν τόπον ἐπίστανται, καὶ τὸν τόκον βδελύττονται· τὴν Βηθλεὲμ βοῶσι, καὶ τὴν οἰκονομίαν σιγῶσι· τὴν Ἰουδαίαν σημειοῦνται, καὶ τὴν δεσποτείαν ἀρνοῦνται. Εἰ οὕτω γέγραπται, διατί μὴ συντίθεσθε; Εἰ ἀνέγνωτε, διατί μὴ πιστεύετε; Εἰ ἐν Βηθλεὲμ τεχθήσεται, διατί μὴ παρ’ ὑμῶν προσκυνηθήσεται; Ὅθεν συνιεὶς ὁ Ἡρῴδης ὡς οὐχ ἁπλῶς ἥκεσαν ἐρωτῶντες οἱ μάγοι, λάθρα τούτους καλέσας ἐπυνθάνετο παρ’ αὐτῶν τὸν χρόνον τοῦ φαινομένου ἀστέρος, ἀκριβῶς, λέγων, τὸν τρόπον ὃν ἥκατε φάσκοντες, εἴπατε τὸ τῆς ὑμετέρας <ὁδείας>

And here is the output from Bard AI, unedited:

The Story of the Magi

When a traveler meets a good companion, he rejoices, stealing away the pain of the long journey with conversation. For as if leaning on a staff with a sharp-pointed tongue, he seems to walk with ease, his foot and his mouth tireless. Dividing the fatigue on his knees, he lightens his weary step with his lips.

So too, when Christ was born, the Magi, seeing the star and taking it as their companion, stole away the long journey of the road, inquiring and asking, “Where is the king?” They sought out the voice of the Hebrews, as thieves of the newborn.

It is likely that the Jews asked the questioners, “What do you mean, strangers? What are you saying, men? What dangerous rumor do you bring? Why do you trumpet a new king in a city that is already ruled? Why do you stir up trouble at an inopportune time? Why do you sharpen your tongues against your own necks? Why do you bring a tomb to your lips, awakening death against yourselves? Persia is full of tombs. Why do you ask about another king while Herod is still alive? He would hear your confession with great gratitude and reward you with great gifts.”

But the Magi’s answer to this was short: “We saw his star in the east, and we have come to worship him.” They were not satisfied with asking, but they also said that they would worship him. By this word they declared that the one who was born was God.

When word of this came to Herod, he dismissed the Magi for a time and called the wise men of the Jews, saying, “Where is the Christ born?” And they said, “In Bethlehem of Judea.”

Oh, the wonder! They know the place, and they abhor the birth. They cry out Bethlehem, and they are silent about the economy. They point to Judea, and they deny the lordship. If it is so written, why do you not agree? If you have read, why do you not believe? If he is to be born in Bethlehem, why will he not be worshiped by you? Therefore, Herod, realizing that the Magi had not come simply to ask questions, called them secretly and inquired of them the time of the appearance of the star, saying, “Tell me exactly the way you came, and tell me the time of your journey.”

And here, for comparison, is the Google Translate text, again unedited:

Reason when the magicians were brought to Jerusalem

It was a fellowship of the Synod, he enjoys the pain of the long -standing manner of the mackerel; ITON VIMA.   Even so, when Christ was born, the magicians, seeing the star and receiving this as a companion, asked the rich laborer, the thief of the covenant, when they were cut down, where the king was; they sought out the Jews by voice as the thief of what had been done.   And they will ask them if they are Jews, what do you say, foreigners, you dare, what do you eat, men, fasi?  What dangerous fame have you come? What city do you trumpet the king’s new reign? What do you end up turning against the hour? Why do you stab the throats of necks? Persis, the treasurer of monuments, why should you marry Herod’s living king? He heard many confessions for your sake and rewarded you with great gifts.  But the answer to this to the magicians was short: we saw the star in the east, and we came to worship him. It was not enough to ask, but also to worship; through this word God is declaring what has been done.   And when the rumor of these came to Herod, he called the magicians of old, the wise men of the Jews, saying: Where is Christ born? They said, in Bethlehem of Judea.   Ὢ of the miracle! The place they visit, and the interest they detest; Bethlehem loudly, and the economy quietly; Judea they mark, and despotia they deny. If it is written like this, why do you not compose it? If you read it, why do you not believe? If it is made in Bethlehem, why does it not worship before you? How did Herod know that the magicians did not simply come inquiring, and you secretly called them a priest for them at the time of the vision? stay star, exactly, saying, the way you came, you said that of the mother <Odeia>

Google Translate is now out-dated, but because it uses a different technology – NMT – it acts as a useful check on AI.  For instance the first sentence is paraphrased by AI, rather than translated.  At least one can count the sentences and get an idea if it’s all there!

Likewise Diogenes allows you to click on individual words and get the L&S result for each, which  helps in checking.

But all the same, the AI translation looks wonderful.  Basically we can now make use of it for ancient and medieval Greek.  So long as we proceed with caution!

I’m not sure whether I want to start working on Ephraem Graecus tho.  What is there in this mass of texts that is going to be interesting?  At the moment I don’t know.

There is another issue with the Ephraem Graecus material.  The edition was made by Assemani, in the 18th century.  He just printed in a heap whatever he found in the manuscripts.  What this means is that some short “texts” are actually just abstracts of other texts.

But which ones?  There ought to be a list, but if so, it has not reached me.

I wonder whether we could get AI to work out the relationships?  After all, the task is basically one of text comparison.  We have all the Greek in electronic form, thanks to the wonder that is the TLG.  So… can we get AI to look through it and tell us?

I think it might be possible.  But there’s only one way to find out, which is to try.  When I get a break, I might experiment a bit!

Share

Methodius ad Theodorum (BHG 1352y) – now online in English

Here is the final version of the “Life of St Nicholas” by Methodius “ad Theodorum” – to Theodore.

The files are also on Archive.org here.  As usual, this material is public domain.  Make whatever use of it you like, personal, educational or commercial.

Share

How does “AI translation” work? Some high-level thoughts

The computer world is a high-bullshit industry.   Every computer system consists of nothing more than silicon chips running streams of ones (1) and zeros (0), however grandly this may be dressed-up.  The unwary blindly accept and repeat the words and pictures offered by salesmen with something to sell.  These are repeated by journalists who need something to write about.  Indeed the IT industry is the victim of repeated fads.  These are always hugely oversold, and they come, reach a crescendo, and then wither away.  But anybody doing serious work needs to understand what is going on under the hood.  If you cannot express it in your own words, you don’t understand it, and you will make bad decisions.

“AI” is the latest nonsense term being pumped by the media.  “Are the machines going to take over?!” scream the journalists.  “Your system needs AI,” murmur the salesmen.  It’s all bunk, marketing fluff for the less majestic-sounding “large language models (LLM) with a chatbot on the front.”

This area is the preserve of computer science people, who are often a bit strange, and are always rather mathematical.  But it would seem useful to share my current understanding as to what is going on, culled from a number of articles online.   I guarantee none of this; this is just what I have read.

Ever since Google Translate, machine translation is done by having a large volume of texts in, say, Latin, a similarly large volume in English, and a large amount of human-written translations of Latin into English.  The “translator” takes a Latin sentence input by a human, searches for a text containing those words in the mass of Latin texts, looks up the existing English translation of the same text, and spits back the corresponding English sentence.  Of course they don’t just have sentences; they have words, and clauses, all indexed in the same way.  There is much more to this, particularly in how material from one language is mapped to material in the other, but that’s the basic principle.  This was known as – jargon alert – “Neural Machine Translation” (NMT).

This process, using existing translations, is why the English translations produced by Google Translate would sometimes drop into Jacobean English for a sentence, or part of it.

The “AI translation” done using an LLM is a further step along the same road, but with added bullshit at each stage.  The jargon word for this technology seems to be “Generative AI”.

A “large language model” (LLM) is a file.  You can download them from GitHub.  It is a file containing numbers, one after another.  Each number represents a word, or part of a word.  The numbers are not random either – they are carefully crafted and generated to tell you how that word fits into the language.  Words relating to similar subjects have numbers which are “closer together”.  So in a sentence “John went skiing in the snow,” both “snow” and “skiing” relate to the same subject, and will have numbers closer together than the same number for “John.”

Again you need a very large amount of text in that language on both sides.  For each language, these texts are then processed into this mass of numbers.  The numbers tell you whether the word is a verb or a noun, or is a name, or is often found with these words, or never found with those.  The mass of numbers is a “language model”, because it contains vast amounts of information about how the language actually works.  The same English word may have more than one number; “right” in “that’s right” is a different concept to the one in “the politicians of the right.”  The more text you have, the more you can analyse, and the better your model of the language will be.  How many sentences contain both “ski” and “snow”?  And so on.  The model of how words, sentences, and so on are actually used, in real language texts, becomes better, the more data you put in.  The analysis of the texts starts with human-written code that generates connections; but as you continue to process the data, the process will generate yet more connections.

The end result is these models, which describe the use of the language.  You also end up with a mass of data connecting the two together.  The same number in one side of the language pair will also appear in the other model, pointing to the equivalent word or concept.  So 11050 may mean “love” in English but “am-” in Latin.

As before, there are a lot of steps to this process, which I have jumped over.  Nor is it just a matter of individual words; far from it.

The term used by the AI salesmen for this process is “training the model.”  They use this word to mislead, because it gives to the reader the false impression of a man being trained.  I prefer to say “populating” the model, because it’s just storing numbers in a file.

When we enter a piece of Latin text in an AI Translator, this is encoded in the same way.  The AI system works out what the appropriate number for each token – word or part-word – in our text is.  This takes quite a bit of time, which is why AI systems hesitate on-screen.  The resulting stream of encoded numbers are then fed into the LLM, which sends back the corresponding English text for those numbers, or numbers which are mathematically “similar”.  Plus a lot of tweaking, no doubt.

But here’s the interesting bit.  The piece of Latin that we put in, and the analysis of it, is not discarded.  This is more raw data for the model.  It is stored in the model itself.

This has two interesting consequences.

The first consequence is that running the same piece of text through the LLM twice will always give different results, and not necessarily better ones.  Because you can never run the same text through the same LLM twice; the LLM is different now, changed to include your text.

The second consequence is even more interesting: you can poison a model by feeding it malicious data, designed to make it give wrong results.  It’s all data, at the end of the day.  The model is just a file.  It doesn’t know anything.  All it is doing is generating the next word, dumbly.  And what happens if the input is itself AI-generated, but is wrong?

In order to create a model of the language and how it is used, you need as much data as possible.  Long ago Google digitised all the books in the world, and turned them into searchable text, even though 80% of them are in copyright.  Google Books merely gives a window on this database.

AI providers need lots of data.  But one reason why they have tried to conceal what they are doing is, in part, because the data input is nearly all in copyright.  One incautious AI provider did list the sources for its data in an article, and these included a massive pirate archive of books.   But they had to get their data from somewhere.  Similarly this is why there are free tiers to all the AI websites – they want your input.

So… there is no magic.  There is no sinister machine intelligence sitting there.  There is a file full of numbers, and processes.

The output is not perfect.  Even Google Translate could do some odd things.  But AI Translate can produce random results – “hallucinations”.

Further reading

Share

Translations of St Nicholas of Myra material on this website

I’ve just created a page on this blog with links to every post that contains a translation of one or the other of the medieval texts containing St Nicholas material.  It’s here.

Looking back, I started taking an interest in 2013.  The first translations of the legends appeared in 2015.  The most recent was earlier today.

That’s a long, long time.  And how things have changed.  Back in 2015, I was commissioning translations from the Greek of various short pieces.  In 2020, Google Translate suddenly became usable, at least for Latin.  And this year, we have the new AI Translators.  It’s possible to do stuff, even if you don’t have much knowledge of the languages.  It’s rather marvellous really!

Share