Getting a manuscript offline from the Forschungsbibliothek Gotha

The Gotha collection of manuscripts is less well-known than it should be, except to specialists.  But anybody doing anything with English and Cornish and Welsh saints’ lives is aware of a semi-mythical manuscript in that collection, with the shelfmark “Gotha Forschungsbibliothek Membr. I 81”.  These lives are mainly accessed in an abbreviated recension made by John of Tynemouth and printed as “Nova Legenda Anglie”.  What makes the Gotha manuscript special is that it contains unabbreviated versions of some of this same material.

We live in a period of transition, where archives know that manuscript material ought to be accessible online.  But at the moment most archives have limited IT resources, both of infrastructure and people skills.  It’s important for extremely online people to remember this.  There may well be just one person at the other end.

A lot of Gotha manuscripts are online.  Unfortunately the website was clearly designed by a non-manuscript person – not at all uncommon, this! -, and it makes it hard to find what is online.  You can’t search by shelfmark.  If they would just put up a single page with all the manuscripts on, listed by shelfmark, and with a link to each ms, that would solve it.

Last Tuesday, a mere 6 days ago, I decided to write to the library and ask.  From the list of contacts I selected a certain Dr Henrikje Carius, and enquired.  I didn’t get a reply, but the following day I had an email instead from Dr Monika Müller:

Memb. I 81 has been digitized, however, the digital copy has not yet been put online due to the lack of a sufficient catalogue entry. It is provided to put the digital copy online in a project planned for next year. In general, the Research library sells already existing digital scans which not are accessible online for 8 Euro. Please, inform me about how you would like to proceed.

Here we see evidence of a library that is in the transitional period; because it’s hard to see why you would do all the hard work of photography and then not put it on the web, just because of cataloguing.  That’s an old trap that librarians sometimes fall into, because cataloguing is never finished.  All the same this was a very helpful reply.  But clearly we were going to get a version of the old-fashioned labour-intensive manual process that used to happen.

I was wary of the 8 euro charge, trivial as it was.  Accounting for money takes loads of manual labour, more than such a charge would justify.  Anyway I agreed to it, mainly out of curiosity.  The next step was that I was sent a long form in PDF format which was an “estimate”, and asked to complete it.  But also:

My apologies, that I have overlooked one aspect: As the manuscript has 230 folios and therefore the scan 460 images, it takes a lot of time to upload the scan. The library charges fees for this service, i.e. 25 Euro for the scans of Memb. I 81.

I didn’t know it then, but the zip file in question was 10Gb, so it did take a while.  I don’t think I’ve ever been charged for this before, however.  On the other hand, it was not so long ago that a CD would be sent out by post.

The paperwork duly caused problems.  Thankfully this was emailed to me – once, this would have been by post.  That is a step forward.  Unfortunately I was away from home and reading the PDF form on a phone.  I could see no way to enter text.  Emails to and fro.  When I returned home, two days later, I found that the PDF was indeed read-only!   So I printed it off, hand-scribbled my agreement, and scanned it back in and sent it in.  I would guess that I should have been sent a Word .docx file instead.  All transitional stuff.  They need a form online that you can enter the data into.

Once  I had emailed the PDF in then things moved swiftly.  Another document in PDF appeared, which luckily I did not have to do anything with.  Then I had to find out just how to send money.  International bank transfer was the sole option.  This is common in the EU, but rarely done outside.  Banks tend to charge 10 euros just for the trouble.  But I was fortunate: since the last time I did this, the banks have introduced ways to do it, and the money went over swiftly.  This morning I received a link to the download – the monster 10 Gb file!  This I shall stash on 3 external drives.

Inside the zip were all the pages in TIFF format, each about 30 mb.  I was relieved to find that they were all excellent quality colour photographs.  I opened one in MS Paint and saved it as PNG, and the size dropped to 20mb.  I then saved it as JPG and the size dropped to… 3mb.  That’s about the size I would expect.

What I want, of course, is a PDF.  I have the tools to create it, and then I can add bookmarks for the various sections of the manuscript.  So the PDF needs to be a reasonable size.

There are about 460 images in the folder, so I’m not doing that conversion manually.  Instead I used ImageMagick.  Looking at my collection of installers, I’ve not done this since about 2011!  But it all worked fine.  I right-clicked on the folder and opened it in Terminal, and then ran:

mogrify -format jpeg *.tif

This ran extremely fast and, in less than a minute, it had merrily converted every .tif image into a brand new .jpeg file in the same directory.  Whatever the image conversion defaults were – some loss of quality, of course -, the jpg file size was 3mb each time, and the images looked just as readable for my purposes.  I then fired up Adobe Acrobat Pro 9 – very elderly now, but still working – and combined all the .jpgs (ignoring endleaves etc) into a PDF.  This itself is a mighty 1.18 Gb, but it will serve my purposes very well.

The next step is to use an online set of contents, and create bookmarks.

Thank you, Dr Müller, and the Forschungsbibliothek staff, for what was a far more efficient process than in the past.


Searching for BHL 6173 (part 2)

In my last post, I started searching online for a manuscript copy of BHL 6173, a miracle story about St Nicholas, which has never been printed.  Two French manuscripts were supposed to contain a copy; neither did.  But two Austrian manuscripts were also listed by the Bollandists in their BHLms database:

  • Heiligenkreuz SB 14
  • Melk SB C.12

Both of these abbeys are in Austria.  This has a union site, which is a good idea.  All the fully digitised manuscripts they have can be located here, and then you drill down.  So far, so good.

There are 93 fully digitised mss of Melk online!  That’s great news.  I find that “C 12” is the old shelfmark – the site in fact lists a concordance of Melk shelfmarks here, but it is useless unless you know which catalogue your source was working from – unlikely with an old reference.  But it’s a fine idea in principle.

In fact “Melk C 12” is now Melk 546, online here.  It’s a 15th century manuscript, so very late.  But we don’t care about that.

Unfortunately the site has been changed since I last looked at it.  It was frankly rather clunky, but it was entirely usable.  It is now rather quicker to find the actual digitised manuscript.  But otherwise the changes are a disaster.  No researcher can work with this.  Negative changes include:

  • Disabled downloads – at least for the public – and instead tried to force you to use their online browser.
  • Set up that browser menu so that Google Translate can’t translate their pop-up menus.  Non-German speakers are not welcome.
  • Made sure the menu options cannot even be copied, in case you tried to use Google Translate that way.
  • Clicking on “fol. 40 r” instead displays f.36r.
  • There’s no way to download the page that I want.  Links point to the wrong pages.

Somebody has really set out to make the researcher’s job impossible.  There are good, solid reasons why researchers hate librarians. Stuff like this, that makes your life harder, is the reason why.  This has cost me an hour of pain, and in reality the manuscript will now be omitted from my list of witnesses.

The only part of all this that is actually an improvement is that the “Scroll” option in the browser – which, weirdly, is horizontal – is quick.  You can skim through the pages.  On fol. 40r I do find “Quidam praepotens vir“.  Not that I can download the page, of course.

Luckily for me the amount of text that I want is small, and can be screen grabbed.  Here’s the text of BHL 6173.

It’s not hugely readable, to a layman.  I’ll try transcribing it another time.

Blessedly the manuscript also contains BHL 6175, which I am also looking for.  This is only found in the Melk and Heiligenkreuz manuscripts, plus one in Belgium, KBR 07487-07491 (3182), somewhere between fol. 170v-185v, a 13th century manuscript.  But that isn’t online.

What about the Heiligenkreuz 14 manuscript?  Sadly not.  Some of the Heiligenkreuz manuscripts are indeed online, but not this one.  [Update, March 21: Heiligenkreuz 14 is indeed now online].

That’s our four manuscripts, and we have a single hit, which luckily contains both unpublished texts.

But although the Bollandists with their BHL, and BHLms database, are the essential reference, they are not the sole source of all truth.  Google searches can reveal things unknown to the excellent fathers.

Doing so led me to a massive monograph online here at, by Sarah Staats, “Le catalogue médiéval de l’abbaye cistercienne de Clairmarais et les manuscrits conservés” (2016).  And on page 64, we learn of a 12th manuscript, now Saint-Omer 701, which contains part of the Speculum Ecclesiae of Honorius Augustodunensis (who?).  This contains on fol. 121v-122r a “Sermo de sancto Nicolao” (BHL 6173 and 6175).  That manuscript is online and accessible through Mirador.  Here is part of the opening in question! 

Which is a nice bonus.  I think we can get a text together using those two witnesses, don’t you?

Have a good weekend, everyone.


Searching for BHL 6173

I’ve gathered nearly 50 miracle stories of St Nicholas, using the wonderful Bibliographica Hagiographica Latina (BHL) index.  BHL 6173 (beginning “Quidam praepotens vir, accersito aurifice…“; “A certain powerful man, an accomplished goldsmith…”) is an epitome of BHL 6172, so the Bollandists did not trouble to print the text.  So I need to look at the manuscripts of BHL 6173.

Fortunately the online database, BHLms, does give four results for this text:

  • Paris BNF lat. 11570
  • Paris BNF lat. 11576
  • Heiligenkreuz SB 14
  • Melk SB C.12

The last two institutions mean nothing to me as yet, but the BNF in Paris has loads of it mss online.  Indeed I already have a download of 11570 in PDF form on my disk.  I quickly acquire 11576.  But… as I look at it, it becomes clear that the entry in BHLms for this manuscript is garbage.  Something has gone wrong, although I have no idea who to report this to.  For this is an unlisted copy of John the Deacon, BHL 6104-8, not BHL 6173.

BNF 11570 lists 4 miracle texts right at the end, on folios 253r-260r.  The last of these is BHL 6173.  But when I look at folio 253r, I find instead a copy of the Transitus of St Nicholas.  It’s supposed to be BHL 6151, “Rursus autem alio tempore altera mulier de vico Neapoleos…”.

Paris BNF lat 11750, fol. 253r.

Instead that text appears under a numeral “II” on the last line of the page.  The “transitus” appears to be a version of BHL 6154.  So the two texts, as listed in BHLms, are reversed.  This is not good news – it means that the catalogue is not reliable.

A casual search reveals that the numerals disappear, and the text becomes continuous.  Thankfully the start of each sentence is capitalised.  At fol. 260r there is nothing starting “Quidam”.  Working back, no sentence starts thus.  The text is simply not there.

The other two institutions are Austrian abbeys.  I can’t recall how to locate these for the moment. I will go off and find out!


The Alcobaca manuscripts – catalogue located, and lots online at Lisbon!

In my last post, I referred to a manuscript of the Alcobaca monastery in Portugal, number 113.  Afterwards I started to search for information.  I discovered that the modern catalogue in three volumes by Thomas L. Amos, The Fundo Alcobaça of the Biblioteca Nacional, Lisbon (1988), is online at!

  • Vol. 1 –
  • Vol. 2 –
  • Vol. 3 –

This was excellent news, and I naturally looked for manuscript 113 in volume 1.  But it wasn’t there!  The manuscripts had two numbers – a Roman number, and a modern Arabic number. The monastery was suppressed long ago and its holdings ended up in the Biblioteca Nacional, and the Arabic number is what they use.

Doing Ctrl-F in the file – ah, the excellence of searchable PDFs – revealed that my manuscript was CXIII, now 414.  So off I went to volume 3, and there it was, on page 178.  It was volume 3 of a set of homilies.

But this did not refer to St Nicholas!  So I did another Ctrl-F, and found an incipit at the back, on – by coincidence – page 414:

[Text] Nicholaus itaque ex illustri prosapia ortus …. 414: 141a-.

Fortunate for me that I had forgotten to search for “Nicolaus”, and had used “Nicho”!  And this told me that the folio was 141a.  (By now all these 1’s and 4’s were starting to get confusing!)

I then wondered whether any of these Alcobaca manuscripts were online.  There is a website:

But it is very hard to use.  It doesn’t allow you to enter shelfmark, only author, title, etc.  I searched for “V”, for Vita, and got nothing.  Then for H, for Homilies; and on the third click, I found MS 414 here.  Blessedly it has a 108Mb download and a monster 2.4Gb download!  So useful!!!  Thank you!  It is supposed to have an IIIF manifest too, but the link took me elsewhere.

[Homiliário / copiado por João Pecador]. – [Alcobaça, 1201-1300]. – [1] f., [2] f. papel, [254], [1] f. (2 colunas, 27 linhas) : pergaminho, il. color. ; 412×290 mm

But a link here was provided to the full catalogue.  Not that this said anything about Nicholas!

I downloaded the PDF and went to page 282 (i.e. folio 141 x 2), and then down a bit, and there it was!  BHL 6104, the start of John the Deacon.

Alcobaca 414, start of John the Deacon

Even better, it started Sic omnis materia, rather than the much more common materies, and someone had written in a correction.

The PDF also comes with some useful bookmarks.

But then disaster!  I tried to add a bookmark of my own and I could not!  The blessed PDF was “secured”, drat it.  Even though everything was marked “public domain”.  I can’t mark it up.

I shall poke around some more.  I might write them a polite note.  I shall try to find the IIIF manifest.

Update: The Biblissima page here gave the IIIF manifest, thankfully, even though the website did not.


It’s starting to work! – Recensio part 4

This afternoon I went to my draft text and translation, and, as per my last post, starting from the top, looked for a place in the text where the editions differed in meaning.  I did not have to go far before I found this place, on “in vocem” or “in clamationem“.

Latin and English text, working notes

To those wondering how I got this, remember that I started my task by creating an electronic text of the Falconius edition, and then translating the whole thing, one sentence at a time.  But when I had finished, I decided that the Mombritius edition text was better.  So I created an electronic text of that, and then I compared the two texts electronically (using dwdiff – but it could have been several tools).  This got me a list of differences.  Then I revised my translation to follow Mombritius.  As I went through the difference list, in order to do so, I noted down the differences that seemed significant to me.  That is, I ignored typos, spelling differences, etc., and only took those where a difference of meaning was apparent.  I noted the meaning as well!  The result was this document, which I am using to work on the text.

Verum, quia scio me penes literatissimos magistros inefficacis esse sermonis, ideo deprecor omnes, qui hujus operis studiosi lectores accesserint, ut non facillimam prorumpant inclamationem,** et me indoctum meque** judicare inertem incipiant.

Until today, I had the Mombritius text, “in vocem” here.

So, just as I did earlier, I opened up my directory of manuscripts, and I started to work my way down the files.

Screen grab of directory in Windows Explorer

Note that I’ve found it endlessly useful to include the century in the file name.

Of course each time I go looking for a passage in the PDF of the manuscripts, I add bookmarks and sticky notes to where I found it.  This does make navigation easier.  I have not attempted to mark up everything in one pass in advance.  Rather I am doing what I need to do as I need it.  After all, I can always come back!

Here is the current state of BNF lat. 2627:

Bookmarks and sticky notes in a manuscript PDF

Apologies for the size.

I found book marks by just picking up on red initials.  So in that picture, I didn’t bother to bookmark Mane itaque – because it’s not one of the main divisions in the text like Pontificalis or Praeterea.  But I could have done.

On my first pass, I added a sticky note for where I was looking at Nacta / Nactus / Notata.  Three lines down, there is “O novi iacob stropha”, from this morning.  I only add a sticky for that where there is an omission, because I always know that it’s just below the Nacta text.

Notice that the sentences in this 11th century manuscript all begin with a small capital.  The big red capitals allow you to find big places in the text.  Once you’re on the right page, the small capitals allow you to find the sentence you want.  When I was looking for these two places, I found myself looking for “Rumpe”!  Because that was a line or two above.

These little tricks all allow you to speed things up.

But back to what am doing right now.  Well, I clicked on every one of those manuscripts.  And I noted down the reading.

I started, of course, with:

** Mom. “in vocem”; Fal. “in clamationem”, crying out against; Corsi: “in cachinnationem”, in immoderate laughter.

Initially I added the manuscripts after the editions.  But actually it’s better to turn it around, and give the text, with the edition against it, and then add manuscripts on the end.

So I ended up with this:

  • “in vocem” – Mom., Lipp. Means nothing.
  • “in clamationem” or “inclamationem”, crying out against, criticism, abuse – Fal., Angers BM 802 (11th ), Balliol 216 (13th), Harley 3097 (1124), BNF lat. 196 (12th), BNF lat. 5284 (13th), BNF lat. 5308 (12th), BNF lat. 5346 (13th), BNF lat. 5624 (13th), BNF lat. 989 (10th), BNF lat. 1864 (14th), BNF lat. 2627 (11th), BNF lat. 18303 (before 968), Bruges BP 402 (13th), Cambridge CCC 9 (11th), Durham B.IV.14 (12th), Fribourg L 5 (13th), Milan P113supp (10th), Munich Clm 3711 (11th early), Orleans BM 342 (10th), Vat. Arch.Cap.S.Pietro A.5 (11th), Vat. lat. 1197 (11th), Vat. lat. 9668 (12th), Vat. reg. lat. 477 (12th), Vat. reg. lat. 496 (11th), Wien ONB 12831 (15th),

“in cachinnationem”, jeering, immoderate laughter – Corsi, Berlin theol. lat. qu.140 (11th), Linz 473 (13th), Munich Clm 12642 (14th),

Because I did this immediately after the last post, some of the manuscripts started to sound familiar!  That group at the bottom had an eccentric reading for the “O novi Iacob stropha” search too.  It’s a group, a family of manuscripts that share common errors.  This is precisely what we are looking for: a way to group manuscripts in order to get a stemma if we can.

The first collation I did took quite a while.  The one this afternoon was quicker.  This one took very little time.  Why?  Because I’m getting used to it, and developing my way of working.

Of course I am lucky to have four different early editions.  If I did not have this, if I only had one, then I would have to manually read through a manuscript PDF and manually compare it with my electronic text.  If I didn’t have an electronic text at all, I would have to transcribe one manuscript, and use that as my framework electronic text – not my final text – to translate, and on which to hang readings, in order to analyse the text.

I am rather enjoying this!  Maybe I’ll look for another passage next!


O novam Jacob stropham! – Recensio part 3

The earliest printed editions of a text are often merely a printed version of some manuscript that the editor had to hand; or are based on a prior edition, plus readings from such a manuscript.  In some cases all the manuscripts were destroyed afterwards, and we only have the printed edition.  This is the case with  Velleius Paterculus, and also with Tertullian’s De ieiunio.  So these editions are a “manuscript witness”.

I’ve scanned four such editions of John the Deacon to Microsoft Word, and carried out a machine comparison.  There are quite a few differences.  But in order to establish a “family tree” of manuscripts, which differences are significant?

At the moment I have two tentative guidelines.  They may be wrong, but it’s what I have.

  1.  The scribes do not care all that much whether they put down “at”, “et”, “ac” or “atque” – all of which mean “and” – regardless of which was actually in the text in front of them.  So “variants” which mean the same thing are not really useful to us.  What we need is a difference in the text which has a real difference in meaning.
  2.   Because the endings of so many words are abbreviated in medieval copies – “ū” for “um”, etc – these variants may not be significant either.  Let’s not spend a lot of time over “explicare” vs “explicarem”.

The next real variant is not much further down the text from the last one.  At the dead of night, St Nicholas has secretly visited the house of the poor man, tossed a bag of gold through the window, and secretly disappeared.  So now, time for a quick comparison with a biblical figure! The text continues:

O novi Jacob stropha!**  Ille commentatus est, qualiter Laban, mercedem non amitteret; hic autem, ut coelestibus non privaretur commodis.

O the trick of the new Jacob!**  The former devised it, with Laban, to avoid losing his wages; but the latter,  to avoid being deprived of heavenly rewards­.

The reference is to Genesis 30:32-3, where Laban agrees to pay Jacob for looking after his sheep by allowing him to keep any offspring that are striped; but, trickily, Laban gives him only monochrome sheep.  Jacob gets round this by putting branches of various colours in the drinking troughs, which cause the sheep to produce vari-coloured offspring.  By his trickery, Jacob gets the wages that he was promised.  St Nicholas, by his own strategem, gets the heavenly reward promised to those who do good in secret.  It’s not a great comparison, but there’s no doubt that this is what John is attempting to say.

The first three words of the text, however, vary in some interesting ways.  I only have 46 manuscripts at the moment, but here are the readings:

  • O novam Jacob stropham. — Mombritius (1477), Lippomano (1553)
  • O pueri Jacob stropham.  (what?!) — Falconius (1751)
  • O nova Jacob stropha. — Corsi, based on Berlin theol. lat. qu. 140 (11th c.), BNF lat. 5284 (13th c.), BNF lat. 5308 (12th), BNF lat. 5345 (13th), Vat. lat. 1271 (12th c.), Bruges BP 402.
  • O novi iacob stropha. — BNF lat. 2627 (11th c.), BNF lat. 18303 (=early 10th c), Angers BM 802 (11th c.)  Balliol 216 (13th c.), BNF lat 196 (12th c.), BNF lat. 1864 (14th c.), BNF lat. 3791 (12th c.), BNF lat. 3809A (15th c.), BNF lat. 5572 (11thc), BNF lat. 5573 (12th c.), Durham B.IV.14 (12th), Fribourg L 5 (13th c.),  Milan P113 supp. (10th c.), Munich Clm 3711 (11th), Orleans BM 342 (10th c.), Vatican Arch.Cap.S.Pietro A.5 (11th c.), Vat. lat. 9668 (12th), Vat. reg. lat. 477 (12th), Vat. reg. lat. 496 (11th c.), Vienna ONB 12831 (15th c.)
  • O novi iacob tropha — BNF lat. 1765 (13th c.)
  • Omitted: clamque discessit is followed directly by Mane itaque, omitting the whole digression. Vat. lat. 5696 (11th c.), Vat. Arch.Cap.S.Pietro A.3 (12th c.)
  • Omitted: clamque discessit is followed directly by Hic est magister bone, omitting two sentences, but retaining some of the digression, then Mane itaque.  Vienna ONB 416 (12th c.), Klosterneuburg 701 (14th), Linz 473  (13th c.) – The Linz manuscript is a contaminated text, however, containing material from BHL 6118.  Munich Clm 12642 (14th).
  • . Vat. lat. 5696 (11th c.), Vat. Arch.Cap.S.Pietro A.3 (12th c.)

A couple of oddities:

  • BNF lat. 989 (10th c.) is impossible to read, but the last word is stropha.
  • Cambridge, Corpus Christi College 9: O novi iacob stropha, but, above the “i” in novi there also appears an “a”.

Now by chance I got some help from a google search.  I wasn’t familiar with the word “stropha”, a strategem or trick.  Googling the words above produced a passage about this in Jerome’s “Hebrew Questions on Genesis”, (Quaest.Heb Ad Gen.30.32-3):

Itaque Iacob novam stropham commentus est, et contra naturam albi et nigri pecoris, naturali arte pugnavit.

Jacob therefore invented a new trick, and by natural art fought against the nature of the white and black cattle.

There’s an awful lot of the same words in there, isn’t there?  Although they’re doing different things.   This perhaps explains why we find all those accusatives like stropham in our text.  Quite possibly they are the result of the copyist being more familiar with Jerome than with John the Deacon. On seeing the unfamiliar text, the copyist “normalised” it.  Jerome has “Jacob” as a nominative, the subject of the verb in his sentence.  But it can’t be the same in John the Deacon.

“Iacob” is indeclinable, so we could read the genitive, sometimes as “Iacobi”, “of Jacob”.  The sense is that Nicholas is the new Jacob.  So “novi” and “Jacobi” would agree.  We end up with (in English word order) stropha novi Jacobi, “the strategem of the new Jacob”.

Of course I only have a selection of manuscripts.  But all the same, it’s clearly necessary to look at them.


From my diary

Another couple of manuscripts were located today, and the relevant portions downloaded.

Today I worked out how to download manuscripts from the Austrian National Library in Vienna, and indeed wrote a little post on how.  One of these is listed in the Bollandists website; the other is not, and contains only one part of the text, BHL 6105.  Indeed it seems as if the text has more or less dissolved into the mass of Nicholas material in circulation by that date.

The other two were both from Britain.  Curiously I found the details that I needed using Google searches, and located the IIIF manifests.  The first one I had seen before, but not known how to access the page images.  The second one I knew nothing of until today.

It’s really a case of hitting Google, hitting the websites, trying out different forms of the author’s name – today I tried “Jean le diacre” rather than “John the Deacon”, which gave me the second hit.  It’s really very random a lot of the time.



How to download a manuscript at the Austrian National Library (Österreichische Nationalbibliothek)

This is for all you non-German speakers out there.  Yes, it is indeed possible to download a PDF of manuscripts at the ONB in Vienna!

All the fully-digitised manuscripts for the Österreichische Nationalbibliothek are listed on a page here:  (The link doesn’t look very permanent, so you might have to search at  [Update: drill down from]

But it is very useful to have them all on one page!  Ctrl-F to find the manuscript you want by number.

Here’s a screen-grab of the top of the page:

Click on the manuscript you want.  I’ve highlighted ONB 6.  Here’s the next page:

What you want is the “Volldigitalisat” – “Fully digitised”.  Click on the “Quicksearch” link:

I had to use Chrome’s automatic translate facility to work out which, if any of this, was relevant.  It’s “Online-Zugriff”.  Click there.  That will take you down the page, to somewhere seemingly random:

Clicking on “Digitales Objeckt” will, at last, take you to the online manuscript.

Right-click on the image, and the menu above will appear.  This contains the exciting words “Objekt herunterladen” – “Download Object”.  If you click on this, you will be prompted to download a PDF of the “Gesamtes Objekt” – the whole thing.

Marvellous!  Well done the ONB.  Now I can mark up the PDF and do some work on the manuscript.

Update, 11 March 2023: the website has now changed, and none of this now works.  There is a download link on the manuscript itself, but this does not seem to work.  If anybody knows where they have hidden the link, please add a comment.


A way to compare two early-modern editions of a Latin text

There are three early modern editions of John the Deacon’s Life of St Nicholas.  These are the Mombritius (1498), Falconius (1751) and Mai (1830-ish) editions.  I have already used Abbyy Finereader 15 to create a word document for each containing the electronic text.

But how to compare these?  I took a look at Juxta but did not like it, and this anyway is ceasing to be available.  For Collatex I have only been able to use the online version, and I find the output tiring.  But Collatex does allow you to compare more than two witnesses.

The basic problem is that most comparison tools operate on a line-by-line basis.  But in a printed edition the line-breaks are arbitrary.  We just don’t care about them.  I have not found a way to get the Unix diff utility to ignore line breaks.

Today I discovered the existence of dwdiff, available here.  This can do this quite effectively, as this article makes clear.  The downside is that dwdiff is not available for Windows; only for MacOS X, and for Ubuntu Linux.

Fortunately I installed the Windows Subsystem for Linux (WSL) on my Windows 10 PC some time back, with Ubuntu as the Linux variant.    So all I had to do was hit the Start key, and type Ubuntu, then click the App that appeared.  Lo and behold, a Linux Bash-shell command line box appeared.

First, I needed to update Ubuntu; and then install dwdiff.  Finally I ran the man command for dwdiff, to check the installation had worked:

sudo apt-get update –y
sudo apt-get install -y dwdiff
man dwdiff

I then tested it out.  I created the text files in the article linked earlier.  Then I needed to copy them into the WSL area.  Because I have never really used the WSL, I was a bit unsure how to find the “home” directory.  But at the Bash shell, you just type this to get Windows Explorer, and then you can copy files using Windows drag and drop:

explorer.exe .

The space and dot are essential.  This opened an explorer window on “\\wsl$\Ubuntu-20.04\home\roger” (??), and I could get on.  I ran the command:

dwdiff draft1.txt draft2.txt

And got the output, which was a bit of tech gobbledegook:

[-To start with, you-]{+You+} may need to install Tomboy, since it's not yet part of the
stable GNOME release. Most recent distros should have Tomboy packages
available, though they may not be installed by default. On Ubuntu,
run apt-get install tomboy, which should pull down all the necessary [-dependencies ---]
{+dependencies,+} including Mono, if you don't have it installed already.

The [-…] stuff is the value in the first file; the {+…} is the different text in the second file.  Other text is common.

There were also some useful options:

  • dwdiff -c draft1.txt draft2.txt added colours to the output.
  • dwdiff –ignore-case file1 file2 made it treat both files as lower case.
  • dwdiff –no-common file1 file2 caused it to omit the common text.

So I thought I’d have a go.

First I went into word and saved each file as a .txt file.  I didn’t fiddle with any options.  This gave me a mombritius.txt, a falconius.txt and a mai.txt.

I copied these to the WSL “home”, and I ran dwdiff on the two of them like this:

dwdiff falconius.txt mombritius.txt --no-common -i > op.txt

The files are fairly big, so the output was piped to a new file, op.txt.  This I opened, in Windows, using the free programmer tool, Notepad++.

The results were interesting, but I found that there were too many useless matches.  A lot of these were punctuation.  In other cases it was as simple as “cujus” versus “cuius”.

So I opened my falconius.txt in Notepad++ and using Ctrl-H globally replaced the punctuation by a space: the full-stop (.), the colon (:), semi-colon(;), question-mark (?), and two different sorts of brackets – () and [].  Then I saved.

I also changed all the text to lower case (Edit | Convert Case to| lower).

I then changed all the “v” to a “u” and all the “j” to an “i”.

And then, most importantly, I saved the file!  I did the same with the Mombritius.txt file.

Then I ran the command again, and piped the results to a text file.  (I found that if I included the common text, it was far easier to work with.)

dwdiff falconius.txt mombritius.txt > myop2.txt

Then I opened myop2.txt in Notepad++.

This produced excellent results.  The only problem was that the result, in myop2.txt, was on very long lines.  But this could easily be fixed in Notepad++ with View | Word Wrap.

The result looked as follows:

Output from dwdiff
Falconius edition vs Mombritius edition

The “-[]” stuff was Falconius only, the “+{}” was Mombritius.  (I have no idea why chapter 2 is indented).

That, I think, is rather useful.  It’s not desperately easy to read – it really needs a GUI interface, that colours the two kinds of text.  But that would be fairly easy to knock up in Visual Basic, I think.  I might try doing that.

Something not visible in the screen shot was in chapter 13, where the text really gets different.  Also not visible in the screen grab – but very visible in the file – is the end, where there is a long chunk of additional (but spurious) text at the end of the Mombritius.

Here by the way is the “no-common” output from the same exercise (with my note on lines 1-2)

dwdiff no-common output

This is quite useful as far as it goes.  There are some things about this which are less than ideal:

  • Using Linux.  Nobody but geeks has Linux.
  • Using an oddball command like dwdiff, instead of a standard utility.  What happens if this ceases to be supported?
  • The output does not display the input.  Rather it displays the text, all lower case, no “j” and “v”, no punctuation.  This makes it harder to relate to the original text.
  • It’s all very techy stuff.  No normal person uses command-line tools and Notepad++.
  • The output is still hard to read – a GUI is needed.
  • Because it relies on both Linux and Windows tools, it’s rather ugly.

Surely a windows tool with a GUI that does it all could be produced?

The source code for dwdiff is available, but my urge to attempt to port a linux C++ command line utility to windows is zero.  If there was a Windows version, that would help a lot.

Maybe this afternoon I will have a play with Visual Basic and see if I can get that output file to display in colour?


Fragment of unknown work by Apuleius discovered in Verona

Via Ugo Mondini on Twitter I learn of an exciting find yesterday (May 9) at the Biblioteca Capitolare – the Chapter Library – in Verona.  It seems that an American team – the “Lazarus Project” – using Multi-Spectral Imaging have discovered a lost text by Apuleius.


Quarantasette scatti per ciascuna pagina effettuati con una fotocamera da 150 megapixel. Diversi filtri di luce, dall’infrarosso all’ultravioletto. Poi il computer elabora le immagini. È nata così la scoperta fatta il 9 maggio. In un palinsesto, un testo antico riscritto più volte, lo strato più basso nascondeva un frammento di un testo di Apuleio andato perduto: un commento alla Repubblica di Platone.

Il merito è degli studiosi del “Lazarus Project”, una squadra internazionale dell’università americana di Rochester, per la prima volta alla Biblioteca Capitolare, alla ricerca di segreti tra le pagine.

Google Translate:

Forty-seven shots per page taken with a 150 megapixel camera. Different light filters, from infrared to ultraviolet. Then the computer processes the images. Thus was born the discovery made on May 9th. In a palimpsest, an ancient text rewritten several times, the lower layer hid a fragment of a lost text by Apuleius: a commentary on Plato’s Republic.

The credit goes to the scholars of the “Lazarus Project”, an international team from the American University of Rochester, for the first time at the Chapter Library, in search of secrets among the pages.

There is a video in Italian at the news site – if anyone has the spoken Italian, perhaps they could advise whether it has extra details?

There is stuff out there, people!  It really is worth going and looking!