Thinking about ways to display Latin syntax information in a translation tool

Most of us probably learned Latin at school.  Those lessons focused on grammar – amo, amas, amat – and also on rote learning of vocabulary.  All of this is essential, and I really wish that I could remember more of it than I can today.

But this focus means that questions of Latin syntax are often dealt with only superficially, or not at all.  I saw evidence of this, back in 2006 when I was running the project to translate Jerome’s Chronicle.  Anybody could contribute by doing an entry.  Often I would see people stumble on something like an ablative absolute, through sheer ignorance.

It occurs to me that some people reading this won’t know what that is, so I’d better try to explain as simply as I can.  Let’s look at this Latin sentence.

Urbe capta, cives fugaverunt.

???, the citizens fled.

Urbe is the ablative of the noun urbs, urbis, = city so would ordinarily mean “by/with/from the city”.  Gender is feminine.  It’s singular.

capta is also in the ablative, but is a perfect passive participle of the verb capio, capere, etc = “capture, seize”.  By itself it would mean “having been captured”.  It too is in the feminine gender, and also singular, so it agrees with Urbe in case, number and gender.

The combination is an ablative absolute – the word “absolute” is just noise – meaning “the city having been captured”, or, in better English “after the city had been captured”, and indicates time.  A noun and a participle in the ablative and agreeing with each other … start thinking “ablative absolute”.

This is a Latin construction.  The term “ablative absolute” is just a label for this Latin construction, where they put the words together to indicate something not found in the bare words individually.  It’s just one of the bits of know-how that you need for Latin, and it’s really really common.

There are many other such bits of trickery.  Students are taught how to recognise them.  This stuff is what you memorise.

Now we have quite a few tools on the web for handling Grammar.  There is my own QuickLatin, Whitaker’s Words, and probably many more that I haven’t come across.  A “lexical parser” is not that uncommon.

But none of these signal these kinds of structures.

For last week or so I’ve taken Morwood’s Oxford Latin Grammar to bed with me, and I’ve been reading through the descriptions of Latin clauses and structures which make up the second half of the book.  It is very clear, to be sure.  But tired brains do not absorb this sort of thing very well, and most readers of this blog will have jobs and other tasks to attend to.  And … do we need to rote learn these things?  Truly?

It’s a UI or UX problem, in a way – User Interface or User Experience.  How could this information be presented to somebody with a line of Latin text in front of them?  If we hover over the individual words, we can have the grammar laid out for us alright, like this:

But how do we signal to the reader that “urbe capta” is an ablative absolute, and pop up some kind of info about how to handle them?

There are two problems here.  The first is how to detect the presence of such a construction.  I suspect that those familiar with algorithms will have ideas in mind already, perhaps about “fuzzy logic” or “AI” or whatever.

Then, once we recognise that this is, or might be, such a construction, how do we signal it to the user?

I’m not sure of the answer to either of these questions, to be honest.  But I’m thinking about it.  This information could be, and should  be, captured and condensed.  It needs to be indexed in a way that allows you to find it from the sentence, rather than in the way that grammars tend to present it.

Ideas are welcome!

Share

From my diary

A couple of things have held my attention in the last few weeks.  Firstly I have been working on the QuickLatin codebase.  The migration to dotNet is complete, and it is now a question of firing stuff at it and finding why it breaks!  I’ve also updated the dictionaries to the latest version.

Basically I can now enhance it as I like; which was the purpose behind doing all this in the first place.  It might be an idea to merge into it some translation tools that I have created over the years.  The main user of this will undoubtedly be me, so I may as well make myself comfortable.

The other piece of work is the ongoing translation of the very ancient Life of St George.  This is in 21 chapters.  The translator has done a draft of chapters 1-12, which I have revised and made ready for release.  I have in turn prepared a draft of chapters 18-21, which I have sent to the translator for comment.  I am now working on chapter 17, and using bits from it to test out QuickLatin.  The completed translation will of course be released online as public domain once it is done.

Easter is now behind us.  I had meant to do an Easter post, but somehow I got distracted.  I spent quite a bit of my downtime on Twitter fighting the “Easter is pagan” jeer that is circulated every year by the malicious and their innocent dupes.  This year the fight really got some traction behind it, and a number of people were patrolling and posting corrective links.  Alas it is probably an unwinnable battle, at least while the false story is agreeable to a certain sort of influential person; but it is something to have tried.

I have enquired about access to the Ipswich Museum files in Suffolk Record Office, in order to locate the survey of the Roman fort by the 1969 sub-aqua expedition.  The archivist has now looked at these, and found nothing.  It looks very much as if the report has been mislaid in the last 20 years.  However I can’t even go and look at the files; I’m told that permission to view the items must be obtained from Ipswich Museum, and their response time is six weeks (!)  I have written of course.  But it is a forlorn hope.

We must always be grateful for the internet and the ready availability of research materials on the web.  I certainly am!

Share

From my diary

I’ve had rather a busy week, ending with a rather splendid college reunion.  But of course everything else has gone out of the window, and I also have rather a large sleep debt to pay off.

Today brings another chunk of translation of an early Latin Vita of St George.  Chapters 9 and 11 are in my inbox now.  The version is a very rough draft. The only difficulty is that the translator doesn’t read my emails with feedback, so makes the same mistakes every time.  This means that I shall have to correct and finish it myself.  I hope to do the job on these chunks this week.  The translation is going forward nicely, tho; some 8 chapters still to do.

Today also brought a welcome email from the Colchester and Ipswich Museum Service with unwelcome news.  In 1969 a team of divers surveyed the ruins of a Roman fort in the sea off Felixstowe, known locally as Walton Castle.  A report was filed with the museum, and was accessible a decade ago.  The email today tells me that they cannot locate it now.  I have written therefore to the sub-aqua club, who may have it in their files.  Another email went to the Suffolk Institute of Archaeology, who published the article mentioning the survey, to see if I can get in contact with the author in case he has a copy.  We tend to think of museums and archives as safe repositories.  But the truth is that history is vanishing before our eyes.  So it has always been.

Last week I was working industriously on the new QuickLatin.  This is going well, and crude errors are disappearing.  I must get a version released online, as a base version for further work.

My backlog of interesting topics to blog about continues to increase.  So much to do!

Share

From my diary

It is Saturday evening here.  I’m just starting to wind down, in preparation for Sunday and a complete day away from the computer, from all the chores and all my hobbies and interests.  I shall go and walk along the seafront instead, and rest and relax and recharge.

Sometimes it is very hard to do these things.  But this custom of always keeping Sunday free from everything has been a lifesaver over the last twenty years.  Most of my interests are quite compelling.  Without this boundary, I would have burned out.

Phase 2 of the QuickLatin conversion from VB6 to VB.Net is complete.  Phase 1 was the process of getting the code converted, so that it compiled.  With Phase 2, I now have some simple phrases being recognised correctly and all the obvious broken bits fixed.  The only exception to this is the copy protection, which I will leave until later.

Phase 3 now lies ahead.  This will consist of creating automated tests for all the combinations of test words and phrases that I have used in the past.  Code like QuickLatin has any number of special cases, which I have yet to exercise.  No doubt some will fail, and I will need to do some fixes.  But when this is done then the stability of the code will be much more certain.   But I am trying to resist the insidious temptation to rewrite bits of the code.  That isn’t the objective here.

I began to do a little of this testing over the last few hours.  Something that I missed is code coverage – a tool that tells me visually how much of the code is covered by the tests.  It’s an excellent way to spot edge-cases that you haven’t thought about.

It is quite revealing that Microsoft only include their coverage tool in the Enterprise, maximum-price editions of Visual Studio.  For Microsoft, plainly, it’s a luxury.  But to Java developers like myself, it’s something you use every day.

Of course I can’t afford the expensive corporate editions.  But I think there is a relatively cheap tool that I could use.  I will look.

Once the code is working, then I can set about adding the syntactical stuff that caused me to undertake this in the first place!  I have a small pile of grammars on the floor by my desk which have sat there for a fortnight!

I’m still thinking a bit about the ruins of the Roman fort which lies under the waves at Felixstowe in Suffolk.  This evening I found another article exists, estimating how far the coast extended and how big the fort was.[1]  It’s not online, but I think a nearby (25 miles away) university will have it.  I’ve sent them a message on twitter, and we’ll see.*

I’ve also continued to monitor archaeological feeds on twitter for items of interest.  I’m starting to build up quite a backlog of things to post about!  I’ll get to them sometime.

* They did not respond.

Share
  1. [1]J. Hagar, “A new plan for Walton Castle Suffolk”, Archaeology Today vol 8.1 (1987), pp. 22-25.  It seems to be a popular publication, once known as Minerva, but there’s little enough in the literature that it’s worth tracking down.

From my diary

I’ve been continuing to work on QuickLatin.  The conversion from VB6 to VB.Net is horrible, but I am making real progress.

The key to it is to change the VB6 project, so that it will convert better.  So for instance I have various places at which I make a raw Win32 API call, because VB6 just doesn’t do something.  These must mostly go.  I replace them with slower equivalents using mainstream VB6 features.  In some cases I shall simply have to rewrite the functionality; but this is mainly front-end stuff.

All the same, the key point is to ensure that the VB6 project continues to work.  It is essential not to allow this to fail, or develop bugs.  This is one area where automated unit tests would be invaluable; but of course that concept did not arise until VB6 was long dead.  So I have to run the program manually and do a few simple tests.  This has worked, as far as I can tell.

The objective is to have a VB6 project that converts cleanly, and works out of the box.  It may be slower, it may have reduced functionality in peripheral areas.  But the business logic remains intact – all those hard-crafted thousands of lines of code still work.

It’s going fairly well.  I’ve been working through known problems – arrays that need to be base 0 rather than base 1.  Fixed strings inside user defined types have to go.   There is a list on the Microsoft site of the likely problems.

Today I had my first attempt at running the VB.Net 2008 Upgrade Wizard.  It failed, as I expected it to do.  The purpose was to identify areas in VB6 that needed work.  But the converted code only had 37 errors.  Only 3 of these were in the business logic, rather than the front-end, and all were easily fixed in VB6.  There were also a large number of warnings, nearly all of them about uninitialised structures.  Those can wait.

So my next stage is to do something about the 34 front-end errors.  Probably I shall simply have to comment out functionality.  Splitters are done differently in VB.NET.  The CommonDialog of VB6 no longer exists to handle file opening.  That’s OK… I can cope with rewriting those.

It has reminded me how much I like programming tho.

In the middle of this enormous task, of course, there are no lack of people who decide to email me about some concern of their own.  So … polite refusals to be distracted are now necessary.  I hate writing those.  But a big project like this can’t get done any other way.

Share

From my diary

It’s been an interesting couple of days.

I was working on the Passio of St Valentine, and I really felt that I could do with some help.  So I started browsing grammars.

This caused me to realise that many of the “rules” embedded in them were things that you’d like to have pop-up, sort of as an informational message, when you were looking at the sentence in a translation tool.

This in turn reminded me that my own morphologising tool, QuickLatin, was available and a natural candidate for such a thing.

This is written in Visual Basic 6.  I wrote most of it, actually, in Visual Basic for Applications, inside a MS Access database, during 1999.  (The language choice was dictated by the machine that I had available at the time, which had no development tools on it).  I then ported it to Visual Basic 6.  Microsoft then kindly abandoned VB6, without even a migration path, some time in the early 2000s.  This left me, and many others, stuck.  It is not a trivial task to rewrite 24,000 lines of code.

So where was my development environment?  I pulled out the last four laptops that I have used; I have them, because I keep all my old machines.  I found it on my Windows XP machine.  The machine started up OK!  In fact the batteries on the Dell laptops all started to charge, unlike a Sony Vaio which had Windows 7 on it.

The Windows XP machine had a tiny screen and was very old.  Could I perhaps install VB6 on Windows 10 instead?  The answer swiftly proved to  be a resounding “no”.  But I gathered a large number of tips from the web while doing so.

Then I tried installing VB onto my travelling laptop, which has Windows 7 on it, using all the info that I had.  The installation failed; but the software seemed to be installed anyway!

Then I tried doing it again on Windows 10.  This time I had a sneaky extra bit of information – to set the SETUP.EXE to run in Windows XP compatibility mode.  And … again it failed; but as with Windows 7, I could in fact still run it!

The process was so fraught that I knew that I’d never remember all the fixes and tips.  So I compiled all the bits together, hastily, into a reference guide on How to Install Visual Basic 6 on Windows 10, for my own use in days to come.

After two days of constant pain, I was at last in a position to work on the code!

But I wasn’t done yet.  I really would rather not work with VB6 any more.  Not that I dislike it; but it is emphatically a dead toolset.  My attempts to convert my code to VB.Net all failed.

But since I last looked, more tools have become available.  My eye was drawn to a commercial product, which Microsoft themselves recommended, by a firm called Mobilize.net.  The tool was VBUC.  You could get a free version which would convert 10,000 lines.  Surely, I naively thought, that would be enough for me?

Anyway I downloaded VBUC, and ran it, and discovered to my horror that I had nearly 30,000 lines of code!  But I set up a tiny test project, with half-a-dozen files borrowed from my main source project, and converted that.  The process of extracting a few files drew my attention to what spaghetti the codebase has become.  It was not trivial to just take a few.  This in turn made me alter the extracted VB code a bit, so that I could use it.

Converting the extract worked, but required some manual fixing.  However it did work in the end.

I was quite impressed with some of the conversions.  One of the StackOverflow pages had indicated that the firm were charging a couple of hundred dollars for the tool, back in 2010.  So I emailed to ask what they were charging now.

Mobilize.net then got a bit funny on me.  Instead of telling me, they asked me to tell them what I wanted it for.  I replied, briefly.  Then they wanted me to run an analyser tool on my code and send it in.  I did.  Then they wanted more details of what it did.  Quite a few emails to and fro.

By this stage I was getting fed up, and I pushed a bit.  They finally came back with a price, based on lines of code, of around $4,500!  That was ridiculous, and our exchange naturally went no further.

However I had not wasted my time, for the most part.  I could now see what the tool might do.  My code may be elderly, but some of the bits that were converted are basically the same throughout.  It is quite possible that I could write my own tool to do the limited subset of changes that I need.

One problem that was not handled well; QuickLatin loads its dictionaries as binaries, created by another tool of my own.  I found that VB.Net would not handle these, whatever I did.  The dictionaries would need to be regenerated in some other format.

So I spent some time experimenting with an XML format.  I quickly found how slow the VB6 file i/o was.  Reading a 20 mb file using VB native methods took 4 seconds.  Using MSXML to load the file and parse it into a linked list took 1.7!  I didn’t want the linkedlist method; but it was clear that the VB native methods were hideously inefficient.

I soon discovered complaints online that the VB.Net i/o did not support the methods used by VB6 and was even slower!  I’ve encountered problems of this sort before, which I got around by dropping into C++ and accessing the files through bare metal.  Clearly I would have to do so again.

Another problem that VBUC showed me was that VB6 fixed length strings were not really supported by VB.Net.  There was some sort of path, but it was horrible.  However there was, in fact, no reason to go that way; the file i/o, for which they were used, will have to change anyway.

I placed my code base under code control, using GIT.  Then I started cautiously making changes, checking that “amas” was giving sensible results – for unit tests were unknown in the days of VB6 – and committing regularly.  This proved wise; several times I had to go back to the last commit.

I spent quite a bit of time removing superfluous fixed strings from the code.  This was not trivial, but I made headway.

Something else I did, once I realised that coding lay ahead, was to rig up an external monitor, keyboard and mouse to my laptop.  I would have rigged up two, but there was no way to turn off the laptop screen – when you close the lid, the machine goes to sleep and that’s that.  On a commercial laptop, I’d set it to turn off the laptop screen and stay running.  Most graphics cards will support two monitors; the home laptops won’t support three.  Oh well.  But it was still better for serious work than using the laptop screen and keyboard alone.

Finally I started creating dictionary loading routines that would convert to VB.NET.  They are much slower; but I can optimise them when I get the code into VB.NET.  They have to change, come what may.  The key thing is to keep the program running and working at all times.  Take it slow.  Little by little If I take it apart into a million pieces, it will never get back together again.  Indeed this mistake I have made before.

Back in the 90s, automated unit tests, continuous integration, test-driven development and dependency injection were all  unheard of.  I have really missed having a set of tests that I can run to check that the code has not broken in some subtle way.  This again is a reason to migrate to VB.Net, where such is possible.  I did write test stubs in the original VBA, but there was no way to run them within VB6.  At least I have them still, and they can form the basis for unit tests.

So … it’s been a very busy few days indeed.  Nothing to show for it, to many eyes; but I feel optimistic.

The next challenges will be to change the other dictionaries over to the slow-but-safe method, and then remove all the stuff that supported the other approach.  This should simplify the code mightily.  Once this is done, then it will be time to attempt to convert the code.  Somehow.  All I need is time, and with luck I shall have some of that this week.

It is remarkable how far down the rabbit-hole one must go, just to get a bit of online help!

Share