[Original review, Jul 24 2018]
This month, three plotlines in my life collided. I know Swedish and Norwegian well, and I'd thought vaguely from time to time that I'd like to learn Icelandic too; I've always been a great admirer of Tolkien, and I knew he had been interested in Icelandic; and I have a couple of Icelandic friends. But none of this had ever come to anything. Last week, however, Jupiter aligned with Mars and I entered the Age of Aquarius. I'd just finished reading Tolkien: Maker of M [Original review, Jul 24 2018]
This month, three plotlines in my life collided. I know Swedish and Norwegian well, and I'd thought vaguely from time to time that I'd like to learn Icelandic too; I've always been a great admirer of Tolkien, and I knew he had been interested in Icelandic; and I have a couple of Icelandic friends. But none of this had ever come to anything. Last week, however, Jupiter aligned with Mars and I entered the Age of Aquarius. I'd just finished reading Tolkien: Maker of Middle-Earth , which has many striking passages in Icelandic, Old Norse and Old English, and our friend K happened to be on Iceland. Fired with enthusiasm by Tolkien's love of these obscure but wonderfully poetic languages, I asked K if she could possibly get me one or two Icelandic children's books. I just don't know how to thank her: she turned up with not one or two but half a dozen books, including my favorite, Le petit prince. I spent the next few days carrying it with me everywhere, snatching all opportunities to try to make sense out of it.
For people who don't know anything about Icelandic, it has the same ancestor as Swedish, Danish and Norwegian. A thousand years ago they were the same language. But the mainland languages have evolved at a normal rate, while Icelandic, on its faraway island, has changed relatively little; so if you speak Swedish or Norwegian, it's like trying to read a language which for an English-speaker would be somewhere between Chaucer and Beowulf. You recognise a few of the words at once, others are more or less mangled, and still others are completely unfamiliar. The first impression is that it makes no sense at all. But I know Le petit prince, and I started trying to guess what word was what, just reading without looking anything up.
It was amazing to see how well this worked. For example, let me show you the following sentence:Þar sem ég hafði adrei teiknað kind dró ég upp fyrir hann aðra af þeim tveimur myndum sem ég var fær að gera: myndina af kyrkislöngunni utanverði.
The first time I saw this, there were only a couple of words I felt at all sure about. Upp and var must be the same words as in Swedish ("up" and "was"). I soon figured out that ég was "I" (it is the same word in some Norwegian dialects), að was att ("that"), and hann was han ("he/him"). The words mynd and kind weren't like anything I recognised, but they were common, and having already come across them I realised they must be "drawing" and "sheep". As I read the book for the second time, the other words gradually fell into place too, and after a while I could read it as sort-of-Swedish:Då som jag hadde aldrig tecknad får drog jag upp för honom den-andra av dem två teckningarna som jag var för att göra: teckningen av pytonormen utifrån.
which I might render into sort-of-English as:Then as I had never drawn sheep pulled I up for him the-second of the two drawings which I was able-to make: the-drawing of the-python from-outside.
I recalled that there was a sentence something like this near the beginning of the story: it all made sense.
How does it work? I've been reading deep learning theory, and it's tempting to conceptualise it in terms of strengthening of neural pathways. I see a word I don't know, and I think of some words it could be: aðra to a Swedish-speaker first looks like ådra, "vein", and you only later think of andra, "second". This word occurs quite often. "Vein" never makes any sense, but "second" often makes good sense. So the pathway for ådra never gets strengthened but the one for andra does, and after a while my eyes just start seeing it as andra. The same thing happened with numerous other words. As I'm sure many language geeks will attest, it is such a weird and interesting feeling to find the sense emerging from words which initially looked like gibberish! I'm sorry if I've gone into too much detail here, but I wanted to explain what I mean when I say it's like doing drugs. You actually feel the text changing your state of consciousness.
Well, I'm hooked. Though so far, I've just barely started: the grammar is still a mystery to me. All the same, on my latest read-through I notice that the endings of nouns and verbs, which are first looked quite random, now seem to be displaying some recurrent patterns...
_________________________
[Update, Aug 6 2018]
I have been making efforts to understand in more quantitative terms what I've been doing here. First, I thought it would be a good exercise to try copying out the text of Litli prinsinn: this would force me to look carefully at every letter, and also give me a machine-readable version that I could analyse. I'm now about three-quarters of the way through (he has just said goodbye to the fox). I tried running my incomplete corpus, which contains about ten thousand words, through a script that Not and I developed last year.
The script is simple but quite useful. It counts frequencies for all the words in the corpus, then builds a hyperlinked concordance which shows me up to ten examples for each word. Every word is clickable, so I can take a word I'm unsure of in a sentence and see examples of that word in other contexts. There is a master index which lists all the words in descending frequency order. Here are the first 50 lines. The 'Freq' column gives the number of times the word occurs, and the 'Cumul' column gives the cumulative frequency:
All of these 50 words (to be exact, some of them are punctuation marks) are now very familiar to me, and as you can see they make up more than 50% of the text. I tried walking down the list to see when I stopped feeling confident. I can go as far as words with four or five occurrences, and I think I know what nearly all of them mean; that brings me up to about 400 words, and 75% of the total. When I look at words occurring two or three times, I start to feel uncertain, but I still think I know the majority of them. That gets me to 900 words and 86%. The 1600 words which only occur once are of course the hardest; but even here I feel I can guess a lot, perhaps a third to a half of them.
Copying out the text has sharpened my understanding of the grammar a good deal, and now I recognise quite a few endings. Though I'm still pretty hazy about the nouns. With multiple genders, multiple cases and marking for definiteness, there are many combinations, and I only know the most common ones.
It's surprising that one can extract so much information from a tiny sample of just ten thousand words. I'll see if I have the patience to finish this and then do Ævintýri Lísu í Undralandi as well...
___________________________________
[Update, Aug 8 2018]
I have finished copying out the text of Litli prins; the file now contains about 14,200 words and about 3,050 unique words. I made a small improvement to our script, so that it now creates an alphabetical index as well. This is very useful for finding copying errors: if I see two words close together which are almost the same, that often means that one of them is an error. Tidying up my copied text is not as tedious as I thought it would be. It's forcing me to look very carefully at everything and consolidate my extremely sketchy vocabulary.
I am sure there are still many errors left, but after this initial cleaning up pass I can look at my alphabetical index and get further on trying to understand the grammar. Here's a section showing forms of the word stjarna, "star", which occurs often in Litli prins.
Some of these are compound nouns: for example, stjörnufræðingur, literally "star-ologist" is "astronomer", and stjarnfræðiþingi, "star-ology-thing" is "astronomical congress". But what are all the others, most of which look like inflected forms? I can click on any of them and get a hyperlinked page of examples. For example, let's look at the page for stjörnu, which occurs 15 times:
I see that occurrences of stjörnu usually come after a preposition. For example, we have Hann hefir aldrei horft á stjörnu, "He has never looked at stjörnu", or En þú ert hreinn og þú kemur frá stjörnu, "But you are pure and you come from stjörnu". Most of the others are similar. Hm, looks like this is a dative singular? My suspicions are reinforced by the fact that Swedish used to have a dative; it disappeared long ago, but still survives in a couple of fixed expressions like till salu, "for sale", which has this -u ending.
Still a great deal more grammar to figure out! There are some improvements to the script that I hope to add soon, and which might help...
___________________________________
[Update, Aug 12 2018]
I have added another little improvement to our script. It now creates a hyperlinked version of the original text, with the words colour-marked to show how frequently they occurred in the text you've read so far. The initial version uses four colours. Words are in black if they occur more than five times, blue if they occur four or five times, green if they occur two or three times, and red if they occur once. Here's an example, the start of the visit to the Drunkard:
The colours let you see at a glance approximately how well I now understand the text. Look at the first paragraph:Á þriðja hnettinum bjó drykkjumaður. Heimsóknin þangað var mjög stutt, en hún fyllti litla prinsinn miklu þunglyndi.
(At the-third planet lived drunkard. The-visit there was very ?short, but it filled the-little prince much ?depression)
Black words like hnettinum ("planet", I think in the dative) and mjög ("very") are quite familiar, and I am reasonably confident that I've guessed the green and blue ones correctly. Only two words, stutt ("short"?) and þunglyndi ("depression"?) are in red, and these are indeed the ones I feel least certain about. I'm pretty much guessing stutt from context. I'm more confident about þunglyndi, since I know from other examples that þung, cognate to Swedish tung, is "heavy", lyndi is probably something related to Swedish lynne, "spirit", and there is a Swedish word tungsint, "heavy-spirited/depressed".
This was an easier passage than average, and usually there is more red. But it feels motivating to think that, as I copy out more text and process it through the script, the red tide should start to recede...
[To Ævintýri Lísu í Undralandi ] ...more