Archive for the ‘Linguistics’ Category

Google Translate: Swahili… bado kidogo

Thursday 27 August 2009

In the last couple of days Google has added Swahili to the list of languages supported by its Translate service. On one hand I’m very happy to see this addition as I think it has the potential to be a big step forward for development in east Africa. However, from first impressions the service still has a long way to go.

One of the main problems for Google is that Swahili is an agglutinative language – meaning that it puts morphemes (grammatical parts of words) together to form longer words. So it can be difficult for a machine to know where the morphemes (parts of the word carrying meaning) begin and end.

Here are some very simple examples that I tried putting into Google:

Swahili Morphemes English Google Translate
kupika ku-pika to cook cooking
ninapika ni-na-pika I am cooking I cooked
nilipika ni-li-pika I cooked I cooked
nitapika ni-ta-pika I will cook I cooked
sijapika si-ja-pika I have not cooked I cooked
apike a-pik-e let him cook apike
umepika u-me-pika you have cooked has cooked
tutapika tu-ta-pika we will cook we cooked
watakapopika wa-taka-po-pika when they will cook will kakopika
mlipokuwa mnapika m-li-po-kuwa m-na-pika when you (pl) were cooking as they were cooked
ikipikwa nasi i-ki-pik(w)a na-si if it is cooked by us it be boiled us
bado kidogo bado ki-dogo not quite yet still little

To be fair, from what I’ve seen the translations of single words isn’t bad at all. Where it falls down is in the grammar – translating Swahili past, present, future and negative-perfect tenses all to English past!

Going the other way, here a few English examples I tried:

English Swahili Google Translate English back-translation
many people watu wengi watu wengi many people
many trees miti mingi miti mingi many trees
many elephants tembo wengi wengi tembo many elephants
many cars magari mengi wengi magari many cars
I am cooking ninapika I am kupikia “I am” to cook with
I cooked nilipika mimi kupikwa I to be cooked
To be fair, from what I’ve seen the translations of single words isn’t bad at all. Kwa kweli, kutokana na yale ambayo nimeyaona, utafsiri wa maneno ya pekee siyo mbaya Kuwa na haki, kutokana na yale I’ve amemwona zote maneno ya wimbo sio mbaya wakati wote. In truth, coming from what “I’ve” he has seen all
words of song not bad all the time

At this point it looks to be a decent dictionary (although with nothing like the depth of the excellent Kamusi Project), and actually does ok with set phrases. However once you get past the set phrases that it knows it seems unable to understand the relatively simple grammar and come up with a meaningful translation.

This is obviously a work in progress, as the “Contribute a better translation” option shows. It would be interesting to know whether Google takes these user contributed translations and tries to work out how the grammars and structures of the languages compare, or whether it simply remembers the set translation in case anyone enters the exact same phrase again. The first would be fascinating to investigate, whereas I fear the second would be like trying to empty the ocean with a teaspoon.

Google still has a long way to go

Monday 25 August 2008

Google now has a homepage for its search engine in Swahili: http://www.google.co.tz/

According to this page Google has translated at least 1% of its main site in 152 languages. Not bad, especially considering that these languages are spoken by several billion people worldwide.

According to Wycliffe Bible Translators, the most translated book of all time, the Bible, has been translated into 438 languages. Another 2,016 have at least some of the Bible translated into them.

But that leaves over 2,200 with a need for Bible translation and no project yet started. Many of these languages don’t have a written form, so in order for the Bible, Google or any other text to be translated and written down, an alphabet and writing system must first be developed.

The efforts of Google and others (like Ubuntu, who are currently translating into 189 languages) are to be applauded, and will make their products accessible to the vast majority of people worldwide. But for the Bible, a message from God’s heart to man’s heart, it’s not enough to translate into the 150 or 200 most major languages in the world.

Rather, the message of God’s good news to all nations must be made available to each and every person in the language of their heart, however uneconomical it may seem. No businessman would ever translate his product into a language spoken by 100 people in a village in Papua New Guinea – it just doesn’t make business sense. But then not many shepherds would leave 99 sheep on their own in order to search for the one sheep that wandered astray.

Which is why the Bible will always be the most translated book. God has created each and every person uniquely and loves them just as they are. He will stop at no lengths to draw each person to himself. If we are to reflect God’s character as we join in with his mission to the world, we must make the Bible available to every person in their own heart language.

The language of church

Sunday 20 April 2008

Our church at the moment is an Anglican church, which can mean a variety of things, but in our case means it’s quite traditional. In an average morning service, there are probably 5 words that I can only guess their meaning from the context, and dozens of others that I wouldn’t hear during the rest of the week.

And the idea of using different vocabulary in church to the rest of life isn’t confined to traditional churches. We were listening to a sermon online today from a church that is very alive and fruitful in many ways, but some of the words used probably hadn’t been used in regular English conversations for well over 100 years.

Why do we do this? Why do we use special old words when we’re talking to God that we would never use if we were talking to our next door neighbour? I’m not sure, but here’s some possible ideas:

  1. We think that God understands old words better. God is old. He’s been around for thousands of years – maybe he’s like our great-grandparents and longs for the good old days. Maybe if we use old words we’ll get his attention and he’ll really understand us.
  2. We’re used to using a Bible with old words. Since God speaks to us in old English, it’s only fair to reply in the same language.
  3. We want to impress other people. If God speaks old words and we do too, maybe people will be impressed that we’re close to God and know his “lingo”…
  4. We’re scared to use the same language in church as we do in the rest of our lives. If we do, that will mean that the rest of our lives are actually connected to what we do in church and we’ll have to give our whole lives to God, not just Sunday mornings.

Are there other (more genuine) answers I’ve missed? I’d love to know, because there are people who are much more godly than me, who I really respect as Christians, who use old English words. Am I missing out on something because I only use simple words…?

An out of context visit

Saturday 29 March 2008

Source: WikipediaYesterday, Mark and I were very happy to have some American friends passing through who we know from Tanzania. It was great to see them in England and it really reminded us of Tanzania… Hot weather, Swahili, other friends from the Uganda-Tanzania SIL Branch. It makes us miss many things about Africa. This morning we woke up thinking about Tanzania and I had Mark teach me a few more words in Swahili.. Mti (tree)… Mguu (leg)… Mkono (arm)… Mji (town)… Mlango (door)… Mfereji (trench)… And if you add m- + refu after any of these, you have a ‘long/tall’ nown. Like Mti mrefu = ‘tall tree.’ It did my heart good to connect with Africa this morning, even if through saying simply ‘Mifereji mitano mirefu’ – ‘five long trenches.’


verb – to twig

Sunday 23 March 2008

My vocabulary has been enlarged today as Mark’s dad asked whether someone had twigged who we were. At which point, Mark’s mum wondered where the root of that word might lie… What a fun easter morning.

twig·ging. British

–verb (used with object)

1. to look at; observe: Now, twig the man climbing there, will you?
2. to see; perceive: Do you twig the difference in colors?
3. to understand.

–verb (used without object)

4. to understand.

[Origin: 1755–65; < Ir tuigim I understand, with E w reflecting the offglide before i of the velarized Ir t typical of southern Ireland; cf. dig2]

Dictionary.com Unabridged (v 1.1)