Wednesday, September 14, 2011

Wow, Prijevod Softvera Stvarno Poboljšao!

A few years ago, I tried and soon gave up using computerized translation software. It was simply too prone to produce ungrammatical nonsense, so aside from a fleeting amusement factor, it was mostly worthless.

A few weeks ago, I wanted to translate some sentences and went reluctantly to Google Translate for help. I was amazed by the quality of the results. At the time I just thought "Wow, translation software has really improved!" (or "Wow, prijevod softvera stvarno poboljšao!" in Croatian), and thought no more of it.

Then today I saw this article in The Independent that explained how Google Translate works and why it's so much better than its predecessors.

The article explains that, unlike previous translation software, GT "isn't an algorithm designed only to extract the meaning of an expression from its syntax and vocabulary. In fact, at bottom, it doesn't deal with meaning at all. Instead of taking a linguistic expression as something that requires decoding, Google Translate (GT) takes it as something that has probably been said before."
"It uses vast computing power to scour the internet in the blink of an eye, looking for the expression in some text that exists alongside its paired translation. The corpus it can scan includes all the paper put out since 1957 by the EU in two dozen languages, everything the UN and its agencies have ever done in writing in six official languages, and huge amounts of other material, from the records of international tribunals to company reports and all the articles and books in bilingual form that have been put up on the web by individuals, libraries, booksellers, authors and academic departments. Drawing on the already established patterns of matches between these millions of paired documents, Google Translate uses statistical methods to pick out the most probable acceptable version of what's been submitted to it."
Of course, the quality of the translation varies depending on the source material available. For example, there aren't as many translations from Icelandic to Urdu as there are from Icelandic to English, so if you need something translated from Icelandic to Urdu, the software will most often use English as a "pivot language," translating from Icelandic to English to Urdu. Inevitably, such extra steps take a toll on accuracy, but Google uses user feedback and the ever-growing body of multilanguage information on the internet to constantly improve its output.

Try it out. It's free, and it's fun.

P.

1 comment:

Al Christensen said...

Sometime, when you have time to waste, you can play a little game with an older translator, like Babelfish. Translate something from one language to another and another, back and forth, and see what you get when you translate back to the original language.