Translate the Web with a Screen Captcha
26 June 2012 - Digital
A free website, Duolingo, has been launched with the intention of translating the entire world wide web with the help of people who are learning a new language. The project has been conceived out of a sense of guilt by the man who introduced one of the most exasperating features on the internet – the screen Captcha.
The website is hoping to convince millions of users to work for free and help to translate all web content in a matter of a few years.
The ambitious scheme has been cooked up by Luis von Ahn and his fellow workers at the Carnegie Mellon University in Pittsburgh, USA which has enlisted a worldwide workforce who are prepared to work for nothing.
At the age of 22, von Ahn invented the Captcha as a graduate student. Captchas are distorted images of words and numbers that are used to sign into security enabled websites such as social media and ticketing sites where visitors have to prove that they are human.
It is believed that the software is used by more than 350,000 websites to identify and prevent computer programs from besieging the sites with spam and in 2007, von Ahn conceded that 200 million Captchas were being entered by people all around the globe each day.
“At first I felt really good about that because I thought, ‘Look at the impact that I’ve had’,’ he said.”But then I started feeling bad.”
Von Ahn estimated that filling in each Captcha took around 10 seconds and when multiplied by 200 million, he figured that humanity as a whole was spending 500,000 hours every day typing in security codes.
In an attempt to make good use of these hours he created ReCaptcha, a process which uses the filled out response as both the intended spam deterrent and as a means to digitise books one word at a time.
This development coincided with the New York Times embarking on a project to digitise its 130 year archive by employing a team of typists. Over ten years, the team had managed to transcribe 27 years worth of newspapers. The Times decided to use von Ahn’s software solution and within 2 years had completed the remaining 129 years of archived material.
In 2009 Google bought ReCaptcha and it is still extensively used to distinguish between spamming software and human beings all around the globe. Its translating features however are exclusively available to Google’s Books initiative, which was set up to transcribe every book in the world.
Despite this, for the majority of people Captchas are seen as a complete and frustrating waste of time and web surfers who suffer from dyslexia or visual impairments find them as a major barrier to online use.
Dr Sue Fowler, at the Dyslexia Research Trust, suggests that the Captchas only add to the problems dyslexics have when filling out web forms. “Even looking at it closely, I wouldn’t know what to do with it,” she said.
There is an audio option, but these can be even more unsatisfactory as they tend to sound muffled and difficult to understand.
It has been noted that automated codes are progressively getting more unintelligible, with some of the latest offerings appearing more jumbled and blurred rendering them almost impossible to decipher.
“As of a few months ago, if we showed someone a ReCaptcha they were successful at it about 93% of the time,” von Ahn remarked, adding that as soon as that drops to 75%, visitors give up trying to gain access to the website.
Von Ahn has embarked on a project to create software that rewards a user for their time and effort when filling out Captchas and in partnership with one of his former graduate students, Severin Hacker, has developed Duolingo.
Duolingo is a website that serves free language tutorials and in return requests aspiring candidates to translate sentences from the web.
At the moment, the site only supports English speakers interested in French, Spanish or German and Spanish speakers looking to learn English. The students begin with very simple sentences and gradually work up to more complicated ones helping to increase their worth as they become more competent.
Although computers can translate individual words, it is important to have human input in order to put the words into context and construct sentences that make sense.
“The computer always knows what each word can translate to, all the possibilities – that’s just a bilingual dictionary. But the computer doesn’t know that in this case, a word means girl, and in that case, it means daughter,” von Ahn said.
Duolingo serves a user with a complete sentence and offers all the possible translations for each of the words. The user then has to build the sentence with the aid of their understanding of their language.
To root out the bad translations, the website invites the users to rate the individual answers and chooses the top ranked explanations.
The users then begin to work on real sentences sourced from creative commons licensed websites.
Duolingo has a computer game element to it, where points are awarded for each translation attempt and a completed round earns the participator a shiny gold medal. The budding linguists can also track each other’s progress which adds another competitive edge.
The system does have its detractors though, with some experts doubting the method actually allows the user to reach a decent level of fluency.
Mickael Pointecourteau, an experienced language lecturer who has used the software said: “There are some mistakes in their translation from the very first level, which worries me for when users will get to a higher level,” he said. “Four main skills must be taken into account when learning a new language – speak, listen, write, read. I doubt this kind of software prepares for that.”
Von Ahn, unsurprisingly does not agree with this; “We’ve been doing a lot of tests and we can get you to the point where you are an intermediate speaker of a language, you can go to a country that speaks that language and you can get around,” he said.
“Of course in order to become bilingual you probably need to go to a particular country and live there for a few months, it takes that level of practice.”
For many people, Duolingo will be no more than a distraction from their work, or at best, a game, but von Ahn is adamant the software’s potential easily surpasses that.
“In the US and in the UK too, learning a language is more of a hobby. In South America you learn a language, particularly English, to make more money and to climb the social ladder.”
Von Ahn is hopeful that his software will help people get a leg-up in life where they wouldn’t otherwise have been able to afford it.