3 reasons why AI won’t replace human translators... yet
3 reasons why AI won’t replace human translators... yet
Witty repartee with robots is typically constrained to a
narrow set of contexts and conditions.
11 Oct 2018 Jonathan Rechtman Co-founder and Chief
Interpreter, Cadence Translate
I've been a proud simultaneous interpreter for the better
part of a decade, but I agree that it's not the flashiest of jobs. We speak in
hushed voices, tucked away in small booths in the corners of conference rooms,
the polyglot wallflowers of the global economy.
Just don't let the robots take credit for our work. Then
we get noisy.
Grumbles grew into a roar recently when an interpreter in
Shanghai took to social media to protest the misleading marketing of
“AI-powered translation” at an international conference. The translation was in
fact a voice-to-text transcription of the human interpreter's work. The post
went viral on Chinese social media and created a scandal around iFlyTek, the
promoter of the mislabelled technology and one of China's leading AI and
natural language processing (NLP) companies.
The public response revealed the extent to which machine
superiority in the field is already taken for granted. People seem genuinely
shocked that in this day and age, interpreting still requires human
professionals to perform actual knowledge work.
Didn't Google Translate solve this problem years ago? Or
Skype Translator? Or any of a dozen wearable translation devices on the market
claiming to be the next “Babel Fish”?
No, they didn't.
AI consistently outperforms humans at driving cars,
diagnosing cancer, shooting free-throws and predicting crop yields (not to
mention chess, Go, poker, and Jeopardy). But when it comes to translation and
interpreting, the most sophisticated technology on earth is still by far the
human brain.
How come? There are three reasons.
1. Language is subjective
Artificial intelligence typically excels at tasks that
are rooted in objective reality. Whether identifying elusive signal patterns in
data sets or navigating complex road conditions, machines function best when
confronted with clear mathematical or physical rules that govern their
decision-making.
Natural languages, by contrast, are subjective constructs
invented by groups of humans to communicate with each other. They often exhibit
rule-like behaviour (grammar and conjugation, for example), but these rules are
grounded only in convention, not objective reality, and they are constantly
evolving.
Humans may have forfeited our lead in recognizing tumours
or judging credit risk, but we still have, and may always have, the final
authority over what is or isn't “natural” in a natural language. This authority
is reflected in the metric of choice for evaluating machine translation
algorithms - the BLEU (bilingual evaluation understudy) - which scores
candidate translations based on their similarities to a human professional's
work. “The closer a machine translation is to a professional human translation,
the better it is”, concede the framework's inventors.
Human translation doesn't just set the standard, it
necessarily is the standard.
2. Big data doesn't have a big sense of humour
Any translator will tell you that jokes, puns and sly
innuendo (as well as nuanced cultural references) are among the hardest bits to
get over the language barrier. Yet without them, our quality of expression
becomes much poorer. From an interpreter's standpoint, tone of voice and body
language also directly inform a speaker's intent and have to be accurately
analyzed and conveyed in the target language as well.
This is challenging for humans, but it's currently
impossible for machines.
The move from statistical, phrase-based machine
translation to neural networks has yielded significant improvements in overall
quality. But neural machine translation is even more dependent on huge sets of
training data than its predecessor models. And since the biggest bilingual
datasets available are from official translations of government documents and
religious texts, these algorithms have a pitifully low exposure to humour,
wordplay and non-verbal expression.
Most disturbingly, neural machine translation often
doesn't confess its mistakes. Rather, like an ill-prepared schoolchild, it
tries to fudge through them. When Google Translate started offering biblical
prophesies in exchange for junk input, experts attributed the errors to neural
networks' preference for fluency over accuracy.
These “false positives” are far more insidious than
clumsier and more obvious mistakes, as audiences in the target language might
never realize a glitch has occurred and might attribute the outlandishness of
the renegade translation to the original text itself.
3. Listen up, bots
The challenges above make it difficult enough to perform
machine translation on a piece of static text. Asking a computer to interpret
live speech simultaneously adds several layers of complexity, the most obvious
being automatic speech recognition (ASR).
Yes, Siri, Alexa and their ilk seem to be pretty
competent conversationalists these days. But your witty repartee with robots is
typically constrained to a narrow set of contexts and conditions: short,
command-based interactions involving a finite vocabulary in a controlled
environment. Most live conferences and business discussions, on the other hand,
feature speech that is spontaneous, continuous and highly context-dependent -
traits that send the error rate of most ASR programs through the roof.
Hilarious and humiliating examples abound. Giving a
speech in Beijing earlier this year, hedge fund guru Ray Dalio reflected on his
mis-forecasts as a young trader.
“How arrogant!" he thundered to the crowd. "How
could I be so arrogant?"
The real-time subtitling program valiantly struggled to
render his rhetorical device.
"How?" the subtitles asked. "Aragon, I
looked at myself and i".
Recent advances in the field are promising, and many
experts predict that the word error rate of ASR software will reach parity with
human transcribers in the near future. Not all word errors are the same,
though. Fudging “alright” into “ all right” might be an inconsequential
mistake, while confusing “today” with “Tuesday” would likely cause a
substantial mix-up. Even with fewer word errors, machines remain far more
likely than humans to commit semantic errors that misrepresent the intended
meaning of a speech.
Not a human, not yet a robot
Humans have long made a pastime of reflecting on our
perceived superiority - over animals, over each other and more recently over
machines. It's a dark pastime, to be sure, and an inevitably foolish one.
I don't doubt that the day may come when computers
develop a human-like command of our natural languages. I don't doubt that one
day interpreters and translators, along with copywriters, editors, radio hosts
and other professionals in the language economy, may find their jobs on the
robot's chopping block.
But that day is further away than most people think.
Language work - always part art, part science - is surprisingly defensible
against these early iterations of AI.
Like so many other industries, we language professionals
should focus our attention on using AI/NLP technologies to increase the
efficiency, quality and cost-competitiveness of our labour. Computer-assisted
translation tools are already widely used among text translators, and while
many bristle at the suggestion, no doubt simultaneous interpreters could
benefit from some combination of speech recognition and translation memory
technology. For the foreseeable future, at least, these tools would serve as a
complement, not an alternative, to human output.
And as long as there are humans in the interpreter's
booth, let's have the decency to give them the credit they deserve.
Comments
Post a Comment