McClatchy DC Logo

Robot translators decipher mountains of enemy messages | McClatchy Washington Bureau

×
    • Customer Service
    • Mobile & Apps
    • Contact Us
    • Newsletters
    • Subscriber Services

    • All White House
    • Russia
    • All Congress
    • Budget
    • All Justice
    • Supreme Court
    • DOJ
    • Criminal Justice
    • All Elections
    • Campaigns
    • Midterms
    • The Influencer Series
    • All Policy
    • National Security
    • Guantanamo
    • Environment
    • Climate
    • Energy
    • Water Rights
    • Guns
    • Poverty
    • Health Care
    • Immigration
    • Trade
    • Civil Rights
    • Agriculture
    • Technology
    • Cybersecurity
    • All Nation & World
    • National
    • Regional
    • The East
    • The West
    • The Midwest
    • The South
    • World
    • Diplomacy
    • Latin America
    • Investigations
  • Podcasts
    • All Opinion
    • Political Cartoons

  • Our Newsrooms

Latest News

Robot translators decipher mountains of enemy messages

Robert S. Boyd - Knight Ridder Newspapers

    ORDER REPRINT →

March 16, 2005 03:00 AM

WASHINGTON—Somewhere in a vast jumble of documents in a Baghdad warehouse or in the constant buzz of electronic signals in the sky, a few ominous words or phrases may be hidden:

"Explosives." "Nerve gas." "Convoy." "Airport arrival." "The president."

The words, however, are in Arabic, Farsi, Pashto or some other language that few Americans understand. The messages urgently need to be translated, but there aren't enough expert linguists to handle the flood.

The time for robot translators has arrived, according to a panel of language specialists at a meeting of the American Association for the Advancement of Science in Washington last month.

SIGN UP

"The Defense Department doesn't have enough human translators," said Melissa Holland, an expert at the Army Research Laboratory in Arlington, Va.

"The backlog of untranslated documents is a hindrance to the war on international terrorism," said Mohammad Shihadah, the founder of Applications Technology, a small firm in suburban McLean, Va., that sells Arabic-to-English translation software to the government.

Since Sept. 11, 2001, the Defense Department, the CIA and other intelligence agencies have been pouring money and effort into what's known as "machine translation," or MT for short.

MT uses computers to translate messages from one language to another—such as turning "Good Morning" into "Buenos Dias" or "Auf Wiedersehen" into "Au Revoir" with little or no human intervention.

Computer scientists have labored to perfect machine translation since the 1950s with only modest success. But the terrorist attacks and the wars in Afghanistan and Iraq have given the technology a boost.

Today's robot-linguists are far from perfect, but they can give soldiers in the field the gist of a document, a poster or a possible threat scrawled on a wall.

"Soldiers can get a sense of what a document is about—not a perfect translation," Holland said. Accuracy is still less than 50 percent, Clare Voss, another Army researcher, acknowledged.

Equipped with a handheld PDA (a Personal Digital Assistant, such as the popular BlackBerry), a digital camera and a laptop computer in the back of a Hummer, a GI can quickly decide if a message needs human attention.

"Expectations for speed and accuracy are not always met—it's not the Queen's English," admitted William McClellan, a machine translation systems manager at Booz-Allen Hamilton, a technology consulting firm in McLean. "But it's a way to find the needle in the haystack without translating every straw."

The elimination process is called "triage."

"Knowing what to translate first out of thousands of documents is a problem faced daily by our military and intelligence officers," McClellan said. "Thousands of documents can be automatically screened, and those meeting certain criteria can be ... automatically routed to linguists and domain specialists."

The volumes of material to be translated are "enormous," said Mark Turner, an MT expert at CACI, an information technology organization in Lanham, Md.

In Baghdad, "we found warehouses with billions of documents in bags, boxes, binder and books," he said. "There are tons of paper and terabytes (trillions of bytes or letters) of electronic media."

People who use machine translation often find it frustrating, quirky and unreliable. "MT is a useful tool for triage, but it doesn't replace human linguists," Turner said.

For decades, machine translation systems labored to make computers understand traditional rules of grammar—subjects, verbs, objects and so on. Progress was slow, thanks to the tremendous ambiguity and complexity of human language.

The word "get," for example, has 24 possible meanings listed in Webster's New College dictionary. One of them is "kill"—as in "I'll get you for this."

In the 1990s, however, a new technique came along, applying statistical analysis to huge databases of previously translated texts. By comparing a new, unknown message to millions of stored sentences, phrases and words, researchers could quickly find the most likely translation.

This method, also known as "data-driven machine translation," works like this: The computer scans a sentence, lists each possible meaning of each word and arranges them in every possible order, most of them nonsensical, until it finds one that most nearly matches a good translation.

For example: "bites man dog," "dog man bites," "man bites dog," and, finally, "dog bites man." A long sentence can produce millions of variations.

Statistical machine translation "was a huge leap in the state of the art—very high accuracy, very fast," said Daniel Marcu, a co-founder of Language Weaver Inc., a commercial MT company in Marina del Rey, Calif.

Marcu claimed that his company's system can translate 5,000 words per minute, 24 hours a day, seven days a week. Five years ago, he said, the best that could be done was one 1,000-word document a day.

According to Marcu, the system can record a broadcast from al Jazeera, the Arabic-language network that carries Osama bin Laden's taped messages, and translate it automatically.

"With a one-minute delay, you can see what al Jazeera reported," he said.

Machine translation is also gaining ground in international commerce, according to Stephen Richardson, a former IBM researcher who now heads the Machine Translation Project at Microsoft in Redmond, Wash.

"Companies are facing increasingly difficult and costly challenges of localizing their products and services in the global marketplace," Richardson said.

Human translation is very expensive—20 to 50 cents per word, he said. Older, rule-based machine translation systems cost as much as a million dollars to create and maintain.

Microsoft has used the new, data-driven method to translate its customer support database into four foreign languages "at a substantial cost savings," Richardson said. The machine translations still need a final polishing by a human editor, but the total cost is 35 percent less than it used to be.

Nevertheless, the prime mover for machine translation is the war on terror, and the urgent need to understand what potential enemies are saying.

"You can't expect the president to speak Pashto," an Afghan language, said Benson Margulies, the chief technical officer at Basic Technologies Inc., a language processing provider in Cambridge, Mass.

———

For more information on machine translation, go to www.essex.ac.uk/linguistics/clmt/Mtbook/HTML/book.html

To try out a free, rudimentary commercial translation system, go to www.freetranslation.com

———

(c) 2005, Knight Ridder/Tribune Information Services.

Need to map

  Comments  

Videos

Lone Sen. Pat Roberts holds down the fort during government shutdown

Suspects steal delivered televisions out front of house

View More Video

Trending Stories

Cell signal puts Cohen outside Prague around time of purported Russian meeting

December 27, 2018 10:36 AM

Ted Cruz’s anti-Obamacare crusade continues with few allies

December 24, 2018 10:33 AM

Sources: Mueller has evidence Cohen was in Prague in 2016, confirming part of dossier

April 13, 2018 06:08 PM

With no agreement on wall, partial federal shutdown likely to continue until 2019

December 21, 2018 03:02 PM

California Republicans fear even bigger trouble ahead for their wounded party

December 27, 2018 09:37 AM

Read Next

Courts & Crime

Trump will have to nominate 9th Circuit judges all over again in 2019

By Emily Cadei

    ORDER REPRINT →

December 28, 2018 03:00 AM

President Trump’s three picks to fill 9th Circuit Court vacancies in California didn’t get confirmed in 2018, which means he will have to renominate them next year.

KEEP READING

MORE LATEST NEWS

Lone senator at the Capitol during shutdown: Kansas Sen. Pat Roberts

Congress

Lone senator at the Capitol during shutdown: Kansas Sen. Pat Roberts

December 27, 2018 06:06 PM
Does Pat Roberts’ farm bill dealmaking make him an ‘endangered species?’

Congress

Does Pat Roberts’ farm bill dealmaking make him an ‘endangered species?’

December 26, 2018 08:02 AM
‘Remember the Alamo’: Meadows steels conservatives, Trump for border wall fight

Congress

‘Remember the Alamo’: Meadows steels conservatives, Trump for border wall fight

December 22, 2018 12:34 PM
With no agreement on wall, partial federal shutdown likely to continue until 2019

Congress

With no agreement on wall, partial federal shutdown likely to continue until 2019

December 21, 2018 03:02 PM
‘Like losing your legs’: Duckworth pushed airlines to detail  wheelchairs they break

Congress

‘Like losing your legs’: Duckworth pushed airlines to detail wheelchairs they break

December 21, 2018 12:00 PM
Trump’s prison plan to release thousands of inmates

Congress

Trump’s prison plan to release thousands of inmates

December 21, 2018 12:18 PM
Take Us With You

Real-time updates and all local stories you want right in the palm of your hand.

Icon for mobile apps

McClatchy Washington Bureau App

View Newsletters

Subscriptions
  • Newsletters
Learn More
  • Customer Service
  • Securely Share News Tips
  • Contact Us
Advertising
  • Advertise With Us
Copyright
Privacy Policy
Terms of Service


Back to Story