An ancient language with nearly a million undeciphered texts just got a translator that does the job in seconds: A.I.

Dead languages are famously hard to decipher. It took 23 years to crack the Egyptian hieroglyphics on the Rosetta Stone. It took nearly two centuries to understand Mayan glyphs. And it took over 3,000 years to reveal Linear B, the earliest form of Greek. When techno-optimists talk about the game-changing potential of A.I., they cite difficult problems like this, and even for languages that have already been translated, challenges remain. Consider Akkadian cuneiform, one of the world’s oldest written languages. There are so few people who can read the extinct language that nearly a million Akkadian texts still haven’t been translated to date—but now an A.I. tool can decode them within seconds.

An interdisciplinary group of computer science and history researchers published a journal article in May describing how they had created an A.I. model to instantly translate the ancient glyphs. The team, led by a Google software engineer and an Assyriologist from Ariel University, trained the model on existing cuneiform translations using the same technology that powers Google Translate.

A beacon to weary translation travelers

In translating dead languages, especially those with no descendant languages, piecing together meaning without a wealth of cultural context can be like traveling without a North Star. Akkadian is just such a language. The tongue of the Akkadian Empire, located in present-day Iraq during the 24th to 22nd centuries BCE, Akkadian existed as both a spoken and written language. Its cuneiform writing system used an alphabet of sharp, intersecting triangular figures. Akkadians typically wrote by marking a clay tablet with the wedge-shaped end of a reed (cuneiform literally means “wedge shaped” in Latin). Hundreds of thousands of these tablets, due to the durability of their material, have weathered the centuries and now populate the halls of various universities and museums.

Translation is often misunderstood as a one-to-one decryption of a foreign word or phrase. But many times, a statement in one language doesn’t have an exact or easy equivalent in another, accounting for cultural nuance and difference in the languages’ construction. High-quality translation requires a deep knowledge of both languages’ structures, their surrounding cultures, and the histories that anchor those cultures. Translating a text while preserving its original tone, cadence, and even humor is a delicate craft—and an incredibly difficult one when the language’s culture is largely unknown.

For decades, computer-generated translations were brittle and unreliable, Tom McCoy, a computational linguist at Princeton University, said. Translation programs embedded with grammatical rules always missed the richness of meaning in idioms and nonliteral language that slip through the cracks of formal grammar. But recently, A.I. programs like the cuneiform translator have been able to get at the “fuzzier” areas of language. It heralds an exciting new period of A.I.-propelled computational linguistics.

“In recent A.I., the big new thing is statistical processing, which is another type of math but not the sort of rigid rules that people were working with before,” McCoy said. “Statistics got us kind of over the hump of previous methods. We’re now working with machine learning and deep learning. Machines are able to learn all these idiosyncrasies, idioms, and exceptions to rules, which is what was missing in the previous generation of A.I.”

Share this:

Related