Powerful AI Algorithm That Cracks Encrypted Messages Could Help Facebook and Google Translate Human Language

A new algorithm can translate one language into another the same way it would with encrypted data. NASA

Cipher-cracking AI isn't just for decoding encrypted data—new research shows it can make quick work out of translating human language.

A team of researchers from the University of Toronto and Google recently created an algorithm that got past two well-established codes: the Caesar cipher, which is simple and comparatively vulnerable to cracking, and the Vigenère cipher, which uses secret keys—an extra variable that requires more sophistication to crack.

Using the algorithm's success against those codes as a benchmark, it should prove just as handy in translating human languages, according to New Scientist. A paper describing the research is among the top few finalists being considered for discussion at the 2018 International Conference on Learning Representations, which will take place in Vancouver this spring.

AI that cracked ancient secret code could help robot translation https://t.co/T5QxspODbI pic.twitter.com/sQciWsWSzo

— New Scientist (@newscientist) January 24, 2018

"Turns out, cipher cracking is a pretty good analog to unsupervised translation," co-author Aidan Gomez, a machine learning researcher at the University of Toronto, told Newsweek.

In the context of AI, unsupervised learning refers to the ability of machines to acquire new information that hasn't been fed to them by humans. It's been a hugely buzzy topic in the dialogue around improving translation software for platforms like Facebook and Google. Gomez, who said he hopes such companies pick up his research and take it forward into an applied form, set out to improve unsupervised translation of human language from the outset of this research.

"The algorithm we proposed is really general to just any two pairs of text," Gomez said. "It doesn't need to be plain text and a cipher; it could be English and French."

"Two new research papers detail unsupervised machine-learning methods that can do language translation without dictionaries" via @techreview https://t.co/i1gtvn7yub

— The Tech (@TheTechMuseum) December 14, 2017

The type of algorithm the researchers created is known as a generative adversarial network, or GAN. This algorithm is capable of moving back and forth between two completely unrelated texts—whether that text was written out in human language like English or in cipher code.

The way most language translation software currently works is by using paired texts, meaning teaching a computer the meaning of certain words in other languages by using existing translations as a reference point. But not only does that require the extra step of using the second text as a sort of translation middleman, not all languages have pre-existing, accurate analogs in all other languages.

"The reason unsupervised translation is so important is that there are so many languages with very little data, very few paired text examples," Gomez said. "So in cases where you don't have input-output pairs between something like English and some small language, you don't have any training data. Unsupervised translation gets around that."