Father of the Internet Worries Our Digital History Is Disappearing

Vint Cerf
Vint Cerf, far left, receiving the inaugural Queen Elizabeth Prize for Engineering, at Buckingham Palace in 2013. Lewis Whyld/Pool/Reuters

Vint Cerf worries the digital history of our age is writ on water. Cerf built the internet from its roots with the United States' Defense Advanced Research Projects Agency (DARPA) and stayed with it, working for the past 10 years for Google as “chief internet evangelist.” Today he’s deeply concerned about the record we will leave to future generations about our time.

“If 100 years from now the digital picture of our society is not accessible, we will be an enigma to the 22nd century,” Cerf warned from the Decentralized Web Summit, a conference dedicated to dreaming up the next version of the web. “I’m very concerned that digital content will be less and less accessible, not because we can't find the bits, but because we don't know what the bits mean.”

Cerf is known for his visions, for his creation and for his three-piece suits, which stand out in a crowd of khaki-and-jeans programmers. Dressing up, he says, is a habit he picked up in Washington, where once he was complimented after congressional testimony he delivered in seersucker as being the “best dressed DARPA guy.” After that, the suits stayed.

If the web doesn’t feel like a fragile thing, try and access a website from 10 years ago, or even five. Not only do things like links stop working as servers get upgraded and moved, but whole domains change hands. The other side of the fluidity and speed of moving information on the web is that it doesn’t last.

So where will that leave the historians of the future? Or our scientists? It’s not just that our CDs fall apart with age and our hard drives stop spinning. Information is tied, in many ways to the software you use to open it. Cerf puts the problem this way:

“We have to find a technical means for preserving the physical bits, so we have to have media that will hold on to the bits. We need to be ready for that to deteriorate and copy those bits over over time. And we need to make sure we can describe what those bits are, and if we succeed in reading them 100 years from now we need some idea of what we read, the so-called metadata. And we may need to preserve the software, which means we need to archive the software. And that software ran on some OS and some hardware so we need to preserve a description of the OS and hardware. That’s the tech side. That’s easy.”

The not easy part, according to Cerf, is the legal ramifications. Not only is copyright on the media and information an issue, but so is the right to use old software to view it. “And there’s the question of how do you pay for content and software over long periods of time.”

“We are depending right now on a medium whose longevity is uncertain, and there are consequences to that. The sad story is that we don't always anticipate those consequences until the bad things happens and then we have no backup,” Cerf says.

When it comes to keeping our history and our scientific legacy, Cerf is passionate. He says that many of our greatest historical treasures, like the 2,600 year old cuneiform clay tablets from the Library of Ashurbanipal, once located in modern Iraq, were preserved by accident, when a fire baked them into more durable pottery. The same fire, presumably, destroyed everything kept on wood, paper or papyrus.

He brings up the Archimedes Palimpsest as another example of accidental historical preservation. In that case a church scribe scraped a priceless one-of-a-kind paper based on the ancient Greek mathematician's work so he could put a liturgical guide on it. “Someone found the document around 1900 that there was this barely discernible Greek in there, then it was lost again 1988 and found in a Frenchman's attic. Then a friend bought it,” Cerf says without elaborating. (The buyer who lent the document to the Walters Art Museum in Baltimore for study has never been publicly identified.) The paper by Archimedes was eventually recovered with the help of modern science. But Cerf sees this all as a cautionary tale.

“We shouldn't set about preserving human history by accident,” he says.

But the problems aren’t insurmountable. A lot of things work in favor of us being able to save our age forever.

Storage is getting cheaper at an exponential rate. Cerf says he did the math on a 3-terabyte hard drive he recently bought this year for $150 and estimated it would have cost him $300 million if he’d bought it in 1979, when he had a 10-megabyte drive for his Apple II.

And a lot of our storage is now distributed, meaning a single copy being lost doesn’t mean the information is gone.

Another part of the solution is creating something Cerf calls the “self-archiving web,” which would allow for websites to be published with multiple accessible versions of live pages at different times which would let links and content live forever. He sees some hope in Digital Object Identifiers which help create links that last for academic papers and scientific journals.

Cerf has been studying institutions that have been good at preserving information, and points to the Catholic Church as one keeper of information through the Dark Ages. Cerf says the Islamic world was another.

All of this talk may seem to be beyond a time frame beyond human worry, but consider that Cerf had trouble making a “tiff” file work with his PowerPoint for his speech on preservation. The digital world is already falling apart. Enjoy this article while you can.