Like A Super Hero
Clad in a comfy black t shirt and jeans, Blaise Aguera y Arcas stood onstage at the Technology, Entertainment, Design Conference in Monterey, California, early last year in front of a projection screen that ran the height of an entire wall. On the screen was a tapestry of hundreds of high-quality digital documents and photos; one image, a scan of an ancient map of the world, had more data than most hard drives. Ordinarily, interacting with—and making sense of—such huge amounts of data would be tedious, if not impossible. Anyone who's tried to work with a 10-megapixel photo on a laptop screen knows this. But the new program Aguera y Arcas was demoing, dubbed Seadragon, made every transition seamless and lightning fast. The audience saw the patchwork quilt of images become a single shot of the façade of Notre Dame cathedral—and in another instant, a close-up of a gargoyle's tooth.
Seadragon, a technology acquired by Microsoft in 2006, is a baby step toward addressing a problem that many techies have pondered for decades: computing, for all its transformative effects on economy and culture, is in a very fundamental way stuck on neutral. Anybody who's sat in front of a computer screen searching and scrolling through reams of information has had an inkling that something is amiss. Google may give us 2 million hits, but we rarely look past the first 10. Most Web sites are designed for a typical PC, though we also view them on cell phones and television sets, and the transfer is far from perfect. If the Internet is such a boon, why is sifting through all the information it brings to our fingertips so laborious?
The Internet, it seems, doesn't take advantage of how humans best process information. Evolution granted Homo sapiens a high degree of visual acuity—all the better to pick out camouflaged predators on the savanna—and despite the progress of civilization, we're still highly visual creatures. "Humans are best at scanning over a fixed field and finding what they want," says design guru Edward Tufte, whose books on visual display have influenced generations of designers. Finding a jar of honey in the kitchen cupboard is a simple task—you have an intuitive sense of where it rests in your house and how to access it. Translated to the computing world, the process becomes more deliberate: click the "house" folder, navigate to the "kitchen" subdirectory, find another folder dubbed "cupboards"—and so on. "If you just think about everything in your house, and all the places you know in the world," says Aguera y Arcas, "you have a much richer mental map of all those things than you have of where your files are in your computer." Scrolling and linking are inferior modes of taking in information. "Humans are incredibly good at spatial navigation and incredibly bad at navigating through a list of generic icons or generic text," says John San Giovanni, a former Microsoft researcher now developing zoom technology for a phone interface at start-up ZenZui.
These limitations are not lost on the technology giants and forward-thinking entrepreneurs working to commercialize a new way to take in information visually: the zoom interface. In its simplest form, it displays information all at once—all the photos in an album, say, or all the files on a PC, or all the entries in a database, or all the items retrieved in a search—and when you spot something of interest, you zoom down into it. In this way, zooming represents an upgrade from the second- and third-best methods for accessing information (scrolling and linking) to the best option: displaying information like a landscape, and giving people the chance to zoom down to the details.
The basic idea of zooming has tantalized techies since the early days of the Internet, and still inspires those who believe it has the potential to take human-computer interaction to a new, more productive level. Only recently have engineers had the advances in display technology, broadband connections and video processors capable of coping with a zoom interface. As a result, prototype zoom interfaces are now up and running in labs around the world. Microsoft is looking to zoom technologies to catapult it once again into a position of technology leadership, but Google is also developing zoom technology in a big way. It already employs a form of zoom in Google Earth, and so does Apple (in the iPhone). A few start-ups like ZenZui and Hillcrest Labs are also bringing zoom into new markets, including a broader swathe of mobile devices and even your home television.
Google Earth displays the globe as a landscape ready to be explored—click the mouse and you zoom, like a superhero, into and out of the Aegean Sea, Illinois or Sudan. "You don't necessarily need to go to an index or keyword search to find something about a place," says John Hanke, director of Google Maps and Earth. "There's real joy in being able to experience the world that way."
Aside from being pleasurable, it's also more efficient. In most other (non-zoom) map programs, such as Yahoo or Google Maps, the screen refreshes itself as you enlarge your target, meaning you must reorient yourself at each step—the move from country- to city- to building-level isn't fluid. It happens quickly, but the brain loses its bearings over and over again, and has to readjust. On the other hand, "seamless zoom allows you to easily comprehend the spatial scale and relationship," Hanke says. Immediately and intuitively, you comprehend the distance between Mumbai and Hong Kong, or the size of the Eiffel Tower relative to a skyscraper.
Although the idea sounds uniquely suited to larger screens that can display many items at once, a zooming interface would mean that any Web page or document could fit comfortably on even the smallest screens. Today, most mobile devices strip down a Web page into its simplest elements, mostly text and a smattering of photos—there's no attempt to replicate the real page. A zoom system could display the whole shebang, even on a Nokia, to give you a sense of the full page before you magnify the text to a readable level. On traditional screens it would make information more accessible. Instead of remembering a complicated file path to find your virtual jar of honey, you could easily scan your screen for a graphical representation, then zoom down into it. Organizing information spatially makes it easier to remember. (You don't have to memorize every street name on your morning commute.) Search results, mapping, Web browsing—all these commonplace and essential tasks could potentially be done better by making them more visual and adding a bit of zoom to them.
Apple was the first to discover mobile devices as a venue for zooming interfaces. When you open a page on the iPhone, you see it in its entirety, which gives you a sense of how all the elements fit together. By sliding open two fingers on the screen, you zoom into any element, magnifying text to a readable level or looking more closely at a picture. As a result, iPhone users aren't scrolling through a seemingly endless column of links and text bits, as users of other mobile devices have to do.
The power of this idea isn't lost on other cell-phone developers. Microsoft, for instance, is developing its own zooming Web browser for mobile devices, called Deepfish. It operates a lot like the iPhone, displaying sites exactly as they would appear on a larger screen, and allowing you to drill down into the details, albeit with a joystick instead of your fingers. (A prototype was released in the first half of 2007.) ZenZui, the phone interface being developed by San Giovanni, the former Microsoft researcher, uses zooming to navigate between colorful tiles that display the weather, stock quotes and other data.
On larger screens, which hold more information, a zoom interface would boost productivity. By making better use of human spatial memory, it plays to our strengths, bringing "the full power of your visual system to bear on processing information," says Aguera y Arcas. Users can handle more tasks at once. Instead of struggling to manage a half-dozen open browser windows, with zoom our brains might be able to keep track of 50 to 100 items—a tenfold improvement.
Aguera y Arcas started working on the zoom idea as a graduate student at Princeton, and in 2003 founded a company, then named Sand Codex (later changed to Seadragon). Microsoft, well aware of the potential of zoom interfaces, acquired the company in 2006. Seadragon is an application for interacting with images, graphics and documents. Most images are encoded in reading order, from left to right, meaning you have to download a picture in its entirety before you can make sense of it, even if you want just a small piece. Seadragon, on the other hand, starts downloading everything at once, but it downloads only the bytes that you're viewing at any given moment. (If you're staring at a picnic scene, for instance, there's no reason to download the data that reveal the ants crawling across the blanket unless you zoom in close enough to see them.) This makes for speedy downloads, even with huge images, and allows for seamless zooming. It's a digital librarian's dream.
More broadly, Seadragon could become the basis for a beautiful Web browser. Cheap digital cameras now take eight-to-10 megapixel pictures, but such big images don't fit on most screens and file sizes are too large to download quickly. On a Seadragon-based browser, search results could show a small replica of each Web page, for instantaneous judgment. Instead of bouncing back and forth between tabs, you would have all your active windows visible at once, putting your speedy spatial memory to work on organizing the information. Surfing the Internet would become a more visual experience.
No one outside Microsoft knows for sure how zoom fits into the firm's future—its zoom-related plans are locked up tighter than Guantánamo. When asked about the next steps for Seadragon, Aguera y Arcas replied "that I'm not really prepared to answer." Yet Microsoft's embrace of zooming seems to be more ambitious than that of its competitors. Although companies don't reveal their investments in this area, Microsoft has almost quadrupled the number of employees working on Seadragon since the acquisition, to nearly 40. Television companies, too, are looking at zoom to help viewers navigate the thousands of programs now on offer. Netflix and premium television have ushered in the slow death of the neighborhood video store, but many aren't happy about that—they miss browsing the walls for new releases and old comedy favorites. Dan Simpkins believes there's value in that experience, where "you walk the aisles of the new releases, you see a visual directory of all the covers, and when you find something that strikes your fancy, you pick up the box and turn it over and read." Simpkins's company, Hillcrest Labs, is developing a zooming TV interface that makes TV browsing more visual.
Ian Sobieski, a managing director of the venture-capital firm Band of Angels, an early investor in Seadragon, cautions that zooming technologies are still in their earliest stages. "The initial applications probably are not going to strike anyone as immediate productivity enhancers," he says. "But where we are with zoom interactions is where palm computing was in 1982. An interface that incorporates this kind of zoom capability will be commonplace in the not-too-distant future."
Not everybody thinks that zoom technology will be revolutionary. For accessing the hundreds of billions of Web sites out there, text-based searching and keyword tagging might still prove best. Also, applications that are very text-heavy, like a Word document or some Web pages, become unintelligible when rendered small. Zooming "scales up the amount of information that a person can work with at a time by maybe a factor of a hundred," says Ben Bederson, a computer scientist at the University of Maryland, "but I have a hundred thousand documents on my computer"—and the Web has billions more. Even our refined visual memories can't manage such a flood. Instead, Bederson thinks that zooming will be a niche application, primarily for photos and cell phones.
Aguera y Arcas and other zoom believers disagree. "I feel a bit like this is a necessity, in the same sense that a color TV after black-and-white was a necessity," he says. "You see it and it's obvious that it has to go that way." In a sense, the Internet as we now know it may come to be seen as a brief step sideways in the evolution of information technology. When a new technology arrives on the scene, people try to apply old standards—the first automakers, for instance, used boat rudders instead of wheels to steer cars. The idea of scrolling information across our screens is a bit like that: it's essentially a throwback to "the pre-codex way of representing text in physical scrolls," says Aguera y Arcas, and hypertext linking is similarly dated. With zoom interfaces poised for a breakthrough, the Internet might finally get a steering wheel.