Sunday, November 30, 2008

The Future of Human-Computer Interfaces

Recently, I have been giving some thought to what the future of human-computer interfaces will be like. This article is supposed to be a compilation of ideas that I have either read about or thought up within the past few weeks. I think Moore’s Law will continue to play a pivotal role in dictating the direction of interface design. This is because smaller and faster circuitry; allowing increasing amounts of processing power, storage capacity, and design complexity; will provide interface designers and manufacturers an increasing degree of flexibility.

Let us first consider the ways in which a (non-handicapped) human can physically interact with a computer (I have not yet thought about how interface accessibility technologies will develop for handicapped users; no offense intended). I have also not yet researched brain-interfacing technologies.

To provide input to a computer, a human can use touch, gestures, and speech. To receive output from a computer, a human can use sight, hearing, and touch. Note that the order in which I list these functions is important; this should become apparent as the article progresses.

The goal is to make human-computer interaction as efficient as possible – potentially, even more efficient than human-human interaction. I think ‘text’ is going to stay around for a long time. The average human (let us call her Jane) cannot always communicate complex ideas with consistency using only speech and gestures against hearing and sight. Of course, the reverse is also true – text is not always the most efficient means of communication either. Therefore, the input and output of textual information needs to be as efficient as possible and balanced effectively with sensory interaction. The requirement is to achieve a balance in trying to communicate the maximum amount of information, as accurately as possible, in the least amount of time, using a minimum amount of energy.

The conventional keyboard is indeed a fantastic input interface for text. The keyboard allows Jane to input textual information in a quicker, cleaner, and more consistent way compared to handwriting and speech. This is mainly due to the amazing dexterity of her fingers. This means that no matter how advanced handwriting and speech recognition technologies become, the keyboard will still be a more efficient input interface. Any advancement in keyboard design will probably just be improved ergonomics. For example, touch sensitive surfaces will probably replace the traditional key-press design. Someone might even invent a layout more ergonomic that the standard QWERTY layout on English keyboards.

The keyboard is also good at selected forms of non-textual input. For example, keyboard shortcuts let you perform common tasks without having to move your fingers away from the keyboard while typing. Therefore, as long as the keyboard remains in favor, keyboard shortcuts will tag along.

Even though textual input may be used to command and control a computer (command-line interface), it is obviously impractical for Jane. A graphical user interface (GUI) has the potential to make life easy for Jane, but it might also do just the opposite. GUI design is still a nascent field and we will continue to see out-of-the-box designs that will improve human-computer interaction. The mouse (including the touchpad, nipple, and trackball) will become obsolete and give way to more efficient interfaces, like the touchscreen (already available on phones and tablets). In this context, touch is still a much more powerful input method compared to gestures and speech; it allows Jane to interact more accurately with her computer.

Imagine trying to issue speech commands to a computer in an office environment. Even if speech recognition technology attains perfection, its impracticality far outweighs its benefits. Gestures are much more convenient and discreet, and will probably be a common input technique in the future. However, gestures are not as accurate as touch. Still, we might commonly use interfaces that are halfway between the two. For example, a projection keyboard (an existing technology) optically projects a keyboard layout onto a surface; finger movement within the projected area is captured and translated into key presses.

Of course, input only makes up one side of the equation. Of the five human senses (sight, hearing, touch, smell, taste), sight is the most powerful in terms of processing capacity. Therefore, visual output is the most efficient way a computer can communicate information to a human. Hearing is required for at least a basic multimedia experience and casual UI feedback. Touch, or tactile feedback, will probably be used for nothing more than minor sensory feedback enhancement, if at all, in mainstream products.

Today, a computer is not just a business machine; it is also an entertainment hub. Therefore, display technologies will continue to improve. With the recent introduction of an affordable pocket-sized multimedia projector, it is easy to conceive that solid displays will soon give way to projected displays. Following that, holographic (3D) projectors are going to take over from their 2D ancestors. As moving towards non-surface displays will require increasingly gesture-based input interfaces, there will have to be a point where the higher input accuracy of touch-based interfaces will demand a split interface for operating the computer. That is, multiple display interfaces – a touchscreen for operating the computer and a holographic projector for playing multimedia like videos, games, etc.

I have tried to imagine what a future device with such interfaces could look like. A few years back, I saw a couple of images that showed what looked like something straight out of a Bond movie: a set of pens that, together, formed the most portable computer imaginable. A quick Google search refreshed my memory: http://www.todaysgizmos.com/computer/pen-size-computers/. What my imagination came up with today, takes this a step further. Of course I feel silly trying to predict how long it will take technology to get to this point, but 10-15 years sounds plausible to me.

Imagine a device the size of today's smart phones that fits easily in your palm. It is a computer, cell phone, camera, TV, all in one; running a full-fledged desktop operating system. It has a projector-camera pair on the front face – it projects the display onto your desk and the camera picks up your finger movements on the display, so that it works as if you are using a touchscreen. The display can extend into a virtual keyboard when you need to type. The computer recognizes hand gestures for use with applications, games, etc. On the back face, there is a holographic projector, along with stereo speakers on the side faces; this creates a completely immersive 3D environment and multimedia experience for movies and games. The top face is a touchscreen for minimal operation of the computer (for example, to use the phone), without having to use the projector. It seamlessly connects wirelessly to local access points and cellular networks for unlimited Internet access. It is able to receive digital broadcast radio and TV from local stations and satellites. It is always location-aware, using GPS or other technology. Processing power and storage capacity are virtually unlimited for common usage scenarios. Battery lifetime is in days, if not weeks or months.

However far-fetched such technology might sound to us today, it is worth imagining. It is as if in 1990, someone described to me, in detail, a gadget of the future called the ‘iPhone’.