The Revolution Will Not Be Handheld

Invited Essay

pp. 92-94

No PDF available for download.


For those working in UX through the past several years, the shift from desktop to mobile has seemed a major event. No longer are our devices clearly situated. Instead they travel with us. “Technology is now an appendage—always available in every moment of time, anywhere” (Holtzblatt & Beyer, 2017, p. 7).

The shift has forced changes to the way we design. We must cater for shallower engagement, support tasks across multiple devices, pare down UIs for smaller screens, and support touch-based manipulation.

We have also needed different research methods. Because we cannot necessarily attend the location at which an interaction takes place, we must use a raft of techniques to explore the contexts in which our products or services will be used. We have to be more literally “in the field.”

Taking a longer view, however, one can regard these changes as part of the same half-century trajectory from the visual display terminal to the handheld device, with the common factor being the undisputed primacy of the screen.

The portable screen provides us with an immersive and seductive world, and it’s perhaps hardly surprising that we UX practitioners have been equally seduced. And of course there is a bottom-line driver in that many of us work for organizations that deliver apps, services, and various capabilities primarily or exclusively within the screen-based ecosphere.

However, changes are occurring that seem likely to challenge the hegemony of the handheld device. Perhaps it’s a good time to look up from our screens and take in our surroundings.


One clear trend is the rapid growth of voice-based systems (after many years of “coming soon” promises). Increasingly people are conversing—in real, spoken language—with Alexa or Siri or Google, and giving spoken directions to their Xboxes and Smart TVs. And in the process, they may be developing deeper or, at least, alternative relationships with their devices and digital avatars. Few would think to type (or swipe) “thank you” on their phone or tablet, but observations indicate that this is a common and desired (and currently ignored) “end of conversation” signal. (For example, see the Reddit conversation on thanking Alexa at This is no surprise. Reeves and Nass pointed out that people have a natural tendency to treat machines as if they were “real” people, and to converse with them accordingly (1996).

This is not to suggest that voice will supplant the screen. Indeed, it is difficult to imagine any voice-based system providing the degree of immersion that a video or video game can provide. In James Joyce’s Ulysses, Stephen Dedalus walks on a beach considering the “ineluctable modality of the visible,” which he conceptualizes as “thought through my eyes. Signature of all things I am here to read.” The screen is undoubtedly for many people the best, or at least most efficient, way to tell a story or quickly explain a concept. Whether the screen is part of a mobile phone, a virtual reality headset, or a more esoteric device is immaterial.

However, using a mobile screen-based device to carry out a moderately complex task—let us say rescheduling an appointment from 10 a.m. today to 1 p.m. tomorrow with an associated message to attendees—may be downright clumsy when compared with a voice-based alternative. “Change my 10 a.m. to 1 p. m. tomorrow, and tell everyone it’s because John can’t make it today.” This may be marginally beyond today’s capabilities, but surely not beyond tomorrow’s.

Coupling speech systems with a moderately capable intelligent agent (as in the previous example) opens up the potential to shift many tasks that are currently screen-based to an alternative medium. Not all tasks can be shifted; after all, speech is sequential, whereas a screen provides literally a snapshot or a gestalt. When Stephen Dedalus closes his eyes, he begins to count his strides: “Five, six: the nacheinander. Exactly: and that is the ineluctable modality of the audible.” The German word nacheinander (translation: successively) neatly sums up the essential consecutiveness of a speech-based UI.

Speech recognition may also be particularly well-suited to short simple queries. Saying “OK Google, What’s the population of Finland?” is far less effortful than typing the same query.

Those of us currently in the business of designing for the screen might be well advised to start putting some thought and effort into learning about how speech UIs might work. The book Practical Speech User Interface Design by James Lewis is a good resource. (2010).

Internet of Things

The much-touted Internet of Things (IoT) presents another interesting opportunity to look beyond the screen.

While an app or portal may be a necessary gateway to set up such devices, it may not be a primary (or even a significant) part of the device-in-use. IoT is not clearly-defined, but includes wearables and multiple “smart home” devices. The list of connected things seems set to grow; even my Icon bicycle lights are connected and offer the potential to be part of the “smart city” by reporting pollution and traffic monitoring data to city authorities.

Designing for such connected devices means moving well beyond the screen. What are appropriate light levels for a wall-mounted thermometer? How long after movement detection should a light become dim or switch off? If there is a conflict between adjacent devices how should it be resolved? How can we design failure modes to prevent frustrating and confusing users? If a device fails, how important is that event, and what sort of notification should (and should not) be sent? Should the operations of the device be apparent or hidden? How are we to deal with privacy and security?

The Robots Are Coming

No matter how diligent we are in keeping abreast of new and emerging technologies and the techniques that help us design them, there is another and existential threat that may blindside us. We have no reason to suppose that at least some of the work traditionally done by UX practitioners could not be carried out by artificial intelligence (AI). In particular, any practices that use patterns must of necessity be susceptible. One can certainly envisage describing a required outcome to an AI that would then generate an outcome based on “traditional” patterns. This is no distant vision; a recent article in New Scientist reported that researchers from Microsoft and University of Cambridge created a system called DeepCoder that can solve problems by writing programs based on pre-existing (human-written) code (Reynolds, 2017).

Combine a similar capability with automated remote unmoderated testing, and one can readily envisage an AI that not only designs user experiences, but can improve them through testing and by applying evolutionary algorithms to approach an optimal solution. Indeed, an AI could tweak a UI or user experience to best meet the needs of an individual user, even as that user’s abilities and preferences changed over time—providing a truly personalized UX.

If AIs can be seen as a threat to those working in design and evaluation, does this imply that those practitioners who conduct ethnographic or formative research will be unaffected? Perhaps in the near term this is true. However, as AIs become more sophisticated, it is conceivable that the need for human interventions in general could become less, as AIs take more and more of the workload we currently see as being primarily in the human domain, and we are “all watched over/ by machines of loving grace” (Brautigan, 1967).

In the meantime, let’s look up from our phones, embrace the larger ecosystem that is opening up to us, and keep questioning not only the finer details of the products we work on, but also the rationale for their—and our—existence.


Brautigan, R. (1967). All Watched Over by Machines of Loving Grace [poem]. In All watched over by machines of loving grace. San Francisco, CA: The Communication Company. Retrieved from

Holtzblatt, K., & Beyer, H. (2017). Contextual design: Design for life. (2nd ed.) Cambridge, MA: Morgan Kaufmann.

Joyce, J. (2008). Ulysses. The Project Gutenberg e-book of Ulysses. Retrieved from

Lewis, J. R. (2010). Practical speech user interface design. Boca Raton, FL: CRC Press.

Reeves, B., & Nass, C. (1996). The media equation: How people treat computers, television and new media like real people and places. Stanford, CA: CSLI Publications.

Reynolds, M. (2017, February 25). AI learns to write its own code by stealing from other programs. New Scientist (3114).