CMU logo
Search
Expand Menu
Close Menu

HCII Ph.D. Thesis Proposal: Julia Cambre

Open in new window

Speaker
Julia Cambre

When
-

Where
GHC 4405

Description

Thesis Proposal: Julia Cambre
"Designing for Voice in Context"

Location:
GHC 4405

Thesis Committee:
Chinmay Kulkarni (Chair), Emory University
Jeff Bigham, Carnegie Mellon University
Nik Martelaro, Carnegie Mellon University
Amy Ogan, Carnegie Mellon University
Andrés Monroy-Hernández, Princeton University

Abstract:
Voice interfaces are all around us. While they hold the potential to offer hands-free convenience and provide a natural interface for interacting with our devices, the voice interfaces that are most common today---voice assistants such as Alexa, Siri, and the Google Assistant---fall short of the promise that voice provides: users make use of only a small fraction of the functionality that voice assistants offer, and many struggle with more fundamental challenges such as speech recognition failures and unmet expectations of the assistants' capabilities due to their human-like personalities and voices. I claim that these shortcomings of voice interfaces are also a consequence of the largely "one size fits all" approach that we see in many voice-enabled technologies today, where the same voice assistant is deployed across a wide range of contexts with minimal tailoring. In practice, this tendency to embed an assistant in a variety of devices and potential interaction scenarios can result in them lacking a solid understanding of the complex physical and social environments in which they are used.

As an alternative to this "one size fits all" approach, my thesis work advocates for designing voice interfaces that are more deeply situated within the context of interaction. I approach this through a range of explorations in the voice space, combining both systems-based research and user evaluations in which I prototype and test voice interfaces in novel contexts that have included a biology wet lab and a large-scale deployment within the Firefox web browser, as well as design-based research in which I explore possible futures for voice interfaces across domains like the home, workplace, and transit. I also introduce a preliminary framework for designing a voice for a smart device based on user, device, and context factors, which will serve as the basis for a more complete framework for contextual voice design as I complete my thesis work. The overarching goals of and findings from these studies point to a need for a tighter coupling between users' mental models of what a voice interface is capable of (e.g. based on the task, physical surroundings, and data sources) and its actual functionality, with contextually designed assistants helping to scope users’ expectations of the interface's capabilities more appropriately, and therefore yielding more successful interactions as a result.

As proposed work, I aim to build upon the systems and studies I have completed thus far by considering how a voice interface might adapt as the context of interaction changes. To accomplish this, I have built a prototype voice assistant, Luca, that is implemented as an iOS app and uses GPT as its core source of knowledge. Motivated by a hypothesis about what context information users believe a voice assistant already leverages to answer their queries, Luca uses context signals (e.g. location, time of day) from the user's device to contextualize the assistant's responses. Through a comparative evaluation study, a field deployment of Luca, and reflection on the the process of iteratively developing an assistant that combines "live" real-world context with a large language model, I plan to refine my framework on contextual voice design and contribute best practices as to what voice interfaces should know about their context to provide more accurate, relevant results and a better user experience.