CMU at CHI 2023

The ACM Conference on Human Factors in Computing Systems is the premier international conference on Human-Computer Interaction. The CHI Conference brings together researchers and practitioners who share the goal of making the world a better place with interactive digital technologies.

CMU work at CHI 2023

We are proud to share that Carnegie Mellon University authors from a variety of schools and disciplines contributed to more than 40 papers accepted to CHI 2023, including one Best Paper and six Honorable Mention awards, as detailed in this CMU at CHI 2023 overview.

A list of papers with CMU contributing authors is available below. (Note: this list is still in progress)

Icon represents a Best Paper Award

Icon represents a Best Paper Honorable Mention Award

Accepted Papers

"I Would Like to Design": Black Girls Analyzing and Ideating Fair and Accountable AI

Jaemarie Solyst, Shixian Xie, Ellia Yang, Angela E.B. Stewart, Motahhare Eslami, Jessica Hammer, Amy Ogan

Artificial intelligence (AI) literacy is especially important for those who may not be well-represented in technology design. We worked with ten Black girls in fifth and sixth grade from a predominantly Black school to understand their perceptions around fair and accountable AI and how they can have an empowered role in the creation of AI. Thematic analysis of discussions and activity artifacts from a summer camp and after-school session revealed a number of findings around how Black girls: perceive AI, primarily consider fairness as niceness and equality (but may need support considering other notions, such as equity), consider accountability, and envision a just future. We also discuss how the learners can be positioned as decision-making designers in creating AI technology, as well as how AI literacy learning experiences can be empowering.

“I Want to Be Unique From Other Robots”: Positioning Girls as Co-creators of Social Robots in Culturally-Responsive Computing Education

Yinmiao Li, Jennifer Nwogu, Amanda Buddemeyer, Jaemarie Solyst, Jina Lee, Erin Walker, Amy Ogan, Angela E.B. Stewart

Robot technologies have been introduced to computing education to engage learners. This study introduces the concept of co-creation with a robot agent into culturally-responsive computing (CRC). Co-creation with computer agents has previously focused on creating external artifacts. Our work differs by making the robot agent itself the co-created product. Through participatory design activities, we positioned adolescent girls and an agentic social robot as co-creators of the robot’s identity. Taking a thematic analysis approach, we examined how girls embody the role of creator and co-creator in this space. We identified themes surrounding who has the power to make decisions, what decisions are made, and how to maintain social relationship. Our findings suggest that co-creation with robot technology is a promising implementation vehicle for realizing CRC.

A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed Bandits

Liu Leqi (CMU MLD), Giulio Zhou (CMU CSD), Fatma Kılınç-Karzan (CMU Tepper), Zachary C. Lipton (CMU MLD), Alan L. Montgomery (CMU Tepper) [Leqi and Giulio contributed to the paper equally]

Personalized recommender systems suffuse modern life, shaping what media we read and what products we consume. Algorithms powering such systems tend to consist of supervised-learning-based heuristics, such as latent factor models with a variety of heuristically chosen prediction targets. Meanwhile, theoretical treatments of recommendation frequently address the decision-theoretic nature of the problem, including the need to balance exploration and exploitation, via the multi-armed bandits (MABs) framework. However, MAB-based approaches rely heavily on assumptions about human preferences. These preference assumptions are seldom tested using human subject studies, partly due to the lack of publicly available toolkits to conduct such studies. In this work, we conduct a study with crowdworkers in a comics recommendation MABs setting. Each arm represents a comic category, and users provide feedback after each recommendation. We check the validity of core MABs assumptions—that human preferences (reward distributions) are fixed over time—and find that they do not hold. This finding suggests that any MAB algorithm used for recommender systems should account for human preference dynamics. While answering these questions, we provide a flexible experimental framework for understanding human preference dynamics and testing MABs algorithms with human users. The code for our experimental framework and the collected data can be found at https://github.com/HumainLab/human-bandit-evaluation.

A US-UK Usability Evaluation of Consent Management Platform Cookie Consent Interface Design on Desktop and Mobile

Elijah Bouma-Sims, Megan Li, Yanzi Lin, Adia Sakura-Lemessy, Alexandra Nisenoff, Ellie Young, Eleanor Birrell, Lorrie Faith Cranor, and Hana Habib

Websites implement cookie consent interfaces to obtain users’ permission to use non-essential cookies, as required by privacy regulations. We extend prior research evaluating the impact of interface design on cookie consent through an online behavioral experiment (? = 1359) in which we prompted mobile and desktop users from the UK and US to make cookie consent decisions using one of 14 interfaces implemented with the OneTrust consent management platform (CMP). We found significant effects on user behavior and sentiment for multiple explanatory variables, including more negative sentiment towards the consent process among UK participants and lower comprehension of interface information among mobile users. The design factor that had the largest effect on user behavior was the initial set of options displayed in the cookie banner. In addition to providing more evidence of the inadequacy of current cookie consent processes, our results have implications for website operators and CMPs.

An Augmented Knitting Machine for Operational Assistance and Guided Improvisation

Lea Albaugh, Scott Hudson, and Lining Yao

Computational mediation can unlock access to existing creative fabrication tools. By outfitting an otherwise purely mechanical hand-operated knitting machine with lightweight sensing capabilities, we produced a system which provides immediate feedback about the state and affordances of the underlying knitting machine. We describe our technical implementation, show modular interface applications which center the particular patterning capabilities of this kind of machine knitting, and discuss user experiences with interactive hybrid computational/mechanical systems.

Augmenting scientific creativity with an analogical search engine

Hyeonsu Kang, Xin Qian, Tom Hope, Dafna Shahaf, Joel Chan, Aniket Kittur

Analogies have been central to creative problem-solving throughout the history of science and technology. As the number of scientific articles continues to increase exponentially, there is a growing opportunity for finding diverse solutions to existing problems. However, realizing this potential requires the development of a means for searching through a large corpus that goes beyond surface matches and simple keywords. Here we contribute the first end-to-end system for analogical search on scientific articles and evaluate its effectiveness with scientists' own problems. Using a human-in-the-loop AI system as a probe we find that our system facilitates creative ideation, and that ideation success is mediated by an intermediate level of matching on the problem abstraction (i.e., high versus low). We also demonstrate a fully automated AI search engine that achieves a similar accuracy with the human-in-the-loop system. We conclude with design implications for enabling automated analogical inspiration engines to accelerate scientific innovation.

Breaking the "Inescapable'' Cycle of Pain: Supporting Wheelchair Users' Upper Extremity Health Awareness and Management with Tracking Technologies

Yunzhi Li, Franklin Mingzhe Li, Patrick Carrington

Upper extremity (UE) health issues are a common concern among wheelchair users and have a large impact on their independence, social participation, and quality of life. However, despite the well-documented prevalence and negative impacts, these issues remain unresolved. Existing solutions (e.g. surgical repair, conservative treatments) often fail to promote sustained UE health improvement in wheelchair users' day-to-day lives. Recent HCI research has shown the effectiveness of health tracking technologies in supporting patients' self-care for different health conditions (e.g. chronic diseases, mental health). In this work, we explore how health tracking technologies could support wheelchair users' UE health self-care. We conducted semi-structured interviews with 12 wheelchair users and 5 therapists to understand their practices and challenges in UE health management, as well as the potential benefits of integrating health tracking technologies into self-care routines. We discuss design implications for UE health tracking technologies and outline opportunities for future investigation.

Can Voice Assistants Be Microaggressors? Cross-Race Psychological Responses to Failures of Automatic Speech Recognition

Kimi Wenzel, Nitya Devireddy, Cam Davison, Geoff Kaufman

Language technologies have a racial bias, committing greater errors for Black users than for white users. However, little work has evaluated what effect these disparate error rates have on users themselves. The present study aims to understand if speech recognition errors in human-computer interactions may mirror the same effects as misunderstandings in interpersonal cross-race communication. In a controlled experiment (N=108), we randomly assigned Black and white participants to interact with a voice assistant pre-programmed to exhibit a high versus low error rate. Results revealed that Black participants in the high error rate condition, compared to Black participants in the low error rate condition, exhibited significantly higher levels of self-consciousness, lower levels of self-esteem and positive affect, and less favorable ratings of the technology. White participants did not exhibit this disparate pattern. We discuss design implications and the diverse research directions to which this initial study aims to contribute.

CatAlyst: Domain-Extensible Intervention for Preventing Task Procrastination Using Large Generative Models

Riku Arakawa, Hiromu Yakura, Masataka Goto

CatAlyst uses generative models to help workers’ progress by influencing their task engagement instead of directly contributing to their task outputs. It prompts distracted workers to resume their tasks by generating a continuation of their work and presenting it as an intervention that is more context-aware than conventional (predetermined) feedback. The prompt can function by drawing their interest and lowering the hurdle for resumption even when the generated continuation is insufficient to substitute their work, while recent human-AI collaboration research aiming at work substitution depends on a stable high accuracy. This frees CatAlyst from domain-specific model-tuning and makes it applicable to various tasks. Our studies involving writing and slide-editing tasks demonstrated CatAlyst’s effectiveness in helping workers swiftly resume tasks with a lowered cognitive load. The results suggest a new form of human-AI collaboration where large generative models publicly available but imperfect for each individual domain can contribute to workers’ digital well-being.

Climate Coach: A Dashboard for Open-Source Maintainers to Overview Community Dynamics

Huilian Sophie Qiu, Anna Lieb, Jennifer Chou, Megan Carneal, Jasmine Mok, Emily Amspoker, Bogdan Vasilescu (CMU S3D), Laura Dabbish

Open-source software projects have become an integral part of our daily life, supporting virtually every software we use today. Since open-source software forms the digital infrastructure, maintaining them is of utmost importance. We present Climate Coach, a dashboard that helps open-source project maintainers monitor the health of their community in terms of team climate and inclusion. Through a literature review and an exploratory survey (N=18), we identified important signals that can reflect a project’s health, and display them on a dashboard. We evaluated and refined our dashboard through two rounds of think-aloud studies (N=19). We then conducted a two-week longitudinal diary study (N=10) to test the usefulness of our dashboard. We found that displaying signals that are related to a project’s inclusion help improve maintainers’ management strategies.

ComLittee: Literature Discovery with Personal Elected Author Committees

Hyeonsu Kang, Nouran Soliman, Matt Latzke, Joseph Chee Chang, Jonathan Bragg

In order to help scholars understand and follow a research topic, significant research has been devoted to creating systems that help scholars discover relevant papers and authors. Recent approaches have shown the usefulness of highlighting relevant authors while scholars engage in paper discovery. However, these systems do not capture and utilize users’ evolving knowledge of authors. We reflect on the design space and introduce ComLittee, a literature discovery system that supports author-centric exploration. In contrast to paper-centric interaction in prior systems, ComLittee’s author-centric interaction supports curating research threads from individual authors, finding new authors and papers using combined signals from a paper recommender and the curated authors’ authorship graphs, and understanding them in the context of those signals. In a within-subjects experiment that compares to a paper-centric discovery system with author-highlighting, we demonstrate how ComLittee improves author and paper discovery.

EpoMemory: Multi-state Shape Memory for Programmable Morphing Interfaces

Ke Zhong, Adriane Fernandes Minori, Di Wu, Humphrey Yang, Mohammad Islam, Lining Yao

Smart shape-changing materials can be adapted to different usages, which have been leveraged for dynamic affordances and on-demand haptic feedback in HCI. However, the applicability of these materials is often bottlenecked by their complex fabrication and the challenge of programming localized and individually addressable responses. In this work, we propose a toolkit for designing and fabricating programmable morphing objects using off-the-shelf epoxies. Our method involves varying the crosslinker to epoxy resin ratio to control morphing temperatures from 40 ℃ to 90 ℃, either across different regions of a shape memory device or across devices. Functional components (e.g., conductive fabric, magnetic particles) are also incorporated with the epoxy for sensing and active reconfiguration. A toolbox of fabrication methods and a primitive design library are introduced to support design ideation and programmable morphing. Finally, we demonstrate application examples, including morphing toys, a shape-changing input device, and an active window shutter.

Exploring Challenges and Opportunities to Support Designers in Learning to Co-create with AI-based Manufacturing Design Tools

Frederic Gmeiner, Humphrey Yang, Lining Yao, Kenneth Holstein, Nikolas Martelaro

AI-based design tools are proliferating in professional software to assist engineering and industrial designers in complex manufacturing and design tasks. These tools take on more agentic roles than traditional computer-aided design tools and are often portrayed as “co-creators.” Yet, working effectively with such systems requires different skills than working with complex CAD tools alone. To date, we know little about how engineering designers learn to work with AI-based design tools. In this study, we observed trained designers as they learned to work with two AI-based tools on a realistic design task. We find that designers face many challenges in learning to effectively co-create with current systems, including challenges in understanding and adjusting AI outputs and in communicating their design goals. Based on our findings, we highlight several design opportunities to better support designer-AI co-creation.

Facilitating Counselor Reflective Learning with a Real-time Annotation tool

Tianying Chen, Michael Xieyang Liu, Emily Ding, Emma O'Neil, Mansi Agarwal, Robert E Kraut, Laura Dabbish

Experiential training, where mental health professionals practice their learned skills, remains the most costly component of therapeutic training. We introduce Pin-MI, a video-call-based tool that supports experiential learning of counseling skills used in motivational interviewing (MI)through interactive role-play as client and counselor. In Pin-MI, counselors annotate, or “pin” the important moments in their role-play sessions in real-time. The pins are then used post-session to facilitate a reflective learning process, in which both client and counselor can provide feedback about what went well or poorly during each pinned moment. We discuss the design of Pin-MI and a qualitative evaluation with a set of healthcare professionals learning MI. Our evaluation suggests that Pin-MI helped users develop empathy, be more aware of their skill usage, guaranteed immediate and targeted feedback, and helped users correct misconceptions about their performance. We discuss implications for the design of experiential training tools for learning counseling skills.

Flat Panel Haptics: Embedded Electroosmotic Pumps for Scalable Shape Displays

Craig Shultz, Chris Harrison

Flat touch interfaces, with or without screens, pervade the modern world. However, their haptic feedback is minimal, prompting much research into haptic and shape-changing display technologies which are self-contained, fast acting, and offer millimeters of displacement while only being only millimeters thick. We present a new, miniaturizable type of shape-changing display using embedded electroosmotic pumps (EEOPs). Our pumps, controlled and powered directly by applied voltage, are 1.5mm in thickness, and allow complete stackups under 5mm. Nonetheless, they can move their entire volume's worth of fluid in 1 second, and generate pressures of +/-50kPa, enough to create dynamic, millimeter-scale tactile features on a surface that can withstand typical interaction forces (<1N). These are the requisite technical ingredients to enable, for example, a pop-up keyboard on a flat smartphone. We experimentally quantify the mechanical and psychophysical performance of our displays and conclude with a set of example interfaces.

Fluidic Computation Kit: Towards Electronic-free Shape-changing Interfaces

Qiuyu Lu, Haiqing Xu, Yijie Guo, Joey Yu Wang, Lining Yao

Although fluidic computation has been utilized to develop interactive devices in the field of Human-Computer Interaction (HCI), the limited computation complexity of previous work hinders the exploration of richer interaction modalities. Based on the Fluidic Computation Kit we developed, this paper explores how unconventional mechanical computing can be leveraged to design shape-changing interfaces that integrate input sensing, output, and complex computation. After introducing the design space enabled by the Kit, we explain how to design four types of elementary computational components and six categories of operators. We end by providing several application scenarios which illustrate the Fluidic Computation Kit’s potential to build sophisticated circuits (e.g., a parallel processor) for use in the field of HCI.

HandAvatar: Embodying Non-Humanoid Virtual Avatars through Hands

Yu Jiang, Zhipeng Li, Mufei He, David Lindlbauer, Yukang Yan

We propose HandAvatar to enable users to embody non-humanoid avatars using their hands. HandAvatar leverages the high dexterity and coordination of users' hands to control virtual avatars, enabled through our novel approach for automatically-generated joint-to-joint mappings. We contribute an observation study to understand users’ preferences on hand-to-avatar mappings on eight avatars. Leveraging insights from the study, we present an automated approach that generates mappings between users' hands and arbitrary virtual avatars by jointly optimizing control precision, structural similarity, and comfort. We evaluated HandAvatar on static posing, dynamic animation, and creative exploration tasks. Results indicate that HandAvatar enables more precise control, requires less physical effort, and brings comparable embodiment compared to a state-of-the-art body-to-avatar control method. We demonstrate HandAvatar's potential with applications including non-humanoid avatar based social interaction in VR, 3D animation composition, and VR scene design with physical proxies. We believe that HandAvatar unlocks new interaction opportunities, especially for usage in Virtual Reality, by letting users become the avatar in applications including virtual social interaction, animation, gaming, or education.

Ignore, Trust, or Negotiate: Understanding Clinician Acceptance of AI-Based Treatment Recommendations in Health Care

Venkatesh Sivaraman, Leigh A. Bukowski, Joel Levin, Jeremy M. Kahn, Adam Perer

Artificial intelligence (AI) in healthcare has the potential to improve patient outcomes, but clinician acceptance remains a critical barrier. We developed a novel decision support interface that provides interpretable treatment recommendations for sepsis, a life-threatening condition in which decisional uncertainty is common, treatment practices vary widely, and poor outcomes can occur even with optimal decisions. This system formed the basis of a mixed-methods study in which 24 intensive care clinicians made AI-assisted decisions on real patient cases. We found that explanations generally increased confidence in the AI, but concordance with specific recommendations varied beyond the binary acceptance or rejection described in prior work. Although clinicians sometimes ignored or trusted the AI, they also often prioritized aspects of the recommendations to follow, reject, or delay in a process we term "negotiation." These results reveal novel barriers to adoption of treatment-focused AI tools and suggest ways to better support differing clinician perspectives.

IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds

Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harrison, Karan Ahuja

Tracking body pose on-the-go could have powerful uses in fitness, mobile gaming, context-aware virtual assistants, and rehabilitation. However, users are unlikely to buy and wear special suits or sensor arrays to achieve this end. Instead, in this work, we explore the feasibility of estimating body pose using IMUs already in devices that many users own --- namely smartphones, smartwatches, and earbuds. This approach has several challenges, including noisy data from low-cost commodity IMUs, and the fact that the number of instrumentation points on a user's body is both sparse and in flux. Our pipeline receives whatever subset of IMU data is available, potentially from just a single device, and produces a best-guess pose. To evaluate our model, we created the IMUPoser Dataset, collected from 10 participants wearing or holding off-the-shelf consumer devices and across a variety of activity contexts. We provide a comprehensive evaluation of our system, benchmarking it on both our own and existing IMU datasets.

Investigating How Practitioners Use Human-AI Guidelines: A Case Study on the People + AI Guidebook

Nur Yildirim, Mahima Pushkarna, Nitesh Goyal, Martin Wattenberg, Fernanda Viégas

Artificial intelligence (AI) presents new challenges for the user experience (UX) of products and services. Recently, practitioner-facing resources and design guidelines have become available to ease some of these challenges. However, little research has investigated if and how these guidelines are used, and how they impact practice. In this paper, we investigated how industry practitioners use the People + AI Guidebook. We conducted interviews with 31 practitioners (i.e., designers, product managers) to understand how they use human-AI guidelines when designing AI-enabled products. Our findings revealed that practitioners use the guidebook not only for addressing AI's design challenges, but also for education, cross-functional communication, and for developing internal resources. We uncovered that practitioners desire more support for early phase ideation and problem formulation to avoid AI product failures. We discuss the implications for future resources aiming to help practitioners in designing AI products.

Measuring the Stigmatizing Effects of a Highly Publicized Event on Online Mental Health Discourse

Anna Fang, Haiyi Zhu

Media coverage has historically played an influential and often stigmatizing role in the public's understanding of mental illness through harmful language and inaccurate portrayals of those with mental health issues. However, it is unknown how and to what extent media events may affect stigma in online discourse regarding mental health. In this study, we examine a highly publicized event -- the celebrity defamation trial between Johnny Depp and Amber Heard -- to uncover how stigmatizing and destigmatizing language on Twitter changed during and after the course of the trial. Using causal impact and language analysis methods, we provided a first look at how external events can lead to significantly greater levels of stigmatization and lower levels of destigmatization on Twitter towards not only particular disorders targeted in the coverage of external events but also general mental health discourse.

Metrics for peer counseling: Triangulating success outcomes for online therapy platforms

Tony Wang, Haard K. Shah, Raj Sanjay Shah, Yi-Chia Wang, Robert Kraut, Diyi Yang

Extensive research has been published on the conversational factors of effective volunteer peer counseling on online mental health platforms (OMHPs). However, studies differ in how they define and measure success outcomes, with most prior work examining only a single success metric. In this work, we model the relationship between previously reported linguistic predictors of effective counseling with four outcomes following a peer-to-peer session on a single OMHP: retention in the community, following up on a previous session with a counselor, users' evaluation of a counselor, and changes in users' mood. Results show that predictors correlate negatively with community retention but positively with users following up with and giving higher evaluations to individual counselors. We suggest actionable insights for therapy platform design and outcome measurement based on findings that the relationship between predictors and outcomes of successful conversations depends on differences in measurement construct and operationalization.

Nooks: Social Spaces to Lower Hesitations in Interacting with New People at Work

Shreya Bali, Pranav Khadpe, Geoff Kaufman, Chinmay Kulkarni

Initiating conversations with new people at work is often intimidating because of uncertainty about their interests. People worry others may reject their attempts to initiate conversation or that others may not enjoy the conversation. We introduce a new system, Nooks, built on Slack, that reduces fear of social evaluation by enabling individuals to initiate any conversation as a nook—a conversation room that identifies its topic, but not its creator. Automatically convening others interested in the nook, Nooks further reduces fears of social evaluation by guaranteeing individuals in advance that others they are about to interact with are interested in the conversation. In a multi-month deployment with participants in a summer research program, Nooks provided participants with non-threatening and inclusive interaction opportunities, and ambient awareness, leading to new interactions online and offline. Our results demonstrate how intentionally designed social spaces can reduce fears of social evaluation and catalyze new workplace connections.

OmniSense: Exploring Novel Input Sensing and Interaction Techniques on Mobile Device with an Omni-Directional Camera

Hui-Shyong Yeo, Erwin Wu, Daehwa Kim, Juyoung Lee, Hyung-il Kim, Seo Young Oh, Luna Takagi, Woontack Woo, Hideki Koike, Aaron J Quigley

An omni-directional (360°) camera captures the entire viewing sphere surrounding its optical center. Such cameras are growing in use to create highly immersive content and viewing experiences. When such a camera is held by a user, the view includes the user's hand grip, finger, body pose, face, and the surrounding environment, providing a complete understanding of the visual world and context around it. This capability opens up numerous possibilities for rich mobile input sensing. In OmniSense, we explore the broad input design space for mobile devices with a built-in omni-directional camera and broadly categorize them into three sensing pillars: i) near device ii) around device and iii) surrounding device. In addition we explore potential use cases and applications that leverage these sensing capabilities to solve user needs. Following this, we develop a working system to put these concepts into action, by leveraging these sensing capabilities to enable potential use cases and applications. We studied the system in a technical evaluation and a preliminary user study to gain initial feedback and insights. Collectively these techniques illustrate how a single, omni-purpose sensor on a mobile device affords many compelling ways to enable expressive input, while also affording a broad range of novel applications that improve user experience during mobile interaction.

ONYX: Assisting Users in Teaching Natural Language Interfaces Through Multi-Modal Interactive Task Learning

Marcel Ruoff, Brad A. Myers, Alexander Maedche

Users are increasingly empowered to personalize natural language interfaces (NLIs) by teaching how to handle new natural language (NL) inputs. However, our formative study found that when teaching new NL inputs, users require assistance in clarifying ambiguities that arise and want insight into which parts of the input the NLI understands. In this paper we introduce ONYX, an intelligent agent that interactively learns new NL inputs by combining NL programming and programming-by-demonstration, also known as multi-modal interactive task learning. To address the aforementioned challenges, ONYX provides suggestions on how ONYX could handle new NL inputs based on previously learned concepts or user-defined procedures, and poses follow-up questions to clarify ambiguities in user demonstrations, using visual and textual aids to clarify the connections. Our evaluation shows that users provided with ONYX’s new features achieved significantly higher accuracy in teaching new NL inputs (median: 93.3%) in contrast to those without (median: 73.3%).

OPTIMISM: Enabling Collaborative Implementation of Domain-Specific Metaheuristic Optimization

Megan Hofmann, Nayha Auradkar, Jessica Birchfield, Jerry Cao, Autumn G Hughes, Gene S-H Kim, Shriya Kurpad, Kathryn J Lum, Kelly Mack, Anisha Nilakantan (CMU ECE & HCII), Margaret Ellen Seehorn, Emily Warnock, Jennifer Mankoff, and Scott E Hudson

For non-technical domain experts and designers it can be a substantial challenge to create designs that meet domain specific goals. This presents an opportunity to create specialized tools that produce optimized designs in the domain. However, implementing domain-specific optimization methods requires a rare combination of programming and domain expertise. Creating flexible design tools with re-configurable optimizers that can tackle a variety of problems in a domain requires even more domain and programming expertise. We present OPTIMISM, a toolkit which enables programmers and domain experts to collaboratively implement an optimization component of design tools. OPTIMISM supports the implementation of metaheuristic optimization methods by factoring them into easy to implement and reuse components: objectives that measure desirable qualities in the domain, modifiers which make useful changes to designs, design and modifier selectors which determine how the optimizer steps through the search space, and stopping criteria that determine when to return results. Implementing optimizers with OPTIMISM shifts the burden of domain expertise from programmers to domain experts.

Overcoming Algorithm Aversion: A Comparison between Process and Outcome Control

Lingwei Cheng, Alex Chouldechova (CMU Heinz College)

Algorithm aversion occurs when humans are reluctant to use algorithms despite their superior performance. Studies show that giving users outcome control by providing agency over how models’ predictions are incorporated into decision-making mitigates algorithm aversion. We study whether algorithm aversion is mitigated by process control, wherein users can decide what input factors and algorithms to use in model training. We conduct a replication study of outcome control, and test novel process control study conditions on Amazon Mechanical Turk (MTurk) and Prolific. Our results partly confirm prior findings on the mitigating effects of outcome control, while also forefronting reproducibility challenges. We find that process control in the form of choosing the training algorithm mitigates algorithm aversion, but changing inputs does not. Furthermore, giving users both outcome and process control does not reduce algorithm aversion more than outcome or process control alone. This study contributes to design considerations around mitigating algorithm aversion.

Pair-Up: Prototyping Human-AI Co-orchestration of Dynamic Transitions between Individual and Collaborative Learning in the Classroom

Kexin "Bella" Yang, Zijing Lu (CMU METALS), Hongyu Mao (CMU Computation Design), Vanessa Echeverria, Kenneth Holstein, Nikol Rummel, Vincent Aleven

Enabling students to dynamically transition between individual and collaborative learning activities has great potential to support better learning. We explore how technology can support teachers in orchestrating dynamic transitions during class. Working with five teachers and 199 students over 22 class sessions, we conducted classroom-based prototyping of a co-orchestration technology ecosystem that supports the dynamic pairing of students working with intelligent tutoring systems. Using mixed-methods data analysis, we study the resulting observed classroom dynamics, and how teachers and students perceived and experienced dynamic transitions as supported by our technology. We discover a potential tension between teachers' and students' preferred level of control: students prefer a degree of control over the dynamic transitions that teachers are hesitant to grant. Our study reveals design implications and challenges for future human-AI co-orchestration in classroom use, bringing us closer to realizing the vision of highly-personalized smart classrooms that address the unique needs of each student.

Participation and Division of Labor in User-Driven Algorithm Audits: How Do Everyday Users Work together to Surface Algorithmic Harms?

Rena Li, Sara Kingsley, Chelsea Fan, Proteeti Sinha, Nora Wai, Jaimie Lee, Hong Shen, Motahhare Eslami, Jason I. Hong

Recent years have witnessed an interesting phenomenon in which users come together to interrogate potentially harmful algorithmic behaviors they encounter in their everyday lives. Researchers have started to develop theoretical and empirical understandings of these user-driven audits, with a hope to harness the power of users in detecting harmful machine behaviors. However, little is known about users’ participation and their division of labor in these audits, which are essential to support these collective efforts in the future. Through collecting and analyzing 17,984 tweets from four recent cases of user-driven audits, we shed light on patterns of users’ participation and engagement, especially with the top contributors in each case. We also identified the various roles users’ generated content played in these audits, including hypothesizing, data collection, amplification, contextualization, and escalation. We discuss implications for designing tools to support user-driven audits and users who labor to raise awareness of algorithm bias.

Physically Situated Tools for Exploring a Grain Space in Computational Machine Knitting

Lea Albaugh, Scott Hudson, Lining Yao

We propose an approach to enabling exploratory creativity in digital fabrication through the use of grain spaces. In material processes, "grain" describes underlying physical properties like the orientation of cellulose fibers in wood that, in aggregate, affect fabrication concerns (such as directional cutting) and outcomes (such as axes of strength and visual effects). Extending this into the realm of computational fabrication, grain spaces define a curated set of mid-level material properties as well as the underlying low-level fabrication processes needed to produce them. We specify a grain space for computational brioche knitting, use it to guide our production of a set of hybrid digital/physical tools to support quick and playful exploration of this space's unique design affordances, and reflect on the role of such tools in creative practice.

Rapid Convergence: The Outcomes of Making PPE during a Healthcare Crisis

Kelly Mack, Megan Hofmann, Udaya Lakshmi, Jerry Cao, Nayha Auradkar, Rosa I. Arriaga, Scott Hudson, Jennifer Mankoff

The U.S. National Institute of Health (NIH) 3D Print Exchange is a public, open-source repository for 3D printable medical device designs with contributions from clinicians, expert-amateur makers, and people from industry and academia. In response to the COVID-19 pandemic, the NIH formed a collection to foster submissions of low-cost, locally-manufacturable personal protective equipment (PPE). We evaluated the 623 submissions in this collection to understand: what makers contributed, how they were made, who made them, and key characteristics of their designs. We found an immediate design convergence to manufacturing-focused remixes of a few initial designs affiliated with NIH partners and major for-profit groups. The NIH worked to review safe, effective designs but was overloaded by manufacturing-focused design adaptations. Our work contributes insights into: the outcomes of distributed, community-based medical making; the features that the community accepted as "safe" making; and how platforms can support regulated maker activities in high-risk domains.

Slide Gestalt: Automatic Structure Extraction in Slide Decks for Non-Visual Access

Yi-Hao Peng, Peggy Chi, Anjuli Kannan, Meredith Ringel Morris, Irfan Essa

Presentation slides commonly use visual patterns for structural navigation, such as titles, dividers, and build slides. However, screen readers do not capture such intention, making it time-consuming and less accessible for blind and visually impaired (BVI) users to linearly consume slides with repeated content. We present Slide Gestalt, an automatic approach that identifies the hierarchical structure in a slide deck. Slide Gestalt computes the visual and textual correspondences between slides to generate hierarchical groupings. Readers can navigate the slide deck from the higher-level section overview to the lower-level description of a slide group or individual elements interactively with our UI. We derived side consumption and authoring practices from interviews with BVI readers and sighted creators and an analysis of 100 decks. We performed our pipeline with 50 real-world slide decks and a large dataset. Feedback from eight BVI participants showed that Slide Gestalt helped navigate a slide deck by anchoring content more efficiently, compared to using accessible slides.

Supporting Piggybacked Co-Located Leisure Activities via Augmented Reality

Samantha Reig, Erica Cruz, Melissa Powers, Jennifer He, Timothy Chong, Yu Jiang Tham, Sven Kratz, Ava Robinson, Brian Smith, Rajan Vaish, Andrés Monroy-Hernández

Technology, especially the smartphone, is villainized for taking meaning and time away from in-person interactions and secluding people into "digital bubbles''. We believe this is not an intrinsic property of digital gadgets, but evidence of a lack of imagination in technology design. Leveraging augmented reality (AR) toward this end allows us to create experiences for multiple people, their pets, and their environments. In this work, we explore the design of AR technology that "piggybacks'' on everyday leisure to foster co-located interactions among close ties (with other people and pets). We designed, developed, and deployed three such AR applications, and evaluated them through a 41-participant and 19-pet user study. We gained key insights about the ability of AR to spur and enrich interaction in new channels, the importance of customization, and the challenges of designing for the physical aspects of AR devices (e.g., holding smartphones). These insights guide design implications for the novel research space of co-located AR.

Surface I/O: Creating Devices with Functional Surface Geometry for Haptics and User Input

Yuran Ding, Craig Shultz, Chris Harrison

Surface I/O is a novel interface approach that functionalizes the exterior surface of devices to provide haptic and touch sensing without dedicated mechanical components. Achieving this requires a unique combination of surface features spanning the macro-scale (5cm~1mm), meso-scale (1mm~200μm), and micro-scale (<200μm). This approach simplifies interface creation, allowing designers to iterate on form geometry, haptic feeling, and sensing functionality without the limitations of mechanical mechanisms. We believe this can contribute to the concept of "invisible ubiquitous interactivity at scale", where the simplicity and easy implementation of the technique allows it to blend with objects around us. While we prototyped our designs using 3D printers and laser cutters, our technique is applicable to mass production methods, including injection molding and stamping, enabling passive goods with new levels of interactivity.

Translation as (Re)mediation: How Ethnic Community-Based Organizations Negotiate Legitimacy

Cella Sum, Anh-Ton Tran, Jessica Lin, Rachel Kuo, Cynthia Bennett, Christina Harrington, Sarah Fox

Ethnic community-based organizations (CBOs) play an essential role in supporting the wellbeing of immigrants and refugees. CBO workers often act as linguistic and cultural translators between communities, government, and health and social service systems. However, resource constraints, technological barriers, and pressures to be data-driven require workers to perform additional forms of translation to ensure their organizations' survival. Drawing on 16 interviews with members of 7 Asian American and Pacific Islander CBOs, we examine opportunities and barriers concerning their technology-mediated work practices. We identify two circumstances where CBO workers perform translation: (1) as legitimacy work to build trust with funders and communities, and (2) as (re)mediation in attending to technological barriers and resisting hegemonic systems that treat their communities as “other.” By unpacking the politics of translation work across these sites, we position CBO workers as a critical source for HCI research and practice as it seeks to support community wellbeing.

Trust, Comfort, and Relatability: Understanding Black Older Adults’ Perceptions of Chatbot Design for Health Information Seeking

Christina Harrington and Lisa Egede

Conversational agents such as chatbots have emerged as a useful resource to access real-time health information online. Perceptions of trust and credibility among chatbots have been attributed to the anthropomorphism and humanness of the chatbot design, with gender and race influencing their reception. Few existing studies have looked specifically at the diversity of chatbot avatar design related to both race, age, and gender, which may have particular significance for racially minoritized users like Black older adults. In this paper, we explored perceptions of chatbots with varying identities for health information seeking in a diary and interview study with 30 Black older adults. Our findings suggest that while racial and age likeness influence feelings of trust and comfort with chatbots, constructs such as professionalism and likeability and overall familiarity also influence reception. Based on these findings, we provide implications for designing text-based chatbots that consider Black older adults.

uKnit: A Position-Aware Reconfigurable Machine-Knitted Wearable for Gestural Interaction and Passive Sensing using Electrical Impedance Tomography

Tianhong Catherine Yu, Riku Arakawa, James McCann, Mayank Goel

A scarf is inherently reconfigurable: wearers often use it as a neck wrap, a shawl, a headband, a wristband, and more. We developed uKnit, a scarf-like soft sensor with scarf-like reconfigurability, built with machine knitting and electrical impedance tomography sensing. Soft wearable devices are comfortable and thus attractive for many human-computer interaction scenarios. While prior work has demonstrated various soft wearable capabilities, each capability is device- and location-specific, being incapable of meeting users' various needs with a single device. In contrast, uKnit explores the possibility of one-soft-wearable-for-all. We describe the fabrication and sensing principles behind uKnit, demonstrate several example applications, and evaluate it with 10-participant user studies and a washability test. uKnit achieves 88.0%/78.2% accuracy for 5-class worn-location detection and 80.4%/75.4% accuracy for 7-class gesture recognition with a per-user/universal model. Moreover, it identifies respiratory rate with an error rate of 1.25 bpm and detects binary sitting postures with an average accuracy of 86.2%.

Understanding Frontline Workers' and Unhoused Individuals' Perspectives on AI Used in Homeless Services

Tzu-Sheng Kuo, Hong Shen, Jisoo Geum, Nev Jones, Jason I. Hong, Haiyi Zhu, Kenneth Holstein

Recent years have seen growing adoption of AI-based decision-support systems (ADS) in homeless services, yet we know little about stakeholder desires and concerns surrounding their use. In this work, we aim to understand impacted stakeholders’ perspectives on a deployed ADS that prioritizes scarce housing resources. We employed AI lifecycle comicboarding, an adapted version of the comicboarding method, to elicit stakeholder feedback and design ideas across various components of an AI system’s design. We elicited feedback from county workers who operate the ADS daily, service providers whose work is directly impacted by the ADS, and unhoused individuals in the region. Our participants shared concerns and design suggestions around the AI system’s overall objective, specific model design choices, dataset selection, and use in deployment. Our findings demonstrate that stakeholders, even without AI knowledge, can provide specific and critical feedback on an AI system’s design and deployment, if empowered to do so.

Understanding Practices, Challenges, and Opportunities for User-Engaged Algorithm Auditing in Industry Practice

Wesley Hanwen Deng, Bill Boyuan Guo, Alicia DeVrio, Hong Shen, Motahhare Eslami, Kenneth Holstein

Recent years have seen growing interest among both researchers and practitioners in user-engaged approaches to algorithm auditing, which directly engage users in detecting problematic behaviors in algorithmic systems. However, we know little about industry practitioners' current practices and challenges around user-engaged auditing, nor what opportunities exist for them to better leverage such approaches in practice. To investigate, we conducted a series of interviews and iterative co-design activities with practitioners who employ user-engaged auditing approaches in their work. Our findings reveal several challenges practitioners face in appropriately recruiting and incentivizing user auditors, scaffolding user audits, and deriving actionable insights from user-engaged audit reports. Furthermore, practitioners shared organizational obstacles to user-engaged auditing, surfacing a complex relationship between practitioners and user auditors. Based on these findings, we discuss opportunities for future HCI research to help realize the potential (and mitigate risks) of user-engaged auditing in industry practice.

Understanding Visual Arts Experiences of Blind People

Franklin Mingzhe Li, Lotus Zhang, Maryam Bandukda, Abigale Stangl, Kristen Shinohara, Leah Findlater, Patrick Carrington

Visual arts play an important role in cultural life and provide access to social heritage and self-enrichment, but most visual arts are inaccessible to blind people. Researchers have explored different ways to enhance blind people's access to visual arts (e.g., audio descriptions, tactile graphics). However, how blind people adopt these methods remains unknown. We conducted semi-structured interviews with 15 blind visual arts patrons to understand how they engage with visual artwork and the factors that influence their adoption of visual arts access methods. We further examined interview insights in a follow-up survey (N=220). We present: 1) current practices and challenges of accessing visual artwork in-person and online (e.g., Zoom tour), 2) motivation and cognition of perceiving visual arts (e.g., imagination), and 3) implications for designing visual arts access methods. Overall, our findings provide a roadmap for technology-based support for blind people's visual arts experiences.

WebUI: A Dataset for Enhancing Visual UI Understanding with Web Semantics

Jason Wu, Siyan Wang, Siman Shen, Yi-Hao Peng, Jeffrey Nichols, Jeffrey Bigham

Modeling user interfaces (UIs) from visual information allows systems to make inferences about the functionality and semantics needed to support use cases in accessibility, app automation, and testing. Current datasets for training machine learning models are limited in size due to the costly and time-consuming process of manually collecting and annotating UIs. We crawled the web to construct WebUI, a large dataset of 400,000 rendered web pages associated with automatically extracted metadata. We analyze the composition of WebUI and show that while automatically extracted data is noisy, most examples meet basic criteria for visual UI modeling. We applied several strategies for incorporating semantics found in web pages to increase the performance of visual UI understanding models in the mobile domain, where less labeled data is available: (i) element detection, (ii) screen classification and (iii) screen similarity.

Zeno: An Interactive Framework for Behavioral Evaluation of Machine Learning

Ángel Alexander Cabrera, Erica Fu (CMU IS), Donald Bertucci, Kenneth Holstein, Ameet Talwalkar (CMU MLD), Jason I. Hong, Adam Perer

Machine learning models with high accuracy on test data can still produce systematic failures, such as harmful biases and safety issues, when deployed in the real world. To detect and mitigate such failures, practitioners run behavioral evaluation of their models, checking model outputs for specific types of inputs. Behavioral evaluation is important but challenging, requiring that practitioners discover real-world patterns and validate systematic failures. We conducted 18 semi-structured interviews with ML practitioners to better understand the challenges of behavioral evaluation and found that it is a collaborative, use-case-first process that is not adequately supported by existing task- and domain-specific tools. Using these findings, we designed Zeno, a general-purpose framework for visualizing and testing AI systems across diverse use cases. In four case studies with participants using Zeno on real-world models, we found that practitioners were able to reproduce previous manual analyses and discover new systematic failures.

↑ back to top