CMU logo
Search
Expand Menu
Close Menu

HCII Ph.D. Thesis Proposal: Amber Horvath

Open in new window

Speaker
Amber Horvath

When
-

Where
NSH 3305

Description

Meta-Information to Support Sensemaking by Developers
Amber Horvath
HCII Ph.D. Thesis Proposal

Time and Location:
Wednesday, December 6, 2023 @ 12:30 pm EST
NSH 3305 + Zoom

Committee:
Brad Myers (Chair), Carnegie Mellon University
Niki Kittur, Carnegie Mellon University
Laura Dabbish, Carnegie Mellon University
Andrew Macvean, Google
Elena Glassman, Harvard

Abstract:
Software development requires developers to juggle and balance many information-seeking and understanding tasks. From determining how a bug was introduced, to choosing what API method to use to resolve the bug, to how to properly integrate this change, even the smallest implementation tasks can lead to many questions. These questions may range from hard-to-answer questions about the rationale behind the original code to common questions such as how to use an API. Once this challenging sensemaking is done, this rich thought history is often lost given the high cost of externalizing these details, despite potentially being useful to future developers.

In this thesis, I explore different systems and methods for authoring and using this rich meta-information. Specifically, I have developed systems for annotating to support developers’ natural sensemaking when understanding information-dense sources such as software documentation and source code. I then demonstrated how this meta-information can be harnessed and used in new ways, including for assessing the trustworthiness of documentation and for capturing design rationale and provenance data of code.

To begin exploring to what extent meta-information for software developers may be utilized to support sensemaking during software development tasks, I developed Adamite, a browser extension that enables developers to annotate and organize their questions, open tasks, issues, and other thoughts about API documentation. Adamite was inspired by the fact that developers often face barriers brought upon by questions and issues with API documentation and the insight that it is likely that other developers have already experienced those barriers and eventually overcame them. Indeed, when using developer-annotated documentation, participants were able to complete significantly more of a challenging programming task compared to the baseline. Given this success, I generalized Adamite’s annotating approach to the IDE with the Catseye plugin for Visual Studio Code, and focused on helping the original developer better keep track of information to answer their own questions. The following project, Sodalite, expanded the Catseye approach to support long-form content through leveraging the relationship between the source document and annotation anchor points to serve as a signal for the “health” of the document. To lower the information authoring cost and account for the issue of scale, I then created the Meta-Manager which automatically versions and extracts meta-information about code and its provenance. Through all of these projects, this thesis has explored the ways in which meta-information may be presented and how the unique properties of code and the rich contextualized information about it may be harnessed to help developers in supporting their information seeking and understanding tasks.

While these systems have worked well in isolation, each only tackle a subset of the types of information developers need and utilize only some of the available meta-information about code. Further, given the shifting nature of software engineering tasks with the rise of AI code-generation systems, we have shown that our system can track and collect some of this information. To conclude this thesis, I propose extending the Meta-Manager to work with my other systems to explore to what extent combining these forms of meta information can answer otherwise unanswerable questions, while extending the systems to capture other significant types of meta information, such as CoPilot prompts and GitHub commit messages. Similar to my previous work, I plan to evaluate the integrated system through experiments to understand to what extent these new classes of meta-information can actually help developers answer their real questions about code.

Link to proposal document:
https://www.amberhorvath.com/resources/horvath_thesisproposal.pdf 

Speaker's Website
https://www.amberhorvath.com/resources/horvath_thesisproposal.pdf