PhD Thesis Proposal: Haojian Jin

Haojian Jin

Monday, November 1, 2021 - 4:00pm
Remote via Zoom (see announcement for location)

Mitigating Overaccess by Apps through Modular Privacy Flows

Haojian Jin



Dr. Jason Hong (HCII), co-Chair

Dr. Swarun Kumar (ECE & HCII), co-Chair

Dr. Yuvraj Agarwal (ISR)

Dr. Laura Dabbish (HCII)

Dr. Ben Zhao (CS, University of Chicago)



Many user-facing applications adopt cloud-based software architectures, where the "brains" of these systems are typically in the cloud. A key concern with cloud-based software architectures is data privacy: users have little assurances of privacy after their sensitive data has been sent to developers' servers. However, it is challenging for companies to legitimately avoid collecting unnecessary users' data while also reassuring users and independent auditors that this is indeed the case.


This dissertation proposes a simple and generic software pattern, called Modular Privacy Flows(MPF), for designing privacy-sensitive software architectures in allowing developers to access users' data. MPF's adversary is an app developer, similar to today's mobile app developers, who might access more data than is necessary. For example, a smart meter developer, who collects raw data to generate a weekly power consumption report, might also use it to determine users' behavior patterns (i.e., data overaccess). MPF's goal is to limit data egress by such developers, while also making it easier for users and auditors to verify and interact with the intended data collection behaviors. Applicable scenarios include but are not limited to smart home app stores, mobile app stores, smart city data aggregation, third-party calendar APIs, and privacy-sensitive ad-targeting.


MPF has three key ideas. First, developers must declare all intended data collection behaviors in a text-based manifest. Second, to specify the data collection, developers choose from a small and fixed set of open-source operators with well-defined data transformation semantics, authoring a stream-oriented pipeline similar to Unix pipes. Third, a trusted runtime enforces the declared behaviors in the manifest, by running all of the preloaded operators specified in the manifest. Combined, developers can only collect data declared in the manifests, and users and auditors can inspect data behaviors by analyzing these manifests. Further, since the fixed set of operators have clearly defined semantics, our approach also facilitates a number of built-in privacy features that allow users to control their data across devices and services in a centralized manner, without additional effort from developers.


I plan to demonstrate the benefits of MPF in two privacy-sensitive smart home architectures. Peekaboo factors non-proprietary data pre-processing functions (i.e., operators) out to an in-home hub (runtime). MapAggregate factors non-proprietary data aggregation functions (i.e., operators) out to an auditable serverless infrastructure (i.e., runtime).



Queenie Kravitz

Upcoming Events