Computer Science Thesis Proposal

Speaker
DANIEL ANDERSON
Ph.D. Student
Computer Science Department
Carnegie Mellon University

When
April 25, 2022 1:00pm - April 25, 2022 1:00pm

Where
In Person and Virtual - ET

Description
The defining feature of many modern large-scale computer systems is the sheer amount of data that they generate and process. In 2008, it was reported that Google’s MapReduce clusters process over twenty petabytes of data per day. Moderns systems for processing this kind of data make extensive use of parallel and distributed computation to achieve the necessary level of throughput. An important property of the datasets involved in such computations is that they are not static. Rather, they are frequently evolving. Social networks, for example, undergo rapid changes; new users are added, and links between uses are created and deleted. The first part of this thesis explores a relatively new area of algorithm design intended to tackle these kinds of problems: parallel batch-dynamic algorithms. Classic algorithms for handling dynamic data support a single update at a time, but this model is insufficient for handling the volume of updates present in modern datasets, and furthermore yields little room to exploit parallelism within the updates. A batch-dynamic algorithm consumes batches of multiple updates at a time, which allows for increased throughput and more opportunities for parallelism. In this thesis, we will design and analyze parallel batch-dynamic algorithms for dynamic graph connectivity, dynamic trees, and dynamic minimum spanning trees. We will also show that batch-dynamic algorithms can be used as ingredients in designing highly- efficient parallel static algorithms. Using a parallel batch-dynamic data structure for dynamic tree queries, we will obtain the first ever work-efficient parallel algorithm for minimum cuts. The second part of this thesis will explore the implementation of systems for processing batch-dynamic data. One factor that hampers the adoption of efficient dynamic algorithms in practice is that they are often extremely complicated and difficult to implement. To address this issue in the sequential setting, self-adjusting computation is a technique for automatically dynamizing static algorithms to create dynamic ones. This lowers the implementation burden on programmers. In this thesis, we will show that self-adjusting computation can be generalized to automatically dynamize parallel algorithms, allowing programmers to reap the benefits of parallel batch-dynamic algorithms without the burden of implementing them from scratch. Thesis Committee: Guy Blelloch (Chair) Phillip Gibbons Daniel Sleator Julian Shun, MIT Valerie King (University of Victoria, Canada) In Person and Zoom Participation. See announcement.