Interctive Data Exploration with Diamond
Speaker
Mahadev Satyanarayanan
Carnegie Group Professor of Computer Science, Carnegie Mellon University
When
-
Where
Newell-Simon Hall 1305 (Michael Mauldin Auditorium)
Description
How does a domain expert discover something relevant to an urgent task in a large distributed repository of complex and loosely-structured data? For example, how does a military intelligence analyst identify suspicious events from recent satellite images and surveillance videos? The term “suspicious” refers to a vague concept. It is highly context-dependent and spans an enormous space of possibilities. While the analyst may have some pre-conceived notions of what he is searching for, the precise definition of “suspicious” can only be given after examining the data in some depth. In other words, hypothesis-formation and hypothesis-validation proceed hand-in-hand in a tightly-coupled and iterative sequence. We refer to this inherently human-centric activity as “interactive data exploration.”
Diamond is an open-source software system for interactive data exploration that has been jointly developed by Intel Research and Carnegie Mellon. It implements the concept of “early discard.” This makes brute-force interactive search practical by eliminating irrelevant data as cheaply as possible. Further, Diamond embodies the concept of “self-tuning.” This allows it dynamically adapt to different hardware configurations, workloads, and data content in a manner that is completely transparent to users and applications.
Medical and pharmaceutical researchers at UPMC, UPitt and Merck are collaborating with Diamond researchers to apply Diamond to their domain-specific tasks. This may open the door to research and diagnostic strategies that were not considered feasible until now. The use of Diamond exposes many research questions that may be of interest to the HCI community. We welcome collaboration with HCI researchers to help answer these questions.
Speaker's Bio
Satya is an experimental computer scientist who has pioneered research in mobile and pervasive computing. One outcome is the open-source Coda File System, which supports distributed file access in low-bandwidth and intermittent wireless networks through disconnected and bandwidth-adaptive operation. The Coda concepts of hoarding, reintegration and application-specific conflict resolution can be found in the hotsync capability of PDAs today. Key ideas from Coda have been incorporated by Microsoft into the IntelliMirror component of Windows 2000 and the Cached Exchange Mode of Outlook 2003. Another outcome of Satya’s work is Odyssey, a set of open-source operating system extensions that enable mobile applications to adapt to variation in critical resources such as network bandwidth and energy. Coda and Odyssey are building blocks in Project Aura, a research initiative at Carnegie Mellon to explore distraction-free ubiquitous computing. His most recent work in this space is Internet Suspend/Resume, a hands-free approach to mobile computing that exploits virtual machine technology to liberate personal computing state from hardware. Satya is a co-inventor of many supporting technologies relevant to mobile and pervasive computing, such as data staging, lookaside caching, translucent caching and application-aware adaptation. He is also a co-inventor of the Diamond approach to interactive, non-indexed search of complex and loosely-organized data such as digital photographs and medical images. Early in his career, Satya was a principal architect and implementer of the Andrew File System (AFS) which pioneered the use of scalable file caching, ACL-based security, and volume-based system administration for enterprise-scale information sharing. AFS was commercialized by IBM, is in widespread use today as OpenAFS, and has heavily influenced the NFS v4 network file system protocol standard that was published in April 2003.
Satya is the Carnegie Group Professor of Computer Science at Carnegie Mellon University. From May 2001 to May 2004 he served as the founding director of Intel Research Pittsburgh, one of four university-affiliated research labs established worldwide by Intel to create disruptive information technologies through its Open Collaborative Research model. Satya received the PhD in Computer Science from Carnegie Mellon, after Bachelor’s and Master’s degrees from the Indian Institute of Technology, Madras. He is a Fellow of the ACM and the IEEE, and was the founding Editor-in-Chief of IEEE Pervasive Computing.
Speaker's Website
http://www.cs.cmu.edu/~satya/DOWNLOAD/diamond-FAST04.pdf