I have broad research interests in computer systems, including cloud computing, storage/file systems, operating systems and distributed systems. I am particularly interested in developing new ways of structuring computer systems to address technology changes and enable new applications. I am involved in many ongoing projects in such areas as data-intensive computing, distributed system diagnosis, cloud computing, home storage, and exploitation of new storage technologies.
Cloud Computing Infrastructure
We are exploring the many systems challenges hidden behind the hype surrounding cloud computing, such as elastic storage, resource allocation/scheduling, and automation (e.g., performance problem diagnosis). In addition, to gain first-hand experiences, we maintain and measure several deployed cloud resources with real users, in collaboration with industry partners.
Data-intensive Computing (DISC, a.k.a. Big Data)
DISC refers to the rapidly growing style of computing characterized by extraction of information from huge and often dynamically growing datasets. We are exploring new distributed storage and computing system designs for achieving robust, efficient data-intensive computing. We are particularly interested in new frameworks for supporting advanced machine learning on large data sets, shifting the work of coordinating parallel threads and data from the programmer without losing efficiency.
Parallel Data Lab (PDL)
As Director of the Parallel Data Lab, I lead a number of storage-related projects in areas such as storage system architecture, survivable storage, file systems, storage security, and automated storage management. As one example, we are exploring how system software should change to accommodate new storage technologies like non-volatile RAM (e.g., PCM) and shingled magnetic recording (in disks). As another, we are exploring new approaches to sharing and access control in distributed home storage.