Projects and Groups
- Computer Science
- Artificial Intelligence
- Computer Architecture
- Data Management
- Distributed and Fault-Tolerant Computing
- Knowledge Discovery and Data Mining
- Operating Systems
- Services Computing
- Software Platform
I currently work on big data and distributed systems. Specifically, to accelerate large-scale machine learning algorithms using scale-out (e.g., Spark) and scale-up (e.g., GPU) approaches. I also work on NoSQL and services computing.
My work and code have been incorporated into IBM patent portfolio and software products such as BigInsights and Cognos. I am an adjunct professor at Department of Automation, Tsinghua University, China, and an associate editor of IEEE Transactions on Automation Science and Engineering.
With only a single machine with four Nvidia GPU cards, cuMF can be 6-10 times as fast, and 33-100 times as cost-efficient, compared with the state-of-art distributed CPU solutions. Moreover, cuMF can solve the largest matrix factorization problem ever reported yet in current literature. [HPDC 16 Paper] [PPT] [video] [GitHub]
From 2008 to 2010 I worked at Computation Institute, University of Chicago and Argonne National Laboratory, on caGrid Workflow Toolkit, a web-service-based scientific workflow platform for cancer Biomedical Informatics Grid (caBIG). It was funded by US National Cancer Institute and adopted by many major US bioinformatics projects.