Projects and Groups
- Computer Science
- Artificial Intelligence
- Cloud Software
- Computer Architecture
- Data Management
- Distributed & Fault-Tolerant Computing
- Knowledge Discovery and Data Mining
- Operating Systems
- Services Computing
I currently work on big data and distributed systems. Specifically, to accelerate large-scale machine learning algorithms using scale-out (e.g., Spark) and scale-up (e.g., GPU) approaches. I also work on NoSQL and services computing.
My work and code have been incorporated into IBM patent portfolio and software products such as BigInsights and Cognos. I am an adjunct professor at Department of Automation, Tsinghua University, China, and an associate editor of IEEE Transactions on Automation Science and Engineering.
With only one machine with four Nvidia GPU cards, cuMF can be 6-10 times as fast, and 33-100 times as cost-efficient, compared with the state-of-art distributed CPU solutions. Moreover, cuMF can solve the largest matrix factorization problem ever reported yet in current literature.
From 2008 to 2010 I worked at Computation Institute, University of Chicago and Argonne National Laboratory, on caGrid Workflow Toolkit, a web-service-based scientific workflow platform for cancer Biomedical Informatics Grid (caBIG). It was funded by US National Cancer Institute and adopted by many major US bioinformatics projects.