Meta-omics - overview
Accurate identification of metagenomic sample content
One of the key questions relating to metagenomic samples is identifying all the Operational Taxonomic Units (OTU, e.g., species, genera) that are present in the mixture, while avoiding false positive calls. We approach this question from the perspective of utilizing all the sequencing reads' mappings to, often multiple, reference genomes.
Our approach is based on promiscuity of reads, i.e., reads mapping to multiple OTUs, in contrast to current approaches that rely on the abundance of reads. Ranking the potential OTU matches for each read, we demonstrate through simulations that the rank frequency distribution of true positive OTUs’ reads peak at rank 1. To further enrich the true positives, we define a normalized score per OTU, based on the promiscuity. Sorting by the score, the false positive OTUs sink to the bottom. Our preliminary experiments demonstrate that false positive OTUs can be substantially reduced, without losing any true positives.
Research on this topic will be presented as a talk at the International Association for Food Protection 2016 annual meeting .
Characterization and comparison of metagenomes
We are exploring the use of RoDEO, our method for differential gene expression, for sample comparisons and OTU abundance comparisons. First results on this ongoing work is presented at the 13th International Conference on Computational Intelligence methods for Bioinformatics and Biostatistics .
Sequencing the Food Supply Chain
The Consortium for Sequencing the Food Supply Chain (SFSC), run by IBM Research and Mars, Inc., will examine the global food chain - from farms, transport, processing facilities and distribution channels to restaurants and grocery stories - and apply genomics and analytics techniques to mitigate food borne illness and other risks in food management.
Our research on meta-omics is closely linked to the consortium efforts to understand and characterize microbiomes of food samples. For more information on the consortium, see Consortium for Sequencing the Food Supply Chain.
 Understanding False Positives in Mapping of Microbiome Sequence Data Using In-Silico Simulations, IAFP Annual Meeting, St. Louis, Missouri, Aug 2016.
 Dimension reduction of metagenome data using RoDEO improves phenotype prediction, CIBB, Stirling, UK, Sept 2016.