Functional Genomics Platform     


 photo photo James H. Kaufman photoHarsha V. Krishnareddy photo photo photoEd Seabolt photo Ignacio G Terrizzano photo

Functional Genomics Platform - overview

The IBM Functional Genomics Platform (formerly named OMXWare) is a relational database linking genotype to phenotype for over 300M biological sequences extracted from microbial genomes. This cloud-based platform is continuously updated with hundreds of thousands of genomes from NCBI GenBank, Sequence Read Archive (SRA), and other sources. Raw bacterial sequence datasets are self-consistently assembled and curated for quality yielding whole bacterial genomes. The complete collection of assembled  genomes are then annotated to identify every gene and protein they contain. Another set of cloud processes discover all of the domains within each protein. Protein domains are fundamental objects of biochemistry delivering the biological activity of the microorganism. They evolve function and can exist independently of the larger protein chain in which they are found. They are also assigned standardized codes representing their molecular function, cellular component, and biological process they are associated with. All of these biological entities are then linked in the Functional Genomics Platform. Linking genomes to protein domains is key to understanding phenotype in biology. The IBM Functional Genomics Platform provides a developer toolkit consisting of REST Services, Python SDK, and a Docker container to help researchers analyze this vast data repository at scale. This toolkit also allows developers to interact with the platform in their own compute environment and integrate with their existing workflows. Important applications can be built on top of the Functional Genomics Platform including services to annotate biological function in the microbiome, predict antimicrobial resistance (AMR), develop molecular targets for health interventions, or to expand our fundamental knowledge of microbial life.

Today the IBM Functional Genomics Platform has approximately 220,000 high quality bacterial and viral genomes, 64 million unique genes, 50 million unique proteins, and over 220 million unique protein domains. This data repository also includes public metadata including geography, food source (for foodborne isolates), AMR assay data, etc. This data is growing as new genomes are automatically detected and downloaded from public sources.

In response to the COVID-19 global pandemic, we extended this platform to include all public virus genomes including over 3M SARS-CoV-2 genomes, genes, proteins, and functional domains. If you're interested in learning more, please access our homepage at

Please note: If you are still using the old url ( for the IBM Functional Genomics Platform, the url has changed to If you are using for your bookmarks, no change is necessary.

Presentation to the The Presidential Advisory Council on Combating Antibiotic-Resistant Bacteria (PACCARB)

Video of presentation (starting at 23:32)
PDF of presentation