OMXWare - overview
OMXWare is a relational database linking genotype to phenotype across over 1000 genera of bacterial life. Built on the IBM Cloud, OMXWare started with hundreds of thousands of bacterial genomes from GenBank and the NCBI Sequence Read Archive (SRA). Continously updated, these datasets are self consistently assembled and curated for quality, yielding whole bacterial genomes. The whole genomes are then annotated to identify every gene and protein they contain. Another set of cloud processes then discover all of the domains within each protein. Protein domains are the fundamental objects of biochemistry. They evolve function and exist independently of the larger protein chain in which they are found. They are also assigned standardized codes representing the molecular functions, biological pathways, and biological processes they are associated with. All of these biological objects are then linked in the relational database. Linking genomes to protein domains is key to understanding phenotype in biology. OMXWare also provides APIs and an SDK. Important applications can be built on top of OMXWare including services to predict antimicrobial resistance (AMR) or services to support identification of food safety hazards or BioThreats. Protein domains are also the targets of drugs so identification of the particular domains within a genome or metagenome can advise prescribed medications and alert to side effects based on a specific genome (or metagenome).
Today OMXWare has approximately 200,000 high quality bacterial genomes, 64 million unique genes, 50 million unique proteins, and over 220 million unique protein domains. The genomes in OMXWare also include public metadata including geography, food source (for food borne isolates), AMR assay data, etc. This data is growing as new genomes are automatically detected and downloaded from public sources.