RoDEO: Robust Differential Gene Expression - overview
RoDEO*  is a framework for detecting differentially expressed genes and stable genes between RNA-seq experiments. [*Robust DE Operator]
For detecting differentially expressed genes, we use a normalization that is not based on the relative values of the gene expression but on the relative order of the expression within a sample. Indeed, the expression values of all genes in an experiment are utilized in a re-sampling approach, to tease out robust relative ranks of the genes in several re-generated instances of the RNA-seq experiments.
Robust scale-free measure of expression
Instead of directly working with the expression value of g, we define a character function φ for each gene g. The two most desirable properties of this function are (i) it depends on the expression values of all the other genes in the assay and and (ii) it is scale invariant.
The character function values are estimated by the re-sampling and binning the genes into a robust sets of P ranks per sample (e.g. P=20). The character function values of a gene g in experiments A and B are compared to determine whether the gene is differentially expressed. A method for adjusting the parameter P' across samples of vastly different sequencing depths has been described .
Performance and Application
RoDEO outperforms existing differential expression detectors on benchmark datasets .
RoDEO has been applied on grass cultivars' RNA-seq data to detect differentially expressed and stable genes during salt stress [1, 3], on cacao gene expression studies for disease resistance , and for processing gene expression data for Parkinson's disease prediction .
RoDEO has also been applied on metagenomics datasets to detect differentially abundant micro-organisms and microbiome functions between samples. [2, 5]
- Niina Haiminen, Manfred Klaas, Zeyu Zhou, Filippo Utro, Paul Cormican, Thomas Didion, Christian Sig Jensen, Chris Mason, Susanne Barth, Laxmi Parida: Comparative Exomics of Phalaris cultivars under salt stress. BMC Genomics 15(Suppl 6):S18, 2014.
- Anna Paola Carrieri, Niina Haiminen, Laxmi Parida: Host phenotype prediction from differentially abundant microbes using RoDEO. Lecture Notes in Computer Science 10477, pp. 27-41, Springer, 2017.
- Manfred Klaas, Niina Haiminen, Jim Grant, Paul Cormican, Dr. John Finnan, Sai Arojju Krishna, Filippo Utro, Tia Vellani, Laxmi Parida, Susanne Barth: Transcriptome characterization and differentially expressed genes under flooding and drought stress in the biomass grasses Phalaris arundinacea and Dactylis glomerata. Annals of Botany 124(13), 717-730, 2019.
- J. Alberto Romero Navarro, Wilbert Wilbert Phillips-Mora, Adriana Arciniegas-Leal, Allan Mata-Quiros, Niina Haiminen, Guiliana Mustiga, Donald Livingstone III, Harm Van bakel, David Kuhn, Laxmi Parida, Andrew Kasarskis and Juan Carlos Motamayor. Application of genome wide association and genomic prediction for improvement of cacao productivity and resistance to black and frosty pod diseases. Frontiers in Plant Science, 2017.
- Filippo Utro, Niina Haiminen, Enrico Siragusa, Laura-Jayne Gardiner, Edward Seabolt, Ritesh Krishna, James Kaufman, Laxmi Parida. Hierarchically labeled database indexing allows scalable characterization of microbiomes. iScience, 2020.
- Sayan Mandal, Aldo Guzman-Saenz, Niina Haiminen, Saugata Basu, Laxmi Parida. A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data. Proc. 7th International Conference on Algorithms for Computational Biology (AlCoB), Lecture Notes in Bioinformatics, pp. 178-187, Springer, 2020.