Human Population Genomics - overview
There is common consensus amongst most researchers that the human species originated in Africa. Then how have we populated all the corners of the planet? It turns that there is sufficient history carried in the genomes of individuals to reconstruct some, if not all, of these movements. This requires careful sampling of extant populations of the different geographic regions and even more meticulous analysis to unravel these histories. This work is part of the Genographic Project.
Ancestral Recombinations Graphs at Genomic Scales
Identification of Recombinations in Sequences (IRiS)
Genetic recombinations play a key role in shaping the chromosomal landscapes. The structure that captures these genetic events as the common evolutionary history of a set samples is called an ancestral recombinations graph (ARG) in population genetics literature. The statistical and combinatoric tools for identification of recombinations in sequences (IRiS) presented here is part of the RecoProject (a pilot in Genogrpahic), initiated by Laxmi Parida and Jaume Bertranpetit. The reconstructed ARG of a collection of samples is necessarily a subgraph of the true ARG, hence we call it a subARG.
Given a collection of haplotypes, IRiS produces a subARG in two phases. A combinatorial algorithm called the DSR  is a model-based approach to detecting recombinations in haplotypes (with a guaranteed approximation factor ). The algorithm is based on iteratively classifying sets of lineages as dominant, subdominant or recombinant (DSR). In the first phase, DSR is run multiple times with different sets of parameters and statistical consensus  is derived from them to produce a matrix of recombination information called the recomatrix. This encodes the local topology information of only the high confidence recombination events detected in the first phase. The subARG is constructed from the recomatrix in the second phase .
The construction of the ARG at a genomic scale naturally raises the question of reconstructability in general. To understand this aspect of the problem, we have modeled the ARG as a random graph . This lead to the identification of a small structure termed the minimal descriptor of an ARG [13-14].
Accuracy: The reconstructed ARG is superimposed on the true ARG of Cosi simulation.
Left: the red nodes show the extracted recombination nodes after Phase 1. Right: The red nodes and branches superimposed on the true ARG. Notice the increase in density of the reconstructed nodes (and edges) after Phase 2.
(Images generated by Marc Pybus & Asif Javed using Pajek; click on the images for the high definition versions)
Based on IRiS Analysis.
Performance: Running time and memory requirements are shown here. Chr denotes the number of chromosome samples.Top: Phase 1. Middle: Phase 2. Bottom: Both phases. In each plot, the dark line shows the average value and the dotted lines span the ninety percentile of the values.
Generation of Ancestral Recombinations Graphs
Simulation based on Random graph Algorithms (SimRA)
Simulating complex evolution scenarios of multiple populations is an important task for answering many basic questions relating to population genomics. We present an algorithm SimRA  that simulates generic multiple population evolution model with admixture. It is based on random graphs that improve dramatically in time and space requirements of the classical algorithm of single populations.
Availability and implementation
SimRA (Simulation based on Random graph Algorithms) source, executable, user manual and sample input-output sets are available for downloading at: https://github.com/ComputationalGenomics/SimRA
Application (on population studies)
Download data used in these papers here.
Javed, A., Melé, M., Pybus, M., Zalloua, P., Haber, M., Comas, D., Netea, M., Balanovsky, O., Balanovska, E., Jin, L., Yang, Y., Arunkumar G., Pitchappan, R.M., Bertranpetit, J., Calafell, F., Parida, L., and The Genographic Consortium, Recombination networks as genetic markers: a human variation study of the Old World, Human Genetics, 2011
Melé, M., Javed, A., Pybus, M., Zalloua, P., Haber, M., Comas, D., Netea, M., Balanovsky, O., Balanovska, E., Jin, L., Yang, Y., Pitchappan, R.M., Arunkumar G., Parida, L., Calafell, F., Bertranpetit J., and The Genographic Consortium, Recombination gives a new insight in the effective population size and the history of the Old World human populations, Molecular Biology and Evolution, 2011
Melé, M., Incorporating recombination into the study of Recent Human Evolutionary History, Evolutionary Biology Institute, Pompeu Fabra University, PhD Thesis, March 2011
Melé, M., Javed, A., Pybus, M., Calafell, F., Parida, L., Bertranpetit, J., and The Genographic Consortium, A New Method to Reconstruct Recombination Events at a Genomic Scale, PLoS Computational Biology, vol 6, No 11, pp e1001010, 2010
Parida, L., Javed, A., Melé, M., Calafell, F., Bertranpetit, J., and The Genographic Consortium, Minimizing recombinations in consensus networks for phylogeographic studies, BMC Bioinformatics , APBC, Beijing, 2009
Parida, L., Javed, A., Melé, M., and Bertranpetit, J., A case for Recombinomics, IBM Technical Report RC24677, August, 2008
Parida, L., Melé, M., Calafell, F., Bertranpetit, J., and The Genographic Consortium, Estimating the Ancestral Recombinations Graph (ARG) as compatible networks of SNP Patterns, Journal of Computational Biology, vol 15, No 9, pp 1--22, 2008
- Platt, D.E , Utro, F., Parida, L., Effect of sampling on the extent and accuracy of the inferred genetic history of recombining genome, Computational Biology and Chemistry, 2014.
Parida, L., Ancestral Recombinations Graph: A Reconstructability Perspective using Random-Graphs Framework, Journal of Computational Biology, vol 17, No 10, pp 1345--1370, 2010
Parida, L., Non-redundant Representation of Ancestral Recombinations Graphs, Evolutionary genomics: statistical and computational methods, Editor: Maria Anisimova, Methods in Molecular Biology (Springer) 2011
Javed, A., and Parida, L., Recombinomics: Population Genomics from a Recombination Perspective , Proceedings of C3S2E, Montreal, No 9, pp 129--137, 2010
Parida, L., Graph Model of Coalescence with Recombinations, The Problem Solving Handbook for Computational Biology and Bioinformatics (Lecture notes in mathematics) (Springer Verlag) 2010