Human Population Genomics       


Human Population Genomics - overview

There is common consensus amongst most researchers that the human species originated in Africa. Then how have we populated all the corners of the planet? It turns that there is sufficient history carried in the genomes of individuals to reconstruct some, if not all, of these movements. This requires careful sampling of extant populations of the different geographic regions and even more meticulous analysis to unravel these histories. This work is part of the Genographic Project.

Ancestral Recombinations Graphs at Genomic Scales

Identification of Recombinations in Sequences (IRiS)

Genetic recombinations play a key role in shaping the chromosomal landscapes. The structure that captures these genetic events as the common evolutionary history of a set samples is called an ancestral recombinations graph (ARG) in population genetics literature. The statistical and combinatoric tools for identification of recombinations in sequences (IRiS) presented here is part of the RecoProject (a pilot in Genogrpahic), initiated by Laxmi Parida and Jaume Bertranpetit. The reconstructed ARG of a collection of samples is necessarily a subgraph of the true ARG, hence we call it a subARG.

Brief Description

Given a collection of haplotypes, IRiS produces a subARG in two phases. A combinatorial algorithm called the DSR [10] is a model-based approach to detecting recombinations in haplotypes (with a guaranteed approximation factor [7]). The algorithm is based on iteratively classifying sets of lineages as dominant, subdominant or recombinant (DSR). In the first phase, DSR is run multiple times with different sets of parameters and statistical consensus [7] is derived from them to produce a matrix of recombination information called the recomatrix. This encodes the local topology information of only the high confidence recombination events detected in the first phase. The subARG is constructed from the recomatrix in the second phase [2].

The construction of the ARG at a genomic scale naturally raises the question of reconstructability in general. To understand this aspect of the problem, we have modeled the ARG as a random graph [15]. This lead to the identification of a small structure termed the minimal descriptor of an ARG [13-14].

Accuracy: The reconstructed ARG is superimposed on the true ARG of Cosi simulation.
Left: the red nodes show the extracted recombination nodes after Phase 1. Right: The red nodes and branches superimposed on the true ARG. Notice the increase in density of the reconstructed nodes (and edges) after Phase 2.
(Images generated by Marc Pybus & Asif Javed using Pajek; click on the images for the high definition versions)

Based on IRiS Analysis.

Performance: Running time and memory requirements are shown here. Chr denotes the number of chromosome samples.Top: Phase 1. Middle: Phase 2. Bottom: Both phases. In each plot, the dark line shows the average value and the dotted lines span the ninety percentile of the values.


The Windows (version), Linux (version) are downloadable here. Also, User Manual.


Generation of Ancestral Recombinations Graphs

Simulation based on Random graph Algorithms (SimRA)

Simulating complex evolution scenarios of multiple populations is an important task for answering many basic questions relating to population genomics. We present an algorithm SimRA [5] that simulates generic multiple population evolution model with admixture. It is based on random graphs that improve dramatically in time and space requirements of the classical algorithm of single populations.

Availability and implementation

SimRA (Simulation based on Random graph Algorithms) source, executable, user manual and sample input-output sets are available for downloading at:

Related Publications

Application (on population studies)

Download data used in these papers here.

  1. Utro, F., Cornejo, O.E. , Livingstone, D.,  Motamayor, J.C., Parida, L., ARG-based genome-wide analysis of cacao cultivars, BMC Bioinformatics, 2012

  2. Javed, A., Melé, M., Pybus, M., Zalloua, P., Haber, M., Comas, D., Netea, M., Balanovsky, O., Balanovska, E., Jin, L., Yang, Y., Arunkumar G., Pitchappan, R.M., Bertranpetit, J., Calafell, F., Parida, L., and The Genographic Consortium, Recombination networks as genetic markers: a human variation study of the Old World, Human Genetics, 2011

  3. Melé, M., Javed, A., Pybus, M., Zalloua, P., Haber, M., Comas, D., Netea, M., Balanovsky, O., Balanovska, E., Jin, L., Yang, Y., Pitchappan, R.M., Arunkumar G., Parida, L., Calafell, F., Bertranpetit J., and The Genographic Consortium, Recombination gives a new insight in the effective population size and the history of the Old World human populations, Molecular Biology and Evolution, 2011

  4. Melé, M., Incorporating recombination into the study of Recent Human Evolutionary History, Evolutionary Biology Institute, Pompeu Fabra University, PhD Thesis, March 2011


  1. Carrieri, A.P., Utro, F., and Parida, L., Sampling ARG of multiple populations under complex configurations of subdivision and admixture, Bioinformatics, 2015

  2. Javed, A., Pybus, M., Melé, M., Utro, F., Bertranpetit, J., Calafell, F., and Parida, L., IRiS: Construction of ARG network at genomic scales, Bioinformatics, 2011

  3. Melé, M., Javed, A., Pybus, M., Calafell, F., Parida, L., Bertranpetit, J., and The Genographic Consortium, A New Method to Reconstruct Recombination Events at a Genomic Scale, PLoS Computational Biology, vol 6, No 11, pp e1001010, 2010

  4. Parida, L., Javed, A., Melé, M., Calafell, F., Bertranpetit, J., and The Genographic Consortium, Minimizing recombinations in consensus networks for phylogeographic studies, BMC Bioinformatics , APBC, Beijing, 2009

  5. Parida, L., Javed, A., Melé, M., and Bertranpetit, J., A case for Recombinomics, IBM Technical Report RC24677, August, 2008

  6. Parida, L., Melé, M., Calafell, F., Bertranpetit, J., and The Genographic Consortium, Estimating the Ancestral Recombinations Graph (ARG) as compatible networks of SNP Patterns, Journal of Computational Biology, vol 15, No 9, pp 1--22, 2008


  1. Platt, D.E ,  Utro, F., Parida, L., Effect of sampling on the extent and accuracy of the inferred genetic history of recombining genomeComputational Biology and Chemistry, 2014.
  2. Platt, D.E ,  Utro, F., Pybus, M.,  Parida, L., Genetic History of Populations: Litmis to InferenceModels and Algorithms for Genome Evolution, 2013.

  3. Utro, F., Pybus, M., and Parida, L., Sum of parts is greater than the whole: inference of common genetic history of populations, BMC Genomics, APBC, Vancouver, Canada, 2013

  4. Parida, L., Palamara, P.F., and Javed, A., A Minimal Descriptor of an Ancestral Recombinations Graph, BMC Bioinformatics, APBC, Inchon, Korea, 2011

  5. Parida, L., Ancestral Recombinations Graph: A Reconstructability Perspective using Random-Graphs Framework, Journal of Computational Biology, vol 17, No 10, pp 1345--1370, 2010


  1. Parida, L., Non-redundant Representation of Ancestral Recombinations Graphs, Evolutionary genomics: statistical and computational methods, Editor: Maria Anisimova, Methods in Molecular Biology (Springer) 2011

  2. Javed, A., and Parida, L., Recombinomics: Population Genomics from a Recombination Perspective , Proceedings of C3S2E, Montreal, No 9, pp 129--137, 2010

  3. Parida, L., Graph Model of Coalescence with Recombinations, The Problem Solving Handbook for Computational Biology and Bioinformatics (Lecture notes in mathematics) (Springer Verlag) 2010