Human Population Genomics       

links

Human Population Genomics - RECO project data


Haplotype data is provided for 33 worldwide populations analyzed for five regions on the X chromosome. Please see the manuscripts for a detailed description of these regions and population abbreviations.

Data

The data is released in a Generic format as well as IRiS format.

Generic format

The dataset is split across populations and regions with 165 files included in 'Generic_format.zip'. Each file is named appropriately to reflect the population name and region contained. For example, 'ADI_1.txt' contains haplotypes for the first region for Adi samples. The format of each included file is as follows.

The first line contains population name, chromosome, reference genome build and sample names. The subsequent lines reflect the allele at each marker. In the example below the first marker is 'rs5970600' on X chromosome at location 22509816 with respect to human genome build 36. The first sample ADI1 has allele A at this position. All markers included in the study are biallelic and their order in files is sorted based on their chromosomal position.

ADI chr NCBI36 ADI1 ...
rs5970600 X 22509816 A ...

IRiS format

The dataset is also converted into a format compatible with the software IRiS and can be directly used with the IRiS pipeline. Information for all 33 populations is concatenated for each region. Please see IRiS user manual for details of the format.