We used HapFABIA to extract short
IBD segments from the 1000 Genomes
Project genotyping data (2), more specifically,
the phase 1 integrated variant call set (version 1)
containing phased genotype calls
for SNVs, short indels, and large deletions.
This data set consists of 1,092 individuals
(246 Africans, 181 Admixed Americans, 286 East
Asians, and 379 Europeans),
36.6M SNVs, 3.8M short indels, and 14k large deletions.
Chromosome 1 contains 3,201,157 SNVs
that are on average 78 bp apart and
have an average minor allele frequency (MAF) of 0.06.
1,920,833 (60%) SNVs are rare (MAF
0.05), 684,171 (21.4%) are
private (minor allele is observed only once),
15,124 (0.47%) have an MAF of zero, and 581,029 (18.2%) are
common (MAF
0.05).
We kept only the rare SNVs for IBD detection and excluded private ones.
Chromosome 1 was divided into intervals of 10,000 SNVs with
adjacent intervals overlapping by 5,000 SNVs.
After removing common and private SNVs,
we applied HapFABIA to these intervals.
We used HapFABIA with 40 iterations and estimated the parameter
from the 1000 Genomes Project data.