next up previous contents
Next: Summary of the Results Up: Introduction Previous: Identity By Descent   Contents

HapFABIA

We recently developed HapFABIA (1) to identify very short segments of identity by descent (IBD) in large sequencing data. HapFABIA identifies 100 times smaller IBD segments than current state-of-the-art methods: 10kbp for HapFABIA vs. 1Mbp for state-of-the-art methods. In experiments with artificial, simulated, and real genotyping data HapFABIA outperformed its competitors in detecting short IBD segments (1). HapFABIA is based on biclustering (4) which in turn used machine learning techniques derived from maximizing the posterior in a Bayes framework (11,5,7,8,10,12,6,9).

HapFABIA is designed to detect short IBD segments in genotype data that were obtained from next generation sequencing (NGS), but can also be applied to DNA microarray data. Especially in NGS data, HapFABIA exploits rare variants for IBD detection. Rare variants convey more information on IBD than common variants, because random minor allele sharing is less likely for rare variants than for common variants (13). In order to detect short IBD segments, both the information supplied by rare variants and the information from IBD segments that are shared by more than two individuals should be utilized (13). HapFABIA uses both. The probability of randomly sharing a segment depends

The shorter the IBD segments, the higher the likelihood that they are shared by more individuals (see Section 6). Therefore, we focus on short IBD segments. There exists a trade-off between low minor allele frequency (MAF) vs. many individuals having a segment (see Section 7). Consequently, a segment that contains rare variants and is shared by more individuals has higher probability of representing IBD (15,14). These two characteristics are our basis for detecting short IBD segments by HapFABIA.


next up previous contents
Next: Summary of the Results Up: Introduction Previous: Identity By Descent   Contents
Sepp Hochreiter 2013-11-13