We tested whether IBD segments that match particular archaic genomes to a large extent are found more often in certain populations than expected randomly. For each IBD segment, we computed two values: The first value was the proportion of tagSNVs that match a particular archaic genome, which we call ``genome proportion'' of an IBD segment (e.g. ``Denisova proportion''). The second value was the proportion of individuals that possess an IBD segment and are from a certain population as opposed to the overall number of individuals that possess this IBD segment. We call this value the ``population proportion'' of an IBD segment (e.g. ``Asian proportion''). Consider the following illustrative examples. If an IBD segment has 20 tagSNVs of which 10 match Denisova bases with their minor allele, then we obtain 10/20=0.5=50% as the Denisova proportion. If an IBD segment is observed in 6 individuals of which 4 are Africans and 2 Europeans, then the African proportion is 4/6=0.67=67% and the European proportion is 0.33=33%. A correlation between a genome proportion and a population proportion would indicate that this genome is overrepresented in this specific population. Pearson's product moment correlation test and Spearman's rank correlation test both showed highly significant correlations between Denisova genome and Asians, Denisova genome and Europeans, Neandertal genome and Asians, and Neandertal genome and Europeans. Fig. 1 shows Pearson's correlation coefficients for the correlation between population proportion and Denisova genome proportion. Asians have significantly larger correlation to the Denisova genome than other populations. Many IBD segments that match the Denisova genome are exclusively found in Asians, which has large effect on the correlation coefficient. Europeans have significantly larger correlation to the Denisova genome than the average. HapMapMexicans have surprisingly high correlation to the Denisova genome while Iberians have low correlation compared to other Europeans. Fig. 2 shows Pearson's correlation coefficients for the correlation between population proportion and Neandertal genome proportion. Europeans and Asians have significantly larger correlation to the Neandertal genome than other populations. Asians have slightly higher correlation coefficients than Europeans. Again HapMapMexicans have surprisingly high correlation to the Neandertal genome while Iberians have low correlation compared tot Europeans. However, correlation tests are sensitive to accumulations of minor effects. Therefore, we focused subsequently on strong effects, i.e. large values of genome proportions and large values of population proportions.
We define an IBD segment to match a particular archaic genome if the genome proportion is 30% or higher. Only 10% of the Denisova and 6% of the Neandertal bases (about 10% of the called bases) match the minor allele of the human genome on average. Therefore, we require an odds ratio of 3 to call an IBD segment to match an archaic genome. We found many more IBD segments that match the Neandertal or the Denisova genome than expected randomly. This again supports the statement that the detected short IBD segments are old and some of them date back to times of the ancestors of humans, Neandertals, and Denisovans. IBD segments that match the Denisova genome often match the Neandertal genome, too, thus these segments cannot be attributed to either one of these genomes. Therefore, we introduce the ``Archaic genome'' (genome of archaic hominids ancestral to Denisovan and Neandertal), to which IBD segments are attributed if they match both the Denisova and the Neandertal genome.
Next we investigated which population has a maximum proportion for an IBD segment that matches a particular genome -- the population with the majority of the individuals possessing this segment. Figure 3 shows the population with maximum proportion for each IBD segment. The IBD segments are presented for each genome, where the colors show the populations with maximum proportion for the according IBD segment. Almost half of the Neandertal matching IBD segments have Asians or Europeans as maximal population proportions. For the Archaic genome (intersection of Neandertal and Denisovan matching IBD segments), IBD segments dominated by Asians or Europeans are also enriched if compared to all IBD segments found in chromosome 1 of the 1000 Genomes Project data (we call the set of these segments ``human genome''). The enrichment by Asian or European IBD segments is lower for the Denisovan genome, but still significant (see tests in next paragraph). Next we asked which populations contain an IBD segment that matches a particular genome, that is, we asked whether this IBD segment is found in this population or not. Figure 4 shows for each genome (human and archaic) and each IBD segment, whether a population contains this IBD segment or not. IBD segments that match the Neandertal or the Archaic genome are found more often in Asians and Europeans than all IBD segments (human genome). This effect is not as prominent for IBD segments that match the Denisovan genome, but still significant (see tests in next paragraph).
We consider strong effects in terms of population proportions, where a considerable population proportion is 20% or higher. Hence, a population has a considerable proportion of an IBD segment, if 20% of the individuals that possess the IBD segment belong to this population. IBD segments were classified into (i) those that match or do not match a particular archaic genome and (ii) those that have or do not have a considerable proportion of a certain population. We tested whether these classes are related using Fisher's exact test for count data. IBD segments matching the Denisova genome are enriched in the Asian (odds ratio of 4.7 with a -value 1e-308) and the European population (odds ratio of 2.3 and -value 2.7e-152). Fig. 5 shows the odds ratios for Fisher's exact test for correlation of Denisova matching IBD segments with single populations. Asians have clearly higher odds ratios because many IBD segments that match the Denisova genome are exclusively found in Asians. Enrichment of the Denisova genome is also found in Europeans. Other thresholds lead to similar odds ratios and -values. This confirms previous findings, where the authors discovered that European and Asian genomes are enriched by the Denisova genome if compared to Africans (29,26). IBD segments that match the Neandertal genome are enriched in Asians (odds ratio of 14.0 and -value 1e-308) and in Europeans (odds ratio of 7.5 and -value 1e-308). Fig. 6 shows the odds ratios for Fisher's exact test for correlation of Neandertal matching IBD segments with single populations. Asians and Europeans have clearly higher odds ratios. Asians have slightly higher odds ratios than Europeans. Again, our results are in accordance with previous findings (30,28). In particular, Wall et al. (30) report that more Neandertal DNA is found in modern East Asians than in modern Europeans. IBD segments that match an ancestral genome are enriched in Asians (odds ratio of 1.3 and -value 2.2e-08) and Europeans (odds ratio of 1.5 and -value 2.1e-29). However, the ancestral (primate) genomes exhibit a considerable overlap with archaic hominid genomes potentially confounding matches with ancestral genomes. Thus, the results on matches with the ancestral genome must be considered with care.