Secure discovery of genetic relatives across large-scale and distributed genomic data sets

Table 1.

SF-Relate achieves near-perfect accuracy for identifying close relatives in the UK Biobank and All of Us data sets

Data set Recall (%, counts) Precision (%, counts) % of comparisons w.r.t. all-pairwise
Relatedness degree Overall
Zero First Second Third
UKB-200K 100.0% 100.0% 99.8% 94.9% 97.0% 98.5% 0.13%
16/16 4702/4702 1709/1711 8475/8925 14,902/15,354 14,902/15,129
UKB-100K 100.0% 100.0% 100.0% 95.1% 97.2% 98.7% 0.26%
6/6 1243/1243 404/404 2169/2279 3822/3932 3822/3872
AoU-20K 100.0% 100.0% 100.0% 94.1% 98.0% 100.0% 1.28%
14/14 209/209 93/93 145/154 461/470 461/461
  • Ground-truth relatedness degrees for recall and precision metrics are obtained using the KING method and assigning each sample to the lowest degree of relatedness observed. SF-Relate obtains accurate results while performing only a small fraction of comparisons compared with all-pairwise comparison between data sets. (w.r.t.) With respect to.

This Article

  1. Genome Res. 34: 1312-1323

Preprint Server