Overview of TX-Phase. We illustrate the workflow of TX-Phase (A) and our key algorithmic techniques (B). Our workflow proceeds as follows. (1) The user and the service provider (SP)’s trusted execution environment (TEE; also called a secure enclave) engage in a remote attestation protocol to verify the integrity of the TEE and the TX-Phase program binary. (2) The user establishes a secure channel with the TEE and securely uploads one or more unphased genotype samples via the channel. (3) The SP runs TX-Phase inside the secure enclave with a locally held haplotype reference panel to phase the user's genotype samples. (4) The user downloads the phased haplotypes from the TEE via the secure channel. TX-Phase offers robust security, ensuring the confidentiality of the user's genomic data from the SP and other users of the system. We introduce two key techniques, compressed haplotype selection and dynamic fixed-point arithmetic, which enabled us to design a phasing algorithm for TEEs that is efficient, accurate, and safeguarded against the risk of side-channel data leakage. Compressed haplotype selection refers to our novel use of compressed reference panels throughout the algorithm to minimize the overhead of memory-oblivious computation. Dynamic fixed-point arithmetic is our approach for enhancing the precision of fixed-point operations necessary for constant-time computations, achieved by dynamically adjusting the scales of numbers in matrices and vectors. Our results show that both techniques are essential for the secure and practical deployment of TEE-based haplotype phasing.
