Weiwen Zhang*, Long Wang, Theint Theint Aye, Juniarto Samsudin and Yongqing Zhu Pages 1 - 9 ( 9 )
Background: Genotype imputation as a service is developed to enable researchers to estimate genotypes on haplotyped data without performing whole genome sequencing. However, genotype imputation is computation intensive and thus it remains a challenge to satisfy the high performance requirement of genome wide association study (GWAS).
Objective: In this paper, we propose a high performance computing solution for genotype imputation on supercomputers to enhance its execution performance.
Method: We design and implement a multi-level parallelization that includes job level, process level and thread level parallelization, enabled by job scheduling management, message passing interface (MPI) and OpenMP, respectively. It involves job distribution, chunk partition and execution, parallelized iteration for imputation and data concatenation. Due to the design of multi-level parallelization, we can exploit the multi-machine/multi-core architecture to improve the performance of genotype imputation.
Results: Experiment results show that our proposed method can outperform the Hadoop-based implementation of genotype imputation. Moreover, we conduct the experiments on supercomputers to evaluate the performance of the proposed method. The evaluation shows that it can significantly shorten the execution time, thus improving the performance for genotype imputation.
Conclusion: The proposed multi-level parallelization, when deployed as an imputation as a service, will facilitate bioinformatics researchers in Singapore to conduct genotype imputation and enhance the association study.
Genotype imputation, parallelization, high performance computing, supercomputers, bioinformatics, performance evaluation.
Guangdong Provincial Key Laboratory of Cyber-Physical System, School of Computers, Guangdong University of Technology, Guangzhou, Computing Science Department, Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Computing Science Department, Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Computing Science Department, Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), School of Science and Technology, Singapore University of Social Sciences