Xin Wang*, Huan Zhang and Xiaojing Liu Pages 130 - 138 ( 9 )
Background: Accurate and exhaustive identification of genomic deletion events is the basis for understanding their roles in phenotype variation. Developing effective algorithms to identify deletions using next generation sequencing (NGS) data remains a challenge.
Objective: The accurate and exhaustive identification of genomic deletion events is important; we present a new approach, Defind, to detect deletions using NGS data from a single sample mapped to the reference genome sequences.
Method: The operating system(s) is Linux. Programming languages are Perl and R. We present Defind, a new approach for detecting medium- and large-sized deletions, based on inspecting the depth of coverage, GC content, mapping quality, and paired-end information of NGS data, simultaneously. We carried out detailed comparisons between Defind and other deletion detection methods using both simulation data and real data.
Results: In simulation studies, Defind could retrieve more deletions than other methods at low to medium sequencing coverage (e.g., 5 to 10×) with no false positives. Using real data, 94% of deletions commonly detected by at least two other methods were also detected by Defind. In addition, 90% of the deletions detected by Defind using the real data were positively supported by comparative genomic hybridization results, demonstrating the efficiency of Defind.
Conclusion: Defind performed robustly at different sequence coverage with different read length in the simulation study. Our studies also provided a significant practical guidance to select appropriate methods to detect genomic deletions using NGS data.
Defind, genomic deletions, NGS data, phenotype, algorithms, hybridization.
College of Life Science, Nanchang University, Nanchang 330031, College of Life Science, Nanchang University, Nanchang 330031, College of Life Science, Nanchang University, Nanchang 330031