Dan Tulpan*, Roberto Montemanni and Derek H. Smith Pages 296 - 302 ( 7 )
Background: The hybridization stability of single and double stranded DNA sequences has been studied extensively and its impact on bio-computing, bio-sensing and bio-quantification technologies such as microarrays, Real-time PCR and DNA sequencing is significant. In many bioinformatics applications DNA duplex hybridization is traditionally estimated using GC-content and melting temperature calculations based on the sequence base composition.Objective: In this study we explore the equivalence of the two approaches when estimating DNA sequence hybridization and we show that GC-content is a far from perfect predictor of DNA strand hybridization strength compared to experimentally-determined melting temperatures. Method: To test the assumption that DNA GC-content is a good indicator of its melting temperature, we formulate a research hypothesis and we apply the Pearson product-moment correlation statistical model to measure the strength of a linear association between the GC-content and melting temperatures. Results: We built a manually curated set of 373 experimental data points collected from 21 publications, each point representing a DNA strand with length between 4 and 35 nucleotides and its corresponding experimentally determined melting temperature measured under specific sequence and salt concentrations. For each data point we calculated the corresponding GC-content and we separated the set into 12 subsets to minimize the variability of experimental conditions. Conclusion: Based on calculated Pearson product-moment correlation coefficients we conclude that GC-content only seldom correlates well with experimentally determined melting temperatures and thus it is not a strictly necessary constraint when used to control the uniformity of DNA strands.
DNA sequence, GC-content, hybridization, melting temperature, oligonucleotides, Pearson correlation.
Information and Communication Technologies, National Research Council Canada, Moncton, N.B. E1A 7R1, Dalle Molle Institute for Artificial Intelligence (IDSIA), University of Applied Sciences of Southern Switzerland (SUPSI), Galleria 2, CH-6928 Manno, Division of Mathematics and Statistics, University of South Wales, Pontypridd, CF37 1DL