Li-Ping Tian, Lizhi Liu and Fang-Xiang Wu Pages 259 - 266 ( 8 )
With advances in biotechnology, a huge amount of high throughput biological data has been and will continuously be produced. The information contained in such data is very useful in understanding the biological process from which such data is collected. Generally, high throughput biological data such as gene expression data is presented in a data matrix. Through matrix decomposition methods, we can often discover some very useful information. In bioinformatics, principal component analysis (PCA), independent component analysis (ICA), nonnegative matrix factorization (NMF) and network component analysis (NCA) are widely used to help understand and utilize high throughput data. They are all matrix decomposition methods, but subject to different constraints. In this paper, each of these methods is introduced and its applications to high throughput biological data are discussed. We also compare these methods and discuss their pros and cons.
Bioinformatics, independent component analysis, matrix decomposition, network component analysis, nonnegative matrix factorization, principal component analysis, Independent Component Analysis, Fast ICA Algorithm, Network Component Analysis, transcription factor activity
Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.