Xiaoshu Zhu, Hong-Dong Li, Lilu Guo, Fang-Xiang Wu and Jianxin Wang* Pages 1 - 9 ( 9 )
The recently developed single-cell RNA sequencing (scRNA-seq) has attracted a great amount of attention due to its capability to interrogate expression of individual cells, which is superior to traditional bulk cell sequencing that can only measure mean gene expression of a population of cells. scRNA-seq has been successfully applied in finding new cell subtypes. New computational challenges exist in the analysis of scRNA-seq data. This work is aimed to give an overview of the features of different similarity calculation and clustering methods, in order to facilitate users to select methods that are suitable for their scRNA-seq. We also would like to show that feature selection methods are important to improve clustering performance. Here we first described similarity measurement methods, followed by reviewing some new clustering methods, as well as their algorithmic details. This analysis revealed several new questions, including how to automatically estimate the number of clustering categories, how to discover novel subpopulation, and how to search for new marker genes by using feature selection methods. Without prior knowledge about the number of cell types, clustering or semi-supervised learning methods are important tools for exploratory analysis of scRNA-seq data.
single-cell RNA-seq, similarity measurement, clustering of cell types, unsupervised learning methods
School of Information Science and Engineering, Central South University, 410083, Changsha, Hunan, School of Information Science and Engineering, Central South University, 410083, Changsha, Hunan, School of Computer Science and Engineering, Yulin Normal University, 537000, Yulin, Guangxi, Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SKS7N5A9, School of Information Science and Engineering, Central South University, 410083, Changsha, Hunan