Lailil Muflikhah*, Nashi Widodo, Wayan Firdaus Mahmudy and Solimun Pages 1 - 9 ( 9 )
Background: Hepatocellular carcinoma (HCC) is a serious disease and is the third main cause of death in the world. Hepatitis B virus infection can lead to HCC. The virus introduces genetic material into the host, damages DNA, and interferes with the activity of the apoptotic and tumor suppressors to trigger the formation of an oncogene. However, most of these cases are discovered after cancer enters stage three or four.
Objective: Early detection of HCC through machine learning algorithm approach using data set: DNA sequence of HBx HepB virus.
Method: The research method used is the development of a Support Vector Machine classifier algorithm for carcinoma detection. The large data volume and unbalance data distribution in class can decrease the accuracy rate and sensitivity. Therefore, this paper proposed a hybrid of Hierarchical k-Means clustering and SVM algorithms to detect HCC disease using HBx DNA sequences. In this method, the SVM algorithm was applied in each cluster using the Hierarchical k-Means method.
Results: The experimental result showed that an accuracy rate of 97,18%, a sensitivity of 98.9%, and AUC 0.918. This means the performance was increased to 9.52%, 95.3%, and 0.4 above the conventional SVM method.
Conclusion: Detection of HCC can be applied using the SVM algorithm based on clustering. The proposed method, by hybrid hierarchical k-Means and SVM, increased the performance of classification results for the detection.
Hepatocellular Carcinoma; HBx; DNA sequence; clustering, Hierarchical k-Means; SVM
Biology Department, Faculty of Mathematics and Natural Science, Brawijaya University, Malang, Biology Department, Faculty of Mathematics and Natural Science, Brawijaya University, Malang, Informatics Engineering, Faculty of Computer Science, Brawijaya University, Malang, Statistics Department, Faculty of Mathematics and Natural Science;Brawijaya University, Malang