Submit Manuscript  

Article Details


Early Prediction of Malignant Mesothelioma: An Approach Towards Non-invasive Method

[ Vol. 16 , Issue. 10 ]

Author(s):

Shakir Shabbir, Muhammad Shahzad Asif, Talha Mahboob Alam and Zeeshan Ramzan*   Pages 1257 - 1277 ( 21 )

Abstract:


Background: Malignant Mesothelioma (MM) is a rare but aggressive tumor that arises in the lungs. Commonly, costly imaging and laboratory resources, i.e. (X-rays imaging, Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET) scans, biopsies, and blood tests) have already been utilized for the diagnosis of MM. Even though these diagnostic measures are expensive and unavailable in distant areas, some of these diagnosis methods are also very painful for the patient, i.e., biopsy and cytology of pleural fluid.

Objective: In this study, we proposed a diagnosis model for early identification of MM via machine learning techniques. We explored the health records of 324 Turkish patients, which show the symptoms related to MM. The data of patients include socio-economic, geographical, and clinical features.

Methods: Different feature selection methods have been employed for the selection of significant features. To overcome the data imbalance problem, various data-level resampling techniques have been utilized to obtain efficient results. The Gradient Boosted Decision Tree (GBDT) method has been used to develop the diagnostic model. The performance of the GBDT model is also compared with traditional machine learning algorithms.

Results: Our model's results outperformed other models, both on balance and imbalance data. The results clearly show that undersampling techniques outperformed by imbalanced data even without resampling based on accuracy and Receiving Operating Characteristic (ROC) value. Conversely, it has also been observed that oversampling techniques outperformed undersampling and imbalanced data based on accuracy and ROC. All classifiers employed in this study achieved efficient results utilizing feature selection-based methods (OneR, information gain, and Relief-F), but the results of the other two methods (gain ratio and Correlation) were not entirely promising. Finally, when the combination of Synthetic Minority Oversampling Technique (SMOTE) and OneR was applied with GBDT, it gave the most favorable results based on accuracy, F-measure, and ROC.

Conclusion: The diagnosis model has also been deployed to assist doctors, patients, medical practitioners, and other healthcare professionals for early diagnosis and better treatment of MM.

Keywords:

Data reduction, mesothelioma, gradient boosted decision tree, machine learning, malignant, biopsy.

Affiliation:

Department of Computer Science & Engineering, Faculty of Electrical Engineering, The University of Engineering and Technology, Lahore, Department of Computer Science & Engineering, Faculty of Electrical Engineering, The University of Engineering and Technology, Lahore, Department of Computer Science and Information Technology, Virtual University of Pakistan, Lahore, Department of Computer Science & Engineering, Faculty of Electrical Engineering, The University of Engineering and Technology, Lahore

Graphical Abstract:



Read Full-Text article