Pratik Joshi*, Masilamani Vedhanayagam and Raj Ramesh Pages 422 - 432 ( 11 )
Background: Preventing adverse drug reactions (ADRs) is imperative for the safety of the people. The problem of under-reporting the ADRs has been prevalent across the world, making it difficult to develop the prediction models, which are unbiased. As a result, most of the models are skewed to the negative samples leading to high accuracy but poor performance in other metrics such as precision, recall, F1 score, and AUROC score.
Objective: In this work, we have proposed a novel way of predicting the ADRs by balancing the dataset.
Methods: The whole data set has been partitioned into balanced smaller data sets. SVMs with optimal kernel have been learned using each of the balanced data sets and the prediction of given ADR for the given drug has been obtained by voting from the ensembled optimal SVMs learned.
Results: We have found that results are encouraging and comparable with the competing methods in the literature and obtained the average sensitivity of 0.97 for all the ADRs. The model has been interpreted and explained with SHAP values by various plots.
Conclusion: A novel way of predicting ADRs by balancing the dataset has been proposed thereby reducing the effect of unbalanced datasets.
Adverse drug reaction, machine learning, SVM, ensembling, SIDER, randomaized search.
Department of Computer Science and Engineering, IIITDM Kancheepuram, Chennai, Department of Computer Science and Engineering, IIITDM Kancheepuram, Chennai, Data Foundry, Bangalore