Yang Lu, Xiaolei Ma, Yinan Lu* and Zhili Pei Pages 359 - 370 ( 12 )
Background: Biomolecular-level event extraction is one of the most important branches of information extraction. With the rapid growth of biomedical literature, it is difficult for researchers to manually obtain information of interest, e.g. unknown information of threatening human disease or some biological processes. Therefore, researchers are interested in automatically acquiring information of biomolecular-level events. However, the annotated biomolecular-level event corpus is limited and highly imbalanced, which affects the performance of the classification algorithms and can even lead to over-fitting.
Method: In this paper, a new approach using the Pairwise model and convolutional neural network for biomolecular-level event extraction is introduced. The method can identify more accurate positive instances from unlabeled data to enlarge the labeled data. First, unlabeled samples are categorized using the Pairwise model. Then, the shortest dependency path with additional information is generated. Furthermore, two input forms with a new representation of the convolutional neural network model, which are dependency word sequence and dependency relation sequence are presented. Finally, with the sample selection strategy, the expanded labeled samples from unlabeled domain corpus incrementally enlarge the training data to improve the performance of the classifier.
Result & Conclusion: Our proposed method achieved better performance than other excellent systems. This is due to our new representation of generated short sentence and proposed sample selection strategy, which greatly improved the accuracy of classification. The extensive experimental results indicate that the new method can effectively inculcate unlabeled data to improve the performance of classifier for biomolecular-level events extraction.
Biomolecular-level event, protein complex event, short sentence generation, short sentence representation, sample selection strategy, word embedding.
College of the Computer Science and Technology, Jilin University, Changchun, Jilin, College of the Computer Science and Technology, Jilin University, Changchun, Jilin, College of the Computer Science and Technology, Jilin University, Changchun, Jilin, College of the Computer Science and Technology, Inner Mongolia University for Nationalities, Tongliao, Inner Mongolia