CHINESE-LANGUAGE ENTITY RELATION EXTRACTION FOR WHEAT DISEASES AND PESTS BASED ON REMOTE SUPERVISION AND BIDIRECTIONAL TRANSFORMERS Authors: Yinchao Che, Shuping Xiong, Shiyu Xi, Demeng Zhang, Shufeng Xiong, Xinming Ma, Lei Xi Journal: Journal of Animal and Plant Sciences (JAPS) ISSN: 1018-7081 (Print), 2309-8694 (Online) Volume: 36 Issue: 1 Pages: 213-226 Year: 2026 DOI: https://doi.org/10.36899/JAPS.2026.1.0018 URL: https://doi.org/https://doi.org/10.36899/JAPS.2026.1.0018 Publisher: Pakistan Agricultural Scientists Forum Abstract:

In the realm of wheat disease and pest management, extracting domain-specific knowledge presents a formidable challenge. This challenge is further amplified by the lack of publicly available Chinese entity-relation extraction datasets and the exorbitant costs associated with manual annotation stemming from the specialized nature of the domain. In response to these challenges, we utilized remote supervision to match relevant triplets from CN-DBpedia and Ownthink knowledge bases with unstructured texts, followed by manual correction to construct WheatCRE, a Chinese entity relation extraction dataset for wheat diseases and pests. The WheatCRE dataset comprises 1,681 labeled samples covering six relationship categories: Distribution Range, Alias, Damage Parts, Damage Crops, Genus Orders, and Genus Families. Subsequently, we proposed a novel model called BE-CRE (BERT-Entity Chinese Relation Extraction), which combines Bidirectional Encoder Representations from Transformers with entity representations. The model uses BERT to obtain dynamic character representations and target entity representations, adopting concatenation method to fuse features. By making full use of the implicit meanings of entities, the model can obtain more accurate features. Comprehensive experiments were conducted on the WheatCRE dataset comparing different optimization algorithms, training parameters, relation extraction models, and pre-training models. Our proposed BE-CRE model achieved superior performance compared to baseline models including BiLSTM-Attention, BiGRU-Attention, and BERT-Softmax, with Precision-Macro (P-M), Recall-Macro (R-M), and F1-Macro (F1-M) values of 89.37%, 89.4%, and 89.29% respectively. Furthermore, we conducted comparative experiments on a public Chinese entity relation extraction dataset (CharacterCRE) to evaluate the generalization ability of our model, achieving the best F1-Macro value of 78.31%. These results demonstrate the effectiveness and applicability of our proposed model in wheat disease and pest relation extraction. BE-CRE distinguishes itself from existing models by integrating BERT with entity representations and using remote supervision to construct a specialized dataset, enabling more accurate and context-aware entity relation extraction for agricultural applications.

Keywords: Remote supervision, Chinese entity relation extraction, Wheat diseases and pests, BERT