Abstract:Boundary samples of different categories staggered on the boundary in the datasets of named entity recognition research, which affects the performance of named entity recognition model. A method based on local adversarial training and BiLSTMCRF model is proposed to solve the problem above. The method selects hard examples which contain a lot of boundary samples to crafting adversarial samples. The process is based on the characteristics of boundary samples that are easily perturbed to leave from the correct category, and then get adversarial samples from the target attack step according to the confusion matrix error probability distribution. Finally, the datasets mixing with the original data and the adversarial is used to adversarial training to enhance the model’s recognition ability. In order to verify the superiority of this method, global/local adversarial training based on nontarget attack method and local adversarial training based on target attack are designed as comparative experiments. Experimental results show that the method proposed improves the quality of adversarial samples while retaining the advantages of adversarial training. The F1 scores on the three datasets of JNLPBA, MalwareTextDB, and Drugbank are increased by 1.34%, 6.03%, and 3.65% respectively.