In a paper published in the journal Scientific Reports, researchers explored the intersection of artificial intelligence and aerospace development. Focusing on named entity recognition (NER) as a pivotal tool in natural language processing (NLP), the study introduced a novel model, the multi-feature fusion transformer (MFT). Utilizing a dataset of 30,000 Chinese sentences in the aerospace domain, MFT integrated features like words and radicals to enhance semantic understanding.
The model, augmented with a double feed-forward neural (FFN) network, demonstrated exceptional entity recognition performance on the aerospace dataset. This research underscored the potential of advanced NLP techniques in unlocking critical information from large aerospace datasets, contributing to the evolving landscape of AI in aerospace applications.
AI in Aerospace
In the dynamic realm of aerospace, advancements in human-crewed spaceflight, exemplified by SpaceX's Dragon spacecraft, are increasingly intertwined with artificial intelligence (AI). NER is vital in extracting meaningful information from extensive aerospace text. Addressing the challenges in Chinese entity recognition, innovative approaches like the lattice structure and flat demonstrate the importance of lexicons, contextual semantics, and character radicals. The evolution from long short-term memory (LSTM) to transformer architectures underscores ongoing efforts to enhance accuracy in acquiring knowledge from aerospace texts.
Aerospace NER with MFT
Human-crewed spaceflight technologies, such as SpaceX's Dragon spacecraft, have made unprecedented progress in aerospace, marking a significant advancement. This advancement increasingly intertwines with AI, showcasing the potential to simplify complex operational processes and enhance autonomy. NER is a critical tool for extracting meaningful information from extensive aerospace text.
In addressing the challenges of Chinese entity recognition, the paper introduces the MFT, highlighting its ability to fuse word and radical information for more effective named entity recognition. The MFT model incorporates a unique network structure, utilizing a one-dimensional convolutional neural network (1D-CNN) to extract radical embeddings and a double feedforward multi-head self-attention (DFMS) encoder module to encode information for improved performance in NER.
The flat lattice module, reminiscent of its flat model counterpart, employs a lexicon to match input sentences, constructing a lattice that encodes potential words' positions. The radical feature module also breaks down Chinese characters into radicals, leveraging 1D-CNN for encoding. The paper details flattening the lattice and extracting both radical and lattice embeddings. Researchers concatenate the embeddings to obtain a comprehensive representation for subsequent processing.
The DFMS encoder module, a crucial component of the MFT model, incorporates a double-FFN network with a self-attention neural network. Like Conformer, this structure utilizes residual connections and normalization between layers to enhance the model's ability to capture long-term dependencies. The sequence embedding, combining lattice and radical features, undergoes multiple layers of computations to achieve the final encoding.
The researchers introduce the conditional random fields (CRF) decoder module as an effective tool for labeled sequence prediction. Utilizing CRF, the model decodes label sequences to obtain the final NER results. The paper details the mathematical formulation of CRF and its application in the context of the NER task.
Comprehensive Aerospace NER Model Analysis
The paper conducts comprehensive experiments to evaluate the performance of the MFT model, comparing it against mainstream NER models on both the constructed aerospace NER dataset and publicly available datasets like Weibo and Resume. Evaluation metrics include precision, recall, and F1 score. The effectiveness of the MFT model's structure is systematically studied to validate its robustness and applicability beyond the specific aerospace dataset.
Explaining the construction of the aerospace-NER dataset involves incorporating data from Wikipedia and the China Aerospace News website. The dataset, categorized into training, developing, and testing sets, undergoes meticulous annotation with seven predefined entity types. The labeling process, involving six annotators and verification by a manager, spans approximately one month. Additionally, mainstream Chinese NER datasets, such as Weibo and Resume, are incorporated for comparative analysis.
Model Experiment and Enhancements
In the experiments, the model utilized the same word lexicon and pre-trained character and word embeddings as in the lattice-LSTM with the radical lexicon. The original authors provided all comparison model codes, and the model was trained on an Ubuntu system using a real-time extension 3060 (RTX 3060). Hyperparameters were set differently for various datasets, with MFT consisting of 9,765,018 trainable parameters on the aerospace dataset and 9,319,506 trainable parameters on the resume dataset.
MFT showed a significant performance improvement on the aerospace dataset, with a 0.97% increase in F1 score compared to the baseline model flat. The local-aware region (LR-CNN) and lateral geniculate nucleus (LGN) performed worse on the aerospace dataset. At the same time, the LSTM combined with lattice achieved an F1 score of 9.88% lower than MFT. Adopting the pre-training model, bidirectional encoder representations from transformers (BERT) by MFT resulted in substantial overall performance improvement. MFT + BERT, although not performing as well as flat + BERT in recall, exhibited better F1 and precision.
MFT exhibited significant performance enhancements, achieving a notable F1 score of 64.38% on the Weibo dataset. While LR-CNN excelled in precision, its lower recall rate than MFT highlighted its comprehensive improvement, particularly when leveraging BERT for pre-training. Across various datasets, experiments on fusion methods demonstrated the superiority of concatenation, with MFT consistently outperforming flat, even on the aerospace dataset where precision decreased. Additionally, experiments confirmed that the double full-step FFN in MFT is more effective for the NER task than the double half-step FFN. The efficacy study affirmed the practical advantages of MFT's enhancements, incorporating radical information and employing a double-FFN structure.
To summarize, this study presents an aerospace NER method leveraging MFT. The corpus is gathered from Wikipedia and China Aerospace News using crawlers, with manual labeling creating the aerospace dataset. MFT exhibits remarkable performance, credited to integrating radical features in Chinese characters and implementing a double-FFN, significantly enhancing recognition rates. Future work may explore additional Chinese features in a multimodal approach, necessitating careful consideration to filter noise and maintain model efficiency.