A Systematic Review of Attention Models in Natural Language Processing
DOI:
https://doi.org/10.52700/scir.v6i1.157Keywords:
natural language processing, recurrent neural networks, attention modelsAbstract
The role of attention models in neural networks has drawn much focus among scholars in recent years mainly because of its efficiency and versatility. Attention models are important for many Natural Language Processing applications like question answering, semantic analysis, sentiment analysis and machine translation performing significantly better than other usual approaches. These mechanisms improve neural networks capability in interpretation and tackle issues like performance decay, which affects the Recurrent Neural Networks by concentrating on significant data inputs and context. A stream of research has explored the integration of attention mechanisms to other fields, namely computer vision and graph analysis improving tasks like object detection, image captioning or node representation learning. However, for more detailed research of modern achievements and various domains and model architecture structures there is still the necessity to use the more profound survey-based investigation. This research article provides a review of previous work done on attention-based model in NLP. It also outlines the problems and provides relevant solutions using different methodologies. Future directions of the research are provided for the improvement of the attention models for NLP, discussing potential issues in terms of model performance and its capacity, as well as its interpretability. This review is intended to help further research to more universal attention-based solutions in the domain of NLP.