BiLSTM-CRF: Improving Cybersecurity Text Analysis through Bidirectional LSTM-CRF (BiLSTM-CRF) Based Named Entity Recognition

Authors

  • Muhammad Shoaib Rasheed Department of Computer Science, National College of Business Administration & Economics Lahore, Multan Sub Campus, 60000, Pakistan.
  • Malik Asim Rajwana Department of Computer Science, National College of Business Administration & Economics Lahore, Multan Sub Campus, 60000, Pakistan.
  • M. Ismail Kashif Department of Computer Science, National College of Business Administration & Economics Lahore, Multan Sub Campus, 60000, Pakistan.
  • Hassaan Malik Department of Computer Science, National College of Business Administration & Economics Lahore, Multan Sub Campus, 60000, Pakistan.
  • Wajahat Anwar Bukhari Department of Computer Science, National College of Business Administration & Economics Lahore, Multan Sub Campus, 60000, Pakistan.
  • Sadia Tariq Department of Computer Science, National College of Business Administration & Economics Lahore, Multan Sub Campus, 60000, Pakistan.
  • Nazir Ahmad Department of Computer Science, National College of Business Administration & Economics Lahore, Multan Sub Campus, 60000, Pakistan.
  • Ehsan ul Haq Department of Computer Science, National College of Business Administration & Economics Lahore, Multan Sub Campus, 60000, Pakistan.

DOI:

https://doi.org/10.52700/scir.v7i2.214

Keywords:

Cybersecurity text; Named Entity Recognition; Bidirectional LSTM; Conditional Random Field.

Abstract

The cybersecurity documents such as vulnerability reports, bulletins and blogs provide key information about system weaknesses, threats and attacks. For organizing this unstructured text into structured data is important for faster threat detection and better decision-making. Most existing Named Entity Recognition (NER) systems are trained on general text and do not perform well in cybersecurity tasks. This study presents a Bidirectional LSTM-CRF model built for recognizing entities in cybersecurity documents. The model includes a word embedding layer a,  a Bidirectional Long Short-Term Memory (BiLSTM) network and a Conditional Random Field (CRF) layer to understand both the meaning of words and the relationship between labels. It is tested on an open-source benchmark cybersecurity dataset that contains blogs, security bulletins and Common Vulnerabilities and Exposures (CVE) records. The results show that the proposed model produces better accuracy than other baseline systems in identifying and classifying cybersecurity entities.

Published

2025-12-03

How to Cite

Rasheed, M. S., Rajwana, M. A., Kashif, M. I. ., Malik, H. ., Bukhari, W. A., Tariq, S., Ahmad, N., & Haq, E. ul. (2025). BiLSTM-CRF: Improving Cybersecurity Text Analysis through Bidirectional LSTM-CRF (BiLSTM-CRF) Based Named Entity Recognition. STATISTICS, COMPUTING AND INTERDISCIPLINARY RESEARCH, 7(2), 359-376. https://doi.org/10.52700/scir.v7i2.214