Enhanced social media phishing detection model using LSTM and BERT

Authors

  • Wenni Syafitri Department of Informatic Engineering, Universitas Lancang Kuning, Pekanbaru 28266, Indonesia
  • Eddisyah Putra Pane Department of Information System, Universitas Lancang Kuning, Pekanbaru 28266, Indonesia
  • Edi Purwanto Department of Information System, Universitas Lancang Kuning, Pekanbaru 28266, Indonesia

DOI:

https://doi.org/10.59190/stc.v6i2.360

Keywords:

BERT, Cybersecurity, Deep Learning, Phishing Detection, Transfer Learning

Abstract

Phishing attacks are a major cyber threat, with more than 30% of incidents occurring via social media platforms, especially short message services. This study evaluates deep learning approaches for automated phishing detection using BERT and Hybrid (BERT-LSTM) architectures fine-tuned on 15950 annotated SMS. The BERT-only model achieved superior performance (F1 0.9928, recall 0.9952, AUC 0.999) with no statistically significant improvement from adding BiLSTM layers (0.0006). K-fold cross-validation demonstrated robust generalisation (coefficient of variation 0.10%). Dataset saturation analysis indicated that 15,950 SMS are sufficient for effective transfer learning. Mild overfitting (6.3x loss ratio) remained within acceptable bounds and did not affect validation metrics. The 1.77% false positive rate and 99.52% recall enable practical deployment for production phishing defence. Results demonstrate that transfer learning with BERT achieves production-grade performance while challenging conventional assumptions about architectural complexity.

Published

2026-02-28

How to Cite

Syafitri, W., Pane, E. P., & Purwanto, E. (2026). Enhanced social media phishing detection model using LSTM and BERT. Science, Technology, and Communication Journal, 6(2), 161-170. https://doi.org/10.59190/stc.v6i2.360