A robust feature engineering approach for Arabic extremist content detection in social media

Authors

  • Ahmed Salman Ibraheem Department of Cyber Security, Imam Alkadhum College (IKC), Iraq

DOI:

https://doi.org/10.37868/dss.v7.id316

Abstract

In this paper, we propose an end-to-end machine learning system to analyze sentiment polarity in extremist (terrorism-related) Arabic content with novel designed features concentrating on linguistic discourse, properties and changes of such contents. We constructed three corpora (V1, V2 and V3) from Arabic tweets; which have been pre-processed by using various linguistic techniques: Normal stemmer, root pattern, Light Stemming. We have employed various machines' learning algorithms such as SVMs, NB and KNN with BOW and ngr am models to retrieve features. Our large scale comparative analysis based on a real-dataset benchmark chose linear SVM and Uni-gram model in conjunction with Term Frequency-inverse document Frequency (TF-IDF) as the preferable choice. Our approach achieved better accuracy for extremist sentiment detection and greater Recall in V1 (81.097%) and V2 (81.707%) compared to this setup. These ones were superior to other combination of SVM kernels along with the KNN algorithm that also was very competitive. Our findings outperformed the already established approach (Kanan & Fox) for classifying extremist Arabic texts (our BEA as an average achieved accuracy rate higher than their 78.00% but using P-Stemmer and SVM). The precision-recall and ROC AUC values for SVM settings also reinforced the performance, and high scores reflected its ability to handle complex features of Arabic like syllabic lengthening and diacritics. The present study demonstrates the potential applicability of this approach to enhanced supporting extremism detection analysis in Arabic textual data, and may offer a clearer perspective for those concerned on security, education and policy making domains.

Downloads

Published

2026-03-12

How to Cite

[1]
A. S. . Ibraheem, “A robust feature engineering approach for Arabic extremist content detection in social media”, Defense and Security Studies, vol. 7, no. 1, pp. 106–122, Mar. 2026.

Issue

Section

Articles