A Hybrid Framework for Detecting Automated Spammers on Twitter: Integrating Machine Learning and Heuristic Approaches

Main Article Content

K. Suresh
K. Thapan
K. Vamshi Reddy
K. Polaiah
T. Abhinav Surya

Abstract

Twitter's open platform has become a hotspot for automated spammers who exploit its vast user base to spread malicious and misleading content. This paper proposes a hybrid approach to detect automated spammers, integrating machine learning models with heuristic rules to achieve a robust and adaptive detection framework. The methodology leverages a rich set of features, including behavioral attributes such as account age and retweet frequency, and content-based metrics like hashtag density and sentiment polarity. The hybrid model combines the adaptability of a Gradient Boosting Classifier with manually defined heuristic rules, enabling it to address the dynamic nature of spam tactics effectively. The proposed system was evaluated using a dataset of 50,000 Twitter accounts, evenly split between spam and legitimate users. Experimental results demonstrate that the hybrid approach outperforms traditional models, achieving a Precision of 91.2%, Recall of 88.9%, F1-Score of 90.0%, and Accuracy of 91.5%. In comparison, standalone models such as Logistic Regression and Support Vector Machines achieved significantly lower performance metrics. These findings highlight the hybrid approach's superior ability to accurately classify spam accounts while minimizing false positives. Despite its effectiveness, the framework's scalability for real-time applications and generalization across platforms remain areas for future work. Additionally, integrating graph-based features and exploring unsupervised techniques are promising directions to enhance detection of previously unseen spam patterns. Overall, this research provides a robust solution for mitigating the growing threat of automated spammers on social media platforms

Article Details

How to Cite
[1]
K. Suresh, K. Thapan, K. Vamshi Reddy, K. Polaiah, and T. Abhinav Surya, “A Hybrid Framework for Detecting Automated Spammers on Twitter: Integrating Machine Learning and Heuristic Approaches”, Int. J. Comput. Eng. Res. Trends, vol. 11, no. 1s, pp. 53–60, Dec. 2024.
Section
Research Articles

References

K. S. Adewole, N. B. Anuar, A. Kamsin, K. D. Varathan, and S. A. Razak, "Malicious accounts: Dark of the social networks," J. Netw. Comput. Appl., vol. 79, pp. 41–67, 2017.

W. Kim, O.-R. Jeong, C. Kim, and J. So, "The dark side of the Internet: Attacks, costs and responses," Inf. Syst., vol. 36, no. 3, pp. 675–705, 2011.

T. Wu, S. Wen, Y. Xiang, and W. Zhou, "Twitter spam detection: Survey of new approaches and comparative study," Comput. Secur., vol. 76, pp. 265–284, 2018.

M. Washha, A. Qaroush, M. Mezghani, and F. Sedes, "Unsupervised collective-based framework for dynamic retraining of supervised real-time spam tweets detection model," Expert Syst. Appl., vol. 135, pp. 129–152, 2019.

S. M. Ahmad, Spam Classification Using Machine Learning and Deep Learning (Doctoral dissertation). Dublin Business School, 2024.

C. Rudin, C. Chen, Z. Chen, H. Huang, L. Semenova, and C. Zhong, "Interpretable machine learning: Fundamental principles and 10 grand challenges," Stat. Surv., vol. 16, 2022.

M. Fazil and M. Abulaish, "A hybrid approach for detecting automated spammers in Twitter," IEEE Trans. Inf. Forensics Secur., vol. 13, no. 11, pp. 2707–2719, 2018.

Y. Mourtaji, M. Bouhorma, D. Alghazzawi, G. Aldabbagh, and A. Alghamdi, "Hybrid rule-based solution for phishing URL detection using convolutional neural network," Wirel. Commun. Mob. Comput., vol. 2021, pp. 1–24, 2021.

W. Hu, Q. Cao, M. Darbandi, and N. Jafari Navimipour, "A deep analysis of nature-inspired and meta-heuristic algorithms for designing intrusion detection systems in cloud/edge and IoT: State-of-the-art techniques, challenges, and future directions," Cluster Comput., vol. 27, no. 7, pp. 8789–8815, 2024.

E. Dzeha, The IntelliTweet: Unveiling Malicious Activities in Tweets through a Multifaceted Feature Analysis (Doctoral dissertation), 2024.

N. Thakur, "A large-scale dataset of Twitter chatter about online learning during the current COVID-19 Omicron wave," Data (Basel), vol. 7, no. 8, p. 109, 2022.

N. Ahmed, R. Amin, H. Aldabbas, D. Koundal, B. Alouffi, and T. Shah, "Machine learning techniques for spam detection in email and IoT platforms: Analysis and research challenges," Security and Communication Networks, vol. 2022, 2022.

S. B. Abkenar, M. H. Kashani, M. Akbari, and E. Mahdipour, "Learning textual features for Twitter spam detection: A systematic literature review," Expert Syst. Appl., vol. 228, p. 120366, 2023.

A. P. Rodrigues et al., "Real-time Twitter spam detection and sentiment analysis using machine learning and deep learning techniques," Comput. Intell. Neurosci., vol. 2022, p. 5211949, 2022.

A. Talha and R. Kara, "A survey of spam detection methods on Twitter," Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 3, 2017.

A. Redhu, P. Choudhary, K. Srinivasan, and T. K. Das, "Deep learning-powered malware detection in cyberspace: A contemporary review," Front. Phys., vol. 12, 2024.

J. Rane, S. K. Mallick, O. Kaya, and N. L. Rane, "Scalable and adaptive deep learning algorithms for large-scale machine learning systems," Future Res. Opportunities Artif. Intell. Ind., vol. 5, pp. 2–40, 2024.

R. Aswani, A. K. Kar, and P. V. Ilavarasan, "Detection of spammers in Twitter marketing: A hybrid approach using social media analytics and bio-inspired computing," Inf. Syst. Front., vol. 20, no. 3, pp. 515–530, 2018.

A. F. Elsaid, R. M. Fahmi, N. Shehta, and B. M. Ramadan, "Machine learning approach for hemorrhagic transformation prediction: Capturing predictors’ interaction," Front. Neurol., vol. 13, p. 951401, 2022.

C. M. R. Da Silva, E. L. Feitosa, and V. C. Garcia, "Heuristic-based strategy for phishing prediction: A survey of URL-based approach," Comput. Secur., vol. 88, 2020.

D. W. Hosmer and S. Lemeshow, Applied Logistic Regression, 2nd ed. Wiley-Interscience, 2000.

C. Cortes and V. Vapnik, "Support-vector networks," Mach. Learn., vol. 20, no. 3, pp. 273–297, 1995.

J. R. Quinlan, "Induction of decision trees," Mach. Learn., vol. 1, no. 1