An Optimized KNN Model for Signature-Based Malware Detection
Main Article Content
Abstract
Malware is a computer program developed with the intent of disrupting, stealing, and compromising a computer system. In recent advances in technology and internet use, malware has become the major problem in computer society. In this research, an optimal K-nearest Neighbor (KNN) based malware detection and classification model is proposed. The proposed malware detection model is based on application programming interface (API) call sequence analysis and classification. The dataset is collected from an online Kaggle data repository which consists of 42,797 malicious application programming interface (API) call sequences and 1,079 non-malicious application programming interface (API) call sequences. The Nearest Neighbor (KNN) algorithm is applied to the dataset to create a model that detects malware. Finally, the accuracy of the proposed KNN based malware detection model is evaluated, and the result shows that the accuracy of 98.17% is achieved in the detection of malware using the model. The proposed model is significantly essential for detecting real-time intrusion on computer systems.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
IJCERT Policy:
The published work presented in this paper is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. This means that the content of this paper can be shared, copied, and redistributed in any medium or format, as long as the original author is properly attributed. Additionally, any derivative works based on this paper must also be licensed under the same terms. This licensing agreement allows for broad dissemination and use of the work while maintaining the author's rights and recognition.
By submitting this paper to IJCERT, the author(s) agree to these licensing terms and confirm that the work is original and does not infringe on any third-party copyright or intellectual property rights.
References
Yu-Lun Wan, Jen-Chun Chang, Rong-Jaye Chen, Shiuh-Jeng Wang, Feature-Selection-Based Ransomware Detection with Machine Learning of Data Analysis, IEEE, International Conference on Computer and Communication Systems, 2018.
Aziz Mohaisen, Omar Alrawi, Jeman Park, Network-based Analysis and Classification of Malware using Behavioral Artifacts Ordering, Association for Computing Machinery, 2019.
Om Prakash Samantray, Satya Narayan Tripathy, Susanta Kumar Das, A Data Mining Based Malware Detection Model using Distinct API Call Sequences, International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8 Issue-7, May 2019.
Niranjan A, Akshobhya KM, P Deepa Shenoy, Venugopal K R, Ensemble of kNN, Naïve Bayes Kernel and ID3 for Efficient Botnet Classification using Stacking, IEEE, 2018.
Assegie, T.A, Nair, P.S, Comparative Study On Methods Used In Prevention And Detection Against Adress Resolution Protocol Spoofing Attack, Journal of Theoretical and Applied Information Technology 31st August 2019.
Assegie, T.A, A Predictive Model For Improving Employee Attrition Rate With K-Nearest Neighbor Classifier, International Journal of Research and Reviews in Applied Sciences,, Jan-Mar. 2021.
Assegie, T.A, An optimized K-Nearest Neighbor based breast cancer detection, Journal of Robotics and Control (JRC) Volume 2, Issue 3, May 2020.
Maryam Nisa, Jamal Hussain Shah, Shansa Kanwal, Mudassar Raza, Muhammad Attique Khan, Robertas Damaševi?cius, Tomas Blažauskas, Hybrid Malware Classification Method Using Segmentation-Based Fractal Texture Analysis and Deep Convolution Neural Network Features, Applied Sciences, 2020.
Assegie, T.A, Nair, P.S, Comparative Study On Methods Used In Prevention And Detection Against Adress Resolution Protocol Spoofing Attack, Journal of Theoretical and Applied Information Technology 31st August 2019.
Sunoh Choi, Combined KNN Classification and Hierarchical Similarity Hash for Fast Malware Detection, Applied science, 2020.
P HarshaLatha, R Mohanasundaram, Classification of Malware Detection Using Machine Learning Algorithms: A Survey, International Journal of Scientific & Technology Research Volume 9, Issue 02, February 2020.
Usha Narra, Clustering versus SVM for Malware Detection, A Project Presented to The Faculty of the Department of Computer Science San Jose State University In Partial Fulfilment of the Requirements for the Degree Master of Science, 2015.