Open Access Open Access  Restricted Access Subscription Access

Spam Detection Using Machine Learning: A Logistic Regression Approach

Y. Sri Navya, K. Pranathi, G. Srija, Syeda Hifsa Naaz

Abstract


Spam emails pose a significant challenge to digital communication, leading to security threats and reduced productivity. This paper proposes a machine learning-based strategy to spam detection using Logistic Regression with TF-IDF vectorization. The dataset is prepared by handling missing values and normalizing labels. A TF-IDF model with bigram inclusion is implemented for feature extraction, followed by a balanced Logistic Regression classifier to address class imbalance. Experimental results indicate promising accuracy, demonstrating the effectiveness of the proposed method. Future enhancements include deep learning models and ensemble techniques for improved spam classification.


Full Text:

PDF

References


Yaseen, Q. (2021). Spam email detection using deep learning techniques. Procedia Computer Science, 184, 853-858.

Sethi, M., Chandra, S., Chaudhary, V., & Dahiya, Y. (2022). Spam email detection using machine learning and neural networks. In Sentimental Analysis and Deep Learning: Proceedings of ICSADL 2021 (pp. 275-290). Springer Singapore.

Malhotra, P., & Malik, S. (2022, June). Spam email detection using machine learning and deep learning techniques. In Proceedings of the International Conference on Innovative Computing & Communication (ICICC).

Nikesh, M., D. Rohini, M. Bharathi, and Syeda Hifsa Naaz. "Predicting Pneumonia with Precision: A Deep Learning Approach." Journal of Innovations in Data Science and Big Data Management (2025): 10-25.

Kumar, N., & Sonowal, S. (2020, July). Email spam detection using machine learning algorithms. In 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) (pp. 108-113). IEEE.

Kontsewaya, Yuliya, Evgeniy Antonov, and Alexey Artamonov. "Evaluating the effectiveness of machine learning methods for spam detection." Procedia Computer Science 190 (2021): 479-486.

Aditya Sai Srinivas, T., Ramasubbareddy Somula, and K. Govinda. "Privacy and security in Aadhaar." In Smart Intelligent Computing and Applications: Proceedings of the Third International Conference on Smart Computing and Informatics, Volume 1, pp. 405-410. Springer Singapore, 2020.

Madhavan, M. V., Pande, S., Umekar, P., Mahore, T., & Kalyankar, D. (2021). Comparative analysis of detection of email spam with the aid of machine learning approaches. In IOP conference series: materials science and engineering (Vol. 1022, No. 1, p. 012113). IOP Publishing.

Harikrishnan, N. B., Vinayakumar, R., & Soman, K. P. (2018, March). A machine learning approach towards phishing email detection. In Proceedings of the anti-phishing pilot at ACM international workshop on security and privacy analytics (IWSPA AP) (Vol. 2013, pp. 455-468).

Gangavarapu, T., Jaidhar, C. D., & Chanduka, B. (2020). Applicability of machine learning in spam and phishing email filtering: review and approaches. Artificial Intelligence Review, 53(7), 5019-5081.


Refbacks

  • There are currently no refbacks.