Open Access Open Access  Restricted Access Subscription Access

Employing Machine Learning Methods to Identify Spam Emails

Puli Vinay Kumar, S. Aswani, S. Sudheer Reddy, K. Subba Shankar, M. Rajashekar, V. Vidyasagar, B. Mohan

Abstract


The issue of email spam has gotten significantly worse in recent years due to the rapid rise in internet users. Millions of unsolicited emails are sent every day worldwide, which presents problems for both individuals and businesses. These emails are frequently used for unethical and unlawful purposes, such as financial frauds, phishing, and the distribution of malicious malware. Spammers use the ease with which phony online profiles and email accounts can be created, making it difficult for recipients to Dangerous files or links that can install malicious software, steal personal information, or grant unauthorized access to a user's computer are frequently included in spam emails.

Older techniques of filtering emails based on basic principles are insufficient to stop these threats since spammers' methods of sending spam are constantly evolving. The necessity for more intelligent and automated systems that can accurately detect and prevent spam emails is demonstrated by this growing issue. Our project is using machine learning to develop a robust method for identifying spam emails in order to add rest his issue.

To distinguish between spam and legitimate emails, the system will consider both the content and additional information. We tested the system on a popular spam email dataset and discovered that the Random Forest algorithm performed the best. A machine learning technique called Random Forest mixes the output of numerous decision trees to produce predictions that are more trustworthy and accurate. Compared to employing a single decision tree, it is less likely to make mistakes and manages complex data better.


Full Text:

PDF

References


Suryavanshi, Shubhangi & Goswami, Anurag & Patil, Pramod. (2019). Email Spam Detection: An Empirical Comparative Study of Different ML and Ensemble Classifiers.

Karim, A., Azam, S., Shanmugam, B., Krishnan, K., & Alazab, M. (2019). A Comprehensive Survey for Intelligent Spam Email Detection

K. Agarwal and T. Kumar, "Email Spam Detection Using Integrated Approach of Naïve Bayes and Particle Swarm Optimization," 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS).

Harisinghaney, Anirudh, Aman Dixit, Saurabh Gupta, and Anuja Arora. "Text and image-based spam email classification using KNN, Naïve Bayes and Reverse DBSCAN algorithm." In Optimization, Reliabilty, and Information Technology (ICROIT).

5. Mohamad, Masurah, and Ali Selamat. "An evaluation on the efficiency of hybrid feature selection in spam email classification." In Computer, Communications, and Control Technology (14CT).

Shradhanjali, Prof. Toran Verma “E- Mail Spam Detection and Classification Using SVM and Feature Extraction” in International Jouranl of Advance Reasearch, Ideas and Innovation In Technology.

W.A, Awad & S.M, ELseuofi. (2011). Machine Learning Methods for Spam E- Mail Classification. International Journal of Computer Science & Information Technology. 3. 10.5121/ijcsit.2011.3112.

A. K. Ameen and B. Kaya, "Spam detection in online social networks by deep learning," 2018 International Conference on Artificial Intelligence and Data Processing (IDAP).

Diren, D.D., Boran, S., Selvi, I.H., & Hatipoglu, T. (2019). Root Cause Detection with an Ensemble Machine Learning Approach in the Multivariate Manufacturing Process.

Tasnim Kabir, Abida Sanjana Shemonti, Atif Hasan Rahman. "Notice of Violation of IEEE Publication Principles: Species Identification Using Partial DNA Sequence: A Machine Learning Approach”, 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE), 2018.


Refbacks

  • There are currently no refbacks.