Spam Email Checker

This is a machine learning-based web application that classifies emails as SPAM or NOT SPAM using natural language processing (NLP). The application is built with Flask and includes a simple web interface for users to input email text and receive predictions.

File Structure

project-folder/
├── app.py                # Main Flask application
├── dataset/              # Folder containing the dataset
│   └── email.csv         # Dataset file (spam email data)
├── spam_model.pkl        # Trained ML model
├── tfidf_vectorizer.pkl  # Trained TF-IDF vectorizer
├── templates/            # Folder containing HTML templates
│   └── index.html        # Web interface template
└── README.md             # Project documentation

Getting Started

Prerequisites

Python 3.7+
Required Python packages:
- Flask
- scikit-learn
- pandas
- joblib

Install the required packages:

pip install flask scikit-learn pandas joblib

Setup Instructions

Clone the repository or download the project files.

git clone https://github.com/yourusername/spam-email-checker.git
cd spam-email-checker

Place your dataset (email.csv) in the dataset/ folder.
Train the model (if not already trained):
- Open a Python script or notebook.
- Load email.csv for training.
- Save the trained model as spam_model.pkl and the TF-IDF vectorizer as tfidf_vectorizer.pkl in the project root directory.
If you already have these files (spam_model.pkl and tfidf_vectorizer.pkl), skip this step.
Run the Flask application:
```
python app.py
```
Open your web browser and navigate to http://127.0.0.1:5000.

Using the Application

Webpage Screenshot

Below is a screenshot of the web interface:

Enter email text in the input field on the web interface.
Click the Check button.
View the result: The app will display whether the email is classified as SPAM or NOT SPAM.

Dataset

The dataset (email.csv) should contain email text and labels (e.g., 0 for NOT SPAM and 1 for SPAM). If you're using your own dataset, ensure it is preprocessed appropriately.

Model Training

The model is trained using:

TF-IDF Vectorizer: Converts email text into numerical features.
Logistic Regression: Used for binary classification.

Training Steps:

Preprocess the text by removing special characters, converting to lowercase, and tokenizing.
Vectorize the text using TF-IDF.
Train the Logistic Regression model.
Save the model and vectorizer using joblib.

Technologies Used

Python
Flask: For building the web application.
scikit-learn: For machine learning and text processing.
HTML/CSS: For the front-end web interface.

Future Enhancements

Improve model accuracy by exploring advanced NLP techniques (e.g., word embeddings or deep learning).
Add more user-friendly features to the web interface.
Deploy the application to a cloud platform (e.g., Heroku, AWS, or Google Cloud).

License

This project is open-source and available under the MIT License.

Author

Virendra Ghule

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spam Email Checker

File Structure

Getting Started

Prerequisites

Setup Instructions

Using the Application

Webpage Screenshot

Dataset

Model Training

Training Steps:

Technologies Used

Future Enhancements

License

Author

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
dataset		dataset
pic		pic
templates		templates
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.ipynb		main.ipynb
spam_model.pkl		spam_model.pkl
tfidf_vectorizer.pkl		tfidf_vectorizer.pkl

License

Virendra108/Spam-email_WebApplication

Folders and files

Latest commit

History

Repository files navigation

Spam Email Checker

File Structure

Getting Started

Prerequisites

Setup Instructions

Using the Application

Webpage Screenshot

Dataset

Model Training

Training Steps:

Technologies Used

Future Enhancements

License

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages