AdClassifier Pro is an enterprise-grade Machine Learning application designed to intelligently categorize digital advertisement text. Built with Streamlit and Scikit-learn, it offers a modern SaaS-like experience for both single-text analysis and bulk data processing.
- Real-time Metrics: View total ads, unique categories, model accuracy, and versioning.
- Data Visualization: Interactive charts showing the distribution of ad categories in the training dataset.
- Real-time Analysis: Paste any job description or ad text to get an instant classification.
- Smart Logic:
- High Confidence: Displays the predicted category with a visual confidence bar.
- Low Confidence Handling: If the model's confidence is below 60%, the system flags the input as "Unclear/Random" rather than making a potentially incorrect guess. This ensures reliability for end-users.
- Bulk Operations: Upload
.csvor.xlsxfiles containing thousands of ad texts. - Automated Labeling: Processing engine classifies every row.
- Uncertainty Detection: Rows with low confidence scores are explicitly marked as
Uncertain/Random. - Analytics: Auto-generated reports on the distributed categories within your batch.
- Export: Download the fully labeled dataset in one click.
- Dark Mode: Sleek, high-contrast design for reduced eye strain.
- Responsive: Fully functional on desktop and tablet sizes.
- Interactive Outcomes: clear, card-based result displays.
- Frontend: Streamlit
- Machine Learning: Scikit-learn
- Data Processing: Pandas & NumPy
- Visualization: Streamlit native charts
Follow these steps to set up the project locally.
- Python 3.8 or higher
git clone <repository-url>
cd Digital-Advertisement-ClassificationIt's recommended to use a virtual environment to manage dependencies.
Windows:
python -m venv .venv
.venv\Scripts\activateMac/Linux:
python3 -m venv .venv
source .venv/bin/activatepip install -r requirements.txtstreamlit run streamlit_app.pyThe app will open automatically in your browser at http://localhost:8501.
The application allows for some configuration directly in the code logic.
In streamlit_app.py, you can adjust the sensitivity of the "Smart Logic" by modifying the CONFIDENCE_THRESHOLD constant at the top of the file.
# Default is 60.0%
CONFIDENCE_THRESHOLD = 60.0├── data/
│ └── ConcatenatedDigitalAdData.xlsx # Training dataset
├── model/
│ └── adv_model.sav # Pre-trained ML model
├── notebook/
│ └── ... # Research notebooks
├── streamlit_app.py # Main Application
├── requirements.txt # Python dependencies
└── README.md # Documentation
Q: The Classifier keeps saying "Input unclear". A: This happens when the text provided is too short, gibberish, or completely unrelated to known ad categories. The model confidence is below 60%. Try providing more descriptive text.
Q: "FileNotFoundError" on startup.
A: Ensure you are running the streamlit run command from the root directory of the project, so it can find model/adv_model.sav and data/.
Q: Can I train my own model?
A: Yes, check the notebook/ directory for the training scripts used to generate the .sav file.