Comment Moderation: Text Classification Model • Data Fukuro

Project details:

This project builds a machine-learning model that classifies user comments as toxic or non-toxic. After cleaning the text and converting it into numerical features with TF-IDF, several algorithms were tested and tuned. The final model helps online platforms flag harmful content automatically and maintain a safer communication environment.

Date: June 2022

Link: Github Repository

Tags: Classification Modeling, Feature Engineering, Grid Search, Machine Learning, Model Evaluation, Natural Language Processing, NumPy, Pandas, Scikit-learn, Text Preprocessing, TF-IDF Vectorization

Description

Business Context & Problem

Online platforms hosting user-generated content face challenges with offensive, abusive or harmful messages. Manual moderation is expensive and slow, especially at scale. A reliable classification model helps filter toxic comments early, reduce moderator workload and improve the user experience. The project focuses on building such a model using historical labelled comments.

Data & Analytical Approach

The dataset contained large volumes of text comments with binary toxicity labels. Text preprocessing included lowercasing, removing symbols and normalising whitespace. Stopwords were handled appropriately, and the cleaned text was transformed using TF-IDF to capture meaningful vocabulary patterns. The data was split into training, validation and test sets to evaluate generalisation properly.

Statistical / ML Analysis

Several classification models were trained — logistic regression, linear SVM and random forest. Hyperparameters were tuned using grid search to improve F1 score, which is especially important for imbalanced datasets. Linear models performed particularly well with TF-IDF features. The final model produced strong accuracy and balanced performance across both classes.

Key Insights & Final Recommendations

The analysis confirmed that TF-IDF features combined with a linear classifier provide an effective baseline for comment moderation tasks. The model can be integrated into a moderation pipeline to pre-screen comments and prioritise human review where needed.
Overall, the project demonstrates a practical NLP solution for identifying harmful content and improving communication safety.