Skip to content

rishavafk/Cyber_Bullying-Detection

Repository files navigation

Cyber Bullying Detection

Overview

This project is a machine learning-based web application for detecting cyberbullying in tweets. It uses natural language processing (NLP) and supervised learning to classify text as bullying or not. The app is built with Streamlit for an interactive user interface.

Features

  • Input text/tweet and get instant cyberbullying prediction
  • Uses a trained machine learning model (e.g., Decision Tree, Random Forest, or similar)
  • Text preprocessing with NLTK (stopwords, tokenization)
  • TF-IDF vectorization for feature extraction
  • Model accuracy and evaluation metrics

Dataset

  • cyberbullying_tweets.csv: Contains labeled tweets for training and testing
  • Classes: Bullying, Not Bullying (binary classification)

Machine Learning Pipeline

  1. Data Preprocessing
    • Remove stopwords
    • Tokenize and clean text
    • Convert text to lowercase
  2. Feature Extraction
    • TF-IDF Vectorizer transforms text into numerical features
    • Vectorizer is saved as tfidf_vectorizer.pkl
  3. Model Training
    • Model (e.g., DecisionTreeClassifier) is trained on the vectorized data
    • Model is saved as bullying_model.pkl
    • Model accuracy is evaluated and reported
  4. Prediction
    • User input is preprocessed and vectorized
    • Model predicts if the input is bullying or not

Streamlit App

  • Main file: app.py
  • Loads the trained model and vectorizer
  • Provides a simple UI for text input and displays prediction
  • Can be run locally or deployed online

How to Run Locally

  1. Install dependencies:
    pip install -r requirements.txt
  2. Start the app:
    streamlit run app.py
  3. The app will open in your browser.

How to Deploy Online

  1. Push your project to GitHub
  2. Go to Streamlit Community Cloud
  3. Link your GitHub repo and select app.py as the main file
  4. Deploy and share your app

Files

  • app.py: Streamlit web app
  • bullying_model.pkl: Trained ML model
  • tfidf_vectorizer.pkl: TF-IDF vectorizer
  • cyberbullying_tweets.csv: Dataset
  • bullying-classification-accuracy-80.ipynb: Training notebook
  • requirements.txt: Python dependencies

Requirements

  • Python 3.8+
  • scikit-learn
  • nltk
  • streamlit

License

MIT License

Author

  • [Rishav Shah]

Feel free to modify this README to add more details about your model, dataset, or deployment process.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •