Skip to content

adityaranjan08/Analysis-of-Movie-Dataset-Using-Exploratory-Data-Analysis-Classification-and-Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎬 Analysis of Indian Movie Dataset using EDA, Classification, and Regression

License Repo Size Last Commit

A data science project focused on exploring, visualizing, and modeling an Indian movies dataset. This project applies Exploratory Data Analysis (EDA), classification, and regression techniques to uncover patterns and predict movie success metrics.

--

📂 Project Structure

├── data/                  # Dataset files (CSV)
├── eda/                   # EDA notebooks and visualizations
├── models/                # Classification & regression models
├── utils/                 # Utility functions
├── outputs/               # Graphs, plots, and predictions
├── requirements.txt       # Python dependencies
└── main.ipynb             # Main notebook (EDA + Modeling)

🧠 Key Features

  • ✅ Data cleaning and preprocessing
  • 📊 Exploratory Data Analysis (EDA) with Matplotlib & Seaborn
  • 🤖 Machine Learning models:
    • Classification (Decision Trees, Random Forests, etc.)
    • Regression (Linear Regression, Random Forest Regressor, etc.)
  • 📈 Evaluation metrics: Accuracy, MAE, RMSE, R²
  • 🔍 Insightful visualizations and trend analysis

📌 Dataset Overview

The dataset contains information about Indian movies such as:

  • Title, Genre, Language
  • IMDb rating
  • Number of votes
  • Release date
  • Budget and box office performance

📁 Source of dataset (add link if public)


🚀 Getting Started

1. Clone the repo

git clone https://github.com/adityaranjan08/Analysis-of-Movie-Dataset-Using-Exploratory-Data-Analysis-Classification-and-Regression.git
cd Analysis-of-Movie-Dataset-Using-Exploratory-Data-Analysis-Classification-and-Regression

2. Install dependencies

pip install -r requirements.txt

3. Run the notebook

Use Jupyter Notebook or Jupyter Lab:

jupyter notebook main.ipynb

📈 Results & Insights

Some interesting findings:

  • Certain genres and languages are more likely to succeed.
  • IMDb ratings are strongly correlated with vote counts.
  • Regression models can reasonably predict movie popularity.

Detailed results and charts are available in the outputs/ folder.


🛠 Tools & Technologies

  • Python (Pandas, NumPy, Scikit-learn)
  • Seaborn & Matplotlib for Visualization
  • Jupyter Notebook
  • Git & GitHub

🤝 Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss.


📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


👨‍💻 Author

Aditya Ranjan
📧 Email
🌐 GitHub


⭐️ If you found this repo helpful, feel free to give it a star!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages