AI-Generated Images Detection

This Repository is Part of the AWS Machine Learning Engineer Track on The Digital Egypt Pioneers Initiative (DEPI) Batch 2 - April 2025

📖 Project Report

The System is live and accessible here:

Project Demo 🚀:

AI-Generated-Images-Detection-Demo.mp4

🔍 Introduction

The line between real and AI-generated content is blurring fast. With tools like DALL·E and Midjourney now accessible to everyone, malicious use cases — from deepfake propaganda to fake historical imagery — are on the rise.
There’s a critical need for automated, scalable systems that can reliably detect such synthetic content. Manual verification doesn’t scale, and conventional tools fail to keep up with the realism of new AI models.
In this project, we aim to respond to that need by developing a robust AI-powered image detection system that classifies content as either AI-generated or human-created.

🎯 Objectives

Data Pipeline and Augmentation
- Preprocess images at multiple resolutions (e.g., 224, 384).
- Apply robust augmentation strategies like resizing, cropping, flipping, rotation, zooming, shearing, and brightness-coloring adjustments to generalize the model.
Model Development
- Develop and compare state-of-the-art deep learning models for image classification, including variants from the EfficientNet and ConvNeXt families, deploy the best trade-off between accuracy and inference cost.
- Evaluate the models using a custom loss function that blends binary cross-entropy with a fairness-oriented MSE penalty to reduce bias and enforce a target prediction ratio.
- Converting the best-performing model to ONNX format for optimized deployment and faster inference in a production environment.
Web Backend & Deployment
- Build a FastAPI backend to handle image uploads and inference requests.
- Use Docker and Docker Compose to containerize the entire application.
- Store models on AWS S3, push containers to AWS ECR, and run them on AWS EC2.
- Automate deployments using GitHub Actions for CI/CD.

🔬 Methodology

1️⃣ Data Collection & Preprocessing

Dataset: The dataset is sourced from the AI-Generated vs. Human-Created Images Competition. The dataset for this Competition is provided by Shutterstock and DeepMedia, which combines authentic and AI-generated images to create a robust foundation for training and evaluation.

AI-Generated Images Samples:

Human-Created Images Samples:
Preprocessing & Augmentation: Images are resized to resolutions like 224 or 384 and normalized using pretrained model-specific functions. The augmentation pipeline uses Keras layers to apply transformations such as random cropping, flipping, rotation, translation, zoom, subtle blurring, sharpness and brightness-colors shifts.

AI-Generated Augmented Images Samples:

Human-Created Augmented Images Samples:
Data Splitting:
- The dataset is split into 90% training (about 72,000 Images) and 10% validation (about 8,000 Images), with the Kaggle competition test set (about 5500 Images) used for final testing.
- To prevent data leakage, we used GroupShuffleSplit during data splitting. Since each human-created image in the dataset had a corresponding AI-generated counterpart, random splitting could easily place related pairs into both training and validation sets. Group-based splitting ensured these pairs remained within the same split, preserving the integrity of our evaluation.
  
  Competition Test Set Samples:

2️⃣ Model Development & Training

We experimented with multiple high-performing models:

EfficientNetV2S
- Validation Score: 98%
- Kaggle Score: 77.5%
- Model Architecture:
  - Native version from Keras Pre-trained models.
ConvNeXtTiny
- Validation Score: 99%
- Kaggle Score: 79%
- Model Architecture:
  - Native version from Keras Pre-trained models.
EfficientNetB5-Swin
- Validation Score: 99%
- Kaggle Score: 81%
- Model Architecture:
  - Keras Hub EfficientNet B5 model pre-trained on the ImageNet 12k dataset and fine-tuned on ImageNet-1k by Ross Wightman. Based on Swin Transformer train / pretrain recipe with modifications (related to both DeiT and ConvNeXt recipes).

EfficientNetB5-Swin was ultimately chosen for deployment due to its superior performance and efficient trade-off between accuracy, size, and inference speed—an essential trade-off for scalable, cloud-based deployment.

Comparing Models' Accuracy and Scores:

Comparing Models' Accuracy and Inference Time:

Our Kaggle competition score would place us in the Top 20 teams from more than 550 teams, confirming the robustness of our approach and pipeline.

Training Configuration:

Optimizer: AdamW
Training Duration: 3–5 epochs
Loss Function: Custom loss to address models' bias towards a particular class in training.
- Explanation: We used a Custom loss combining binary cross-entropy and an MSE fairness penalty, which enforces alignment with a target class distribution by penalizing deviation from a predefined ratio of AI-generated predictions, thereby mitigating bias and encouraging balanced predictions.
  - $Loss_1$: Standard cross-entropy loss for training sample predictions.
  - $Loss_2$: Mean squared error (MSE) loss to enforce a target ratio $\beta$ of predicted class 1 (AI-generated) to class 0 (human-created) samples in the test set.
    
    $$\text{MSE} = (\text{mean}(y_{\text{pred}}) - \beta)^2$$
  - The total loss is computed as:
    
    $$\text{Total Loss} = \text{Loss}_1 + \alpha \times \text{Loss}_2$$
    
    where:
    - $\alpha$ is a hyperparameter controlling the weight of the fairness constraint.
    - $\beta$ is the target proportion of AI-generated images in predictions.
Evaluation Metrics: Accuracy and F1-score

EfficientNetB5-Swin Metrics: >

Inference Configuration:

ONNX Conversion: The best model was converted to ONNX format for optimized deployment and faster inference.
Model Size Optimization: The model was resized and optimized, reducing its file size to 115 MB.
Deployment-Ready: These adjustments ensure efficient resource usage and faster execution in production environments.
Visualizing Some Models' Prediction (from validation-set):

3️⃣ Web Backend & Deployment

Frontend:
Developed using HTML, CSS, and JavaScript, the interface is clean, interactive, and user-friendly. Users can download the prediction for their images with the predicted class and confidence printed on it for their records. It also offers a dedicated history section where users can view all their past predictions they saved.
Backend:
Powered by FastAPI, the backend handles all image-upload routes and prediction requests efficiently. It performs all required image preprocessing and model inference, and includes comprehensive error handling to ensure reliable operation in production. Additionally, it integrates with other AWS services for model storage and containerized deployment.
Deployment Stack:
- Docker & Docker Compose: Used to containerize both the backend and frontend.
- AWS S3: Stores trained model files.
- AWS ECR: Hosts container images.
- AWS EC2: Pulls and runs the latest containers and models from ECR and S3.
- IAM Rules: Employed to securely grant EC2, least-privilege access to S3, ECR.
- GitHub Actions: Automates building and pushing of containers to ECR, triggering container updates on EC2.
System Architecture Design:

The System is always live and accessible here:

🔮 Future Work & Limitations

🚧 Limitations

Data & Domain Drift:
The model may underperform in production as generative models evolve rapidly, causing a drift between the training data and newly generated images.
Static Model:
Without updates, the model risks becoming outdated, especially as image realism from AI tools continues to improve.
Lack of Interpretability:
Currently, predictions are made as black-box outputs, with no real built-in explainability for users or developers.

Sample from wrong predictions:

🔄 Future Work

Continuous Retraining:
Develop pipelines to frequently retrain the model with the latest AI-generated image data to keep pace with generative trends.
User Feedback Loop:
Incorporate misclassified samples into the training set to enhance the system's robustness over time.
Explainable AI Integration:
Add tools like Grad-CAM to help visualize model decisions and improve trust in predictions.

📌 Summary

This project presents an end-to-end full-stack AI solution to classify AI-generated vs. human-created images using state-of-the-art deep learning models. Our pipeline includes robust data preprocessing and augmentation, training with a custom loss function that addresses model bias, and deployment using AWS services with containerized applications orchestrated via Docker Compose and updated via GitHub Actions.

🏆 Our score placement in the top 20 of more than 550 teams in Kaggle demonstrates that our approach is both robust and applicable in practical settings.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
models		models
static		static
.dockerignore		.dockerignore
.gitignore		.gitignore
AI_Generated_Images_Detection.ipynb		AI_Generated_Images_Detection.ipynb
Dockerfile		Dockerfile
LICENSE		LICENSE
Project Presentation.pdf		Project Presentation.pdf
README.md		README.md
docker-compose.yml		docker-compose.yml
download_model.py		download_model.py
main.py		main.py
requirements.txt		requirements.txt
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-Generated Images Detection

📖 Project Report

The System is live and accessible here:

Project Demo 🚀:

🔍 Introduction

🎯 Objectives

🔬 Methodology

1️⃣ Data Collection & Preprocessing

2️⃣ Model Development & Training

3️⃣ Web Backend & Deployment

The System is always live and accessible here:

🔮 Future Work & Limitations

🚧 Limitations

🔄 Future Work

📌 Summary

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

abduulrahmankhalid/AI-Generated-Images-Detection

Folders and files

Latest commit

History

Repository files navigation

AI-Generated Images Detection

📖 Project Report

The System is live and accessible here:

Project Demo 🚀:

🔍 Introduction

🎯 Objectives

🔬 Methodology

1️⃣ Data Collection & Preprocessing

2️⃣ Model Development & Training

3️⃣ Web Backend & Deployment

The System is always live and accessible here:

🔮 Future Work & Limitations

🚧 Limitations

🔄 Future Work

📌 Summary

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages