This repository contains a training and inference pipeline built around RetinaFace (for face detection) and a MobileNetV2-based Mask Classifier trained from scratch on curated datasets (RMFD, MAFA, CMFD, and custom images).
It powers a real-time masked face detection system.
- RetinaFace for robust face detection (supports ResNet50 & MobileNet backbones)
- MobileNetV2 classifier trained for
MaskvsNo Mask - 99.3% test accuracy on curated test set
- Supports real-time webcam inference
- Seamlessly integrates with facial recognition backend
- Modular & extensible for future fine-tuning
# Create & activate virtual environment
python -m venv venv_train
venv_train\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txtUse the CUDA-enabled PyTorch build matching your GPU.
The Dataset is prepared and filtered from various real World Datasets like RMFD,CMFD,MAFA, ANd Custom Real Masked images. Structure your dataset as follows:
data/Dataset/
├── train/
│ ├── Mask/
│ └── No_mask/
├── val/
│ ├── Mask/
│ └── No_mask/
└── test/
├── Mask/
└── No_mask/
Ensure a balanced dataset (~9k Mask / 9k No Mask).
Supports custom datasets for real-world adaptation.
python train.pytrain.py uses:
- MobileNetV2 base network
- Adam optimizer
- CrossEntropy loss
- Early stopping based on validation accuracy
Checkpoints are saved in checkpoints/ as:
mobilenet_mask_best.pth.tarmobilenet_mask_last.pth.tar
All The Trained weights are available here.
users must have Git LFS installed to download large weights. Evaluate your trained model:
python test_script.pyExample output:
====== TEST REPORT ======
Total images: 1849
Mask: 924 (49.97%)
No Mask: 925 (50.03%)
Skipped: 0 (0.00%)
==========================
Overall Accuracy: 99.3%
Run real-time webcam detection:
python mask_detector_webcam_test.py- Detects multiple faces in real time
- Classifies mask status using your trained MobileNet
- Overlays bounding boxes with confidence scores
Example result:
The trained model integrates seamlessly with the Face Recognition System, where:
- RetinaFace detects faces in real-time video or CCTV frames.
- The MobileNet classifier determines if the face is masked.
- Embeddings are generated.
- Matches are found from the Database.
| Component | Backbone | Accuracy | FPS | Notes |
|---|---|---|---|---|
| RetinaFace Detector | MobileNet0.25 | 97.8% | 30 | Fast & lightweight |
| Mask Classifier(ours) | MobileNetV2 | 99.3% | 28 | Custom trained |
| Combined (End-to-End) | RetinaFace + MobileNet | 98.9% | 25 | Optimized for real-time |
While public repos like [Face-Mask-Detection] exist,
this model was custom-trained because:
- It uses real forensic and surveillance-grade images.
- Integrated mask detection within the recognition flow.
- Optimized for speed and accuracy balance with MobileNet.
- Full control over architecture, logging, and checkpoints.
- Licensed under the MIT License.
- Based on the open-source RetinaFace project.
- Datasets used: RMFD, MAFA, CMFD, Custom Surveillance Dataset.
- All the images and datasets and models used here belongs to the respective owners.
- MIT © Joel Thomas
You can find our Code of Conduct here.
👨💻 Joel Thomas
- 💻 GitHub
- 📧 Email: [email protected]
Pull requests and issues are welcome.
You can contribute by improving dataset balance scripts, fine-tuning on other backbones, or optimizing for embedded systems.
“Built with purpose — precision and performance for real-world recognition.”



