DS RPC 2.0 : ADEGuard – AI-Powered Adverse Drug Event Detection and Severity Mapping

This is the starter repository for Codebasics's Resume Project Challenge 2.0.

This project focuses on building an AI-powered pipeline to detect Adverse Drug Events (ADEs) from symptom text, group them into symptom-based and age-specific clusters, and classify each event by severity.

Please fork this repository to get started.

📂 Data Access Instructions

Contestants will use the VAERS dataset provided by the U.S. Vaccine Adverse Event Reporting System.

Visit the official VAERS Data page:
👉 https://vaers.hhs.gov/data/datasets.html
Scroll to the table listing data by year.
Download the ZIP file for your target year(s) from the "Zip File" column.
- Example: For 2025, click the link in the Zip File column (e.g., 4.95 MB).
- The ZIP will contain three CSV files:
  - VAERSDATA.csv → Main case and patient data
  - VAERSSYMPTOMS.csv → Coded adverse event terms using the MedDRA (Medical Dictionary for Regulatory Activities) terminology.
    - Each report can have up to five coded symptoms (SYMPTOM1–SYMPTOM5), representing standardized MedDRA Preferred Terms.
  - VAERSVAX.csv → Vaccine/product details
Extract the ZIP files for all target years, and move all three CSV files from each ZIP into the data/raw folder of this repository.

📝 Annotation Guidelines

Before starting annotation or model training, review the Annotation Guidelines in the docs/ folder.
They explain in detail:

ADE annotation rules – how to identify Adverse Drug Events in text, including what to include and what to skip.
DRUG annotation rules – how to label vaccine or drug mentions exactly as reported, handle brand names, code names, and generic terms.
Special cases – rules for compound symptoms, repeated mentions, death/hospitalization references, and COVID-19 mentions.
Span formatting – keeping longest medically accurate terms, excluding durations, and labeling each occurrence separately.
Quick checklist – a step-by-step reminder to ensure annotations are consistent and compliant.

📌 Tip: Following these rules strictly ensures the labels are high quality and consistent, which is critical for training the NER model effectively.

📌 Learn More

Visit the challenge page to learn more: DS RPC-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
docs		docs
models		models
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DS RPC 2.0 : ADEGuard – AI-Powered Adverse Drug Event Detection and Severity Mapping

📂 Data Access Instructions

📝 Annotation Guidelines

📌 Learn More

About

Uh oh!

Releases

Packages

codebasics/ds-rpc-02

Folders and files

Latest commit

History

Repository files navigation

DS RPC 2.0 : ADEGuard – AI-Powered Adverse Drug Event Detection and Severity Mapping

📂 Data Access Instructions

📝 Annotation Guidelines

📌 Learn More

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages