Skip to content

Data science project using Stack Overflow Developer Survey 2022 dataset. The goal of this project is to uncover what Data Scientists do, the languages they use, how education links to salaries, which languages are most sought after, and the issue of gender-based pay gaps.

Notifications You must be signed in to change notification settings

sadiaTab/stackoverflow_survey_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Stack Overflow Annual Developer Survey Data Analysis

Blog post about this analysis can be found here

Project Overview

Founded in 2008, Stack Overflow has evolved into a vital resource for developers worldwide, providing a platform for learning, knowledge sharing, collaboration, and career development. Every year, Stack Overflow conducts the largest global Developer Survey, collecting insights from over thousands of developers. This survey data, openly available, forms a valuable resource for in-depth data analysis, allowing us to explore real-world questions and challenges.

In this project, we focus on analyzing the 2022 Stack Overflow Developer Survey dataset. The 2022 survey gathered responses from over 70,000 developers, shedding light on how developers learn, the tools they use, and their preferences and demographics.

Data Source

You can access the survey data in CSV format for each annual developer survey conducted since 2011 from the following link:

Stack Overflow Developer Surveys

Additional insights provided by Stack Overflow for the 2022 survey can be found here:

Stack Overflow Survey 2022 Insights

Research Questions

In this data science project, we aim to address several research questions:

  1. What additional responsibilities do Data Scientists commonly take on in their current positions?

  2. Which programming languages are most frequently utilised by Data Scientists?

  3. Which programming languages do Data Scientist want to work with over the next year?

  4. Does holding a higher degree correlate with earning a higher salary?

  5. Is there a gender-based salary disparity among Data Scientists, with male Data Scientists earning higher salaries than their female counterparts?

Getting Started

  1. Clone this repository to your local machine:

    git clone https://github.com/your-username/stackoverflow-survey-analysis.git
  2. Navigate to the project directory:

    cd stackoverflow-survey-analysis
  3. Explore the Jupyter notebook stackoverflow-survey-analysis.ipynb to follow the analysis process.

  4. Review the final reports and visualizations in the "Reports" directory for the project's key findings.

Programming Language

Python 3.11.5

Dependencies

To run the Jupyter notebooks and scripts in this project, you may need the following Python libraries:

  • pandas
  • matplotlib
  • seaborn
  • numpy

You can install these dependencies using pip:

pip install pandas matplotlib seaborn numpy 

About

Data science project using Stack Overflow Developer Survey 2022 dataset. The goal of this project is to uncover what Data Scientists do, the languages they use, how education links to salaries, which languages are most sought after, and the issue of gender-based pay gaps.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published