Skip to content

Analysis of Stack Overflow 2018 Developer Survey data to explore the differences between data scientists and non-data scientists.

Notifications You must be signed in to change notification settings

gkhayes/ds_survey_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Data Scientist Survey Analysis - README

Installation

The code was written in Python 3 and requires the following packages: Pandas, Numpy, Collections, Matplotlib, Seaborn, Scipy and Warnings.

Project Motivation

The motivation behind this analysis is to explore how data scientists compare with other non-data scientist software developers ("non-data scientists") with regard to demographics, programming languages used, coding experience and job satisfaction. Consequently, in this analysis, I set out to answer the following questions, using data collected by Stack Overflow as part of their 2018 Annual Developer Survey:

  1. How does the demographic profile of data scientists differ from that of non-data scientists?
  2. What programming languages do data scientists favour and how do they differ from those used by non-data scientists?
  3. How much coding experience do data scientists have compared to non-data scientists?
  4. Are data scientists more satisfied with their jobs/careers than non-data scientists?

File Descriptions

All analysis is contained in the Jupyter notebook DS Survey Analysis.ipynb.

To run this code, it is first necessary to download the 2018 Stack Overflow Develop Survey dataset from https://insights.stackoverflow.com/survey. The folder containing this data (developer_survey_2018) should then be saved in the current working directory in a folder named "Data".

Results

The main findings of this analysis are summarised in a blog post available here.

Licensing, Authors, Acknowledgements

The dataset used in this analysis was created by Stack Overflow and made available for download under the Open Database License (ODbL).

The code contained in this repository may be used freely with acknowledgement.

About

Analysis of Stack Overflow 2018 Developer Survey data to explore the differences between data scientists and non-data scientists.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published