-
Notifications
You must be signed in to change notification settings - Fork 0
/
HR Analytics - Case study.Rmd
89 lines (65 loc) · 2.15 KB
/
HR Analytics - Case study.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
title: "HR Analytics - Case study"
author: "Amit Choudhary"
date: "8 August 2018"
output: html_document
---
```{r setup, include=FALSE}
library(ggplot2)
library(tidyr)
library(dplyr)
library(purrr)
library(e1071)
library(miscset)
hr = read.csv("E://Training data//PGDS- EDA//HR Analytics.csv")
```
# Summary of the data
Lets check if the data is clean or not
```{r}
summary(hr)
```
After looking into structure ,We can see that there are no NA'S and Some variables
such as Education,JobInvolvement..etc which are factors are stored as integers ,
so Converting these continuos variables to Categorical data
```{r}
str(hr)
```
Converting the variable to factor
```{r}
names = c('Attrition','Education','EnvironmentSatisfaction','JobSatisfaction',
'RelationshipSatisfaction','WorkLifeBalance' ,'JobInvolvement')
hr[,names] = lapply(hr[,names] , factor)
str(hr)
```
## UNIVARIATE ANALYSIS: Visualizing Each Numeric data
```{r}
hr[1:25] %>%
keep(is.numeric) %>%
gather() %>%
ggplot(aes(x = value)) + geom_histogram(fill = "Red") +
facet_wrap(~ key, scales = "free")
```
## VISUALIZING EACH categorical data
```{r}
# If you want the names of all factor column or numeric column
hrcat = names(which(sapply(hr,is.factor)))
hrnum = names(which(sapply(hr,is.numeric)))
```
# Lets visualize categorical variable
```{r}
ggplotGrid(ncol = 3,lapply(hrcat[1:5],function(x) {
ggplot(hr, aes_string(x)) + geom_bar(aes(fill = Attrition)) + theme_bw() +
theme(axis.text.x = element_text(size = 10, angle = 90, hjust = 1, vjust = 1))}))
```
Column 6 to 10
```{r}
ggplotGrid(ncol = 3,lapply(hrcat[6:10],function(x) {
ggplot(hr, aes_string(x)) + geom_bar(aes(fill = Attrition)) + theme_bw() +
theme(axis.text.x = element_text(size = 10, angle = 45, hjust = 1, vjust = 1))}))
```
Column 11 to 13
```{r}
ggplotGrid(ncol = 3,lapply(hrcat[11:13],function(x) {
ggplot(hr, aes_string(x)) + geom_bar(aes(fill = Attrition)) + theme_bw() +
theme(axis.text.x = element_text(size = 10, angle = 45, hjust = 1, vjust = 1))}))
```