Skip to content

berkayalan/data-science-tutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Basics, Tutorials and Functions

Python Basics

Introduction to Python

  - Data Types
  - Built-in Functions
  - Type Converting
  - Getting Input from users

Data Structures

  - Lists
  - Tuples
  - Dictionaries
  - Sets

Conditional Statements

  - Boolean Expressions
  - Logical Operators
  - If-Else
  - Grade System User Interaction Example
  - Nested If
  - Odd or Even Example

Loops

  - range()
  - In Operator
  - For Loop
  - Iterating in Strings
  - Iterating in two(2) dimensional Lists
  - continue
  - break
  - zip()
  - Iteration in a Dictionary
  - Iterating pair values
  - While Loop
  - While True

Functions

  - Intro to Functions
  - return()
  - Number of Arguments
  - Arbitrary Arguments, *args
  - Arbitrary Keyword Arguments, **kwargs
  - Giving output with Information
  - Functions that have 2 parameters
  - Predefined Parameters in Functions
  - Local and Global Variables
  - Changing global variables in local area
  - Pass Statement

Nested Functions

Object Oriented Programming

  - What is object oriented programming?
  - Defining Classes
  - Instantiation - Creating objects
  - Class and Instance Attributes
  - Instance(Object) Methods
  - Inheritance
  - Overriding - Extending the Functionality of a Parent Class
  - super() keyword

Numpy

  - What is Numpy?
  - Importing Numpy
  - Numpy arrays and Dimensions
  - Creating Numpy Arrays
      - Zero arrays
      - Ones arrays
      - Full arrays
      - Identify Matrixes
      - Linear Series
      - Distributions arrays - Random
  - Array Indexing
  - Subsets
  - reshape() function
  - Flattening the Arrays
  - Concatenation
  - Splitting
  - Sorting
  - Broadcasting
  - Array Math
  - Dot(Scalar) Product

Pandas

   - What is Pandas?
   - Importing Pandas Library
   - Pandas Series
   - Pandas Dataframes
   - Filtering
   - Adding/Removing rows and columns
   - Merging Dataframes
   - Sorting
   - Aggregation Functions
   - Grouping
   - Apply
   - Pivot Tables
   - Missing values(NaN)
   - Working external files in Pandas(csv,excel)
   - Exploring Netflix Dataset(basic)

Data Preprocessing-Cleaning

  - Data Cleaning / Cleasing
        - Noisy Data
        - Missing Data Analysis
        - Outlier Detection
  - Data Standardization / Feature Scaling
        - Normalization(0-1 Scaling)
        - Standardization(Z Score Scaling)
        - Min-Max Scaling
        - Binary Transformation
  - Variable Transformation
        - Label Encoding
        - One Hot Encoding

Data Visualization

   - Main Libraries for Data Visualisation
   - What is Exploratory data analysis(EDA)?
   - Importing Libraries
   - Matplotlib
       - Pyplot
       - Line Plot
       - Bar Plot
       - Pie Chart
       - Stack Plot
       - Histograms
       - Scatter Plot
       - Time Series Plotting
       - Box Plot 
       - Heatmap
   
   - Seaborn
       - Pyplot
       - Line Plot
       - Bar Plot
       - Cat Plot
       - Histograms
       - Density Plots
       - Pair Plot
       - Scatter Plot
       - Time Series Plotting
       - Box Plot
       - Heatmap
       - Multi-plot Grids
      
   - Pandas
       - Basic Plots
       - Bar Plots
       - Histograms
       - Box Plots
       - Area Plots
       - Scatter Plots
       - Hexagonal Bin Plots
       - Pie Plots
       - Plotting Tools
   
   - Plotnine - ggplot
       - Line Plot
       - Bar Plot
       - Scatter Plot
       - Histograms
       - Density Plot
       - Box Plot
       - Violin Plot
   
   - Plotly
       - Line Plot
       - Bar Plot
       - Pie Charts
       - Bubble Charts
       - Scatter Plots
       - Filled area Plots
       - Gannt Charts
       - Sunburst Charts
       - Tables

Linear Methods for Regression

  - What is Linear Regression?
  - Simple Linear Regression (Theory - Model- Tuning)
  - Multiple Linear Regression (Theory - Model- Tuning)
  - Least-Squares Regression(Ordinary Least Squares) (Theory - Model- Tuning)
  - Principal Component Analysis (PCA) 
  - Principal component regression(PCR) (Theory - Model- Tuning)
  - Shrinkage(Regularization) Methods
      - Partial Least Squares (Theory - Model- Tuning)
      - Ridge Regression(L2 Regularization) (Theory - Model- Tuning)
      - Lasso Regression(L1 Regularization) (Theory - Model- Tuning)
      - Elastic Net Regression (Theory - Model- Tuning)

Non-Linear Models for Regression

  - K - Nearest Neighbors(KNN) (Theory - Model- Tuning)
  - Support Vector Regression(SVR) (Theory - Model- Tuning)
  - Non-Linear Support Vector Regression(SVR) (Theory - Model- Tuning)
  - Regression(Decision) Trees (CART) (Theory - Model- Tuning)
  - Ensemble Learning - Bagged Trees(Bagging) (Theory - Model- Tuning)
  - Ensemble Learning - Random Forests (Theory - Model- Tuning)
  - Gradient Boosting Machines(GBM)  (Theory - Model- Tuning)
  - Light Gradient Boosting Machines(LGBM)  (Theory - Model- Tuning)
  - XGBoost(Extreme Gradient Boosting)  (Theory - Model- Tuning)
  - Catboost  (Theory - Model- Tuning)

Unsupervised Learning - Clustering - Principal Components Analysis(PCA)

  - Clustering
  - K-Means Clustering (Theory - Exploratory Data Analysis - Preprocessing - Model- Tuning)
  - Color - Image Quantization
  - Hierarchical Clustering (Theory - Model)
  - DBSCAN (Density-based spatial clustering) (Theory - Model- Tuning)
  - Principal Components Analysis(PCA) (Theory - Manual Implementation of PCA - Model)   

Classification

  - Classification and Evaluation Metrics
  - Logistic Regression (Theory - Model- Tuning)
  - K - Nearest Neighbors(KNN) (Theory - Model- Tuning)
  - Support Vector Machines(SVC) - Linear Kernel (Theory - Model- Tuning)
  - Support Vector Machines(SVC) - Radial Basis Kernel (Theory - Model- Tuning)
  - Decision Tree Classification (Theory - Model- Tuning)
  - Ensemble Learning - Random Forests Classification (Theory - Model- Tuning)
  - Naive Bayes Classification (Theory - Model)
  - GBM(Gradient Boosting Machines) Classification (Model- Tuning)
  - XGBoost(Extreme Gradient Boosting) Classification (Theory - Model- Tuning)
  - LGBM(Light Gradient Boosting Machines) Classification (Theory - Model- Tuning)

Deep Learning with Pytorch

  - What is Pytorch?
  - Importing Libraries
  - Basics of Pytorch
  - Tensors
  - Math Operations
  - Common Funtions
  - Variables - Autograd
  - Datasets & DataLoaders
  - Common Modules: Optim - nn
  - Extra - Useful Resources

Model Deployment

  - What is Joblib Library?
  - Artificial Neural Networks(ANN) Model
  - Prediction
  - Model Tuning & Validation
  - Saving Model as pickle file
  - Loading Model

Natural Language Proccessing

  - NLP Intuition
  - String Essentials : Creating String
  - String Essentials : Querying of Types
  - String Essentials : Reaching to Indexes
  - String Essentials : First and last characters
  - String Essentials : Splitting Characters
  - String Essentials : Case Conversions in String
  - String Essentials : Capitalizing and titles
  - String Essentials : Cropping Characters
  - String Essentials : Joining Strings
  - String Essentials : Replacing Characters
  - String Essentials : contains
  - Text Preprocessing : Converting string to other data types
  - Text Preprocessing : Case Conversion
  - Text Preprocessing : Handling with Punctuation
  - Text Preprocessing : Handling with Numbers
  - Text Preprocessing : Handling with Stopwords
  - Text Preprocessing : Handling with Frequnecies
  - Text Preprocessing : Tokenization
  - Text Preprocessing : Stemming
  - Text Preprocessing : Lemmatization
  - Object Standardization
  - Linguistic Features : N-Gram
  - Linguistic Features : Part of speech tagging (POS)
  - Linguistic Features : Chunking(Shallow Parsing)
  - Linguistic Features : Noun Chunks
  - Linguistic Features : Named Entity Recognition(NER)
  - Linguistic Features : Visualization in Spacy
  - Text Feature Engineering 
  - Bag of Words
  - Text Visualisation : Bar Plot
  - Text Visualisation : Frequency Visualisation
  - Text Visualisation : WordCloud
  - Transformers, Encoders and Decoders
  - Different Models : Bert, HuggingFace, StanfordNLP, NLTK, LSTM etc.
  - Sentiment Analysis with Logistic Regression
  - Sentiment Analysis with Naive Bayes
  - Vector Space Models
  - Neural Machine Translation
  - Text Summarization
  - Classification with Bert

Spark

  - Spark Basics
  - MlLib