Skip to content

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

License

Notifications You must be signed in to change notification settings

saksham-jain177/Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Analysis Projects

Welcome to my repository of Data Analysis projects! This repository contains a series of notebooks demonstrating different data analysis and machine learning tasks. Each project focuses on a unique dataset and problem statement, showcasing various analytical and predictive techniques.

Table of Contents

  1. Swiggy Restaurants Data Analysis
  2. GeeksforGeeks Data Analysis
  3. Cardekho Used Car Price Analysis
  4. Sonar Mine Prediction
  5. Big Mart Sales Prediction
  6. California House Price Prediction
  7. CarDekho Car Price EDA
  8. Credit Card Fraud Detection
  9. Customer Segmentation Using K-Means
  10. Fake News Prediction
  11. Gold Price Prediction
  12. Heart Disease Prediction
  13. House Prices: Advanced Regression Techniques
  14. Loan Eligibility Prediction
  15. Parkinson's Disease Detection
  16. Spam Mail Prediction
  17. Used Medical Insurance Prediction

Swiggy Restaurants Data Analysis

Description: This project involves analyzing restaurant data from the Swiggy food delivery platform. Key aspects include:

  • Data Collection: Access data on restaurant names, cuisines, ratings, reviews, delivery times, and locations.
  • Data Cleansing and Preparation: Clean and preprocess the data for analysis.
  • Restaurant Performance Analysis: Calculate average ratings, review counts, and identify high-performing restaurants.
  • Cuisine and Menu Analysis: Analyze cuisine distribution and popular menu items.

GeeksforGeeks Data Analysis

Description: This project involves scraping and analyzing video data from the GeeksforGeeks YouTube channel.

  • Data Gathering: Use YouTube Data API to fetch video details such as titles, views, upload dates, and lengths.
  • Data Processing and Analysis: Calculate total views and lengths, identify popular topics, and analyze correlations.
  • Visualization: Use libraries like matplotlib to create visualizations of trends and patterns.

Cardekho Used Car Price Analysis

Description: Analyze the used car dataset from Cardekho to uncover insights about factors influencing car prices.

  • Data Gathering: The dataset includes features like selling price, vehicle age, KM driven, engine size, fuel type, seller type, and transmission type.
  • Data Cleaning and Preprocessing: Handle missing values, remove duplicates, standardize text columns, and remove outliers.
  • Exploratory Data Analysis (EDA): Perform univariate, bivariate, and categorical analyses to identify key trends and insights.
  • Visualization: Use libraries like matplotlib and seaborn to create distribution plots, scatter plots, and correlation heatmaps.
  • Insights and Findings: Analyze the impact of various factors on car prices and provide recommendations based on the analysis.

Sonar Mine Prediction

Description: Build a machine learning model to classify sonar signals as either mines (M) or rocks (R).

  • Data Gathering: The dataset includes sonar readings for mines and rocks.
  • Data Cleaning and Preprocessing: Verify and handle missing values and outliers.
  • Exploratory Data Analysis (EDA): Analyze summary statistics and class distribution.
  • Model Building: Create feature matrices, split data, and evaluate models such as Logistic Regression, SVC, Decision Tree, and Random Forest.
  • Model Comparison: Compare models based on accuracy and performance metrics.
  • Insights and Findings: Determine the best model for sonar signal classification based on accuracy.

Big Mart Sales Prediction

Description: Predict sales for Big Mart using historical sales data.

  • Data Gathering: Use sales data from Big Mart to create predictive models.
  • Data Cleaning and Preprocessing: Handle missing values and preprocess data for modeling.
  • Model Building: Build and evaluate regression models to predict sales.

California House Price Prediction

Description: Predict house prices in California using historical data.

  • Data Gathering: Use historical housing data from California.
  • Data Cleaning and Preprocessing: Clean and preprocess data for analysis.
  • Model Building: Develop regression models to predict house prices.

CarDekho Car Price EDA

Description: Perform exploratory data analysis on CarDekho's car price dataset.

  • Data Gathering: Analyze features such as car price, model, and mileage.
  • Exploratory Data Analysis (EDA): Identify key trends and patterns in the dataset.

Credit Card Fraud Detection

Description: Build a model to detect fraudulent credit card transactions.

  • Data Gathering: Use historical credit card transaction data.
  • Model Building: Develop and evaluate classification models to detect fraud.

Customer Segmentation Using K-Means

Description: Segment customers into different groups using K-Means clustering.

  • Data Gathering: Use customer data for clustering.
  • Model Building: Apply K-Means clustering to segment customers.

Fake News Prediction

Description: Predict whether a news article is fake or real.

  • Data Gathering: Use a dataset of news articles.
  • Model Building: Develop and evaluate classification models for fake news detection.

Gold Price Prediction

Description: Predict gold prices using historical data.

  • Data Gathering: Use historical gold price data.
  • Model Building: Develop regression models to predict future gold prices.

Heart Disease Prediction

Description: Predict the likelihood of heart disease based on patient data.

  • Data Gathering: Use health data related to heart disease.
  • Model Building: Develop classification models to predict heart disease risk.

House Prices: Advanced Regression Techniques

Description: Use advanced regression techniques to predict house prices.

  • Data Gathering: Use historical housing data.
  • Model Building: Apply advanced regression techniques to improve predictions.

Loan Eligibility Prediction

Description: Predict loan eligibility based on applicant data.

  • Data Gathering: Use applicant data to determine loan eligibility.
  • Model Building: Develop classification models to predict loan approval.

Parkinson's Disease Detection

Description: Build a model to detect Parkinson's disease from patient data.

  • Data Gathering: Use health data related to Parkinson's disease.
  • Model Building: Develop and evaluate classification models for disease detection.

Spam Mail Prediction

Description: Predict whether an email is spam or not.

  • Data Gathering: Use email data to classify spam and non-spam emails.
  • Model Building: Develop classification models to detect spam emails.

Used Medical Insurance Prediction

Description: Predict the likelihood of medical insurance usage based on patient data.

  • Data Gathering: Use patient data to predict insurance usage.
  • Model Building: Develop classification models to predict medical insurance needs.

License

This project is licensed under the MIT License.

About

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published