Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twitter Sentiment Analysis using NLP #385

Merged
merged 3 commits into from
Dec 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Twitter Sentiment Analysis NLP/Dataset/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
https://www.kaggle.com/datasets/kazanova/sentiment140
Dataset
1 change: 1 addition & 0 deletions Twitter Sentiment Analysis NLP/Images/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
EDA done through line plot, wordcloud, confusion matrix
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
57 changes: 57 additions & 0 deletions Twitter Sentiment Analysis NLP/Models/Readme.Md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Twitter Sentiment Analysis NLP

## PROJECT TITLE

Twitter Sentiment Analysis NLP

## GOAL

The main goal of this project is to analyse the tweets of people using LSTM and Keras Sequential model

## DATASET

https://www.kaggle.com/datasets/kazanova/sentiment140.

## DESCRIPTION

This project aims to perform a sentiment analysis on the tweets posted by various people, and group those into positive and negative tweets.

## WHAT I HAD DONE

1. Used NLTK to preprocess and clean text , using Stemmer, Lemmatizer, removing symbols, etc
2. Created sequential model using Keras, added weight initializers and regulators
3. Used Glove embeddings in other notebook
4. Created LSTM model with Conv1D, Spatial DropOut, Dense and other layers
5. Used confusion matrix
6. Used BERT for classification

## MODELS USED

1. Glove embeddings with LSTM
2. Sequential Model
3. BERT

## LIBRARIES NEEDED
- numpy
- pandas
- sklearn
- tensorflow
- keras
- scipy

## VISUALIZATION

![For Sequential Model](<../Images/Screenshot (277).png>)- keras sequential
![For LSTM](<../Images/Screenshot (279).png>) - lstm

## EVALUATION METRICS

Confusion matrix was created and recall, f1 score, precision were used as metrics of accuracy

## RESULTS

LSTM has higher accuracy (About 78%) compared to 72% of Keras sequential model. The highest accuracy is offered by BERT at 87%

## CONCLUSION

Long Short-Term Memory (LSTM) networks are beneficial in tweet sentiment analysis compared to Keras Sequential models due to their ability to capture contextual dependencies and handle sequential data effectively. Tweets often contain short and informal language, making it challenging for traditional models to discern sentiment accurately. LSTMs, with their memory cells, can capture nuances in the temporal structure of tweets, considering dependencies between words and phrases. This enables LSTMs to grasp the sentiment context better than simple sequential models. In contrast, Keras Sequential models may struggle to capture the inherent sequential nature and intricate dependencies present in tweet data, leading to suboptimal performance in sentiment analysis tasks.Using encoders from Transformer enables BERT to have a better context understanding than traditional neural networks such as LSTM or RNN since the encoder process all inputs, which is the whole sentence, simultaneously so when building a context for a word, BERT will take into account the inputs before it

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions Twitter Sentiment Analysis NLP/Models/tweet_lstm.ipynb

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions Twitter Sentiment Analysis NLP/Models/tweetbert.ipynb

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions Twitter Sentiment Analysis NLP/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
numpy
pandas
sklearn
tensorflow
keras
scipy