Classifying Bitcoin Tweet Sentiment

Project Overview & Objective

This project seeks to build a model that accurately classifies tweets about Bitcoin as having either positive or negative sentiment. Unlabeled tweets classified by this model could ultimately could be used to analyze time trends on Bitcoin sentiment and assess the predictive power of Twitter sentiment on future price movements of the cryptocurrency.

The Data

This project uses Twitter data sourced from Kaggle. It consists of 1 million Tweets referencing Bitcoin between February and August 2021. The sentiment is pre-labeled.

Modeling

Target

In this analysis we target sentiment - positive or negative.
Sentiment is fairly balanced, with around 53 percent of tweets labeled as negative and 47 percent positive.

Evaluation Metrics

As the data is fairly balanced and we value false positives and false negatives equally, we focus on F1 score and accuracy

Final Model:

Achieved 97 percent F1 score, 97 percent accuracy
Most important features in positive tweets include:
Most important features in negative tweets include:
Just 3 percent of validation data categorized as negative when it was actually positive
Just 1.8 percent of validation data categorized as positive when it was actually negative

Conclusion

A Logistic Regression model was the best-performing classifier, with Count Vectorization used to process the annotated tweets
Final model can classify unlabeled Tweets as positive or negative with ~97 percent accuracy, 97 percent F1 score
Words important to the model included 'best', 'awesome', 'successful', 'insane', 'worst', 'worthless'
Positive tweets had more hashtags on average, negative tweets more frequently contained a price

Next Steps / Recommendations

Pull more recent Tweets on Bitcoin via Twitter API and run final model on real-time data
Use model-labeled Tweets to conduct Time Series Analysis, with the aim of understanding the predictive power of Tweet sentiment on the price of BTC

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
images		images
.gitignore		.gitignore
Bitcoin_Sentiment_Presentation.pdf		Bitcoin_Sentiment_Presentation.pdf
README.md		README.md
final_notebook.ipynb		final_notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classifying Bitcoin Tweet Sentiment

Project Overview & Objective

The Data

Modeling

Target

Evaluation Metrics

Final Model:

Conclusion

Next Steps / Recommendations

For More Information

About

Releases

Packages

Languages

ntdoris/bitcoin-twitter-sentiment

Folders and files

Latest commit

History

Repository files navigation

Classifying Bitcoin Tweet Sentiment

Project Overview & Objective

The Data

Modeling

Target

Evaluation Metrics

Final Model:

Conclusion

Next Steps / Recommendations

For More Information

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages