Twitter sentiment analysis via machine learning
Loading...
Files
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Kadir Has Üniversitesi
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
İnsanlar dünyada yaşanan olaylardan kullandıkları ürün ve hizmetlere kadar bir çok konu hakkında sosyal medya platformlarında yorum yapmakta, duygu ve düşüncelerini paylaşmakta ve birbirleriyle iletişim içinde bulunmaktadır. Twitter günümüzde çok popüler olan sosyal medya platformlarından biridir. Bu platformun kullanıcıları tarafından oluşturulan tweetler Metin Madenciliği alanında ve özelinde Duygu analizi çalışmalarında veri bilimcileri için çok iyi birer veri seti kaynağı olabilmektedir. Bu tez çalışmasında tweet verileri Python programlama dili ile Anaconda platformunda yer alan JupyterLab editörü üzerinde metin önişleme sürecinden geçirildikten sonra duygu analizleri yapılmış, metin verisi ikili sınıflandırma yapılarak Negatif ve Pozitif olarak etiketlenmiştir. Tweet metin verileri vektörlere dönüştürülerek Bag of Words ve Tf-idf gibi özellik çıkarımı yöntemi ile işlenmiş ve Destek Vektör Makinesi, Lojistik Regresyon, Naïve Bayes, Rastgele Orman, Extreme Gradient Boost Makine Öğrenmesi algoritmaları ile sınıflandırma tahmin verilerinin doğrulukları karşılaştırılmıştır.
People comment on social media platforms, share their feelings and thoughts, and communicate with each other about many issues from the events in the world to the products and services they use. Twitter is one of the most popular social media platforms today. The tweets created by the users of this platform can be a very good data set source for data scientists in the field of Text Mining and in Sentiment Analysis studies in particular. In this thesis, sentiment analysis was performed after the tweet data was passed through the text preprocessing process on the JupyterLab editor on the Anaconda platform with the Python programming language, and the text data was labeled as Negative and Positive by binary classification. Tweet text data was transformed into vectors and processed with feature extraction method such as Bag of Words and Tf-idf, and the accuracy of the classification prediction data was compared with Support Vector Machine, Logistic Regression, Naïve Bayes, Random Forest, Extreme Gradient Boost Machine Learning algorithms.
People comment on social media platforms, share their feelings and thoughts, and communicate with each other about many issues from the events in the world to the products and services they use. Twitter is one of the most popular social media platforms today. The tweets created by the users of this platform can be a very good data set source for data scientists in the field of Text Mining and in Sentiment Analysis studies in particular. In this thesis, sentiment analysis was performed after the tweet data was passed through the text preprocessing process on the JupyterLab editor on the Anaconda platform with the Python programming language, and the text data was labeled as Negative and Positive by binary classification. Tweet text data was transformed into vectors and processed with feature extraction method such as Bag of Words and Tf-idf, and the accuracy of the classification prediction data was compared with Support Vector Machine, Logistic Regression, Naïve Bayes, Random Forest, Extreme Gradient Boost Machine Learning algorithms.
Description
Keywords
Turkish CoHE Thesis Center URL
Fields of Science
Citation
WoS Q
Scopus Q
Source
Volume
Issue
Start Page
1
End Page
102