Sentiment Analysis of Universitas Jember’s Sister for Student Application Using Gaussian Naive Bayes and N-Gram

Abstract views: 6 , PDF downloads: 8
Keywords: Gaussian Naive Bayes, Cross Validation, SFS, Application Review Analysis, N-Gram

Abstract

This research aims to classify sentiment in reviews of the Universitas Jember Sister for Student application on Google Play Store, a vital student platform. The primary challenge tackled is the automated identification of positive and negative user sentiments. The study employs the Gaussian Naive Bayes method for classification and uses N-Gram techniques for sentiment analysis. The dataset consists of 1097 reviews, with 673 negative and 424 positive reviews, after removing 86 neutral spam reviews. The data is divided into 80% training data (877 reviews) and 20% test data (220 reviews). Gaussian Naive Bayes is used for modeling and combined with TF-IDF vectorization. The findings reveal that the Gaussian Naive Bayes model achieves an accuracy of 68%, precision of 68%, and recall of 71% on the test data. N-Gram analysis shows frequent occurrences of words like "bisa", "bagus", and "aplikasi" in positive sentiments, while "bisa", "hp", and "absen" are prevalent in negative sentiments. The study concludes that the Gaussian Naive Bayes model effectively classifies sentiment in application reviews, with the potential for further performance improvements.

References

S. Criollo-C, A. Guerrero-Arias, Á. Jaramillo-Alcázar, and S. Luján-Mora, “Mobile Learning Technologies for Education: Benefits and Pending Issues,” Applied Sciences 2021, Vol. 11, Page 4111, vol. 11, no. 9, p. 4111, Apr. 2021, doi: 10.3390/APP11094111.

M. I. N. Ardiansyah, “Analisa Faktor-faktor yang Mempengaruhi Kepuasan Penggunan Aplikasi Sister For Student Menggunakan Metode End User Computing Satisfaction,” Jurnal Sistem Informasi, p. 95, 2019.

B. Fu, J. Lin, L. Liy, C. Faloutsos, J. Hong, and N. Sadeh, “Why people hate your App - Making sense of user feedback in a mobile app store,” Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. Part F128815, pp. 1276–1284, Aug. 2013, doi: 10.1145/2487575.2488202.

E. Indrayuni, “Analisa Sentimen Review Hotel Menggunakan Algoritma Support Vector Machine Berbasis Particle Swarm Optimization,” Jurnal Evolusi Volume 4 Nomor 2 - 2016, vol. 4, no. 2, pp. 20–27, 2016.

N. F. Najwa, M. A. Furqon, E. S. Sintiya, and A. C. Puspitaningrum, “AKUISISI DATA MEDIA SOSIAL PEMERINTAH UNTUK MENGANALISIS KETERBUKAAN INFORMASI PENYEBARAN COVID-19”.

M. A. Furqon, D. Hermansyah, R. Sari, A. Sukma, Y. Akbar, and N. A. Rakhmawati, “Analisis sosial media pemerintah daerah di indonesia berdasarkan respons warganet,” Jurnal Sosioteknologi, vol. 17, no. 2, pp. 2–4, 2018.

P. Borele and D. A. Borikar, “An Approach to Sentiment Analysis using Artificial Neural Network with Comparative Analysis of Different Techniques,” IOSR Journal of Computer Engineering (IOSR-JCE), vol. 18, no. 2, pp. 64–69, 2016.

N. G. Ramadhan and F. D. Adhinata, “Sentiment analysis on vaccine COVID-19 using word count and Gaussian Naïve Bayes,” Indonesian Journal of Electrical Engineering and Computer Science, 2022.

N. G. Ramadhan and F. D. Adhinata, “Sentiment analysis on vaccine COVID-19 using word count and Gaussian Na"ive Bayes,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 26, no. 3, pp. 1765–1772, 2022.

H. Faisol, K. Djajadinata, and M. Muljono, “Sentiment analysis of yelp review,” Proceedings - 2020 International Seminar on Application for Technology of Information and Communication: IT Challenges for Sustainability, Scalability, and Security in the Age of Digital Disruption, iSemantic 2020, pp. 179–184, 2020, doi: 10.1109/iSemantic50169.2020.9234213.

S. U. Hassan, J. Ahamed, and K. Ahmad, “Analytics of machine learning-based algorithms for text classification,” Sustainable Operations and Computers, vol. 3, pp. 238–248, 2022, doi: 10.1016/j.susoc.2022.03.001.

S. M. Malakouti, M. B. Menhaj, and A. A. Suratgar, “The usage of 10-fold cross-validation and grid search to enhance ML methods performance in solar farm power generation prediction,” Clean Eng Technol, vol. 15, p. 100664, Aug. 2023, doi: 10.1016/J.CLET.2023.100664.

S. U. Hassan, J. Ahamed, and K. Ahmad, “Analytics of machine learning-based algorithms for text classification,” Sustainable Operations and Computers, vol. 3, pp. 238–248, Jan. 2022, doi: 10.1016/J.SUSOC.2022.03.001.

A. I. Kadhim, “Term Weighting for Feature Extraction on Twitter: A Comparison between BM25 and TF-IDF,” 2019 International Conference on Advanced Science and Engineering, ICOASE 2019, pp. 124–128, Apr. 2019, doi: 10.1109/ICOASE.2019.8723825.

A. Fakhruddin, J. S. Kom, A. Ridwan, and S. Mmsi, “Performance Measurement of Confusion Matrix Accuracy in Sentiment Analysis with Decision Trees, Naïve Bayes, K-Nearest Neighbor methods Using Rapidminer,” International Research Journal of Advanced Engineering and Science, vol. 8, no. 4, pp. 123–127, 2023.

A. Tripathy, A. Agrawal, and S. K. Rath, “Classification of sentiment reviews using n-gram machine learning approach,” Expert Syst Appl, vol. 57, pp. 117–126, Sep. 2016, doi: 10.1016/J.ESWA.2016.03.028.

C. Martins, “Gaussian Naive Bayes Explained and Hands-On with Scikit-Learn | by Carla Martins | Towards AI,” 2022.

A. Ashari Muin and Syarli, “Metode Naive Bayes Untuk Prediksi Kelulusan (Studi Kasus: Data Mahasiswa Baru Perguruan Tinggi),” Jurnal Ilmiah Ilmu Komputer, vol. 2, no. 1, 2016.

N. Sarang, “Understanding Confusion Matrix,” Towards Data Science, 2018.

P. Kumar, “An Introduction to N-grams: What Are They and Why Do We Need Them?,” XRDS Crossroads - The ACM Magazine for Students, pp. 1–6, 2017.

PlumX Metrics

Published
2024-12-30