Implementation of SVM and DT for Sentiment Classification: Tempel Hamlet Content Reviews


Authors

  • Yerik Afrianto Singgalen Universitas Katolik Indonesia Atma Jaya, Jakarta, Indonesia

DOI:

https://doi.org/10.30865/klik.v4i5.1826

Keywords:

Classification; Content Analysis; Decision Tree; Sentiment; Support Vector Machine

Abstract

The study aims to investigate the effectiveness of sentiment analysis algorithms, specifically Support Vector Machine (SVM) and Decision Tree (DT), integrated with the Synthetic Minority Over-sampling Technique (SMOTE) to mitigate class imbalance issues. Guided by the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework, the research involves several stages: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. The process begins with understanding the business objectives of sentiment analysis and proceeds to explore and prepare the dataset for analysis. SVM and DT algorithms, enhanced with SMOTE, are then implemented for sentiment classification. The study reveals promising results in sentiment analysis tasks. When integrated with SMOTE, SVM achieves an accuracy of 99.21%, while DT attains an accuracy of 98.33%. The Area Under the Curve (AUC) metrics indicate high confidence in classifying positive instances, with SVM and DT demonstrating AUC scores of 1.000 and 0.996, respectively. These findings underscore the efficacy of SVM and DT algorithms, enhanced with SMOTE, in accurately classifying sentiment within text data, thereby addressing class imbalance issues effectively

Downloads

Download data is not yet available.

References

P. Jayaswal and B. Parida, “Past, present and future of augmented reality marketing research: a bibliometric and thematic analysis approach,” Eur. J. Mark., vol. 57, no. 9, pp. 2237–2289, Jan. 2023, doi: 10.1108/EJM-05-2022-0397.

G. Rasool and A. Pathania, “Reading between the lines: untwining online user-generated content using sentiment analysis,” J. Res. Interact. Mark., vol. 15, no. 3, pp. 401–418, Jan. 2021, doi: 10.1108/JRIM-03-2020-0045.

A. Abid, P. Harrigan, and S. K. Roy, “Online relationship marketing through content creation and curation,” Mark. Intell. Plan., vol. 38, no. 6, pp. 699–712, Jan. 2020, doi: 10.1108/MIP-04-2019-0219.

R. Harakawa, T. Ogawa, and M. Haseyama, “Extracting Hierarchical Structure of Web Video Groups Based on Sentiment-Aware Signed Network Analysis,” IEEE Access, vol. 5, pp. 16963–16973, 2017, doi: 10.1109/ACCESS.2017.2741098.

X. Yan, H. Xue, S. Jiang, and Z. Liu, “Multimodal Sentiment Analysis Using Multi-tensor Fusion Network with Cross-modal Modeling,” Appl. Artif. Intell., vol. 36, no. 1, 2022, doi: 10.1080/08839514.2021.2000688.

M. S. Viñán-Ludeña and L. M. de Campos, “Analyzing tourist data on Twitter: a case study in the province of Granada at Spain,” J. Hosp. Tour. Insights, vol. 5, no. 2, pp. 435–464, Jan. 2022, doi: 10.1108/JHTI-11-2020-0209.

N. Amat-Lefort, F. Barravecchia, and L. Mastrogiacomo, “Quality 4.0: big data analytics to explore service quality attributes and their relation to user sentiment in Airbnb reviews,” Int. J. Qual. Reliab. Manag., vol. 40, no. 4, pp. 990–1008, Jan. 2023, doi: 10.1108/IJQRM-01-2022-0024.

I. Awajan, M. Mohamad, and A. Al-Quran, “Sentiment Analysis Technique and Neutrosophic Set Theory for Mining and Ranking Big Data from Online Reviews,” IEEE Access, vol. 9, pp. 47338–47353, 2021, doi: 10.1109/ACCESS.2021.3067844.

R. C. Ho, M. S. Withanage, and K. W. Khong, “Sentiment drivers of hotel customers: a hybrid approach using unstructured data from online reviews,” Asia-Pacific J. Bus. Adm., vol. 12, no. 3/4, pp. 237–250, Jan. 2020, doi: 10.1108/APJBA-09-2019-0192.

R. Bringula, S. A. I. D. A. Ulfa, J. P. P. Miranda, and F. A. L. Atienza, “Text mining analysis on students’ expectations and anxieties towards data analytics course,” Cogent Eng., vol. 9, no. 1, 2022, doi: 10.1080/23311916.2022.2127469.

P. Sarin, A. K. Kar, and V. P. Ilavarasan, “Exploring engagement among mobile app developers – Insights from mining big data in user generated content,” J. Adv. Manag. Res., vol. 18, no. 4, pp. 585–608, Jan. 2021, doi: 10.1108/JAMR-06-2020-0128.

S. Song, S. “Brian” Park, and K. Park, “Thematic analysis of destination images for social media engagement marketing,” Ind. Manag. Data Syst., vol. 121, no. 6, pp. 1375–1397, Jan. 2021, doi: 10.1108/IMDS-12-2019-0667.

A. Singh, M. Jenamani, and J. Thakkar, “Do online consumer reviews help to evaluate the performance of automobile manufacturers?,” J. Enterp. Inf. Manag., vol. 33, no. 5, pp. 1153–1198, Jan. 2020, doi: 10.1108/JEIM-09-2019-0292.

Y. K. Oh and J. Yi, “Asymmetric effect of feature level sentiment on product rating: an application of bigram natural language processing (NLP) analysis,” Internet Res., vol. 32, no. 3, pp. 1023–1040, Jan. 2022, doi: 10.1108/INTR-11-2020-0649.

R. Obiedat et al., “Sentiment Analysis of Customers’ Reviews Using a Hybrid Evolutionary SVM-Based Approach in an Imbalanced Data Distribution,” IEEE Access, vol. 10, pp. 22260–22273, 2022, doi: 10.1109/ACCESS.2022.3149482.

K. Suresh Kumar et al., “Sentiment Analysis of Short Texts Using SVMs and VSMs-Based Multiclass Semantic Classification,” Appl. Artif. Intell., vol. 38, no. 1, 2024, doi: 10.1080/08839514.2024.2321555.

M. S. Hossain, H. Begum, M. A. Rouf, and M. M. I. Sabuj, “Investigation and prediction of users’ sentiment toward food delivery apps applying machine learning approaches,” J. Contemp. Mark. Sci., vol. 6, no. 2, pp. 109–127, Jan. 2023, doi: 10.1108/JCMARS-12-2022-0030.

I. Gurrib and F. Kamalov, “Predicting bitcoin price movements using sentiment analysis: a machine learning approach,” Stud. Econ. Financ., vol. 39, no. 3, pp. 347–364, Jan. 2022, doi: 10.1108/SEF-07-2021-0293.

H. He, G. Zhou, and S. Zhao, “Exploring E-Commerce Product Experience Based on Fusion Sentiment Analysis Method,” IEEE Access, vol. 10, no. October, pp. 110248–110260, 2022, doi: 10.1109/ACCESS.2022.3214752.

M. A. Qureshi et al., “Sentiment Analysis of Reviews in Natural Language: Roman Urdu as a Case Study,” IEEE Access, vol. 10, no. January, pp. 24945–24954, 2022, doi: 10.1109/ACCESS.2022.3150172.

M. S. Hossain, M. F. Rahman, M. K. Uddin, and M. K. Hossain, “Customer sentiment analysis and prediction of halal restaurants using machine learning approaches,” J. Islam. Mark., vol. 14, no. 7, pp. 1859–1889, Jan. 2023, doi: 10.1108/JIMA-04-2021-0125.

Y. A. Singgalen, “Penerapan Metode CRISP-DM dalam Klasifikasi Data Ulasan Pengunjung Destinasi Danau Toba Menggunakan Algoritma Naïve Bayes Classifier ( NBC ) dan Decision Tree ( DT ),” J. Media Inform. Budidarma, vol. 7, no. 3, pp. 1551–1562, 2023, doi: 10.30865/mib.v7i3.6461.

Y. A. Singgalen, “Analisis Performa Algoritma NBC , DT , SVM dalam Klasifikasi Data Ulasan Pengunjung Candi Borobudur Berbasis CRISP-DM,” Build. Informatics, Technol. Sci., vol. 4, no. 3, pp. 1634–1646, 2022, doi: 10.47065/bits.v4i3.2766.

Y. A. Singgalen, “Culture and heritage tourism sentiment classification through cross-industry standard process for data mining,” Int. J. Basic Appl. Sci., vol. 12, no. 3, pp. 110–120, 2023.

Y. A. Singgalen, “Sentiment Classification of Over-Tourism Issues in Responsible Tourism Content using Naïve Bayes Classifier,” J. Comput. Syst. Informatics, vol. 5, no. 2, pp. 275–285, 2024, doi: 10.47065/josyc.v5i2.4904.

Y. A. Singgalen, “Sentiment Classification of Robot Hotel Content using NBC and SVM Algorithm,” J. Comput. Syst. Informatics, vol. 5, no. 2, pp. 442–453, 2024, doi: 10.47065/josyc.v5i2.4924.

Y. A. Singgalen, “Sentiment Classification of Climate Change and Tourism Content Using Support Vector Machine,” J. Comput. Syst. Informatics, vol. 5, no. 2, pp. 357–367, 2024, doi: 10.47065/josyc.v5i2.4908.

Y. A. Singgalen, “Analisis Perilaku Wisatawan Berdasarkan Data Ulasan di Website Tripadvisor Menggunakan CRISP-DM?: Wisata Minat Khusus Pendakian Gunung Rinjani dan Gunung Bromo,” J. Comput. Syst. Informatics, vol. 4, no. 2, pp. 326–338, 2023, doi: 10.47065/josyc.v4i2.3042.

Y. A. Singgalen, “Analisis Sentimen Pengunjung Pulau Komodo dan Pulau Rinca di Website Tripadvisor Berbasis CRISP-DM,” J. Inf. Syst. Res., vol. 4, no. 2, pp. 614–625, 2023, doi: 10.47065/josh.v4i2.2999.

Y. A. Singgalen, “Performance Evaluation of SVM Algorithm in Sentiment Classification?: A Visual Journey of Wonderful Indonesia Content,” KLIK Kaji. Ilm. Inform. dan Komput., vol. 4, no. 4, pp. 2078–2087, 2024, doi: 10.30865/klik.v4i4.1709.

Y. A. Singgalen, “Toxicity , topic , and sentiment analysis on the operation of coal-fired power plants content reviews,” J. Tek. Inform. C.I.T Medicom, vol. 16, no. 1, pp. 45–57, 2024.

Y. A. Singgalen, “Toxicity Analysis and Sentiment Classification of Wonderland Indonesia by Alffy Rev using Support Vector Machine,” J. Sist. Komput. dan Inform., vol. 5, no. 3, pp. 538–548, 2024, doi: 10.30865/json.v5i3.7563.

H. Zhang, S. Sun, Y. Hu, J. Liu, and Y. Guo, “Sentiment Classification for Chinese Text Based on Interactive Multitask Learning,” IEEE Access, vol. 8, pp. 129626–129635, 2020, doi: 10.1109/ACCESS.2020.3007889.

X. Cao, J. Yu, and Y. Zhuang, “Injecting User Identity Into Pretrained Language Models for Document-Level Sentiment Classification,” IEEE Access, vol. 10, pp. 30157–30167, 2022, doi: 10.1109/ACCESS.2022.3158975.

Y. Chen, L. Kong, Y. Wang, and D. Kong, “Multi-Grained Attention Representation with ALBERT for Aspect-Level Sentiment Classification,” IEEE Access, vol. 9, pp. 106703–106713, 2021, doi: 10.1109/ACCESS.2021.3100299.

F. Z. Ruskanda, M. R. Abiwardani, R. Mulyawan, I. Syafalni, and H. T. Larasati, “Quantum-Enhanced Support Vector Machine for Sentiment Classification,” IEEE Access, vol. 11, no. July, pp. 87520–87532, 2023, doi: 10.1109/ACCESS.2023.3304990.

K. Xu, H. Zhao, and T. Liu, “Aspect-Specific Heterogeneous Graph Convolutional Network for Aspect-Based Sentiment Classification,” IEEE Access, vol. 8, pp. 139346–139355, 2020, doi: 10.1109/ACCESS.2020.3012637.

L. Xiaoyan and R. C. Raga, “BiLSTM Model With Attention Mechanism for Sentiment Classification on Chinese Mixed Text Comments,” IEEE Access, vol. 11, no. February, pp. 26199–26210, 2023, doi: 10.1109/ACCESS.2023.3255990.

K. Jahanbin and M. A. Z. Chahooki, “Aspect-Based Sentiment Analysis of Twitter Influencers to Predict the Trend of Cryptocurrencies Based on Hybrid Deep Transfer Learning Models,” IEEE Access, vol. 11, no. November, pp. 121656–121670, 2023, doi: 10.1109/ACCESS.2023.3327060.

E. Saltman, F. Kooti, and K. Vockery, “New Models for Deploying Counterspeech: Measuring Behavioral Change and Sentiment Analysis,” Stud. Confl. Terror., vol. 46, no. 9, pp. 1547–1574, 2023, doi: 10.1080/1057610X.2021.1888404.

N. S. Mohd Nafis and S. Awang, “An Enhanced Hybrid Feature Selection Technique Using Term Frequency-Inverse Document Frequency and Support Vector Machine-Recursive Feature Elimination for Sentiment Classification,” IEEE Access, vol. 9, no. Ml, pp. 52177–52192, 2021, doi: 10.1109/ACCESS.2021.3069001.

U. Sehar, S. Kanwal, K. Dashtipur, U. Mir, U. Abbasi, and F. Khan, “Urdu Sentiment Analysis via Multimodal Data Mining Based on Deep Learning Algorithms,” IEEE Access, vol. 9, pp. 153072–153082, 2021, doi: 10.1109/ACCESS.2021.3122025.

P. Thiengburanathum and P. Charoenkwan, “SETAR: Stacking Ensemble Learning for Thai Sentiment Analysis Using RoBERTa and Hybrid Feature Representation,” IEEE Access, vol. 11, no. July, pp. 92822–92837, 2023, doi: 10.1109/ACCESS.2023.3308951.

Y. Zheng, Y. Long, and H. Fan, “Identifying Labor Market Competitors with Machine Learning Based on Maimai Platform,” Appl. Artif. Intell., vol. 36, no. 1, 2022, doi: 10.1080/08839514.2022.2064047.

S. Mukhopadhyay, T. Jain, S. Modgil, and R. K. Singh, “Social media analytics in tourism: a review and agenda for future research,” Benchmarking An Int. J., vol. 30, no. 9, pp. 3725–3750, Jan. 2023, doi: 10.1108/BIJ-05-2022-0309.

S. Sekar, S. Edakkat Subhakaran, and D. Chattopadhyay, “Unlocking the voice of employee perspectives: exploring the relevance of online platform reviews on organizational perceptions,” Manag. Decis., vol. 61, no. 11, pp. 3408–3429, Jan. 2023, doi: 10.1108/MD-11-2022-1509.

B.-H. Leem and S.-W. Eum, “Using text mining to measure mobile banking service quality,” Ind. Manag. Data Syst., vol. 121, no. 5, pp. 993–1007, Jan. 2021, doi: 10.1108/IMDS-09-2020-0545.

V. Gupta, S. Singh, and S. S. Yadav, “The impact of media sentiments on IPO underpricing,” J. Asia Bus. Stud., vol. 16, no. 5, pp. 786–801, Jan. 2022, doi: 10.1108/JABS-10-2020-0404.

L. Ryan Bengtsson and J. Edlom, “Commodifying participation through choreographed engagement: the Taylor Swift case,” Arts Mark., vol. 13, no. 2, pp. 65–79, Jan. 2023, doi: 10.1108/AAM-07-2022-0034.

P. Rohde and G. Mau, “‘It’s selling like hotcakes’: deconstructing social media influencer marketing in long-form video content on youtube via social influence heuristics,” Eur. J. Mark., vol. 55, no. 10, pp. 2700–2734, Jan. 2021, doi: 10.1108/EJM-06-2019-0530.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Implementation of SVM and DT for Sentiment Classification: Tempel Hamlet Content Reviews

Dimensions Badge

ARTICLE HISTORY


Published: 2024-04-30
Abstract View: 194 times
PDF Download: 116 times

Issue

Section

Articles

Most read articles by the same author(s)

1 2 > >>