Improving Helpdesk Chatbot Performance with Term Frequency-Inverse Document Frequency (TF-IDF) and Cosine Similarity Models

  • Gede Herdian Setiawan Institut Teknologi dan Bisnis STIKOM Bali
  • I Made Budi Adnyana Institut Teknologi dan Bisnis STIKOM Bali
Keywords: Chatbot, TF-IDF, Cosine Similarity, NLP

Abstract

Helpdesk chatbots are growing in popularity due to their ability to provide help and answers to user questions quickly and effectively. Chatbot development poses several challenges, including enhancing accuracy in understanding user queries and providing relevant responses while improving problem-solving efficiency. In this research, we aim to enhance the accuracy and efficiency of the Helpdesk Chatbot by implementing the Term Frequency-Inverse Document Frequency (TF-IDF) model and the Cosine Similarity algorithm. The TF-IDF model is a method used to measure the frequency of words in a document and their occurrence in the entire document collection, while the Cosine Similarity algorithm is used to measure the similarity between two documents. After implementing and testing TF-IDF and Cosine Similarity models in the Helpdesk Chatbot, we achieved a 75% question recognition rate. To increase accuracy and precision, it is necessary to increase the knowledge dataset and improve pre-processing, especially in recognition and correct inaccurate spelling

Downloads

Download data is not yet available.

References

A. L. Chiru, I. A. Awada, and A. M. Florea, “A Support Process of Telemedicine Applications that Integrates a Chatbot,” in 2021 International Conference on e-Health and Bioengineering (EHB), 2021, pp. 1–4. doi: 10.1109/EHB52898.2021.9657553.

R. Shah, S. Lahoti, and K. Lavanya, “An intelligent chat-bot using natural language processing,” International Journal of Engineering Research, vol. 6, no. 5, p. 281, 2017, doi: 10.5958/2319-6890.2017.00019.8.

S. K. Maher, S. G. Bhable, A. R. Lahase, and S. S. Nimbhore, “AI and Deep Learning-driven Chatbots: A Comprehensive Analysis and Application Trends,” in 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS), 2022, pp. 994–998. doi: 10.1109/ICICCS53718.2022.9788276.

J. J. Sophia and T. P. Jacob, “EDUBOT-A Chatbot For Education in Covid-19 Pandemic and VQAbot Comparison,” in 2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC), 2021, pp. 1707–1714. doi: 10.1109/ICESC51422.2021.9532611.

P. D. Larasati, A. Irawan, S. Anwar, M. F. Mulya, M. A. Dewi, and I. Nurfatima, “Chatbot helpdesk design for digital customer service,” Applied Engineering and Technology, vol. 1, no. 3, pp. 138–145, 2022, doi: 10.3176/aet.v1i1.684.

D. C. Ukpabi, B. Aslam, and H. Karjaluoto, “Chatbot adoption in tourism services: A conceptual exploration,” in Robots, Artificial Intelligence and Service Automation in Travel, Tourism and Hospitality, Emerald Group Publishing Ltd., 2019, pp. 105–121. doi: 10.1108/978-1-78756-687-320191006.

A. Ali and M. Zain Amin, “Conversational AI Chatbot Based on Encoder-Decoder Architectures with Attention Mechanism,” Artificial Intelligence Festival, vol. 2, no. 0, 2019, doi: 10.13140/RG.2.2.12710.27204.

S. Defit and G. Widi Nurcahyo, “Product Codefication Accuracy With Cosine Similarity And Weighted Term Frequency And Inverse Document FREQUENCY (TF-IDF),” 2021.

M. Chiny, M. Chihab, O. Bencharef, and Y. Chihab, “Netflix Recommendation System based on TF-IDF and Cosine Similarity Algorithms,” Scitepress, May 2022, pp. 15–20. doi: 10.5220/0010727500003101.

P. Y. Ristanti, A. P. Wibawa, and U. Pujianto, “Cosine Similarity for Title and Abstract of Economic Journal Classification,” in Proceeding - 2019 5th International Conference on Science in Information Technology: Embracing Industry 4.0: Towards Innovation in Cyber Physical System, ICSITech 2019, Institute of Electrical and Electronics Engineers Inc., Oct. 2019, pp. 123–127. doi: 10.1109/ICSITech46713.2019.8987547.

G. Herdian Setiawan and I. Made Budi Adnyana, “Information Retrieval Pada Frequently Asked Questions (FAQ) dengan metode String Similarity Information Retrieval on Frequently Asked Questions (FAQ) using String Similarity method,” 2022.

R. T. Wahyuni, D. Prastiyanto, and D. E. Supraptono, “Penerapan Algoritma Cosine Similarity dan Pembobotan TF-IDF pada Sistem Klasifikasi Dokumen Skripsi.”

S. Ayanouz, B. A. Abdelhakim, and M. Benhmed, “A Smart Chatbot Architecture based NLP and Machine Learning for Health Care Assistance,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Mar. 2020. doi: 10.1145/3386723.3387897.

Q. Xu, “Research on Text Classification Method based on PTF-IDF and Cosine Similarity,” Journal of Information and Communication Engineering, vol. 6, no. 1, pp. 335–339, 2020, [Online]. Available: https://www.kaggle.com/shineucc/bbc-newsdataset

Published
2023-12-05
How to Cite
[1]
G. Setiawan and I. M. Adnyana, “Improving Helpdesk Chatbot Performance with Term Frequency-Inverse Document Frequency (TF-IDF) and Cosine Similarity Models”, JAIC, vol. 7, no. 2, pp. 252-257, Dec. 2023.
Section
Articles