Mental Health Classification Using Naïve Bayes and Random Forest Algorithms
DOI:
https://doi.org/10.30871/jaic.v9i4.10144Keywords:
Mental Health, Machine Learning, Naïve Bayes, Random Forest, Text ClassificationAbstract
Mental health is a crucial issue affecting individual and societal well-being. This study aims to investigate and compare the performance of Machine Learning algorithms, namely Naïve Bayes and Random Forest, for text-based mental health classification. The dataset used is the Mental Health Corpus from Kaggle, consisting of 27,977 English text messages from online forums, with binary labels (0: no indication of mental disorder, 1: indication of mental disorder) pre-annotated by the dataset creators. Text preprocessing involved lowercasing, negation handling, stopword removal, slang normalization, tokenization, and stemming. Data transformation was performed using TF-IDF. Model evaluation utilized accuracy, precision, recall, and F1-score metrics, along with 5-Fold Cross Validation. Evaluation results indicate high performance for both algorithms. Naïve Bayes achieved 88.7 % accuracy, 84.2 % precision, 95.2 % recall, and 89.3 % F1-score on the test data. Random Forest demonstrated more balanced performance with 89.3 % accuracy, 88.1 % precision, 90.5 % recall, and 89.3 % F1-score. The 5-Fold Cross Validation for Naïve Bayes yielded average scores of 88.8 % accuracy, 84.4 % precision, 94.9 % recall, and 89.3 % F1-score. In contrast, Random Forest showed averages of 89.2 % accuracy, 88.8 % precision, 89.5 % recall, and 89.3 % F1-score. While Naïve Bayes had higher recall, Random Forest exhibited the best overall performance, considering the combination of accuracy, precision, and stable generalization, making it more effective for mental health text classification.
Downloads
References
[1] K. Yusrani, Ghefira, N. Aini, S. Maghfiroh, Aulia, and N. Istanti, Dwi, “Tinjauan Kebijakan Kesehatan Mental di Indonesia: Menuju Pencapaian Sustainable Development Goals dan Universal Health Coverage,” J. Med. Nusant., vol. 1, no. 2, pp. 89–107, 2023, doi: 10.59680/medika.v1i2.281.
[2] C. Kokoh, H. Addarian, L. Anastasia, and D. Cahyadi, “Laporan Kasus: Gangguan Depresi Mayor Pada Mahasiswa Fakultas Kedokteran,” J. Pranata Biomedika, vol. 3, no. 1, pp. 1–23, 2024.
[3] H. Nathasya, P. Nuraini, S. Z. A. Thohiroh, T. Salma, and R. Fadhlina, “Analisis Tingkat Dan Faktor Penyebab Depresi Se Asia Tenggara,” J. Edu Res. Indones. Inst. Corp. Learn. Stud. Page, vol. 15, no. 1, pp. 37–48, 2024.
[4] A. Harahap, Juita, Sharmila, and Y. Mariska, “Pentingnya Menjaga Kesehatan Mental dalam Perspektif Agama Islam,” Innov. J. Soc. Sci. Res., vol. 4, no. 4, pp. 7836–7848., 2024.
[5] E. J. Izati, R. Hairisya, and E. Nurita, “Tantangan dan Solusi dalam Penanganan Kesehatan Mental di Kalangan Mahasiswa,” Pros. Semin. Nas. Manaj., vol. 4, no. 1, pp. 463–467, 2025.
[6] A. Wijoyo, A. Y. Saputra, S. Ristanti, S. R. Sya’ban, M. Amalia, and R. Febriansyah, “Pembelajaran Machine Learning,” OKTAL (Jurnal Ilmu Komput. dan Sci., vol. 3, no. 2, pp. 375–380, 2024, [Online]. Available: https://journal.mediapublikasi.id/index.php/oktal/article/view/2305
[7] S. P. Utami, “Klasifikasi Kesehatan Mental Usia Remaja Menggunakan Algoritma Decision Tree Dan Naïve Bayes,” Universitas Islam Negeri Syarif Hidayatullah, Jakarta, 2024.
[8] A. P. Wijaya, “Perbandingan Algoritma Klasifikasi Random Foresst dengan Naïve Bayes Classifier pada Studi Penyakit Berdasarkan Pola Nutrisi,” Remik Ris. dan E-Jurnal Manaj. Inform. Komput., vol. 9, no. 1, pp. 429–438, 2025.
[9] A. R. Dani and I. Handayani, “Klasifikasi Motif Batik Yogyakarta Menggunakan Metode GLCM Dan CNN,” J. Teknol. Terpadu, vol. 10, no. 2, pp. 142–156, 2024.
[10] Suci Amaliah, M. Nusrang, and A. Aswi, “Penerapan Metode Random Forest Untuk Klasifikasi Varian Minuman Kopi di Kedai Kopi Konijiwa Bantaeng,” VARIANSI J. Stat. Its Appl. Teach. Res., vol. 4, no. 3, pp. 121–127, 2022, doi: 10.35580/variansiunm31.
[11] R. Alfarezy, E. Ermatita, and R. M. B. Wadu, “Implementasi Algoritma Naïve Bayes Untuk Analisis Klasifikasi Survei Kesehatan Mental (Studi Kasus: Open Sourcing Mental Illness),” Inform. J. Ilmu Komput., vol. 19, no. 1, pp. 1–10, 2023, doi: 10.52958/iftk.v19i1.4696.
[12] A. Priyono, M. Shodiq, D. P. Alvinsyah, and S. A. Hidayah, “Metode Random Forest Untuk Memudahkan Klasifikasi Diagnosis Penyakit Mental,” J. Inform. Medis, vol. 2, no. 1, pp. 1–4, 2024, doi: 10.52060/im.v2i1.2119.
[13] A. A. Syam, G. H. M, A. Salim, D. F. Surianto, and M. F. B, “Analisis teknik preprocessing pada sentimen masyarakat terkait konflik israel-palestina menggunakan support vector machine,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 9, no. 3, pp. 1464–1472, 2024.
[14] D. Rifaldi, A. Fadlil, and Herman, “Teknik Preprocessing Pada Text Mining Menggunakan Data Tweet ‘Mental Health,’” Decod. J. Pendidik. Teknol. Inf., vol. 3, no. 2, pp. 161–171, 2023, doi: 10.51454/decode.v3i2.131.
[15] D. Septiani and I. Isabela, “Analisis Term Frequency Inverse Document Frequency (Tf-Idf) Dalam Temu Kembali Informasi Pada Dokumen Teks,” SINTESIA J. Sist. dan Teknol. Inf. Indones., vol. 1, no. 1, pp. 81–88, 2022.
[16] R. Kristianto Hondro, “Jurnal Pendidikan Teknologi Informasi Dan Komputer Analisis Penerapan Text Mining dan TF-IDF dalam Mengetahui Sentimen Masyarakat Terhadap Kinerja POLRI,” J. Pendidik. Teknol. Inf. Dan Komput., vol. 2, no. 1, pp. 44–49, 2023, [Online]. Available: https://journal.grahamitra.id/index.php/petik
[17] R. Al Rasyid and D. H. U. Ningsih, “Penerapan Algoritma TF-IDF dan Cosine Similarity untuk Query Pencarian Pada Dataset Destinasi Wisata,” J. JTIK (Jurnal Teknol. Inf. dan Komunikasi), vol. 8, no. 1, pp. 170–178, 2024, doi: 10.35870/jtik.v8i1.1416.
[18] R. G. Wardhana, G. Wang, and F. Sibuea, “Penerapan Machine Learning Dalam Prediksi Tingkat Kasus Penyakit Di Indonesia,” J. Inf. Syst. Manag., vol. 5, no. 1, pp. 40–45, 2023, doi: 10.24076/joism.2023v5i1.1136.
[19] N. H. Setyawan and N. Wakhidah, “Analisis perbandingan metode logistic regression, random forest, gradient boosting untuk prediksi diabetes,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 10, no. 1, pp. 150–162, 2025.
[20] R. H. Kusumodestoni, M. Aan Presetyo, and A. Khanif Zyen, “Optimasi Algoritma Naïve Bayes Berbasis Kernel Untuk Klasifikasi Penyakit Hati,” J. Inform. Teknol. dan Sains, vol. 6, no. 3, pp. 748–756, 2024.
[21] G. W. Nyipto Wibowo, S. Widiastuti, M. Muratno, E. Lolang, and S. Soraya, “Penerapan Metode Teorema Bayes Dalam Mendiagnosa Penyakit Tubercolosis,” Build. Informatics, Technol. Sci., vol. 4, no. 4, pp. 1782–1788, 2023, doi: 10.47065/bits.v4i4.3035.
[22] H. Hartono, A. Hajjah, and Y. N. Marlim, “Penerapan Metode Naïve Bayes Classifier Untuk Klasifikasi Judul Berita,” J. SimanteC, vol. 12, no. 1, pp. 37–46, 2023.
[23] S. Ary Prandika, D. P. Purba, P. Jojor Putri, and K. R. Bakara, “Implementasi Algoritma Random Forest Dalam Klasifikasi Diagnosis Penyakit Stroke,” J. Penelit. Rumpun Ilmu Tek., vol. 2, no. 4, pp. 155–164, 2023, doi: 10.55606/juprit.v2i4.3039.
[24] S. Mahmuda, “Implementasi Metode Random Forest pada Kategori Konten Kanal Youtube,” J. Jendela Mat., vol. 2, no. 01, pp. 21–31, 2024, doi: 10.57008/jjm.v2i01.633.
[25] R. Nurhidayat and K. E. Dewi, “Penerapan Algoritma K-Nearest Neighbor Dan Fitur Ekstraksi N-Gram Dalam Analisis Sentimen Berbasis Aspek,” Komputa J. Ilm. Komput. dan Inform., vol. 12, no. 1, pp. 91–100, 2023, doi: 10.34010/komputa.v12i1.9458.
[26] P. Romadloni, B. Adhi Kusuma, and W. Maulana Baihaqi, “Komparasi Metode Pembelajaran Mesin Untuk Implementasi Pengambilan Keputusan Dalam Menentukan Promosi Jabatan Karyawan,” JATI (Jurnal Mhs. Tek. Inform., vol. 6, no. 2, pp. 622–628, 2022, doi: 10.36040/jati.v6i2.5238.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Muhammad Jazum Faisti, R. Hadapiningradja Kusumodestoni, Gentur Wahyu Nyipto Wibowo

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








