Mental Health Classification Using Naïve Bayes and Random Forest Algorithms

Authors

  • Muhammad Jazum Faisti Universitas Islam Nahdlatul Ulama Jepara
  • R. Hadapiningradja Kusumodestoni Universitas Islam Nahdlatul Ulama Jepara
  • Gentur Wahyu Nyipto Wibowo Universitas Islam Nahdlatul Ulama Jepara

DOI:

https://doi.org/10.30871/jaic.v9i4.10144

Keywords:

Mental Health, Machine Learning, Naïve Bayes, Random Forest, Text Classification

Abstract

Mental health is a crucial issue affecting individual and societal well-being. This study aims to investigate and compare the performance of Machine Learning algorithms, namely Naïve Bayes and Random Forest, for text-based mental health classification. The dataset used is the Mental Health Corpus from Kaggle, consisting of 27,977 English text messages from online forums, with binary labels (0: no indication of mental disorder, 1: indication of mental disorder) pre-annotated by the dataset creators. Text preprocessing involved lowercasing, negation handling, stopword removal, slang normalization, tokenization, and stemming. Data transformation was performed using TF-IDF. Model evaluation utilized accuracy, precision, recall, and F1-score metrics, along with 5-Fold Cross Validation. Evaluation results indicate high performance for both algorithms. Naïve Bayes achieved 88.7 % accuracy, 84.2 % precision, 95.2 % recall, and 89.3 % F1-score on the test data. Random Forest demonstrated more balanced performance with 89.3 % accuracy, 88.1 % precision, 90.5 % recall, and 89.3 % F1-score. The 5-Fold Cross Validation for Naïve Bayes yielded average scores of 88.8 % accuracy, 84.4 % precision, 94.9 % recall, and 89.3 % F1-score. In contrast, Random Forest showed averages of 89.2 % accuracy, 88.8 % precision, 89.5 % recall, and 89.3 % F1-score. While Naïve Bayes had higher recall, Random Forest exhibited the best overall performance, considering the combination of accuracy, precision, and stable generalization, making it more effective for mental health text classification.

Downloads

Download data is not yet available.

References

[1] K. Yusrani, Ghefira, N. Aini, S. Maghfiroh, Aulia, and N. Istanti, Dwi, “Tinjauan Kebijakan Kesehatan Mental di Indonesia: Menuju Pencapaian Sustainable Development Goals dan Universal Health Coverage,” J. Med. Nusant., vol. 1, no. 2, pp. 89–107, 2023, doi: 10.59680/medika.v1i2.281.

[2] C. Kokoh, H. Addarian, L. Anastasia, and D. Cahyadi, “Laporan Kasus: Gangguan Depresi Mayor Pada Mahasiswa Fakultas Kedokteran,” J. Pranata Biomedika, vol. 3, no. 1, pp. 1–23, 2024.

[3] H. Nathasya, P. Nuraini, S. Z. A. Thohiroh, T. Salma, and R. Fadhlina, “Analisis Tingkat Dan Faktor Penyebab Depresi Se Asia Tenggara,” J. Edu Res. Indones. Inst. Corp. Learn. Stud. Page, vol. 15, no. 1, pp. 37–48, 2024.

[4] A. Harahap, Juita, Sharmila, and Y. Mariska, “Pentingnya Menjaga Kesehatan Mental dalam Perspektif Agama Islam,” Innov. J. Soc. Sci. Res., vol. 4, no. 4, pp. 7836–7848., 2024.

[5] E. J. Izati, R. Hairisya, and E. Nurita, “Tantangan dan Solusi dalam Penanganan Kesehatan Mental di Kalangan Mahasiswa,” Pros. Semin. Nas. Manaj., vol. 4, no. 1, pp. 463–467, 2025.

[6] A. Wijoyo, A. Y. Saputra, S. Ristanti, S. R. Sya’ban, M. Amalia, and R. Febriansyah, “Pembelajaran Machine Learning,” OKTAL (Jurnal Ilmu Komput. dan Sci., vol. 3, no. 2, pp. 375–380, 2024, [Online]. Available: https://journal.mediapublikasi.id/index.php/oktal/article/view/2305

[7] S. P. Utami, “Klasifikasi Kesehatan Mental Usia Remaja Menggunakan Algoritma Decision Tree Dan Naïve Bayes,” Universitas Islam Negeri Syarif Hidayatullah, Jakarta, 2024.

[8] A. P. Wijaya, “Perbandingan Algoritma Klasifikasi Random Foresst dengan Naïve Bayes Classifier pada Studi Penyakit Berdasarkan Pola Nutrisi,” Remik Ris. dan E-Jurnal Manaj. Inform. Komput., vol. 9, no. 1, pp. 429–438, 2025.

[9] A. R. Dani and I. Handayani, “Klasifikasi Motif Batik Yogyakarta Menggunakan Metode GLCM Dan CNN,” J. Teknol. Terpadu, vol. 10, no. 2, pp. 142–156, 2024.

[10] Suci Amaliah, M. Nusrang, and A. Aswi, “Penerapan Metode Random Forest Untuk Klasifikasi Varian Minuman Kopi di Kedai Kopi Konijiwa Bantaeng,” VARIANSI J. Stat. Its Appl. Teach. Res., vol. 4, no. 3, pp. 121–127, 2022, doi: 10.35580/variansiunm31.

[11] R. Alfarezy, E. Ermatita, and R. M. B. Wadu, “Implementasi Algoritma Naïve Bayes Untuk Analisis Klasifikasi Survei Kesehatan Mental (Studi Kasus: Open Sourcing Mental Illness),” Inform. J. Ilmu Komput., vol. 19, no. 1, pp. 1–10, 2023, doi: 10.52958/iftk.v19i1.4696.

[12] A. Priyono, M. Shodiq, D. P. Alvinsyah, and S. A. Hidayah, “Metode Random Forest Untuk Memudahkan Klasifikasi Diagnosis Penyakit Mental,” J. Inform. Medis, vol. 2, no. 1, pp. 1–4, 2024, doi: 10.52060/im.v2i1.2119.

[13] A. A. Syam, G. H. M, A. Salim, D. F. Surianto, and M. F. B, “Analisis teknik preprocessing pada sentimen masyarakat terkait konflik israel-palestina menggunakan support vector machine,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 9, no. 3, pp. 1464–1472, 2024.

[14] D. Rifaldi, A. Fadlil, and Herman, “Teknik Preprocessing Pada Text Mining Menggunakan Data Tweet ‘Mental Health,’” Decod. J. Pendidik. Teknol. Inf., vol. 3, no. 2, pp. 161–171, 2023, doi: 10.51454/decode.v3i2.131.

[15] D. Septiani and I. Isabela, “Analisis Term Frequency Inverse Document Frequency (Tf-Idf) Dalam Temu Kembali Informasi Pada Dokumen Teks,” SINTESIA J. Sist. dan Teknol. Inf. Indones., vol. 1, no. 1, pp. 81–88, 2022.

[16] R. Kristianto Hondro, “Jurnal Pendidikan Teknologi Informasi Dan Komputer Analisis Penerapan Text Mining dan TF-IDF dalam Mengetahui Sentimen Masyarakat Terhadap Kinerja POLRI,” J. Pendidik. Teknol. Inf. Dan Komput., vol. 2, no. 1, pp. 44–49, 2023, [Online]. Available: https://journal.grahamitra.id/index.php/petik

[17] R. Al Rasyid and D. H. U. Ningsih, “Penerapan Algoritma TF-IDF dan Cosine Similarity untuk Query Pencarian Pada Dataset Destinasi Wisata,” J. JTIK (Jurnal Teknol. Inf. dan Komunikasi), vol. 8, no. 1, pp. 170–178, 2024, doi: 10.35870/jtik.v8i1.1416.

[18] R. G. Wardhana, G. Wang, and F. Sibuea, “Penerapan Machine Learning Dalam Prediksi Tingkat Kasus Penyakit Di Indonesia,” J. Inf. Syst. Manag., vol. 5, no. 1, pp. 40–45, 2023, doi: 10.24076/joism.2023v5i1.1136.

[19] N. H. Setyawan and N. Wakhidah, “Analisis perbandingan metode logistic regression, random forest, gradient boosting untuk prediksi diabetes,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 10, no. 1, pp. 150–162, 2025.

[20] R. H. Kusumodestoni, M. Aan Presetyo, and A. Khanif Zyen, “Optimasi Algoritma Naïve Bayes Berbasis Kernel Untuk Klasifikasi Penyakit Hati,” J. Inform. Teknol. dan Sains, vol. 6, no. 3, pp. 748–756, 2024.

[21] G. W. Nyipto Wibowo, S. Widiastuti, M. Muratno, E. Lolang, and S. Soraya, “Penerapan Metode Teorema Bayes Dalam Mendiagnosa Penyakit Tubercolosis,” Build. Informatics, Technol. Sci., vol. 4, no. 4, pp. 1782–1788, 2023, doi: 10.47065/bits.v4i4.3035.

[22] H. Hartono, A. Hajjah, and Y. N. Marlim, “Penerapan Metode Naïve Bayes Classifier Untuk Klasifikasi Judul Berita,” J. SimanteC, vol. 12, no. 1, pp. 37–46, 2023.

[23] S. Ary Prandika, D. P. Purba, P. Jojor Putri, and K. R. Bakara, “Implementasi Algoritma Random Forest Dalam Klasifikasi Diagnosis Penyakit Stroke,” J. Penelit. Rumpun Ilmu Tek., vol. 2, no. 4, pp. 155–164, 2023, doi: 10.55606/juprit.v2i4.3039.

[24] S. Mahmuda, “Implementasi Metode Random Forest pada Kategori Konten Kanal Youtube,” J. Jendela Mat., vol. 2, no. 01, pp. 21–31, 2024, doi: 10.57008/jjm.v2i01.633.

[25] R. Nurhidayat and K. E. Dewi, “Penerapan Algoritma K-Nearest Neighbor Dan Fitur Ekstraksi N-Gram Dalam Analisis Sentimen Berbasis Aspek,” Komputa J. Ilm. Komput. dan Inform., vol. 12, no. 1, pp. 91–100, 2023, doi: 10.34010/komputa.v12i1.9458.

[26] P. Romadloni, B. Adhi Kusuma, and W. Maulana Baihaqi, “Komparasi Metode Pembelajaran Mesin Untuk Implementasi Pengambilan Keputusan Dalam Menentukan Promosi Jabatan Karyawan,” JATI (Jurnal Mhs. Tek. Inform., vol. 6, no. 2, pp. 622–628, 2022, doi: 10.36040/jati.v6i2.5238.

Downloads

Published

2025-08-08

How to Cite

[1]
M. J. Faisti, R. H. Kusumodestoni, and G. W. N. Wibowo, “Mental Health Classification Using Naïve Bayes and Random Forest Algorithms”, JAIC, vol. 9, no. 4, pp. 1740–1750, Aug. 2025.

Issue

Section

Articles

Similar Articles

<< < 2 3 4 5 6 > >> 

You may also start an advanced similarity search for this article.