A Hybrid Approach to Music Recommendations Based on Audio Similarity Using Autoencoder and LightGBM

Authors

  • Winda Ardelia Aristawidya Universitas Amikom Yogyakarta
  • Majid Rahardi Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.30871/jaic.v9i6.10516

Keywords:

Music Recommendation System, Audio Feature, Autoencoder, PCA, LightGBM

Abstract

Music recommendation systems help users navigate large music collections by suggesting songs aligned with their preferences. However, conventional methods often overlook the depth of audio content, limiting personalization and accuracy. This study proposes a hybrid approach that uses PCA and Autoencoder to extract audio embeddings. These embeddings are processed using K-Nearest Neighbors to find similar tracks, followed by a reranking step with LightGBM based on predicted relevance. The system achieved strong results: 98% accuracy, 0.96 precision, 0.96 recall, and 0.96 F1-score for the Similar class, with 0.99 precision and recall for Not Similar. Cross-validation confirmed model robustness, with an average accuracy of 97.99%, precision of 0.9577, recall of 0.9624, and F1-score of 0.9600, all with low standard deviations. These outcomes show that combining deep audio features with machine learning ranking enhances recommendation quality. Future improvements may involve incorporating metadata and genre-based visualizations for more diverse and interpretable results.

Downloads

Download data is not yet available.

References

[1] A. I. Putra and R. R. Santika, “Implementasi Machine Learning dalam Penentuan Rekomendasi Musik dengan Metode Content-Based Filtering,” Edumatic J. Pendidik. Inform., vol. 4, no. 1, pp. 121–130, 2020, doi: 10.29408/edumatic.v4i1.2162.

[2] H. Siefkes, L. C. Oliveira, R. Koppel, and W. Hogan, “Machine Learning – Based Critical,” pp. 1–13, 2024, doi: 10.1109/CONFLUENCE47617.2020.9058196.

[3] S. Pencarian et al., “Lagu Untuk Pengalaman Mendengarkan Yang Lebih Personal Menggunakan Content-Based Filtering,” vol. 8, no. 2, pp. 169–174, 2025.

[4] A. T. R. Dani, V. Ratnasari, L. Ni’matuzzahroh, I. C. AVIANTHOLIB, R. NOVIDIANTO, and N. Y. ADRIANINGSIH, “Analisis Klasifikasi Artist Music Menggunakan Model Regresi Logistik Biner Dan Analisis Diskriminan,” Jambura J. Probab. Stat., vol. 3, no. 1, pp. 1–10, 2022, doi: 10.34312/jjps.v3i1.13708.

[5] Y. Jiang and F. H. F. Leung, “Vector-Based Feature Representations for Speech Signals: From Supervector to Latent Vector,” IEEE Trans. Multimed., vol. 23, no. c, pp. 2641–2655, 2021, doi: 10.1109/TMM.2020.3014559.

[6] K. R. Putra and M. A. Rachman, “Perbandingan Metode Content-based , Collaborative dan Hybrid Filtering pada Sistem Rekomendasi Lagu,” vol. 9, no. 2, pp. 179–193, 2024, doi: https://doi.org/10.26760/mindjournal.v1i1.49.

[7] X. Wang, Z. Wang, Y. Zhang, X. Jiang, and Z. Cai, “Latent representation learning based autoencoder for unsupervised feature selection in hyperspectral imagery,” Multimed. Tools Appl., vol. 81, no. 9, pp. 12061–12075, 2022, doi: 10.1007/s11042-020-10474-8.

[8] T. Sugiura, Y. Yamagishi, and Y. Kishimoto, “Leveraging LightGBM Ranker for Efficient Large-Scale News Recommendation Systems,” ACM Int. Conf. Proceeding Ser., pp. 27–31, 2024, doi: 10.1145/3687151.3687156.

[9] Karthik V, Savita Chaudhary, and Radhika A D, “Feature Extraction in Music information retrival using Machine Learning Algorithms,” Int. J. Data Informatics Intell. Comput., vol. 1, no. 1, pp. 1–10, 2024, doi: 10.59461/ijdiic.v1i1.11.

[10] M. U. Hassan, N. Zafar, H. Ali, I. Yaqoob, S. A. A. Alaliyat, and I. A. Hameed, “Collaborative Filtering Based Hybrid Music Recommendation System,” Lect. Notes Networks Syst., vol. 350, pp. 239–249, 2022, doi: 10.1007/978-981-16-7618-5_21.

[11] M. Casella, P. Dolce, M. Ponticorvo, and D. Marocco, “Autoencoders as an alternative approach to Principal Component Analysis for dimensionality reduction. An application on simulated data from psychometric models,” CEUR Workshop Proc., vol. 3100, pp. 0–2, 2021.

[12] A. Y. Timur and A. N. Rohman, “Indonesia Metode Content-Based Filtering Dan Cosine Similarity,” vol. 13, no. 1, pp. 1415–1423, 2025.

[13] A. E. Karrar, “The Effect of Using Data Pre-Processing by Imputations in Handling Missing Values,” Indones. J. Electr. Eng. Informatics, vol. 10, no. 2, pp. 375–384, 2022, doi: 10.52549/ijeei.v10i2.3730.

[14] L. Wang, S. Jiang, and S. Jiang, “A feature selection method via analysis of relevance, redundancy, and interaction,” Expert Syst. Appl., vol. 183, no. August 2019, p. 115365, 2021, doi: 10.1016/j.eswa.2021.115365.

[15] M. Alkhayrat, M. Aljnidi, and K. Aljoumaa, “A comparative dimensionality reduction study in telecom customer segmentation using deep learning and PCA,” J. Big Data, vol. 7, no. 1, 2020, doi: 10.1186/s40537-020-0286-0.

[16] T. Pratiwi, A. Sunyoto, and D. Ariatmanto, “Music Genre Classification Using K-Nearest Neighbor and Mel-Frequency Cepstral Coefficients,” Sinkron, vol. 8, no. 2, pp. 861–867, 2024, doi: 10.33395/sinkron.v8i2.12912.

[17] X. Cheng, K. Liu, X. Hu, T. Liu, C. Che, and C. Zhu, “Comparative Analysis of Machine Learning Models for Music Recommendation,” Theor. Nat. Sci., vol. 53, no. 1, pp. 249–254, 2024, doi: 10.54254/2753-8818/53/20240233.

Downloads

Published

2025-12-06

How to Cite

[1]
W. A. Aristawidya and M. Rahardi, “A Hybrid Approach to Music Recommendations Based on Audio Similarity Using Autoencoder and LightGBM”, JAIC, vol. 9, no. 6, pp. 3191–3197, Dec. 2025.

Most read articles by the same author(s)

1 2 3 > >> 

Similar Articles

<< < 2 3 4 5 6 > >> 

You may also start an advanced similarity search for this article.