Musical Instrument Classification using Audio Features and Convolutional Neural Network
Abstract
This research classifies acoustic instruments using Convolutional Neural Network (CNN). We utilize a dataset from Kaggle containing audio recordings of piano, violin, drums, and guitar. The training set consists of 700 guitar, percussion, violin, and 528 piano samples. The test set contains 80 samples of each instrument. Features such as Mel spectrograms, MFCCs, and other spectral and non-spectral characteristics are extracted using the Librosa package. Three feature sets—spectral-only, non-spectral-only, and a combined set—are employed to evaluate the efficacy of CNN models. Various CNN configurations are tested by adjusting the number of convolutional filters, learning rates, and epochs. The combined feature set achieves the highest performance, with a validation accuracy of 71.8% and a training accuracy of 76.9%. In comparison, non-spectral features achieve a validation accuracy of 68.4%, and spectral-only features achieve 69.3%. These findings highlight the benefits of using a comprehensive feature set for accurate classification.
Downloads
References
K. Racharla, V. Kumar, C. B. Jayant, A. Khairkar, and P. Harish, “Predominant musical instrument classification based on spectral features,” in 2020 7th International Conference on Signal Processing and Integrated Networks, SPIN 2020, 2020. doi: 10.1109/SPIN48934.2020.9071125.
S. R. Chaudhary, S. N. Kakarwal, and J. V. Bagade, “Feature selection and classification of indian musical string instruments using svm,” Indian Journal of Computer Science and Engineering, vol. 12, no. 4, 2021, doi: 10.21817/indjcse/2021/v12i4/211204142.
P. K. Aurchana, “Musical Instruments Sound Classification using GMM,” London Journal of Social Sciences, 2021, doi: 10.31039/ljss.2021.1.37.
C. Dewi, A. P. S. Chen, and H. J. Christanto, “Recognizing Similar Musical Instruments with YOLO Models,” Big Data and Cognitive Computing, vol. 7, no. 2, 2023, doi: 10.3390/bdcc7020094.
S. Rajesh and N. J. Nalini, “Musical instrument emotion recognition using deep recurrent neural network,” in Procedia Computer Science, 2020. doi: 10.1016/j.procs.2020.03.178.
Y. Su, “Instrument Classification Using Different Machine Learning and Deep Learning Methods,” Highlights in Science, Engineering and Technology, vol. 34, 2023, doi: 10.54097/hset.v34i.5435.
S. K. Mahanta, N. J. Basisth, E. Halder, A. F. U. R. Khilji, and P. Pakray, “Exploiting cepstral coefficients and CNN for efficient musical instrument classification,” Evolving Systems, vol. 15, no. 3, 2024, doi: 10.1007/s12530-023-09540-x.
C.-W. Weng, C.-Y. Lin, and J.-S. R. Jang, “Music Instrument Identification Using MFCC: Erhu as an Example,” Chinese Music Dept, Tainan National …, 2004.
M. Blaszke and B. Kostek, “Musical Instrument Identification Using Deep Learning Approach,” Sensors, vol. 22, no. 8, 2022, doi: 10.3390/s22083033.
SOUMENDRA PRASAD MOHANTY, “Musical Instrument’s Sound Dataset.” Accessed: Jun. 01, 2024. [Online]. Available: https://www.kaggle.com/datasets/soumendraprasad/musical-instruments-sound-dataset/
D. S. Lau and R. Ajoodha, “Music Genre Classification: A Comparative Study Between Deep Learning and Traditional Machine Learning Approaches,” in Lecture Notes in Networks and Systems, 2022. doi: 10.1007/978-981-16-2102-4_22.
J. L. Leevy, J. M. Johnson, J. Hancock, and T. M. Khoshgoftaar, “Threshold optimization and random undersampling for imbalanced credit card data,” J Big Data, vol. 10, no. 1, 2023, doi: 10.1186/s40537-023-00738-z.
R. A. Nawasta, N. H. Cahyana, and H. Heriyanto, “Implementation of Mel-Frequency Cepstral Coefficient as Feature Extraction using K-Nearest Neighbor for Emotion Detection Based on Voice Intonation,” Telematika, vol. 20, no. 1, 2023, doi: 10.31315/telematika.v20i1.9518.
P.-N. Tan, M. Steinbach, A. Karpatne, and V. Kumar, “Introduction to data mining Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar.,” Introduction to data mining, 2019.
Copyright (c) 2024 Gst. Ayu Vida Mastrika Giri, Made Leo Radhitya
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).