Comparative Evaluation of MFCC and Mel-spectrogram Features for CNN-Based Respiratory Abnormality Detection

Authors

  • Simboni Simboni Tege Department of Management Information Systems, Higher Pedagogical Institute of Isiro, Isiro, D.R. Congo
  • Kafunda Katalay Pierre Mention of Mathematcs, Statistics and Computer Science, University of Kinshasa, Kinshasa, D.R. Congo
  • Oshasha Oshasha Fiston General Commissariat for Atomic Energy, Regional Center for Nuclear Studies of Kinshasa, Kinshasa, D.R. Congo
  • Sylvestre Frey Mention of Mathematcs, Statistics and Computer Science, University of Kinshasa, Kinshasa, D.R. Congo
  • Albert Ntumba Nkongolo Mention of Mathematcs, Statistics and Computer Science, University of Kinshasa, Kinshasa, D.R. Congo
  • Biaba Kuya Jirince International School, Vietnam National University, Hanoi,Vietnam

DOI:

https://doi.org/10.30871/jaic.v10i2.12355

Keywords:

Convolutional Neural Networks (CNNs), Respiratory sounds, MFCC, Mel-spectrogram, Medical diagnosis

Abstract

Automated respiratory sound analysis addresses critical limitations in traditional clinical auscultation, particularly high inter-observer variability and limited specialist access in resource-constrained settings. This study rigorously compares Mel-Frequency Cepstral Coefficients (MFCC) and Mel-spectrogram representations for classifying respiratory abnormalities using convolutional neural networks. Using the ICBHI 2017 dataset (920 recordings, 6,898 cycles from 126 patients), we implemented identical CNN architectures differing only in input features. Class imbalance was addressed through Synthetic Minority Over-sampling Technique applied exclusively to training data. The MFCC model achieved 83% accuracy with superior sensitivity for normal sounds (97% recall), while Mel-spectrograms reached 82% accuracy with higher precision (95%). MFCC demonstrated better crackle detection (76% vs 73% recall) and wheeze precision (75% vs 71%), attributed to enhanced transient spectral capture through discrete cosine transformation. Both models showed strong discrimination (AUC > 0.90). MFCC offers computational efficiency advantages for screening applications, while Mel-spectrograms provide interpretability for diagnostic contexts. This controlled comparison provides evidence-based guidance for computer-aided respiratory diagnostic system design, particularly relevant for resource-limited healthcare environments.

Downloads

Download data is not yet available.

References

[1] A. Bohadana, G. Izbicki, and S. S. Kraman, "Fundamentals of lung auscultation," N. Engl. J. Med., vol. 370, no. 8, pp. 744–751, Feb. 2014.

[2] M. L. Aviles-Solis, J. C. Storvoll, S. E. Vanbelle, and H. Melbye, "Prevalence and clinical associations of wheezes and crackles in the general population: the Tromsø study," BMC Pulm. Med., vol. 19, no. 1, pp. 1–8, 2019.

[3] M. Sarkar, I. Madabhavi, N. Niranjan, and M. Dogra, "Auscultation of the respiratory system," Ann. Thorac. Med., vol. 10, no. 3, pp. 158–168, Jul. 2015.

[4] R. X. A. Pramono, S. Bowyer, and E. Rodriguez-Villegas, "Automatic adventitious respiratory sound analysis: A systematic review," PLoS ONE, vol. 12, no. 5, p. e0177926, May 2017.

[5] S. Reichert, R. Gass, C. Brandt, and E. Andrès, "Analysis of respiratory sounds: State of the art," Clin. Med. Circ. Respirat. Pulm. Med., vol. 2, pp. 45–58, Jan. 2008.

[6] H. Pasterkamp, S. S. Kraman, and G. R. Wodicka, "Respiratory sounds: Advances beyond the stethoscope," Am. J. Respir. Crit. Care Med., vol. 156, no. 3, pp. 974–987, Sep. 1997.

[7] A. R. A. Sovijärvi et al., "Definition of terms for applications of respiratory sounds," Eur. Respir. Rev., vol. 10, no. 77, pp. 597–610, 2000.

[8] H. J. Schreur et al., "Lung sounds during allergen-induced asthmatic responses in patients with asthma," Am. J. Respir. Crit. Care Med., vol. 153, no. 5, pp. 1510–1517, May 1996.

[9] R. L. H. Murphy Jr, "Computerized multichannel lung sound analysis," IEEE Eng. Med. Biol. Mag., vol. 26, no. 1, pp. 16–19, Jan. 2007.

[10] R. Palaniappan, K. Sundaraj, and N. U. Ahmed, "Lung sound classification using cepstral-based statistical features," Comput. Biol. Med., vol. 43, no. 3, pp. 181–191, Mar. 2013.

[11] L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition. Englewood Cliffs, NJ: Prentice Hall, 1993.

[12] S. Perna and A. Tagarelli, "Deep auscultation: Predicting respiratory anomalies and diseases via recurrent neural networks," in Proc. IEEE 32nd Int. Symp. Comput.-Based Med. Syst. (CBMS), Jun. 2019, pp. 50–55.

[13] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Commun. ACM, vol. 60, no. 6, pp. 84–90, May 2017.

[14] M. Aykanat, Ö. Kılıç, B. Kurt, and S. Saryal, "Classification of lung sounds using convolutional neural networks," EURASIP J. Image Video Process., vol. 2017, no. 1, p. 65, Dec. 2017.

[15] J. Acharya and A. Basu, "Deep neural network for respiratory sound classification in wearable devices enabled by patient specific model tuning," IEEE Trans. Biomed. Circuits Syst., vol. 14, no. 3, pp. 535–544, Jun. 2020.

[16] B. M. Rocha et al., "An open access database for the evaluation of respiratory sound classification algorithms," Physiol. Meas., vol. 40, no. 3, p. 035001, Mar. 2019.

[17] B. McFee et al., "librosa: Audio and music signal analysis in Python," in Proc. 14th Python Sci. Conf., vol. 8, 2015, pp. 18–25.

[18] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic minority over-sampling technique," J. Artif. Intell. Res., vol. 16, pp. 321–357, Jun. 2002.

[19] M. Abadi et al., "TensorFlow: Large-scale machine learning on heterogeneous systems," 2015. [Online]. Available: https://www.tensorflow.org/

[20] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A simple way to prevent neural networks from overfitting," J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958, Jan. 2014.

[21] V. Nair and G. E. Hinton, "Rectified linear units improve restricted Boltzmann machines," in Proc. 27th Int. Conf. Mach. Learn., 2010, pp. 807–814.

[22] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," in Proc. Int. Conf. Learn. Represent., 2015.

[23] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436–444, May 2015.

[24] K. Kochetov, E. Putin, M. Balashov, A. Filchenkov, and A. Shalyto, "Wheeze detection using convolutional neural networks," in Proc. Eur. Symp. Artif. Neural Netw., Comput. Intell. Mach. Learn., 2018, pp. 105–110.

[25] S. Chakraborty, G. Pal, and P. S. Bhattacharya, "Detection of respiratory disorder using mel-frequency cepstral coefficients and convolutional neural network," IEEE Sens. Lett., vol. 4, no. 6, pp. 1–4, Jun. 2020.

[26] T. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318–327, Feb. 2020.

[27] Y. Ma, X. Xu, and Y. Li, “LungAttn: Advanced lung sound classification using attention mechanism with dual-stream CNN and Transformer,” Physiol. Meas., vol. 42, no. 10, p. 105006, Nov. 2021.

[28] S. Alqudaihi et al., “Cough sound detection and diagnosis using artificial intelligence techniques: Challenges and opportunities,” IEEE Access, vol. 9, pp. 102327–102344, 2021.

[29] H. Pham et al., “Robust detection of COVID-19 in cough sounds using recurrence plot-based feature extraction and deep learning,” Biomed. Signal Process. Control, vol. 78, p. 103963, Sep. 2022.

[30] W. Xia, D. Togneri, F. Sohel, M. Bennamoun, D. Khoo, and B. Murray, “Respiratory sound classification using long short-term memory,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2023, pp. 1–5.

Downloads

Published

2026-04-16

How to Cite

[1]
S. S. Tege, K. K. Pierre, O. O. Fiston, S. Frey, A. Ntumba Nkongolo, and B. K. Jirince, “Comparative Evaluation of MFCC and Mel-spectrogram Features for CNN-Based Respiratory Abnormality Detection”, JAIC, vol. 10, no. 2, pp. 1142–1150, Apr. 2026.

Most read articles by the same author(s)

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.