A Deep Neural Network Approach for Classifying Pulmonary Diseases from Respiratory Sounds
DOI:
https://doi.org/10.63665/tsdtby95Keywords:
Deep Learning, Respiratory Sounds, Pulmonary Diseases, CNN, Audio ClassificationAbstract
This study presents an advanced approach to detecting lung auscultation sounds using Mel-frequency Cepstral Coefficients (MFCC), Chroma features, and neural networks. Lung auscultation, a key diagnostic tool in identifying respiratory conditions, often relies on the expertise of medical professionals to interpret subtle sound patterns. However, automated systems that accurately classify these sounds can greatly assist in early diagnosis and treatment. To achieve this, we employed MFCC, which captures the power spectrum of sounds and effectively models the way humans perceive auditory signals, focusing on the critical frequency ranges for lung sounds. Additionally, Chroma features, which represent the tonal content of audio signals, were used to capture harmonic aspects that could be indicative of specific lung conditions. These features were then fed into a neural network designed to classify lung sounds into various diagnostic categories, such as normal breathing, wheezing, crackles, and other abnormal respiratory sounds. The neural network, trained on a comprehensive dataset of lung sounds, was able to learn complex patterns and correlations within the MFCC and Chroma features, leading to high accuracy in sound classification. This automated approach offers a powerful tool for enhancing the precision of lung sound diagnosis, potentially leading to earlier detection of respiratory conditions and improved patient outcomes.
Downloads
References
[1]. E. Tsalera, A. Papadakis, and M. Samarakou, “Comparison of pre-trained CNNs for audio classification using transfer learning,” 2021.
[2]. S. Patil and K. Wani, “Gear fault detection using noise analysis and machine learning algorithm with YAMNet pretrained network,” 2023.
[3]. W. Chen, H. Kamachi, A. Yokokubo, and G. Lopez, “Bone conduction eating activity detection based on YAMNet transfer learning and LSTM networks,” 2022.
[4]. S. Dogan, “A new fractal H-tree pattern based gun model identification method using gunshot audios,” 2021.
[5]. R. Nijhawan, S. A. Ansari, S. Kumar, F. Alassery, and S. M. El-kenawy, “Gun identification from gunshot audios for secure public places using transformer,” 2022.
[6]. J. Bajzik et al., “Independent channel residual convolutional network for gunshot detection,” 2022.
[7]. E. Tsalera, A. Papadakis, and M. Samarakou, ‘‘Comparison of pre-trained CNNsfor audio classification using transfer learning,’’ J. Sensor Actuator Netw., vol. 10, no. 4, p. 72, Dec. 2021.
[8]. S. Patil and K. Wani, ‘‘Gear fault detection using noise analysis and machine learning
algorithm with YAMNet pretrained network,’’ Mater. Today, Proc., vol. 72, pp. 1322–1327, 2023.
[9]. W. Chen, H. Kamachi, A. Yokokubo, and G. Lopez, ‘‘Bone conduction eating activity detection based on YAMNet transfer learning and LSTM networks,’’ in Proc. 15th Int. Joint Conf. Biomed.Eng.Syst. Technol., 2022. VOLUME 12, 2024 N. H. Valliappan et al.: Enhancing Gun Detection With Transfer Learning and YAMNet Audio Classification.
[10]. S. Dogan, ‘‘A new fractal H-tree pattern based gun model identification method using gunshot audios,’’ Appl. Acoust., vol. 177, Jun. 2021, Art. no. 107916.
[11]. R. Nijhawan, S. A. Ansari, S. Kumar, F. Alassery, and S. M. El-kenawy, ‘‘Gun identification from gunshot audios for secure public places using transformer learning,’’ Sci. Rep., vol. 12, no. 1, pp. 1–5, Aug. 2022.
[12]. J. Park, ‘‘Enemy spotted: In-game gun sound dataset for gunshot classification and localization,’’ in Proc. IEEE Conf. Games (CoG), 2022, pp. 56–63.
[13]. J. Bajzik et al., ‘‘Independent channel residual convolutional network for gunshot detection,’’Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 4, pp. 950–958, 2022.
[14]. M. Djeddou and T. Touhami, ‘‘Classification and modeling of acoustic gunshot signatures,’’Arabian J. Sci. Eng., vol. 38, no. 12, pp. 3399–3406, Dec. 2013, doi: 10.1007/s13369-013-0655-5.
[15]. J. Bajzik, J. Prinosil, and D. Koniar, ‘‘Gunshot detection using convolu tional neural networks,’’ in Proc. 24th Int. Conf. Electron., Lithuania, 2020, pp. 1–5, doi:10.1109/IEEECONF49502.2020.9141621.
[16]. T. Tuncer, S. Dogan, E. Akbal, and E. Aydemir, ‘‘An automated gunshot audio classification method based on finger pattern feature generator and iterative relieff feature selector,’’ Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, vol. 8, no. 14, pp. 225–243, 2021.
[17]. L.G.Martins.(Mar.2,2021).TransferLearningforAudioDataWithYAM Net. TensorFlowBlog. [Online]. Available: https://medium.com/analytics vidhya/understanding-the-mel-spectrogram-fca2afa2ce53
[18]. A. Tena, F. Clarià, and F. Solsona, ‘‘Automated detection of COVID 19 cough,’’ Biomed. Signal Process. Control, vol. 71, Jan. 2022, Art. no. 103175, doi: 10.1016/j.bspc.2021.103175.
[19]. A. Patel, S. Degadwala, and D. Vyas, ‘‘Lung respiratory audio prediction using transfer learning models,’’ in Proc. 6th Int. Conf. I-SMAC (IoT Social, Mobile, Analytics Cloud) (I-SMAC), Dharan, Nepal, Nov. 2022, pp. 1107–1114.
[20]. R. Baliram Singh, H. Zhuang, and J. K. Pawani, ‘‘Data collection, modeling, and classification for gunshot and gunshot-like audio events: Acase study,’’ Sensors, vol. 21, no. 21, p. 7320, Nov. 2021.
[21]. A. K. Sharma, G. Aggarwal, S. Bhardwaj, P. Chakrabarti, T. Chakrabarti, J. H. Abawajy, S. Bhattacharyya, R. Mishra, A. Das, and H. Mahdin, ‘‘Classification of Indian classical music with time-series matching deep learning approach,’’ IEEE Access, vol. 9, pp. 102041–102052, 2021.
[22]. N.A.M.AriffandA.R.Ismail,‘‘Study of Adaman dada max optimizers on Alex Net architecture for voice biometric authentication system,’’ in Proc. 17th Int. Conf. Ubiquitous Inf. Manage. Commun. (IMCOM), Jan. 2023, pp. 1–4.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
