Ambulance Siren Audio Classification Using Convolutional Neural Network for Medical Emergency Detection

Authors

  • Ramadhan Paninggalih Institut Teknologi Kalimantan
  • Bima Prihasto Institut Teknologi Kalimantan
  • Maryo Inri Pratama Institut Teknologi Kalimantan
  • Rizky Irswanda Ramadhana Institut Teknologi Kalimantan
  • Misbahuddin Misbahuddin Universitas Mataram
  • Buan Anshari Universitas Mat
  • Lalu Ahmad Syamsul Irfan Akbar Universitas Mataram
  • Giri Wahyu Wiriasto Universitas Mataram

DOI:

https://doi.org/10.33394/j-ps.v14i2.20099

Keywords:

Ambulance siren, Convolutional neural network, Mel-frequency cepstral coefficients, Sound classification

Abstract

The rapid detection of emergency vehicle sirens is critical for enhancing road safety and traffic management. This study proposes an automated classification system for ambulance sirens using a Convolutional Neural Network (CNN). The method utilizes Mel-Frequency Cepstral Coefficients (MFCC) to transform audio signals into 2D feature maps, allowing the model to capture distinct spectral and temporal patterns. The dataset was preprocessed using a stratified split to ensure balanced class distribution and prevent data leakage. Experimental results demonstrate that the CNN model achieves a high performance with an accuracy of 0.95, significantly outperforming baseline models such as Multi-Layer Perceptron (MLP) and XGBoost. Detailed evaluation through a confusion matrix indicates a consistent precision, recall, and F1-score of 0.95, proving the model’s robustness in distinguishing sirens from complex urban noise. The implementation of the Adam optimizer and early stopping mechanism ensured stable convergence and prevented overfitting. These findings suggest that the proposed CNN-MFCC framework provides a reliable solution for real-time emergency signal detection, offering a substantial contribution to intelligent transportation systems.

References

Abbaskhah, A., Sedighi, H., & Marvi, H. (2023). Infant cry classification by MFCC feature extraction with MLP and CNN structures. Biomedical Signal Processing and Control, 86, 105261. https://doi.org/10.1016/j.bspc.2023.105261

Alruwaili, M., Ali, A., Almutairi, M., Alsahyan, A., & Mohamed, M. (2025). LSTM and ResNet18 for optimized ambulance routing and traffic signal control in emergency situations. Scientific Reports, 15(1), 6011. https://doi.org/10.1038/s41598-025-89651-4

Alslamah, T., Alsofayan, Y. M., Al Imam, M. H., Almazroa, M. A., Abalkhail, A., Alasqah, I., & Mahmud, I. (2023). Emergency Medical Service Response Time for Road Traffic Accidents in the Kingdom of Saudi Arabia: Analysis of National Data (2016–2020). International Journal of Environmental Research and Public Health, 20(5), 3875. https://doi.org/10.3390/ijerph20053875

Asif, M., Usaid, M., Rashid, M., Rajab, T., Hussain, S., & Wasi, S. (2022). Large-scale audio dataset for emergency vehicle sirens and road noises. Scientific Data, 9(1), 599. https://doi.org/10.1038/s41597-022-01727-2

Brent, D., & Beland, L.-P. (2020). Traffic congestion, transportation policies, and the performance of first responders. Journal of Environmental Economics and Management, 103, 102339. https://doi.org/10.1016/j.jeem.2020.102339

Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), 6. https://doi.org/10.1186/s12864-019-6413-7

Chu, H.-C., Zhang, Y.-L., & Chiang, H.-C. (2023). A CNN Sound Classification Mechanism Using Data Augmentation. Sensors, 23(15), 6972. https://doi.org/10.3390/s23156972

Costantini, G., Cesarini, V., & Brenna, E. (2023). High-Level CNN and Machine Learning Methods for Speaker Recognition. Sensors, 23(7), 3461. https://doi.org/10.3390/s23073461

Damdin, S., Trakulsrichai, S., Yuksen, C., Sricharoen, P., Suttapanit, K., Tienpratarn, W., Liengswangwong, W., & Seesuklom, S. (2025). Effects of Emergency Medical Service Response Time on Survival Rate of Out-of-Hospital Cardiac Arrest Patients: A 5-Year Retrospective Study. Archives of Academic Emergency Medicine, 13(1), e36. https://doi.org/10.22037/aaemj.v13i1.2596

Farooq, H., Hashmi, M. S. A., Author), T. F. K. (Corresponding, Hafeez, Q., & Mohsin, M. (2024). Intelligent emergency vehicle sound classification for public safety. Kashf Journal of Multidisciplinary Research, 1(12), 141–152. https://doi.org/10.71146/kjmr161

Gourisaria, M. K., Agrawal, R., Sahni, M., & Singh, P. K. (2024). Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques. Discover Internet of Things, 4(1), 1. https://doi.org/10.1007/s43926-023-00049-y

Jayakumar, D., Krishnaiah, M., Kollem, S., Peddakrishna, S., Chandrasekhar, N., & Thirupathi, M. (2024). Emergency Vehicle Classification Using Combined Temporal and Spectral Audio Features with Machine Learning Algorithms. Electronics, 13(19), 3873. https://doi.org/10.3390/electronics13193873

Kamaladevi, R., Hashir, M. M., & James, Y. G. (2023). Ambulance Siren Detection using ANN. Grenze International Journal of Engineering & Technology (GIJET), 9(2), 596.

Kong, Q., Cao, Y., Iqbal, T., Wang, Y., Wang, W., & Plumbley, M. D. (2020). PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 2880–2894. https://doi.org/10.1109/TASLP.2020.3030497

Luan, S., & Jiang, Z. (2024). Does the priority of ambulance guarantee no delay? A MIPSSTW model of emergency vehicle routing optimization considering complex traffic conditions for highway incidents. PLOS ONE, 19(4), e0301637. https://doi.org/10.1371/journal.pone.0301637

Onisha, T. A., Kim, J., & Seol, J. (2024). Multi Label Sound Classification using Deep Learning Models. 2024 IEEE/ACIS 22nd International Conference on Software Engineering Research, Management and Applications (SERA), 129–134. https://doi.org/10.1109/SERA61261.2024.10685563

Parineh, H., Sarvi, M., & Bagloee, S. A. (2023). Detecting emergency vehicles With 1D-CNN using fourier processed audio signals. Measurement, 223, 113784. https://doi.org/10.1016/j.measurement.2023.113784

Rawat, P., Bajaj, M., Vats, S., & Sharma, V. (2023). A comprehensive study based on MFCC and spectrogram for audio classification. Journal of Information and Optimization Sciences, 44(6), 1057–1074. https://doi.org/10.47974/JIOS-1431

Rezaul, K. M., Jewel, M., Islam, M. S., Siddiquee, K., Barua, N., Rahman, M. A., Shan-A-Khuda, M., Sulaiman, R. B., Shaikh, M. S. I., Hamim, M. A., Tanmoy, F. M., Haque, A. U., Nipun, M. S., Dorudian, N., Kareem, A., Farid, A. K., Mubarak, A., Jannat, T., & Asha, U. F. T. (2024). Enhancing Audio Classification Through MFCC Feature Extraction and Data Augmentation with CNN and RNN Models. International Journal of Advanced Computer Science and Applications, 15(7), 37–53.

Shah, A., Singh, A., & Singh, A. (2023). Audio Classification of Emergency Vehicle Sirens Using Recurrent Neural Network Architectures. In A. Yadav, S. J. Nanda, & M.-H. Lim (Eds.), Proceedings of International Conference on Paradigms of Communication, Computing and Data Analytics (pp. 71–83). Springer Nature. https://doi.org/10.1007/978-981-99-4626-6_6

Shams, M. Y., Abd El-Hafeez, T., & Hassan, E. (2024). Acoustic data detection in large-scale emergency vehicle sirens and road noise dataset. Expert Systems with Applications, 249, 123608. https://doi.org/10.1016/j.eswa.2024.123608

Tharwat, A. (2020). Classification assessment methods. Applied Computing and Informatics, 17(1), 168–192. https://doi.org/10.1016/j.aci.2018.08.003

Usaid, M., Asif, M., Rajab, T., Rashid, M., & Hassan, S. I. (2022). Ambulance Siren Detection using Artificial Intelligence in Urban Scenarios. Sir Syed University Research Journal of Engineering & Technology, 12(1), 92–97. https://doi.org/10.33317/ssurj.467

Zaman, K., Sah, M., Direkoglu, C., & Unoki, M. (2023). A Survey of Audio Classification Using Deep Learning. IEEE Access, 11, 106620–106649. https://doi.org/10.1109/ACCESS.2023.3318015

Zbancioc, M. D., & Feraru, S. M. (2024). Automatic Recognition of Siren Sound in Traffic. In H.-N. Costin, R. Magjarević, & G. G. Petroiu (Eds.), Advances in Digital Health and Medical Bioengineering (pp. 292–299). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-62520-6_34

Downloads

Published

2026-04-29

How to Cite

Paninggalih, R., Prihasto, B., Pratama, M. I., Ramadhana, R. I., Misbahuddin, M., Anshari, B., … Wiriasto, G. W. (2026). Ambulance Siren Audio Classification Using Convolutional Neural Network for Medical Emergency Detection. Prisma Sains : Jurnal Pengkajian Ilmu Dan Pembelajaran Matematika Dan IPA IKIP Mataram, 14(2), 521–534. https://doi.org/10.33394/j-ps.v14i2.20099

Issue

Section

Research Articles