Ambulance Siren Audio Classification Using Convolutional Neural Network for Medical Emergency Detection
DOI:
https://doi.org/10.33394/j-ps.v14i2.20099Keywords:
Ambulance siren, Convolutional neural network, Mel-frequency cepstral coefficients, Sound classificationAbstract
The rapid detection of emergency vehicle sirens is critical for enhancing road safety and traffic management. This study proposes an automated classification system for ambulance sirens using a Convolutional Neural Network (CNN). The method utilizes Mel-Frequency Cepstral Coefficients (MFCC) to transform audio signals into 2D feature maps, allowing the model to capture distinct spectral and temporal patterns. The dataset was preprocessed using a stratified split to ensure balanced class distribution and prevent data leakage. Experimental results demonstrate that the CNN model achieves a high performance with an accuracy of 0.95, significantly outperforming baseline models such as Multi-Layer Perceptron (MLP) and XGBoost. Detailed evaluation through a confusion matrix indicates a consistent precision, recall, and F1-score of 0.95, proving the model’s robustness in distinguishing sirens from complex urban noise. The implementation of the Adam optimizer and early stopping mechanism ensured stable convergence and prevented overfitting. These findings suggest that the proposed CNN-MFCC framework provides a reliable solution for real-time emergency signal detection, offering a substantial contribution to intelligent transportation systems.
References
Abbaskhah, A., Sedighi, H., & Marvi, H. (2023). Infant cry classification by MFCC feature extraction with MLP and CNN structures. Biomedical Signal Processing and Control, 86, 105261. https://doi.org/10.1016/j.bspc.2023.105261
Alruwaili, M., Ali, A., Almutairi, M., Alsahyan, A., & Mohamed, M. (2025). LSTM and ResNet18 for optimized ambulance routing and traffic signal control in emergency situations. Scientific Reports, 15(1), 6011. https://doi.org/10.1038/s41598-025-89651-4
Alslamah, T., Alsofayan, Y. M., Al Imam, M. H., Almazroa, M. A., Abalkhail, A., Alasqah, I., & Mahmud, I. (2023). Emergency Medical Service Response Time for Road Traffic Accidents in the Kingdom of Saudi Arabia: Analysis of National Data (2016–2020). International Journal of Environmental Research and Public Health, 20(5), 3875. https://doi.org/10.3390/ijerph20053875
Asif, M., Usaid, M., Rashid, M., Rajab, T., Hussain, S., & Wasi, S. (2022). Large-scale audio dataset for emergency vehicle sirens and road noises. Scientific Data, 9(1), 599. https://doi.org/10.1038/s41597-022-01727-2
Brent, D., & Beland, L.-P. (2020). Traffic congestion, transportation policies, and the performance of first responders. Journal of Environmental Economics and Management, 103, 102339. https://doi.org/10.1016/j.jeem.2020.102339
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), 6. https://doi.org/10.1186/s12864-019-6413-7
Chu, H.-C., Zhang, Y.-L., & Chiang, H.-C. (2023). A CNN Sound Classification Mechanism Using Data Augmentation. Sensors, 23(15), 6972. https://doi.org/10.3390/s23156972
Costantini, G., Cesarini, V., & Brenna, E. (2023). High-Level CNN and Machine Learning Methods for Speaker Recognition. Sensors, 23(7), 3461. https://doi.org/10.3390/s23073461
Damdin, S., Trakulsrichai, S., Yuksen, C., Sricharoen, P., Suttapanit, K., Tienpratarn, W., Liengswangwong, W., & Seesuklom, S. (2025). Effects of Emergency Medical Service Response Time on Survival Rate of Out-of-Hospital Cardiac Arrest Patients: A 5-Year Retrospective Study. Archives of Academic Emergency Medicine, 13(1), e36. https://doi.org/10.22037/aaemj.v13i1.2596
Farooq, H., Hashmi, M. S. A., Author), T. F. K. (Corresponding, Hafeez, Q., & Mohsin, M. (2024). Intelligent emergency vehicle sound classification for public safety. Kashf Journal of Multidisciplinary Research, 1(12), 141–152. https://doi.org/10.71146/kjmr161
Gourisaria, M. K., Agrawal, R., Sahni, M., & Singh, P. K. (2024). Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques. Discover Internet of Things, 4(1), 1. https://doi.org/10.1007/s43926-023-00049-y
Jayakumar, D., Krishnaiah, M., Kollem, S., Peddakrishna, S., Chandrasekhar, N., & Thirupathi, M. (2024). Emergency Vehicle Classification Using Combined Temporal and Spectral Audio Features with Machine Learning Algorithms. Electronics, 13(19), 3873. https://doi.org/10.3390/electronics13193873
Kamaladevi, R., Hashir, M. M., & James, Y. G. (2023). Ambulance Siren Detection using ANN. Grenze International Journal of Engineering & Technology (GIJET), 9(2), 596.
Kong, Q., Cao, Y., Iqbal, T., Wang, Y., Wang, W., & Plumbley, M. D. (2020). PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 2880–2894. https://doi.org/10.1109/TASLP.2020.3030497
Luan, S., & Jiang, Z. (2024). Does the priority of ambulance guarantee no delay? A MIPSSTW model of emergency vehicle routing optimization considering complex traffic conditions for highway incidents. PLOS ONE, 19(4), e0301637. https://doi.org/10.1371/journal.pone.0301637
Onisha, T. A., Kim, J., & Seol, J. (2024). Multi Label Sound Classification using Deep Learning Models. 2024 IEEE/ACIS 22nd International Conference on Software Engineering Research, Management and Applications (SERA), 129–134. https://doi.org/10.1109/SERA61261.2024.10685563
Parineh, H., Sarvi, M., & Bagloee, S. A. (2023). Detecting emergency vehicles With 1D-CNN using fourier processed audio signals. Measurement, 223, 113784. https://doi.org/10.1016/j.measurement.2023.113784
Rawat, P., Bajaj, M., Vats, S., & Sharma, V. (2023). A comprehensive study based on MFCC and spectrogram for audio classification. Journal of Information and Optimization Sciences, 44(6), 1057–1074. https://doi.org/10.47974/JIOS-1431
Rezaul, K. M., Jewel, M., Islam, M. S., Siddiquee, K., Barua, N., Rahman, M. A., Shan-A-Khuda, M., Sulaiman, R. B., Shaikh, M. S. I., Hamim, M. A., Tanmoy, F. M., Haque, A. U., Nipun, M. S., Dorudian, N., Kareem, A., Farid, A. K., Mubarak, A., Jannat, T., & Asha, U. F. T. (2024). Enhancing Audio Classification Through MFCC Feature Extraction and Data Augmentation with CNN and RNN Models. International Journal of Advanced Computer Science and Applications, 15(7), 37–53.
Shah, A., Singh, A., & Singh, A. (2023). Audio Classification of Emergency Vehicle Sirens Using Recurrent Neural Network Architectures. In A. Yadav, S. J. Nanda, & M.-H. Lim (Eds.), Proceedings of International Conference on Paradigms of Communication, Computing and Data Analytics (pp. 71–83). Springer Nature. https://doi.org/10.1007/978-981-99-4626-6_6
Shams, M. Y., Abd El-Hafeez, T., & Hassan, E. (2024). Acoustic data detection in large-scale emergency vehicle sirens and road noise dataset. Expert Systems with Applications, 249, 123608. https://doi.org/10.1016/j.eswa.2024.123608
Tharwat, A. (2020). Classification assessment methods. Applied Computing and Informatics, 17(1), 168–192. https://doi.org/10.1016/j.aci.2018.08.003
Usaid, M., Asif, M., Rajab, T., Rashid, M., & Hassan, S. I. (2022). Ambulance Siren Detection using Artificial Intelligence in Urban Scenarios. Sir Syed University Research Journal of Engineering & Technology, 12(1), 92–97. https://doi.org/10.33317/ssurj.467
Zaman, K., Sah, M., Direkoglu, C., & Unoki, M. (2023). A Survey of Audio Classification Using Deep Learning. IEEE Access, 11, 106620–106649. https://doi.org/10.1109/ACCESS.2023.3318015
Zbancioc, M. D., & Feraru, S. M. (2024). Automatic Recognition of Siren Sound in Traffic. In H.-N. Costin, R. Magjarević, & G. G. Petroiu (Eds.), Advances in Digital Health and Medical Bioengineering (pp. 292–299). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-62520-6_34
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Ramadhan Paninggalih, Bima Prihasto, Maryo Inri Pratama, Rizky Irswanda Ramadhana, Misbahuddin Misbahuddin, Buan Anshari, Lalu Ahmad Syamsul Irfan Akbar, Giri Wahyu Wiriasto

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

This work is licensed under a Creative Commons Attribution 4.0 International License.

