Open Access Open Access  Restricted Access Subscription Access

A Systematic Review of Audio Deepfake Detection Using Hybrid Deep Models and Feature Fusion

Priyadarshini Prasad Yadav, Shobha B. Patil

Abstract


Deep fake audio is a term used to describe artificial or synthetic human, like voice generated by AI algorithms. This technology may lead to privacy issues and data security breaches in digital communication. Since most of the existing methods for detecting deep fake audios are not anymore able to compete with the new audio generation capabilities, the deep fake audio detection race has started. This article aims to set up a robust system to recognize deep fake audios through the use of Recurrent Neural Network (RNNs) and Long, Short Term Memory (LSTMs) networks. The method classifies real and fake audios with the help of two sophisticated audio feature extraction techniques, i.e., spectrograms and Mel, Frequency Cepstral Coefficients (MFCCs), by employing the model. The RNN and LSTM, based architectures proposed undergo testing through various datasets containing deep fake and real audio samples so that their effectiveness in real, life situations can be assured. The paper also highlights that it is important to use a deep fake audio detector to safeguard privacy, electronic communications, and audio evidence in the court. Its findings suggest the exploitation of the techniques of deep learning to overcome the threat of deep fake audio and develop the art.


Full Text:

PDF

References


Bisogni, C., Loia, V., Nappi, M., & Pero, C. (2024). Acoustic features analysis for explainable machine learning-based audio spoofing detection. Computer Vision and Image Understanding, 249, 104145. https://doi.org/10.1016/j.cviu.2024.104145

Hamza, A., Javed, A. R. R., Iqbal, F., Kryvinska, N., Almadhor, A. S., Jalil, Z., & Borghol, R. (2022). Deepfake audio detection via MFCC features using machine learning. IEEE Access, 10, 134018–134028. https://doi.org/10.1109/ACCESS.2022.3218782

Meuba, M., Singh, A., Ikuesan, R. A., & Venter, H. (2023). The effect of deep learning methods on deepfake audio detection for digital investigation. Procedia Computer Science, 219, 211–219. https://doi.org/10.1016/j.procs.2023.01.126

Ayetiran, E. F., & Özgöbek, Ö. (2024). A review of deep learning techniques for multimodal fake news and harmful languages detection.

IEEE Access. https://doi.org/10.1109/ACCESS.2024.3401265

Shaaban, O. A., Yildirim, R., & Alguttar, A. A. (2023). Audio deepfake approaches. IEEE Access, 11, 132652–132682. https://doi.org/10.1109/ACCESS.2023.3320345

Wu, T., Zhang, X., & Yang, H. (2019). Audio forensics: Detecting fake audio using traditional signal processing. Journal of Signal Processing, 34(2), 102–115.

Zheng, S., Li, D., & Wei, Z. (2020). Feature extraction techniques for deep fake audio detection. International Journal of Digital Signal Processing, 29(1), 35–48.

Kreuk, F., Polyak, A., & Michaeli, T. (2020). CNN-based detection of deep fake audio using spectrogram analysis. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 917–927. https://doi.org/10.1109/TASLP.2020.2973971

Ali, A., Bashir, M., & Javed, S. (2021). Waveform-based approaches for detecting fake audio. Journal of Acoustic Signal Processing, 12(3), 453–467.


Refbacks

  • There are currently no refbacks.