Comparative analysis of speech recognition on deep fake audio using deep learning algorithms

Suhairi, Nurul Sakinah (2024) Comparative analysis of speech recognition on deep fake audio using deep learning algorithms. Project Report. Melaka, Malaysia, Universiti Teknikal Malaysia Melaka. (Submitted)

Text (Full Text)
Comparative analysis of speech recognition on deep fake audio using deep learning algorithms.pdf - Submitted Version
Download (1MB)

Abstract

This study presents a comprehensive comparative analysis of speech recognition on deep fake audio using multiple deep learning algorithms. Deep fake technology is becoming more common, so, it is needed to create reliable techniques for identifying and analyzing these deep fake audios. This audio presents serious security, privacy and authenticity verification concerns. In order to accurately distinguish between real and fake audio, this research explores the effectiveness of various deep learning model such as, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Generative Adversarial Network (GAN) and Gated Recurrent Units (GRU). Each model is evaluated using a range of performance metrics such as accuracy, F1-score and recall, across the dataset. Furthermore, advanced feature selection methods like Principal Component Analysis (PCA), Recursive Feature Elimination (RFE) and Mutual Information (MI) are used in effort to boost the model’s ability for generalization and boost classification performance. This study guarantees a comprehensive and objective evaluation of the model’s performance by utilizing Synthetic Minority Over-sampling Technique (SMOTE) to handle data imbalance and Stratified K-fold Cross Validation. The objective of this research are to analyze the process of deep fake audio dataset using various deep learning algorithm, to identify which model produce the best performance through evaluation metrics and to propose and enhanced deep learning method for deep fake audio dataset.

Item Type:	Final Year Project (Project Report)
Uncontrolled Keywords:	Speech recognition, Deep fake audio, Recurrent neural network, Convolutional neural network, Generative adversarial network, Long short-term memory, Gated recurrent unit
Subjects:	T Technology > T Technology (General) T Technology > TK Electrical engineering. Electronics Nuclear engineering
Divisions:	Library > Final Year Project > FTMK
Depositing User:	Norfaradilla Idayu Ab. Ghafar
Date Deposited:	03 Jan 2025 07:57
Last Modified:	03 Jan 2025 07:57
URI:	http://digitalcollection.utem.edu.my/id/eprint/34462

Actions (login required)

View Item

Download Statistics

Downloads

Downloads per month over past year