Speech Emotion Recognition

Give it a visit

My Role

Team lead -- Architecture Design, Data Preparation, Model Training

Team

Tech stacks

Python | TensorFlow | Pandas | CNN | LSTM

Overview

Develop a speech emotion recognition model using two different approaches which are convolutional neural network and Long Short Term Memory (LSTM) algorithm and compare the result between these two deep learning methods.

My Role

Team lead -- Architecture Design, Data Preparation, Model Training

Team

Tech Stacks

Python | TensorFlow | Pandas | CNN | LSTM

Overview

Architecture

Empowering Speech Emotion Recognition

Accurate Classification with Robust Design

This architecture enables accurate emotion classification by combining advanced data augmentation techniques, efficient preprocessing with MFCC feature extraction, and the powerful capabilities of LSTM and CNN models. The system seamlessly integrates data collection, processing, and training, allowing for rapid and scalable development while maintaining high accuracy in real-world scenarios.

Diagram of the Speech Emotion Recognition system model showcasing data flow and model components.

Model Performance

LSTM Model Test Results

Evaluating Accuracy and Performance Metrics

The LSTM model achieves an accuracy of 61.018%, demonstrating its ability to classify emotions in speech data. The model's precision, recall, and F1 score are 0.6806, 0.5533, and 0.6806 respectively, showcasing a balanced performance. Training and testing loss and accuracy graphs indicate consistent learning, although improvements can be made for higher generalization.

Graphs displaying training/testing loss and accuracy over epochs, along with performance metrics table.

Model Performance

CNN Model Test Results

High Accuracy and Precision Achieved

The CNN model achieves an outstanding accuracy of 96.52%, demonstrating its strong capability in emotion recognition tasks. With precision, recall, and F1 scores of 0.9664, 0.9643, and 0.9653 respectively, it showcases exceptional performance across key metrics. The training and testing loss graphs indicate smooth convergence, while the accuracy graphs highlight robust generalization.

Graphs displaying training/testing loss and accuracy over epochs, along with performance metrics table.

Speech Emotion Recognition

Give it a visit

My Role

Team lead -- Architecture Design, Data Preparation, Model Training

Team

Tech stacks

Python | TensorFlow | Pandas | CNN | LSTM

Overview

My Role

Team lead -- Architecture Design, Data Preparation, Model Training

Team

Tech Stacks

Python | TensorFlow | Pandas | CNN | LSTM

Overview

Architecture

Empowering Speech Emotion Recognition

Accurate Classification with Robust Design

Diagram of the Speech Emotion Recognition system model showcasing data flow and model components.

Model Performance

LSTM Model Test Results

Evaluating Accuracy and Performance Metrics

Graphs displaying training/testing loss and accuracy over epochs, along with performance metrics table.

Model Performance

CNN Model Test Results

High Accuracy and Precision Achieved

Graphs displaying training/testing loss and accuracy over epochs, along with performance metrics table.