Multimodal emotion recognition on two benchmark datasets RAVDESS and SAVEE from audio-visual information using CNN(Convolutional Neural Networks)
-
Updated
Sep 11, 2021 - Jupyter Notebook
Multimodal emotion recognition on two benchmark datasets RAVDESS and SAVEE from audio-visual information using CNN(Convolutional Neural Networks)
deep_video_extraction is a powerful repository designed to extract deep feature representations from video inputs using pre-trained models. With support for both visual and aural features from videos. Additionally, you can process audio separately by converting it into spectrograms.
Official Codebase of "Audio-visual Generalized Zero-shot Learning the Easy Way" (ECCV 2024)
Add a description, image, and links to the audio-visual-features topic page so that developers can more easily learn about it.
To associate your repository with the audio-visual-features topic, visit your repo's landing page and select "manage topics."