Audio-Visual Emotion Recognition in Video Clips

dc.contributor.authorAnbarjafari, Gholamreza
dc.contributor.authorNoroozi, Fatemeh
dc.contributor.authorMarjanovic, Marina
dc.contributor.authorNjegus, Angelina
dc.contributor.authorEscalera, Sergio
dc.date.accessioned2019-11-05T12:14:35Z
dc.date.available2019-11-05T12:14:35Z
dc.date.issued2019-01
dc.departmentHKÜ, Mühendislik Fakültesi, Elektirik Elektronik Mühendisliği Bölümüen_US
dc.description.abstractThis paper presents a multimodal emotion recognition system, which is based on the analysis of audio and visual cues. From the audio channel, Mel-Frequency Cepstral Coefficients, Filter Bank Energies and prosodic features are extracted. For the visual part, two strategies are considered. First, facial landmarks' geometric relations, i.e., distances and angles, are computed. Second, we summarize each emotional video into a reduced set of key-frames, which are taught to visually discriminate between the emotions. In order to do so, a convolutional neural network is applied to key-frames summarizing videos. Finally, confidence outputs of all the classifiers from all the modalities are used to define a new feature space to be learned for final emotion label prediction, in a late fusion/stacking fashion. The experiments conducted on the SAVEE, eNTERFACE'05, and RML databases show significant performance improvements by our proposed system in comparison to current alternatives, defining the current state-of-the-art in all three databases.en_US
dc.identifier.citationNoroozi, F., Marjanovic, M., Njegus, A., Escalera, S., & Anbarjafari, G. (January 01, 2019). Audio-Visual Emotion Recognition in Video Clips. Ieee Transactions on Affective Computing, 10, 1, 60-75.en_US
dc.identifier.doi10.1109/TAFFC.2017.2713783
dc.identifier.endpage75en_US
dc.identifier.issn1949-3045
dc.identifier.issue1en_US
dc.identifier.scopus2-s2.0-85050265218
dc.identifier.scopusqualityQ1
dc.identifier.startpage60en_US
dc.identifier.urihttps://doi.org/10.1109/TAFFC.2017.2713783
dc.identifier.urihttps://hdl.handle.net/20.500.11782/574
dc.identifier.volume10en_US
dc.identifier.wosWOS:000461333200008
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INCen_US
dc.relation.ispartofIEEE TRANSACTIONS ON AFFECTIVE COMPUTING
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/embargoedAccessen_US
dc.subjectMultimodal emotion recognition; classifier fusion; data fusion; convolutional neural networksen_US
dc.titleAudio-Visual Emotion Recognition in Video Clips
dc.typeArticle

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
000461333200008.pdf
Boyut:
800.97 KB
Biçim:
Adobe Portable Document Format
Açıklama:
Makale Dosyası

Lisans paketi

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
license.txt
Boyut:
1.56 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: