News Presenter skills evaluation using multi-modality and machine learning

Loading...
Thumbnail Image

Date

2023-07

Journal Title

Journal ISSN

Volume Title

Type

Article

Publisher

IEEE

Series Info

1st International Conference of Intelligent Methods, Systems and Applications, IMSA 2023;Pages 124 - 1292023

Abstract

Assessing television presenters is a challenging yet essential task, as it requires considering numerous characteristics for their evaluation. A multi-modal approach is employed, utilizing various data sources such as eye gaze, gestures, and facial expressions. Automation of this process is crucial due to the exhaustive nature of presenter evaluation, where assessors need to evaluate the presenter based on all the aforementioned features. This paper proposes a system that assesses the presenter based on four key features, namely posture, eye contact, facial expression, and voice. Each feature is assigned a weight, and the presenter receives a grade based on their performance on each feature. The present study focused on facial emotion, eye tracking, and physical posture. The presenter's elbow, shoulder, and nose joints were extracted, and they served as inputs for classifiers that were divided into three categories: machine learning algorithms, template-based algorithms, and deep learning algorithms to classify the presenter's posture. For the eye gaze distance algorithms such as Euclidean distance and Manhattan distance were employed to analyze eye gaze, while facial expression analysis was conducted using the DeepFace library. The system proposed in this research paper achieved an accuracy of 92% utilizing SVM in the machine learning algorithms, 75% using dollarpy in the distance algorithm, besides 79% utilizing BiLSTM for the deep learning model. The data set used in this study was collected from faculty of Mass communication, MSA University.

Description

Keywords

Eye gaze; facial emotions; posture

Citation