Reconhecimento online de gestos dinâmicos para ambientes interacionais multicâmeras

Clebeson Canuto; Luiz Carlos Cosmi; Alexandre Pereira; Jorge Samatelo; José Santos-Victor; Raquel Vassallo

doi:10.20906/sbai.v1i1.2820

Clebeson Canuto Universidade Federal do Espírito Santo, Vitória - ES
Luiz Carlos Cosmi Universidade Federal do Espírito Santo, Vitória - ES
Alexandre Pereira Instituto Federal do Espírito Santo Guarapari-ES
Jorge Samatelo Universidade Federal do Espírito Santo, Vitória - ES
José Santos-Victor Intituto Superior Técnico, Universidade de Lisboa, Lisboa
Raquel Vassallo Universidade Federal do Espírito Santo, Vitória - ES

DOI: https://doi.org/10.20906/sbai.v1i1.2820

Keywords: Dynamic Gesture Recognition, Interactive Environment, Intelligent Space, Multicamera Environment, Deep Learning

Abstract

This work proposes an online dynamic gesture recognizer that can be used in an interactive multicamera environment. The proposal consists of reconstructing the three-dimensional skeleton of a user present in the scene, using a temporal segmentation model to segment the skeleton sequences that contain gestures and classifying them into one of the possible classes. Thus, the main contributions in this work are three-fold: (i) a model able to segmenting a temporal flow of 3D skeletons as being gestures or non-gestures; (ii) a model that allows classifying a sequence of 3D skeletons movements as belonging to one of the possible gesture classes; and (iii) a system that unites the segmentation and classification models, in order to allow the recognition of online gestures in a multicamera interactive environment. To evaluate the proposal, two gesture datasets were acquired in two distinct interactional spaces. After the experiments, the proposed spotting model obtained an average accuracy of 82% over the test dataset, and the classifier model obtained 72.40%. The two models together reached a Jaccard index of 0.76. Considering the processing time, each observation required an average time of 70 ms when executed in GPU. Thus, the proposed solution is considered effective in online dynamic gesture recognition for multicamera environments.