Treffer: Virtual agents as a scalable tool for diverse, robust gesture recognition.
Original Publication: Austin, Tex. : Psychonomic Society, c2005-
Sci Rep. 2023 Dec 2;13(1):21295. (PMID: 38042876)
Sci Rep. 2016 Oct 14;6:35295. (PMID: 27739460)
Philos Trans R Soc Lond B Biol Sci. 2023 Apr 24;378(1875):20210475. (PMID: 36871588)
Br J Psychol. 2018 Aug;109(3):395-417. (PMID: 29504117)
Psychon Bull Rev. 2019 Jun;26(3):894-900. (PMID: 30734158)
Child Dev. 2000 Jan-Feb;71(1):231-9. (PMID: 10836578)
PLoS One. 2018 Dec 12;13(12):e0208030. (PMID: 30540819)
Front Psychol. 2023 Feb 23;14:1133621. (PMID: 36910814)
Cogn Sci. 2023 Dec;47(12):e13392. (PMID: 38058215)
Nat Biomed Eng. 2021 Jun;5(6):493-497. (PMID: 34131324)
iScience. 2022 Oct 13;25(11):105331. (PMID: 36325058)
Weitere Informationen
Gesture recognition technology is a popular area of research, offering applications in many fields, including behaviour research, human-computer interaction (HCI), medical research, and surveillance culture, among others. However, the large quantity of data needed to train a recognition algorithm is not always available, and differences between the training set and one's own research data in factors such as recording conditions and participant characteristics may hinder transferability. To address these issues, we propose training and testing recognition algorithms on virtual agents, a tool that has not yet been used for this purpose in multimodal communication research. We provide an example use case with step-by-step instructions, using mocap data to animate a virtual agent and create customised lighting conditions, backgrounds, and camera angles, creating a virtual agent-only dataset to train and test a gesture recognition algorithm. This approach also allows us to assess the impact of particular features, such as background and lighting. Our best-performing model in optimal background and lighting conditions achieved accuracy of 85.9%. When introducing background clutter and reduced lighting, the accuracy dropped to 71.6%. When testing the virtual agent-trained model on images of humans, the accuracy of target handshape classification ranged from 72% to 95%. The results suggest that training an algorithm on artificial data (1) is a resourceful, convenient, and effective way to customise algorithms, (2) potentially addresses issues of data sparsity, and (3) can be used to assess the impact of many contextual and environmental factors that would not be feasible to systematically assess using human data.
(© 2026. The Author(s).)
Declarations. Conflicts of interest/Competing interests: The authors declare no conflicts of interest or competing interests. Ethics approval: As no data from persons besides the authors were used in this study, no ethics approval was required. Consent to participate: All participants gave their informed consent to be part of this study. Consent for publication: Only data from the authors were used in this study, and all authors gave their consent to publish these data.