Treffer: From Controllers to Multimodal Input: A Chronological Review of XR Interaction Across Device Generations.
IEEE Trans Vis Comput Graph. 2023 Oct 03;PP:. (PMID: 37788200)
Sensors (Basel). 2024 Sep 25;24(19):. (PMID: 39409244)
IEEE Trans Vis Comput Graph. 2024 Jul;30(7):3767-3778. (PMID: 37022075)
IEEE Trans Vis Comput Graph. 2023 May;29(5):2269-2279. (PMID: 37027619)
Weitere Informationen
This study provides a chronological analysis of how Extended Reality (XR) interaction techniques have evolved from early controller-centered interfaces to natural hand- and gaze-based input and, more recently, to multimodal input, with a particular focus on the role of XR devices. We collected 46 user study-based XR interaction papers published between 2016 and 2024, including only studies that explicitly defined their interaction techniques and reported quantitative and/or qualitative evaluation results. For each study, we documented the XR hardware and software development kits (SDKs) used as well as the input modalities applied (e.g., controller, hand tracking, eye tracking, wrist rotation, multimodal input). These data were analyzed in relation to a device and SDK timeline spanning major platforms from the HTC Vive and Oculus Rift to the Meta Quest Pro and Apple Vision Pro. Using frequency summaries, heatmaps, correspondence analysis, and chi-square tests, we quantitatively compared input modality distributions across device generations. The results reveal three distinct stages of XR interaction development: (1) an early controller-dominant phase centered on the Vive/Rift (2016-2018), (2) a transitional phase marked by the widespread introduction of hand- and gaze-based input through the Oculus Quest, HoloLens 2, and the Hand Tracking SDK (2019-2021), and (3) an expansion phase in which multisensor and multimodal input became central, driven by MR-capable devices such as the Meta Quest Pro (2022-2024). These findings demonstrate that the choice of input modalities in XR research has been structurally shaped not only by researcher preference or task design but also by the sensing configurations, tracking performance, and SDK support provided by devices available at each point in time. By reframing XR interaction research within the technological context of device and SDK generations-rather than purely functional taxonomies-this study offers a structured analytical framework for informing future multimodal and context-adaptive XR interface design and guiding user studies involving next-generation XR devices.