Treffer: Multi-Modal Multi-View 3D Hand Pose Estimation.
Weitere Informationen
With the rapid progress of the artificial intelligence (AI) technology and mobile internet, 3D hand pose estimation has become critical to various intelligent application areas, e. g., human-computer interaction. To avoid the low accuracy of single-modal estimation and the high complexity of traditional multi-modal 3D estimation, this paper proposes a novel multi-modal multi-view (MMV) 3D hand pose estimation system, which introduces a registration before translation (RT)-translation before registration (TR) jointed conditional generative adversarial network (cGAN) to train a multi-modal registration network, and then employs the multi-modal feature fusion to achieve high-quality estimation, with low hardware and software costs both in data acquisition and processing. Experimental results demonstrate that the MMV system is effective and feasible in various scenarios. It is promising for the MMV system to be used in broad intelligent application areas. [ABSTRACT FROM AUTHOR]
随着人工智能技术和移动互联网的飞速发展, 3D 手部姿态估计已成为人机交互等各种智能应用领 域的关键技术之一。为了避免单一模态估计精度低, 以及传统多模态3D 估计复杂度高的问题, 该文提出 了一种新型的多模态多视角(multi-modal multi-view, MMV) 3D 手部姿态估计系统: 引入RT-TR 联合的 条件式生成对抗网络(conditional generative adversarial network, cGAN), 以训练多模态配准网络, 并利用 多模态特征融合实现高质量估计, 在数据采集和处理过程中具有较低的硬件和软件成本。实验结果表明, MMV 系统在各种场景下都显示出其有效性和可行性, 有望广泛应用于智能领域。. [ABSTRACT FROM AUTHOR]