IEEE Trans Vis Comput Graph. 2022 Dec.
Mathias Parger, Chengcheng Tang, Yuanlu Xu, Christopher D Twigg, Lingling Tao, Yijing Li, Robert Wang, Markus Steinberger
Tracking body and hand motions in 3D space is essential for social and self-presence in augmented and virtual environments. Unlike the popular 3D pose estimation setting, the problem is often formulated as egocentric tracking based on embodied perception (e.g., egocentric cameras, handheld sensors). In this article, we propose a new data-driven framework for egocentric body tracking, targeting challenges of omnipresent occlusions in optimization-based methods (e.g., inverse kinematics solvers). We first collect a large-scale motion capture dataset with both body and finger motions using optical markers and inertial sensors. This dataset focuses on social scenarios and captures ground truth poses under self-occlusions and body-hand interactions. We then simulate the occlusion patterns in head-mounted camera views on the captured ground truth using a ray casting algorithm and learn a deep neural network to infer the occluded body parts. Our experiments show that our method is able to generate high-fidelity embodied poses by applying the proposed method to the task of real-time egocentric body tracking, finger motion synthesis, and 3-point inverse kinematics.
Please download the UNOC dataset and extract the BVH files.
Install Python and Pytorch. Please follow the Pytorch installation instructions from their website.
Install dependencies by running pip install -r requirements.txt
Before running any scripts, please set the environment variables
- PATH_UNOC to the root directory of the UNOC dataset (containing the 13 participants).
- PATH_DATA to a directory that can be used for temporary data, such as preprocessed feature sets and model weights.
Once the environment variables are set, you can run the training script train.py. At first start, this will process the bvh files and save the pose information as numpy files in UNOC_PATH. After the conversion is done, the feature sets for input and output are created and saved to PATH_DATA to speed up training. The training procedure will start once all feature sets are converted and should finish after a few minutes.
train.py can be run in two modes:
- Predicting the fully body pose from incomplete tracking information
python train.py body
- Predicting finger pose from body pose
python train.py finger
Like train.py, eval.py can be run in two modes:
- Predicting the fully body pose from incomplete tracking information
python eval.py body
- Predicting finger pose from body pose
python eval.py finger
Both modes support plot as an additional argument that will animate the predicted animations using an interactive 3D view.
Try out our interactive web player to preview some examples of UNOC motion.
The dataset is available in different file formats and at different stages of processing:
- Solved and calibrated body and hand pose (BVH): This data is used for training and evaluating the neural network.
- Solved body without hand motion (FBX)
- Solved hand motion (BVH): Motion of hands attached to a static avatar. Beware that the wrist rotation is not accurate and only the finger joint motion is used in the merged animation.
- Cleaned body markers (C3D): We used the Optitrack Biomech 57 markerset.
UNOC
is released under the MIT license. See LICENSE for additional details about it.
See also our Terms of Use and Privacy Policy.