Method Overview. Given multiple sparse video views as well as 3D poses as inputs, TAVA creates a virtual actor consists of implicit shape, apperance, skinning weights in the canonical space, which is ready to be animated and rendered even with out-of-distribution poses. Dense correspondences across views and poses can also be established, which enables content editing during rendering. Without requiring body template, Our method can be directly used for creatures beyond human as along as the skeleton can be defined.
In the video below, we show a demo of reposing multiple learnt actors (left) using the same motion sequence from AIST++ Dance Motion Dataset (right), in which the poses are novel poses and out-of-distribution of the training ones.
* The bended back in our rendering results come from the inaccurate AIST++ motion estimation.
In the following videos, we show our results of dense sorrespondences. The colors indicates the canonical correspondences of each pixel.
In the following videos, we show comparisons between our method and different baselines on the task of novel-pose synthesis using a sequence of poses never seen during training.
* "Ours (robust)" denotes our results with shading factorized out, thus is more robust to novel pose rendering.
* Both "Animatable-NeRF" and "NeuralBody" require SMPL body templates while other approaches don't.