(3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Learning to Estimate 3D Human Pose and Shape from a Single Color Image(CVPR2018)


本文解决的问题是 从单幅彩色图片估计人的全身3D姿态和体型(pose & shape)

在end-to-end网络中加入了SMPL模型。这样仅需很少的参数就可以得到详细的3D网格结果。文章也说明了,可以从2D的关节点和掩膜(masks)得到这些参数。这样就不需要很多的3D shape ground truth。同时在网络训练时候,从估计的参数生成3D mesh,并且使用3D per-vertex loss 来优化surface。最后,通过一个可微的模块(a differentiable renderer)将3D mesh投影到图片,这样可以进行后续的fine操作,用来优化与2D标签(2D 关键点, 人体掩膜)的一致性。


(3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image                                                              (3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image


3D per-vertex loss

由于函数(3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image是可微的,我们可以通过它反向传播将图中的Mesh Generator当做网络的一层,而不需要训练参数;将预测得到的网格顶点值(mesh vertices)(3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image 和对应的真值 (3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image

(3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image

如果重点放在了pose估计,我们也可以仅根据3D关节点(3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image, 这些点可以根据网格顶点的线性组合得到。对应的loss为:

(3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Differentiable renderer

(3D-HPE)Learning to Estimate 3D Human Pose and Shape from a Single Color Image 分别 对应的是轮廓和2D关节点。