3.4 Pair Interaction Feature The interaction pattern between two individuals is encoded by a spatial descriptor with view invariant relative pose encoding. Given the 3D locations of two individual detec- tions zi,zj and two pose features pi,pj, we represent the pairwise relationship using view normalization, pose co-occurrence encoding, semantic compression and a spatial histogram (see Fig. 5 for illustration). The view normalization is performed by rotating the two people in 3D space by θ with respect to their midpoint, making their connecting line perpendicular to the cam- era view point. In this step, the pose features are also shifted accordingly (e.g. if θ = 45‘, shift 1 dimension with a cycle). Then, the co-occurrence feature is obtained by building a 2-dimensional matrix in which each element (r, c) corresponds to min(pi(r), pj (c)). Although the feature is view invariant, there are still elements in the matrix that deliver the same semantic concepts (e.g. left-left and right-right). To reduce such unnecessary variance and obtain a compact representation, we perform another transformation by multiplying a semantic compression matrix Sc to the vector form of the co-occurrence feature. The matrix Sc is learned offline by enumerating all possible configurations of view points and grouping the pairs that are equivalent when rotated by 180 degrees. Finally, we obtain the pair interaction descriptor by building a spatial histogram based on the 3D distance between the two (bin centers at 0.2, 0.6, 2.0 and 6.5 m). Here, we use linear interpolation similarly to contextual feature in Sec. 3.3. Given the interac- tion descriptor for each pair, we represent the interaction feature φxx(xi,xj) using the confidence value from an SVM classifier trained on a dictionary of interaction labels Y.什么意思
时间: 2024-04-18 18:24:12 浏览: 103
AbstractLike humans, robots that need semantic perception and accurate estimation of the environment can increase their knowledge through active interaction with objects. This paper proposes a novel method for 3D object mode.es6
这段文本介绍了一种编码两个个体之间交互模式的方法,通过具有视角不变的相对姿势编码的空间描述符。给定两个个体检测的3D位置zi、zj和两个姿势特征pi、pj,我们使用视角归一化、姿势共现编码、语义压缩和空间直方图来表示两者之间的配对关系(如图5所示)。
视角归一化是通过将两个人在3D空间中绕着它们的中点旋转θ来进行的,使它们的连接线垂直于相机视点。在这一步中,姿势特征也相应地进行了平移(例如,如果θ=45度,则进行1维周期性平移)。然后,通过构建一个二维矩阵来获取共现特征,其中每个元素(r,c)对应于min(pi(r),pj(c))。虽然该特征是视角不变的,但矩阵中仍然存在传递相同语义概念的元素(例如,left-left和right-right)。为了减少这种不必要的差异并获得紧凑的表示,我们通过将语义压缩矩阵Sc乘以共现特征的向量形式来进行另一种转换。矩阵Sc是离线学习的,通过枚举所有可能的视点配置并将旋转180度后等效的配对进行分组来得到。最后,我们通过基于两者之间的3D距离构建空间直方图(bin中心为0.2、0.6、2.0和6.5米)来获得配对交互描述符。在这里,我们使用线性插值,类似于第3.3节中的上下文特征。给定每个配对的交互描述符,我们使用在交互标签字典Y上训练的SVM分类器的置信度值来表示交互特征φxx(xi, xj)。
阅读全文