Given an image, we first detect individual people using the Poselet detector [2]. We represent each detection hypothesis bounding box using a combination of the Poselet activation vector, MDP activation vector [33] and HOG descriptor [7]. Instead of using such a high dimensional vector directly to encode individual properties, we train SVM classifiers [3] equipped with histogram intersection kernel for individual pose classes and assign the confidence vector (probabilistic estimation) to individual feature pi (e.g. we train person v.s. no person, standing v.s. sitting on an object v.s. sitting on the ground, and 8 viewpoints × 3 poses classifiers). These individual pose vectors are used to rep- resent the unary and interaction features. 什么意思
时间: 2024-04-18 11:24:19 浏览: 169
在给定一张图片后,我们首先使用Poselet测器[2]检测个体。我们使用Poselet激活向量、MDP激活向量[33]和HOG描述符[7]的组合来表示每个检测假设的边界框。我们不直接使用这样高维的向量来编码个体属性,而是使用带有直方图交集核的SVM分类器[3]对个体姿势类别进行训练,并将置信度向量(概率估计)分配给个体特征pi(例如,我们训练人与非人、站立与坐在物体上与坐在地面上,以及8个视角× 3个姿势的分类器)。这些个体姿势向量用于表示一元特征和交互特征。
阅读全文