This paper studies image alignment, the problem of learning a shape and appearance model from labeled data and efficiently fitting the model to a non-rigid object with large variations.
Given a set of images with manually labeled landmarks, our model representation consists of a shape component represented by a point distribution model and an appearance component represented by a collection of local features, trained discriminatively as a two-class classifier using boosting. Images with ground truth landmarks are the positive training samples while those with perturbed landmarks are considered as negatives. Enabled by piece-wise affine warping, corresponding local feature positions across all training samples form a hypothesis space for boosting. Image alignment is performed by maximizing the boosted classifier score, which is our distance measure, through iteratively mapping the feature positions to the image, and computing the gradient direction of the score with respect to the shape parameter. The authors apply this approach to human body alignment from surveillance-type images. They conduct experiments on the MIT pedestrian database where the body size is approximately 110 times 46 pixels, and demonstrate their real-time alignment capability. (Published abstract provided)
Similar Publications
- Lessons Learned Implementing Gunshot Detection Technology: Results of a Process Evaluation in Three Major Cities
- ChatGPTing Securely: Using Machine Learning to Automate Writing Rape Reports, Closed Source Large Language Models
- Panacea or Poison: Can Propensity Score Modeling (PSM) Methods Replicate the Results from Randomized Control Trials (RCTs)?