Disclosed are computer-readable devices, systems and methods for generating a model of a clothed body. The method includes generating a model of an unclothed human body, the model capturing a shape or a pose of the unclothed human body, determining two-dimensional contours associated with the model, and computing deformations by aligning a contour of a clothed human body with a contour of the unclothed human body. Based on the two-dimensional contours and the deformations, the method includes generating a first two-dimensional model of the unclothed human body, the first two-dimensional model factoring the deformations of the unclothed human body into one or more of a shape variation component, a viewpoint change, and a pose variation and learning an eigen-clothing model using principal component analysis applied to the deformations, wherein the eigen-clothing model classifies different types of clothing, to yield a second two-dimensional model of a clothed human body.
In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 1378 -1385, Columbus, Ohio, USA, IEEE Intenational Conference on Computer Vision and Pattern Recognition, June 2014 (inproceedings)
We consider the intersection of two research fields: transfer learning and statistics on manifolds. In particular, we consider, for manifold-valued data, transfer learning of tangent-space models such as Gaussians distributions, PCA, regression, or classifiers. Though one would hope to simply use ordinary Rn-transfer learning ideas, the manifold structure prevents it. We overcome this by basing our method on inner-product-preserving parallel transport, a well-known tool widely used in other problems of statistics on manifolds in computer vision. At first, this straightforward idea seems to suffer from an obvious shortcoming: Transporting large datasets is prohibitively expensive, hindering scalability. Fortunately, with our approach, we never transport data. Rather, we show how the statistical models themselves can be transported, and prove that for the tangent-space models above, the transport “commutes” with learning. Consequently, our compact framework, applicable to a large class of manifolds, is not restricted by the size of either the training or test sets. We demonstrate the approach by transferring PCA and logistic-regression models of real-world data involving 3D shapes and image descriptors.
This technical report is complementary to "Model Transport: Towards Scalable Transfer Learning on Manifolds" and contains proofs, explanation of the attached video (visualization of bases from the body shape experiments), and high-resolution images of select results of individual reconstructions from the shape experiments. It is identical to the supplemental mate- rial submitted to the Conference on Computer Vision and Pattern Recognition (CVPR 2014) on November 2013.
Foster, J., Nuyujukian, P., Freifeld, O., Gao, H., Walker, R., Ryu, S., Meng, T., Murmann, B., Black, M., Shenoy, K.
J. of Neural Engineering, 11(4):046020, 2014 (article)
Objective: Motor neuroscience and brain-machine interface (BMI) design is based on examining how the brain controls voluntary movement, typically by recording neural activity and behavior from animal models. Recording technologies used with these animal models have traditionally limited the range of behaviors that can be studied, and thus the generality of science and engineering research. We aim to design a freely-moving animal model using neural and behavioral recording technologies that do not constrain movement.
Approach: We have established a freely-moving rhesus monkey model employing technology that transmits neural activity from an intracortical array using a head-mounted device and records behavior through computer vision using markerless motion capture. We demonstrate the excitability and utility of this new monkey model, including the first recordings from motor cortex while rhesus monkeys walk quadrupedally on a treadmill.
Main results: Using this monkey model, we show that multi-unit threshold-crossing neural activity encodes the phase of walking and that the average ring rate of the threshold crossings covaries with the speed of individual steps. On a population level, we find that neural state-space trajectories of walking at different speeds have similar rotational dynamics in some dimensions that evolve at the step rate of walking, yet robustly separate by speed in other state-space dimensions.
Significance: Freely-moving animal models may allow neuroscientists to examine a wider range of behaviors and can provide a flexible experimental paradigm for examining the neural mechanisms that underlie movement generation across behaviors and environments. For BMIs, freely-moving animal models have the potential to aid prosthetic design by examining how neural encoding changes with posture, environment, and other real-world context changes. Understanding this new realm of behavior in more naturalistic settings is essential for overall progress of basic motor neuroscience and for the successful translation of BMIs to people with paralysis.
Statistical models of non-rigid deformable shape have wide application in many fields,
including computer vision, computer graphics, and biometry. We show that shape deformations
are well represented through nonlinear manifolds that are also matrix Lie groups.
These pattern-theoretic representations lead to several advantages over other alternatives,
including a principled measure of shape dissimilarity and a natural way to compose deformations.
Moreover, they enable building models using statistics on manifolds. Consequently,
such models are superior to those based on Euclidean representations. We
demonstrate this by modeling 2D and 3D human body shape. Shape deformations are
only one example of manifold-valued data. More generally, in many computer-vision and
machine-learning problems, nonlinear manifold representations arise naturally and provide
a powerful alternative to Euclidean representations. Statistics is traditionally concerned
with data in a Euclidean space, relying on the linear structure and the distances associated
with such a space; this renders it inappropriate for nonlinear spaces. Statistics can,
however, be generalized to nonlinear manifolds. Moreover, by respecting the underlying
geometry, the statistical models result in not only more effective analysis but also consistent
synthesis. We go beyond previous work on statistics on manifolds by showing how,
even on these curved spaces, problems related to modeling a class from scarce data can be
dealt with by leveraging information from related classes residing in different regions of the
space. We show the usefulness of our approach with 3D shape deformations. To summarize
our main contributions: 1) We define a new 2D articulated model -- more expressive than
traditional ones -- of deformable human shape that factors body-shape, pose, and camera
variations. Its high realism is obtained from training data generated from a detailed 3D
model. 2) We define a new manifold-based representation of 3D shape deformations that
yields statistical deformable-template models that are better than the current state-of-the-
art. 3) We generalize a transfer learning idea from Euclidean spaces to Riemannian
manifolds. This work demonstrates the value of modeling manifold-valued data and their
statistics explicitly on the manifold. Specifically, the methods here provide new tools for
In European Conf. on Computer Vision (ECCV), pages: 1-14, Part I, LNCS 7572, (Editors: A. Fitzgibbon et al. (Eds.)), Springer-Verlag, October 2012 (inproceedings)
Three-dimensional object shape is commonly represented in terms of deformations of a triangular mesh from an exemplar shape. Existing models, however, are based on a Euclidean representation of shape deformations. In contrast, we argue that shape has a manifold structure: For example, summing the shape deformations for two people does not necessarily yield a deformation corresponding to a valid human shape, nor does the Euclidean difference of these two deformations provide a meaningful measure of shape dissimilarity. Consequently, we define a
novel manifold for shape representation, with emphasis on body shapes, using a new Lie group of deformations. This has several advantages. First we define triangle deformations exactly, removing non-physical deformations
and redundant degrees of freedom common to previous methods. Second, the Riemannian structure of Lie Bodies enables a more meaningful definition of body shape similarity by measuring distance between bodies on the manifold of body shape deformations. Third, the group structure allows the valid composition of deformations. This is important for models that factor body shape deformations into multiple causes or represent shape as a linear combination of basis shapes. Finally, body shape variation is modeled using statistics on manifolds. Instead of modeling Euclidean shape variation with Principal Component Analysis we capture shape variation on the manifold using Principal Geodesic Analysis. Our experiments show consistent visual and quantitative advantages of Lie Bodies over traditional Euclidean models of shape deformation and our representation can be easily incorporated into existing methods.
In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 3546-3553, IEEE, June 2012 (inproceedings)
Pictorial Structures (PS) define a probabilistic model of 2D articulated objects in images. Typical PS models assume an object can be represented by a set of rigid parts connected with pairwise constraints that define the prior probability of part configurations. These models are widely used to represent non-rigid articulated objects such as humans and animals despite the fact that such objects have parts that deform non-rigidly. Here we define a new Deformable Structures (DS) model that is a natural extension of previous PS models and that captures the non-rigid shape deformation of the parts. Each part in a DS model is represented by a low-dimensional shape deformation space and pairwise potentials between parts capture how the shape varies with pose and the shape of neighboring parts. A key advantage of such a model is that it more accurately models object boundaries. This enables image likelihood models that are more discriminative than previous PS likelihoods. This likelihood is learned using training imagery annotated using a DS “puppet.” We focus on a human DS model learned from 2D projections of a realistic 3D human body model and use it to infer human poses in images using a form of non-parametric belief propagation.
In Advances in Neural Information Processing Systems (NIPS) 25, pages: 2033-2041, (Editors: P. Bartlett and F.C.N. Pereira and C.J.C. Burges and L. Bottou and K.Q. Weinberger), MIT Press, 2012 (inproceedings)
Multi-metric learning techniques learn local metric tensors in different parts of a feature space. With such an approach, even simple classifiers can be competitive with the state-of-the-art because the distance measure locally adapts to the structure of the data. The learned distance measure is, however, non-metric, which has prevented multi-metric learning from generalizing to tasks such as dimensionality reduction and regression in a principled way. We prove that, with appropriate changes, multi-metric learning corresponds to learning the structure of a Riemannian manifold. We then show that this structure gives us a principled way to perform dimensionality reduction and regression according to the learned metrics. Algorithmically, we provide the first practical algorithm for computing geodesics according to the learned metrics, as well as algorithms for computing exponential and logarithmic maps on the Riemannian manifold. Together, these tools let many Euclidean algorithms take advantage of multi-metric learning. We illustrate the approach on regression and dimensionality reduction tasks that involve predicting measurements of the human body from shape data.
In European Conf. on Computer Vision, (ECCV), pages: 285-298, Springer-Verlag, September 2010 (inproceedings)
Detection, tracking, segmentation and pose estimation of people in monocular images are widely studied. Two-dimensional models of the human body are extensively used, however, they are typically fairly crude, representing the body either as a rough outline or in terms of articulated geometric primitives. We describe a new 2D model of the human body contour that combines an underlying naked body with a low-dimensional clothing model. The naked body is represented as a Contour Person that can take on a wide variety of poses and body shapes. Clothing is represented as a deformation from the underlying body contour. This deformation is learned from training examples using principal component analysis to produce eigen clothing. We find that the statistics of clothing deformations are skewed and we model the a priori probability of these deformations using a Beta distribution. The resulting generative model captures realistic human forms in monocular images and is used to infer 2D body shape and pose under clothing. We also use the coefficients of the eigen clothing to recognize different categories of clothing on dressed people. The method is evaluated quantitatively on synthetic and real images and achieves better accuracy than previous methods for estimating body shape under clothing.
In IEEE Conf. on Computer Vision and Pattern Recognition, (CVPR), pages: 639-646, IEEE, June 2010 (inproceedings)
We define a new “contour person” model of the human body that has the expressive power of a detailed 3D model and the computational benefits of a simple 2D part-based model. The contour person (CP) model is learned from a 3D SCAPE model of the human body that captures natural shape and pose variations; the projected contours of this model, along with their segmentation into parts forms the training set. The CP model factors deformations of the body into three components: shape variation, viewpoint change and part rotation. This latter model also incorporates a learned non-rigid deformation model. The result is a 2D articulated model that is compact to represent, simple to compute with and more expressive than previous models. We demonstrate the value of such a model in 2D pose estimation and segmentation. Given an initial pose from a standard pictorial-structures method, we refine the pose and shape using an objective function that segments the scene into foreground and background regions. The result is a parametric, human-specific, image segmentation.
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems