Header logo is


2011


Thumb xl trimproc small
High-quality reflection separation using polarized images

Kong, N., Tai, Y., Shin, S. Y.

IEEE Transactions on Image Processing, 20(12):3393-3405, IEEE Signal Processing Society, December 2011 (article)

Abstract
In this paper, we deal with a problem of separating the effect of reflection from images captured behind glass. The input consists of multiple polarized images captured from the same view point but with different polarizer angles. The output is the high quality separation of the reflection layer and the background layer from the images. We formulate this problem as a constrained optimization problem and propose a framework that allows us to fully exploit the mutually exclusive image information in our input data. We test our approach on various images and demonstrate that our approach can generate good reflection separation results.

ps

Publisher site [BibTex]

2011


Publisher site [BibTex]


Thumb xl teaser iccv2011
Outdoor Human Motion Capture using Inverse Kinematics and von Mises-Fisher Sampling

Pons-Moll, G., Baak, A., Gall, J., Leal-Taixe, L., Mueller, M., Seidel, H., Rosenhahn, B.

In IEEE International Conference on Computer Vision (ICCV), pages: 1243-1250, November 2011 (inproceedings)

ps

project page pdf supplemental [BibTex]

project page pdf supplemental [BibTex]


Thumb xl iccv2011homepageimage notext small
Home 3D body scans from noisy image and range data

Weiss, A., Hirshberg, D., Black, M.

In Int. Conf. on Computer Vision (ICCV), pages: 1951-1958, IEEE, Barcelona, November 2011 (inproceedings)

Abstract
The 3D shape of the human body is useful for applications in fitness, games and apparel. Accurate body scanners, however, are expensive, limiting the availability of 3D body models. We present a method for human shape reconstruction from noisy monocular image and range data using a single inexpensive commodity sensor. The approach combines low-resolution image silhouettes with coarse range data to estimate a parametric model of the body. Accurate 3D shape estimates are obtained by combining multiple monocular views of a person moving in front of the sensor. To cope with varying body pose, we use a SCAPE body model which factors 3D body shape and pose variations. This enables the estimation of a single consistent shape while allowing pose to vary. Additionally, we describe a novel method to minimize the distance between the projected 3D body contour and the image silhouette that uses analytic derivatives of the objective function. We propose a simple method to estimate standard body measurements from the recovered SCAPE model and show that the accuracy of our method is competitive with commercial body scanning systems costing orders of magnitude more.

ps

pdf YouTube poster Project Page Project Page [BibTex]

pdf YouTube poster Project Page Project Page [BibTex]


Thumb xl iccv2012
Means in spaces of tree-like shapes

Aasa Feragen, Soren Hauberg, Mads Nielsen, Francois Lauze

In Computer Vision (ICCV), 2011 IEEE International Conference on, pages: 736 -746, IEEE, november 2011 (inproceedings)

ps

Publishers site PDF Suppl. material [BibTex]

Publishers site PDF Suppl. material [BibTex]


Thumb xl teaser iccvw
Everybody needs somebody: modeling social and grouping behavior on a linear programming multiple people tracker

Leal-Taixé, L., Rosenhahn, G. P. A. B.

In IEEE International Conference on Computer Vision Workshops (IICCVW), November 2011 (inproceedings)

ps

project page pdf [BibTex]

project page pdf [BibTex]


Thumb xl lugano11small
Evaluating the Automated Alignment of 3D Human Body Scans

Hirshberg, D. A., Loper, M., Rachlin, E., Tsoli, A., Weiss, A., Corner, B., Black, M. J.

In 2nd International Conference on 3D Body Scanning Technologies, pages: 76-86, (Editors: D’Apuzzo, Nicola), Hometrica Consulting, Lugano, Switzerland, October 2011 (inproceedings)

Abstract
The statistical analysis of large corpora of human body scans requires that these scans be in alignment, either for a small set of key landmarks or densely for all the vertices in the scan. Existing techniques tend to rely on hand-placed landmarks or algorithms that extract landmarks from scans. The former is time consuming and subjective while the latter is error prone. Here we show that a model-based approach can align meshes automatically, producing alignment accuracy similar to that of previous methods that rely on many landmarks. Specifically, we align a low-resolution, artist-created template body mesh to many high-resolution laser scans. Our alignment procedure employs a robust iterative closest point method with a regularization that promotes smooth and locally rigid deformation of the template mesh. We evaluate our approach on 50 female body models from the CAESAR dataset that vary significantly in body shape. To make the method fully automatic, we define simple feature detectors for the head and ankles, which provide initial landmark locations. We find that, if body poses are fairly similar, as in CAESAR, the fully automated method provides dense alignments that enable statistical analysis and anthropometric measurement.

ps

pdf slides DOI Project Page [BibTex]

pdf slides DOI Project Page [BibTex]


Thumb xl mt
Branch&Rank: Non-Linear Object Detection

(Best Impact Paper Prize)

Lehmann, A., Gehler, P., VanGool, L.

In Proceedings of the British Machine Vision Conference (BMVC), pages: 8.1-8.11, (Editors: Jesse Hoey and Stephen McKenna and Emanuele Trucco), BMVA Press, September 2011, http://dx.doi.org/10.5244/C.25.8 (inproceedings)

ps

video of talk pdf slides supplementary [BibTex]

video of talk pdf slides supplementary [BibTex]


no image
A human inspired gaze estimation system

Wulff, J., Sinha, P.

Journal of Vision, 11(11):507-507, ARVO, September 2011 (article)

Abstract
Estimating another person's gaze is a crucial skill in human social interactions. The social component is most apparent in dyadic gaze situations, in which the looker seems to look into the eyes of the observer, thereby signaling interest or a turn to speak. In a triadic situation, on the other hand, the looker's gaze is averted from the observer and directed towards another, specific target. This is mostly interpreted as a cue for joint attention, creating awareness of a predator or another point of interest. In keeping with the task's social significance, humans are very proficient at gaze estimation. Our accuracy ranges from less than one degree for dyadic settings to approximately 2.5 degrees for triadic ones. Our goal in this work is to draw inspiration from human gaze estimation mechanisms in order to create an artificial system that can approach the former's accuracy levels. Since human performance is severely impaired by both image-based degradations (Ando, 2004) and a change of facial configurations (Jenkins & Langton, 2003), the underlying principles are believed to be based both on simple image cues such as contrast/brightness distribution and on more complex geometric processing to reconstruct the actual shape of the head. By incorporating both kinds of cues in our system's design, we are able to surpass the accuracy of existing eye-tracking systems, which rely exclusively on either image-based or geometry-based cues (Yamazoe et al., 2008). A side-benefit of this combined approach is that it allows for gaze estimation despite moderate view-point changes. This is important for settings where subjects, say young children or certain kinds of patients, might not be fully cooperative to allow a careful calibration. Our model and implementation of gaze estimation opens up new experimental questions about human mechanisms while also providing a useful tool for general calibration-free, non-intrusive remote eye-tracking.

ps

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Detecting synchrony in degraded audio-visual streams

Dhandhania, K., Wulff, J., Sinha, P.

Journal of Vision, 11(11):800-800, ARVO, September 2011 (article)

Abstract
Even 8–10 week old infants, when presented with two dynamic faces and a speech stream, look significantly longer at the ‘correct’ talking person (Patterson & Werker, 2003). This is true even though their reduced visual acuity prevents them from utilizing high spatial frequencies. Computational analyses in the field of audio/video synchrony and automatic speaker detection (e.g. Hershey & Movellan, 2000), in contrast, usually depend on high-resolution images. Therefore, the correlation mechanisms found in these computational studies are not directly applicable to the processes through which we learn to integrate the modalities of speech and vision. In this work, we investigated the correlation between speech signals and degraded video signals. We found a high correlation persisting even with high image degradation, resembling the low visual acuity of young infants. Additionally (in a fashion similar to Graf et al., 2002) we explored which parts of the face correlate with the audio in the degraded video sequences. Perfect synchrony and small offsets in the audio were used while finding the correlation, thereby detecting visual events preceding and following audio events. In order to achieve a sufficiently high temporal resolution, high-speed video sequences (500 frames per second) of talking people were used. This is a temporal resolution unachieved in previous studies and has allowed us to capture very subtle and short visual events. We believe that the results of this study might be interesting not only to vision researchers, but, by revealing subtle effects on a very fine timescale, also to people working in computer graphics and the generation and animation of artificial faces.

ps

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl teaser dagm2011
Efficient and Robust Shape Matching for Model Based Human Motion Capture

Pons-Moll, G., Leal-Taixé, L., Truong, T., Rosenhahn, B.

In German Conference on Pattern Recognition (GCPR), pages: 416-425, September 2011 (inproceedings)

ps

project page pdf [BibTex]

project page pdf [BibTex]


no image
BrainGate pilot clinical trials: Progress in translating neural engineering principles to clinical testing

Hochberg, L., Simeral, J., Black, M., Bacher, D., Barefoot, L., Berhanu, E., Borton, D., Cash, S., Feldman, J., Gallivan, E., Homer, M., Jarosiewicz, B., King, B., Liu, J., Malik, W., Masse, N., Perge, J., Rosler, D., Schmansky, N., Travers, B., Truccolo, W., Nurmikko, A., Donoghue, J.

33rd Annual International IEEE EMBS Conference of the IEEE Engineering in Medicine and Biology Society, Boston, MA, August 2011 (conference)

ps

[BibTex]

[BibTex]


no image
ISocRob-MSL 2011 Team Description Paper for Middle Sized League

Messias, J., Ahmad, A., Reis, J., Sousa, J., Lima, P.

15th Annual RoboCup International Symposium 2011, July 2011 (techreport)

Abstract
This paper describes the status of the ISocRob MSL robotic soccer team as required by the RoboCup 2011 qualification procedures. The most relevant technical and scientifical developments carried out by the team, since its last participation in the RoboCup MSL competitions, are here detailed. These include cooperative localization, cooperative object tracking, planning under uncertainty, obstacle detection and improvements to self-localization.

ps

link (url) [BibTex]

link (url) [BibTex]


Thumb xl trajectory pami
Trajectory Space: A Dual Representation for Nonrigid Structure from Motion

Akhter, I., Sheikh, Y., Khan, S., Kanade, T.

Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(7):1442-1456, IEEE, July 2011 (article)

Abstract
Existing approaches to nonrigid structure from motion assume that the instantaneous 3D shape of a deforming object is a linear combination of basis shapes. These basis are object dependent and therefore have to be estimated anew for each video sequence. In contrast, we propose a dual approach to describe the evolving 3D structure in trajectory space by a linear combination of basis trajectories. We describe the dual relationship between the two approaches, showing that they both have equal power for representing 3D structure. We further show that the temporal smoothness in 3D trajectories alone can be used for recovering nonrigid structure from a moving camera. The principal advantage of expressing deforming 3D structure in trajectory space is that we can define an object independent basis. This results in a significant reduction in unknowns, and corresponding stability in estimation. We propose the use of the Discrete Cosine Transform (DCT) as the object independent basis and empirically demonstrate that it approaches Principal Component Analysis (PCA) for natural motions. We report the performance of the proposed method, quantitatively using motion capture data, and qualitatively on several video sequences exhibiting nonrigid motions including piecewise rigid motion, partially nonrigid motion (such as a facial expressions), and highly nonrigid motion (such as a person walking or dancing).

ps

pdf project page [BibTex]

pdf project page [BibTex]


Thumb xl screen shot 2012 02 23 at 09.35.10
Learning Output Kernels with Block Coordinate Descent

Dinuzzo, F., Ong, C. S., Gehler, P., Pillonetto, G.

In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages: 49-56, ICML ’11, (Editors: Getoor, Lise and Scheffer, Tobias), ACM, New York, NY, USA, ICML, June 2011 (inproceedings)

ei ps

data+code pdf [BibTex]

data+code pdf [BibTex]


Thumb xl sigalijcv11
Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation

Sigal, L., Isard, M., Haussecker, H., Black, M. J.

International Journal of Computer Vision, 98(1):15-48, Springer Netherlands, May 2011 (article)

Abstract
We formulate the problem of 3D human pose estimation and tracking as one of inference in a graphical model. Unlike traditional kinematic tree representations, our model of the body is a collection of loosely-connected body-parts. In particular, we model the body using an undirected graphical model in which nodes correspond to parts and edges to kinematic, penetration, and temporal constraints imposed by the joints and the world. These constraints are encoded using pair-wise statistical distributions, that are learned from motion-capture training data. Human pose and motion estimation is formulated as inference in this graphical model and is solved using Particle Message Passing (PaMPas). PaMPas is a form of non-parametric belief propagation that uses a variation of particle filtering that can be applied over a general graphical model with loops. The loose-limbed model and decentralized graph structure allow us to incorporate information from "bottom-up" visual cues, such as limb and head detectors, into the inference process. These detectors enable automatic initialization and aid recovery from transient tracking failures. We illustrate the method by automatically tracking people in multi-view imagery using a set of calibrated cameras and present quantitative evaluation using the HumanEva dataset.

ps

pdf publisher's site link (url) Project Page Project Page [BibTex]

pdf publisher's site link (url) Project Page Project Page [BibTex]


Thumb xl pointclickimagewide
Point-and-Click Cursor Control With an Intracortical Neural Interface System by Humans With Tetraplegia

Kim, S., Simeral, J. D., Hochberg, L. R., Donoghue, J. P., Friehs, G. M., Black, M. J.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 19(2):193-203, April 2011 (article)

Abstract
We present a point-and-click intracortical neural interface system (NIS) that enables humans with tetraplegia to volitionally move a 2D computer cursor in any desired direction on a computer screen, hold it still and click on the area of interest. This direct brain-computer interface extracts both discrete (click) and continuous (cursor velocity) signals from a single small population of neurons in human motor cortex. A key component of this system is a multi-state probabilistic decoding algorithm that simultaneously decodes neural spiking activity and outputs either a click signal or the velocity of the cursor. The algorithm combines a linear classifier, which determines whether the user is intending to click or move the cursor, with a Kalman filter that translates the neural population activity into cursor velocity. We present a paradigm for training the multi-state decoding algorithm using neural activity observed during imagined actions. Two human participants with tetraplegia (paralysis of the four limbs) performed a closed-loop radial target acquisition task using the point-and-click NIS over multiple sessions. We quantified point-and-click performance using various human-computer interaction measurements for pointing devices. We found that participants were able to control the cursor motion accurately and click on specified targets with a small error rate (< 3% in one participant). This study suggests that signals from a small ensemble of motor cortical neurons (~40) can be used for natural point-and-click 2D cursor control of a personal computer.

ps

pdf publishers's site pub med link (url) Project Page [BibTex]

pdf publishers's site pub med link (url) Project Page [BibTex]


Thumb xl middleburyimagesmall
A Database and Evaluation Methodology for Optical Flow

Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M. J., Szeliski, R.

International Journal of Computer Vision, 92(1):1-31, March 2011 (article)

Abstract
The quantitative evaluation of optical flow algorithms by Barron et al. (1994) led to significant advances in performance. The challenges for optical flow algorithms today go beyond the datasets and evaluation methods proposed in that paper. Instead, they center on problems associated with complex natural scenes, including nonrigid motion, real sensor noise, and motion discontinuities. We propose a new set of benchmarks and evaluation methods for the next generation of optical flow algorithms. To that end, we contribute four types of data to test different aspects of optical flow algorithms: (1) sequences with nonrigid motion where the ground-truth flow is determined by tracking hidden fluorescent texture, (2) realistic synthetic sequences, (3) high frame-rate video used to study interpolation error, and (4) modified stereo sequences of static scenes. In addition to the average angular error used by Barron et al., we compute the absolute flow endpoint error, measures for frame interpolation error, improved statistics, and results at motion discontinuities and in textureless regions. In October 2007, we published the performance of several well-known methods on a preliminary version of our data to establish the current state of the art. We also made the data freely available on the web at http://vision.middlebury.edu/flow/ . Subsequently a number of researchers have uploaded their results to our website and published papers using the data. A significant improvement in performance has already been achieved. In this paper we analyze the results obtained to date and draw a large number of conclusions from them.

ps

pdf pdf from publisher Middlebury Flow Evaluation Website [BibTex]

pdf pdf from publisher Middlebury Flow Evaluation Website [BibTex]


Thumb xl jampani11 spie
Role of expertise and contralateral symmetry in the diagnosis of pneumoconiosis: an experimental study

Jampani, V., Vaidya, V., Sivaswamy, J., Tourani, K. L.

In Proc. SPIE 7966, Medical Imaging: Image Perception, Observer Performance, and Technology Assessment, 2011, Florida, March 2011 (inproceedings)

Abstract
Pneumoconiosis, a lung disease caused by the inhalation of dust, is mainly diagnosed using chest radiographs. The effects of using contralateral symmetric (CS) information present in chest radiographs in the diagnosis of pneumoconiosis are studied using an eye tracking experimental study. The role of expertise and the influence of CS information on the performance of readers with different expertise level are also of interest. Experimental subjects ranging from novices & medical students to staff radiologists were presented with 17 double and 16 single lung images, and were asked to give profusion ratings for each lung zone. Eye movements and the time for their diagnosis were also recorded. Kruskal-Wallis test (χ2(6) = 13.38, p = .038), showed that the observer error (average sum of absolute differences) in double lung images differed significantly across the different expertise categories when considering all the participants. Wilcoxon-signed rank test indicated that the observer error was significantly higher for single-lung images (Z = 3.13, p < .001) than for the double-lung images for all the participants. Mann-Whitney test (U = 28, p = .038) showed that the differential error between single and double lung images is significantly higher in doctors [staff & residents] than in non-doctors [others]. Thus, Expertise & CS information plays a significant role in the diagnosis of pneumoconiosis. CS information helps in diagnosing pneumoconiosis by reducing the general tendency of giving less profusion ratings. Training and experience appear to play important roles in learning to use the CS information present in the chest radiographs.

ps

url link (url) [BibTex]

url link (url) [BibTex]


Thumb xl problem
Recovering Intrinsic Images with a Global Sparsity Prior on Reflectance

Gehler, P., Rother, C., Kiefel, M., Zhang, L., Schölkopf, B.

In Advances in Neural Information Processing Systems 24, pages: 765-773, (Editors: Shawe-Taylor, John and Zemel, Richard S. and Bartlett, Peter L. and Pereira, Fernando C. N. and Weinberger, Kilian Q.), Curran Associates, Inc., Red Hook, NY, USA, Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We address the challenging task of decoupling material properties from lighting properties given a single image. In the last two decades virtually all works have concentrated on exploiting edge information to address this problem. We take a different route by introducing a new prior on reflectance, that models reflectance values as being drawn from a sparse set of basis colors. This results in a Random Field model with global, latent variables (basis colors) and pixel-accurate output reflectance values. We show that without edge information high-quality results can be achieved, that are on par with methods exploiting this source of information. Finally, we are able to improve on state-of-the-art results by integrating edge information into our model. We believe that our new approach is an excellent starting point for future developments in this field.

ei ps

website + code pdf poster Project Page Project Page [BibTex]

website + code pdf poster Project Page Project Page [BibTex]


Thumb xl openbiosafetylab  a virtual world based biosafety training application for medical students
OpenBioSafetyLab: A virtual world based biosafety training application for medical students

Nakasone, A., Tang, S., Shigematsu, M., Heinecke, B., Fujimoto, S., Prendinger, H.

In International Conference on Information Technology: New Generations (ITNG), IEEE CPS, 2011 (inproceedings)

ps

PDF [BibTex]

PDF [BibTex]


Thumb xl fosterembs2011
Combining wireless neural recording and video capture for the analysis of natural gait

Foster, J., Freifeld, O., Nuyujukian, P., Ryu, S., Black, M. J., Shenoy, K.

In Proc. 5th Int. IEEE EMBS Conf. on Neural Engineering, pages: 613-616, IEEE, 2011 (inproceedings)

ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl segmentation isbi11
Tagged Cardiac MR Image Segmentation Using Boundary & Regional-Support and Graph-based Deformable Priors

Xiang, B., Wang, C., Deux, J., Rahmouni, A., Paragios, N.

In IEEE International Symposium on Biomedical Imaging (ISBI), 2011 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl multi nrsfm
Multiview Structure from Motion in Trajectory Space

Zaheer, A., Akhter, I., Mohammad, H. B., Marzban, S., Khan, S.

In Computer Vision (ICCV), 2011 IEEE International Conference on, pages: 2447-2453, 2011 (inproceedings)

Abstract
Most nonrigid objects exhibit temporal regularities in their deformations. Recently it was proposed that these regularities can be parameterized by assuming that the non- rigid structure lies in a small dimensional trajectory space. In this paper, we propose a factorization approach for 3D reconstruction from multiple static cameras under the com- pact trajectory subspace representation. Proposed factor- ization is analogous to rank-3 factorization of rigid struc- ture from motion problem, in transformed space. The benefit of our approach is that the 3D trajectory basis can be directly learned from the image observations. This also allows us to impute missing observations and denoise tracking errors without explicit estimation of the 3D structure. In contrast to standard triangulation based methods which require points to be visible in at least two cameras, our ap- proach can reconstruct points, which remain occluded even in all the cameras for quite a long time. This makes our solution especially suitable for occlusion handling in motion capture systems. We demonstrate robustness of our method on challenging real and synthetic scenarios.

ps

pdf project page [BibTex]

pdf project page [BibTex]


Thumb xl 1000dayimagesmall
Neural control of cursor trajectory and click by a human with tetraplegia 1000 days after implant of an intracortical microelectrode array

(J. Neural Engineering Highlights of 2011 Collection. JNE top 10 cited papers of 2010-2011.)

Simeral, J. D., Kim, S., Black, M. J., Donoghue, J. P., Hochberg, L. R.

J. of Neural Engineering, 8(2):025027, 2011 (article)

Abstract
The ongoing pilot clinical trial of the BrainGate neural interface system aims in part to assess the feasibility of using neural activity obtained from a small-scale, chronically implanted, intracortical microelectrode array to provide control signals for a neural prosthesis system. Critical questions include how long implanted microelectrodes will record useful neural signals, how reliably those signals can be acquired and decoded, and how effectively they can be used to control various assistive technologies such as computers and robotic assistive devices, or to enable functional electrical stimulation of paralyzed muscles. Here we examined these questions by assessing neural cursor control and BrainGate system characteristics on five consecutive days 1000 days after implant of a 4 × 4 mm array of 100 microelectrodes in the motor cortex of a human with longstanding tetraplegia subsequent to a brainstem stroke. On each of five prospectively-selected days we performed time-amplitude sorting of neuronal spiking activity, trained a population-based Kalman velocity decoding filter combined with a linear discriminant click state classifier, and then assessed closed-loop point-and-click cursor control. The participant performed both an eight-target center-out task and a random target Fitts metric task which was adapted from a human-computer interaction ISO standard used to quantify performance of computer input devices. The neural interface system was further characterized by daily measurement of electrode impedances, unit waveforms and local field potentials. Across the five days, spiking signals were obtained from 41 of 96 electrodes and were successfully decoded to provide neural cursor point-and-click control with a mean task performance of 91.3% ± 0.1% (mean ± s.d.) correct target acquisition. Results across five consecutive days demonstrate that a neural interface system based on an intracortical microelectrode array can provide repeatable, accurate point-and-click control of a computer interface to an individual with tetraplegia 1000 days after implantation of this sensor.

ps

pdf pdf from publisher link (url) Project Page [BibTex]


Thumb xl scia2011
Unscented Kalman Filtering for Articulated Human Tracking

Anders Boesen Lindbo Larsen, Soren Hauberg, Kim S. Pedersen

In Image Analysis, 6688, pages: 228-237, Lecture Notes in Computer Science, (Editors: Heyden, Anders and Kahl, Fredrik), Springer Berlin Heidelberg, 2011 (inproceedings)

ps

Publishers site PDF [BibTex]

Publishers site PDF [BibTex]


no image
Adaptation for perception of the human body: Investigations of transfer across viewpoint and pose

Sekunova, A., Black, M. J., Parkinson, L., Barton, J. S.

Vision Sciences Society, 2011 (conference)

ps

[BibTex]

[BibTex]


Thumb xl icip1
Level Set Segmentation with Robust Image Gradient Energy and Statistical Shape Prior

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

In IEEE International Conference on Image Processing, pages: 3397 - 3400, 2011 (inproceedings)

Abstract
We propose a new level set segmentation method with statistical shape prior using a variational approach. The image energy is derived from a robust image gradient feature. This gives the active contour a global representation of the geometric configuration, making it more robust to image noise, weak edges and initial configurations. Statistical shape information is incorporated using nonparametric shape density distribution, which allows the model to handle relatively large shape variations. Comparative examples using both synthetic and real images show the robustness and efficiency of the proposed method.

ps

link (url) [BibTex]

link (url) [BibTex]


Thumb xl cmbve1
Variational Level Set Segmentation Using Shape Prior

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

In International Conference on Mathematical and Computational Biomedical Engineering, 2011 (inproceedings)

ps

[BibTex]

[BibTex]


no image
Toward simple control for complex, autonomous robotic applications: combining discrete and rhythmic motor primitives

Degallier, S., Righetti, L., Gay, S., Ijspeert, A.

Autonomous Robots, 31(2-3):155-181, October 2011 (article)

Abstract
Vertebrates are able to quickly adapt to new environments in a very robust, seemingly effortless way. To explain both this adaptivity and robustness, a very promising perspective in neurosciences is the modular approach to movement generation: Movements results from combinations of a finite set of stable motor primitives organized at the spinal level. In this article we apply this concept of modular generation of movements to the control of robots with a high number of degrees of freedom, an issue that is challenging notably because planning complex, multidimensional trajectories in time-varying environments is a laborious and costly process. We thus propose to decrease the complexity of the planning phase through the use of a combination of discrete and rhythmic motor primitives, leading to the decoupling of the planning phase (i.e. the choice of behavior) and the actual trajectory generation. Such implementation eases the control of, and the switch between, different behaviors by reducing the dimensionality of the high-level commands. Moreover, since the motor primitives are generated by dynamical systems, the trajectories can be smoothly modulated, either by high-level commands to change the current behavior or by sensory feedback information to adapt to environmental constraints. In order to show the generality of our approach, we apply the framework to interactive drumming and infant crawling in a humanoid robot. These experiments illustrate the simplicity of the control architecture in terms of planning, the integration of different types of feedback (vision and contact) and the capacity of autonomously switching between different behaviors (crawling and simple reaching).

mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Force Control Policies for Compliant Manipulation

Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 4639-4644, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Developing robots capable of fine manipulation skills is of major importance in order to build truly assistive robots. These robots need to be compliant in their actuation and control in order to operate safely in human environments. Manipulation tasks imply complex contact interactions with the external world, and involve reasoning about the forces and torques to be applied. Planning under contact conditions is usually impractical due to computational complexity, and a lack of precise dynamics models of the environment. We present an approach to acquiring manipulation skills on compliant robots through reinforcement learning. The initial position control policy for manipulation is initialized through kinesthetic demonstration. We augment this policy with a force/torque profile to be controlled in combination with the position trajectories. We use the Policy Improvement with Path Integrals (PI2) algorithm to learn these force/torque profiles by optimizing a cost function that measures task success. We demonstrate our approach on the Barrett WAM robot arm equipped with a 6-DOF force/torque sensor on two different manipulation tasks: opening a door with a lever door handle, and picking up a pen off the table. We show that the learnt force control policies allow successful, robust execution of the tasks.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Control of legged robots with optimal distribution of contact forces

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 11th IEEE-RAS International Conference on Humanoid Robots, pages: 318-324, IEEE, Bled, Slovenia, 2011 (inproceedings)

Abstract
The development of agile and safe humanoid robots require controllers that guarantee both high tracking performance and compliance with the environment. More specifically, the control of contact interaction is of crucial importance for robots that will actively interact with their environment. Model-based controllers such as inverse dynamics or operational space control are very appealing as they offer both high tracking performance and compliance. However, while widely used for fully actuated systems such as manipulators, they are not yet standard controllers for legged robots such as humanoids. Indeed such robots are fundamentally different from manipulators as they are underactuated due to their floating-base and subject to switching contact constraints. In this paper we present an inverse dynamics controller for legged robots that use torque redundancy to create an optimal distribution of contact constraints. The resulting controller is able to minimize, given a desired motion, any quadratic cost of the contact constraints at each instant of time. In particular we show how this can be used to minimize tangential forces during locomotion, therefore significantly improving the locomotion of legged robots on difficult terrains. In addition to the theoretical result, we present simulations of a humanoid and a quadruped robot, as well as experiments on a real quadruped robot that demonstrate the advantages of the controller.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Motion Primitive Goals for Robust Manipulation

Stulp, F., Theodorou, E., Kalakrishnan, M., Pastor, P., Righetti, L., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 325-331, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions. Second, in manipulation, the end-point of the movement must be chosen carefully, as it represents a grasp which must be adapted to the pose and shape of the object. Finally, there is uncertainty in the object pose, and even the most carefully planned movement may fail if the object is not at the expected position. To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI2; 2) extend PI2 so that it simultaneously learns shape parameters and goal parameters of motion primitives; 3) use shape and goal learning to acquire motion primitives that are robust to object pose uncertainty. We evaluate these contributions on a manipulation platform consisting of a 7-DOF arm with a 4-DOF hand.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Inverse Dynamics Control of Floating-Base Robots with External Constraints: a Unified View

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 IEEE International Conference on Robotics and Automation, pages: 1085-1090, IEEE, Shanghai, China, 2011 (inproceedings)

Abstract
Inverse dynamics controllers and operational space controllers have proved to be very efficient for compliant control of fully actuated robots such as fixed base manipulators. However legged robots such as humanoids are inherently different as they are underactuated and subject to switching external contact constraints. Recently several methods have been proposed to create inverse dynamics controllers and operational space controllers for these robots. In an attempt to compare these different approaches, we develop a general framework for inverse dynamics control and show that these methods lead to very similar controllers. We are then able to greatly simplify recent whole-body controllers based on operational space approaches using kinematic projections, bringing them closer to efficient practical implementations. We also generalize these controllers such that they can be optimal under an arbitrary quadratic cost in the commands.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl andriluka2011
Benchmark datasets for pose estimation and tracking

Andriluka, M., Sigal, L., Black, M. J.

In Visual Analysis of Humans: Looking at People, pages: 253-274, (Editors: Moesland and Hilton and Kr"uger and Sigal), Springer-Verlag, London, 2011 (incollection)

ps

publisher's site Project Page [BibTex]

publisher's site Project Page [BibTex]


Thumb xl foe2011
Fields of experts

Roth, S., Black, M. J.

In Markov Random Fields for Vision and Image Processing, pages: 297-310, (Editors: Blake, A. and Kohli, P. and Rother, C.), MIT Press, 2011 (incollection)

Abstract
Fields of Experts are high-order Markov random field (MRF) models with potential functions that extend over large pixel neighborhoods. The clique potentials are modeled as a Product of Experts using nonlinear functions of many linear filter responses. In contrast to previous MRF approaches, all parameters, including the linear filters themselves, are learned from training data. A Field of Experts (FoE) provides a generic, expressive image prior that can capture the statistics of natural scenes, and can be used for a variety of machine vision tasks. The capabilities of FoEs are demonstrated with two example applications, image denoising and image inpainting, which are implemented using a simple, approximate inference scheme. While the FoE model is trained on a generic image database and is not tuned toward a specific application, the results compete with specialized techniques.

ps

publisher site [BibTex]

publisher site [BibTex]


Thumb xl hmdb snapshot1
HMDB: A Large Video Database for Human Motion Recognition

Kuhne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.

In IEEE International Conference on Computer Vision (ICCV), 2011 (inproceedings)

ps

code, webpage, dataset pdf [BibTex]

code, webpage, dataset pdf [BibTex]


Thumb xl screen shot 2012 03 13 at 2.41.46 pm
Dorsal Stream: From Algorithm to Neuroscience

Jhuang, H.

PhD Thesis, MIT, 2011 (techreport)

ps

pdf [BibTex]


no image
Context dependent changes in grip selectivity in primate ventral premotor cortex

Franquemont, L., Vargas-Irwin, C., Black, M., Donoghue, J.

2011 Abstract Viewer and Itinerary Planner, Online, Society for Neuroscience, 2011, Online (conference)

ps

[BibTex]

[BibTex]


no image
Towards a freely moving animal model: Combining markerless multi-camera video capture and wirelessly transmitted neural recording for the analysis of walking

Foster, J., Freifeld, O., Nuyujukian, P., Ryu, S., Black, M., Shenoy, K.

2011 Abstract Viewer and Itinerary Planner, Society for Neuroscience, 2011, Online (conference)

ps

Project Page [BibTex]

Project Page [BibTex]


Thumb xl ijnmbe1
Modelling pipeline for subject-specific arterial blood flow—A review

Igor Sazonov, Si Yong Yeo, Rhodri Bevan, Xianghua Xie, Raoul van Loon, Perumal Nithiarasu

International Journal for Numerical Methods in Biomedical Engineering, 27(12):1868–1910, 2011 (article)

Abstract
In this paper, a robust and semi-automatic modelling pipeline for blood flow through subject-specific arterial geometries is presented. The framework developed consists of image segmentation, domain discretization (meshing) and fluid dynamics. All the three subtopics of the pipeline are explained using an example of flow through a severely stenosed human carotid artery. In the Introduction, the state-of-the-art of both image segmentation and meshing is presented in some detail, and wherever possible the advantages and disadvantages of the existing methods are analysed. Followed by this, the deformable model used for image segmentation is presented. This model is based upon a geometrical potential force (GPF), which is a function of the image. Both the GPF calculation and level set determination are explained. Following the image segmentation method, a semi-automatic meshing method used in the present study is explained in full detail. All the relevant techniques required to generate a valid domain discretization are presented. These techniques include generating a valid surface mesh, skeletonization, mesh cropping, boundary layer mesh construction and various mesh cosmetic methods that are essential for generating a high-quality domain discretization. After presenting the mesh generation procedure, how to generate flow boundary conditions for both the inlets and outlets of a geometry is explained in detail. This is followed by a brief note on the flow solver, before studying the blood flow through the carotid artery with a severe stenosis.

ps

[BibTex]

[BibTex]


Thumb xl tnip1
Geometrically Induced Force Interaction for Three-Dimensional Deformable Models

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

IEEE Transactions on Image Processing, 20(5):1373 - 1387, 2011 (article)

Abstract
In this paper, we propose a novel 3-D deformable model that is based upon a geometrically induced external force field which can be conveniently generalized to arbitrary dimensions. This external force field is based upon hypothesized interactions between the relative geometries of the deformable model and the object boundary characterized by image gradient. The evolution of the deformable model is solved using the level set method so that topological changes are handled automatically. The relative geometrical configurations between the deformable model and the object boundaries contribute to a dynamic vector force field that changes accordingly as the deformable model evolves. The geometrically induced dynamic interaction force has been shown to greatly improve the deformable model performance in acquiring complex geometries and highly concave boundaries, and it gives the deformable model a high invariancy in initialization configurations. The voxel interactions across the whole image domain provide a global view of the object boundary representation, giving the external force a long attraction range. The bidirectionality of the external force field allows the new deformable model to deal with arbitrary cross-boundary initializations, and facilitates the handling of weak edges and broken boundaries. In addition, we show that by enhancing the geometrical interaction field with a nonlocal edge-preserving algorithm, the new deformable model can effectively overcome image noise. We provide a comparative study on the segmentation of various geometries with different topologies from both synthetic and real images, and show that the proposed method achieves significant improvements against existing image gradient techniques.

ps

[BibTex]

[BibTex]


Thumb xl dagm2011imagesmall
Shape and pose-invariant correspondences using probabilistic geodesic surface embedding

Tsoli, A., Black, M. J.

In 33rd Annual Symposium of the German Association for Pattern Recognition (DAGM), 6835, pages: 256-265, Lecture Notes in Computer Science, (Editors: Mester, Rudolf and Felsberg, Michael), Springer, 2011 (inproceedings)

Abstract
Correspondence between non-rigid deformable 3D objects provides a foundation for object matching and retrieval, recognition, and 3D alignment. Establishing 3D correspondence is challenging when there are non-rigid deformations or articulations between instances of a class. We present a method for automatically finding such correspondences that deals with significant variations in pose, shape and resolution between pairs of objects.We represent objects as triangular meshes and consider normalized geodesic distances as representing their intrinsic characteristics. Geodesic distances are invariant to pose variations and nearly invariant to shape variations when properly normalized. The proposed method registers two objects by optimizing a joint probabilistic model over a subset of vertex pairs between the objects. The model enforces preservation of geodesic distances between corresponding vertex pairs and inference is performed using loopy belief propagation in a hierarchical scheme. Additionally our method prefers solutions in which local shape information is consistent at matching vertices. We quantitatively evaluate our method and show that is is more accurate than a state of the art method.

ps

pdf talk Project Page [BibTex]

pdf talk Project Page [BibTex]


Thumb xl srf2011 2
Steerable random fields for image restoration and inpainting

Roth, S., Black, M. J.

In Markov Random Fields for Vision and Image Processing, pages: 377-387, (Editors: Blake, A. and Kohli, P. and Rother, C.), MIT Press, 2011 (incollection)

Abstract
This chapter introduces the concept of a Steerable Random Field (SRF). In contrast to traditional Markov random field (MRF) models in low-level vision, the random field potentials of a SRF are defined in terms of filter responses that are steered to the local image structure. This steering uses the structure tensor to obtain derivative responses that are either aligned with, or orthogonal to, the predominant local image structure. Analysis of the statistics of these steered filter responses in natural images leads to the model proposed here. Clique potentials are defined over steered filter responses using a Gaussian scale mixture model and are learned from training data. The SRF model connects random fields with anisotropic regularization and provides a statistical motivation for the latter. Steering the random field to the local image structure improves image denoising and inpainting performance compared with traditional pairwise MRFs.

ps

publisher site [BibTex]

publisher site [BibTex]


Thumb xl thesis
Spatial Models of Human Motion

Soren Hauberg

University of Copenhagen, 2011 (phdthesis)

ps

PDF [BibTex]

PDF [BibTex]


no image
Visual orientation and direction selectivity through thalamic synchrony

Kelly, S., Stanley, G., Jin, J., Wang, Y., Desbordes, G., Wang, Q., Black, M., Alonso, J.

2011 Abstract Viewer and Itinerary Planner, Society for Neuroscience, 2011, Online (conference)

ps

[BibTex]

[BibTex]


no image
Use of the BrainGate neural inteface system for more than five years by a woman with tetraplegia

Hochberg, L., Bacher, D., Barefoot, L., Berhanu, E., Black, M., Cash, S., Feldman, J., Gallivan, E., Homer, M., Jarosiewicz, B., King, B., Liu, J., Malik, W., Masse, N., Berge, J., Rosler, D., Schmansky, N., Simeral, J., Travers, B., Truccolo, W., Donoghue, J.

2011 Abstract Viewer and Itinerary Planner, Society for Neuroscience, 2011, Onine (conference)

ps

[BibTex]

[BibTex]


no image
Extracting 3D Structures from Biomedical Data

Xianghua Xie, Si Yong Yeo, Igor Sazonov, Perumal Nithiarasu

Proceedings of the 5th International Conference on Bioinformatics and Biomedical Engineering, 2011 (conference)

ps

[BibTex]

[BibTex]


no image
Operational Space Control of Constrained and Underactuated Systems

Mistry, M., Righetti, L.

In Proceedings of Robotics: Science and Systems, Los Angeles, CA, USA, June 2011 (inproceedings)

Abstract
The operational space formulation (Khatib, 1987), applied to rigid-body manipulators, describes how to decouple task-space and null-space dynamics, and write control equations that correspond only to forces at the end-effector or, alternatively, only to motion within the null-space. We would like to apply this useful theory to modern humanoids and other legged systems, for manipulation or similar tasks, however these systems present additional challenges due to their underactuated floating bases and contact states that can dynamically change. In recent work, Sentis et al. derived controllers for such systems by implementing a task Jacobian projected into a space consistent with the supporting constraints and underactuation (the so called "support consistent reduced Jacobian"). Here, we take a new approach to derive operational space controllers for constrained underactuated systems, by first considering the operational space dynamics within "projected inverse-dynamics" (Aghili, 2005), and subsequently resolving underactuation through the addition of dynamically consistent control torques. Doing so results in a simplified control solution compared with previous results, and importantly yields several new insights into the underlying problem of operational space control in constrained environments: 1) Underactuated systems, such as humanoid robots, cannot in general completely decouple task and null-space dynamics. However, 2) there may exist an infinite number of control solutions to realize desired task-space dynamics, and 3) these solutions involve the addition of dynamically consistent null-space motion or constraint forces (or combinations of both). In light of these findings, we present several possible control solutions, with varying optimization criteria, and highlight some of their practical consequences.

mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl teaser bchap
Model-Based Pose Estimation

Pons-Moll, G., Rosenhahn, B.

In Visual Analysis of Humans: Looking at People, pages: 139-170, 9, (Editors: T. Moeslund, A. Hilton, V. Krueger, L. Sigal), Springer, 2011 (inbook)

ps

book page pdf [BibTex]

book page pdf [BibTex]


Thumb xl illumination cvpr11
Illumination Estimation and Cast Shadow Detection through a Higher-order Graphical Model

Panagopoulos, A., Wang, C., Samaras, D., Paragios, N.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]