Publications | Max Planck Institute for Intelligent Systems

2422 results (View BibTeX file of all listed publications)

2024

Reflectance Outperforms Force and Position in Model-Free Needle Puncture Detection

L’Orsa, R., Bisht, A., Yu, L., Murari, K., Westwick, D. T., Sutherland, G. R., Kuchenbecker, K. J.

In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, USA, July 2024 (inproceedings) Accepted

Abstract

The surgical procedure of needle thoracostomy temporarily corrects accidental over-pressurization of the space between the chest wall and the lungs. However, failure rates of up to 94.1% have been reported, likely because this procedure is done blind: operators estimate by feel when the needle has reached its target. We believe instrumented needles could help operators discern entry into the target space, but limited success has been achieved using force and/or position to try to discriminate needle puncture events during simulated surgical procedures. We thus augmented our needle insertion system with a novel in-bore double-fiber optical setup. Tissue reflectance measurements as well as 3D force, torque, position, and orientation were recorded while two experimenters repeatedly inserted a bevel-tipped percutaneous needle into ex vivo porcine ribs. We applied model-free puncture detection to various filtered time derivatives of each sensor data stream offline. In the held-out test set of insertions, puncture-detection precision improved substantially using reflectance measurements compared to needle insertion force alone (3.3-fold increase) or position alone (11.6-fold increase).

Project Page [BibTex]

2024

hi L’Orsa, R., Bisht, A., Yu, L., Murari, K., Westwick, D. T., Sutherland, G. R., Kuchenbecker, K. J. Reflectance Outperforms Force and Position in Model-Free Needle Puncture Detection In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, USA, July 2024 (inproceedings) Accepted

Project Page [BibTex]

HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

(Accepted as Highlight: Top 11.9%)

Fan, Z., Parelli, M., Kadoglou, M. E., Kocabas, M., Chen, X., Black, M. J., Hilliges, O.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) Accepted

Paper Project Code [BibTex]

ps Fan, Z., Parelli, M., Kadoglou, M. E., Kocabas, M., Chen, X., Black, M. J., Hilliges, O. HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) Accepted

Paper Project Code [BibTex]

GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs

Gao, G., Liu, W., Chen, A., Geiger, A., Schölkopf, B.

The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) Accepted

[BibTex]

ei Gao, G., Liu, W., Chen, A., Geiger, A., Schölkopf, B. GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) Accepted

[BibTex]

AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion

Chhatre, K., Daněček, R., Athanasiou, N., Becherini, G., Peters, C., Black, M. J., Bolkart, T.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) To be published

Abstract

Existing methods for synthesizing 3D human gestures from speech have shown promising results, but they do not explicitly model the impact of emotions on the generated gestures. Instead, these methods directly output animations from speech without control over the expressed emotion. To address this limitation, we present AMUSE, an emotional speech-driven body animation model based on latent diffusion. Our observation is that content (i.e., gestures related to speech rhythm and word utterances), emotion, and personal style are separable. To account for this, AMUSE maps the driving audio to three disentangled latent vectors: one for content, one for emotion, and one for personal style. A latent diffusion model, trained to generate gesture motion sequences, is then conditioned on these latent vectors. Once trained, AMUSE synthesizes 3D human gestures directly from speech with control over the expressed emotions and style by combining the content from the driving speech with the emotion and style of another speech sequence. Randomly sampling the noise of the diffusion model further generates variations of the gesture with the same emotional expressivity. Qualitative, quantitative, and perceptual evaluations demonstrate that AMUSE outputs realistic gesture sequences. Compared to the state of the art, the generated gestures are better synchronized with the speech content and better represent the emotion expressed by the input speech.

Project Paper Code link (url) [BibTex]

ps Chhatre, K., Daněček, R., Athanasiou, N., Becherini, G., Peters, C., Black, M. J., Bolkart, T. AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) To be published

Project Paper Code link (url) [BibTex]

Out-of-Variable Generalization for Discriminative Models

Guo, S., Wildberger, J., Schölkopf, B.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

ei Guo, S., Wildberger, J., Schölkopf, B. Out-of-Variable Generalization for Discriminative Models Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding

Pace, A., Yèche, H., Schölkopf, B., Rätsch, G., Tennenholtz, G.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

ei Pace, A., Yèche, H., Schölkopf, B., Rätsch, G., Tennenholtz, G. Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion

Meterez*, A., Joudaki*, A., Orabona, F., Immer, A., Rätsch, G., Daneshmand, H.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

ei Meterez*, A., Joudaki*, A., Orabona, F., Immer, A., Rätsch, G., Daneshmand, H. Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks

Spieler, A., Rahaman, N., Martius, G., Schölkopf, B., Levina, A.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

ei al

arXiv [BibTex]

ei al Spieler, A., Rahaman, N., Martius, G., Schölkopf, B., Levina, A. The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Open X-Embodiment Collaboration ( incl. Guist, S., Schneider, J., Schölkopf, B., Büchler, D. ).

IEEE International Conference on Robotics and Automation (ICRA), May 2024 (conference) Accepted

[BibTex]

ei Open X-Embodiment Collaboration ( incl. Guist, S., Schneider, J., Schölkopf, B., Büchler, D. ). Open X-Embodiment: Robotic Learning Datasets and RT-X Models IEEE International Conference on Robotics and Automation (ICRA), May 2024 (conference) Accepted

[BibTex]

Can Large Language Models Infer Causation from Correlation?

Jin, Z., Liu, J., Lyu, Z., Poff, S., Sachan, M., Mihalcea, R., Diab*, M., Schölkopf*, B.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal supervision (conference) Accepted

arXiv [BibTex]

ei Jin, Z., Liu, J., Lyu, Z., Poff, S., Sachan, M., Mihalcea, R., Diab*, M., Schölkopf*, B. Can Large Language Models Infer Causation from Correlation? Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal supervision (conference) Accepted

arXiv [BibTex]

Certified private data release for sparse Lipschitz functions

Donhauser, K., Lokna, J., Sanyal, A., Boedihardjo, M., Hönig, R., Yang, F.

27th International Conference on Artificial Intelligence and Statistics (AISTATS), May 2024 (conference) Accepted

[BibTex]

ei Donhauser, K., Lokna, J., Sanyal, A., Boedihardjo, M., Hönig, R., Yang, F. Certified private data release for sparse Lipschitz functions 27th International Conference on Artificial Intelligence and Statistics (AISTATS), May 2024 (conference) Accepted

[BibTex]

Ghost on the Shell: An Expressive Representation of General 3D Shapes

(Oral)

Liu, Z., Feng, Y., Xiu, Y., Liu, W., Paull, L., Black, M. J., Schölkopf, B.

In Proceedings of the Twelfth International Conference on Learning Representations, The Twelfth International Conference on Learning Representations, May 2024 (inproceedings) Accepted

Abstract

The creation of photorealistic virtual worlds requires the accurate modeling of 3D surface geometry for a wide range of objects. For this, meshes are appealing since they 1) enable fast physics-based rendering with realistic material and lighting, 2) support physical simulation, and 3) are memory-efficient for modern graphics pipelines. Recent work on reconstructing and statistically modeling 3D shape, however, has critiqued meshes as being topologically inflexible. To capture a wide range of object shapes, any 3D representation must be able to model solid, watertight, shapes as well as thin, open, surfaces. Recent work has focused on the former, and methods for reconstructing open surfaces do not support fast reconstruction with material and lighting or unconditional generative modelling. Inspired by the observation that open surfaces can be seen as islands floating on watertight surfaces, we parameterize open surfaces by defining a manifold signed distance field on watertight templates. With this parameterization, we further develop a grid-based and differentiable representation that parameterizes both watertight and non-watertight meshes of arbitrary topology. Our new representation, called Ghost-on-the-Shell (G-Shell), enables two important applications: differentiable rasterization-based reconstruction from multiview images and generative modelling of non-watertight meshes. We empirically demonstrate that G-Shell achieves state-of-the-art performance on non-watertight mesh reconstruction and generation tasks, while also performing effectively for watertight meshes.

ei ps

Home Code Video Project [BibTex]

ei ps Liu, Z., Feng, Y., Xiu, Y., Liu, W., Paull, L., Black, M. J., Schölkopf, B. Ghost on the Shell: An Expressive Representation of General 3D Shapes In Proceedings of the Twelfth International Conference on Learning Representations, The Twelfth International Conference on Learning Representations, May 2024 (inproceedings) Accepted