Publications | Max Planck Institute for Intelligent Systems

7527 results (View BibTeX file of all listed publications)

2024

Reflectance Outperforms Force and Position in Model-Free Needle Puncture Detection

L’Orsa, R., Bisht, A., Yu, L., Murari, K., Westwick, D. T., Sutherland, G. R., Kuchenbecker, K. J.

In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, USA, July 2024 (inproceedings) Accepted

Abstract

The surgical procedure of needle thoracostomy temporarily corrects accidental over-pressurization of the space between the chest wall and the lungs. However, failure rates of up to 94.1% have been reported, likely because this procedure is done blind: operators estimate by feel when the needle has reached its target. We believe instrumented needles could help operators discern entry into the target space, but limited success has been achieved using force and/or position to try to discriminate needle puncture events during simulated surgical procedures. We thus augmented our needle insertion system with a novel in-bore double-fiber optical setup. Tissue reflectance measurements as well as 3D force, torque, position, and orientation were recorded while two experimenters repeatedly inserted a bevel-tipped percutaneous needle into ex vivo porcine ribs. We applied model-free puncture detection to various filtered time derivatives of each sensor data stream offline. In the held-out test set of insertions, puncture-detection precision improved substantially using reflectance measurements compared to needle insertion force alone (3.3-fold increase) or position alone (11.6-fold increase).

Project Page [BibTex]

2024

hi L’Orsa, R., Bisht, A., Yu, L., Murari, K., Westwick, D. T., Sutherland, G. R., Kuchenbecker, K. J. Reflectance Outperforms Force and Position in Model-Free Needle Puncture Detection In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, USA, July 2024 (inproceedings) Accepted

Project Page [BibTex]

HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

(Accepted as Highlight: Top 11.9%)

Fan, Z., Parelli, M., Kadoglou, M. E., Kocabas, M., Chen, X., Black, M. J., Hilliges, O.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) Accepted

Paper Project Code [BibTex]

ps Fan, Z., Parelli, M., Kadoglou, M. E., Kocabas, M., Chen, X., Black, M. J., Hilliges, O. HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) Accepted

Paper Project Code [BibTex]

GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs

Gao, G., Liu, W., Chen, A., Geiger, A., Schölkopf, B.

The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) Accepted

[BibTex]

ei Gao, G., Liu, W., Chen, A., Geiger, A., Schölkopf, B. GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) Accepted

[BibTex]

AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion

Chhatre, K., Daněček, R., Athanasiou, N., Becherini, G., Peters, C., Black, M. J., Bolkart, T.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) To be published

Abstract

Existing methods for synthesizing 3D human gestures from speech have shown promising results, but they do not explicitly model the impact of emotions on the generated gestures. Instead, these methods directly output animations from speech without control over the expressed emotion. To address this limitation, we present AMUSE, an emotional speech-driven body animation model based on latent diffusion. Our observation is that content (i.e., gestures related to speech rhythm and word utterances), emotion, and personal style are separable. To account for this, AMUSE maps the driving audio to three disentangled latent vectors: one for content, one for emotion, and one for personal style. A latent diffusion model, trained to generate gesture motion sequences, is then conditioned on these latent vectors. Once trained, AMUSE synthesizes 3D human gestures directly from speech with control over the expressed emotions and style by combining the content from the driving speech with the emotion and style of another speech sequence. Randomly sampling the noise of the diffusion model further generates variations of the gesture with the same emotional expressivity. Qualitative, quantitative, and perceptual evaluations demonstrate that AMUSE outputs realistic gesture sequences. Compared to the state of the art, the generated gestures are better synchronized with the speech content and better represent the emotion expressed by the input speech.

Project Paper Code link (url) [BibTex]

ps Chhatre, K., Daněček, R., Athanasiou, N., Becherini, G., Peters, C., Black, M. J., Bolkart, T. AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024 (conference) To be published

Project Paper Code link (url) [BibTex]

Out-of-Variable Generalization for Discriminative Models

Guo, S., Wildberger, J., Schölkopf, B.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

ei Guo, S., Wildberger, J., Schölkopf, B. Out-of-Variable Generalization for Discriminative Models Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding

Pace, A., Yèche, H., Schölkopf, B., Rätsch, G., Tennenholtz, G.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

ei Pace, A., Yèche, H., Schölkopf, B., Rätsch, G., Tennenholtz, G. Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion

Meterez*, A., Joudaki*, A., Orabona, F., Immer, A., Rätsch, G., Daneshmand, H.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

ei Meterez*, A., Joudaki*, A., Orabona, F., Immer, A., Rätsch, G., Daneshmand, H. Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

GaitGuide: A Wearable Device for Vibrotactile Motion Guidance

Rokhmanova, N., Martus, J., Faulkner, R., Fiene, J., Kuchenbecker, K. J.

Workshop paper (3 pages) presented at the ICRA Workshop on Advancing Wearable Devices and Applications Through Novel Design, Sensing, Actuation, and AI, Yokohama, Japan, May 2024 (misc) Accepted

Abstract

Wearable vibrotactile devices can provide salient sensations that attract the user's attention or guide them to change. The future integration of such feedback into medical or consumer devices would benefit from understanding how vibrotactile cues vary in amplitude and perceived strength across the heterogeneity of human skin. Here, we developed an adhesive vibrotactile device (the GaitGuide) that uses two individually mounted linear resonant actuators to deliver directional motion guidance. By measuring the mechanical vibrations of the actuators via small on-board accelerometers, we compared vibration amplitudes and perceived signal strength across 20 subjects at five signal voltages and four sites around the shank. Vibrations were consistently smallest in amplitude—but perceived to be strongest—at the site located over the tibia. We created a fourth-order linear dynamic model to capture differences in tissue properties across subjects and sites via optimized stiffness and damping parameters. The anterior site had significantly higher skin stiffness and damping; these values also correlate with subject-specific body-fat percentages. Surprisingly, our study shows that the perception of vibrotactile stimuli does not solely depend on the vibration magnitude delivered to the skin. These findings also help to explain the clinical practice of evaluating vibrotactile sensitivity over a bony prominence.

Project Page [BibTex]

hi Rokhmanova, N., Martus, J., Faulkner, R., Fiene, J., Kuchenbecker, K. J. GaitGuide: A Wearable Device for Vibrotactile Motion Guidance Workshop paper (3 pages) presented at the ICRA Workshop on Advancing Wearable Devices and Applications Through Novel Design, Sensing, Actuation, and AI, Yokohama, Japan, May 2024 (misc) Accepted

Project Page [BibTex]

The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks

Spieler, A., Rahaman, N., Martius, G., Schölkopf, B., Levina, A.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

ei al

arXiv [BibTex]

ei al Spieler, A., Rahaman, N., Martius, G., Schölkopf, B., Levina, A. The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Open X-Embodiment Collaboration ( incl. Guist, S., Schneider, J., Schölkopf, B., Büchler, D. ).

IEEE International Conference on Robotics and Automation (ICRA), May 2024 (conference) Accepted

[BibTex]

ei Open X-Embodiment Collaboration ( incl. Guist, S., Schneider, J., Schölkopf, B., Büchler, D. ). Open X-Embodiment: Robotic Learning Datasets and RT-X Models IEEE International Conference on Robotics and Automation (ICRA), May 2024 (conference) Accepted

[BibTex]

Can Large Language Models Infer Causation from Correlation?

Jin, Z., Liu, J., Lyu, Z., Poff, S., Sachan, M., Mihalcea, R., Diab*, M., Schölkopf*, B.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal supervision (conference) Accepted

arXiv [BibTex]

ei Jin, Z., Liu, J., Lyu, Z., Poff, S., Sachan, M., Mihalcea, R., Diab*, M., Schölkopf*, B. Can Large Language Models Infer Causation from Correlation? Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal supervision (conference) Accepted

arXiv [BibTex]

Certified private data release for sparse Lipschitz functions

Donhauser, K., Lokna, J., Sanyal, A., Boedihardjo, M., Hönig, R., Yang, F.

27th International Conference on Artificial Intelligence and Statistics (AISTATS), May 2024 (conference) Accepted

[BibTex]

ei Donhauser, K., Lokna, J., Sanyal, A., Boedihardjo, M., Hönig, R., Yang, F. Certified private data release for sparse Lipschitz functions 27th International Conference on Artificial Intelligence and Statistics (AISTATS), May 2024 (conference) Accepted

[BibTex]

Ghost on the Shell: An Expressive Representation of General 3D Shapes

(Oral)

Liu, Z., Feng, Y., Xiu, Y., Liu, W., Paull, L., Black, M. J., Schölkopf, B.

In Proceedings of the Twelfth International Conference on Learning Representations, The Twelfth International Conference on Learning Representations, May 2024 (inproceedings) Accepted

Abstract

The creation of photorealistic virtual worlds requires the accurate modeling of 3D surface geometry for a wide range of objects. For this, meshes are appealing since they 1) enable fast physics-based rendering with realistic material and lighting, 2) support physical simulation, and 3) are memory-efficient for modern graphics pipelines. Recent work on reconstructing and statistically modeling 3D shape, however, has critiqued meshes as being topologically inflexible. To capture a wide range of object shapes, any 3D representation must be able to model solid, watertight, shapes as well as thin, open, surfaces. Recent work has focused on the former, and methods for reconstructing open surfaces do not support fast reconstruction with material and lighting or unconditional generative modelling. Inspired by the observation that open surfaces can be seen as islands floating on watertight surfaces, we parameterize open surfaces by defining a manifold signed distance field on watertight templates. With this parameterization, we further develop a grid-based and differentiable representation that parameterizes both watertight and non-watertight meshes of arbitrary topology. Our new representation, called Ghost-on-the-Shell (G-Shell), enables two important applications: differentiable rasterization-based reconstruction from multiview images and generative modelling of non-watertight meshes. We empirically demonstrate that G-Shell achieves state-of-the-art performance on non-watertight mesh reconstruction and generation tasks, while also performing effectively for watertight meshes.

ei ps

Home Code Video Project [BibTex]

ei ps Liu, Z., Feng, Y., Xiu, Y., Liu, W., Paull, L., Black, M. J., Schölkopf, B. Ghost on the Shell: An Expressive Representation of General 3D Shapes In Proceedings of the Twelfth International Conference on Learning Representations, The Twelfth International Conference on Learning Representations, May 2024 (inproceedings) Accepted

Home Code Video Project [BibTex]

Identifying Policy Gradient Subspaces

Schneider, J., Schumacher, P., Guist, S., Chen, L., Häufle, D., Schölkopf, B., Büchler, D.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

ei Schneider, J., Schumacher, P., Guist, S., Chen, L., Häufle, D., Schölkopf, B., Büchler, D. Identifying Policy Gradient Subspaces Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

Emergent mechanisms for long timescales depend on training curriculum and affect performance in memory tasks

Khajehabdollahi, S., Zeraati, R., Giannakakis, E., Schäfer, T. J., Martius, G., Levina, A.

In The Twelfth International Conference on Learning Representations, ICLR 2024, May 2024 (inproceedings)

link (url) [BibTex]

al Khajehabdollahi, S., Zeraati, R., Giannakakis, E., Schäfer, T. J., Martius, G., Levina, A. Emergent mechanisms for long timescales depend on training curriculum and affect performance in memory tasks In The Twelfth International Conference on Learning Representations, ICLR 2024, May 2024 (inproceedings)

link (url) [BibTex]

Some Intriguing Aspects about Lipschitz Continuity of Neural Networks

Khromov*, G., Singh*, S. P.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

ei Khromov*, G., Singh*, S. P. Some Intriguing Aspects about Lipschitz Continuity of Neural Networks Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Liu, W., Qiu, Z., Feng, Y., Xiu, Y., Xue, Y., Yu, L., Feng, H., Liu, Z., Heo, J., Peng, S., Wen, Y., Black, M. J., Weller, A., Schölkopf, B.

In Proceedings of the Twelfth International Conference on Learning Representations, The Twelfth International Conference on Learning Representations, May 2024 (inproceedings) Accepted

Abstract

Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a few key desiderata that enable better parameter-efficiency. Inspired by how the Cooley-Tukey fast Fourier transform algorithm enables efficient information transmission, we propose an efficient orthogonal parameterization using butterfly structures. We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT). By subsuming OFT as a special case, BOFT introduces a generalized orthogonal finetuning framework. Finally, we conduct an extensive empirical study of adapting large vision transformers, large language models, and text-to-image diffusion models to various downstream tasks in vision and language.

ei ps

Home Code HuggingFace project [BibTex]

ei ps Liu, W., Qiu, Z., Feng, Y., Xiu, Y., Xue, Y., Yu, L., Feng, H., Liu, Z., Heo, J., Peng, S., Wen, Y., Black, M. J., Weller, A., Schölkopf, B. Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization In Proceedings of the Twelfth International Conference on Learning Representations, The Twelfth International Conference on Learning Representations, May 2024 (inproceedings) Accepted

Home Code HuggingFace project [BibTex]

Skill or Luck? Return Decomposition via Advantage Functions

Pan, H., Schölkopf, B.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

[BibTex]

ei Pan, H., Schölkopf, B. Skill or Luck? Return Decomposition via Advantage Functions Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

[BibTex]

Transformer Fusion with Optimal Transport

Imfeld*, M., Graldi*, J., Giordano*, M., Hofmann, T., Anagnostidis, S., Singh, S. P.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

ei Imfeld*, M., Graldi*, J., Giordano*, M., Hofmann, T., Anagnostidis, S., Singh, S. P. Transformer Fusion with Optimal Transport Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

Learning Hierarchical World Models with Adaptive Temporal Abstractions from Discrete Latent Dynamics

Gumbsch, C., Sajid, N., Martius, G., Butz, M. V.

In The Twelfth International Conference on Learning Representations, ICLR 2024, May 2024 (inproceedings)

link (url) [BibTex]

al Gumbsch, C., Sajid, N., Martius, G., Butz, M. V. Learning Hierarchical World Models with Adaptive Temporal Abstractions from Discrete Latent Dynamics In The Twelfth International Conference on Learning Representations, ICLR 2024, May 2024 (inproceedings)

link (url) [BibTex]

Causal Modeling with Stationary Diffusions

Lorch, L., Krause*, A., Schölkopf*, B.

27th International Conference on Artificial Intelligence and Statistics (AISTATS), May 2024, *equal supervision (conference) Accepted

[BibTex]

ei Lorch, L., Krause*, A., Schölkopf*, B. Causal Modeling with Stationary Diffusions 27th International Conference on Artificial Intelligence and Statistics (AISTATS), May 2024, *equal supervision (conference) Accepted

[BibTex]

Multi-View Causal Representation Learning with Partial Observability

Yao, D., Xu, D., Lachapelle, S., Magliacane, S., Taslakian, P., Martius, G., von Kügelgen, J., Locatello, F.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

ei al

arXiv [BibTex]

ei al Yao, D., Xu, D., Lachapelle, S., Magliacane, S., Taslakian, P., Martius, G., von Kügelgen, J., Locatello, F. Multi-View Causal Representation Learning with Partial Observability Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

Towards Meta-Pruning via Optimal Transport

Theus, A., Geimer, O., Wicke, F., Hofmann, T., Anagnostidis, S., Singh, S. P.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

[BibTex]

ei Theus, A., Geimer, O., Wicke, F., Hofmann, T., Anagnostidis, S., Singh, S. P. Towards Meta-Pruning via Optimal Transport Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

[BibTex]

Stochastic Gradient Descent for Gaussian Processes Done Right

Lin*, J. A., Padhy*, S., Antorán*, J., Tripp, A., Terenin, A., Szepesvari, C., Hernández-Lobato, J. M., Janz, D.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

ei Lin*, J. A., Padhy*, S., Antorán*, J., Tripp, A., Terenin, A., Szepesvari, C., Hernández-Lobato, J. M., Janz, D. Stochastic Gradient Descent for Gaussian Processes Done Right Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal contribution (conference) Accepted

arXiv [BibTex]

CAPT Motor: A Strong Direct-Drive Rotary Haptic Interface

Javot, B., Nguyen, V. H., Ballardini, G., Kuchenbecker, K. J.

Hands-on demonstration presented at the IEEE Haptics Symposium, Long Beach, USA, April 2024 (misc)

Abstract

We have designed and built a new motor named CAPT Motor that delivers continuous and precise torque. It is a brushless ironless motor using a Halbach-magnet ring and a planar axial Lorentz-coil array. This motor is unique as we use a two-phase design allowing for higher fill factor and geometrical accuracy of the coils, as they can all be made separately. This motor outperforms existing Halbach ring and cylinder motors with a torque constant per magnet volume of 9.94 (Nm/A)/dm3, a record in the field. The angular position of the rotor is measured by a high-resolution incremental optical encoder and tracked by a multimodal data acquisition device. The system's control firmware uses this angle measurement to calculate the two-phase motor currents needed to produce the torque commanded by the virtual environment at the rotor's position. The strength and precision of the CAPT Motor's torque and the lack of any mechanical transmission enable unusually high haptic rendering quality, indicating the promise of this new motor design.

link (url) Project Page [BibTex]

hi Javot, B., Nguyen, V. H., Ballardini, G., Kuchenbecker, K. J. CAPT Motor: A Strong Direct-Drive Rotary Haptic Interface Hands-on demonstration presented at the IEEE Haptics Symposium, Long Beach, USA, April 2024 (misc)

link (url) Project Page [BibTex]

Cutaneous Electrohydraulic (CUTE) Wearable Devices for Multimodal Haptic Feedback

Sanchez-Tamayo, N., Yoder, Z., Ballardini, G., Rothemund, P., Keplinger, C., Kuchenbecker, K. J.

Extended abstract (1 page) presented at the IEEE RoboSoft Workshop on Multimodal Soft Robots for Multifunctional Manipulation, Locomotion, and Human-Machine Interaction, San Diego, USA, April 2024 (misc)

hi rm

[BibTex]

hi rm Sanchez-Tamayo, N., Yoder, Z., Ballardini, G., Rothemund, P., Keplinger, C., Kuchenbecker, K. J. Cutaneous Electrohydraulic (CUTE) Wearable Devices for Multimodal Haptic Feedback Extended abstract (1 page) presented at the IEEE RoboSoft Workshop on Multimodal Soft Robots for Multifunctional Manipulation, Locomotion, and Human-Machine Interaction, San Diego, USA, April 2024 (misc)

[BibTex]

Quantifying Haptic Quality: External Measurements Match Expert Assessments of Stiffness Rendering Across Devices

Fazlollahi, F., Seifi, H., Ballardini, G., Taghizadeh, Z., Schulz, A., MacLean, K. E., Kuchenbecker, K. J.

Work-in-progress paper (2 pages) presented at the IEEE Haptics Symposium, Long Beach, USA, April 2024 (misc)

Project Page [BibTex]

hi Fazlollahi, F., Seifi, H., Ballardini, G., Taghizadeh, Z., Schulz, A., MacLean, K. E., Kuchenbecker, K. J. Quantifying Haptic Quality: External Measurements Match Expert Assessments of Stiffness Rendering Across Devices Work-in-progress paper (2 pages) presented at the IEEE Haptics Symposium, Long Beach, USA, April 2024 (misc)

Project Page [BibTex]

Cutaneous Electrohydraulic Wearable Devices for Expressive and Salient Haptic Feedback

Sanchez-Tamayo, N., Yoder, Z., Ballardini, G., Rothemund, P., Keplinger, C., Kuchenbecker, K. J.

Hands-on demonstration presented at the IEEE Haptics Symposium, Long Beach, USA, April 2024 (misc)

hi rm

[BibTex]

hi rm Sanchez-Tamayo, N., Yoder, Z., Ballardini, G., Rothemund, P., Keplinger, C., Kuchenbecker, K. J. Cutaneous Electrohydraulic Wearable Devices for Expressive and Salient Haptic Feedback Hands-on demonstration presented at the IEEE Haptics Symposium, Long Beach, USA, April 2024 (misc)

[BibTex]

PILLAR: How to make semi-private learning more effective

Hu, Y., Pinto, F., Yang, F., Sanyal, A.

2nd IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), April 2024 (conference) Accepted

[BibTex]

ei Hu, Y., Pinto, F., Yang, F., Sanyal, A. PILLAR: How to make semi-private learning more effective 2nd IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), April 2024 (conference) Accepted

[BibTex]

Fingertip Dynamic Response Simulated Across Excitation Points and Frequencies

Serhat, G., Kuchenbecker, K. J.

Biomechanics and Modeling in Mechanobiology, April 2024 (article) Accepted

Abstract

Predicting how the fingertip will mechanically respond to different stimuli can help explain human haptic perception and enable improvements to actuation approaches such as ultrasonic mid-air haptics. This study addresses this goal using high-fidelity 3D finite element analyses. We compute the deformation profiles and amplitudes caused by harmonic forces applied in the normal direction at four locations: the center of the finger pad, the side of the finger, the tip of the finger, and the oblique midpoint of these three sites. The excitation frequency is swept from 2.5 to 260 Hz. The simulated frequency response functions (FRFs) obtained for displacement demonstrate that the relative magnitudes of the deformations elicited by stimulating at each of these four locations greatly depends on whether only the excitation point or the entire finger is considered. The point force that induces the smallest local deformation can even cause the largest overall deformation at certain frequency intervals. Above 225 Hz, oblique excitation produces larger mean displacement amplitudes than the other three forces due to excitation of multiple modes involving diagonal deformation. These simulation results give novel insights into the combined influence of excitation location and frequency on the fingertip dynamic response, potentially facilitating the design of future vibration feedback devices.

Project Page [BibTex]

hi Serhat, G., Kuchenbecker, K. J. Fingertip Dynamic Response Simulated Across Excitation Points and Frequencies Biomechanics and Modeling in Mechanobiology, April 2024 (article) Accepted

Project Page [BibTex]

Demonstration: Cutaneous Electrohydraulic (CUTE) Wearable Devices for Expressive and Salient Haptic Feedback

Sanchez-Tamayo, N., Yoder, Z., Ballardini, G., Rothemund, P., Keplinger, C., Kuchenbecker, K. J.

Hands-on demonstration presented at the IEEE RoboSoft Conference, San Diego, USA, April 2024 (misc)

hi rm

[BibTex]

hi rm Sanchez-Tamayo, N., Yoder, Z., Ballardini, G., Rothemund, P., Keplinger, C., Kuchenbecker, K. J. Demonstration: Cutaneous Electrohydraulic (CUTE) Wearable Devices for Expressive and Salient Haptic Feedback Hands-on demonstration presented at the IEEE RoboSoft Conference, San Diego, USA, April 2024 (misc)

[BibTex]

Expert Perception of Teleoperated Social Exercise Robots

Mohan, M., Mat Husin, H., Kuchenbecker, K. J.

In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), pages: 769-773, Boulder, USA, March 2024, Late-Breaking Report (LBR) (5 pages) presented at the IEEE/ACM International Conference on Human-Robot Interaction (HRI) (inproceedings)

Abstract

Social robots could help address the growing issue of physical inactivity by inspiring users to engage in interactive exercise. Nevertheless, the practical implementation of social exercise robots poses substantial challenges, particularly in terms of personalizing their activities to individuals. We propose that motion-capture-based teleoperation could serve as a viable solution to address these needs by enabling experts to record custom motions that could later be played back without their real-time involvement. To gather feedback about this idea, we conducted semi-structured interviews with eight exercise-therapy professionals. Our findings indicate that experts' attitudes toward social exercise robots become more positive when considering the prospect of teleoperation to record and customize robot behaviors.

DOI Project Page [BibTex]

hi Mohan, M., Mat Husin, H., Kuchenbecker, K. J. Expert Perception of Teleoperated Social Exercise Robots In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), pages: 769-773, Boulder, USA, March 2024, Late-Breaking Report (LBR) (5 pages) presented at the IEEE/ACM International Conference on Human-Robot Interaction (HRI) (inproceedings)

DOI Project Page [BibTex]

TADA! Text to Animatable Digital Avatars

Liao, T., Yi, H., Xiu, Y., Tang, J., Huang, Y., Thies, J., Black, M. J.

In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings) Accepted

Abstract

We introduce TADA, a simple-yet-effective approach that takes textual descriptions and produces expressive 3D avatars with high-quality geometry and lifelike textures, that can be animated and rendered with traditional graphics pipelines. Existing text-based character generation methods are limited in terms of geometry and texture quality, and cannot be realistically animated due to inconsistent align-007 ment between the geometry and the texture, particularly in the face region. To overcome these limitations, TADA leverages the synergy of a 2D diffusion model and an animatable parametric body model. Specifically, we derive an optimizable high-resolution body model from SMPL-X with 3D displacements and a texture map, and use hierarchical rendering with score distillation sampling (SDS) to create high-quality, detailed, holistic 3D avatars from text. To ensure alignment between the geometry and texture, we render normals and RGB images of the generated character and exploit their latent embeddings in the SDS training process. We further introduce various expression parameters to deform the generated character during training, ensuring that the semantics of our generated character remain consistent with the original SMPL-X model, resulting in an animatable character. Comprehensive evaluations demonstrate that TADA significantly surpasses existing approaches on both qualitative and quantitative measures. TADA enables creation of large-scale digital character assets that are ready for animation and rendering, while also being easily editable through natural language. The code will be public for research purposes.

Home Code Video [BibTex]

ps Liao, T., Yi, H., Xiu, Y., Tang, J., Huang, Y., Thies, J., Black, M. J. TADA! Text to Animatable Digital Avatars In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings) Accepted

Home Code Video [BibTex]

POCO: 3D Pose and Shape Estimation using Confidence

Dwivedi, S. K., Schmid, C., Yi, H., Black, M. J., Tzionas, D.

In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings)

Abstract

The regression of 3D Human Pose and Shape HPS from an image is becoming increasingly accurate. This makes the results useful for downstream tasks like human action recognition or 3D graphics. Yet, no regressor is perfect, and accuracy can be affected by ambiguous image evidence or by poses and appearance that are unseen during training. Most current HPS regressors, however, do not report the confidence of their outputs, meaning that downstream tasks cannot differentiate accurate estimates from inaccurate ones. To address this, we develop POCO, a novel framework for training HPS regressors to estimate not only a 3D human body, but also their confidence, in a single feed-forward pass. Specifically, POCO estimates both the 3D body pose and a per-sample variance. The key idea is to introduce a Dual Conditioning Strategy (DCS) for regressing uncertainty that is highly correlated to pose reconstruction quality. The POCO framework can be applied to any HPS regressor and here we evaluate it by modifying HMR, PARE, and CLIFF. In all cases, training the network to reason about uncertainty helps it learn to more accurately estimate 3D pose. While this was not our goal, the improvement is modest but consistent. Our main motivation is to provide uncertainty estimates for downstream tasks; we demonstrate this in two ways: (1) We use the confidence estimates to bootstrap HPS training. Given unlabelled image data, we take the confident estimates of a POCO-trained regressor as pseudo ground truth. Retraining with this automatically-curated data improves accuracy. (2) We exploit uncertainty in video pose estimation by automatically identifying uncertain frames (e.g. due to occlusion) and inpainting these from confident frames.

Paper SupMat Poster link (url) [BibTex]

ps Dwivedi, S. K., Schmid, C., Yi, H., Black, M. J., Tzionas, D. POCO: 3D Pose and Shape Estimation using Confidence In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings)

Paper SupMat Poster link (url) [BibTex]

TECA: Text-Guided Generation and Editing of Compositional 3D Avatars

Zhang, H., Feng, Y., Kulits, P., Wen, Y., Thies, J., Black, M. J.

In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings) To be published

Abstract

Our goal is to create a realistic 3D facial avatar with hair and accessories using only a text description. While this challenge has attracted significant recent interest, existing methods either lack realism, produce unrealistic shapes, or do not support editing, such as modifications to the hairstyle. We argue that existing methods are limited because they employ a monolithic modeling approach, using a single representation for the head, face, hair, and accessories. Our observation is that the hair and face, for example, have very different structural qualities that benefit from different representations. Building on this insight, we generate avatars with a compositional model, in which the head, face, and upper body are represented with traditional 3D meshes, and the hair, clothing, and accessories with neural radiance fields (NeRF). The model-based mesh representation provides a strong geometric prior for the face region, improving realism while enabling editing of the person's appearance. By using NeRFs to represent the remaining components, our method is able to model and synthesize parts with complex geometry and appearance, such as curly hair and fluffy scarves. Our novel system synthesizes these high-quality compositional avatars from text descriptions. The experimental results demonstrate that our method, Text-guided generation and Editing of Compositional Avatars (TECA), produces avatars that are more realistic than those of recent methods while being editable because of their compositional nature. For example, our TECA enables the seamless transfer of compositional features like hairstyles, scarves, and other accessories between avatars. This capability supports applications such as virtual try-on.

ncs ps

arXiv project link (url) [BibTex]

ncs ps Zhang, H., Feng, Y., Kulits, P., Wen, Y., Thies, J., Black, M. J. TECA: Text-Guided Generation and Editing of Compositional 3D Avatars In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings) To be published

arXiv project link (url) [BibTex]

GAN-Avatar: Controllable Personalized GAN-based Human Head Avatar

Kabadayi, B., Zielonka, W., Bhatnagar, B. L., Pons-Moll, G., Thies, J.

In International Conference on 3D Vision (3DV), March 2024 (inproceedings)

Abstract

Digital humans and, especially, 3D facial avatars have raised a lot of attention in the past years, as they are the backbone of several applications like immersive telepresence in AR or VR. Despite the progress, facial avatars reconstructed from commodity hardware are incomplete and miss out on parts of the side and back of the head, severely limiting the usability of the avatar. This limitation in prior work stems from their requirement of face tracking, which fails for profile and back views. To address this issue, we propose to learn person-specific animatable avatars from images without assuming to have access to precise facial expression tracking. At the core of our method, we leverage a 3D-aware generative model that is trained to reproduce the distribution of facial expressions from the training data. To train this appearance model, we only assume to have a collection of 2D images with the corresponding camera parameters. For controlling the model, we learn a mapping from 3DMM facial expression parameters to the latent space of the generative model. This mapping can be learned by sampling the latent space of the appearance model and reconstructing the facial parameters from a normalized frontal view, where facial expression estimation performs well. With this scheme, we decouple 3D appearance reconstruction and animation control to achieve high fidelity in image synthesis. In a series of experiments, we compare our proposed technique to state-of-the-art monocular methods and show superior quality while not requiring expression tracking of the training data.

ncs

Video Webpage Code Arxiv [BibTex]

ncs Kabadayi, B., Zielonka, W., Bhatnagar, B. L., Pons-Moll, G., Thies, J. GAN-Avatar: Controllable Personalized GAN-based Human Head Avatar In International Conference on 3D Vision (3DV), March 2024 (inproceedings)

Video Webpage Code Arxiv [BibTex]

TeCH: Text-guided Reconstruction of Lifelike Clothed Humans

Huang, Y., Yi, H., Xiu, Y., Liao, T., Tang, J., Cai, D., Thies, J.

In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings) Accepted

Abstract

Despite recent research advancements in reconstructing clothed humans from a single image, accurately restoring the "unseen regions" with high-level details remains an unsolved challenge that lacks attention. Existing methods often generate overly smooth back-side surfaces with a blurry texture. But how to effectively capture all visual attributes of an individual from a single image, which are sufficient to reconstruct unseen areas (e.g., the back view)? Motivated by the power of foundation models, TeCH reconstructs the 3D human by leveraging 1) descriptive text prompts (e.g., garments, colors, hairstyles) which are automatically generated via a garment parsing model and Visual Question Answering (VQA), 2) a personalized fine-tuned Text-to-Image diffusion model (T2I) which learns the "indescribable" appearance. To represent high-resolution 3D clothed humans at an affordable cost, we propose a hybrid 3D representation based on DMTet, which consists of an explicit body shape grid and an implicit distance field. Guided by the descriptive prompts + personalized T2I diffusion model, the geometry and texture of the 3D humans are optimized through multi-view Score Distillation Sampling (SDS) and reconstruction losses based on the original observation. TeCH produces high-fidelity 3D clothed humans with consistent & delicate texture, and detailed full-body geometry. Quantitative and qualitative experiments demonstrate that TeCH outperforms the state-of-the-art methods in terms of reconstruction accuracy and rendering quality.

Code Home Video arXiv [BibTex]

ps Huang, Y., Yi, H., Xiu, Y., Liao, T., Tang, J., Cai, D., Thies, J. TeCH: Text-guided Reconstruction of Lifelike Clothed Humans In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings) Accepted

Code Home Video arXiv [BibTex]

ArtiGrasp: Physically Plausible Synthesis of Bi-Manual Dexterous Grasping and Articulation

Zhang, H., Christen, S., Fan, Z., Zheng, L., Hwangbo, J., Song, J., Hilliges, O.

In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings) Accepted

Abstract

We present ArtiGrasp, a novel method to synthesize bi-manual hand-object interactions that include grasping and articulation. This task is challenging due to the diversity of the global wrist motions and the precise finger control that are necessary to articulate objects. ArtiGrasp leverages reinforcement learning and physics simulations to train a policy that controls the global and local hand pose. Our framework unifies grasping and articulation within a single policy guided by a single hand pose reference. Moreover, to facilitate the training of the precise finger control required for articulation, we present a learning curriculum with increasing difficulty. It starts with single-hand manipulation of stationary objects and continues with multi-agent training including both hands and non-stationary objects. To evaluate our method, we introduce Dynamic Object Grasping and Articulation, a task that involves bringing an object into a target articulated pose. This task requires grasping, relocation, and articulation. We show our method's efficacy towards this task. We further demonstrate that our method can generate motions with noisy hand-object pose estimates from an off-the-shelf image-based regressor.

pdf project code [BibTex]

ps Zhang, H., Christen, S., Fan, Z., Zheng, L., Hwangbo, J., Song, J., Hilliges, O. ArtiGrasp: Physically Plausible Synthesis of Bi-Manual Dexterous Grasping and Articulation In International Conference on 3D Vision (3DV 2024), 3DV 2024, March 2024 (inproceedings) Accepted

pdf project code [BibTex]

Identifiable Causal Representation Learning

von Kügelgen, J.

University of Cambridge, UK, Cambridge, February 2024, (Cambridge-Tübingen-Fellowship) (phdthesis)

[BibTex]

ei von Kügelgen, J. Identifiable Causal Representation Learning University of Cambridge, UK, Cambridge, February 2024, (Cambridge-Tübingen-Fellowship) (phdthesis)

[BibTex]

IMU-Based Kinematics Estimation Accuracy Affects Gait Retraining Using Vibrotactile Cues

Rokhmanova, N., Pearl, O., Kuchenbecker, K. J., Halilaj, E.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32, pages: 1005-1012, February 2024 (article)

Abstract

Wearable sensing using inertial measurement units (IMUs) is enabling portable and customized gait retraining for knee osteoarthritis. However, the vibrotactile feedback that users receive directly depends on the accuracy of IMU-based kinematics. This study investigated how kinematic errors impact an individual's ability to learn a therapeutic gait using vibrotactile cues. Sensor accuracy was computed by comparing the IMU-based foot progression angle to marker-based motion capture, which was used as ground truth. Thirty subjects were randomized into three groups to learn a toe-in gait: one group received vibrotactile feedback during gait retraining in the laboratory, another received feedback outdoors, and the control group received only verbal instruction and proceeded directly to the evaluation condition. All subjects were evaluated on their ability to maintain the learned gait in a new outdoor environment. We found that subjects with high tracking errors exhibited more incorrect responses to vibrotactile cues and slower learning rates than subjects with low tracking errors. Subjects with low tracking errors outperformed the control group in the evaluation condition, whereas those with higher error did not. Errors were correlated with foot size and angle magnitude, which may indicate a non-random algorithmic bias. The accuracy of IMU-based kinematics has a cascading effect on feedback; ignoring this effect could lead researchers or clinicians to erroneously classify a patient as a non-responder if they did not improve after retraining. To use patient and clinician time effectively, future implementation of portable gait retraining will require assessment across a diverse range of patients.

DOI Project Page [BibTex]

hi Rokhmanova, N., Pearl, O., Kuchenbecker, K. J., Halilaj, E. IMU-Based Kinematics Estimation Accuracy Affects Gait Retraining Using Vibrotactile Cues IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32, pages: 1005-1012, February 2024 (article)

DOI Project Page [BibTex]

Creating a Haptic Empathetic Robot Animal That Feels Touch and Emotion

Burns, R.

University of Tübingen, Tübingen, Germany, February 2024, Department of Computer Science (phdthesis)

Abstract

Social touch, such as a hug or a poke on the shoulder, is an essential aspect of everyday interaction. Humans use social touch to gain attention, communicate needs, express emotions, and build social bonds. Despite its importance, touch sensing is very limited in most commercially available robots. By endowing robots with social-touch perception, one can unlock a myriad of new interaction possibilities. In this thesis, I present my work on creating a Haptic Empathetic Robot Animal (HERA), a koala-like robot for children with autism. I demonstrate the importance of establishing design guidelines based on one's target audience, which we investigated through interviews with autism specialists. I share our work on creating full-body tactile sensing for the NAO robot using low-cost, do-it-yourself (DIY) methods, and I introduce an approach to model long-term robot emotions using second-order dynamics.

Project Page [BibTex]

hi Burns, R. Creating a Haptic Empathetic Robot Animal That Feels Touch and Emotion University of Tübingen, Tübingen, Germany, February 2024, Department of Computer Science (phdthesis)

Project Page [BibTex]

Adapting a High-Fidelity Simulation of Human Skin for Comparative Touch Sensing in the Elephant Trunk

Schulz, A., Serhat, G., Kuchenbecker, K. J.

Abstract presented at the Society for Integrative and Comparative Biology Annual Meeting (SICB), Seattle, USA, January 2024 (misc)

Abstract

Skin is a complex biological composite consisting of layers with distinct mechanical properties, morphologies, and mechanosensory capabilities. This work seeks to expand the comparative biomechanics field to comparative haptics, analyzing elephant trunk touch by redesigning a previously published human finger-pad model with morphological parameters measured from an elephant trunk. The dorsal surface of the elephant trunk has a thick, wrinkled epidermis covered with whiskers at the distal tip and deep folds at the proximal base. We hypothesize that this thick dorsal skin protects the trunk from mechanical damage but significantly dulls its tactile sensing ability. To facilitate safe and dexterous motion, the distributed dorsal whiskers might serve as pre-touch antennae, transmitting an amplified version of impending contact to the mechanoreceptors beneath the elephant's armor. We tested these hypotheses by simulating soft tissue deformation through high-fidelity finite element analyses involving representative skin layers and whiskers, modeled based on frozen African elephant trunk (Loxodonta africana) morphology. For a typical contact force, quintupling the stratum corneum thickness to match dorsal trunk skin reduces the von Mises stress communicated to the dermis by 18%. However, adding a whisker offsets this dulled sensing, as hypothesized, amplifying the stress by more than 15 at the same location. We hope this work will motivate further investigations of mammalian touch using approaches and models from the ample literature on human touch.

[BibTex]

hi Schulz, A., Serhat, G., Kuchenbecker, K. J. Adapting a High-Fidelity Simulation of Human Skin for Comparative Touch Sensing in the Elephant Trunk Abstract presented at the Society for Integrative and Comparative Biology Annual Meeting (SICB), Seattle, USA, January 2024 (misc)

[BibTex]

Adversarial Likelihood Estimation With One-Way Flows

Ben-Dov, O., Gupta, P. S., Abrevaya, V., Black, M. J., Ghosh, P.

In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages: 3779-3788, January 2024 (inproceedings)

Abstract

Generative Adversarial Networks (GANs) can produce high-quality samples, but do not provide an estimate of the probability density around the samples. However, it has been noted that maximizing the log-likelihood within an energy-based setting can lead to an adversarial framework where the discriminator provides unnormalized density (often called energy). We further develop this perspective, incorporate importance sampling, and show that 1) Wasserstein GAN performs a biased estimate of the partition function, and we propose instead to use an unbiased estimator; and 2) when optimizing for likelihood, one must maximize generator entropy. This is hypothesized to provide a better mode coverage. Different from previous works, we explicitly compute the density of the generated samples. This is the key enabler to designing an unbiased estimator of the partition function and computation of the generator entropy term. The generator density is obtained via a new type of flow network, called one-way flow network, that is less constrained in terms of architecture, as it does not require a tractable inverse function. Our experimental results show that our method converges faster, produces comparable sample quality to GANs with similar architecture, successfully avoids over-fitting to commonly used datasets and produces smooth low-dimensional latent representations of the training data.

pdf arXiv [BibTex]

ps Ben-Dov, O., Gupta, P. S., Abrevaya, V., Black, M. J., Ghosh, P. Adversarial Likelihood Estimation With One-Way Flows In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages: 3779-3788, January 2024 (inproceedings)

pdf arXiv [BibTex]

Polarization-based non-linear deep diffractive neural networks

Kottapalli, S. N. M., Schlieder, L., Song, A., Volchkov, V., Schölkopf, B., Fischer, P.

AI and Optical Data Sciences V, PC12903, pages: PC129030B, (Editors: Ken-ichi Kitayama and Volker J. Sorger), SPIE, January 2024 (conference)

ei OS Lab

DOI [BibTex]

ei OS Lab Kottapalli, S. N. M., Schlieder, L., Song, A., Volchkov, V., Schölkopf, B., Fischer, P. Polarization-based non-linear deep diffractive neural networks AI and Optical Data Sciences V, PC12903, pages: PC129030B, (Editors: Ken-ichi Kitayama and Volker J. Sorger), SPIE, January 2024 (conference)

DOI [BibTex]

Multi-channel free space optical convolutions with incoherent light

Song, A., Kottapalli, S. N. M., Schölkopf, B., Fischer, P.

AI and Optical Data Sciences V, PC12903, pages: PC129030I, (Editors: Ken-ichi Kitayama and Volker J. Sorger), SPIE, January 2024 (conference)

DOI [BibTex]

ei Song, A., Kottapalli, S. N. M., Schölkopf, B., Fischer, P. Multi-channel free space optical convolutions with incoherent light AI and Optical Data Sciences V, PC12903, pages: PC129030I, (Editors: Ken-ichi Kitayama and Volker J. Sorger), SPIE, January 2024 (conference)

DOI [BibTex]

MPI-10: Haptic-Auditory Measurements from Tool-Surface Interactions

Khojasteh, B., Shao, Y., Kuchenbecker, K. J.

Dataset published as a companion to the journal article "Robust Surface Recognition with the Maximum Mean Discrepancy: Degrading Haptic-Auditory Signals through Bandwidth and Noise" in IEEE Transactions on Haptics, January 2024 (misc)

DOI Project Page [BibTex]

hi Khojasteh, B., Shao, Y., Kuchenbecker, K. J. MPI-10: Haptic-Auditory Measurements from Tool-Surface Interactions Dataset published as a companion to the journal article "Robust Surface Recognition with the Maximum Mean Discrepancy: Degrading Haptic-Auditory Signals through Bandwidth and Noise" in IEEE Transactions on Haptics, January 2024 (misc)

DOI Project Page [BibTex]

How Should Robots Exercise with People? Robot-Mediated Exergames Win with Music, Social Analogues, and Gameplay Clarity

Fitter, N. T., Mohan, M., Preston, R. C., Johnson, M. J., Kuchenbecker, K. J.

Frontiers in Robotics and AI, 10(1155837):1-18, January 2024 (article)

Abstract

The modern worldwide trend toward sedentary behavior comes with significant health risks. An accompanying wave of health technologies has tried to encourage physical activity, but these approaches often yield limited use and retention. Due to their unique ability to serve as both a health-promoting technology and a social peer, we propose robots as a game-changing solution for encouraging physical activity. This article analyzes the eight exergames we previously created for the Rethink Baxter Research Robot in terms of four key components that are grounded in the video-game literature: repetition, pattern matching, music, and social design. We use these four game facets to assess gameplay data from 40 adult users who each experienced the games in balanced random order. In agreement with prior research, our results show that relevant musical cultural references, recognizable social analogues, and gameplay clarity are good strategies for taking an otherwise highly repetitive physical activity and making it engaging and popular among users. Others who study socially assistive robots and rehabilitation robotics can benefit from this work by considering the presented design attributes to generate future hypotheses and by using our eight open-source games to pursue follow-up work on social-physical exercise with robots.

DOI Project Page [BibTex]

hi Fitter, N. T., Mohan, M., Preston, R. C., Johnson, M. J., Kuchenbecker, K. J. How Should Robots Exercise with People? Robot-Mediated Exergames Win with Music, Social Analogues, and Gameplay Clarity Frontiers in Robotics and AI, 10(1155837):1-18, January 2024 (article)

DOI Project Page [BibTex]

Whiskers That Don’t Whisk: Unique Structure From the Absence of Actuation in Elephant Whiskers

Schulz, A., Kaufmann, L., Brecht, M., Richter, G., Kuchenbecker, K. J.

Abstract presented at the Society for Integrative and Comparative Biology Annual Meeting (SICB), Seattle, USA, January 2024 (misc)

Abstract

Whiskers are so named because these hairs often actuate circularly, whisking, via collagen wrapping at the root of the hair follicle to increase their sensing volumes. Elephant trunks are a unique case study for whiskers, as the dorsal and lateral sections of the elephant proboscis have scattered sensory hairs that lack individual actuation. We hypothesize that the actuation limitations of these non-whisking whiskers led to anisotropic morphology and non-homogeneous composition to meet the animal's sensory needs. To test these hypotheses, we examined trunk whiskers from a 35-year-old female African savannah elephant (Loxodonta africana). Whisker morphology was evaluated through micro-CT and polarized light microscopy. The whiskers from the distal tip of the trunk were found to be axially asymmetric, with an ovular cross-section at the root, shifting to a near-square cross-section at the point. Nanoindentation and additional microscopy revealed that elephant whiskers have a composition unlike any other mammalian hair ever studied: we recorded an elastic modulus of 3 GPa at the root and 0.05 GPa at the point of a single 4-cm-long whisker. This work challenges the assumption that hairs have circular cross-sections and isotropic mechanical properties. With such striking differences compared to other mammals, including the mouse (Mus musculus), rat (Rattus norvegicus), and cat (Felis catus), we conclude that whisker morphology and composition play distinct and complementary roles in elephant trunk mechanosensing.

zwe-csfm hi

[BibTex]

zwe-csfm hi Schulz, A., Kaufmann, L., Brecht, M., Richter, G., Kuchenbecker, K. J. Whiskers That Don’t Whisk: Unique Structure From the Absence of Actuation in Elephant Whiskers Abstract presented at the Society for Integrative and Comparative Biology Annual Meeting (SICB), Seattle, USA, January 2024 (misc)

[BibTex]

Robust Surface Recognition with the Maximum Mean Discrepancy: Degrading Haptic-Auditory Signals through Bandwidth and Noise

(Best ToH Short Paper Award at the IEEE Haptics Symposium Conference 2024)

Khojasteh, B., Shao, Y., Kuchenbecker, K. J.

IEEE Transactions on Haptics, 17(1):58-65, January 2024, Presented at the IEEE Haptics Symposium (article)

Abstract

Sliding a tool across a surface generates rich sensations that can be analyzed to recognize what is being touched. However, the optimal configuration for capturing these signals is yet unclear. To bridge this gap, we consider haptic-auditory data as a human explores surfaces with different steel tools, including accelerations of the tool and finger, force and torque applied to the surface, and contact sounds. Our classification pipeline uses the maximum mean discrepancy (MMD) to quantify differences in data distributions in a high-dimensional space for inference. With recordings from three hemispherical tool diameters and ten diverse surfaces, we conducted two degradation studies by decreasing sensing bandwidth and increasing added noise. We evaluate the haptic-auditory recognition performance achieved with the MMD to compare newly gathered data to each surface in our known library. The results indicate that acceleration signals alone have great potential for high-accuracy surface recognition and are robust against noise contamination. The optimal accelerometer bandwidth exceeds 1000 Hz, suggesting that useful vibrotactile information extends beyond human perception range. Finally, smaller tool tips generate contact vibrations with better noise robustness. The provided sensing guidelines may enable superhuman performance in portable surface recognition, which could benefit quality control, material documentation, and robotics.

DOI Project Page [BibTex]

hi Khojasteh, B., Shao, Y., Kuchenbecker, K. J. Robust Surface Recognition with the Maximum Mean Discrepancy: Degrading Haptic-Auditory Signals through Bandwidth and Noise IEEE Transactions on Haptics, 17(1):58-65, January 2024, Presented at the IEEE Haptics Symposium (article)

DOI Project Page [BibTex]

Physics-Based Rigid Body Object Tracking and Friction Filtering From RGB-D Videos

Kandukuri, R. K., Strecke, M., Stueckler, J.

In International Conference on 3D Vision (3DV), 2024, accepted, preprint arXiv: 2309.15703 (inproceedings) Accepted

Abstract

Physics-based understanding of object interactions from sensory observations is an essential capability in augmented reality and robotics. It enables capturing the properties of a scene for simulation and control. In this paper, we propose a novel approach for real-to-sim which tracks rigid objects in 3D from RGB-D images and infers physical properties of the objects. We use a differentiable physics simulation as state-transition model in an Extended Kalman Filter which can model contact and friction for arbitrary mesh-based shapes and in this way estimate physically plausible trajectories. We demonstrate that our approach can filter position, orientation, velocities, and concurrently can estimate the coefficient of friction of the objects. We analyse our approach on various sliding scenarios in synthetic image sequences of single objects and colliding objects. We also demonstrate and evaluate our approach on a real-world dataset. We will make our novel benchmark datasets publicly available to foster future research in this novel problem setting and comparison with our method.

preprint supplemental video dataset [BibTex]

ev Kandukuri, R. K., Strecke, M., Stueckler, J. Physics-Based Rigid Body Object Tracking and Friction Filtering From RGB-D Videos In International Conference on 3D Vision (3DV), 2024, accepted, preprint arXiv: 2309.15703 (inproceedings) Accepted

preprint supplemental video dataset [BibTex]

← Previous
1
2
3
4
5
6
7
8
9
…
150
151
Next →

MPI Papers

Abteilungen

Forschungsgruppen

Publikationen

Jahr

2024

2024