SCOOP'D: Learning Mixed-Liquid-Solid Scooping via Sim2Real Generative Policy

Under Review

Abstract

Scooping items with tools such as spoons and ladles is common in daily life, ranging from assistive feeding to retrieving items from environmental disaster sites. However, developing a general and autonomous robotic scooping policy is challenging since it requires reasoning about complex tool-object interactions. Furthermore, scooping often involves manipulating deformable objects, such as granular media or liquids, which is challenging due to their infinite-dimensional configuration spaces and complex dynamics. We propose a method, SCOOP'D, which uses simulation from OmniGibson (built on NVIDIA Omniverse) to collect scooping demonstrations using algorithmic procedures that rely on privileged state information. Then, we use generative policies via diffusion to imitate demonstrations from observational input. We directly apply the learned policy in diverse real-world scenarios, testing its performance on various unseen item quantities, item characteristics, and container types. In zero-shot deployment, our method demonstrates promising results across 465 trials in diverse scenarios, including objects of different difficulty levels that we categorize as "Level 1" and "Level 2." SCOOP'D outperforms all baselines and ablations, suggesting that this is a promising approach to acquiring robotic scooping skills. We will post code, data and videos online after acceptance.

Our SCOOP'D Method

Method

The first row shows the heuristic demonstration collection. Using OmniGibson simulation, we leverage an algorithmic demonstrator for SimScoop dataset collection. The second row shows how deployment works. The left part shows how we obtain the state of the target item from text ("meatball"), detection, live video stream segmentation, and regression with the partial point cloud. The middle part shows the pipeline of our method. The right part shows the execution. We demonstrate the execution process in both the top and bottom containers, with the states specifically marked in the bottom for extra clarity.

Severe Occlusion

Yellow Cube 1

Yellow Cube 2

Yellow PingPongBall 1

Yellow PingPongBall 2

Moving objects

Moving Toy Duck 1

Moving Toy Duck 2

Moving PingPongBall

Multi-object Scooping

Multi-Cube

Multi-Mushroom

Multi-PingPongBall

Viewpoint Generalization

3 Different Views

More Generalization and Robustness Test

Level 2 objects

Deformable Cream 1

Deformable Cream 2

Large Cookie