- A compositional world model. A principled design that combines a controllable multi-view dynamics model with a progress value model, yielding informative advantages for robust policy improvement.
- RL in imagination. A scalable self-improving framework that bootstraps robot policies through imaginary rollouts, avoiding the hardware cost and laborious reset of real-world interactions.
- Real-world manipulation gains. Non-trivial performance improvements on challenging dexterous tasks, including +35% on dynamic brick sorting, +45% on backpack packing, and +35% on box closing.
RISE repository is structured with three major parts:
policy_and_value/policy_offline_and_value: OpenPI-based offline policy and value training. They are put together since they share similar model architecture and training pipeline.dynamics/dynamics_model: Action-conditioned dynamics model.policy_and_value/policy_online: Online RL for policy improvement.
- 💻 Installation
- 🏆 Offline Policy Training & Value Model
- 🔮 Dynamics Model
- 🦾 Online Policy Training
- 🛠️ Deployment on Piper
We thank the following projects for their open-source contributions: OpenPI Pi0 / Pi05, Genie Envisioner, RLinf, LTX-Video.
- [2026/04/22] Training code and pre-trained dynamics model are released.
- [2026/02/11] Paper released on arXiv.
All assets and code in this repository are under the Apache 2.0 license unless specified otherwise. The data and checkpoint are under CC BY-NC-SA 4.0. Other modules inherit their own distribution licenses.
@article{rise2026,
title={RISE: Self-Improving Robot Policy with Compositional World Model},
author={Yang, Jiazhi and Lin, Kunyang and Li, Jinwei and Zhang, Wencong and Lin, Tianwei and Wu, Longyan and Su, Zhizhong and Zhao, Hao and Zhang, Ya-Qin and Chen, Li and Luo, Ping and Yue, Xiangyu and Li, Hongyang},
journal={arXiv preprint arXiv:2602.11075},
year={2026}
}