- RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control, [HomePage], [Paper]
- VoxPoser: composable 3d value maps for robotic manipulation with language models, [HomePage], [Paper].
- PaLM-E: An Embodied Multimodal Language Model, [HomePage], [Paper]
- Code as Policies: Language Model Programs for Embodied Control, [HomePage]