WeTalkRobots the AI and Robotics Research Archive

WeTalkRobots curates and summarizes the latest advancements in AI, robotics, humanoid control, reinforcement learning, and vision-language models. Explore research papers, podcasts, and videos focused on robot learning, manipulation, and autonomous systems. It holds a curated list of research papers, with summaries, images, and audio podcasts.

Browse some posts: F1VLA, InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions, ASAP: Aligning Simulation and Real-world Physics, DeepMimic, DreamVLA, Embodied COT, Figure Helix VLA, Hitter: Humanoid Table Tennis Robot, OmniRetarget, Open-source Robot VLAs and VLMs, RDT-1: Robotic Diffusion Transformer, ResMimic, StarVLA: VLA Model Codebase, TWIST: Teleoperated Whole-Body Imitation System, SoftMimic: Learning Compliant Whole-body Control from Examples, Igniting VLMs toward the Embodied Space, Dexbotic: Open-Source Vision-Language-Action Toolbox, EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration, Vision-Language Agent (VLA) from Scratch, BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion, VLAb: A Modular and Extensible Research Platform for Vision-Language Models, RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation, Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision–Language–Action Models via Latent Iterative Reasoning, Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera, Reconstructing Hands in 3D with Transformers, LARGE VIDEO PLANNER ENABLES GENERALIZABLE ROBOT CONTROL, ReMEmbR: Building and Reasoning Over Long-Horizon Spatio-Temporal Memory for Robot Navigation, COMPASS: Cross-embOdiment Mobility Policy via ResiduAl RL and Skill Synthesis, SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control, DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos, DREAMGEN: Unlocking Generalization in Robot Learning through Video World Models

Categories: VLA, Humanoid Robots, Open-source VLA, VLA Loco-manipulation, Learning from Humans, Navigation, Cross-embodiment Learning, Video Generative Models