Recovering Policy-Induced Errors: Benchmarking and Trajectory Synthesis for Robust GUI Agents Paper • 2605.29447 • Published 29 days ago • 21
PlatonicNav: Unveiling Semantic Correspondence in Navigation with Platonic Topological Maps Paper • 2606.01788 • Published 25 days ago • 9
One-Forcing: Towards Stable One-Step Autoregressive Video Generation Paper • 2605.23458 • Published May 22 • 7
ijinyu1113/ft_mr7_410m_seed42_lr3e-5_wd0.05_oldcfg300ep_modarith_subtract_max500_evalevery100_purenum Updated 22 days ago • 1
The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement Paper • 2605.30888 • Published 28 days ago • 10
CoRL2026-CSI/IsaacLab-SO101-Phase1-pick_place-80episode-10fps Viewer • Updated 23 days ago • 25.3k • 56 • 1
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 30 days ago • 431
Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints Paper • 2605.21085 • Published May 20 • 5
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation Paper • 2605.23271 • Published May 22 • 81
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published May 12 • 196