Ling and Ring 2.6 Technical Report: Efficient and Instant Agentic Intelligence at Trillion-Parameter Scale Paper • 2606.15079 • Published 8 days ago • 76
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published Apr 6 • 116
TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders Paper • 2604.07340 • Published Apr 8 • 18
OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering Paper • 2604.08209 • Published Apr 9 • 27
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training Paper • 2503.06486 • Published Mar 9, 2025
WorldOlympiad: Can Your World Model Survive a Triathlon? Paper • 2606.11129 • Published 12 days ago • 31
MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism Paper • 2606.07512 • Published 16 days ago • 39
Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching Paper • 2606.03577 • Published 19 days ago • 16
Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? Paper • 2606.01247 • Published 21 days ago • 31
TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction Paper • 2605.26115 • Published 27 days ago • 52
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8, 2025 • 78
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15, 2025 • 107
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper • 2502.17157 • Published Feb 24, 2025 • 52