-
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration
Paper • 2602.05400 • Published • 355 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 228 -
FASA: Frequency-aware Sparse Attention
Paper • 2602.03152 • Published • 154 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 328
Collections
Discover the best community collections!
Collections including paper arxiv:2602.03152
-
LongCat-Flash-Thinking-2601 Technical Report
Paper • 2601.16725 • Published • 181 -
DeepSeek-OCR 2: Visual Causal Flow
Paper • 2601.20552 • Published • 70 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
BMAM: Brain-inspired Multi-Agent Memory Framework
Paper • 2601.20465 • Published • 5
-
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 181 -
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
Paper • 2601.07832 • Published • 53 -
Motion Attribution for Video Generation
Paper • 2601.08828 • Published • 72 -
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper • 2601.19895 • Published • 27
-
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 22 -
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper • 2601.15892 • Published • 55 -
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper • 2601.16208 • Published • 55 -
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper • 2601.11004 • Published • 30
-
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration
Paper • 2602.05400 • Published • 355 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 228 -
FASA: Frequency-aware Sparse Attention
Paper • 2602.03152 • Published • 154 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 328
-
LongCat-Flash-Thinking-2601 Technical Report
Paper • 2601.16725 • Published • 181 -
DeepSeek-OCR 2: Visual Causal Flow
Paper • 2601.20552 • Published • 70 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
BMAM: Brain-inspired Multi-Agent Memory Framework
Paper • 2601.20465 • Published • 5
-
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 22 -
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper • 2601.15892 • Published • 55 -
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper • 2601.16208 • Published • 55 -
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper • 2601.11004 • Published • 30
-
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 181 -
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
Paper • 2601.07832 • Published • 53 -
Motion Attribution for Video Generation
Paper • 2601.08828 • Published • 72 -
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper • 2601.19895 • Published • 27