arxiv:2507.05197
Shihan Dou
Ablustrund
AI & ML interests
Natural Language Processing, Large Language Models
Recent Activity
upvoted a paper about 1 month ago
LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening published a dataset about 2 months ago
tencent/CL-bench-Life updated a dataset about 2 months ago
tencent/CL-bench-Life