VeriBench: End-to-End Formal Verification Benchmark for AI Coding Agents in Lean 4
Published in NeurIPS 2026, 2026
Third-author paper; under review at NeurIPS 2026. Led accompanying technical blog post.
Recommended citation: Brando Miranda, Srivatsava Daruru, Ethan S Hersch, Zhanke Zhou, Allen Nie, Daneshvar Amrollahi, Leni Aniva, Iddah Mlauzi, Kirill Acharya, Elyas Obbad, Dilara Soylu, Weston Kirk, Zixiao Jolene Wang, Kai Fronsdal, Ying Li, Donald Poindexter Jr, Rakshit Kaushik, Shurui Liu, Yegor Denisov-Blanch, Steven Dillmann, Simon Obstbaum, Santiago Cuellar, John Sarracino, Rylan Schaeffer, Mo Tiwari, Donghyun Lee, Bo Han, Sanmi Koyejo. "VeriBench: End-to-End Formal Verification Benchmark for AI Coding Agents in Lean 4." Under review at NeurIPS 2026.
