Posts by Collection

portfolio

publications

[2] VeriBench: End-to-End Formal Verification Benchmark for AI Coding Agents in Lean 4

Accepted at ICML 2026 Workshop on Deep Learning for Code (DL4C); ICML 2026 AI for Math Workshop (AI4Math)., 2026

Third-author paper on an end-to-end formal verification benchmark for AI coding agents in Lean 4.

Technical Blog Post

[1] Certifying the Judge: Falsifiable Properties for LLM-Based Evaluation of Formal Code

Accepted at ICML 2026 Workshop on Deep Learning for Code (DL4C); ICML 2026 AI for Math Workshop (AI4Math), 2026

First-author paper on falsifiable properties for LLM-based evaluation of formal code.

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.