leandojo.org - LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Example domain paragraphs

Top right : LeanDojo extracts proofs in Lean into datasets for training machine learning models. It also enables the trained model to prove theorems by interacting with Lean's proof environment.

Bottom : Our ReProver model. Given a state, it retrieves premises from the math library, which are concatenated with the state and fed into an encoder-decoder Transformer to generate the next tactic.

Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo : an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lea

Links to leandojo.org (6)