Exploring if LLMs can truly model complex systems in TLA+. Dive into formal verification, AI's role, and my real-world tech insights. Is it worth the hype? Read more!

seen from United States
seen from Germany

seen from Qatar
seen from Brazil
seen from United States
seen from United States
seen from United States

seen from United States
seen from Brazil
seen from Canada
seen from Canada
seen from United States
seen from United States

seen from Malaysia
seen from United States

seen from India

seen from Maldives

seen from United States
seen from Vietnam
seen from United States
Exploring if LLMs can truly model complex systems in TLA+. Dive into formal verification, AI's role, and my real-world tech insights. Is it worth the hype? Read more!
Recife Guide: Learn TLA+ for system specification, with Clojure
Paulo Feodrippe has created Recife, to make the TLA+ formal specification language + checker more accessible to Clojure devs. Now he also started Recife Guide to teach it. Check out the short example from its quick start, of a small model where TLA+/TLC found a design error.
Someone recently told me a project isn’t real until you do a retrospective, so I think it’s time to do one for Let’s Prove Leftpad. Short explanation: it’s a repository of proofs of leftpad, in different proof systems.
Long explanation: the rest of this post.
Background
I’m into formal methods, the discipline of proving software correct. Consider the following contrived code:
def add(x: int, y: int): int { Â if(x == 12976 && y == 14867) { Â Â Â return x - y; Â } Â return x + y; }
This typechecks, and any black-box unit test would find that add(x, y) == x + y. It would also pass property testing: if you randomly picked two 16 bit integers, you’d have to run something like 500 million tests to have a 10% chance of triggering the bug. A formal verification tool would fail the code 100% of the time, and pass it if only if it was always correct.
The tradeoff for this kind of power is difficulty: proving code correct is very, very hard to do. This is pretty widely accepted in the FM discipline. Outsiders, though, often assume that function programs are easier to prove correct than imperative programs. While this might be true for informal reasoning, it’s not that true for formal proof. To show this, in 2018 I made an online challenge to people:
Lots of people say "FP is easier to analyze than imperative code because of purity" but whenever I ask for evidence people look at me like I'm crazy. So I'd like to make a challenge: I'll provide three imperative functions, and your job is to convert them into pure functions.
Here's the catch: I formally proved all three functions are correct. You have to do the same. And by "formally prove", I mean "if there are any bugs _it will not compile_". Informal arguments don't count. Quickcheck doesn't count. Partial proofs ("it typechecks") don't count.
I provided two warmups, leftpad and unique, and a main challenge, called “fulcrum”. The full details and results are documented here.
Two things came out of the theorem showdown. First, it put me in touch with Lars Hupel, and we’ve been fast friends ever since. Second, this tweet:
And that’s when I realized that leftpad is a great proving exercise.
(16 November 2022)
This is an “intro packet” you can use to argue for the benefits of formal methods (FM) to your boss. It’s a short explanation, a list of ben
A short, sweet argumentation for why formal methods / TLA+ might be useful for you. Based on a number of short use cases.
Key point: Use TLA+/similar to make quickly a high-level model of your system (actors, operations, state) and its invariants and let the checker verify the invariants (through an exhaustive search).
xldenis/creusot: deductive verification of Rust code. (semi) automatically prove your code satisfies your specifications!
deductive verification of Rust code. (semi) automatically prove your code satisfies your specifications! - xldenis/creusot
Creusot is a tool for deductive verification of Rust code. It allows you to annotate your code with specifications, invariants and assertions and then check them formally, returning a proof your code satisfies its specification.
Creusot works by translating Rust code to WhyML the verification and specification language of . Users can then leverage the full power of Why3 to (semi)-automatically discharge the verification conditions! A PhD thesis.
I’ve been using Vim for eight years and am still discovering new things. This is usually seen as a Good Thing About Vim. In my head, though, it’s a failing of discoverability: I keep discovering new things because Vim makes it so hard to know what’s available. While people often talk about the beauty of modal editing or text objects, I don’t think that gets at the essence of Vim.
I’ve been using Vim for eight years and am still discovering new things. This is usually seen as a Good Thing About Vim. In my head, though, it’s a failing of discoverability: I keep discovering new things because Vim makes it so hard to know what’s available. While people often talk about the beauty of modal editing or text objects, I don’t think that gets at the essence of Vim.
Specifying UIs and transitions using statecharts and verifying them using Alloy. Neat.