Rust: beyond the typechecker

Applying deductive verification to Rust

This post will be different from the previous ones, since I’m going to present some of the early results of my work as a PhD student at the Prosecco team in Inria Paris, under the supervision of Karthikeyan Bhargavan and Jonathan Protzenko. Indeed, my goal is to bring to Rust the power of deductive verification techniques to improve the level of assurance one can have about the correctness of their programs. So far, I can present two compelling case studies from Servo that showcase what using verification techniques can look like.

When do we know programs are correct?

First, we have to understand what it means for a program to be correct. Usually, programs are tested on a set of selected inputs against expected outputs. If the output of one test case matches, we can be a little bit more confident that our program is correct. But how can we be sure it will work in all situations? What if our test cases missed a corner case in our algorithm?

One way to increase confidence is to test the program in more systematic ways, for instance by doing unit testing or measuring the code coverage and making sure it’s 100%. Let us say that we are testing the function:

fn foo(x: i32, y:i32) -> i32 {
  ...
}

Unit testing means writing tests for every sub-function used in foo. Having a code coverage of 100% on foo means that your test case explore all the branches in the code of foo: if foo contains a test x + y > 1, it means we should have at least one test case where x + y > 1, and one where x + y <= 1. But even a 100% code coverage does not guarantee that your tests reach all the program states that could cover bugs; overflow bugs are not always detected by such testing for example.

Testing a program in such a systematic way is time-consuming, and requires the programmer to create a mental model on how the code should work, in order to create test cases as relevant as possible. When all the tests are written with this methodology, there are two correlated outcomes:

  1. The programmer has thought about the behavior of its code for all combinations of the input variables;
  2. The level of assurance about the correctness of the code is high enough for a production release.

However, these methods cannot tell you if you missed something; even if you carefully test your code to prevent a certain category of bug, there will always be a possibility that a bug of this category remains hidden somewhere.

Types as verification tools

At POPL19, Mark Harman presented Sapiens, the very sophisticated analysis tool that detect and fixes bugs over the Facebook codebase. However, he admits that all the static analysis and machine learning techniques used in Sapiens have so far been devoted to fix null pointer related bugs.

Why doesn’t this category of bugs exist for Rust programs? Simply because you cannot write a program that contains a null-pointer error. Indeed, the Option type forces you to declare when writing the program which values can be None and which values cannot. By making sure you think about this beforehand, the Rust compiler guides you into thinking about what happens to None values inside your program for all combinations of input values. Thanks to that mechanism, you do not have to care anymore about null pointer errors when testing the correctness of your program.

What we see here is that the type system, by rejecting at compilation time programs susceptible to be buggy, can remove whole categories of bugs out of the way, enabling resources for testing more advanced correctness properties about the program. The specificity of Rust’s type system is that it is capable of guaranteeing that your program is memory safe and data-race free, two properties associated to categories of bugs both dreaded and notoriously hard to debug in C or C++.

But how can the typechecking algorithm be so sure that what it claims is true for all combinations of inputs? Let’s take another example that builds on the Option type:

fn bar(x: Option<i32>, y:Option<i32>) -> i32 {
  match (x,y) {
      (Some(x), None) => x,
      (Some(x), Some(y)) => x + y,
      (None, None) => 0
  }
}

You may have already noticed that this program is incorrect; indeed, the Rust compiler reports:

error[E0004]: non-exhaustive patterns: `(None, Some(_))` not covered
 --> src/main.rs:5:9
  |
2 |   match (x,y) {
  |         ^^^^^ pattern `(None, Some(_))` not covered

How can the typechecker know that you missed this pattern? Actually, what the typechecker does is that it tries to prove that for all x and y of types Option<i32>, the function will return a value of type i32. And how would you do to prove such a property in a more mathematical setting? Simply by considering all the cases associated with Option: either Some or None. Two cases for two different values so 4 cases in total; since the match statement has only 3, one case is missing.

This example shows a correspondance between the code and mathematical concepts, and this idea known as the Curry-Howard correspondence is the foundation of modern programming language design and research. The goal of of this field of Computer Science is to get rid of more and more categories of bugs by proving their absence from programs, and to allow programmers to specify the correctness of their programs thanks to more expressive mathematical theories.

A plan for bringing verification to Rust developers

There are multiple ways to produce software that has been verified partially or totally. A lot of very successful techniques involve writing the program into a very high-level specification language for which we know exactly how every language construct behaves. However, Rust was not designed with this goal in mind and lacks such a complete and official language formalisation. The Rustbelt project has brilliantly started to tackle this problem by proving that the unsafe part of the standard library does not put in jeopardy all the properties that safe code using the standard library should enjoy.

More recently, the team developing the Viper verification framework developed a frontend for a subset of Rust that allows for the automatic verification of memory safety and some arithmetic properties. Internally, Viper uses a verification technique called symbolic execution, which builds proofs completely free of any user indications. Viper also uses a different verification technique on which my PhD will be focused : deductive verification. In this setting, every function of the program has a precondition (a set of things that need to be true for the function to be called) and a postcondition (a set of things that need to be true at the end of the function). The verification consists in proving that the body of the function makes the postcondition true, supposing the precondition is true.

How could these preconditions and post-conditions be specified in Rust? The syntax and details are being discussed, but it could look like this:

struct Point {
  x: i32,
  y: i32
}

fn is_diagonal(p: &Point) -> bool {
  p.x == p.y
}

#[requires="is_diagonal(p)"]
#[ensures="is_diagonal(new_p)"]
fn double(p: Point) -> new_p: Point {
    Point {
      x: p.x + p.x,
      y: 2 * p.y
    }
}

The function is_diagonal is the specification of the function double, and it is not intended to be executed. Rather, it serves as a mathematical description of the function. But how to prove this postcondition? Here, we just need to prove that 2 * p.y == p.x + p.x under the assumption that p.x == p.y. Rather than building an ad hoc verification framework for Rust, I am planning to use F*, a semi-automatic proof assistant based on deductive verification that uses the Z3 theorem prover to alleviate the proof burden for the user.

Since the Rust typechecker is very powerful and already partially verified thanks to projects like Rustbelt, I will focus on verifying properties not captured by the type system of Rust. Specifically, full functional correctness with properties like arithmetic correctness, in-bounds array accesses, state machines correctness, functional correctness with respect to an abstract specification or security properties. My main PhD work will be to produce a tool, Rox*, most likely integrated in some form to the Rust compiler, that will translate Rust programs into F* code that can later be verified automatically or with additional F* user annotations.

Rust is very expressive and sophisticated, so I will use a reasonable subset that allows programs meaningful enough so that verification is relevant for them. This subset will probably be built upon Oxide, a new formalisation that focuses on single-threaded safe code using selected primitives from the standard library. So far, I have produced two case studies that showcase how my work could improve the assurance of correctness of real-world Rust programs.

Case studies : TextInput and bloom filter

The two case studies are taken out of the Servo project, a new parallel browser engine written in Rust and sponsored by Mozilla. They are available here. In each case, I identified a piece of code that could benefit from some verification; then I manually translated the Rust code into functionally equivalent pure F* code and proved some properties on the F* code that increase our confidence on the Rust code. I plan to automate this translation later but these case studies help me evaluate how this approach could work before diving in the compiler development.

The two case studies are presented in two separate posts here and there.

Conclusion and future work

These two cases studies helped me identity how verification could benefit real-world pieces of Rust code. Some properties could be specified lightly inside Rust and then be proven automatically, while others require more knowledge of a proof assistant like F*. Not all code needs a confidence level requiring verification, and most of the time systematic testing is enough. However, verification could sometimes be faster than systematic testing, and offer a peace of mind that testing will never provide. The advantage of this method is that it is incremental: you can start from a codebase that just typechecks, and then go on proving one property or invariant at a time.

Different verification techniques (symbolic execution, deductive verification) each have their advantages and drawbacks that depend on what you want to prove, so it is important that a future Rust plugin for verification allows to target different verification frameworks. I am planning to use Oxide as a semantic-full intermediate language accessible after Rust HIR; making it a plaftorm for all the later translations.

The goal of my work will not be foundational, in the sense that I am not trying to verify any properties related to the type system of Rust. I do not intend to tackle the difficult problem of interaction between unsafe and safe Rust code; Rustbelt is the correct way to do it. Rather, the focus of my PhD will be to go beyond the typechecker and enable gradual verification of higher-level properties about Rust code, some examples of which can be seen in the case studies.

Getting formal methods into Rust has drawn attention of a number of researchers and companies, that are gathered in the Rust formal methods group. I hope my future work will help the Rust community to get more familiar with verification techniques and how they can be efficiently used to increase the level of confidence about the correctness of various Rust programs.