Formal Verification With Lean

Introduction

Lean is a functional programming language whose type system is powerful enough to encode mathematical propositions and proofs. An important implication is that it is possible to use lean both to write programs and to verify mathematical proofs about their properties.

The goal of this post is to introduce lean, use it to implement insertion sort and prove that the output of our implementation is always sorted.

As a warm up, we’ll define the LessThan relation on natural numbers and prove basic properties such as transitivity.

Next we’ll define the IsLowerBound and IsSorted list properties which are needed to state and prove the correctness of insertion sort.

Finally, we’ll implement insertion sort and prove that its output is always sorted.

Note that a full correctness proof of would include a proof that the output of insertion sort is a permutation of the input. This permutation property was omitted in the interest of space and can be proved using similar techniques to the ones in this post.

Less Than

Lean bridges the gap between programs and proofs via dependent types.

To see how this works, we’ll use dependent types to define a LessThan relation on natural numbers and prove some of its properties.

Programs And Proofs

In this section we’ll introduce the LessThan type and give a high level overview of the relationship between programs and proofs.

The specifics of the lean syntax will be provided in the following sections.

The LessThan type represents the “less than” relation between pairs of natural numbers. It is defined as follows:

inductive LessThan (n : Nat) : Nat  Prop where
  | base : LessThan n (n + 1)
  | step : {m : Nat}  LessThan n m  LessThan n (m + 1)

At a high level, this means that LessThan defines a family of types. For each pair of natural numbers n m : Nat there is a corresponding type LessThan n m. Instances of types in the family can be built inductively via the base and step constructors.

For example, we can use the base constructor to build objects of type LessThan n (n + 1) for some n : Nat:

(.base : LessThan 3 4)

We can then use the step constructor to increment the m index:

(.step .base : LessThan 3 5)

By recursively calling .step we can construct objects in LessThan n m for all n m : Nat satisfying n < m. For example:

(.step (.step .base) : LessThan 3 6)

We call LessThan a dependent type because each pair of natural numbers n m : Nat defines a different type LessThan n m.

The Prop in the definition means that types in this family (such as LessThan 3 4) belong to special class of types called propositions. In lean a proposition type is defined to be true if an instance of that type exists and false if not. An instance of a Prop type is called a proof of the proposition.

For example, LessThan 3 4 is a proposition and we can write this explicitly as

LessThan 3 4 : Prop

Furthermore, this proposition is true and the proof is the object we constructed above with the base constructor:

(.base : LessThan 3 4)

We can declare the relationship between the proposition and the proof more explicitly using the theorem keyword:

theorem three_lt_four : LessThan 3 4 := .base

As an example of a false proposition, consider

LessThan 0 0 : Prop

Indeed, it is easy to see that neither of the two constructors of LessThan can output an object of type LessThan 0 0. For example the LessThan.base constructor always outputs objects of type LessThan n n + 1 for some natural number n : Nat. We can verify this using lean via:

theorem not_zero_lt_zero : LessThan 0 0  False
| h => nomatch h

Here the nomatch keyword asserts that there are no constructors matching LessThan 0 0. We’ll discuss lean’s matching syntax in detail below.

Inductive Types

A common pattern in lean is to define objects inductively. Indeed, the natural number type Nat is an inductive type with two constructors:

inductive Nat where
  | zero : Nat
  | succ (n : Nat) : Nat

This type has two constructors:

  • zero : Constructs 0 : Nat
  • succ n: Given a natural number n : Nat, constructs n + 1 : Nat.

Numerals such as 2 are syntax sugar for the corresponding iterated application of Nat constructors. For example:

example : 0 = .zero := by rfl
example : 1 = .succ .zero := by rfl
example : 2 = .succ (.succ .zero) := by rfl

In these examples, we are stating a proposition that the two sides of the = sign are equivalent. We prove the propositions using the rfl tactic that applies when the two sides are identical by definition. We’ll go into more detail on tactics in the following sections. The important take away for now is that instances of Nat are defined as the outputs of iterated sequences of Nat constructors.

Functions

In this section we’ll introduce some lean syntax related to function definitions.

Pattern Matching

Here is an example of a function definition:

def add (n m : Nat) : Nat := n + m

This function takes two natural numbers n m : Nat and outputs the natural numbern + m.

Functions are called like this:

add 3 4

We can evaluate a function in the lean interpreter via:

#eval add 3 4
7

Types in lean are typically defined inductively. Indeed, in the previous section we saw the inductive definition of Nat. Therefore, it is convenient to define functions inductively as well. For this purpose lean provides a pattern matching syntax which we can use to rewrite our addition function as:

def add (n : Nat) : Nat  Nat
  | .zero => n
  | .succ k => .succ (add n k)

As before, n : Nat is a function parameter and available in the entire scope of the definition. However, rather than explicitly setting m, this implementation is defined via induction on m. Specifically, the implementation depends on which Nat constructor was used to build m and is defined recursively:

  • zero: If m := Nat.zero then add n m = n.
  • succ: If m := Nat.succ k then add n m = succ (add n k)

lean provides syntax sugar which lets us write the zero constructor as simply 0 and the succ k constructor as k + 1. We can use this to rewrite our function as:

def add (n : Nat) : Nat  Nat
  | 0 => n
  | k + 1 => (add n k) + 1

Currying

In lean, if S : Type and T : Type are types then the S → T : Type is the type representing functions from S to T.

The arrow is associative to the right. For example:

Nat  Nat  Nat = Nat  (Nat  Nat)

In other words, Nat → Nat → Nat represents functions f that take a natural number n : Nat and return a function

f(n) : Nat  Nat

This means that we can use f to map a pair of natural numbers n m : Nat to an output in Nat via:

f(n)(m) : Nat

The process of chaining together functions of a single variable to represent multi-variable functions is called currying.

The upshot is that we can think of types such as

S  T  U : Type

As representing functions that take two arguments with types S and T and output an object of type U.

For example, the type of the add function defined above is Nat → Nat → Nat:

add : Nat  Nat  Nat

Less Than Revisited

We’ll now explain our LessThan definition in more detail. Recall that LessThan was defined by:

inductive LessThan (n : Nat) : Nat  Prop where
  | base : LessThan n (n + 1)
  | step : {m : Nat}  LessThan n m  LessThan n (m + 1)

The argument n : Nat is a parameter for the type. This means that it is fixed for the entire body of the type definition. Indeed, note that all of the LessThan types in the definition body use the same parameter n.

The Nat following the colon specifies that the type depends on another natural number and that this dependence will be defined inductively. This kind of argument is called an index.

The next two rows define the constructors.

First consider the base constructor:

base : LessThan n (n + 1)

This means that for a given parameter n : Nat, the base constructor returns an object of type LessThan n (n + 1). In the language of proofs, we can think of this as an axiom saying that for all natural numbers n : Nat, n is less than n + 1.

Now consider the step constructor:

step : {m : Nat}  LessThan n m  LessThan n (m + 1)

This constructor takes two arguments as input - a natural number m : Nat and an object of type LessThan n m. The curly brackets around m imply that m is an implicit parameter. This means that lean can deduce the value of m from the type of the LessThan n m argument and so we do not have to pass m explicitly.

The constructor returns an object of type LessThan n (m + 1). In the language of proofs, this axiom states that if n is less than m, then n is less than m + 1.

The point of these definitions is that by combining the base and step constructors we can construct objects of type LessThan n m for all n m : Nat where n is less than m.

For example:

(.base : LessThan 1 2)
(.step .base : LessThan 1 3)
(.step (.step .base) : LessThan 1 4)

Note that the m and n arguments in the constructors on the left are automatically deduced from the types on the right which is why we do not have to write them explicitly.

For example, consider the second row

(.step .base : LessThan 1 3)

Since the output type is LessThan 1 3, this fixes the parameter n to be n=1. This in turn implies that .base must output the type LessThan 1 2. Since the non-implicit argument to step is of type LessThan n m, this means that the step constructor is run with n=1 and m=2 and, as expected, outputs an object of type LessThan 1 3.

Zero Is Less Than Successors

We’ve already proved some properties of LessThan that hold for specific numbers. For example:

theorem one_lt_two : LessThan 1 2 := .base

This theorem states the property LessThan 1 2 and proves it by constructing an instance of LessThan 1 2 via the base constructor.

Now let’s prove a more general proposition:

For all natural numbers n : Nat, 0 < n + 1.

What should be the type corresponding to this proposition? One way of phrasing the proposition is that for every natural number n : Nat, we must be able to produce a proof that 0 is less than n + 1. In other words, for every natural number n : Nat we must be able to construct an instance of LessThan 0 (n + 1). A mapping from natural numbers n to instances of LessThan 0 (n + 1) is simply a function with type:

(n : Nat)  LessThan 0 (n + 1)

Proving the proposition is equivalent to constructing a function with this type. Here is how we can define such a function this using pattern matching on n : Nat similarly to the add example in section Pattern Matching:

theorem LessThan.zero_lt_succ : {n : Nat}  LessThan 0 (n + 1)
  -- n := .zero
  -- need to construct: LessThan 0 (0 + 1)
  | 0 => .base
  -- n := .succ n_prev
  -- in this case n = n_prev + 1.
  -- need to construct: LessThan 0 (n_prev + 1 + 1)
  -- proof: recursively call zero_lt_succ n_prev to obtain:
  --        LessThan 0 (n_prev + 1).
  --        then perform one step to obtain: LessThan 0 ((n_prev + 1) + 1).
  | n_prev + 1 => .step .zero_lt_succ

The name LessThan.zero_lt_succ simply puts zero_lt_succ in the LessThan namespace.

As we saw in section Pattern Matching, the curly brackets around n : Nat mean that n is an implicit argument and can be deduced by the return type.

Here is an application of this general theorem to a specific case:

theorem zero_lt_ten : LessThan 0 10 := .zero_lt_succ

Rather than defining LessThan.zero_lt_succ recursively, lean also provides a tactics syntax that we can use to write the same proof in a style that resembles a traditional math proof:

theorem LessThan.zero_lt_succ (n : Nat) : LessThan 0 (n + 1) := by
  induction n with
  -- base case: n = 0. need to construct: LessThan 0 (0 + 1)
  | zero => exact .base
  -- inductive step:
  -- n = n_prev + 1.
  -- inductive hypothesis ih : LessThan 0 (n_prev + 1).
  -- need to construct an instance of LessThan 0 ((n_prev + 1) + 1)
  | succ n_prev ih => exact .step ih

The first difference is that now the n : Nat argument is before the colon. This turns n : Nat into a parameter that is available in the entire definition body. Then we use the by keyword to enter into tactic mode.

In tactic mode we can use the induction tactic to split the goal of constructing an element of LessThan 0 (n + 1) into two cases - the base case where n=0 and an inductive step where n=n_prev+1 and we are given an inductive hypothesis

ih : LessThan 0 (n_prev + 1)

In both cases we construct an instance of the desired type LessThan 0 (n + 1) using one of the LessThan constructors and close the goal via the exact tactic.

Addition And Transitivity

We’ll conclude our LessThan example by proving two more properties.

First we’ll prove that adding a positive number to n results in a larger number:

For all natural numbers n m : Nat, if 0 < m then n < n + m.

Similarly to what we saw in the previous section, the corresponding type is:

LessThan 0 m   LessThan n (n + m)

We can construct an function of this type by inductive pattern matching on the LessThan 0 m argument:

-- For all n,m: if 0 < m  then n + m < m
theorem LessThan.add_right {n m : Nat} : LessThan 0 m   LessThan n (n + m)
  -- base case: LessThan 0 m := .base
  -- this means that: m = 0 + 1
  -- need to construct: LessThan n (n + 1)
  | base => .base
  -- step case: LessThan 0 m := .step hk
  -- where:
  -- m = k + 1 and
  -- hk : LessThan 0 k
  -- need to construct: LessThan n (n + k + 1)
  -- proof: Recursively call .add_right hk to get an instance of LessThan n (n + k)
  --        then apply LessThan.step.
  | step hk => .step (.add_right hk)

The function LessThan.add_right has two implicit parameters n m : Nat and an object of type LessThan 0 m. It outputs an object of type LessThan n (n + m).

Similarly to what we saw in the add example in Pattern Matching, since n m : Nat are implicit lean deduces them from the type signature and so we do not have to pass them explicitly. For example:

theorem zero_lt_three : LessThan 0 3 := .zero_lt_succ 2
theorem two_lt_five : LessThan 2 5 := .add_right zero_lt_three

Finally, let’s prove that our LessThan property is transitive. This time we’ll use the induction tactic on the hypothesis hmk : LessThan m k :

-- For all n,m,k: If n < m and m < k then n < k.
theorem LessThan.trans {n m k : Nat}
  (hnm : LessThan n m) (hmk : LessThan m k) : LessThan n k := by
  induction hmk with
  -- base case: hmk : LessThan m (m + 1):= .base
  -- this means that: k = m + 1
  -- need to construct an object in LessThan n (m + 1)
  | base => exact .step hnm
  -- step case: hmk = .step hmk_prev
  -- where
  -- k = k_prev + 1
  -- hmk_prev : LessThan m k_prev
  -- induction hypothesis ih : LessThan n k_prev
  -- need to construct: LessThan n (k_prev + 1)
  | step _ ih => exact .step ih

Here is an example of applying the theorems above to deduce some inequalities.:

theorem one_lt_two : LessThan 1 2 := .base
theorem zero_lt_three : LessThan 0 3 := .zero_lt_succ 2
theorem two_lt_five : LessThan 2 5 := .add_right zero_lt_three
theorem one_lt_five : LessThan 1 5 := .trans one_lt_two two_lt_five

Insertion Sort

The goal of this section is to implement insertion sort in lean and to prove that the output of insertion sort is always sorted.

Lists

For a given type α : Type, List α : Type is the type of lists with elements in α. Similarly to Nat, it is an inductive type that has two constructors:

  • The empty constructor constructs the empty list in List α.
  • The cons constructor takes an element x : α and a list xs : List α and concatenates x with xs to obtain a new list denoted x :: xs. It is common to refer to x as the head and to xs as the tail.

lean also provides syntax sugar for constructing lists using the common bracket notation. For example, here are some equivalent list constructions:

example : [1] = .cons 1 .nil := by rfl
example : [1, 2] = .cons 1 (.cons 2 .nil) := by rfl

Lower Bound

We’ll now define the notion of an element being a lower bound on a list using an inductive dependant type similar to LessThan.

From now on we’ll require the type α to implement a less-than-or-equal relation that is defined for each pair of elements:

variable {α : Type} [LE α] [DecidableRel (·  · : α  α  Prop)]

Note that we are now forgetting our custom LessThan type and using the built in LE. Similarly to our custom LessThan type, our assumption on α implies that for all x y : α there is a corresponding proposition type

x ≤ y : Prop

Instances of x ≤ y can be built by iteratively applying the constructors which are defined similarly to our LessThan constructors.

We can now define the IsLowerBound property:

inductive IsLowerBound (x : α) : List α  Prop where
  | nil : IsLowerBound x []
  | step : {y : α}  {ys : List α}  (x  y)  IsLowerBound x ys 
            IsLowerBound x (y :: ys)

This type is in the Prop universe and has a parameter x : α and an index of type List α.

For a given parameter x : α, the constructors are defined via:

  • nil: Constructs an object in IsLowerBound x []. In other words, x is trivially a lower bound on the empty set.
  • step: This constructor has an implicit parameters y : α and ys : List α. It takes an instance of type x ≤ y and an instance of IsLowerBound x ys. It outputs an instance of IsLowerBound x (y :: ys). This corresponds to an axiom that if x ≤ y and x is a lower bound on ys then x is a lower bound on y :: ys.

For example:

theorem one_lb_empty : IsLowerBound 1 [] := .nil
theorem one_lb_two : IsLowerBound 1 [2] := .step (by omega) one_lb_empty
theorem one_lb_three_two : IsLowerBound 1 [3, 2] := .step (by omega) one_lb_two

In the one_lb_two example, lean infers from the type IsLowerBound 1 [2] that the type of the first explicit argument to step must be the proposition 1 ≤ 2 : Prop. To construct this object (i.e proof that 1 is less than or equal to 2) we enter tactic mode with by and then use the omega tactic which automatically constructs proofs of simple statements about integers. Similarly, in the one_lb_three_two example the omega tactic is used to construct an instance of 1 ≤ 3 : Prop.

Here are some useful properties of IsLowerBound:

-- If x is a lower bound of y :: ys then it is a lower bound of ys.
theorem IsLowerBound.tail{x y : α} {ys : List α} : IsLowerBound x (y :: ys) 
  IsLowerBound x ys
  -- only the step constructor can match IsLowerBound x (y :: ys). In this case
  -- IsLowerBound x (y :: ys) := .step hxy hx_ys
  -- where
  -- hxy : x ≤ y
  -- hx_ys : IsLowerBound x ys
  | .step _ hx_ys => hx_ys

-- If x is a lower bound of y :: ys then x ≤ y
theorem IsLowerBound.head_le {x y : α} {ys : List α} :
  IsLowerBound x (y :: ys)  x  y
  | step hxy _ =>  hxy

For example, we can use IsLowerBound.tail to provide an alternative proof for one_lb_two:

theorem one_lb_two : IsLowerBound 1 [2] := .tail one_lb_three_two

Sorted

We’ll now define the property of a list being sorted.

inductive IsSorted : List α  Prop where
  | nil : IsSorted []
  | single : {x : α}  IsSorted [x]
  | step : {x : α}  {y : α}  {ys : List α}  (x  y)  IsSorted (y :: ys) 
      IsSorted (x :: y :: ys)

Similarly to LessThan and IsLowerBound, this is an inductive dependent Prop type. It has three constructors which inductively can be used to construct an instance of IsSorted l for a list l : List α if and only if l is sorted.

For example:

theorem empty_sorted : IsSorted ([] : List Int) := .nil
theorem one_sorted : IsSorted [1] := .single
theorem one_two_sorted : IsSorted [1, 2] := .step (by omega) .single
theorem one_two_three_sorted : IsSorted [1, 2, 3] := .step (by omega) (.step (by omega) .single)

As we saw in section Inductive Types, lean deduces the values of the implicit constructor arguments from the output type. For example, in one_sorted, it calls .single with x := 1 because we are trying to construct and object of type IsSorted [1].

And, similarly to section Lower Bound, the by omega calls use the omega tactic to construct instances of the necessary x ≤ y : Prop types. For example, in one_two_sorted it builds an instance of the proposition 1 ≤ 2 : Prop.

Here are some useful properties of IsSorted.

-- If x :: xs is sorted then xs is sorted
theorem IsSorted.tail {x : α} {xs : List α} : IsSorted (x :: xs)   IsSorted xs
  | .single => .nil
  | .step _ hs_y_ys => hs_y_ys

-- If x is a lower bound of xs and xs is sorted then x :: xs is sorted
theorem IsSorted.lb {x : α} {xs : List α} :
  IsLowerBound x xs  IsSorted xs  IsSorted (x :: xs)
  | .nil, _ => .single
  | .step hxy _, hs_y_ys => .step hxy hs_y_ys

For example:

theorem zero_lb_one_two : IsLowerBound 0 [1, 2] := .step (by omega) (.step (by omega) .nil)
theorem zero_one_two_sorted : IsSorted [0, 1, 2] := .lb zero_lb_one_two one_two_sorted

Total Order

In order to prove that insertionSort outputs sorted lists, we will need to strengthen our assumptions on the less-than-or-equal relation . Specifically, we’ll need to assume that it is transitive and total. Relations that satisfy both properties are call total orders. We formally encode the definition of a total order in the following type class

class TotalOrder (α : Type) [LE α] where
  total : (x y : α)   x  y  y  x
  trans : (x y z : α)  x  y  y  z  x  z

In lean, a type class is essentially an interface that can be implemented for a given type. The TotalOrder interface extends the LE interface with two functions:

  • total: This function takes as input two elements x y : α and outputs an element of type x ≤ y ∨ y ≤ x. In the language of proofs, implementing this function is equivalent to proving that for all elements x y : α, x ≤ y or y ≤ x.

  • trans: This function takes as input three elements x y z: α, an object of type x ≤ y and an object of type y ≤ z. It outputs an object of type x ≤ z. In the world of proofs, implementing this method is equivalent to proving that for all x y z : α, if x ≤ y and y ≤ z then x ≤ z.

Note that the OR relation is an inductive dependant type. For each pair of propositions P Q : Prop it defines a new proposition P ∨ Q : Prop. The type P ∨ Q has two constructors, one with an argument of type P and one with an argument of type Q. Therefore, to construct an instance of P ∨ Q it is necessary to construct an instance of P OR and instance of Q.

From now on we will assume that relation defined on our element type α is a total order:

variable [TotalOrder α]

Here are two useful properties of total orders:

-- if x ≤ y is false then y ≤ x
theorem le_of_not_le (x y : α) (h : ¬ x  y) : y  x :=
  -- Match the output of TotalOrder.total x y. By definition there are two
  -- possible constructors, one for each side of the OR.
  match TotalOrder.total x y with
  -- The left hand side ctor accepts an object hxy : x ≤ y. This contradicts the assumption
  -- h : ¬ x ≤ y and so we can apply the absurd tactic.
  | Or.inl hxy => absurd hxy h
  -- The right hand side ctor accepts an object hyz : y ≤ x. We can return it
  -- directly.
  | Or.inr hyx => hyx

-- ≤ is reflexive
theorem le_refl (x : α) : (x  x) := by
  -- Match the output of TotalOrder.total x x. In this case both the left hand and
  -- right hand constructors take an element hxx : x ≤ x.
  -- Since we can directly return hxx in both cases, we use the map operator <;>
  -- to pipe both cases to the assumption tactic.
  cases TotalOrder.total x x <;> assumption

For example, the natural numbers are a total order and we can implement the TotalOrder type class for Nat via:

instance : TotalOrder Nat where
  total := Nat.le_total
  trans := @Nat.le_trans

We can now apply le_refl:

theorem one_le_one : 1  1 := le_refl 1

We can use the transitivity of to prove that IsLowerBound is transitive as well.

-- If x ≤ y and y is a lower bound for the list l then x is a lower bound for l.
theorem IsLowerBound.trans {x y : α} {l : List α}
  (hxy : x  y) (hyl : IsLowerBound y l) : IsLowerBound x l := by
  -- Use induction on hyl : IsLowerBound y l
  induction hyl with
  -- base case: hyl : IsLowerBound y l := .base
  -- This means that l = []
  | nil => exact .nil
  -- step case: hyl : IsLowerBound y l := .step hyz hy_zs
  -- where:
  -- l = z :: zs and we have objects
  -- hyz : y ≤ z and
  -- hy_zs : IsLowerBound y zs
  -- induction hypothesis: 
  -- hx_zs : IsLowerBound x zs
  | step hyz hy_zs hx_zs =>
    -- By transitivity we can construct hxz : x ≤ z
    have hxz := TotalOrder.trans _ _ _ hxy hyz
    -- Apply IsLowerBound.step with hxz and hx_zs to construct IsLowerBound x (z :: zs)
    exact .step hxz hx_zs

For example:

theorem zero_lb_three_two : IsLowerBound 0 [3, 2] := .trans (by omega) one_lb_three_two

Finally, we can use the transitivity of IsLowerBound to prove the following relationship between the IsLowerBound and IsSorted properties:

-- If x :: xs is sorted then x is a lower bound of xs.
theorem IsLowerBound.sorted {x : α} {xs : List α} (hs : IsSorted (x :: xs)) :
  IsLowerBound x xs := by
  -- Use induction on the tail xs for an arbitrary head x.
  revert x
  induction xs with
  -- xs = []
  | nil => intro x hs; exact .nil
  -- xs = y :: ys and we have the induction hypothesis
  -- ih : {z : α} → IsSorted (z :: ys) → IsLowerBound z ys
  -- I.e the inductive hypothesis states that for all z, if z :: ys is sorted then
  -- z is a lower bound on ys.
  | cons y ys ih =>
    -- the intro tactic introduces objects corresponding to our assumptions:
    -- x : α
    -- hs : IsSorted (x :: y :: ys)
    intro x hs
    -- Pattern match the constructor of hs : IsSorted (x :: y :: ys).
    cases hs with
    -- Only the step constructor matches. In this case
    -- hs : IsSorted (x :: y :: ys) := .step hxy h_sorted_y_ys
    -- where:
    -- hxy : x ≤ y
    -- h_sorted_y_ys : IsSorted (y :: ys)
    | step hxy h_sorted_y_ys =>
      -- Apply the inductive hypothesis ih to y :: ys to prove that
      -- y is a lower bound on ys.
      have hy_lb_ys : IsLowerBound y ys := ih h_sorted_y_ys
      -- Since y ≤ y, we can apply IsLowerBound.step to conclude that
      -- y is also a lower bound on y :: ys
      have hy_lb_y_ys : IsLowerBound y (y :: ys) := .step (le_refl y) hy_lb_ys
      -- The result now follows from x ≤ y and transitivity of IsLowerBound
      exact .trans hxy hy_lb_y_ys

For example:

theorem one_lb_two_three : IsLowerBound 1 [2, 3]:= .sorted one_two_three_sorted

Algorithm

Recall that insertion sort sorts a collection of elements by inserting them one by one into a list. The insertion is done via an insertSorted procedure which inserts an element into a sorted list at a position chosen such that that the new list is still sorted.

We’ll start by implementing insertSorted in lean.

def insertSorted (x : α) : List α  List α
  | [] => [x]
  | y :: ys => if x  y then x :: y :: ys else y :: insertSorted x ys

This function has a parameter x : α of type α and an additional argument of type List α. The output is also of type List α.

The function is defined via pattern matching on the List α argument similarly to the add example in Pattern Matching. As we saw in section Lists, the List α type has two constructors that we must match:

  • empty: In this case the input list is []. Inserting x into [] is just [x].
  • cons: In this case the input list is of the form y :: ys where y : α and ys : List α. To insert x into y :: ys we compare x with the head y. If x ≤ y then x should be inserted in the front of the list and so we return x :: y :: ys. Otherwise, x should come after y and should be inserted somewhere in the tail ys. Therefore, we return y :: insertSorted x ys.

For example:

#eval insertSorted 2 [1, 3]
[1, 2, 3]

It is now easy to implement insertion sort:

def insertionSort : List α  List α
  | [] => []
  | x :: xs => insertSorted x (insertionSort xs)

insertionSort is also defined by matching the constructor of the input list:

  • empty: The result of sorting the empty list is [].
  • cons: In this case the input list is of the form x :: xs. First we sort the tail by evaluating insertionSort xs. Then we insert the head into the sorted tail via insertSorted x (insertionSort xs).

For example:

#eval insertionSort [3, 4, 2, 1]
[1, 2, 3, 4]

Sorted Output

In this section we’ll prove that for every list l : List α, the list insertionSort l : List α is sorted.

Since by definition insertionSort iteratively inserts elements into a sorted list, the key is to prove that for all x : α and l : List α, if l : List α is sorted then insertSorted x l is also sorted.

We’ll start with the weaker claim that insertSorted preserves lower bounds:

-- If x is a lower bound of ys and x ≤ y then x is a lower bound of insertSorted y ys.
theorem insertSorted_lb {x y : α} {ys : List α} (h_lb : IsLowerBound x ys) (hxy : x  y) :
  IsLowerBound x (insertSorted y ys) := by
  -- Proof by induction on the definition of insertSorted.
  fun_induction insertSorted with
  -- case1: 
  -- ys = []
  -- By definition: insertSorted y ys = [y]
  -- The claim follows from hxy and IsLowerBound.step
  | case1 => exact .step hxy .nil
  -- case2:
  -- ys = z :: zs
  -- hyz : y ≤ z
  -- By definition: insertSorted y ys = y :: z :: zs
  -- The claim follows from hxy and IsLowerBound.step
  | case2 z zs hyz => exact .step hxy h_lb
  -- case3:
  -- ys = z :: zs
  -- h_not_yz: not y ≤ z
  -- By assumption we have: h_lb : IsLowerBound x (z :: zs)
  -- By definition: insertSorted y ys = z :: insertSorted y zs
  -- Inductive hypothesis:
  -- ih : IsLowerBound x zs → IsLowerBound x (insertSorted y zs)
  -- We need to show: IsLowerBound x (z :: insertSorted (y :: zs))
  -- Proof: 1. Since x is a lower bound on z :: zs it follows from
  --           IsLowerBound.head that x ≤ z and from IsLowerBound.tail that
  --           x is a lower bound on zs.
  --        2. x is a lower bound on insertSorted y zs:
  --           This follows from IsLowerBound x zs and the induction hypothesis.
  --        3. The claim follows by applying IsLowerBound.step to x ≤ z and
  --           IsLowerBound x zs.
  | case3 z zs h_not_yz ih =>
    have hxz : x  z := IsLowerBound.head_le h_lb
    have h_x_zs : IsLowerBound x zs := .tail h_lb
    have hx_ins_y_zs : IsLowerBound x (insertSorted y zs) := ih h_x_zs
    exact .step hxz hx_ins_y_zs

We’ll now use insertSorted_lb to prove that insertSorted preserves the IsSorted property:

-- If xs is sorted then insertSorted x xs is sorted.
theorem insertSorted_sorted {x : α} {xs : List α} (hs : IsSorted xs) :
  IsSorted (insertSorted x xs) := by
  -- Proof by induction on the definition of insertSorted.
  fun_induction insertSorted with
  -- case1:
  -- xs = []
  -- By definition: insertSorted x xs = [x]
  -- In this case the claim is trivial.
  | case1 => exact .single
  -- case2:
  -- xs = y :: ys
  -- hxy : x ≤ y.
  -- By definition: insertSorted x xs = x :: y :: ys
  -- In this case x ≤ y and so we can apply IsSorted.step to deduce that 
  -- x :: y :: ys is sorted.
  | case2 y ys h_xy => exact .step h_xy hs
  -- case3:
  -- xs = y :: ys
  -- h_not_le: not x ≤ y
  -- Induction hypothesis ih: For all z, if ys is sorted then insertSorted z ys is sorted.
  -- By definition: insertSorted x xs = y :: insertSorted x ys
  -- Therefore, we need to show: y :: (insertSorted x :: ys) is sorted
  -- Proof: 1. y is a lower bound on insertSorted x :: ys:
  --           Since y :: ys is sorted, y is a lower bound on ys. Since y ≤ x
  --           insertSorted_lb implies that y is a lower bound on insertSorted x ys.
  --        2. insertSorted x ys is sorted:
  --           Since y :: ys is sorted, ys is sorted. Therefore, by the induction
  --           hypothesis, insertSorted x ys is sorted.
  --        3. Since y is a lower bound on insertSorted x :: ys and
  --           insertSorted x :: ys is sorted and x ≤ y it follows from
  --           IsSorted.lb that y :: insertSorted x :: ys is sorted.
  | case3 y ys h_not_le ih =>
    have h_le : y  x := le_of_not_le x y h_not_le
    have hy_ys : IsLowerBound y ys := IsLowerBound.sorted hs
    have hy_ins_x_ys : IsLowerBound y (insertSorted x ys) := insertSorted_lb hy_ys h_le
    have ys_sorted := IsSorted.tail hs
    have ins_x_ys_sorted : IsSorted (insertSorted x ys) := ih ys_sorted
    exact IsSorted.lb hy_ins_x_ys ins_x_ys_sorted

Finally, we can prove that the output of insertionSort is sorted by induction on the input list where the induction step uses insertSorted_sorted:

theorem insertionSort_sorted (l : List α) : IsSorted (insertionSort l) := by
  induction l with
  -- l := []
  | nil => exact .nil
  -- l := x :: xs
  -- Induction hypothesis ih : IsSorted insertionSort xs
  -- By definition: insertionSort l := insertSorted x (insertionSort xs)
  -- The claim follows immediately by applying insertSorted_sorted to ih.
  | cons x xs ih => exact insertSorted_sorted ih

Comments

The comments are powered by the utterences Github app. If you do not want the app to post on your behalf, you can comment directly on this Github issue.