\section{The proof}
In this section, we will present the correctness proof of the algorithm in more
detail.
Since our computation is a least fixed point computation, we must prove
termination in order to prove correctness: if the algorithm is halted after
a number of steps without reaching a fixed point, the solution is not
guaranteed to be correct. More specifically, jumps might be encoded whose
displacement is too great for the instruction chosen.
Proof of termination rests on the fact that jumps can only increase, which
means that we must reach a fixed point after at most $2n$ iterations, with
$2n$ the number of jumps in the program. This worst case is reached if at every
iteration, we change the encoding of exactly one jump; since a jump can change
from {\tt short} to {\tt medium} and from {\tt medium} to {\tt long}, there
can be at most $2n$ changes.
This proof has been executed in the ``Russell'' style from~\cite{Sozeau2006}.
We have proven some invariants of the {\sc f} function from the previous
section; these invariants are then used to prove properties that hold for every
iteration of the fixed point computation; and finally, we can prove some
properties of the fixed point.
The invariants for the fold function. Note that during the fixed point
computation, the $sigma$ functions are implemented as tries, so in order to
compute $sigma(i)$, we lookup the value for $i$ in the corresponding trie.
\begin{lstlisting}
(out_of_program_none prefix policy) $\wedge$
(jump_not_in_policy prefix policy) $\wedge$
(policy_increase prefix old_sigma policy) $\wedge$
(policy_compact_unsafe prefix labels policy) $\wedge$
(policy_safe prefix labels added old_sigma policy) $\wedge$
(\fst (bvt_lookup $\ldots$
(bitvector_of_nat ? 0) (\snd policy) $\langle$0,short_jump$\rangle$) = 0) $\wedge$
(\fst policy = \fst (bvt_lookup $\ldots$
(bitvector_of_nat ? (|prefix|)) (\snd policy) $\langle$0,short_jump$\rangle$)) $\wedge$
(added = 0 → policy_pc_equal prefix old_sigma policy) $\wedge$
(policy_jump_equal prefix old_sigma policy → added = 0)
\end{lstlisting}
These invariants have the following meanings:
\begin{itemize}
\item {\tt out\_of\_program\_none} shows that if we try to lookup a value
beyond the program, we will get an empty result;
\item {\tt jump\_not\_in\_policy} shows that an instruction that is not a jump
will have a {\tt short\_jump} as its associated jump length;
\item {\tt policy\_increase} shows that between iterations, jumps either increase (i.e. from {\tt short\_jump} to {\tt medium\_jump} or from {\tt medium\_jump} to {\tt long\_jump}) or remain equal;
\item {\tt policy\_compact\_unsafe} shows that the policy is compact (instrucftions directly follow each other and do not overlap);
\item {\tt policy\_safe} shows that jumps selected are appropriate for the
distance between their position and their target, and the jumps selected are
the smallest possible;
\item Then there are two properties that fix the values of $\sigma(0)$ and
$\sigma(n)$ (with $n$ the size of the program);
\item And finally two predicates that link the value of $added$ to reaching
of a fixed point.
\end{itemize}
\begin{lstlisting}
($\Sigma$x:bool × (option ppc_pc_map).
let $\langle$no_ch,y$\rangle$ := x in
match y with
[ None ⇒ nec_plus_ultra program old_policy
| Some p ⇒ (out_of_program_none program p) $\wedge$
(jump_not_in_policy program p) $\wedge$
(policy_increase program old_policy p) $\wedge$
(no_ch = true → policy_compact program labels p) $\wedge$
(\fst (bvt_lookup $\ldots$
(bitvector_of_nat ? 0) (\snd p) $\langle$0,short_jump$\rangle$) = 0) $\wedge$
(\fst p = \fst (bvt_lookup $\ldots$
(bitvector_of_nat ? (|program|)) (\snd p) $\langle$0,short_jump$\rangle$)) $\wedge$
(no_ch = true $\rightarrow$ policy_pc_equal program old_policy p) $\wedge$
(policy_jump_equal program old_policy p $\rightarrow$ no_ch = true) $\wedge$
(\fst p < 2^16)
])
\end{lstlisting}
The main correctness statement, then, is as follows:
\begin{lstlisting}
definition jump_expansion':
$\forall$program:preamble $\times$ ($\Sigma$l:list labelled_instruction.S (|l|) < 2^16).
option ($\Sigma$sigma:Word → Word $\times$ bool.
$\forall$ppc: Word.$\forall$ppc_ok.
let pc := \fst (sigma ppc) in
let labels := \fst (create_label_cost_map (\snd program)) in
let lookup_labels :=
$\lambda$x.bitvector_of_nat ? (lookup_def ?? labels x 0) in
let instruction :=
\fst (fetch_pseudo_instruction (\snd program) ppc ppc_ok) in
let next_pc := \fst (sigma (add ? ppc (bitvector_of_nat ? 1))) in
And (nat_of_bitvector $\ldots$ ppc $\leq$ |\snd program| →
next_pc = add ? pc (bitvector_of_nat $\ldots$
(instruction_size lookup_labels ($\lambda$x.\fst (sigma x))
($\lambda$x.\snd (sigma x)) ppc instruction))
)
(Or (nat_of_bitvector $\ldots$ ppc < |\snd program| →
nat_of_bitvector $\ldots$ pc < nat_of_bitvector $\ldots$ next_pc)
(nat_of_bitvector $\ldots$ ppc = |\snd program| → next_pc = (zero $\ldots$))))
\end{lstlisting}
Informally, this means that when fetching a pseudo-instruction at $ppc$, the
translation by $\sigma$ of $ppc+1$ is the same as $\sigma(ppc)$ plus the size
of the instruction at $ppc$; i.e. an instruction is placed immediately
after the previous one, and there are no overlaps.
The other condition enforced is the fact that instructions are stocked in
order: the memory address of the instruction at $ppc$ should be smaller than
the memory address of the instruction at $ppc+1$. There is one exeception to
this rule: the instruction at the very end of the program, whose successor
address can be zero (this is the case where the program size is exactly equal
to the amount of memory).