# Changeset 3520

Ignore:
Timestamp:
Nov 20, 2014, 12:42:03 PM (5 years ago)
Message:

 r3519 \newcommand{\todo}[1]{\ensuremath{\texttt{[#1]}}} \makeatletter \def\blfootnote{\xdef\@thefnmark{}\@footnotetext} \makeatother \bibliographystyle{alpha} \author{Jaap Boender \and Dominic P. Mulligan \and Claudio Sacerdoti Coen} \title{On the proof of correctness of a verified optimising assembler \thanks{Research supported by the CerCo project, within the Future and Emerging Technologies (FET) programme of the Seventh Framework Programme for Research of the European Commission, under FET-Open grant number 243881}} \date{\today} \title{On the proof of correctness of a verified optimising assembler} \begin{document} \begin{abstract} A typical microprocessor's instruction set provides several forms of conditional and unconditional jumps, procedure calls, and data movement instructions. Unconditional jump and call instructions vary depending on the \emph{span} of the branch---the distance between the location of the instruction and the branch target---with short jump and call instructions taking less space in code memory to encode than their long jump and call counterparts. Assembly languages are a thin veneer over machine code, with a few light abstractions provided for the benefit of the programmer. Two such abstractions provided in assembly languages are uniform jump' and call' pseudoinstructions that branch to a labelled point in program source rather than a concrete machine address. These pseudoinstructions remove the burden of selecting between long and short jump instructions, and computing concrete machine addresses, from the programmer. Pseudoinstructions must be assembled away' by the assembler, and a jump pseudoinstruction can always be replaced by a long jump machine code instruction with no semantic change. Yet, such a na\"ive replacement is inefficient and wasteful in terms of code memory usage---a prime concern when targetting embedded devices---as some long jumps could have been replaced with short jumps, reducing the size of the program in code memory. Therefore any assembler should replace as many long jumps with short jumps as possible, optimising for program size. We present a formalisation and proof of correctness of an optimising assembler targetting a space-constrained embedded microcontroller: the 8-bit Intel MCS-51 series. All proofs are checked by the Matita theorem prover, and the work forms a major component of the Certified Complexity (CerCo') project, aiming to produce a verified concrete complexity-preserving compiler for a subset of the C programming language. Our optimisations complicate the proof of correctness of the assembler proper and therefore we employ the use of policies', separating out decisions of how a pseudoinstruction should be expanded from the expansion process itself, simplifying our proofs. Our proof strategy also tracks good' addresses, and only programs that use good addresses have their semantics preserved under assembly. This tracking offers increased flexibility over the traditional approach---where addresses are kept abstract and immutable---as we may experiment with some manipulations of machine addresses. \end{abstract} \tableofcontents \blfootnote{Research supported by the CerCo project, within the Future and Emerging Technologies (FET) programme of the Seventh Framework Programme for Research of the European Commission, under FET-Open grant number 243881} \newpage % DPM and CSC \begin{lstlisting} definition BitVector := $\lambda$n: nat. Vector bool n. \end{lstlisting} \begin{lstlisting} definition Nibble := BitVector 4. definition Byte7 := BitVector 7. definition Byte := BitVector 8. definition Word := BitVector 16. definition Word11 := BitVector 11. \end{lstlisting} \begin{lstlisting} inductive BitVectorTrie (A: Type[0]): nat $\rightarrow$ Type[0] := Leaf: A -> BitVectorTrie A O | Node: $\forall$n: nat. BitVectorTrie A n $\rightarrow$ BitVectorTrie A n $\rightarrow$ BitVectorTrie A (S n) | Stub: $\forall$n: nat. BitVectorTrie A n. \end{lstlisting} % Trie data structure and use of dependent types % Operational semantics of MCS-51