[917] | 1 | \documentclass{llncs} |
---|
| 2 | |
---|
[960] | 3 | \usepackage{amsfonts} |
---|
[952] | 4 | \usepackage{amsmath} |
---|
[960] | 5 | \usepackage{amssymb} |
---|
[917] | 6 | \usepackage[english]{babel} |
---|
[960] | 7 | \usepackage{color} |
---|
| 8 | \usepackage{fancybox} |
---|
| 9 | \usepackage{graphicx} |
---|
[917] | 10 | \usepackage[colorlinks]{hyperref} |
---|
[969] | 11 | \usepackage{hyphenat} |
---|
[957] | 12 | \usepackage[utf8x]{inputenc} |
---|
| 13 | \usepackage{listings} |
---|
[960] | 14 | \usepackage{mdwlist} |
---|
| 15 | \usepackage{microtype} |
---|
| 16 | \usepackage{stmaryrd} |
---|
| 17 | \usepackage{url} |
---|
[917] | 18 | |
---|
[960] | 19 | \renewcommand{\verb}{\lstinline} |
---|
| 20 | \def\lstlanguagefiles{lst-grafite.tex} |
---|
| 21 | \lstset{language=Grafite} |
---|
| 22 | |
---|
[957] | 23 | \newlength{\mylength} |
---|
| 24 | \newenvironment{frametxt}% |
---|
| 25 | {\setlength{\fboxsep}{5pt} |
---|
| 26 | \setlength{\mylength}{\linewidth}% |
---|
| 27 | \addtolength{\mylength}{-2\fboxsep}% |
---|
| 28 | \addtolength{\mylength}{-2\fboxrule}% |
---|
| 29 | \Sbox |
---|
| 30 | \minipage{\mylength}% |
---|
| 31 | \setlength{\abovedisplayskip}{0pt}% |
---|
| 32 | \setlength{\belowdisplayskip}{0pt}% |
---|
| 33 | }% |
---|
| 34 | {\endminipage\endSbox |
---|
| 35 | \[\fbox{\TheSbox}\]} |
---|
| 36 | |
---|
[952] | 37 | \title{On the correctness of an assembler for the Intel MCS-51 microprocessor} |
---|
[927] | 38 | \author{Jaap Boender \and Dominic P. Mulligan \and Claudio Sacerdoti Coen} |
---|
| 39 | \institute{Dipartimento di Scienze dell'Informazione, Universit\'a di Bologna} |
---|
[918] | 40 | |
---|
[927] | 41 | \bibliographystyle{splncs03} |
---|
| 42 | |
---|
[917] | 43 | \begin{document} |
---|
| 44 | |
---|
| 45 | \maketitle |
---|
| 46 | |
---|
| 47 | \begin{abstract} |
---|
[952] | 48 | We consider the formalisation of an assembler for the Intel MCS-51 8-bit microprocessor in the Matita proof assistant. |
---|
| 49 | This formalisation forms a major component of the EU-funded CerCo project, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language. |
---|
| 50 | ... |
---|
[917] | 51 | \end{abstract} |
---|
| 52 | |
---|
[918] | 53 | % ---------------------------------------------------------------------------- % |
---|
| 54 | % SECTION % |
---|
| 55 | % ---------------------------------------------------------------------------- % |
---|
| 56 | \section{Introduction} |
---|
| 57 | \label{sect.introduction} |
---|
| 58 | |
---|
[953] | 59 | We consider the formalisation of an assembler for the Intel MCS-51 8-bit microprocessor in the Matita proof assistant~\cite{asperti:user:2007}. |
---|
| 60 | This formalisation forms a major component of the EU-funded CerCo project~\cite{cerco:2011}, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language. |
---|
[947] | 61 | |
---|
[953] | 62 | The MCS-51 dates from the early 1980s and commonly called the 8051/8052. |
---|
| 63 | Despite the microprocessor's age, derivatives are still widely manufactured by a number of semiconductor foundries. |
---|
| 64 | As a result the processor is widely used, especially in embedded systems development, where well-tested, cheap, predictable microprocessors find their niche. |
---|
[927] | 65 | |
---|
[952] | 66 | The MCS-51 has a relative paucity of features compared to its more modern brethren. |
---|
| 67 | In particular, the MCS-51 does not possess a cache or any instruction pipelining that would make predicting the concrete cost of executing a single instruction an involved process. |
---|
| 68 | Instead, each semiconductor foundry that produces an MCS-51 derivative is able to provide accurate timing information in clock cycles for each instruction in their derivative's instruction set. |
---|
| 69 | It is important to stress that this timing information, unlike in more sophisticated processors, is not an estimate, it is a definition. |
---|
[953] | 70 | For the MCS-51, if a manufacturer states that a particular opcode takes three clock cycles to execute, then that opcode \emph{always} takes three clock cycles to execute. |
---|
[927] | 71 | |
---|
[952] | 72 | This predicability of timing information is especially attractive to the CerCo consortium. |
---|
[953] | 73 | We are in the process of constructing a certified, concrete complexity compiler for a realistic processor, and not for building and formalising the worst case execution time (WCET) tools that would be necessary to achieve the same result with, for example, a modern ARM or PowerPC microprocessor. |
---|
| 74 | |
---|
[952] | 75 | However, the MCS-51's paucity of features is a double edged sword. |
---|
[953] | 76 | In particular, the MCS-51 features relatively miniscule memory spaces (including read-only code memory, stack and internal/external random access memory) by modern standards. |
---|
| 77 | As a result our compiler, to have any sort of hope of successfully compiling realistic C programs, ought to produce `tight' machine code. |
---|
[952] | 78 | This is not simple. |
---|
[927] | 79 | |
---|
[953] | 80 | We here focus on a single issue in the MCS-51's instruction set: unconditional jumps. |
---|
| 81 | The MCS-51 features three conditional jump instructions: \texttt{LJMP} and \texttt{SJMP}---`long jump' and `short jump' respectively---and an 11-bit oddity of the MCS-51, \texttt{AJMP}, that the prototype CerCo compiler~\cite{cerco-report-code:2011} ignores for simplicity's sake.\footnote{Ignoring \texttt{AJMP} and its analogue \texttt{ACALL} is not idiosyncratic. The Small Device C Compiler (SDCC)~\cite{sdcc:2011}, the leading open source C compiler for the MCS-51, also seemingly does not produce \texttt{AJMP} and \texttt{ACALL} instructions. Their utility in a modern context remains unclear.} |
---|
[952] | 82 | Each of these three instructions expects arguments in different sizes and behaves in different ways. |
---|
[953] | 83 | \texttt{SJMP} may only perform a `local jump' whereas \texttt{LJMP} may jump to any address in the MCS-51's memory space. |
---|
| 84 | Consequently, the size of each opcode is different, and to squeeze as much code as possible into the MCS-51's limited code memory, the smallest possible opcode should be selected. |
---|
[927] | 85 | |
---|
[953] | 86 | The prototype CerCo C compiler does not attempt to select the smallest jump opcode in this manner, as this was thought to unneccessarily complicate the compilation chain. |
---|
| 87 | Instead, the compiler targets an assembly language, complete with pseudoinstructions including bespoke \texttt{Jmp} and \texttt{Call} instructions. |
---|
| 88 | Labels, conditional jumps to labels, a program preamble containing global data and a \texttt{MOV} instruction for moving this global data into the MCS-51's one 16-bit register also feature. |
---|
[952] | 89 | This latter feature will ease any later consideration of separate compilation in the CerCo compiler. |
---|
[953] | 90 | An assembler is used to expand pseudoinstructions into MCS-51 machine code. |
---|
[927] | 91 | |
---|
[953] | 92 | However, this assembly process is not trivial, for numerous reasons. |
---|
| 93 | For example, our conditional jumps to labels behave differently from their machine code counterparts. |
---|
| 94 | At the machine code level, conditional jumps may only jump to a relative offset, expressed in a byte, of the current program counter, limiting their range. |
---|
| 95 | However, at the assembly level, conditional jumps may jump to a label that appears anywhere in the program, significantly liberalising the use of conditional jumps and further simplifying the design of the CerCo compiler. |
---|
[952] | 96 | |
---|
[953] | 97 | Further, trying to na\"ively relate assembly programs with their machine code counterparts simply does not work. |
---|
[952] | 98 | Machine code programs that fetch from code memory and programs that combine the program counter with constant shifts do not make sense at the assembly level. |
---|
| 99 | More generally, memory addresses can only be compared with other memory addresses. |
---|
| 100 | However, checking that memory addresses are only compared against each other at the assembly level is in fact undecidable. |
---|
| 101 | In short, the full preservation of the semantics of the two languages is impossible. |
---|
| 102 | |
---|
[953] | 103 | Yet more complications are added by the peculiarities of the CerCo project itself. |
---|
[952] | 104 | As mentioned, the CerCo consortium is in the business of constructing a verified compiler for the C programming language. |
---|
[953] | 105 | However, unlike CompCert~\cite{compcert:2011,leroy:formal:2009,leroy:formally:2009}---which currently represents the state of the art for `industrial grade' verified compilers---CerCo considers not just the \emph{intensional correctness} of the compiler, but also its \emph{extensional correctness}. |
---|
[952] | 106 | That is, CompCert focusses solely on the preservation of the \emph{meaning} of a program during the compilation process, guaranteeing that the program's meaning does not change as it is gradually transformed into assembly code. |
---|
| 107 | However in any realistic compiler (even the CompCert compiler!) there is no guarantee that the program's time properties are preserved during the compilation process; a compiler's `optimisations' could, in theory, even conspire to degrade the concrete complexity of certain classes of programs. |
---|
| 108 | CerCo aims to expand the current state of the art by producing a compiler where this temporal degradation is guaranteed not to happen. |
---|
| 109 | |
---|
| 110 | In order to achieve this CerCo imposes a cost model on programs, or more specifically, on simple blocks of instructions. |
---|
[954] | 111 | This cost model is induced by the compilation process itself, and its non-compositional nature allows us to assign different costs to identical blocks of instructions depending on how they are compiled. |
---|
| 112 | In short, we aim to obtain a very precise costing for a program by embracing the compilation process, not ignoring it. |
---|
| 113 | This, however, complicates the proof of correctness for the compiler proper: for every translation pass from intermediate language to intermediate language, we must prove that not only has the meaning of a program been preserved, but also its complexity characteristics. |
---|
[952] | 114 | This also applies for the translation from assembly language to machine code. |
---|
[954] | 115 | |
---|
[952] | 116 | How do we assign a cost to a pseudoinstruction? |
---|
| 117 | As mentioned, conditional jumps at the assembly level can jump to a label appearing anywhere in the program. |
---|
[954] | 118 | However, at the machine code level, conditional jumps are limited to jumping `locally', using a measly byte offset. |
---|
| 119 | To translate a jump to a label, a single conditional jump pseudoinstruction may be translated into a block of three real instructions, as follows (here, \texttt{JZ} is `jump if accumulator is zero'): |
---|
[952] | 120 | \begin{displaymath} |
---|
| 121 | \begin{array}{r@{\quad}l@{\;\;}l@{\qquad}c@{\qquad}l@{\;\;}l} |
---|
| 122 | & \mathtt{JZ} & label & & \mathtt{JZ} & \text{size of \texttt{SJMP} instruction} \\ |
---|
| 123 | & \ldots & & \text{translates to} & \mathtt{SJMP} & \text{size of \texttt{LJMP} instruction} \\ |
---|
| 124 | label: & \mathtt{MOV} & \mathtt{A}\;\;\mathtt{B} & \Longrightarrow & \mathtt{LJMP} & \text{address of \textit{label}} \\ |
---|
| 125 | & & & & \ldots & \\ |
---|
| 126 | & & & & \mathtt{MOV} & \mathtt{A}\;\;\mathtt{B} |
---|
| 127 | \end{array} |
---|
| 128 | \end{displaymath} |
---|
| 129 | In the translation, if \texttt{JZ} fails, we fall through to the \texttt{SJMP} which jumps over the \texttt{LJMP}. |
---|
[954] | 130 | Naturally, if \textit{label} is close enough, a conditional jump pseudoinstruction is mapped directly to a conditional jump machine instruction; the above translation only applies if \textit{label} is not sufficiently local. |
---|
[952] | 131 | This leaves the problem, addressed below, of calculating whether a label is indeed `close enough' for the simpler translation to be used. |
---|
| 132 | |
---|
| 133 | Crucially, the above translation demonstrates the difficulty in predicting how many clock cycles a pseudoinstruction will take to execute. |
---|
| 134 | A conditional jump may be mapped to a single machine instruction or a block of three. |
---|
| 135 | Perhaps more insidious, the number of cycles needed to execute the instructions in the two branches of a translated conditional jump may be different. |
---|
[954] | 136 | Depending on the particular MCS-51 derivative at hand, an \texttt{SJMP} could in theory take a different number of clock cycles to execute than an \texttt{LJMP}. |
---|
[952] | 137 | These issues must also be dealt with in order to prove that the translation pass preserves the concrete complexity of the code. |
---|
| 138 | |
---|
| 139 | The question remains: how do we decide whether to expand a jump into an \texttt{SJMP} or an \texttt{LJMP}? |
---|
[954] | 140 | To understand why this problem is not trivial, consider the following snippet of assembly code: |
---|
[952] | 141 | \begin{displaymath} |
---|
| 142 | \text{dpm: finish me} |
---|
| 143 | \end{displaymath} |
---|
[954] | 144 | |
---|
[952] | 145 | As our example shows, given an occurence $l$ of an \texttt{LJMP} instruction, it may be possible to shrink $l$ to an occurence of an \texttt{SJMP} providing we can shrink any \texttt{LJMP}s that exist between $l$ and its target location. |
---|
| 146 | However, shrinking these \texttt{LJMP}s may in turn depend on shrinking $l$ to an \texttt{SJMP}, as it is perfectly possible to jump backwards. |
---|
[954] | 147 | In short, unless we can somehow break this loop of circularity, and similar knotty configurations of jumps, we are stuck with a suboptimal solution to the expanding jumps problem. |
---|
[952] | 148 | |
---|
[954] | 149 | How we went about resolving this problem affected the shape of our proof of correctness for the whole assembler in a rather profound way. |
---|
| 150 | We first attempted to synthesize a solution bottom up: starting with no solution, we gradually refine a solution using the same functions that implement the jump expansion. |
---|
[952] | 151 | Using this technique, solutions can fail to exist, and the proof quickly descends into a diabolical quagmire. |
---|
| 152 | |
---|
[954] | 153 | Abandoning this attempt, we instead split the `policy'---the decision over how any particular jump should be expanded---from the implementation that actually expands assembly programs into machine code. |
---|
[952] | 154 | Assuming the existence of a correct policy, we proved the implementation of the assembler correct. |
---|
[954] | 155 | Further, we proved that the assembler fails to assemble an assembly program if and only if a correct policy does not exist. |
---|
[952] | 156 | Policies do not exist in only a limited number of circumstances: namely, if a pseudoinstruction attempts to jump to a label that does not exist, or the program is too large to fit in code memory. |
---|
[954] | 157 | The first case would constitute a serious compiler error, and hopefully certifying the rest of the compiler would rule this possibility out. |
---|
| 158 | The second case is unavoidable---certified compiler or not, trying to load a huge program into a small code memory will break \emph{something}. |
---|
[952] | 159 | |
---|
[954] | 160 | The rest of this paper is a detailed description of this proof. |
---|
| 161 | |
---|
[927] | 162 | % ---------------------------------------------------------------------------- % |
---|
| 163 | % SECTION % |
---|
| 164 | % ---------------------------------------------------------------------------- % |
---|
| 165 | \subsection{Overview of the paper} |
---|
| 166 | \label{subsect.overview.of.the.paper} |
---|
[952] | 167 | In Section~\ref{sect.matita} we provide a brief overview of the Matita proof assistant for the unfamiliar reader. |
---|
| 168 | In Section~\ref{sect.the.proof} we discuss the design and implementation of the proof proper. |
---|
| 169 | In Section~\ref{sect.conclusions} we conclude. |
---|
[927] | 170 | |
---|
| 171 | % ---------------------------------------------------------------------------- % |
---|
| 172 | % SECTION % |
---|
| 173 | % ---------------------------------------------------------------------------- % |
---|
[952] | 174 | \section{Matita} |
---|
| 175 | \label{sect.matita} |
---|
[927] | 176 | |
---|
[960] | 177 | Matita is a proof assistant based on the (Co)inductive Calculus of Constructions~\cite{asperti:user:2007}. |
---|
| 178 | For those familiar with Coq, Matita's syntax and mode of operation should be entirely familiar. |
---|
| 179 | We take time here to explain one of Matita's syntactic idiosyncracies, however. |
---|
[976] | 180 | The use of `$\mathtt{?}$' or `$\mathtt{\ldots}$' in an argument position denotes a type or types to be inferred automatically by unification respectively. |
---|
[960] | 181 | The use of `$\mathtt{?}$' in the body of a definition, lemma or axiom denotes an incomplete term that is to be closed, by hand, using tactics. |
---|
[952] | 182 | |
---|
[927] | 183 | % ---------------------------------------------------------------------------- % |
---|
| 184 | % SECTION % |
---|
| 185 | % ---------------------------------------------------------------------------- % |
---|
[952] | 186 | \section{The proof} |
---|
| 187 | \label{sect.the.proof} |
---|
[927] | 188 | |
---|
[960] | 189 | \subsection{The assembler and semantics of machine code} |
---|
| 190 | \label{subsect.the.assembler.and.semantics.of.machine.code} |
---|
[952] | 191 | |
---|
[960] | 192 | The formalisation in Matita of the semantics of MCS-51 machine code is described in~\cite{mulligan:executable:2011}. |
---|
| 193 | We merely describe enough here to understand the rest of the proof. |
---|
| 194 | |
---|
| 195 | At heart, the MCS-51 emulator centres around a \texttt{Status} record, describing the current state of the microprocessor. |
---|
| 196 | This record contains fields corresponding to the microprocessor's program counter, special function registers, and so on. |
---|
| 197 | At the machine code level, code memory is implemented as a trie of bytes, addressed by the program counter. |
---|
| 198 | Machine code programs are loaded into \texttt{Status} using the \texttt{load\_code\_memory} function. |
---|
| 199 | |
---|
| 200 | We may execut a single step of a machine code program using the \texttt{execute\_1} function, which returns an updated \texttt{Status}: |
---|
[957] | 201 | \begin{lstlisting} |
---|
[960] | 202 | definition execute_1: Status $\rightarrow$ Status := $\ldots$ |
---|
[957] | 203 | \end{lstlisting} |
---|
[960] | 204 | The function \texttt{execute} allows one to execute an arbitrary, but fixed (due to Matita's normalisation requirement!) number of steps of a program. |
---|
[957] | 205 | |
---|
[960] | 206 | Naturally, assembly programs have analogues. |
---|
| 207 | The counterpart of the \texttt{Status} record is \texttt{PseudoStatus}. |
---|
| 208 | Instead of code memory being implemented as tries of bytes, code memory is here implemented as lists of pseudoinstructions, and program counters are merely indices into this list. |
---|
| 209 | Our analogue of \texttt{execute\_1} is \texttt{execute\_1\_pseudo\_instruction}: |
---|
[957] | 210 | \begin{lstlisting} |
---|
[960] | 211 | definition execute_1_pseudo_instruction: (Word $\rightarrow$ nat $\times$ nat) $\rightarrow$ |
---|
| 212 | PseudoStatus $\rightarrow$ PseudoStatus := $\ldots$ |
---|
[957] | 213 | \end{lstlisting} |
---|
[960] | 214 | Notice, here, that the emulation function for pseudoprograms takes an additional argument. |
---|
| 215 | This is a function that maps program counters (for the pseudoprogram) to pairs of natural numbers representing the number of clock ticks that the pseudoinstruction needs to execute, post expansion. |
---|
[968] | 216 | We call this function a \emph{costing}, and note that the costing is induced by the particular strategy we use to expand pseudoinstructions. |
---|
| 217 | If we change how we expand conditional jumps to labels, for instance, then the costing needs to change, hence \texttt{execute\_1\_pseudo\_instruction}'s parametricity in the costing. |
---|
| 218 | |
---|
| 219 | The costing returns \emph{pairs} of natural numbers because, in the case of expanding conditional jumps to labels, the expansion of the `true branch' and `false branch' may differ in the number of clock ticks needed for execution. |
---|
[960] | 220 | This timing information is used inside \texttt{execute\_1\_pseudo\_instruction} to update the clock of the \texttt{PseudoStatus}. |
---|
| 221 | During the proof of correctness of the assembler we relate the clocks of \texttt{Status}es and \texttt{PseudoStatus}es. |
---|
[957] | 222 | |
---|
[960] | 223 | The assembler, mapping programs consisting of lists of pseudoinstructions to lists of bytes, operates in a mostly straightforward manner. |
---|
| 224 | To a degree of approximation, the assembler on an assembly program, consisting of $n$ pseudoinstructions $\mathtt{P_i}$ for $1 \leq i \leq n$, works as in the following diagram: |
---|
| 225 | \begin{displaymath} |
---|
[974] | 226 | [\mathtt{P_1}, \ldots \mathtt{P_n}] \xrightarrow{\mathtt{flatten}\left(\mathtt{P_i} \xrightarrow{\mbox{\fontsize{7}{9}\selectfont$\mathtt{expand}$}} \mathtt{[I_1^i, \ldots I^q_i]} \xrightarrow{\mbox{\fontsize{7}{9}\selectfont$\mathtt{assembly1}^*$}} \mathtt{[0110]}\right)^{*}} \mathtt{[010101]} |
---|
[960] | 227 | \end{displaymath} |
---|
| 228 | Here $\mathtt{I^i_j}$ for $1 \leq j \leq q$ are the $q$ machine code instructions obtained by expanding, with \texttt{expand\_pseudo\_instruction}, a single pseudoinstruction. |
---|
| 229 | Each machine code instruction $\mathtt{I^i_j}$ is then assembled, using the \texttt{assembly1} function, into a list of bytes. |
---|
| 230 | This process is iterated for each pseudoinstruction, before the lists are flattened into a single bit list representation of the original assembly program. |
---|
| 231 | |
---|
[974] | 232 | By inspecting the above diagram, it would appear that the best way to proceed with a proof that the assembly process does not change the semantics of an assembly program is via a decomposition of the problem into two subproblems. |
---|
| 233 | Namely, we first expand any and all pseudoinstructions into lists of machine instructions, and provide a proof that this process does not change our program's semantics. |
---|
| 234 | Finally, we assemble all machine code instructions into machine code---lists of bytes---and prove once more that this process does not have an adverse effect on a program's semantics. |
---|
| 235 | By composition, we then have that the whole assembly process is semantics preserving. |
---|
| 236 | |
---|
| 237 | This is a tempting approach to the proof, but ultimately the wrong approach. |
---|
| 238 | In particular, it is important that we track how the program counter indexing into the assembly program, and the machine's program counter evolve, so that we can relate them. |
---|
| 239 | Expanding pseudoinstructions requires that the machine's program counter be incremented by $n$ steps, for $1 \leq n$, for every increment of the assembly program's program counter. |
---|
| 240 | Keeping track of the correspondence between the two program counters quickly becomes unfeasible using a compositional approach, and hence the proof must be monolithic. |
---|
| 241 | |
---|
[960] | 242 | % ---------------------------------------------------------------------------- % |
---|
| 243 | % SECTION % |
---|
| 244 | % ---------------------------------------------------------------------------- % |
---|
| 245 | \subsection{Policies} |
---|
| 246 | \label{subsect.policies} |
---|
| 247 | |
---|
[968] | 248 | Policies exist to dictate how conditional and unconditional jumps at the assembly level should be expanded into machine code instructions. |
---|
| 249 | Using policies, we are able to completely decouple the decision over how jumps are expanded with the act of expansion, simplifying our proofs. |
---|
| 250 | As mentioned, the MCS-51 instruction set includes three different jump instructions: \texttt{SJMP}, \texttt{AJMP} and \texttt{LJMP}; call these `short', `medium' and `long' jumps, respectively: |
---|
[957] | 251 | \begin{lstlisting} |
---|
[968] | 252 | inductive jump_length: Type[0] := |
---|
[957] | 253 | | short_jump: jump_length |
---|
| 254 | | medium_jump: jump_length |
---|
| 255 | | long_jump: jump_length. |
---|
[960] | 256 | \end{lstlisting} |
---|
[968] | 257 | A \texttt{jump\_expansion\_policy} is a map from \texttt{Word}s to \texttt{jump\_length}s, implemented as a trie. |
---|
| 258 | Intuitively, a policy maps positions in a program (indexed using program counters implemented as \texttt{Word}s) to a particular variety of jump. |
---|
[960] | 259 | \begin{lstlisting} |
---|
[968] | 260 | definition jump_expansion_policy := BitVectorTrie jump_length 16. |
---|
[957] | 261 | \end{lstlisting} |
---|
[968] | 262 | Next, we require a series of `sigma' functions. |
---|
| 263 | These functions map assembly program counters to their machine code counterparts, establishing the correspondence between `positions' in an assembly program and `positions' in a machine code program. |
---|
| 264 | At the heart of this process is \texttt{sigma0} which traverses an assembly program building maps from program counter to program counter. |
---|
| 265 | This function fails if and only if an internal call to \texttt{assembly\_1\_pseudoinstruction} fails: |
---|
[957] | 266 | \begin{lstlisting} |
---|
[974] | 267 | definition sigma0: pseudo_assembly_program |
---|
| 268 | $\rightarrow$ option (nat $\times$ (nat $\times$ (BitVectorTrie Word 16))) := $\ldots$ |
---|
[957] | 269 | \end{lstlisting} |
---|
[968] | 270 | We eventually lift this to functions from program counters to program counters: |
---|
[957] | 271 | \begin{lstlisting} |
---|
[969] | 272 | definition sigma_safe: |
---|
| 273 | pseudo_assembly_program $\rightarrow$ option (Word $\rightarrow$ Word) := $\ldots$ |
---|
[968] | 274 | \end{lstlisting} |
---|
| 275 | Now, it's possible to define what a `good policy' is i.e. one that does not cause \texttt{sigma\_safe} to fail. |
---|
| 276 | As mentioned, \texttt{sigma\_safe} can only fail if an assembly program fails to be assembled: |
---|
| 277 | \begin{lstlisting} |
---|
[969] | 278 | definition policy_ok := $\lambda$p. sigma_safe p $\neq$ None $\ldots$. |
---|
[968] | 279 | \end{lstlisting} |
---|
[969] | 280 | Finally, we obtain \texttt{sigma}, a map from program counters to program counters, which is guranteed not to fail as we internally provide a that |
---|
[968] | 281 | \begin{lstlisting} |
---|
[969] | 282 | definition sigma: pseudo_assembly_program $\rightarrow$ Word $\rightarrow$ Word := $\ldots$ |
---|
[957] | 283 | \end{lstlisting} |
---|
[974] | 284 | |
---|
| 285 | % ---------------------------------------------------------------------------- % |
---|
| 286 | % SECTION % |
---|
| 287 | % ---------------------------------------------------------------------------- % |
---|
| 288 | \subsection{Total correctness of the assembler} |
---|
| 289 | \label{subsect.total.correctness.of.the.assembler} |
---|
| 290 | |
---|
[976] | 291 | Using our policies, we now work toward proving the total correctness of the assembler. |
---|
| 292 | By total correctness, we mean that the assembly process does not change the semantics of an assembly program. |
---|
| 293 | Naturally, this necessitates keeping some sort of correspondence between program counters at the assembly level, and program counters at the machine code level. |
---|
| 294 | For this, we use the \texttt{sigma} machinery defined at the end of Subsection~\ref{subsect.policies}. |
---|
| 295 | |
---|
[974] | 296 | CSC: no options using policy |
---|
| 297 | \begin{lstlisting} |
---|
| 298 | lemma fetch_assembly_pseudo2: $\forall$program, assembled, costs, labels. |
---|
| 299 | Some $\ldots$ $\langle$labels,costs$\rangle$ = build_maps program $\rightarrow$ |
---|
| 300 | Some $\ldots$ $\langle$assembled,costs$\rangle$ = assembly program $\rightarrow$ $\forall$ppc. |
---|
| 301 | let code_memory := load_code_memory assembled in |
---|
| 302 | let preamble := $\pi_1$ program in |
---|
| 303 | let data_labels := construct_datalabels preamble in |
---|
| 304 | let lk_labels := |
---|
| 305 | λx. sigma program (address_of_word_labels_code_mem ($\pi_2$ program) x) in |
---|
| 306 | let lk_dlabels := λx. lookup ? ? x data_labels (zero ?) in |
---|
| 307 | let expansion := jump_expansion ppc program in |
---|
| 308 | let $\langle$pi,newppc$\rangle$ := fetch_pseudo_instruction ($\pi_2$ program) ppc in |
---|
| 309 | let ppc' := sigma program ppc in |
---|
| 310 | let newppc' := sigma program newppc in |
---|
| 311 | let instructions' := |
---|
| 312 | expand_pseudo_instruction lk_labels lk_dlabels ppc' expansion pi in |
---|
| 313 | let fetched := fetch_many code_memory newppc' ppc' instructions in |
---|
| 314 | $\exists$instructions. Some ? instructions = instructions' $\wedge$ fetched. |
---|
| 315 | \end{lstlisting} |
---|
| 316 | |
---|
[976] | 317 | We expand pseudoinstructions using the function \texttt{expand\_pseudo\_instruction}. |
---|
| 318 | This function accepts a `policy decision'---an element of type \texttt{jump\_length}---that is used when expanding a \texttt{Call}, \texttt{Jmp} or conditional jump to a label into the correct machine instruction. |
---|
| 319 | This \texttt{policy\_decision} is asssumed to originate from a policy as defined in Subsection~\ref{subsect.policies}. |
---|
[974] | 320 | \begin{lstlisting} |
---|
| 321 | definition expand_pseudo_instruction: |
---|
| 322 | ∀lookup_labels, lookup_datalabels, pc, policy_decision. |
---|
[976] | 323 | pseudo_instruction $\rightarrow$ option list instruction := $\ldots$ |
---|
[974] | 324 | \end{lstlisting} |
---|
[976] | 325 | Under the assumption that a correct policy exists, \texttt{expand\_pseudo\_instruction} should never fail, and therefore the option type may be dispensed with. |
---|
| 326 | This is because the only failure conditions for \texttt{expand\_pseudo\_instruction} result from trying to expand a pseudoinstruction into an `impossible' combination of machine code instructions. |
---|
| 327 | For instance, if the policy decision dictates that we should expand a \texttt{Call} pseudoinstruction into a `short jump', then we fail, as the MCS-51's instruction set only features instructions \texttt{ACALL} and \texttt{LCALL}. |
---|
[974] | 328 | |
---|
| 329 | \begin{lstlisting} |
---|
| 330 | axiom assembly_ok: ∀program,assembled,costs,labels. |
---|
| 331 | Some $\ldots$ $\langle$labels, costs$\rangle$ = build_maps program $\rightarrow$ |
---|
| 332 | Some $\ldots$ $\langle$assembled, costs$\rangle$ = assembly program $\rightarrow$ |
---|
| 333 | let code_memory := load_code_memory assembled in |
---|
| 334 | let preamble := $\pi_1$ program in |
---|
| 335 | let datalabels := construct_datalabels preamble in |
---|
| 336 | let lk_labels := |
---|
| 337 | $\lambda$x. sigma program (address_of_word_labels_code_mem ($\pi_2$ program) x) in |
---|
| 338 | let lk_dlabels := $\lambda$x. lookup ? ? x datalabels (zero ?) in |
---|
| 339 | ∀ppc,len,assembledi. |
---|
| 340 | let $\langle$pi, newppc$\rangle$ := fetch_pseudo_instruction ($\pi_2$ program) ppc in |
---|
| 341 | let assembly' := assembly_1_pseudoinstruction program ppc |
---|
| 342 | (sigma program ppc) lk_labels lk_dlabels pi in |
---|
| 343 | let newpc := (sigma program ppc) + len in |
---|
| 344 | let echeck := |
---|
| 345 | encoding_check code_memory (sigma program ppc) slen assembledi in |
---|
| 346 | Some $\ldots$ $\langle$len, assembledi$\rangle$ = assembly' $\rightarrow$ |
---|
| 347 | echeck $\wedge$ sigma program newppc = newpc. |
---|
| 348 | \end{lstlisting} |
---|
| 349 | |
---|
[976] | 350 | Lemma \texttt{fetch\_assembly\_pseudo} establishes the correctness of expanding and then assembling pseudoinstructions: |
---|
[974] | 351 | \begin{lstlisting} |
---|
| 352 | lemma fetch_assembly_pseudo: $\forall$program, ppc, lk_labels, lk_dlabels. |
---|
| 353 | $\forall$pi, code_memory, len, assembled, instructions, pc. |
---|
| 354 | let jexp := jump_expansion ppc program in |
---|
| 355 | let exp := |
---|
[976] | 356 | expand_pseudo_instruction lk_labels lk_dlabels pc jexp pi |
---|
[974] | 357 | let ass := |
---|
| 358 | assembly_1_pseudoinstruction program ppc pc lk_labels lk_dlabels pi in |
---|
| 359 | Some ? instructions = exp $\rightarrow$ |
---|
| 360 | Some $\ldots$ $\langle$len, assembled$\rangle$ = ass $\rightarrow$ |
---|
| 361 | encoding_check code_memory pc (pc + len) assembled $\rightarrow$ |
---|
| 362 | fetch_many code_memory (pc + len) pc instructions. |
---|
| 363 | \end{lstlisting} |
---|
[976] | 364 | Here, \texttt{len} is the number of machine code instructions the pseudoinstruction at hand has been expanded into, \texttt{encoding\_check} is a recursive function that checks for any possible corruption of the code memory, resulting from expanding the pseudoinstruction. |
---|
[981] | 365 | We assemble a single pseudoinstruction with \texttt{assembly\_1\_pseudoinstruction}, which internally calls \texttt{jump\_expansion} and \texttt{expand\_pseudo\_instruction}. |
---|
[976] | 366 | The function \texttt{fetch\_many} fetches multiple machine code instructions from code memory and performs some routine checks. |
---|
[974] | 367 | |
---|
[976] | 368 | Intuitively, lemma \texttt{fetch\_assembly\_pseudo} can be read as follows. |
---|
| 369 | Suppose our policy \texttt{jump\_expansion} dictates that the pseudoinstruction indexed by the pseudo program counter \texttt{ppc} in assembly program \texttt{program} gives us the policy decision \texttt{jexp}. |
---|
| 370 | Further, suppose we expand the pseudoinstruction at \texttt{ppc} with the policy decision \texttt{jexp}, obtaining an (optional) list of machine code instructions \texttt{exp}. |
---|
| 371 | Suppose we also |
---|
| 372 | |
---|
[974] | 373 | \begin{lstlisting} |
---|
| 374 | theorem fetch_assembly: $\forall$pc, i, cmem, assembled. |
---|
| 375 | assembled = assembly1 i $\rightarrow$ |
---|
[976] | 376 | let len := length $\ldots$ assembled in |
---|
[974] | 377 | encoding_check cmem pc (pc + len) assembled $\rightarrow$ |
---|
| 378 | let fetched := fetch code_memory (bitvector_of_nat $\ldots$ pc) in |
---|
| 379 | let $\langle$instr_pc, ticks$\rangle$ := fetched in |
---|
[976] | 380 | let $\langle$instr, pc'$\rangle$ := instr_pc in |
---|
[974] | 381 | (eq_instruction instr i $\wedge$ |
---|
| 382 | eqb ticks (ticks_of_instruction instr) $\wedge$ |
---|
| 383 | eq_bv $\ldots$ pc' (pc + len)) = true. |
---|
| 384 | \end{lstlisting} |
---|
| 385 | |
---|
[969] | 386 | An \texttt{internal\_pseudo\_address\_map} positions in the memory of a \texttt{PseudoStatus} with a physical memory address. |
---|
[957] | 387 | \begin{lstlisting} |
---|
[969] | 388 | definition internal_pseudo_address_map := list (BitVector 8). |
---|
| 389 | \end{lstlisting} |
---|
| 390 | We use \texttt{internal\_pseudo\_address\_map}s to convert the lower internal RAM of a \texttt{PseudoStatus} into the lower internal RAM of a \texttt{Status}. |
---|
| 391 | Notice, the MCS-51's internal RAM is addressed with a 7-bit `byte'. |
---|
| 392 | % dpm: ugly English, fix |
---|
| 393 | The whole of the internal RAM space is addressed with bytes: the first bit is used to distinguish between the programmer addressing low and high internal memory. |
---|
| 394 | \begin{lstlisting} |
---|
[957] | 395 | axiom low_internal_ram_of_pseudo_low_internal_ram: |
---|
[969] | 396 | internal_pseudo_address_map $\rightarrow$ BitVectorTrie Byte 7 $\rightarrow$ BitVectorTrie Byte 7. |
---|
[957] | 397 | \end{lstlisting} |
---|
[969] | 398 | A similar axiom exists for high internal RAM. |
---|
[957] | 399 | |
---|
[969] | 400 | Next, we are able to translate \texttt{PseudoStatus} records into \texttt{Status} records using \texttt{status\_of\_pseudo\_status}. |
---|
| 401 | Translating a \texttt{PseudoStatus}'s code memory requires we expand pseudoinstructions and then assemble to obtain a trie of bytes. |
---|
| 402 | This can fail, as mentioned, in a limited number of situations, related to improper use of labels in an assembly program. |
---|
| 403 | However, it is possible to `tighten' the type of \texttt{status\_of\_pseudo\_status}, removing the option type, by using the fact that if any `good policy' exists, assembly will never fail. |
---|
[957] | 404 | \begin{lstlisting} |
---|
| 405 | definition status_of_pseudo_status: |
---|
| 406 | internal_pseudo_address_map → PseudoStatus → option Status |
---|
| 407 | \end{lstlisting} |
---|
[969] | 408 | After fetching an assembly instruction we must update any \texttt{internal\_pseudo\hyp{}\_address\_map}s that may be laying around. |
---|
| 409 | This is done with the following function: |
---|
[957] | 410 | \begin{lstlisting} |
---|
[974] | 411 | definition next_internal_pseudo_address_map: internal_pseudo_address_map |
---|
| 412 | $\rightarrow$ PseudoStatus $\rightarrow$ option internal_pseudo_address_map |
---|
[957] | 413 | \end{lstlisting} |
---|
[974] | 414 | Finally, we are able to state and prove our main theorem. |
---|
| 415 | This relates the execution of a single assembly instruction and the execution of (possibly) many machine code instructions. |
---|
| 416 | That is, the assembly process preserves the semantics of an assembly program, as it is translated into machine code: |
---|
[957] | 417 | \begin{lstlisting} |
---|
[974] | 418 | theorem main_thm: |
---|
[969] | 419 | ∀M,M',ps,s,s''. |
---|
[976] | 420 | next_internal_pseudo_address_map M ps = Some $\ldots$ M' $\rightarrow$ |
---|
| 421 | status_of_pseudo_status M ps = Some $\ldots$ s $\rightarrow$ |
---|
[969] | 422 | status_of_pseudo_status M' |
---|
| 423 | (execute_1_pseudo_instruction |
---|
[976] | 424 | (ticks_of (code_memory $\ldots$ ps)) ps) = Some $\ldots$ s'' $\rightarrow$ |
---|
[969] | 425 | $\exists$n. execute n s = s''. |
---|
[957] | 426 | \end{lstlisting} |
---|
[970] | 427 | The statement can be given an intuitive reading as follows. |
---|
[973] | 428 | Suppose our \texttt{PseudoStatus} $ps$ can be successfully converted into a \texttt{Status} $s$. |
---|
| 429 | Suppose further that, after executing a single assembly instruction and converting the resulting \texttt{PseudoStatus} into a \texttt{Status}, we obtain $s''$, being careful to track the number of ticks executed. |
---|
| 430 | Then, there exists some number $n$, so that executing $n$ machine code instructions in \texttt{Status} $s$ gives us \texttt{Status} $s''$. |
---|
[974] | 431 | Theorem \texttt{main\_thm} establishes the correctness of the assembly process. |
---|
[957] | 432 | |
---|
[969] | 433 | % ---------------------------------------------------------------------------- % |
---|
| 434 | % SECTION % |
---|
| 435 | % ---------------------------------------------------------------------------- % |
---|
[927] | 436 | \section{Conclusions} |
---|
| 437 | \label{sect.conclusions} |
---|
| 438 | |
---|
[976] | 439 | We have proved the total correctness of an assembler for MCS-51 assembly language. |
---|
| 440 | In particular, our assembly language featured labels, arbitrary conditional and unconditional jumps to labels, global data and instructions for moving this data into the MCS-51's single 16-bit register. |
---|
[981] | 441 | Expanding these pseudoinstructions into machine code instructions is not trivial, and the proof that the assembly process is `correct', in that the semantics of an assembly program are not changed is complex. |
---|
[960] | 442 | |
---|
[981] | 443 | Aside from their application in verified compiler projects such as CerCo, verified assemblers could also be applied to the verification of operating system kernels. |
---|
| 444 | Of particular note is the verified seL4 kernel~\cite{klein:sel4:2009,klein:sel4:2010}. |
---|
| 445 | This verification explicitly assumes the existence of, amongst other things, a trustworthy assembler. |
---|
[952] | 446 | |
---|
[981] | 447 | \paragraph{Use of dependent types and Russell} |
---|
[955] | 448 | Our formalisation makes sparing use of dependent types. |
---|
| 449 | In certain datastructures, such as tries and vectors, they are used to guarantee invariants. |
---|
| 450 | However, we have currently shyed away from making extensive use of dependent types and inductive predicates in the proof of correctness for the assembler itself. |
---|
| 451 | This is because complex dependent types and inductive predicates tend not to co\"operate particularly well with tactics such as inversion. |
---|
[952] | 452 | |
---|
[956] | 453 | However, there are certain cases where the use of dependent types is unavoidable. |
---|
| 454 | For instance, when proving that the \texttt{build\_maps} function is correct, a function that collates the cost and data labels of an assembly program into map datastructures. |
---|
[981] | 455 | In cases such as these we make use of Matita's implementation of Russell~\cite{sozeau:subset:2006}. |
---|
| 456 | In Matita, Russell may be implemented with two coercions and some notational sugaring. |
---|
[955] | 457 | |
---|
[952] | 458 | \subsection{Related work} |
---|
| 459 | \label{subsect.related.work} |
---|
| 460 | |
---|
[981] | 461 | % piton |
---|
| 462 | We are not the first to consider the total correctness of an assembler for a non-trivial assembly language. |
---|
| 463 | Perhaps the most impressive piece of work in this domain is the Piton stack~\cite{moore:piton:1996,moore:grand:2005}. |
---|
| 464 | This was a stack of verified components, written and verified in ACL2, ranging from a proprietary FM9001 microprocessor verified at the gate level, to assemblers and compilers for two high-level languages---a dialect of Lisp and $\mu$Gypsy~\cite{moore:grand:2005}. |
---|
| 465 | %dpm more: weirich paper? |
---|
| 466 | |
---|
| 467 | % jinja |
---|
| 468 | Klein and Nipkow consider a Java-like programming language, Jinja~\cite{klein:machine:2006,klein:machine:2010}. |
---|
| 469 | They provide a compiler, virtual machine and operational semantics for the programming language and virtual machine, and prove that their compiler is semantics and type preserving. |
---|
| 470 | |
---|
| 471 | Finally, mention should be made of CompCert~\cite{compcert:2011,blazy:formal:2006,leroy:formal:2009,leroy:formally:2009}, another verified compiler project related to CerCo. |
---|
| 472 | As previously mentioned, CompCert considers only extensional correctness of the compiler, and not intensional correctness, which CerCo focusses on. |
---|
| 473 | However, CerCo also extends CompCert in other ways. |
---|
| 474 | Namely, the CompCert verified compilation chain terminates at the PowerPC or ARM assembly level, and takes for granted the existence of a trustworthy assembler. |
---|
| 475 | CerCo chooses to go further, by considering a verified compilation chain all the way down to the machine code level. |
---|
| 476 | In essence, the work presented in this publication is one part of CerCo's extension over CompCert. |
---|
| 477 | |
---|
[927] | 478 | \bibliography{cpp-2011.bib} |
---|
| 479 | |
---|
[960] | 480 | \end{document}\renewcommand{\verb}{\lstinline} |
---|
| 481 | \def\lstlanguagefiles{lst-grafite.tex} |
---|
| 482 | \lstset{language=Grafite} |
---|