1 | \documentclass{llncs} |
---|
2 | |
---|
3 | \usepackage{amsfonts} |
---|
4 | \usepackage{amsmath} |
---|
5 | \usepackage{amssymb} |
---|
6 | \usepackage[english]{babel} |
---|
7 | \usepackage{color} |
---|
8 | \usepackage{fancybox} |
---|
9 | \usepackage{graphicx} |
---|
10 | \usepackage[colorlinks]{hyperref} |
---|
11 | \usepackage{hyphenat} |
---|
12 | \usepackage[utf8x]{inputenc} |
---|
13 | \usepackage{listings} |
---|
14 | \usepackage{mdwlist} |
---|
15 | \usepackage{microtype} |
---|
16 | \usepackage{stmaryrd} |
---|
17 | \usepackage{url} |
---|
18 | |
---|
19 | \renewcommand{\verb}{\lstinline} |
---|
20 | \def\lstlanguagefiles{lst-grafite.tex} |
---|
21 | \lstset{language=Grafite} |
---|
22 | |
---|
23 | \newlength{\mylength} |
---|
24 | \newenvironment{frametxt}% |
---|
25 | {\setlength{\fboxsep}{5pt} |
---|
26 | \setlength{\mylength}{\linewidth}% |
---|
27 | \addtolength{\mylength}{-2\fboxsep}% |
---|
28 | \addtolength{\mylength}{-2\fboxrule}% |
---|
29 | \Sbox |
---|
30 | \minipage{\mylength}% |
---|
31 | \setlength{\abovedisplayskip}{0pt}% |
---|
32 | \setlength{\belowdisplayskip}{0pt}% |
---|
33 | }% |
---|
34 | {\endminipage\endSbox |
---|
35 | \[\fbox{\TheSbox}\]} |
---|
36 | |
---|
37 | \title{On the correctness of an assembler for the Intel MCS-51 microprocessor} |
---|
38 | \author{Jaap Boender \and Dominic P. Mulligan \and Claudio Sacerdoti Coen} |
---|
39 | \institute{Dipartimento di Scienze dell'Informazione, Universit\'a di Bologna} |
---|
40 | |
---|
41 | \bibliographystyle{splncs03} |
---|
42 | |
---|
43 | \begin{document} |
---|
44 | |
---|
45 | \maketitle |
---|
46 | |
---|
47 | \begin{abstract} |
---|
48 | We consider the formalisation of an assembler for the Intel MCS-51 8-bit microprocessor in the Matita proof assistant. |
---|
49 | This formalisation forms a major component of the EU-funded CerCo project, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language. |
---|
50 | ... |
---|
51 | \end{abstract} |
---|
52 | |
---|
53 | % ---------------------------------------------------------------------------- % |
---|
54 | % SECTION % |
---|
55 | % ---------------------------------------------------------------------------- % |
---|
56 | \section{Introduction} |
---|
57 | \label{sect.introduction} |
---|
58 | |
---|
59 | We consider the formalisation of an assembler for the Intel MCS-51 8-bit microprocessor in the Matita proof assistant~\cite{asperti:user:2007}. |
---|
60 | This formalisation forms a major component of the EU-funded CerCo project~\cite{cerco:2011}, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language. |
---|
61 | |
---|
62 | The MCS-51 dates from the early 1980s and commonly called the 8051/8052. |
---|
63 | Despite the microprocessor's age, derivatives are still widely manufactured by a number of semiconductor foundries. |
---|
64 | As a result the processor is widely used, especially in embedded systems development, where well-tested, cheap, predictable microprocessors find their niche. |
---|
65 | |
---|
66 | The MCS-51 has a relative paucity of features compared to its more modern brethren. |
---|
67 | In particular, the MCS-51 does not possess a cache or any instruction pipelining that would make predicting the concrete cost of executing a single instruction an involved process. |
---|
68 | Instead, each semiconductor foundry that produces an MCS-51 derivative is able to provide accurate timing information in clock cycles for each instruction in their derivative's instruction set. |
---|
69 | It is important to stress that this timing information, unlike in more sophisticated processors, is not an estimate, it is a definition. |
---|
70 | For the MCS-51, if a manufacturer states that a particular opcode takes three clock cycles to execute, then that opcode \emph{always} takes three clock cycles to execute. |
---|
71 | |
---|
72 | This predicability of timing information is especially attractive to the CerCo consortium. |
---|
73 | We are in the process of constructing a certified, concrete complexity compiler for a realistic processor, and not for building and formalising the worst case execution time (WCET) tools that would be necessary to achieve the same result with, for example, a modern ARM or PowerPC microprocessor. |
---|
74 | |
---|
75 | However, the MCS-51's paucity of features is a double edged sword. |
---|
76 | In particular, the MCS-51 features relatively miniscule memory spaces (including read-only code memory, stack and internal/external random access memory) by modern standards. |
---|
77 | As a result our compiler, to have any sort of hope of successfully compiling realistic C programs, ought to produce `tight' machine code. |
---|
78 | This is not simple. |
---|
79 | |
---|
80 | We here focus on a single issue in the MCS-51's instruction set: unconditional jumps. |
---|
81 | The MCS-51 features three conditional jump instructions: \texttt{LJMP} and \texttt{SJMP}---`long jump' and `short jump' respectively---and an 11-bit oddity of the MCS-51, \texttt{AJMP}, that the prototype CerCo compiler~\cite{cerco-report-code:2011} ignores for simplicity's sake.\footnote{Ignoring \texttt{AJMP} and its analogue \texttt{ACALL} is not idiosyncratic. The Small Device C Compiler (SDCC)~\cite{sdcc:2011}, the leading open source C compiler for the MCS-51, also seemingly does not produce \texttt{AJMP} and \texttt{ACALL} instructions. Their utility in a modern context remains unclear.} |
---|
82 | Each of these three instructions expects arguments in different sizes and behaves in different ways. |
---|
83 | \texttt{SJMP} may only perform a `local jump' whereas \texttt{LJMP} may jump to any address in the MCS-51's memory space. |
---|
84 | Consequently, the size of each opcode is different, and to squeeze as much code as possible into the MCS-51's limited code memory, the smallest possible opcode should be selected. |
---|
85 | |
---|
86 | The prototype CerCo C compiler does not attempt to select the smallest jump opcode in this manner, as this was thought to unneccessarily complicate the compilation chain. |
---|
87 | Instead, the compiler targets an assembly language, complete with pseudoinstructions including bespoke \texttt{Jmp} and \texttt{Call} instructions. |
---|
88 | Labels, conditional jumps to labels, a program preamble containing global data and a \texttt{MOV} instruction for moving this global data into the MCS-51's one 16-bit register also feature. |
---|
89 | This latter feature will ease any later consideration of separate compilation in the CerCo compiler. |
---|
90 | An assembler is used to expand pseudoinstructions into MCS-51 machine code. |
---|
91 | |
---|
92 | However, this assembly process is not trivial, for numerous reasons. |
---|
93 | For example, our conditional jumps to labels behave differently from their machine code counterparts. |
---|
94 | At the machine code level, conditional jumps may only jump to a relative offset, expressed in a byte, of the current program counter, limiting their range. |
---|
95 | However, at the assembly level, conditional jumps may jump to a label that appears anywhere in the program, significantly liberalising the use of conditional jumps and further simplifying the design of the CerCo compiler. |
---|
96 | |
---|
97 | Further, trying to na\"ively relate assembly programs with their machine code counterparts simply does not work. |
---|
98 | Machine code programs that fetch from code memory and programs that combine the program counter with constant shifts do not make sense at the assembly level. |
---|
99 | More generally, memory addresses can only be compared with other memory addresses. |
---|
100 | However, checking that memory addresses are only compared against each other at the assembly level is in fact undecidable. |
---|
101 | In short, the full preservation of the semantics of the two languages is impossible. |
---|
102 | |
---|
103 | Yet more complications are added by the peculiarities of the CerCo project itself. |
---|
104 | As mentioned, the CerCo consortium is in the business of constructing a verified compiler for the C programming language. |
---|
105 | However, unlike CompCert~\cite{compcert:2011,leroy:formal:2009,leroy:formally:2009}---which currently represents the state of the art for `industrial grade' verified compilers---CerCo considers not just the \emph{intensional correctness} of the compiler, but also its \emph{extensional correctness}. |
---|
106 | That is, CompCert focusses solely on the preservation of the \emph{meaning} of a program during the compilation process, guaranteeing that the program's meaning does not change as it is gradually transformed into assembly code. |
---|
107 | However in any realistic compiler (even the CompCert compiler!) there is no guarantee that the program's time properties are preserved during the compilation process; a compiler's `optimisations' could, in theory, even conspire to degrade the concrete complexity of certain classes of programs. |
---|
108 | CerCo aims to expand the current state of the art by producing a compiler where this temporal degradation is guaranteed not to happen. |
---|
109 | |
---|
110 | In order to achieve this CerCo imposes a cost model on programs, or more specifically, on simple blocks of instructions. |
---|
111 | This cost model is induced by the compilation process itself, and its non-compositional nature allows us to assign different costs to identical blocks of instructions depending on how they are compiled. |
---|
112 | In short, we aim to obtain a very precise costing for a program by embracing the compilation process, not ignoring it. |
---|
113 | This, however, complicates the proof of correctness for the compiler proper: for every translation pass from intermediate language to intermediate language, we must prove that not only has the meaning of a program been preserved, but also its complexity characteristics. |
---|
114 | This also applies for the translation from assembly language to machine code. |
---|
115 | |
---|
116 | How do we assign a cost to a pseudoinstruction? |
---|
117 | As mentioned, conditional jumps at the assembly level can jump to a label appearing anywhere in the program. |
---|
118 | However, at the machine code level, conditional jumps are limited to jumping `locally', using a measly byte offset. |
---|
119 | To translate a jump to a label, a single conditional jump pseudoinstruction may be translated into a block of three real instructions, as follows (here, \texttt{JZ} is `jump if accumulator is zero'): |
---|
120 | \begin{displaymath} |
---|
121 | \begin{array}{r@{\quad}l@{\;\;}l@{\qquad}c@{\qquad}l@{\;\;}l} |
---|
122 | & \mathtt{JZ} & label & & \mathtt{JZ} & \text{size of \texttt{SJMP} instruction} \\ |
---|
123 | & \ldots & & \text{translates to} & \mathtt{SJMP} & \text{size of \texttt{LJMP} instruction} \\ |
---|
124 | label: & \mathtt{MOV} & \mathtt{A}\;\;\mathtt{B} & \Longrightarrow & \mathtt{LJMP} & \text{address of \textit{label}} \\ |
---|
125 | & & & & \ldots & \\ |
---|
126 | & & & & \mathtt{MOV} & \mathtt{A}\;\;\mathtt{B} |
---|
127 | \end{array} |
---|
128 | \end{displaymath} |
---|
129 | In the translation, if \texttt{JZ} fails, we fall through to the \texttt{SJMP} which jumps over the \texttt{LJMP}. |
---|
130 | Naturally, if \textit{label} is close enough, a conditional jump pseudoinstruction is mapped directly to a conditional jump machine instruction; the above translation only applies if \textit{label} is not sufficiently local. |
---|
131 | This leaves the problem, addressed below, of calculating whether a label is indeed `close enough' for the simpler translation to be used. |
---|
132 | |
---|
133 | Crucially, the above translation demonstrates the difficulty in predicting how many clock cycles a pseudoinstruction will take to execute. |
---|
134 | A conditional jump may be mapped to a single machine instruction or a block of three. |
---|
135 | Perhaps more insidious, the number of cycles needed to execute the instructions in the two branches of a translated conditional jump may be different. |
---|
136 | Depending on the particular MCS-51 derivative at hand, an \texttt{SJMP} could in theory take a different number of clock cycles to execute than an \texttt{LJMP}. |
---|
137 | These issues must also be dealt with in order to prove that the translation pass preserves the concrete complexity of the code. |
---|
138 | |
---|
139 | The question remains: how do we decide whether to expand a jump into an \texttt{SJMP} or an \texttt{LJMP}? |
---|
140 | To understand why this problem is not trivial, consider the following snippet of assembly code: |
---|
141 | \begin{displaymath} |
---|
142 | \text{dpm: finish me} |
---|
143 | \end{displaymath} |
---|
144 | |
---|
145 | As our example shows, given an occurence $l$ of an \texttt{LJMP} instruction, it may be possible to shrink $l$ to an occurence of an \texttt{SJMP} providing we can shrink any \texttt{LJMP}s that exist between $l$ and its target location. |
---|
146 | However, shrinking these \texttt{LJMP}s may in turn depend on shrinking $l$ to an \texttt{SJMP}, as it is perfectly possible to jump backwards. |
---|
147 | In short, unless we can somehow break this loop of circularity, and similar knotty configurations of jumps, we are stuck with a suboptimal solution to the expanding jumps problem. |
---|
148 | |
---|
149 | How we went about resolving this problem affected the shape of our proof of correctness for the whole assembler in a rather profound way. |
---|
150 | We first attempted to synthesize a solution bottom up: starting with no solution, we gradually refine a solution using the same functions that implement the jump expansion. |
---|
151 | Using this technique, solutions can fail to exist, and the proof quickly descends into a diabolical quagmire. |
---|
152 | |
---|
153 | Abandoning this attempt, we instead split the `policy'---the decision over how any particular jump should be expanded---from the implementation that actually expands assembly programs into machine code. |
---|
154 | Assuming the existence of a correct policy, we proved the implementation of the assembler correct. |
---|
155 | Further, we proved that the assembler fails to assemble an assembly program if and only if a correct policy does not exist. |
---|
156 | Policies do not exist in only a limited number of circumstances: namely, if a pseudoinstruction attempts to jump to a label that does not exist, or the program is too large to fit in code memory. |
---|
157 | The first case would constitute a serious compiler error, and hopefully certifying the rest of the compiler would rule this possibility out. |
---|
158 | The second case is unavoidable---certified compiler or not, trying to load a huge program into a small code memory will break \emph{something}. |
---|
159 | |
---|
160 | The rest of this paper is a detailed description of this proof. |
---|
161 | |
---|
162 | % ---------------------------------------------------------------------------- % |
---|
163 | % SECTION % |
---|
164 | % ---------------------------------------------------------------------------- % |
---|
165 | \subsection{Overview of the paper} |
---|
166 | \label{subsect.overview.of.the.paper} |
---|
167 | In Section~\ref{sect.matita} we provide a brief overview of the Matita proof assistant for the unfamiliar reader. |
---|
168 | In Section~\ref{sect.the.proof} we discuss the design and implementation of the proof proper. |
---|
169 | In Section~\ref{sect.conclusions} we conclude. |
---|
170 | |
---|
171 | % ---------------------------------------------------------------------------- % |
---|
172 | % SECTION % |
---|
173 | % ---------------------------------------------------------------------------- % |
---|
174 | \section{Matita} |
---|
175 | \label{sect.matita} |
---|
176 | |
---|
177 | Matita is a proof assistant based on the (Co)inductive Calculus of Constructions~\cite{asperti:user:2007}. |
---|
178 | For those familiar with Coq, Matita's syntax and mode of operation should be entirely familiar. |
---|
179 | We take time here to explain one of Matita's syntactic idiosyncracies, however. |
---|
180 | The use of `$\mathtt{?}$' or `$\mathtt{\ldots}$' in an argument position denotes a type or types to be inferred automatically by unification respectively. |
---|
181 | The use of `$\mathtt{?}$' in the body of a definition, lemma or axiom denotes an incomplete term that is to be closed, by hand, using tactics. |
---|
182 | |
---|
183 | % ---------------------------------------------------------------------------- % |
---|
184 | % SECTION % |
---|
185 | % ---------------------------------------------------------------------------- % |
---|
186 | \section{The proof} |
---|
187 | \label{sect.the.proof} |
---|
188 | |
---|
189 | \subsection{The assembler and semantics of machine code} |
---|
190 | \label{subsect.the.assembler.and.semantics.of.machine.code} |
---|
191 | |
---|
192 | The formalisation in Matita of the semantics of MCS-51 machine code is described in~\cite{mulligan:executable:2011}. |
---|
193 | We merely describe enough here to understand the rest of the proof. |
---|
194 | |
---|
195 | At heart, the MCS-51 emulator centres around a \texttt{Status} record, describing the current state of the microprocessor. |
---|
196 | This record contains fields corresponding to the microprocessor's program counter, special function registers, and so on. |
---|
197 | At the machine code level, code memory is implemented as a trie of bytes, addressed by the program counter. |
---|
198 | Machine code programs are loaded into \texttt{Status} using the \texttt{load\_code\_memory} function. |
---|
199 | |
---|
200 | We may execut a single step of a machine code program using the \texttt{execute\_1} function, which returns an updated \texttt{Status}: |
---|
201 | \begin{lstlisting} |
---|
202 | definition execute_1: Status $\rightarrow$ Status := $\ldots$ |
---|
203 | \end{lstlisting} |
---|
204 | The function \texttt{execute} allows one to execute an arbitrary, but fixed (due to Matita's normalisation requirement!) number of steps of a program. |
---|
205 | |
---|
206 | Naturally, assembly programs have analogues. |
---|
207 | The counterpart of the \texttt{Status} record is \texttt{PseudoStatus}. |
---|
208 | Instead of code memory being implemented as tries of bytes, code memory is here implemented as lists of pseudoinstructions, and program counters are merely indices into this list. |
---|
209 | Our analogue of \texttt{execute\_1} is \texttt{execute\_1\_pseudo\_instruction}: |
---|
210 | \begin{lstlisting} |
---|
211 | definition execute_1_pseudo_instruction: (Word $\rightarrow$ nat $\times$ nat) $\rightarrow$ |
---|
212 | PseudoStatus $\rightarrow$ PseudoStatus := $\ldots$ |
---|
213 | \end{lstlisting} |
---|
214 | Notice, here, that the emulation function for pseudoprograms takes an additional argument. |
---|
215 | This is a function that maps program counters (for the pseudoprogram) to pairs of natural numbers representing the number of clock ticks that the pseudoinstruction needs to execute, post expansion. |
---|
216 | We call this function a \emph{costing}, and note that the costing is induced by the particular strategy we use to expand pseudoinstructions. |
---|
217 | If we change how we expand conditional jumps to labels, for instance, then the costing needs to change, hence \texttt{execute\_1\_pseudo\_instruction}'s parametricity in the costing. |
---|
218 | |
---|
219 | The costing returns \emph{pairs} of natural numbers because, in the case of expanding conditional jumps to labels, the expansion of the `true branch' and `false branch' may differ in the number of clock ticks needed for execution. |
---|
220 | This timing information is used inside \texttt{execute\_1\_pseudo\_instruction} to update the clock of the \texttt{PseudoStatus}. |
---|
221 | During the proof of correctness of the assembler we relate the clocks of \texttt{Status}es and \texttt{PseudoStatus}es. |
---|
222 | |
---|
223 | The assembler, mapping programs consisting of lists of pseudoinstructions to lists of bytes, operates in a mostly straightforward manner. |
---|
224 | To a degree of approximation, the assembler on an assembly program, consisting of $n$ pseudoinstructions $\mathtt{P_i}$ for $1 \leq i \leq n$, works as in the following diagram: |
---|
225 | \begin{displaymath} |
---|
226 | [\mathtt{P_1}, \ldots \mathtt{P_n}] \xrightarrow{\mathtt{flatten}\left(\mathtt{P_i} \xrightarrow{\mbox{\fontsize{7}{9}\selectfont$\mathtt{expand}$}} \mathtt{[I_1^i, \ldots I^q_i]} \xrightarrow{\mbox{\fontsize{7}{9}\selectfont$\mathtt{assembly1}^*$}} \mathtt{[0110]}\right)^{*}} \mathtt{[010101]} |
---|
227 | \end{displaymath} |
---|
228 | Here $\mathtt{I^i_j}$ for $1 \leq j \leq q$ are the $q$ machine code instructions obtained by expanding, with \texttt{expand\_pseudo\_instruction}, a single pseudoinstruction. |
---|
229 | Each machine code instruction $\mathtt{I^i_j}$ is then assembled, using the \texttt{assembly1} function, into a list of bytes. |
---|
230 | This process is iterated for each pseudoinstruction, before the lists are flattened into a single bit list representation of the original assembly program. |
---|
231 | |
---|
232 | By inspecting the above diagram, it would appear that the best way to proceed with a proof that the assembly process does not change the semantics of an assembly program is via a decomposition of the problem into two subproblems. |
---|
233 | Namely, we first expand any and all pseudoinstructions into lists of machine instructions, and provide a proof that this process does not change our program's semantics. |
---|
234 | Finally, we assemble all machine code instructions into machine code---lists of bytes---and prove once more that this process does not have an adverse effect on a program's semantics. |
---|
235 | By composition, we then have that the whole assembly process is semantics preserving. |
---|
236 | |
---|
237 | This is a tempting approach to the proof, but ultimately the wrong approach. |
---|
238 | In particular, it is important that we track how the program counter indexing into the assembly program, and the machine's program counter evolve, so that we can relate them. |
---|
239 | Expanding pseudoinstructions requires that the machine's program counter be incremented by $n$ steps, for $1 \leq n$, for every increment of the assembly program's program counter. |
---|
240 | Keeping track of the correspondence between the two program counters quickly becomes unfeasible using a compositional approach, and hence the proof must be monolithic. |
---|
241 | |
---|
242 | % ---------------------------------------------------------------------------- % |
---|
243 | % SECTION % |
---|
244 | % ---------------------------------------------------------------------------- % |
---|
245 | \subsection{Policies} |
---|
246 | \label{subsect.policies} |
---|
247 | |
---|
248 | Policies exist to dictate how conditional and unconditional jumps at the assembly level should be expanded into machine code instructions. |
---|
249 | Using policies, we are able to completely decouple the decision over how jumps are expanded with the act of expansion, simplifying our proofs. |
---|
250 | As mentioned, the MCS-51 instruction set includes three different jump instructions: \texttt{SJMP}, \texttt{AJMP} and \texttt{LJMP}; call these `short', `medium' and `long' jumps, respectively: |
---|
251 | \begin{lstlisting} |
---|
252 | inductive jump_length: Type[0] := |
---|
253 | | short_jump: jump_length |
---|
254 | | medium_jump: jump_length |
---|
255 | | long_jump: jump_length. |
---|
256 | \end{lstlisting} |
---|
257 | A \texttt{jump\_expansion\_policy} is a map from \texttt{Word}s to \texttt{jump\_length}s, implemented as a trie. |
---|
258 | Intuitively, a policy maps positions in a program (indexed using program counters implemented as \texttt{Word}s) to a particular variety of jump. |
---|
259 | \begin{lstlisting} |
---|
260 | definition jump_expansion_policy := BitVectorTrie jump_length 16. |
---|
261 | \end{lstlisting} |
---|
262 | Next, we require a series of `sigma' functions. |
---|
263 | These functions map assembly program counters to their machine code counterparts, establishing the correspondence between `positions' in an assembly program and `positions' in a machine code program. |
---|
264 | At the heart of this process is \texttt{sigma0} which traverses an assembly program building maps from program counter to program counter. |
---|
265 | This function fails if and only if an internal call to \texttt{assembly\_1\_pseudoinstruction} fails: |
---|
266 | \begin{lstlisting} |
---|
267 | definition sigma0: pseudo_assembly_program |
---|
268 | $\rightarrow$ option (nat $\times$ (nat $\times$ (BitVectorTrie Word 16))) := $\ldots$ |
---|
269 | \end{lstlisting} |
---|
270 | We eventually lift this to functions from program counters to program counters: |
---|
271 | \begin{lstlisting} |
---|
272 | definition sigma_safe: |
---|
273 | pseudo_assembly_program $\rightarrow$ option (Word $\rightarrow$ Word) := $\ldots$ |
---|
274 | \end{lstlisting} |
---|
275 | Now, it's possible to define what a `good policy' is i.e. one that does not cause \texttt{sigma\_safe} to fail. |
---|
276 | As mentioned, \texttt{sigma\_safe} can only fail if an assembly program fails to be assembled: |
---|
277 | \begin{lstlisting} |
---|
278 | definition policy_ok := $\lambda$p. sigma_safe p $\neq$ None $\ldots$. |
---|
279 | \end{lstlisting} |
---|
280 | Finally, we obtain \texttt{sigma}, a map from program counters to program counters, which is guranteed not to fail as we internally provide a that |
---|
281 | \begin{lstlisting} |
---|
282 | definition sigma: pseudo_assembly_program $\rightarrow$ Word $\rightarrow$ Word := $\ldots$ |
---|
283 | \end{lstlisting} |
---|
284 | |
---|
285 | % ---------------------------------------------------------------------------- % |
---|
286 | % SECTION % |
---|
287 | % ---------------------------------------------------------------------------- % |
---|
288 | \subsection{Total correctness of the assembler} |
---|
289 | \label{subsect.total.correctness.of.the.assembler} |
---|
290 | |
---|
291 | Using our policies, we now work toward proving the total correctness of the assembler. |
---|
292 | By total correctness, we mean that the assembly process does not change the semantics of an assembly program. |
---|
293 | Naturally, this necessitates keeping some sort of correspondence between program counters at the assembly level, and program counters at the machine code level. |
---|
294 | For this, we use the \texttt{sigma} machinery defined at the end of Subsection~\ref{subsect.policies}. |
---|
295 | |
---|
296 | CSC: no options using policy |
---|
297 | \begin{lstlisting} |
---|
298 | lemma fetch_assembly_pseudo2: $\forall$program, assembled, costs, labels. |
---|
299 | Some $\ldots$ $\langle$labels,costs$\rangle$ = build_maps program $\rightarrow$ |
---|
300 | Some $\ldots$ $\langle$assembled,costs$\rangle$ = assembly program $\rightarrow$ $\forall$ppc. |
---|
301 | let code_memory := load_code_memory assembled in |
---|
302 | let preamble := $\pi_1$ program in |
---|
303 | let data_labels := construct_datalabels preamble in |
---|
304 | let lk_labels := |
---|
305 | λx. sigma program (address_of_word_labels_code_mem ($\pi_2$ program) x) in |
---|
306 | let lk_dlabels := λx. lookup ? ? x data_labels (zero ?) in |
---|
307 | let expansion := jump_expansion ppc program in |
---|
308 | let $\langle$pi,newppc$\rangle$ := fetch_pseudo_instruction ($\pi_2$ program) ppc in |
---|
309 | let ppc' := sigma program ppc in |
---|
310 | let newppc' := sigma program newppc in |
---|
311 | let instructions' := |
---|
312 | expand_pseudo_instruction lk_labels lk_dlabels ppc' expansion pi in |
---|
313 | let fetched := fetch_many code_memory newppc' ppc' instructions in |
---|
314 | $\exists$instructions. Some ? instructions = instructions' $\wedge$ fetched. |
---|
315 | \end{lstlisting} |
---|
316 | |
---|
317 | We expand pseudoinstructions using the function \texttt{expand\_pseudo\_instruction}. |
---|
318 | This function accepts a `policy decision'---an element of type \texttt{jump\_length}---that is used when expanding a \texttt{Call}, \texttt{Jmp} or conditional jump to a label into the correct machine instruction. |
---|
319 | This \texttt{policy\_decision} is asssumed to originate from a policy as defined in Subsection~\ref{subsect.policies}. |
---|
320 | \begin{lstlisting} |
---|
321 | definition expand_pseudo_instruction: |
---|
322 | ∀lookup_labels, lookup_datalabels, pc, policy_decision. |
---|
323 | pseudo_instruction $\rightarrow$ option list instruction := $\ldots$ |
---|
324 | \end{lstlisting} |
---|
325 | Under the assumption that a correct policy exists, \texttt{expand\_pseudo\_instruction} should never fail, and therefore the option type may be dispensed with. |
---|
326 | This is because the only failure conditions for \texttt{expand\_pseudo\_instruction} result from trying to expand a pseudoinstruction into an `impossible' combination of machine code instructions. |
---|
327 | For instance, if the policy decision dictates that we should expand a \texttt{Call} pseudoinstruction into a `short jump', then we fail, as the MCS-51's instruction set only features instructions \texttt{ACALL} and \texttt{LCALL}. |
---|
328 | |
---|
329 | \begin{lstlisting} |
---|
330 | axiom assembly_ok: ∀program,assembled,costs,labels. |
---|
331 | Some $\ldots$ $\langle$labels, costs$\rangle$ = build_maps program $\rightarrow$ |
---|
332 | Some $\ldots$ $\langle$assembled, costs$\rangle$ = assembly program $\rightarrow$ |
---|
333 | let code_memory := load_code_memory assembled in |
---|
334 | let preamble := $\pi_1$ program in |
---|
335 | let datalabels := construct_datalabels preamble in |
---|
336 | let lk_labels := |
---|
337 | $\lambda$x. sigma program (address_of_word_labels_code_mem ($\pi_2$ program) x) in |
---|
338 | let lk_dlabels := $\lambda$x. lookup ? ? x datalabels (zero ?) in |
---|
339 | ∀ppc,len,assembledi. |
---|
340 | let $\langle$pi, newppc$\rangle$ := fetch_pseudo_instruction ($\pi_2$ program) ppc in |
---|
341 | let assembly' := assembly_1_pseudoinstruction program ppc |
---|
342 | (sigma program ppc) lk_labels lk_dlabels pi in |
---|
343 | let newpc := (sigma program ppc) + len in |
---|
344 | let echeck := |
---|
345 | encoding_check code_memory (sigma program ppc) slen assembledi in |
---|
346 | Some $\ldots$ $\langle$len, assembledi$\rangle$ = assembly' $\rightarrow$ |
---|
347 | echeck $\wedge$ sigma program newppc = newpc. |
---|
348 | \end{lstlisting} |
---|
349 | |
---|
350 | Lemma \texttt{fetch\_assembly\_pseudo} establishes the correctness of expanding and then assembling pseudoinstructions: |
---|
351 | \begin{lstlisting} |
---|
352 | lemma fetch_assembly_pseudo: $\forall$program, ppc, lk_labels, lk_dlabels. |
---|
353 | $\forall$pi, code_memory, len, assembled, instructions, pc. |
---|
354 | let jexp := jump_expansion ppc program in |
---|
355 | let exp := |
---|
356 | expand_pseudo_instruction lk_labels lk_dlabels pc jexp pi |
---|
357 | let ass := |
---|
358 | assembly_1_pseudoinstruction program ppc pc lk_labels lk_dlabels pi in |
---|
359 | Some ? instructions = exp $\rightarrow$ |
---|
360 | Some $\ldots$ $\langle$len, assembled$\rangle$ = ass $\rightarrow$ |
---|
361 | encoding_check code_memory pc (pc + len) assembled $\rightarrow$ |
---|
362 | fetch_many code_memory (pc + len) pc instructions. |
---|
363 | \end{lstlisting} |
---|
364 | Here, \texttt{len} is the number of machine code instructions the pseudoinstruction at hand has been expanded into, \texttt{encoding\_check} is a recursive function that checks for any possible corruption of the code memory, resulting from expanding the pseudoinstruction. |
---|
365 | We assemble a single pseudoinstruction with \texttt{assembly\_1\_pseudoinstruction}, which internally calls \texttt{jump\_expansion} and \texttt{expand\_pseudo\_instruction}. |
---|
366 | The function \texttt{fetch\_many} fetches multiple machine code instructions from code memory and performs some routine checks. |
---|
367 | |
---|
368 | Intuitively, lemma \texttt{fetch\_assembly\_pseudo} can be read as follows. |
---|
369 | Suppose our policy \texttt{jump\_expansion} dictates that the pseudoinstruction indexed by the pseudo program counter \texttt{ppc} in assembly program \texttt{program} gives us the policy decision \texttt{jexp}. |
---|
370 | Further, suppose we expand the pseudoinstruction at \texttt{ppc} with the policy decision \texttt{jexp}, obtaining an (optional) list of machine code instructions \texttt{exp}. |
---|
371 | Suppose we also |
---|
372 | |
---|
373 | \begin{lstlisting} |
---|
374 | theorem fetch_assembly: $\forall$pc, i, cmem, assembled. |
---|
375 | assembled = assembly1 i $\rightarrow$ |
---|
376 | let len := length $\ldots$ assembled in |
---|
377 | encoding_check cmem pc (pc + len) assembled $\rightarrow$ |
---|
378 | let fetched := fetch code_memory (bitvector_of_nat $\ldots$ pc) in |
---|
379 | let $\langle$instr_pc, ticks$\rangle$ := fetched in |
---|
380 | let $\langle$instr, pc'$\rangle$ := instr_pc in |
---|
381 | (eq_instruction instr i $\wedge$ |
---|
382 | eqb ticks (ticks_of_instruction instr) $\wedge$ |
---|
383 | eq_bv $\ldots$ pc' (pc + len)) = true. |
---|
384 | \end{lstlisting} |
---|
385 | |
---|
386 | An \texttt{internal\_pseudo\_address\_map} positions in the memory of a \texttt{PseudoStatus} with a physical memory address. |
---|
387 | \begin{lstlisting} |
---|
388 | definition internal_pseudo_address_map := list (BitVector 8). |
---|
389 | \end{lstlisting} |
---|
390 | We use \texttt{internal\_pseudo\_address\_map}s to convert the lower internal RAM of a \texttt{PseudoStatus} into the lower internal RAM of a \texttt{Status}. |
---|
391 | Notice, the MCS-51's internal RAM is addressed with a 7-bit `byte'. |
---|
392 | % dpm: ugly English, fix |
---|
393 | The whole of the internal RAM space is addressed with bytes: the first bit is used to distinguish between the programmer addressing low and high internal memory. |
---|
394 | \begin{lstlisting} |
---|
395 | axiom low_internal_ram_of_pseudo_low_internal_ram: |
---|
396 | internal_pseudo_address_map $\rightarrow$ BitVectorTrie Byte 7 $\rightarrow$ BitVectorTrie Byte 7. |
---|
397 | \end{lstlisting} |
---|
398 | A similar axiom exists for high internal RAM. |
---|
399 | |
---|
400 | Next, we are able to translate \texttt{PseudoStatus} records into \texttt{Status} records using \texttt{status\_of\_pseudo\_status}. |
---|
401 | Translating a \texttt{PseudoStatus}'s code memory requires we expand pseudoinstructions and then assemble to obtain a trie of bytes. |
---|
402 | This can fail, as mentioned, in a limited number of situations, related to improper use of labels in an assembly program. |
---|
403 | However, it is possible to `tighten' the type of \texttt{status\_of\_pseudo\_status}, removing the option type, by using the fact that if any `good policy' exists, assembly will never fail. |
---|
404 | \begin{lstlisting} |
---|
405 | definition status_of_pseudo_status: |
---|
406 | internal_pseudo_address_map → PseudoStatus → option Status |
---|
407 | \end{lstlisting} |
---|
408 | After fetching an assembly instruction we must update any \texttt{internal\_pseudo\hyp{}\_address\_map}s that may be laying around. |
---|
409 | This is done with the following function: |
---|
410 | \begin{lstlisting} |
---|
411 | definition next_internal_pseudo_address_map: internal_pseudo_address_map |
---|
412 | $\rightarrow$ PseudoStatus $\rightarrow$ option internal_pseudo_address_map |
---|
413 | \end{lstlisting} |
---|
414 | Finally, we are able to state and prove our main theorem. |
---|
415 | This relates the execution of a single assembly instruction and the execution of (possibly) many machine code instructions. |
---|
416 | That is, the assembly process preserves the semantics of an assembly program, as it is translated into machine code: |
---|
417 | \begin{lstlisting} |
---|
418 | theorem main_thm: |
---|
419 | ∀M,M',ps,s,s''. |
---|
420 | next_internal_pseudo_address_map M ps = Some $\ldots$ M' $\rightarrow$ |
---|
421 | status_of_pseudo_status M ps = Some $\ldots$ s $\rightarrow$ |
---|
422 | status_of_pseudo_status M' |
---|
423 | (execute_1_pseudo_instruction |
---|
424 | (ticks_of (code_memory $\ldots$ ps)) ps) = Some $\ldots$ s'' $\rightarrow$ |
---|
425 | $\exists$n. execute n s = s''. |
---|
426 | \end{lstlisting} |
---|
427 | The statement can be given an intuitive reading as follows. |
---|
428 | Suppose our \texttt{PseudoStatus} $ps$ can be successfully converted into a \texttt{Status} $s$. |
---|
429 | Suppose further that, after executing a single assembly instruction and converting the resulting \texttt{PseudoStatus} into a \texttt{Status}, we obtain $s''$, being careful to track the number of ticks executed. |
---|
430 | Then, there exists some number $n$, so that executing $n$ machine code instructions in \texttt{Status} $s$ gives us \texttt{Status} $s''$. |
---|
431 | Theorem \texttt{main\_thm} establishes the correctness of the assembly process. |
---|
432 | |
---|
433 | % ---------------------------------------------------------------------------- % |
---|
434 | % SECTION % |
---|
435 | % ---------------------------------------------------------------------------- % |
---|
436 | \section{Conclusions} |
---|
437 | \label{sect.conclusions} |
---|
438 | |
---|
439 | We have proved the total correctness of an assembler for MCS-51 assembly language. |
---|
440 | In particular, our assembly language featured labels, arbitrary conditional and unconditional jumps to labels, global data and instructions for moving this data into the MCS-51's single 16-bit register. |
---|
441 | Expanding these pseudoinstructions into machine code instructions is not trivial, and the proof that the assembly process is `correct', in that the semantics of an assembly program are not changed is complex. |
---|
442 | |
---|
443 | Aside from their application in verified compiler projects such as CerCo, verified assemblers could also be applied to the verification of operating system kernels. |
---|
444 | Of particular note is the verified seL4 kernel~\cite{klein:sel4:2009,klein:sel4:2010}. |
---|
445 | This verification explicitly assumes the existence of, amongst other things, a trustworthy assembler. |
---|
446 | |
---|
447 | \paragraph{Use of dependent types and Russell} |
---|
448 | Our formalisation makes sparing use of dependent types. |
---|
449 | In certain datastructures, such as tries and vectors, they are used to guarantee invariants. |
---|
450 | However, we have currently shyed away from making extensive use of dependent types and inductive predicates in the proof of correctness for the assembler itself. |
---|
451 | This is because complex dependent types and inductive predicates tend not to co\"operate particularly well with tactics such as inversion. |
---|
452 | |
---|
453 | However, there are certain cases where the use of dependent types is unavoidable. |
---|
454 | For instance, when proving that the \texttt{build\_maps} function is correct, a function that collates the cost and data labels of an assembly program into map datastructures. |
---|
455 | In cases such as these we make use of Matita's implementation of Russell~\cite{sozeau:subset:2006}. |
---|
456 | In Matita, Russell may be implemented with two coercions and some notational sugaring. |
---|
457 | |
---|
458 | \subsection{Related work} |
---|
459 | \label{subsect.related.work} |
---|
460 | |
---|
461 | % piton |
---|
462 | We are not the first to consider the total correctness of an assembler for a non-trivial assembly language. |
---|
463 | Perhaps the most impressive piece of work in this domain is the Piton stack~\cite{moore:piton:1996,moore:grand:2005}. |
---|
464 | This was a stack of verified components, written and verified in ACL2, ranging from a proprietary FM9001 microprocessor verified at the gate level, to assemblers and compilers for two high-level languages---a dialect of Lisp and $\mu$Gypsy~\cite{moore:grand:2005}. |
---|
465 | %dpm more: weirich paper? |
---|
466 | |
---|
467 | % jinja |
---|
468 | Klein and Nipkow consider a Java-like programming language, Jinja~\cite{klein:machine:2006,klein:machine:2010}. |
---|
469 | They provide a compiler, virtual machine and operational semantics for the programming language and virtual machine, and prove that their compiler is semantics and type preserving. |
---|
470 | |
---|
471 | Finally, mention should be made of CompCert~\cite{compcert:2011,blazy:formal:2006,leroy:formal:2009,leroy:formally:2009}, another verified compiler project related to CerCo. |
---|
472 | As previously mentioned, CompCert considers only extensional correctness of the compiler, and not intensional correctness, which CerCo focusses on. |
---|
473 | However, CerCo also extends CompCert in other ways. |
---|
474 | Namely, the CompCert verified compilation chain terminates at the PowerPC or ARM assembly level, and takes for granted the existence of a trustworthy assembler. |
---|
475 | CerCo chooses to go further, by considering a verified compilation chain all the way down to the machine code level. |
---|
476 | In essence, the work presented in this publication is one part of CerCo's extension over CompCert. |
---|
477 | |
---|
478 | \bibliography{cpp-2011.bib} |
---|
479 | |
---|
480 | \end{document}\renewcommand{\verb}{\lstinline} |
---|
481 | \def\lstlanguagefiles{lst-grafite.tex} |
---|
482 | \lstset{language=Grafite} |
---|