source: Papers/jar-assembler-2015/boender-jar-2015.tex @ 3633

Last change on this file since 3633 was 3558, checked in by boender, 5 years ago
  • Added paragraph to branch displacement section
File size: 19.8 KB
Line 
1\documentclass[a4paper]{article}
2
3\usepackage{a4wide}
4\usepackage{algpseudocode}
5\usepackage{alltt}
6\usepackage{amsfonts}
7\usepackage{amsmath}
8\usepackage[british]{babel}
9\usepackage{caption}
10\usepackage[colorlinks]{hyperref}
11\usepackage[utf8]{inputenc}
12\usepackage{listings}
13\usepackage{microtype}
14\usepackage{subcaption}
15
16%\usepackage{kpfonts}
17\usepackage[T1]{fontenc}
18%\usepackage[ttdefault=true]{AnonymousPro}
19
20\newcommand{\All}[1]{\forall{#1}.\;}
21\newcommand{\deffont}[1]{\textbf{#1}}
22\newcommand{\lam}[1]{\lambda{#1}.\;}
23\newcommand{\todo}[1]{\ensuremath{\texttt{[#1]}}}
24
25\makeatletter
26\def\blfootnote{\xdef\@thefnmark{}\@footnotetext}
27\makeatother
28
29\bibliographystyle{alpha}
30
31%\renewcommand{\verb}{\lstinline}
32\def\lstlanguagefiles{lst-grafite.tex}
33\lstset{language=Grafite}
34
35\author{Jaap Boender \and Dominic P. Mulligan \and Claudio Sacerdoti Coen}
36\title{On the proof of correctness of a verified optimising assembler}
37
38\begin{document}
39
40\maketitle
41
42\begin{abstract}
43A typical microprocessor's instruction set provides several forms of conditional and unconditional jumps, procedure calls, and data movement instructions.
44Unconditional jump and call instructions vary depending on the \emph{span} of the branch---the distance between the location of the instruction and the branch target---with short jump and call instructions taking less space in code memory to encode than their long jump and call counterparts.
45
46Assembly languages are a thin veneer over machine code, with a few light abstractions provided for the benefit of the programmer.
47Two such abstractions provided in assembly languages are uniform `jump' and `call' pseudoinstructions that branch to a labelled point in program source rather than a concrete machine address.
48These pseudoinstructions remove the burden of selecting between long and short jump instructions, and computing concrete machine addresses, from the programmer.
49
50Pseudoinstructions must be `assembled away' by the assembler, and a jump pseudoinstruction can always be replaced by a long jump machine code instruction with no semantic change.
51Yet, such a na\"ive replacement is inefficient and wasteful in terms of code memory usage---a prime concern when targetting embedded devices---as some long jumps could have been replaced with short jumps, reducing the size of the program in code memory.
52Therefore any assembler should replace as many long jumps with short jumps as possible, optimising for program size.
53
54We present a formalisation and proof of correctness of an optimising assembler targetting a space-constrained embedded microcontroller: the 8-bit Intel MCS-51 series.
55All proofs are checked by the Matita theorem prover, and the work forms a major component of the Certified Complexity (`CerCo') project, aiming to produce a verified concrete complexity-preserving compiler for a subset of the C programming language.
56
57Our optimisations complicate the proof of correctness of the assembler proper and therefore we employ the use of `policies', separating out the decision of how a pseudoinstruction should be expanded from the expansion process itself, simplifying our proofs.
58Our proof strategy also tracks `good' addresses, and only programs that use good addresses have their semantics preserved under assembly.
59This tracking offers increased flexibility over the traditional approach---where addresses are kept abstract and immutable---as we may experiment with some manipulations of machine addresses.
60\end{abstract}
61
62\newpage
63
64\tableofcontents
65
66\blfootnote{Research supported by the CerCo project, within the Future and Emerging Technologies (FET) programme of the Seventh Framework Programme for Research of the European Commission, under FET-Open grant number 243881.}
67
68\newpage
69
70%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
71% Section                                 %
72%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
73\section{Introduction}
74\label{sect.introduction}
75
76We present an overview of work formalising and proving correct an optimising assembler for a typical embedded microcontroller.
77Though the work described herein is specialised to the 8-bit Intel MCS-51 series, we believe the design decisions and techniques developed for our proof of correctness are of general interest.
78
79% Everybody
80
81%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
82% Section                                 %
83%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
84\subsection{Matita}
85\label{subsect.matita}
86
87Matita is a theorem prover based on a variant of the Calculus of Coinductive Constructions~\cite{asperti:user:2007}.
88The system features a full spectrum of dependent types and (co)inductive families, a system of coercions, a tactic-driven proof construction engine~\cite{sacerdoti-coen:tinycals:2007}, and paramodulation based automation~\cite{asperti:higher-order:2007}, all of which we exploit in the formalisation described herein.
89
90Matita's syntax is similar to the syntaxes of mainstream functional programming languages such as OCaml or Standard ML.
91The type theory that Matita implements is broadly akin to that of Coq~\cite{coq:2004} and Agda~\cite{bove:brief:2009}.
92Nevertheless, we provide a brief explanation of the main syntactic and type-theoretic features of Matita that will be needed to follow the body of the paper:
93\begin{itemize}
94\item
95Non-recursive functions and definitions are introduced via the \texttt{definition} keyword.
96Recursive functions are introduced with \texttt{let rec}.
97Mutually recursive functions are separated via the \texttt{and} keyword.
98Matita's termination checker ensures that all recursive functions are terminating before being admitted to maintain soundness of the system's logic.
99\item
100Matita has an infinite hierarchy of type universes.
101A single impredicative universe of types, \texttt{Prop}, exists at the base of this hierarchy.
102An infinite series of predicative universes, \texttt{Type[0]} : \texttt{Type[1]} : \texttt{Type[2]}, and so on and so forth, sits atop \texttt{Prop}.
103Matita, unlike Coq or Agda, implements no form of typical ambiguity or universe polymorphism, with explicit concrete universe levels being preferred instead.
104\item
105Matita's type theory plays host to a rich and expressive higher-order logic.
106Constants \texttt{True} and \texttt{False} represent truth and falsity in \texttt{Prop} respectively.
107Two inductive families in \texttt{Prop} encode conjunction and disjunction---$\mathtt{P \wedge Q}$ and $\mathtt{P \vee Q}$ respectively.
108
109As is usual, implication and universal quantification are identified with the dependent function space ($\Pi$ types), whereas (constructive) existential quantification is encoded as a dependent sum (a $\Sigma$-type).
110We write $\All{x : \phi}\psi$ for the dependent function space, and abbreviate this as $\phi \rightarrow \psi$ when $x \not\in fv(\psi)$ as usual.
111We use $\langle M,N \rangle$ for the pairing of $M$ and $N$.
112\item
113Inductive and coinductive families are introduced via the \texttt{inductive} and \texttt{coinductive} keywords respectively, with named constructor declarations separated by a bar.
114Mutually inductive data family declarations are separated via \texttt{with}.
115In the following declaration:
116\begin{lstlisting}
117inductive I ($P_1$ : $\tau_1$) $\ldots$ ($P_n$ : $\tau_n$) : $\phi_1 \rightarrow \ldots \rightarrow \phi_m \rightarrow \phi$ := $\ldots$
118\end{lstlisting}
119We call $P_i$ for $0 \leq i \leq n$ the \deffont{parameters} of \texttt{I} and $\phi_j$ for $0 \leq j \leq m$ the \deffont{indices} of \texttt{I}.
120Matita's positivity checker ensures that constructors have strictly-positive types before admitting an inductive family to maintain soundness of the system's logic.
121\item
122Records are introduced with the \texttt{record} keyword.
123A Matita record
124\begin{lstlisting}
125record R : Type[0] := { F1 : nat }.
126\end{lstlisting}
127may be thought of as syntactic sugar for a single-constructor inductive data type of the same name:
128\begin{lstlisting}
129inductive R : Type[0] :=
130  | mk_R : nat -> R.
131\end{lstlisting}
132A record field's type may depend on fields declared earlier in the record.
133
134Records may be decomposed with projections.
135Projections, one for each of field of a record, are registered in the global context.
136In the example record above, \texttt{F1} of type $R \rightarrow nat$ is registered as a field projection and $mk\_R$ of type $nat \rightarrow R$ is registered as a constructor.
137
138Record fields may also be marked as coercions.
139In the following example
140\begin{lstlisting}
141record S : Type[1] :=
142{
143  Carrier :> Type[0];
144  op : Carrier -> Carrier -> Carrier
145}
146\end{lstlisting}
147the field \texttt{Carrier} is declared to be a coercion with `\texttt{:>}'.with the operational effect being that the field projection \texttt{Carrier} may be omitted where it could be successfully inferred by Matita.
148Field coercions facilitate the informal but common mathematical practice of intentionally confusing a structure with its underlying carrier set.
149\item
150Terms may be freely omitted, allowing the user to write down partial types and terms.
151A question mark, \texttt{?}, denotes a single term that has been omitted by the user.
152Some omitted terms can be deduced by Matita's refinement system.
153Other, more complex goals arising from omitted terms may require user input to solve, in which case a proof obligation is opened for each term that cannot be deduced automatically.
154Three consecutive dots, \texttt{$\ldots$}, denote multiple terms or types that have been omitted.
155\item
156Data may be decomposed by pattern matching with a \texttt{match} expression.
157We may fully annotate a \texttt{match} expression with its return type.
158This is especially useful when working with indexed families of types or with invariants, expressed as types, on functions.
159In the following
160\begin{lstlisting}
161match t return $\lam{x}x = 0 \rightarrow bool$ with
162[ 0    $\Rightarrow$ $\lam{prf_1}P_1$
163| S m $\Rightarrow$ $\lam{prf_2}P_2$
164] (refl $\ldots$ t)
165\end{lstlisting}
166the \texttt{0} branch of the \texttt{match} expression returns a function from $0 = 0$ to \texttt{bool}, whereas the \texttt{S m} branch of the \texttt{match} expression returns a function from \texttt{S m = 0} to \texttt{bool}.
167In both cases the annotated return type $\lam{x}x = 0 \rightarrow bool$ has been specialised given new information about \texttt{t} revealed by the act of pattern matching.
168The entire term, with \texttt{match} expression applied to \texttt{refl $\ldots$ t}, has type \texttt{bool}.
169\item
170Matita features a liberal system of coercions (distinct from the previously mentioned record field coercions).
171It is possible to define a uniform coercion $\lam{x}\langle x, ?\rangle$ from every type $T$ to the dependent product $\Sigma{x : T}. P x$.
172The coercion opens a proof obligation that asks the user to prove that $P$ holds for $x$.
173When a coercion is to be applied to a complex term (for example, a $\lambda$-abstraction, a local definition, or a case analysis), the system automatically propagates the coercion to the sub-terms.
174For instance, to apply a coercion to force $\lam{x}M : A \rightarrow B$ to
175have type $\All{x : A}\Sigma{y : B}. P x y$, the system looks for a coercion from $M : B$ to $\Sigma{y : B}. P x y$ in a context augmented with $x : A$.
176This is significant when the coercion opens a proof obligation, as the user will be presented with multiple, but simpler proof obligations in the correct context.
177In this way, Matita supports the `Russell' proof methodology developed by Sozeau in~\cite{sozeau:subset:2007}, in a lightweight but tightly-integrated manner.
178\end{itemize}
179Throughout, for reasons of clarity, conciseness, and readability, we may choose to simplify or omit parts of Matita code.
180We will always ensure that these omissions do not mislead the reader.
181
182%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
183% Section                                 %
184%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
185\subsection{Map of paper}
186\label{subsect.map.of.paper}
187
188% Anybody
189
190%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
191% Section                                 %
192%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
193\section{The proof}
194\label{sect.the.proof}
195
196%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
197% Section                                 %
198%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
199\subsection{Structure of the assembler}
200\label{subsect.structure.of.the.assembler}
201
202% Everybody
203
204%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
205% Section                                 %
206%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
207\subsection{Encoding the instruction set}
208\label{subsect.encoding.the.instruction.set}
209
210% DPM and CSC
211
212% Polymorphic variants
213% Non-regularity of instruction set (CISC)
214% Use of dependent types in get_arg_XXX and set_arg_XXX
215% Assembly code features (what pseudo-instructions are present)
216
217%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
218% Section                                 %
219%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
220\subsection{Operational semantics}
221\label{subsect.operational.semantics}
222
223% DPM and CSC
224One of the central data types in our formalisation is the \texttt{BitVector}.
225\texttt{BitVector}s are fixed-length lists of bits (implemented as booleans) used to encode words and bytes:
226\begin{lstlisting}
227definition BitVector := $\lambda$n: nat. Vector bool n.
228\end{lstlisting}
229Arithmetic in our formalisation is carried out on \texttt{BitVector} values.
230
231An MCS-51 byte is eight bits wide and an MCS-51 word is a double byte, or sixteen bits wide.
232Some MCS-51 instructions require four, seven and eleven bit wide sub-bytes and sub-words respectively.
233We capture these notions as definitions:
234\begin{lstlisting}
235definition Nibble := BitVector 4.
236definition Byte7 := BitVector 7.
237definition Byte := BitVector 8.
238definition Word := BitVector 16.
239definition Word11 := BitVector 11.
240\end{lstlisting}
241A central design decision in our formalisation is the choice of how to encode memory.
242The MCS-51 has various memory spaces and registers that can be addressed with bytes and words.
243Using a complete associative map to capture an entire memory space, including large swathes of uninitialised data, would be incredibly inefficient and make any proof incredibly difficult due to the strain placed on Matita.
244However, using a na\"ive data structured like an association-list to encode memory would lead to inefficiencies when stepping through the semantics of a machine code program inside Matita.
245We therefore use an indexed type, \texttt{BitVectorTrie}, to efficiently represent a memory space that may have `holes' of uninitialised or unspecified data:
246\begin{lstlisting}
247inductive BitVectorTrie (A: Type[0]): $\mathbb{N}$ $\rightarrow$ Type[0] :=
248  Leaf: A $\rightarrow$ BitVectorTrie A O
249| Node: $\forall$n. BitVectorTrie A n $\rightarrow$ BitVectorTrie A n $\rightarrow$ BitVectorTrie A (S n)
250| Stub: $\forall$n. BitVectorTrie A n.
251\end{lstlisting}
252The $\mathbb{N}$ index of the \texttt{BitVectorTrie} family records the maximum length of any path to stored data through the tree.
253For a dependently-typed encoding of a trie data structure, the \texttt{Leaf} and \texttt{Node} constructors are standard.
254The novelty of \texttt{BitVectorTrie} lies in the \texttt{Stub} constructor, which intuitively represents a `hole' in the tree, and can appear
255
256% Trie data structure and use of dependent types
257% Operational semantics of MCS-51
258% Operational semantics of assembly code
259
260%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
261% Section                                 %
262%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
263\subsection{Policies and the correctness proof}
264\label{subsect.policies.and.the.correctness.proof}
265
266% DPM and CSC
267
268%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
269% Section                                 %
270%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
271\subsection{Correctness of branch displacement}
272\label{subsect.correctness.of.branch.displacement}
273
274% JB and CSC
275One of the consequences of using the MCS-51 instruction set is that we have to take its specific branch instructions into account. The MCS-51 has three different types of branch instruction---which one should be used depends on the distance between the instruction and its target (the {\em span}). This applies to unconditional branches, conditional branches and procedure calls.
276
277The three different types of branch instructions are:
278\begin{itemize}
279  \item The {\em short} branch: a 2-byte instruction that can branch over a 1-byte displacement, i.e. 127 bytes forwards or 128 bytes backwards.
280  \item The {\em absolute} branch: a 2-byte instruction that can change the last 11 bits of the program counter. Thus, it can branch to any destination within the same 2048-byte segment.
281  \item The {\em long} branch: a 3-byte instruction that can branch to any destination in memory.
282\end{itemize}
283
284Not all combinations of instructions are provided. For example, there is no short call, nor is there a long conditional branch.
285
286Apart from their difference in size, the instructions also take different amounts of time to execute: 2 cycles for the short and absolute branch, 3 cycles for the long branch.
287
288The {\em branch optimisation problem}, a well-known problem in assembler design, is to find the smallest branch instructions for a given assembler program. This is complicated by the fact that branch instructions can influence each other: an instruction's span depends on the size of any branch instructions in between.
289
290The branch displacement problem has been found to be NP-complete for several
291similar architectures~\cite{Robertson1979,Szymanski1978}, therefore an optimal
292solution could be time-consuming to find. The standard
293solution~\cite{Szymanski1978,Dickson2008} is to use a fixed point algorithm.
294
295Two flavours of solution are possible: either a greatest fixed point algorithm
296(encoding all branches as long and then replacing them by shorter branches
297insofar possible), or a least fixed point algorithm (encoding all branches as
298short and then replacing them by longer branches as necessary). The advantage
299of a greatest fixed point algorithm is that it can be stopped after any
300iteration, as all the intermediate solutions are safe, if not necessarily
301optimal. This is not the case for a least fixed point algorithm, but this
302algorithm, if it terminates, will result in a smaller (or at least equal-sized)
303solution.
304
305In the particular case of the MCS-51 instruction set, matters are complicated
306by the existence of the absolute branch. With this instruction, the encoding of
307a branch does not solely depend on its span, but also on their location within
308the program. This means that instead of being determined solely by the size of
309instructions within its span, the encoding of a branch is determined also by
310all of the branches that precede it, since lengthening or shortening a
311preceding instruction could propel a jump and its destination into the same
312segment (or out of it).   
313
314This also means that the standard termination argument for the least fixed
315point algorithm no longer holds. With only short and long jumps, once a short
316jump is replaced by a long jump, it will never be replaced by a short jump
317again (since the span of jumps only increases), so the number of jumps is
318an upper bound. However, with the absolute jump, it is entirely possible for
319a long jump to be replaced by an absolute jump, so this argument no longer
320holds.
321
322%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
323% Section                                 %
324%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
325\subsection{Fitting together}
326\label{subsect.fitting.together}
327
328% Everybody
329
330%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
331% Section                                 %
332%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
333\section{Conclusion}
334\label{sect.conclusion}
335
336% Everbody
337
338%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
339% Section                                 %
340%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
341\subsection{Related work}
342\label{subsect.related.work}
343
344% Everybody
345
346%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
347% Section                                 %
348%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
349\subsection{Formalisation}
350\label{subsect.formalisation}
351
352% Everybody
353
354% Man-hours of effort
355% Lines of code, etc.
356% Pointer to web address for formalisation
357
358\newpage
359
360\bibliography{boender-jar-2015}
361
362\end{document}
Note: See TracBrowser for help on using the repository browser.