source: Papers/itp-2011/itp-2011.tex @ 3662

Last change on this file since 3662 was 1205, checked in by mulligan, 8 years ago

typographical changes

File size: 55.3 KB
Line 
1\documentclass[10pt, a4paper]{llncs}
2
3\usepackage{amsfonts}
4\usepackage{amsmath}
5\usepackage{amssymb} 
6\usepackage[english]{babel}
7\usepackage{color}
8\usepackage{fancybox}
9\usepackage{graphicx}
10\usepackage[colorlinks]{hyperref}
11\usepackage[utf8x]{inputenc}
12\usepackage{listings}
13\usepackage{mdwlist}
14\usepackage{microtype}
15\usepackage{stmaryrd}
16\usepackage{url}
17
18\newlength{\mylength} 
19\newenvironment{frametxt}%
20        {\setlength{\fboxsep}{5pt}
21                \setlength{\mylength}{\linewidth}%
22                \addtolength{\mylength}{-2\fboxsep}%
23                \addtolength{\mylength}{-2\fboxrule}%
24                \Sbox
25                \minipage{\mylength}%
26                        \setlength{\abovedisplayskip}{0pt}%
27                        \setlength{\belowdisplayskip}{0pt}%
28                }%
29                {\endminipage\endSbox
30                        \[\fbox{\TheSbox}\]}
31
32\lstdefinelanguage{matita-ocaml}
33  {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try,on,to},
34   morekeywords={[2]whd,normalize,elim,cases,destruct},
35   morekeywords={[3]type,of,val,assert,let,function},
36   mathescape=true,
37  }
38\lstset{language=matita-ocaml,basicstyle=\scriptsize\tt,columns=flexible,breaklines=false,
39        keywordstyle=\bfseries, %\color{red}\bfseries,
40        keywordstyle=[2]\bfseries, %\color{blue},
41        keywordstyle=[3]\bfseries, %\color{blue}\bfseries,
42%        commentstyle=\color{green},
43%        stringstyle=\color{blue},
44        showspaces=false,showstringspaces=false}
45\DeclareUnicodeCharacter{8797}{:=}
46\DeclareUnicodeCharacter{10746}{++}
47\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
48\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
49
50\lstset{
51  extendedchars=false,
52  inputencoding=utf8x,
53  tabsize=1
54}
55
56\setlength{\belowcaptionskip}{0pt}
57\setlength{\textfloatsep}{1em}
58
59\begin{document}
60
61\author{Dominic P. Mulligan \and Claudio Sacerdoti Coen}
62\authorrunning{D. P. Mulligan \and C. Sacerdoti Coen}
63\institute{Dipartimento di Scienze dell'Informazione,\\ Universit\`a di Bologna}
64
65\bibliographystyle{plain}
66
67\title{An executable formalisation of the MCS-51 microprocessor in Matita
68 \thanks{The project CerCo acknowledges the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under FET-Open grant number: 243881}}
69
70\maketitle
71
72\begin{abstract}
73We summarise the formalisation of emulators for the MCS-51 microprocessor in O'Caml and the Matita proof assistant.
74The MCS-51 is a widely used microprocessor, especially popular in embedded devices.
75
76The O'Caml emulator is intended to be `feature complete' with respect to the MCS-51 device.
77However, the Matita emulator is intended to be used as a target for a certified, complexity preserving C compiler, as part of the EU-funded CerCo project.
78As a result, not all features of the MCS-51 are formalised in the Matita emulator.
79Both the O'Caml and Matita emulators are `executable'.
80Assembly programs may be animated within Matita, producing a trace of instructions executed.
81\end{abstract}
82
83%\begin{keywords}
84%Hardware formalisation, Matita, dependent types, CerCo
85%\end{keywords}
86
87%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
88% SECTION                                                                      %
89%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
90\section{Introduction}
91\label{sect.introduction}
92
93Formal methods aim to increase our confidence in the design and implementation of software.
94Ideally, all software should come equipped with a formal specification and a proof of correctness for the corresponding implementation.
95Most programs are written in high level languages and then compiled into low level ones.
96Specifications are also given at a high level and correctness can be proved by reasoning on the program's source code.
97The code that is actually run is not the high level source code that we reason on, but low level code generated by the compiler.
98
99Questions now arise:
100\begin{itemize*}
101\item
102What properties are preserved during compilation?
103\item
104What properties are affected by the compilation strategy?
105\item
106To what extent can you trust your compiler in preserving those properties?
107\end{itemize*}
108These and other questions motivate a current `hot topic' in computer science research: \emph{compiler verification} (e.g.~\cite{chlipala:verified:2010,leroy:formal:2009}, and so on).
109So far, the field has only been focused on the first and last questions.
110Much attention has been placed on verifying compiler correctness with respect to extensional properties of programs.
111These are `easily' preserved during compilation.
112
113If we consider intentional properties of programs---space, time, and so forth---the situation is more complex.
114To express these properties, and reason about them, we must adopt a cost model that assigns a cost to single, or blocks, of instructions.
115A compositional cost model, assigning the same cost to all occurrences of one instruction, would be ideal.
116However, compiler optimisations are inherently non-compositional: each occurrence of a high level instruction may be compiled in a different way depending on its context.
117Therefore both the cost model and intentional specifications are affected by the compilation process.
118
119In the CerCo project (`Certified Complexity')~\cite{cerco:2011} we approach the problem of reasoning about intentional properties of programs as follows.
120We are currently developing a compiler that induces a cost model on high level source code.
121Costs are assigned to each block of high level instructions by considering the costs of the corresponding blocks of compiled code.
122The cost model is therefore inherently non-compositional, but has the potential to be extremely \emph{precise}, capturing a program's \emph{realistic} cost.
123That is, the compilation process is taken into account, not ignored.
124A prototype compiler, where no approximation of the cost is provided, has been developed.
125(The technical details of the cost model is explained in~\cite{amadio:certifying:2010}.)
126
127We believe that our approach is applicable to certifying real time programs.
128A user can certify that `deadlines' are met whilst wringing as many clock cycles from the processor---using a cost model that does not over-estimate---as possible.
129
130Our approach is also relevant to compiler verification and construction.
131\emph{An optimisation specified only extensionally is only half specified}; the optimisation may preserve the denotational semantics of a program, but there is no guarantee that intentional properties of the program improve.
132
133Another potential application is the completeness and correctness of the compilation process in the presence of space constraints.
134A compiler could reject a source program targetting an embedded system when the size of the compiled code exceeds available ROM size.
135Preservation of semantics may only be required for those programs that do not exhaust the stack or heap.
136The statement of completeness of the compiler must take into account a realistic cost model.
137
138With the CerCo methodology, we assume we can assign to object code exact and realistic costs for sequential blocks of instructions.
139The WCET community has developed complex tools for bounding the worst-case execution times of sequential blocks on modern processors.
140WCET analysis takes place at the object code level.
141However, it is more convenient to reason about programs at a much higher-level of abstraction.
142Therefore, the analysis must be reflected back onto the original source code.
143This reflection process is completely `untrusted' and makes strong assumptions about the internal design and correctness of the compiler.
144For example, some WCET analysis tools, to maximise precision, require a programmer-provided strict upper bound on the number of loop iterations.
145Compiler optimizations could rearrange code in such a manner that the upper bound is no longer strict.
146The certified CerCo C compiler validates such strong assumptions, and a certified analysis tool could be obtained by combining the CerCo compiler with any certified WCET tool.
147
148We are interested in building a fully certified tool.
149However we are not able to build a certified WCET tool \emph{and} certified C compiler within the confines of the CerCo project.
150We therefore focus on certifying the compiler by targetting a microprocessor where complex WCET analyses are not required.
151
152Caching, memory effects, and advanced features such as branch prediction all have an effect on the complexity of WCET analyses (see~\cite{bate:wcet:2011,yan:wcet:2008}, and so on).
153CerCo therefore decided to focus on 8-bit microprocessors, which are still used in embedded systems.
154These have a predictable, precise cost model due to their relative paucity of features.
155Manufacturer timesheets provide \emph{exact guarantees} for the number of processor cycles each instruction will take to execute.
156
157We have fully formalised an executable formal semantics of a family of 8-bit Freescale microprocessors~\cite{oliboni:matita:2008}, and provided a similar executable formal semantics for the MCS-51 microprocessor.
158The latter is what we describe in this paper.
159The focus of the formalisation has been on capturing the intentional behaviour of the processor; this is not novel.
160However, the design of the MCS-51 itself has caused problems in the formalisation.
161For example, the MCS-51 has a highly unorthogonal instruction set.
162To cope with this unorthogonality, and to produce an executable specification, we rely on the dependent types of Matita, an interactive proof assistant~\cite{asperti:user:2007}.
163The manner in which we combined dependent types and coercions to handle this problem is novel.
164
165\paragraph*{The MCS-51}\quad
166The MCS-51 is an 8-bit microprocessor.
167Commonly called the 8051, since its introduction by Intel the processor has become a popular component of embedded systems.
168Despite being manufactured \emph{en masse}, there is not yet a formal model of the MCS-51.
169
170The 8051 is a well documented processor, with very few underspecified behaviours (almost all of these correspond to erroneous usage of the processor).
171The processor also has the support of numerous open source and commercial tools, such as compilers and emulators.
172For instance, the open source Small Device C Compiler (SDCC) recognises a dialect of C~\cite{sdcc:2010}, and other compilers for BASIC, Forth and Modula-2 are also extant.
173An open source emulator for the processor, MCU 8051 IDE, is also available~\cite{mcu8051ide:2010}.
174Both MCU 8051 IDE and SDCC were used in for validating the formalisation.
175
176\begin{figure*}
177\centering
178\setlength{\unitlength}{0.87pt}
179\small{
180\begin{picture}(410,250)(-50,200)
181%\put(-50,200){\framebox(410,250){}}
182\put(12,410){\makebox(80,0)[b]{Internal (256B)}}
183\put(13,242){\line(0,1){165}}
184\put(93,242){\line(0,1){165}}
185\put(13,407){\line(1,0){80}}
186\put(12,400){\makebox(0,0)[r]{0h}}  \put(14,400){\makebox(0,0)[l]{Register bank 0}}
187\put(13,393){\line(1,0){80}}
188\put(12,386){\makebox(0,0)[r]{8h}}  \put(14,386){\makebox(0,0)[l]{Register bank 1}}
189\put(13,379){\line(1,0){80}}
190\put(12,372){\makebox(0,0)[r]{10h}}  \put(14,372){\makebox(0,0)[l]{Register bank 2}}
191\put(13,365){\line(1,0){80}}
192\put(12,358){\makebox(0,0)[r]{18h}} \put(14,358){\makebox(0,0)[l]{Register bank 3}}
193\put(13,351){\line(1,0){80}}
194\put(12,344){\makebox(0,0)[r]{20h}} \put(14,344){\makebox(0,0)[l]{Bit addressable}}
195\put(13,323){\line(1,0){80}}
196\put(12,316){\makebox(0,0)[r]{30h}}
197  \put(14,309){\makebox(0,0)[l]{\quad \vdots}}
198\put(13,291){\line(1,0){80}}
199\put(12,284){\makebox(0,0)[r]{80h}}
200  \put(14,263){\makebox(0,0)[l]{\quad \vdots}}
201\put(12,249){\makebox(0,0)[r]{ffh}}
202\put(13,242){\line(1,0){80}}
203
204\qbezier(-2,407)(-6,407)(-6,393)
205\qbezier(-6,393)(-6,324)(-10,324)
206\put(-12,324){\makebox(0,0)[r]{Indirect/stack}}
207\qbezier(-6,256)(-6,324)(-10,324)
208\qbezier(-2,242)(-6,242)(-6,256)
209
210\qbezier(94,407)(98,407)(98,393)
211\qbezier(98,393)(98,349)(102,349)
212\put(104,349){\makebox(0,0)[l]{Direct}}
213\qbezier(98,305)(98,349)(102,349)
214\qbezier(94,291)(98,291)(98,305)
215
216\put(102,242){\framebox(20,49){SFR}}
217% bit access to sfrs?
218
219\qbezier(124,291)(128,291)(128,277)
220\qbezier(128,277)(128,266)(132,266)
221\put(134,266){\makebox(0,0)[l]{Direct}}
222\qbezier(128,257)(128,266)(132,266)
223\qbezier(124,242)(128,242)(128,256)
224
225\put(164,410){\makebox(80,0)[b]{External (64kB)}}
226\put(164,220){\line(0,1){187}}
227\put(164,407){\line(1,0){80}}
228\put(244,220){\line(0,1){187}}
229\put(164,242){\line(1,0){80}}
230\put(163,400){\makebox(0,0)[r]{0h}}
231\put(164,324){\makebox(80,0){Paged access}}
232  \put(164,310){\makebox(80,0){Direct/indirect}}
233\put(163,235){\makebox(0,0)[r]{80h}}
234  \put(164,228){\makebox(80,0){\vdots}}
235  \put(164,210){\makebox(80,0){Direct/indirect}}
236
237\put(264,410){\makebox(80,0)[b]{Code (64kB)}}
238\put(264,220){\line(0,1){187}}
239\put(264,407){\line(1,0){80}}
240\put(344,220){\line(0,1){187}}
241\put(263,400){\makebox(0,0)[r]{0h}}
242  \put(264,228){\makebox(80,0){\vdots}}
243  \put(264,324){\makebox(80,0){Direct}}
244  \put(264,310){\makebox(80,0){PC relative}}
245\end{picture}}
246\caption{The 8051 memory model}
247\label{fig.memory.layout}
248\end{figure*}
249
250The 8051 has a relatively straightforward architecture.
251A high-level overview of the processor's memory layout, along with the ways in which different memory spaces may be addressed, is provided in Figure~\ref{fig.memory.layout}.
252
253Processor RAM is divided into numerous segments, with the most prominent division being between internal and (optional) external memory.
254Internal memory, commonly provided on the die itself with fast access, is composed of 256 bytes, but, in direct addressing mode, half of them are overloaded with 128 bytes of memory-mapped Special Function Registers (SFRs).
255SFRs control the operation of the processor.
256Internal RAM (IRAM) is divided again into 8 general purpose bit-addressable registers (R0--R7).
257These sit in the first 8 bytes of IRAM, though can be programmatically `shifted up' as needed.
258Bit memory, followed by a small amount of stack space, resides in the memory space immediately following the register banks.
259What remains of IRAM may be treated as general purpose memory.
260A schematic view of IRAM is also provided in Figure~\ref{fig.memory.layout}.
261
262External RAM (XRAM), limited to a maximum size of 64 kilobytes, is optional, and may be provided on or off chip, depending on the vendor.
263XRAM is accessed using a dedicated instruction, and requires 16 bits to address fully.
264External code memory (XCODE) is often stored as an EPROM, and limited to 64 kilobytes in size.
265However, depending on the particular processor model, a dedicated on-die read-only memory area for program code (ICODE) may be supplied.
266
267Memory may be addressed in numerous ways: immediate, direct, indirect, external direct and code indirect.
268As the latter two addressing modes hint, there are some restrictions enforced by the 8051, and its derivatives, on which addressing modes may be used with specific types of memory.
269For instance, the extra 128 bytes of IRAM of the 8052 cannot be addressed using indirect addressing; rather, external (in)direct addressing must be used.
270Some memory segments are also addressed using 8-bit pointers while others require 16-bits.
271
272The 8051 possesses an 8-bit Arithmetic and Logic Unit (ALU), with a variety of instructions for performing arithmetic and logical operations on bits and integers.
273Two 8-bit general purpose accumulators, A and B, are provided.
274
275Communication with the device is handled by a UART serial port and controller.
276This can operate in numerous modes.
277Serial baud rate is determined by one of two 16-bit timers included with the 8051, which can be set to multiple modes of operation.
278(The 8052 provides an additional 16-bit timer.)
279The 8051 also provides a 4 byte bit-addressable I/O port.
280
281The programmer may take advantage of an interrupt mechanism.
282This is especially useful when dealing with I/O involving the serial device, as an interrupt can be set when a whole character is sent or received via the UART.
283
284Interrupts immediately halt the flow of execution of the processor, and cause the program counter to jump to a fixed address, where the requisite interrupt handler is stored.
285However, interrupts may be set to one of two priorities: low and high.
286The interrupt handler of an interrupt with high priority is executed ahead of the interrupt handler of an interrupt of lower priority, interrupting a currently executing handler of lower priority, if necessary.
287
288The 8051 has interrupts disabled by default.
289The programmer is free to handle serial input and output manually, by poking serial flags in the SFRs.
290`Exceptional circumstances' that would otherwise trigger an interrupt on more modern processors, (e.g. division by zero) are also signalled by setting flags.
291
292%\begin{figure}[t]
293%\begin{center}
294%\includegraphics[scale=0.5]{iramlayout.png}
295%\end{center}
296%\caption{Schematic view of 8051 IRAM layout}
297%\label{fig.iram.layout}
298%\end{figure}
299
300\paragraph*{Overview of paper}\quad
301In Section~\ref{sect.design.issues.formalisation} we discuss design issues in the development of the formalisation.
302In Section~\ref{sect.correctness.pseudoinstruction.expansion} we briefly discuss some difficulties present in the ongoing proof of correctness for the assembly process.
303In Section~\ref{sect.validation} we discuss validation of the emulator to ensure that what we formalised was an accurate model of an MCS-51 series microprocessor.
304In Section~\ref{sect.related.work} we describe previous, related work.
305Section~\ref{sect.conclusions} concludes.
306
307%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
308% SECTION                                                                      %
309%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
310\section{Design issues in the formalisation}
311\label{sect.design.issues.formalisation}
312
313We implemented two emulators, one in O'Caml and one in Matita.
314The O'Caml emulator is intended to be `feature complete' with respect to the MCS-51 device.
315However, the Matita emulator is intended to be used as a target for a certified, complexity preserving C compiler.
316As a result, not all features of the MCS-51 are formalised in the Matita emulator.
317
318We designed the O'Caml emulator to be as efficient as possible, under the constraint that it would be eventually translated into Matita.
319One performance drain in the O'Caml emulator is the use of purely functional map datastructures to represent memory spaces, used to maintain the close correspondence between the Matita and O'Caml emulators.
320
321Matita~\cite{asperti:user:2007} is a proof assistant based on the Calculus of Coinductive constructions, similar to Coq.
322As a programming language, Matita corresponds to the functional fragment of O'Caml extended with dependent types.
323Matita also features a rich higher-order logic for reasoning about programs, including a universe hierarchy with a single impredicative universe, \texttt{Prop}, and potentially infinitely many predicative universes \texttt{Type[i]} for $0 \geq i$.
324All recursive functions admitted by the Matita typechecker must be structurally recursive and total.
325
326We box Matita code to distinguish it from O'Caml code.
327In Matita `\texttt{$?$}' and  `\texttt{$\ldots$}' are arguments to be inferred automatically.
328
329A full account of the formalisation can be found in~\cite{cerco-report-code:2011}.
330All source code is available from the CerCo project website~\cite{cerco-report-code:2011}.
331
332%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
333% SECTION                                                                      %
334%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
335\subsection{Representation of bytes, words, etc.}
336\label{subsect.representation.integers}
337
338\begin{figure}[t]
339\begin{minipage}[t]{0.45\textwidth}
340\vspace{0pt}
341\begin{lstlisting}
342type 'a vect = bit list
343type nibble = [`Sixteen] vect
344type byte = [`Eight] vect
345let split_word w = split_nth 4 w
346let split_byte b = split_nth 2 b
347\end{lstlisting}
348\end{minipage}
349%
350\begin{minipage}[t]{0.55\textwidth}
351\vspace{0pt}
352\begin{lstlisting}
353type 'a vect
354type word = [`Sixteen] vect
355type byte = [`Eight] vect
356val split_word: word -> byte * word
357val split_byte: byte -> nibble * nibble
358\end{lstlisting}
359\end{minipage}
360\caption{Sample of O'Caml implementation and interface for bitvectors module}
361\label{fig.ocaml.implementation.bitvectors}
362\end{figure}
363
364The formalization of MCS-51 must deal with bytes (8-bits), words (16-bits), and also more esoteric quantities (7, 3 and 9-bits).
365To avoid difficult-to-trace size mismatch bugs, we represented all quantities using bitvectors, i.e. fixed length vectors of booleans.
366In the O'Caml emulator, we `faked' bitvectors using phantom types~\cite{leijen:domain:1999} implemented with polymorphic variants~\cite{garrigue:programming:1998}, as in Figure~\ref{fig.ocaml.implementation.bitvectors}.
367From within the bitvector module (top) bitvectors are just lists of bits and no guarantee is provided on sizes.
368However, the module's interface (bottom) enforces size invariants in the rest of the code.
369
370In Matita, we are able to use the full power of dependent types to always work with vectors of a known size:
371\begin{lstlisting}[frame=single]
372inductive Vector (A: Type[0]): nat $\rightarrow$ Type[0] ≝
373  VEmpty: Vector A O
374| VCons: $\forall$n: nat. A $\rightarrow$ Vector A n $\rightarrow$ Vector A (S n).
375\end{lstlisting}
376\texttt{BitVector} is a specialization of \texttt{Vector} to \texttt{bool}.
377We may use Matita's type system to provide precise typings for functions that are polymorphic in the size without code duplication:
378\begin{lstlisting}[frame=single]
379let rec split (A: Type[0]) (m,n: nat) on m: Vector A (m+n) $\rightarrow$ (Vector A m)$\times$(Vector A n) := ...
380\end{lstlisting}
381
382%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
383% SECTION                                                                      %
384%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
385\subsection{Representing memory}
386\label{subsect.representing.memory}
387
388The MCS-51 has numerous disjoint memory spaces addressed by differently sized pointers.
389In the O'Caml implementation, we use a map data structure (from the standard library) for each space.
390In Matita, we exploited dependent types to design a data structure which enforced the correspondence between the size of pointer and the size of the memory space.
391Further, we assumed that large swathes of memory would often be uninitialized (an assumption on the behaviour of the programmer, not the processor!)
392
393We use a modified form of trie of fixed height $h$.
394Paths are represented by bitvectors (used also for addresses and registers) of length $h$:
395\begin{lstlisting}[frame=single]
396inductive BitVectorTrie(A: Type[0]):nat $\rightarrow$ Type[0] :=
397  Leaf: A $\rightarrow$ BitVectorTrie A 0
398| Node: ∀n. BitVectorTrie A n $\rightarrow$ BitVectorTrie A n $\rightarrow$ BitVectorTrie A (S n)
399| Stub: ∀n. BitVectorTrie A n.
400\end{lstlisting}
401\texttt{Stub} is a constructor that can appear at any point in a trie.
402It represents `uninitialized data'.
403Performing a lookup in memory is now straight-forward.
404The only subtlety over normal trie lookup is how we handle \texttt{Stub}.
405We traverse a path, and upon encountering \texttt{Stub}, we return a default value.
406All manufacturer data sheets that we consulted were silent on the subject of what should be returned if we attempt to access uninitialized memory.
407We defaulted to simply returning zero, though our \texttt{lookup} function is parametric in this choice.
408This is reasonable, as SDCC for instance generates code which first `zeroes' memory in a preamble before executing the program proper.
409This is in line with the C standard, which guarantees that all global variables will be zero initialized piecewise.
410
411\texttt{BitVectorTrie} and \texttt{Vector}, and related functions, can be used in the formalisation of other microprocessors.
412
413%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
414% SECTION                                                                      %
415%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
416\subsection{Labels and pseudoinstructions}
417\label{subsect.labels.pseudoinstructions}
418
419Aside from implementing the core MCS-51 instruction set, we also provided \emph{pseudoinstructions}, \emph{labels} and \emph{cost labels}.
420The purpose of \emph{cost labels} will be explained in Subsection~\ref{subsect.computation.cost.traces}.
421
422Introducing pseudoinstructions had the effect of simplifying a C compiler---another component of the CerCo project---that was being implemented in parallel with our implementation.
423To see why, consider the fact that the MCS-51's instruction set has numerous instructions for unconditional and conditional jumps to memory locations.
424For instance, the instructions \texttt{AJMP}, \texttt{JMP} and \texttt{LJMP} all perform unconditional jumps.
425However, these instructions differ in how large the maximum size of the offset of the jump to be performed can be.
426Further, all jump instructions require a concrete memory address---to jump to---to be specified.
427Compilers that support separate compilation cannot directly compute these offsets and select the appropriate jump instructions.
428These operations are also burdensome for compilers that do not do separate compilation and are handled by assemblers.
429We followed suit.
430
431While introducing pseudoinstructions, we also introduced labels for locations to jump to, and for global data.
432To specify global data via labels, we introduced a preamble before the program where labels and the size of reserved space for data is stored.
433A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the MCS-51's 16-bit register, \texttt{DPTR}.
434(This register is used for indirect addressing of data stored in external memory.)
435
436The pseudoinstructions and labels induce an assembly language similar to that of SDCC's.
437All pseudoinstructions and labels are `assembled away' prior to program execution.
438Jumps are computed in two stages.
439A map associating memory addresses to labels is built, before replacing pseudojumps with concrete jumps to the correct address.
440The algorithm currently implemented does not try to minimize object code size by picking the shortest possible jump instruction.
441A better algorithm is left for future work.
442
443%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
444% SECTION                                                                      %
445%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
446\subsection{Anatomy of the (Matita) emulator}
447\label{subsect.anatomy.matita.emulator}
448
449The Matita emulator's internal state is a record:
450\begin{lstlisting}[frame=single]
451record Status: Type[0] := {
452  code_memory: BitVectorTrie Byte 16;
453  low_internal_ram: BitVectorTrie Byte 7;
454  high_internal_ram: BitVectorTrie Byte 7;
455  external_ram: BitVectorTrie Byte 16;
456  program_counter: Word;
457  special_function_registers_8051: Vector Byte 19;
458  ... }.
459\end{lstlisting}
460This record encapsulates the current memory contents, the program counter, the state of the current SFRs, and so on.
461
462Here the MCS-51's memory model is implemented using four disjoint memory spaces, plus SFRs.
463From the programmer's point of view, what \emph{really} matters are the addressing modes that are in a many-to-many relationship with the spaces.
464\texttt{DIRECT} addressing can be used to address either lower IRAM (if the first bit is 0) or the SFRs (if the first bit is 1), for instance.
465That's why DIRECT uses 8-bit addresses but pointers to lower IRAM only use 7 bits.
466The complexity of the memory model is captured in a pair of functions, \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX}, that `get' and `set' data of size \texttt{XX} from memory.
467
468%Overlapping, and checking which addressing modes can be used to address particular memory spaces, is handled through numerous \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX} (for 1, 8 and 16 bits) functions.
469
470Both the Matita and O'Caml emulators follow the classic `fetch-decode-execute' model of processor operation.
471The next instruction to be processed, indexed by the program counter, is fetched from code memory with \texttt{fetch}.
472An updated program counter, along with its concrete cost in processor cycles, is also returned.
473These costs are taken from a Siemens Semiconductor Group data sheet for the MCS-51~\cite{siemens:2011}, and will likely vary between particular implementations.
474\begin{lstlisting}[frame=single]
475definition fetch: BitVectorTrie Byte 16 $\rightarrow$ Word $\rightarrow$ instruction $\times$ Word $\times$ nat
476\end{lstlisting}
477Instruction are assembled to bit encodings by \texttt{assembly1}:
478\begin{lstlisting}[frame=single]
479definition assembly1: instruction $\rightarrow$ list Byte
480\end{lstlisting}
481An assembly program---comprising a preamble containing global data and a list of pseudoinstructions---is assembled using \texttt{assembly}.
482Pseudoinstructions and labels are eliminated in favour of instructions from the MCS-51 instruction set.
483A map associating memory locations and cost labels (see Subsection~\ref{subsect.computation.cost.traces}) is produced.
484\begin{lstlisting}[frame=single]
485definition assembly: assembly_program $\rightarrow$ option (list Byte $\times$ (BitVectorTrie String 16))
486\end{lstlisting}
487A single fetch-decode-execute cycle is performed by \texttt{execute\_1}:
488\begin{lstlisting}[frame=single]
489definition execute_1: Status $\rightarrow$ Status
490\end{lstlisting}
491The \texttt{execute} functions performs a fixed number of cycles by iterating
492\texttt{execute\_1}:
493\begin{lstlisting}[frame=single]
494let rec execute (n: nat) (s: Status): Status := ...
495\end{lstlisting}
496This differs from the O'Caml emulator, which executed a program indefinitely.
497A callback function was also accepted as an argument, which `witnessed' the execution as it happened.
498Due to Matita's termination requirement, \texttt{execute} cannot execute a program indefinitely.
499An alternative would be to produce an infinite stream of statuses representing an execution trace using Matita's co-inductive types.
500
501%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
502% SECTION                                                                      %
503%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
504\subsection{Instruction set unorthogonality}
505\label{subsect.instruction.set.unorthogonality}
506
507A peculiarity of the MCS-51 is its unorthogonal instruction set; \texttt{MOV} can be invoked using one of 16 combinations of addressing modes out of a total of 361, for instance.
508
509% Show example of pattern matching with polymorphic variants
510
511Such unorthogonality in the instruction set was handled with the use of polymorphic variants in O'Caml~\cite{garrigue:programming:1998}.
512For instance, we introduced types corresponding to each addressing mode:
513\begin{lstlisting}
514type direct = [ `DIRECT of byte ]
515type indirect = [ `INDIRECT of bit ]
516...
517\end{lstlisting}
518Which were then combined in the inductive datatype for assembly preinstructions using the union operator `$|$':
519\begin{lstlisting}
520type 'addr preinstruction =
521[ `ADD of acc * [ reg | direct | indirect | data ]
522...
523| `MOV of (acc * [ reg| direct | indirect | data ],
524   [ reg | indirect ] * [ acc | direct | data ],
525   direct * [ acc | reg | direct | indirect | data ],
526   dptr * data16,
527   carry * bit,
528   bit * carry
529   ) union6
530...
531\end{lstlisting}
532Here, \texttt{union6} is a disjoint union type, defined as follows:
533\begin{lstlisting}
534type ('a,'b,'c,'d,'e,'f) union6 = [ `U1 of 'a | ... | `U6 of 'f ]
535\end{lstlisting}
536The types \texttt{union2}, \texttt{union3} and \texttt{union6} sufficed.
537
538This polymorphic variant machinery worked well: it introduced a certain level of type safety (the type of \texttt{MOV} above guarantees it cannot be invoked with arguments in the \texttt{carry} and \texttt{data16} addressing modes), and also allowed us to pattern match against instructions, when necessary.
539However, this polymorphic variant machinery is \emph{not} present in Matita.
540We needed some way to produce the same effect, which Matita supported.
541We used dependent types.
542
543We provided an inductive data type representing all addressing modes, a type that functions will pattern match against:
544\begin{lstlisting}[frame=single]
545inductive addressing_mode: Type[0] :=
546  DIRECT: Byte $\rightarrow$ addressing_mode
547| INDIRECT: Bit $\rightarrow$ addressing_mode
548...
549\end{lstlisting}
550We also wished to express in the type of functions the \emph{impossibility} of pattern matching against certain constructors.
551In order to do this, we introduced an inductive type of addressing mode `tags'.
552The constructors of \texttt{addressing\_mode\_tag} are in one-to-one correspondence with the constructors of \texttt{addressing\_mode}:
553\begin{lstlisting}[frame=single]
554inductive addressing_mode_tag : Type[0] :=
555  direct: addressing_mode_tag
556| indirect: addressing_mode_tag
557...
558\end{lstlisting}
559The \texttt{is\_a} function checks if an \texttt{addressing\_mode} matches an \texttt{addressing\_mode\_tag}:
560\begin{lstlisting}[frame=single]
561definition is_a :=
562  $\lambda$d: addressing_mode_tag. $\lambda$A: addressing_mode.
563match d with
564[ direct $\Rightarrow$ match A with [ DIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ]
565| indirect $\Rightarrow$ match A with [ INDIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ]
566...
567\end{lstlisting}
568The \texttt{is\_in} function checks if an \texttt{addressing\_mode} matches a set of tags represented as a vector.
569It simply extends the \texttt{is\_a} function in the obvious manner.
570
571A \texttt{subaddressing\_mode} is an \emph{ad hoc} non-empty $\Sigma$-type of \texttt{addressing\_mode}s in a set of tags:
572\begin{lstlisting}[frame=single]
573record subaddressing_mode (n: nat) (l: Vector addressing_mode_tag (S n)): Type[0] := {
574  subaddressing_modeel :> addressing_mode;
575  subaddressing_modein: bool_to_Prop (is_in $\ldots$ l subaddressing_modeel)
576}.
577\end{lstlisting}
578An implicit coercion~\cite{luo:coercive:1999} is provided to promote vectors of tags (denoted with $\llbracket - \rrbracket$) to the corresponding \texttt{subaddressing\_mode} so that we can use a syntax close to that of O'Caml to specify \texttt{preinstruction}s:
579\begin{lstlisting}[frame=single]
580inductive preinstruction (A: Type[0]): Type[0] ≝
581  ADD: $\llbracket$acc_a$\rrbracket$ $\rightarrow$ $\llbracket$register;direct;indirect;data$\rrbracket$ $\rightarrow$ preinstruction A
582| ADDC: $\llbracket$acc_a$\rrbracket$ $\rightarrow$ $\llbracket$register;direct;indirect;data$\rrbracket$ $\rightarrow$ preinstruction A
583...
584\end{lstlisting}
585The constructor \texttt{ADD} expects two parameters, the first being the accumulator A (\texttt{acc\_a}), the second being a register, direct, indirect or data addressing mode.
586
587% One of these coercions opens up a proof obligation which needs discussing
588% Have lemmas proving that if an element is a member of a sub, then it is a member of a superlist, and so on
589Finally, type coercions from \texttt{addressing\_mode} to \texttt{subaddressing\_mode} and from \texttt{Vector addressing\_mode\_tag} to \texttt{Type$\lbrack0\rbrack$}, are required.
590The first opens a proof obligation wherein we must prove that the provided value is in the admissible set, simulating PVS subset types~\cite{shankar:principles:1999}.
591%PVS introduced subset types, and these later featured in Coq as part of Russell~\cite{sozeau:subset:2006}.
592%All coercions in Matita can open proof obligations.
593
594Proof obligations require us to state and prove a few auxilliary lemmas related to the transitivity of subtyping.
595For instance, an \texttt{addressing\_mode} that belongs to an allowed set also belongs to any one of its supersets.
596At the moment, Matita's automation exploits these lemmas to completely solve all the proof obligations opened in the formalisation.
597The \texttt{execute\_1} function, for instance, opens over 200 proof obligations during type checking.
598
599The machinery just described allows us to restrict the set of \texttt{addressing\_mode}s expected by a function and use this information during pattern matching.
600This allows us to skip impossible cases.
601For instance, consider \texttt{set\_arg\_16}, which expects only a \texttt{DPTR}:
602\begin{lstlisting}[frame=single]
603definition set_arg_16: Status $\rightarrow$ Word $\rightarrow$ $\llbracket$dptr$\rrbracket$ $\rightarrow$ Status := $~\lambda$s, v, a.
604match a return $\lambda$x. bool_to_Prop (is_in ? $\llbracket$dptr$\rrbracket$ x) $\rightarrow$ ? with
605  [ DPTR $\Rightarrow$ $\lambda$_: True.
606    let $\langle$bu, bl$\rangle$ := split $\ldots$ eight eight v in
607    let status := set_8051_sfr s SFR_DPH bu in
608    let status := set_8051_sfr status SFR_DPL bl in
609      status
610  | _ $\Rightarrow$ $\lambda$_: False. $\bot$
611  ] $~$(subaddressing_modein $\ldots$ a).
612\end{lstlisting}
613We give a proof (the expression \texttt{(subaddressing\_modein} $\ldots$ \texttt{a)}) that the argument $a$ is in the set $\llbracket$ \texttt{dptr} $\rrbracket$ to the \texttt{match} expression.
614In every case but \texttt{DPTR}, the proof is a proof of \texttt{False}, and the system opens a proof obligation $\bot$ that can be discarded using \emph{ex falso}.
615Attempting to match against a disallowed addressing mode (replacing \texttt{False} with \texttt{True} in the branch) produces a type error.
616
617We tried other dependently and non-dependently typed solutions before settling on this approach.
618As we need a large number of different combinations of addressing modes to describe the instruction set, it is infeasible to declare a datatype for each one of these combinations.
619The current solution is closest to the corresponding O'Caml code, to the point that the translation from O'Caml to Matita is almost syntactical.
620% We would like to investigate the possibility of changing the code extraction procedure of Matita so that it recognises this programming pattern and outputs O'Caml code using polymorphic variants.
621
622% Talk about extraction to O'Caml code, which hopefully will allow us to extract back to using polymorphic variants, or when extracting vectors we could extract using phantom types
623% Discuss alternative approaches, i.e. Sigma types to piece together smaller types into larger ones, as opposed to using a predicate to `cut out' pieces of a larger type, which is what we did
624
625%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
626% SECTION                                                                      %
627%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
628\subsection{I/O and timers}
629\label{subsect.i/o.timers}
630
631% `Real clock' for I/O and timers
632The O'Caml emulator has code for handling timers, asynchronous I/O and interrupts (these are not in the Matita emulator as they are not relevant to CerCo).
633All three of these features interact with each other in subtle ways.
634Interrupts can `fire' when an input is detected on the processor's UART port, and, in certain modes, timers reset when a high signal is detected on one of the MCS-51's communication pins.
635
636To accurately model timers and I/O, we add an unbounded integral field \texttt{clock} to the central \texttt{status} record.
637This field is only logical, since it does not represent any quantity stored in the physical processor, and is used to keep track of the current `processor time'.
638Before every execution step, \texttt{clock} is incremented by the number of processor cycles that the instruction just fetched will take to execute.
639The emulator executes the instruction then the code implementing timers and I/O (it isn't specified by the data sheets if I/O is handled at the beginning or the end of each cycle.)
640To model I/O, we store in \texttt{status} a \emph{continuation} which is a description of the behaviour of the environment:
641\begin{lstlisting}
642type line =
643[ `P1 of byte | `P3 of byte
644| `SerialBuff of [ `Eight of byte | `Nine of BitVectors.bit * byte ]  ]
645
646type continuation =
647[`In of time * line * epsilon * continuation] option *
648[`Out of (time -> line -> time * continuation)]
649\end{lstlisting}
650The second projection of the continuation $k$ describes how the environment will react to an output event performed in the future by the processor.
651Suppose $\pi_2(k)(\tau,o) = \langle \tau',k' \rangle$.
652If the emulator at time $\tau$ starts an asynchronous output $o$ either on the P1 or P3 output lines, or on the UART, then the environment will receive the output at time $\tau'$.
653Moreover \texttt{status} is immediately updated with the continuation $k'$.
654
655Further, if $\pi_1(k) = \mathtt{Some}~\langle \tau',i,\epsilon,k'\rangle$, then at time $\tau'$ the environment will send the asynchronous input $i$ to the emulator and \texttt{status} is updated with the continuation $k'$.
656This input is visible to the emulator only at time $\tau' + \epsilon$.
657
658The time required to perform an I/O operation is partially specified in the data sheets of the UART module.
659This computation is complex so we prefer to abstract over it.
660We leave the computation of the delay time to the environment.
661
662We use only the P1 and P3 lines.
663This is because the other lines become inoperable if XRAM is present (we assume it is).
664
665The UART port can work in several modes, depending on the how the SFRs are set.
666In an asyncrhonous mode, the UART transmits 8 bits at a time, using a ninth line for synchronisation.
667In a synchronous mode the ninth line is used to transmit an additional bit.
668All UART modes are formalised.
669
670%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
671% SECTION                                                                      %
672%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
673\subsection{Computation of cost traces}
674\label{subsect.computation.cost.traces}
675
676As mentioned in Subsection~\ref{subsect.labels.pseudoinstructions} we introduced a notion of \emph{cost label}.
677Cost labels are inserted by the prototype C compiler at specific locations in the object code.
678
679Cost labels are used to calculate a precise costing for a program by marking the start of basic blocks.
680During the assembly phase, where labels and pseudoinstructions are eliminated, a map is generated associating cost labels with memory locations.
681This map is later used in a separate analysis which computes the cost of a program by traversing through a program, fetching one instruction at a time, and computing the cost of blocks.
682When targetting more complex processors, this simple analysis could be replaced by a sophisticated WCET analysis.
683Block costings are stored in another map, and will be passed back to the compiler.
684
685%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
686% SECTION                                                                      %
687%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
688\section{Correctness of pseudoinstruction expansion}
689\label{sect.correctness.pseudoinstruction.expansion}
690
691With the addition of pseudoinstructions and labels, an assembly language is induced on top of the machine language.
692Pseudoinstructions in an assembly program must be expanded into machine instructions prior to execution.
693As we are interested in constructing a verified compiler, this expansion process must be shown to be `correct', avoiding any semantic changes to the assembly program.
694
695Demonstrating such `correctness'is subtle due to jump pseudoinstructions.
696We would like to expand jump pseudoinstructions to the smallest possible jump machine instruction.
697Doing this is non-trivial, as it is easy to construct programs with configurations of jumps where larger jump instructions can only be `shrunk' if and only if others are also shrunk.
698This is a well-known problem in the assembly programming community, but causes difficulties for the correctness proof of an assembler that purports to produce small executables.
699
700Another problem is computing the cost of pseudoinstructions, a necessary step for lifting our cost model to the C-source level in a cost-preserving manner.
701Conditional jumps can be expanded in complex ways, where the `true' and `false branches' of the jump instruction get expanded into pieces of machine code with differing costs.
702What, then, is the cost of a conditional jump pseudoinstruction?
703
704How we solve these problems, and a full description of the assembler's correctness proof, will be described in a following publication.
705
706%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
707% SECTION                                                                      %
708%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
709\section{Validation}
710\label{sect.validation}
711
712%\begin{figure}[t]
713%\begin{scriptsize}
714%\begin{verbatim}
715%08: mov 81 #07
716%
717% Processor status:                               
718%
719%   ACC: 0   B: 0   PSW: 0
720%    with flags set as:
721%     CY: false    AC: false   FO: false   RS1: false
722%     RS0: false   OV: false   UD: false   P: false
723%   SP: 7   IP: 0   PC: 8   DPL: 0   DPH: 0   SCON: 0
724%   SBUF: 0   TMOD: 0   TCON: 0
725%   Registers:                                   
726%    R0: 0   R1: 0   R2: 0   R3: 0
727%    R4: 0   R5: 0   R6: 0   R7: 0
728%\end{verbatim}
729%\end{scriptsize}
730%\caption{An example snippet from an emulator execution trace}
731%\label{fig.execution.trace}
732%\end{figure}
733
734We attempted to ensure that what we have formalised is an accurate model of the MCS-51 microprocessor.
735
736We made use of multiple data sheets, each from a different manufacturer.
737This helped us triangulate errors in the specification of the processor's instruction set, and its behaviour, for instance, in a data sheet from Philips Semiconductor.
738
739The O'Caml emulator was especially useful for validation purposes.
740We wrote a module for parsing and loading Intel HEX format files.
741Intel HEX is a standard format that all compilers targetting the MCS-51, and similar processors, produce.
742It is essentially a snapshot of the processor's code memory in compressed form.
743Using this we were able to compile C programs with SDCC and load the resulting program directly into the emulator's code memory, ready for execution.
744Further, we can produce a HEX file from the emulator's code memory for loading into third party tools.
745After each step of execution, we can print out both the instruction that had been executed and a snapshot of the processor's state, including all flags and register contents.
746These traces were useful in spotting anything that was `obviously' wrong with the execution of the program.
747
748We further used MCU 8051 IDE as a reference, which allows a user to step through an assembly program one instruction at a time.
749With these execution traces, we could step through a compiled program in MCU 8051 IDE and compare the resulting execution trace with the trace produced by our emulator.
750
751We partially validated the assembler by proving in Matita that on all defined opcodes the \texttt{assembly\_1} and \texttt{fetch} functions are inverse.
752
753The Matita formalisation was largely copied from the O'Caml source code apart from the changes already mentioned.
754As the Matita emulator is executable we could perform further validation by comparing the trace of a program's execution in the Matita and O'Caml emulators.
755
756%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
757% SECTION                                                                      %
758%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
759\section{Related work}
760\label{sect.related.work}
761A large body of literature on the formalisation of microprocessors exists.
762The majority of it deals with proving correctness of implementations of microprocessors at the microcode or gate level, with many considering `cycle accurate' models.
763We are interested in providing a precise specification of the behaviour of the microprocessor in order to prove the correctness of a compiler which will target the processor.
764In particular, we are interested in intentional properties of the processor; precise timings of instruction execution in clock cycles.
765We have also built a complete MCS-51 ecosystem: UART, I/O lines, and hardware timers, complete with an assembler.
766
767Fox \emph{et al} formalised, in HOL4, the ARMv7 instruction set architecture, along with various related tools, such as assemblers~~\cite{fox:trustworthy:2010}.
768The formalisation is executable both within (using rewriting) and outwith the theorem prover.
769They further point to an excellent list of references to related work in the literature.
770In further related work~\cite{myreen:extensible:2009}, a verified compiler for a Lisp-like language is presented, targetting this formalised microprocessor.
771
772Further, in~\cite{fox:trustworthy:2010} the authors validated their formalisation by using development boards and random testing.
773We currently rely on non-exhaustive testing against a third party emulator.
774We recognise the importance of this exhaustive testing, but currently leave it for future work.
775Both models---ours and that of~\cite{fox:trustworthy:2010}---are `executable'.
776
777Our main difficulties resided in the non-uniformity of an old 8-bit architecture, in terms of the instruction set, addressing modes and memory models.
778In contrast, the various ARM instruction sets and memory models are relatively uniform.
779
780Other related projects CompCert~\cite{leroy:formally:2009} and the CLI Stack.
781CompCert concerns the certification of a C compiler and includes a formalisation in Coq of a subset of PowerPC.
782The CLI Stack consists of the design and verification of a whole chain of artefacts including a 32-bit microprocessor (FM9001), the Piton assembler and two compilers.
783Like CerCo, the CLI Stack compilers gave the cost of high-level instructions in processor cycles.
784The CLI Stack high-level languages ($\mu$Gypsy and Nqthm Lisp) and FM9001 microprocessor were not commercial products, but bespoke designs for the purpose of verification (see~\cite{moore:grand:2005}).
785This work has, however, been continued by Hunt and Swords, verifying the Centaur Technologies Via Nano and Isaiah micoprocessors in ACL2~\cite{hunt:verifying:2010}.
786
787The CompCert C compiler is extracted to O'Caml using Coq's code extraction facility.
788Many other formalised emulators/compilers have also been extracted from proof assistants using similar technology (e.g. see~\cite{blanqui:designing:2010}).
789We aim to make use of a similar code extraction facility in Matita, but only if the extracted code exhibits the same degree of type safety, provided by polymorphic variants, and human readability that the O'Caml emulator posseses.
790This is because we aim to use the emulator as a library for non-certified software written directly in O'Caml.
791Matita's dependent typesm, used to handle the instruction set (Subsection~\ref{subsect.instruction.set.unorthogonality}), could enable code extraction to make use of polymorphic variants.
792Using Coq's current code extraction algorithm we could write assembly programs that would generate runtime errors when emulated.
793We leave this for future work.
794
795Despite the apparent similarity between CerCo and CompCert, the two formalisations do not have much in common.
796First, CompCert provides a formalisation at the assembly level (no instruction decoding).
797This impels them to trust an unformalised assembler and linker, whereas we provide our own.
798Our formalisation is \emph{directly} executable, while the one in CompCert only provides a relation that describes execution.
799In CompCert I/O is only described as a synchronous external function call and there is no I/O at the processor level.
800CompCert assumes an idealized abstract and uniform memory model; we take into account the complicated overlapping memory model of the MCS-51 architecture.
801Finally, 82 instructions of the more than 200 offered by the processor are formalised in CompCert, and the assembly language is augmented with macro instructions that are turned into `real' instructions only during communication with the external assembler.
802Even from a technical level the two formalisations differ: we use dependent types whilst CompCert largely sticks to non-dependent types.
803
804Atkey~\cite{atkey:coqjvm:2007} provides an executable specification of the JVM.
805Dependent types are used to remove spurious partiality from the model, and lower the need for over-specifying the behaviour of the processor in impossible cases.
806However, dependent types in the two formalisations are used differently.
807Further, our use of dependent types will also serve to enforce invariants in the CerCo compiler.
808
809Finally~\cite{sarkar:semantics:2009} provides an executable semantics for x86-CC multiprocessor machine code.
810This machine code exhibits a degree of non-uniformity similar to the MCS-51.
811Only a small subset of the instruction set is considered, and they over-approximate the possibilities of unorthogonality of the instruction set, largely dodging the problems we had to face.
812
813Further, it seems that the definition of the decode function is potentially error prone.
814A domain specific language of patterns is formalised in HOL4, similar to the specification language of the x86 instruction set found in manufacturer's data sheets.
815A decode function is implemented by copying lines from data sheets into the proof script, which are then partially evaluated to obtain a compiler.
816We are currently considering implementing a similar domain specific language in Matita.
817
818%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
819% SECTION                                                                      %
820%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
821\section{Conclusions}
822\label{sect.conclusions}
823
824In CerCo, we are interested in the certification of a compiler for C that induces a precise cost model on the source code.
825Our cost model assigns costs to blocks of instructions by tracing the way that blocks are compiled, and by computing exact costs on generated machine language.
826To perform this accurately, we have provided an executable semantics for the MCS-51 family of processors.
827An O'Caml and Matita formalisation was provided, and both capture the exact timings of the MCS-51 (according to a Siemen's data sheet).
828The O'Caml formalisation also considers timers and I/O.
829Adding support for I/O and timers in Matita is on-going work that will not present any problem, and was delayed only because the addition is not immediately useful for the formalisation of the CerCo compiler.
830
831The formalisation is done at machine level and not at assembly level; we also formalise fetching and decoding.
832We provide an assembly language, enhanched with labels and pseudoinstructions, and an assembler from this language to machine code.
833This assembly language is similar to those found in `industrial strength' compilers, such as SDCC.
834A proof of correctness for the assembly process is ongoing.
835We introduce cost labels in the machine language to relate the data flow of the assembly program to that of the C source language, in order to associate costs to the C program.
836
837Our main difficulty in formalising the MCS-51 was the unorthogonality of its memory model and instruction set.
838These problems are handled in O'Caml by using language features like polymorphic variants and phantom types, simulating Generalized Abstract Data Types~\cite{xi:guarded:2003}.
839Importantly, we searched for a manner of using dependent types to recover the same flexibility, reduce spurious partiality, and grant invariants that will be later useful in other formalisations in CerCo.
840
841The formalisation has been partially verified by computing execution traces on selected programs and comparing them with an existing emulator.
842All instructions have been tested at least once, but we have not yet pushed testing further, for example with random testing or by using development boards.
843I/O in particular has not been tested yet, and it is currently unclear how to provide exhaustive testing in the presence of I/O.
844Finally, we are aware of having over-specified the processor in several places, by fixing a behaviour hopefully consistent with the real machine, where manufacturer data sheets are ambiguous or under-specified.\footnote{Submitted paper 16 pages by permission of program committee.}
845
846\bibliography{itp-2011}
847
848\end{document}
Note: See TracBrowser for help on using the repository browser.