source: Deliverables/D4.1/ITP-Paper/itp-2011.tex @ 569

Last change on this file since 569 was 569, checked in by mulligan, 9 years ago

Tightened up English in the Introduction

File size: 56.5 KB
Line 
1\documentclass{llncs}
2
3\usepackage{amsfonts}
4\usepackage{amsmath}
5\usepackage{amssymb} 
6\usepackage[english]{babel}
7\usepackage{color}
8\usepackage{fancybox}
9\usepackage{graphicx}
10\usepackage[colorlinks]{hyperref}
11\usepackage[utf8x]{inputenc}
12\usepackage{listings}
13\usepackage{mdwlist}
14\usepackage{microtype}
15\usepackage{stmaryrd}
16\usepackage{url}
17
18\newlength{\mylength} 
19\newenvironment{frametxt}%
20        {\setlength{\fboxsep}{5pt}
21                \setlength{\mylength}{\linewidth}%
22                \addtolength{\mylength}{-2\fboxsep}%
23                \addtolength{\mylength}{-2\fboxrule}%
24                \Sbox
25                \minipage{\mylength}%
26                        \setlength{\abovedisplayskip}{0pt}%
27                        \setlength{\belowdisplayskip}{0pt}%
28                }%
29                {\endminipage\endSbox
30                        \[\fbox{\TheSbox}\]}
31
32\lstdefinelanguage{matita-ocaml}
33  {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try,on,to},
34   morekeywords={[2]whd,normalize,elim,cases,destruct},
35   morekeywords={[3]type,of,val,assert,let,function},
36   mathescape=true,
37  }
38\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
39        keywordstyle=\color{red}\bfseries,
40        keywordstyle=[2]\color{blue},
41        keywordstyle=[3]\color{blue}\bfseries,
42        commentstyle=\color{green},
43        stringstyle=\color{blue},
44        showspaces=false,showstringspaces=false}
45\lstset{extendedchars=false}
46\lstset{inputencoding=utf8x}
47\DeclareUnicodeCharacter{8797}{:=}
48\DeclareUnicodeCharacter{10746}{++}
49\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
50\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
51
52\author{Dominic P. Mulligan\thanks{The project CerCo acknowledges the financial support of the Future and
53Emerging Technologies (FET) programme within the Seventh Framework
54Programme for Research of the European Commission, under FET-Open grant
55number: 243881} \and Claudio Sacerdoti Coen$^\star$}
56\authorrunning{D. P. Mulligan and C. Sacerdoti Coen}
57\title{An executable formalisation of the MCS-51 microprocessor in Matita}
58\titlerunning{An executable formalisation of the MCS-51}
59\institute{Dipartimento di Scienze dell'Informazione, Universit\`a di Bologna}
60
61\bibliographystyle{plain}
62
63\begin{document}
64
65\maketitle
66
67\begin{abstract}
68We summarise our formalisation of an emulator for the MCS-51 microprocessor in the Matita proof assistant.
69The MCS-51 is a widely used 8-bit microprocessor, especially popular in embedded devices.
70
71The formalisation proceeded in two stages, first implementing an O'Caml prototype for quickly `ironing out' bugs.
72We then ported the O'Caml emulator to Matita.
73Though mostly straight-forward, this porting presented multiple problems.
74Of particular interest is how we handled the non-orthoganality of the MSC-51's instruction set.
75In O'Caml, this was handled with polymorphic variants.
76In Matita, we achieved the same effect with a non-standard use of dependent types.
77
78Both the O'Caml and Matita emulators are `executable'.
79Assembly programs may be animated within Matita, producing a trace of instructions executed.
80The formalisation is a major component of the ongoing EU-funded CerCo project.
81\end{abstract}
82
83%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
84% SECTION                                                                      %
85%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
86\section{Introduction}
87\label{sect.introduction}
88
89Formal methods aim to increase our confidence in the design and implementation of software.
90Ideally, all software should come equipped with a formal specification and a proof of correctness for the corresponding implementation.
91The majority of programs are written in high level languages and then compiled into low level ones.
92Specifications are therefore also given at a high level and correctness can be proved by reasoning on the program's source code.
93The code that is actually run, however, is not the high level source code that we reason on, but low level code generated by the compiler.
94A few questions now arise:
95\begin{itemize*}
96\item
97What properties are preserved during compilation?
98\item
99What properties are affected by the compilation strategy?
100\item
101To what extent can you trust your compiler in preserving those properties?
102\end{itemize*}
103These questions, and others like them, motivate a current `hot topic' in computer science research: \emph{compiler verification} (for instance~\cite{leroy:formal:2009,chlipala:verified:2010}, and many others).
104So far, the field has only been focused on the first and last questions.
105Much attention has been placed on verifying compiler correctness with respect to extensional properties of programs.
106These are `easily' preserved during compilation.
107
108If we consider intensional properties of programs---space, time, and so forth---the situation is more complex.
109To express these properties, and reason about them, we must adopt a cost model that assigns a cost to single, or blocks, of instructions.
110A compositional cost model, assigning the same cost to all occurrences of one instruction, would be ideal.
111However, compiler optimisations are inherently non-compositional: each occurrence of a high level instruction may be compiled in a different way depending on its context.
112Therefore both the cost model and intensional specifications are affected by the compilation process.
113
114In the CerCo project (`Certified Complexity')~\cite{cerco:2011} we approach the problem of reasoning about intensional properties of programs as follows.
115We are currently developing a compiler that induces a cost model on high level source code.
116Costs are assigned to each block of high level instructions by considering the costs of the corresponding blocks of compiled code.
117The cost model is therefore inherently non-compositional, but has the potential to be extremely \emph{precise}, capturing a program's \emph{realistic} cost.
118That is, the compilation process is taken into account, not ignored.
119A prototype compiler, where no approximation of the cost is provided, has been developed.
120(The technical details of the cost model is explained in~\cite{amadio:certifying:2010}.)
121
122We believe that our approach is applicable to certifying real time programs.
123A user can certify that `deadlines' are met whilst wringing as many clock cycles from the processor---using a cost model that does not over-estimate---as possible.
124
125We also see our approach as being relevant to compiler verification (and construction) itself.
126\emph{An optimisation specified only extensionally is only half specified}.
127Though the optimisation may preserve the denotational semantics of a program, there is no guarantee that any intensional properties of the program will be improved.
128
129Another potential application is toward completeness and correctness of the compilation process in the presence of space constraints.
130A compiler could reject a source program targetting an embedded system when the size of the compiled code exceeds the available ROM size.
131Preservation of a program's semantics may only be required for those programs that do not exhaust the stack or heap.
132The statement of completeness of the compiler must therefore take in to account a realistic cost model.
133
134With the CerCo methodology, we assume we can assign to object code exact and realistic costs for sequential blocks of instructions.
135This is possible with modern processors (see~\cite{bate:wcet:2011,yan:wcet:2008} for instance) but difficult, as the structure and execution of a program itself has an influence on the speed of processing.
136Caching, memory effects, and advanced features such as branch prediction all have an effect on execution speed.
137For this reason CerCo decided to focus on 8-bit microprocessors.
138These are still used in embedded systems, with the advantage of a predictable cost model due to their relative paucity of features.
139
140We have fully formalised an executable formal semantics of a family of 8-bit Freescale microprocessors~\cite{oliboni:matita:2008}, and provided a similar executable formal semantics for the MCS-51 microprocessor.
141The latter is what we describe in this paper.
142The focus of the formalisation has been on capturing the intensional behaviour of the processor.
143However, the design of the MCS-51 itself has caused problems in our formalisation.
144For example, the MCS-51 has a highly unorthogonal instruction set.
145To cope with this unorthogonality, and to produce an executable specification, we rely on Matita's dependent types.
146
147\paragraph{The MCS-51}\quad
148The MCS-51 is an 8-bit microprocessor introduced by Intel in the late 1970s.
149Commonly called the 8051, in the decades since its introduction the processor has become a popular component of embedded systems.
150The processor, its successor the 8052, and derivatives are still manufactured \emph{en masse} by a host of vendors.
151
152The 8051 is a well documented processor, and has the support of numerous open source and commercial tools, such as compilers and emulators.
153For instance, the open source Small Device C Compiler (SDCC) recognises a dialect of C~\cite{sdcc:2010}, and other compilers for BASIC, Forth and Modula-2 are also extant.
154An open source emulator for the processor, MCU 8051 IDE, is also available~\cite{mcu8051ide:2010}.
155Both MCU 8051 IDE and SDCC were used in for validating our formalisation.
156
157\begin{figure}[t]
158\setlength{\unitlength}{0.87pt}
159\begin{picture}(410,250)(-50,200)
160%\put(-50,200){\framebox(410,250){}}
161\put(12,410){\makebox(80,0)[b]{Internal (256B)}}
162\put(13,242){\line(0,1){165}}
163\put(93,242){\line(0,1){165}}
164\put(13,407){\line(1,0){80}}
165\put(12,400){\makebox(0,0)[r]{0h}}  \put(14,400){\makebox(0,0)[l]{Register bank 0}}
166\put(13,393){\line(1,0){80}}
167\put(12,386){\makebox(0,0)[r]{8h}}  \put(14,386){\makebox(0,0)[l]{Register bank 1}}
168\put(13,379){\line(1,0){80}}
169\put(12,372){\makebox(0,0)[r]{10h}}  \put(14,372){\makebox(0,0)[l]{Register bank 2}}
170\put(13,365){\line(1,0){80}}
171\put(12,358){\makebox(0,0)[r]{18h}} \put(14,358){\makebox(0,0)[l]{Register bank 3}}
172\put(13,351){\line(1,0){80}}
173\put(12,344){\makebox(0,0)[r]{20h}} \put(14,344){\makebox(0,0)[l]{Bit addressable}}
174\put(13,323){\line(1,0){80}}
175\put(12,316){\makebox(0,0)[r]{30h}}
176  \put(14,309){\makebox(0,0)[l]{\quad \vdots}}
177\put(13,291){\line(1,0){80}}
178\put(12,284){\makebox(0,0)[r]{80h}}
179  \put(14,263){\makebox(0,0)[l]{\quad \vdots}}
180\put(12,249){\makebox(0,0)[r]{ffh}}
181\put(13,242){\line(1,0){80}}
182
183\qbezier(-2,407)(-6,407)(-6,393)
184\qbezier(-6,393)(-6,324)(-10,324)
185\put(-12,324){\makebox(0,0)[r]{Indirect/stack}}
186\qbezier(-6,256)(-6,324)(-10,324)
187\qbezier(-2,242)(-6,242)(-6,256)
188
189\qbezier(94,407)(98,407)(98,393)
190\qbezier(98,393)(98,349)(102,349)
191\put(104,349){\makebox(0,0)[l]{Direct}}
192\qbezier(98,305)(98,349)(102,349)
193\qbezier(94,291)(98,291)(98,305)
194
195\put(102,242){\framebox(20,49){SFR}}
196% bit access to sfrs?
197
198\qbezier(124,291)(128,291)(128,277)
199\qbezier(128,277)(128,266)(132,266)
200\put(134,266){\makebox(0,0)[l]{Direct}}
201\qbezier(128,257)(128,266)(132,266)
202\qbezier(124,242)(128,242)(128,256)
203
204\put(164,410){\makebox(80,0)[b]{External (64kB)}}
205\put(164,220){\line(0,1){187}}
206\put(164,407){\line(1,0){80}}
207\put(244,220){\line(0,1){187}}
208\put(164,242){\line(1,0){80}}
209\put(163,400){\makebox(0,0)[r]{0h}}
210\put(164,324){\makebox(80,0){Paged access}}
211  \put(164,310){\makebox(80,0){Direct/indirect}}
212\put(163,235){\makebox(0,0)[r]{80h}}
213  \put(164,228){\makebox(80,0){\vdots}}
214  \put(164,210){\makebox(80,0){Direct/indirect}}
215
216\put(264,410){\makebox(80,0)[b]{Code (64kB)}}
217\put(264,220){\line(0,1){187}}
218\put(264,407){\line(1,0){80}}
219\put(344,220){\line(0,1){187}}
220\put(263,400){\makebox(0,0)[r]{0h}}
221  \put(264,228){\makebox(80,0){\vdots}}
222  \put(264,324){\makebox(80,0){Direct}}
223  \put(264,310){\makebox(80,0){PC relative}}
224\end{picture}
225\caption{The 8051 memory model}
226\label{fig.memory.layout}
227\end{figure}
228
229The 8051 has a relatively straightforward architecture, unencumbered by advanced features of modern processors, making it an ideal target for formalisation.
230A high-level overview of the processor's memory layout, along with the ways in which different memory spaces may be addressed, is provided in Figure~\ref{fig.memory.layout}.
231
232Processor RAM is divided into numerous segments, with the most prominent division being between internal and (optional) external memory.
233Internal memory, commonly provided on the die itself with fast access, is composed of 256 bytes, but, in direct addressing mode, half of them are overloaded with 128 bytes of memory-mapped Special Function Registers (SFRs).
234SFRs control the operation of the processor.
235Internal RAM (IRAM) is divided again into 8 general purpose bit-addressable registers (R0--R7).
236These sit in the first 8 bytes of IRAM, though can be programmatically `shifted up' as needed.
237Bit memory, followed by a small amount of stack space, resides in the memory space immediately following the register banks.
238What remains of IRAM may be treated as general purpose memory.
239A schematic view of IRAM layout is also provided in Figure~\ref{fig.memory.layout}.
240
241External RAM (XRAM), limited to a maximum size of 64 kilobytes, is optional, and may be provided on or off chip, depending on the vendor.
242XRAM is accessed using a dedicated instruction, and requires 16 bits to address fully.
243External code memory (XCODE) is often stored as an EPROM, and limited to 64 kilobytes in size.
244However, depending on the particular processor model, a dedicated on-die read-only memory area for program code (ICODE) may be supplied.
245
246Memory may be addressed in numerous ways: immediate, direct, indirect, external direct and code indirect.
247As the latter two addressing modes hint, there are some restrictions enforced by the 8051, and its derivatives, on which addressing modes may be used with specific types of memory.
248For instance, the extra 128 bytes of IRAM of the 8052 cannot be addressed using indirect addressing; rather, external (in)direct addressing must be used. Moreover, some memory segments are addressed using 8-bit pointers while others require 16-bits.
249
250The 8051 possesses an 8-bit Arithmetic and Logic Unit (ALU), with a variety of instructions for performing arithmetic and logical operations on bits and integers.
251Two 8-bit general purpose accumulators, A and B, are provided.
252
253Communication with the device is handled by an inbuilt UART serial port and controller.
254This can operate in numerous modes.
255Serial baud rate is determined by one of two 16-bit timers included with the 8051, which can be set to multiple modes of operation.
256(The 8052 provides an additional 16-bit timer.)
257The 8051 also provides a 4 byte bit-addressable I/O port.
258
259The programmer may take advantage of an interrupt mechanism.
260This is especially useful when dealing with I/O involving the serial device, as an interrupt can be set when a whole character is sent or received via the UART.
261
262Interrupts immediately halt the flow of execution of the processor, and cause the program counter to jump to a fixed address, where the requisite interrupt handler is stored.
263However, interrupts may be set to one of two priorities: low and high.
264The interrupt handler of an interrupt with high priority is executed ahead of the interrupt handler of an interrupt of lower priority, interrupting a currently executing handler of lower priority, if necessary.
265
266The 8051 has interrupts disabled by default.
267The programmer is free to handle serial input and output manually, by poking serial flags in the SFRs.
268`Exceptional circumstances' that would otherwise trigger an interrupt on more modern processors, (e.g. division by zero) are also signalled by setting flags.
269
270%\begin{figure}[t]
271%\begin{center}
272%\includegraphics[scale=0.5]{iramlayout.png}
273%\end{center}
274%\caption{Schematic view of 8051 IRAM layout}
275%\label{fig.iram.layout}
276%\end{figure}
277
278\paragraph{Overview of paper}\quad
279In Section~\ref{sect.design.issues.formalisation} we discuss design issues in the development of the formalisation.
280In Section~\ref{sect.validation} we discuss how we validated the design and implementation of our emulator to ensure that what we formalised was an accurate model of an MCS-51 series microprocessor.
281In Section~\ref{sect.related.work} we describe previous work, with an eye toward describing its relation with the work described herein.
282In Section~\ref{sect.conclusions} we conclude.
283
284%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
285% SECTION                                                                      %
286%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
287\section{Design issues in the formalisation}
288\label{sect.design.issues.formalisation}
289
290From hereonin, we typeset O'Caml source with \texttt{\color{blue}{blue}} and Matita source with \texttt{\color{red}{red}} to distinguish the two syntaxes.
291Matita's syntax is largely straightforward to those familiar with Coq or O'Caml.
292The only subtlety is the use of `\texttt{?}' in an argument position denoting an argument that should be inferred automatically.
293
294A full account of the formalisation can be found in~\cite{cerco-report:2011}.
295
296\subsection{Development strategy}
297\label{subsect.development.strategy}
298
299Our implementation progressed in two stages.
300We began with an emulator written in O'Caml to `iron out' any bugs in our design and implementation.
301O'Caml's ability to perform file I/O also eased debugging and validation.
302Once we were happy with the design of the O'Caml emulator, we moved Matita.
303
304Matita's syntax is lexically similar to O'Caml's.
305This eased the translation, as code was merely copied with minor modifications.
306However, several major issues had to be addresses when moving from to Matita.
307These are now discussed.
308
309%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
310% SECTION                                                                      %
311%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
312\subsection{Representation of bytes, words, etc.}
313\label{subsect.representation.integers}
314
315\begin{figure}[t]
316\begin{minipage}[t]{0.45\textwidth}
317\vspace{0pt}
318\begin{lstlisting}
319type 'a vect = bit list
320type nibble = [`Sixteen] vect
321type byte = [`Eight] vect
322$\color{blue}{\mathtt{let}}$ split_word w = split_nth 4 w
323$\color{blue}{\mathtt{let}}$ split_byte b = split_nth 2 b
324\end{lstlisting}
325\end{minipage}
326%
327\begin{minipage}[t]{0.55\textwidth}
328\vspace{0pt}
329\begin{lstlisting}
330type 'a vect
331type word = [`Sixteen] vect
332type byte = [`Eight] vect
333val split_word: word -> byte * word
334val split_byte: byte -> nibble * nibble
335\end{lstlisting}
336\end{minipage}
337\caption{Sample of O'Caml implementation and interface for bitvectors module}
338\label{fig.ocaml.implementation.bitvectors}
339\end{figure}
340
341The formalization of MCS-51 must deal with bytes (8-bits), words (16-bits) but also with more exoteric quantities (7-bits, 3-bits, 9-bits).
342To avoid size mismatch bugs difficult to spot, we represent all of these quantities using bitvectors, i.e. fixed length vectors of booleans.
343In our O'Caml emulator, we `faked' bitvectors using phantom types~\cite{leijen:domain:1999} implemented with polymorphic variants~\cite{garrigue:programming:1998}, as in Figure~\ref{fig.ocaml.implementation.bitvectors}.
344From within the bitvector module (left column) bitvectors are just lists of bits and no guarantee is provided on sizes.
345However, the module's interface (right column) enforces the size invariants in the rest of the code.
346
347In Matita, we are able to use the full power of dependent types to always work with vectors of a known size:
348\begin{lstlisting}
349inductive Vector (A: Type[0]): nat $\rightarrow$ Type[0] ≝
350  VEmpty: Vector A O
351| VCons: $\forall$n: nat. A $\rightarrow$ Vector A n $\rightarrow$ Vector A (S n).
352\end{lstlisting}
353We define \texttt{BitVector} as a specialization of \texttt{Vector} to \texttt{bool}.
354We may use Matita's type system to provide precise typing for functions that are
355polymorphic in the size without having to duplicate the code as we did in O'Caml:
356\begin{lstlisting}
357let rec split (A: Type[0]) (m,n: nat) on m:
358   Vector A (plus m n) $\rightarrow$ (Vector A m) $\times$ (Vector A n) := ...
359\end{lstlisting}
360
361%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
362% SECTION                                                                      %
363%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
364\subsection{Representing memory}
365\label{subsect.representing.memory}
366
367The MCS-51 has numerous disjoint memory spaces addressed by pointers of different sizes.
368In our prototype implementation, we use a map data structure (from O'Caml's standard library) for each space.
369Matita's standard library is small, and does not contain a generic map data structure.
370We had the opportunity of crafting a dependently typed special-purpose data structure for the job to enforce the correspondence between the size of pointer and the size of the memory space.
371We assumed that large swathes of memory would often be uninitialized.
372
373We picked a modified form of trie of fixed height $h$.
374Paths are represented by bitvectors (already used in our implementation for addresses and registers) of length $h$:
375\begin{lstlisting}
376inductive BitVectorTrie (A: Type[0]): nat $\rightarrow$ Type[0] ≝
377  Leaf: A $\rightarrow$ BitVectorTrie A 0
378| Node: ∀n. BitVectorTrie A n $\rightarrow$ BitVectorTrie A n $\rightarrow$ BitVectorTrie A (S n)
379| Stub: ∀n. BitVectorTrie A n.
380\end{lstlisting}
381\texttt{Stub} is a constructor that can appear at any point in a trie.
382It represents `uninitialized data'.
383Performing a lookup in memory is now straight-forward.
384We traverse a path, and if we encounter \texttt{Stub}, we return a default value\footnote{All manufacturer data sheets that we consulted were silent on the subject of what should be returned if we attempt to access uninitialized memory.  We defaulted to simply returning zero, though our \texttt{lookup} function is parametric in this choice.  We do not believe that this is an outrageous decision, as SDCC for instance generates code which first `zeroes out' all memory in a preamble before executing the program proper.  This is in line with the C standard, which guarantees that all global variables will be zero initialized piecewise.}.
385
386%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
387% SECTION                                                                      %
388%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
389\subsection{Labels and pseudoinstructions}
390\label{subsect.labels.pseudoinstructions}
391
392Aside from implementing the core MCS-51 instruction set, we also provided \emph{pseudoinstructions}, \emph{labels} and \emph{cost labels}.
393The purpose of \emph{cost labels} will be explained in Subsection~\ref{subsect.computation.cost.traces}.
394
395Introducing pseudoinstructions had the effect of simplifying a C compiler---another component of the CerCo project---that was being implemented in parallel with our implementation.
396To see why, consider the fact that the MCS-51's instruction set has numerous instructions for unconditional and conditional jumps to memory locations.
397For instance, the instructions \texttt{AJMP}, \texttt{JMP} and \texttt{LJMP} all perform unconditional jumps.
398However, these instructions differ in how large the maximum size of the offset of the jump to be performed can be.
399Further, all jump instructions require a concrete memory address---to jump to---to be specified.
400Compilers that support separate compilation cannot directly compute these offsets and select the appropriate jump instructions.
401These operations are also burdensome for compilers that do not do separate compilation and are handled by the assemblers, as we decided to do.
402
403While introducing pseudoinstructions, we also introduced labels for locations to jump to, and for global data.
404To specify global data via labels, we introduced a preamble before the program where labels and the size of reserved space for data is associated.
405A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the (16-bit) register \texttt{DPTR}.
406
407Our pseudoinstructions and labels induce an assembly language similar to that of SDCC.
408All pseudoinstructions and labels are `assembled away' prior to program execution.
409Jumps are computed in two stages.
410A map associating memory addresses to labels is built, before removing pseudojumps with concrete jumps to the correct address.
411The algorithm currently implemented does not try to minimize the object code size by always picking the shortest possible jump instruction.
412A better algorithm is currently left for future work.
413
414%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
415% SECTION                                                                      %
416%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
417\subsection{Anatomy of the (Matita) emulator}
418\label{subsect.anatomy.matita.emulator}
419
420The internal state of our Matita emulator is represented as a record:
421\begin{lstlisting}
422record Status: Type[0] ≝ {
423  code_memory: BitVectorTrie Byte 16;
424  low_internal_ram: BitVectorTrie Byte 7;
425  high_internal_ram: BitVectorTrie Byte 7;
426  external_ram: BitVectorTrie Byte 16;
427  program_counter: Word;
428  special_function_registers_8051: Vector Byte 19;
429  special_function_registers_8052: Vector Byte 5;
430  ...  }.
431\end{lstlisting}
432This record neatly encapsulates the current memory contents, the program counter, the state of the current SFRs, and so on.
433
434Here the MCS-51 memory model is implemented using four disjoint memory spaces plus the SFRs.
435From the programmer's point of view, what matters are addressing modes that are in a many-to-many relation with the spaces.
436\texttt{DIRECT} addressing can be used to address either low internal RAM (if the first bit is 0) or the SFRs (if the first bit is 1), for instance.
437That's why DIRECT uses 8-bit addresses but pointers to the low internal RAM only use 7 bits.
438The complexity of the memory model is captured in the \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX} functions that get and set data of size \texttt{XX} from memory, considering all addressing modes
439
440%Overlapping, and checking which addressing modes can be used to address particular memory spaces, is handled through numerous \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX} (for 1, 8 and 16 bits) functions.
441
442Both the Matita and O'Caml emulators follows the classic `fetch-decode-execute' model of processor operation.
443The next instruction to be processed, indexed by the program counter, is fetched from code memory with \texttt{fetch}.
444An updated program counter, along with the concrete cost, in processor cycles for executing this instruction, is also returned.
445These costs are taken from a Siemens Semiconductor Group data sheet for the MCS-51~\cite{siemens:2011}, and will likely vary across manufacturers and particular derivatives of the processor.
446\begin{lstlisting}
447definition fetch: BitVectorTrie Byte 16 $\rightarrow$ Word $\rightarrow$ instruction $\times$ Word $\times$ nat
448\end{lstlisting}
449Instruction are assembled to bit encodings by \texttt{assembly1}:
450\begin{lstlisting}
451definition assembly1: instruction $\rightarrow$ list Byte
452\end{lstlisting}
453An assembly program---comprising a preamble containing global data and a list of pseudoinstructions---is assembled using \texttt{assembly}.
454Pseudoinstructions and labels are eliminated in favour of instructions from the MCS-51 instruction set.
455A map associating memory locations and cost labels (see Subsection~\ref{subsect.computation.cost.traces}) is produced.
456\begin{lstlisting}
457definition assembly:
458  assembly_program $\rightarrow$ option (list Byte $\times$ (BitVectorTrie String 16))
459\end{lstlisting}
460A single fetch-decode-execute cycle is performed by \texttt{execute\_1}:
461\begin{lstlisting}
462definition execute_1: Status $\rightarrow$ Status
463\end{lstlisting}
464The \texttt{execute} functions performs a fixed number of cycles by iterating
465\texttt{execute\_1}:
466\begin{lstlisting}
467let rec execute (n: nat) (s: Status) on n: Status := ...
468\end{lstlisting}
469This differs from the O'Caml emulator, which executed a program indefinitely.
470A callback function was also accepted as an argument, which could `witness' the execution as it happened, providing a print-out of the processor state.
471Due to Matita's termination requirement, \texttt{execute} cannot execute a program indefinitely.
472An alternative would be to produce an infinite stream of statuses representing an execution trace.
473Matita supports infinite streams through co-inductive types.
474
475%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
476% SECTION                                                                      %
477%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
478\subsection{Instruction set unorthogonality}
479\label{subsect.instruction.set.unorthogonality}
480
481A peculiarity of the MCS-51 is the non-orthogonality of its instruction set.
482For instance, the \texttt{MOV} instruction, can be invoked using one of 16 combinations of addressing modes out of a possible 361.
483
484% Show example of pattern matching with polymorphic variants
485
486Such non-orthogonality in the instruction set was handled with the use of polymorphic variants in the O'Caml emulator.
487For instance, we introduced types corresponding to each addressing mode:
488\begin{lstlisting}
489type direct = [ `DIRECT of byte ]
490type indirect = [ `INDIRECT of bit ]
491...
492\end{lstlisting}
493Which were then combined in our inductive datatype for assembly instructions using the union operator `$|$':
494\begin{lstlisting}
495type 'addr preinstruction =
496 [ `ADD of acc * [ reg | direct | indirect | data ]
497...
498 | `MOV of
499    (acc * [ reg | direct | indirect | data ],
500     [ reg | indirect ] * [ acc | direct | data ],
501     direct * [ acc | reg | direct | indirect | data ],
502     dptr * data16,
503     carry * bit,
504     bit * carry
505     ) union6
506...
507\end{lstlisting}
508Here, \texttt{union6} is a disjoint union type, defined as follows:
509\begin{lstlisting}
510type ('a,'b,'c,'d,'e,'f) union6 = [ `U1 of 'a | ... | `U6 of 'f ]
511\end{lstlisting}
512For our purposes, the types \texttt{union2}, \texttt{union3} and \texttt{union6} sufficed.
513
514This polymorphic variant machinery worked well: it introduced a certain level of type safety (for instance, the type of our \texttt{MOV} instruction above guarantees it cannot be invoked with arguments in the \texttt{carry} and \texttt{data16} addressing modes, respectively), and also allowed us to pattern match against instructions, when necessary.
515However, this polymorphic variant machinery is \emph{not} present in Matita.
516We needed some way to produce the same effect, which Matita supported.
517For this task, we used dependent types.
518
519We first provided an inductive data type representing all possible addressing modes, a type that functions will pattern match against:
520\begin{lstlisting}
521inductive addressing_mode: Type[0] ≝
522  DIRECT: Byte $\rightarrow$ addressing_mode
523| INDIRECT: Bit $\rightarrow$ addressing_mode
524...
525\end{lstlisting}
526We also wished to express in the type of functions the \emph{impossibility} of pattern matching against certain constructors.
527In order to do this, we introduced an inductive type of addressing mode `tags'.
528The constructors of \texttt{addressing\_mode\_tag} are in one-to-one correspondence with the constructors of \texttt{addressing\_mode}:
529\begin{lstlisting}
530inductive addressing_mode_tag : Type[0] ≝
531  direct: addressing_mode_tag
532| indirect: addressing_mode_tag
533...
534\end{lstlisting}
535The \texttt{is\_a} function checks if an \texttt{addressing\_mode} matches an \texttt{addressing\_mode\_tag}:
536\begin{lstlisting}
537let rec is_a (d: addressing_mode_tag) (A: addressing_mode) on d :=
538  match d with
539   [ direct $\Rightarrow$ match A with [ DIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ]
540   | indirect $\Rightarrow$ match A with [ INDIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ]
541...
542\end{lstlisting}
543The \texttt{is\_in} function checks if an \texttt{addressing\_mode} matches a set of tags represented as a vector. It simply extends the \texttt{is\_a} function in the obvious manner.
544
545Finally, a \texttt{subaddressing\_mode} is an ad-hoc non empty $\Sigma$-type of addressing
546modes constrained to be in a given set of tags:
547\begin{lstlisting}
548record subaddressing_mode n (l: Vector addressing_mode_tag (S n)): Type[0] :=
549 { subaddressing_modeel :> addressing_mode;
550   subaddressing_modein: bool_to_Prop (is_in ? l subaddressing_modeel) }.
551\end{lstlisting}
552An implicit coercion is provided to promote vectors of tags (denoted with
553$\llbracket - \rrbracket$)
554to the
555corresponding \texttt{subaddressing\_mode} so that we can use a syntax
556close to the O'Caml one to specify preinstructions:
557\begin{lstlisting}
558inductive preinstruction (A: Type[0]): Type[0] ≝
559   ADD: $\llbracket$ acc_a $\rrbracket$ $\rightarrow$ $\llbracket$ register; direct; indirect; data $\rrbracket$ $\rightarrow$ preinstruction A
560 | ADDC: $\llbracket$ acc_a $\rrbracket$ $\rightarrow$ $\llbracket$ register; direct; indirect; data $\rrbracket$ $\rightarrow$ preinstruction A
561...
562\end{lstlisting}
563The constructor \texttt{ADD} expects two parameters, the first being the accumulator A (\texttt{acc\_a}), the second being a register, direct, indirect or data addressing mode.
564
565% One of these coercions opens up a proof obligation which needs discussing
566% Have lemmas proving that if an element is a member of a sub, then it is a member of a superlist, and so on
567The final, missing component is a pair of type coercions from \texttt{addressing\_mode} to \texttt{subaddressing\_mode} and from \texttt{subaddressing\_mode} to \texttt{Type$\lbrack0\rbrack$}, respectively.
568The first is a forgetful coercion, while the second opens a proof obligation wherein we must prove that the provided value is in the admissible set.
569These coercions were first introduced by PVS to implement subset types~\cite{shankar:principles:1999}, and later in Coq as an addition~\cite{sozeau:subset:2006}.
570In Matita all coercions can open proof obligations.
571
572Proof obligations impels us to state and prove a few auxilliary lemmas related
573to transitivity of subtyping. For instance, an addressing mode that belongs
574to an allowed set also belongs to any one of its super-set. At the moment,
575Matita's automation exploits these lemmas to completely solve all the proof
576obligations opened in our formalization, comprising the 200 proof obligations
577related to the main \texttt{execute\_1} function.
578
579The machinery just described allows us to restrict the set of addressing
580modes expected by a function and use this information during pattern matching
581to skip impossible cases.
582For instance, consider \texttt{set\_arg\_16}, which expects only a \texttt{DPTR}:
583\begin{lstlisting}
584definition set_arg_16: Status $\rightarrow$ Word $\rightarrow$ $\llbracket$ dptr $\rrbracket$ $\rightarrow$ Status ≝ $~\lambda$s, v, a.
585   match a return $\lambda$x. bool_to_Prop (is_in ? $\llbracket$ dptr $\rrbracket$ x) $\rightarrow$ ? with
586     [ DPTR $\Rightarrow$ $\lambda$_: True.
587       let 〈 bu, bl 〉 := split $\ldots$ eight eight v in
588       let status := set_8051_sfr s SFR_DPH bu in
589       let status := set_8051_sfr status SFR_DPL bl in
590         status
591     | _ $\Rightarrow$ $\lambda$_: False. $\bot$ ] $~$(subaddressing_modein $\ldots$ a).
592\end{lstlisting}
593We give a proof (the expression \texttt{(subaddressing\_modein} $\ldots$ \texttt{a)}) that the argument $a$ is in the set $\llbracket$ \texttt{dptr} $\rrbracket$ to the match expression.
594In every case but \texttt{DPTR}, the proof is a proof of \texttt{False}, and the system opens a proof obligation $\bot$ that can be discarded using \emph{ex falso}.
595Attempting to match against a disallowed addressing mode (replacing \texttt{False} with \texttt{True} in the branch) produces a type-error.
596
597Other dependently and non-dependently typed solutions we tried were clumsy in practice.
598As we need a large number of different combinations of addressing modes to describe the whole instruction set, it is unfeasible to declare a data type for each one of these combinations.
599The current solution is the one that best matches the corresponding O'Caml code, to the point that the translation from O'Caml to Matita is almost syntactical.
600We would like to investigate the possibility of changing the code extraction procedure of Matita to recognise this programming pattern and output O'Caml code using polymorphic variants.
601
602% Talk about extraction to O'Caml code, which hopefully will allow us to extract back to using polymorphic variants, or when extracting vectors we could extract using phantom types
603% Discuss alternative approaches, i.e. Sigma types to piece together smaller types into larger ones, as opposed to using a predicate to `cut out' pieces of a larger type, which is what we did
604
605%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
606% SECTION                                                                      %
607%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
608\subsection{I/O and timers}
609\label{subsect.i/o.timers}
610
611% `Real clock' for I/O and timers
612The O'Caml emulator has code for handling timers, asynchronous I/O and interrupts (these are not yet ported to the Matita emulator).
613All three of these features interact with each other in subtle ways.
614Interrupts can `fire' when an input is detected on the processor's UART port, and, in certain modes, timers reset when a high signal is detected on one of the MCS-51's communication pins.
615
616To accurately model timers and I/O, we add an unbounded integral field \texttt{clock} to the central \texttt{status} record.
617This field is only logical, since it does not represent any quantity stored in the actual processor, and is used to keep track of the current processor time.
618Before every execution step, \texttt{clock} is incremented by the number of processor cycles that the instruction just fetched will take to execute.
619The processor then executes the instruction, followed by the code implementing the timers and I/O\footnote{Though it isn't fully specified by the manufacturer's data sheets if I/O is handled at the beginning or the end of each cycle.}. In order to model I/O, we also store in the status a
620\emph{continuation} which is a description of the behaviour of the environment:
621\begin{lstlisting}
622type line =
623  [ `P1 of byte | `P3 of byte
624  | `SerialBuff of [ `Eight of byte | `Nine of BitVectors.bit * byte ]]
625type continuation =
626  [`In of time * line * epsilon * continuation] option *
627  [`Out of (time -> line -> time * continuation)]
628\end{lstlisting}
629At each moment, the second projection of the continuation $k$ describes how the environment will react to an output event performed in the future by the processor. Let $\pi_2(k)(\tau,o) = \langle \tau',k' \rangle$.
630If the processor at time $\tau$ starts an asynchronous output $o$ either on the P1 or P3 output lines, or on the UART, then the environment will receive the output at time $\tau'$.
631Moreover the status is immediately updated with the continuation $k'$.
632
633Further, if $\pi_1(k) = \mathtt{Some}~\langle \tau',i,\epsilon,k'\rangle$, then at time $\tau'$ the environment will send the asynchronous input $i$ to the processor and the status will be updated with the continuation $k'$.
634This input is visible to the processor only at time $\tau' + \epsilon$.
635
636The time required to perform an I/O operation is partially specified in the data sheets of the UART module.
637This computation is complex so we prefer to abstract over it.
638We leave the computation of the delay time to the environment.
639
640We use only the P1 and P3 lines despite the MCS-51 having four output lines, P0--P3.
641This is because P0 and P2 become inoperable if the processor is equipped with XRAM (which we assume it is).
642
643The UART port can work in several modes, depending on the how the SFRs are set.
644In an asyncrhonous mode, the UART transmits 8 bits at a time, using a ninth line for syncrhonization.
645In a syncrhonous mode the ninth line is used to transmit an additional bit.
646
647%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
648% SECTION                                                                      %
649%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
650\subsection{Computation of cost traces}
651\label{subsect.computation.cost.traces}
652
653As mentioned in Subsection~\ref{subsect.labels.pseudoinstructions} we introduced a notion of \emph{cost label}.
654Cost labels are inserted by the prototype C compiler in specific locations in the object code.
655Roughly, for those familiar with control flow graphs, they are inserted at the start of every basic block.
656
657Cost labels are used to calculate a precise costing for a program by marking the location of basic blocks.
658During the assembly phase, where labels and pseudoinstructions are eliminated, a map is generated associating cost labels with memory locations.
659This map is later used in a separate analysis which computes the cost of a program by traversing through a program, fetching one instruction at a time, and computing the cost of blocks.
660These block costings are stored in another map, and will later be passed back to the prototype compiler.
661
662%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
663% SECTION                                                                      %
664%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
665\section{Validation}
666\label{sect.validation}
667
668We spent considerable effort attempting to ensure that what we have formalised is an accurate model of the MCS-51 microprocessor.
669
670First, we made use of multiple data sheets, each from a different semiconductor manufacturer.  This helped us spot errors in the specification of the processor's instruction set, and its behaviour, for instance, in a datasheet from Philips.
671
672The O'Caml prototype was especially useful for validation purposes.
673This is because we wrote a module for parsing and loading the Intel HEX file format.
674HEX is a standard format that all compilers targetting the MCS-51, and similar processors, produce.
675It is essentially a snapshot of the processor's code memory in compressed form.
676Using this, we were able to compile C programs with SDCC, an open source compiler, and load the resulting program directly into our emulator's code memory, ready for execution.
677Further, we are able to produce a HEX file from our emulator's code memory, for loading into third party tools.
678After each step of execution, we can print out both the instruction that had been executed, along with its arguments, and a snapshot of the processor's state, including all flags and register contents.
679For example:
680\begin{frametxt}
681\begin{small}
682\begin{verbatim}
683...
68408: mov 81 #07
685
686 Processor status:                               
687
688   ACC: 0   B: 0   PSW: 0
689    with flags set as:
690     CY: false  AC: false  FO: false  RS1: false
691     RS0: false  OV: false UD: false  P: false
692   SP: 7  IP: 0  PC: 8  DPL: 0  DPH: 0  SCON: 0
693   SBUF: 0  TMOD: 0  TCON: 0
694   Registers:                                   
695    R0: 0  R1: 0  R2: 0  R3: 0  R4: 0  R5: 0  R6: 0  R7: 0
696...
697\end{verbatim}
698\end{small}
699\end{frametxt}
700Here, the traces indicates that the instruction \texttt{mov 81 \#07} has just been executed by the processor, which is now in the state indicated.
701These traces were useful in spotting anything that was `obviously' wrong with the execution of the program.
702
703Further, we used MCU 8051 IDE as a reference.
704Using our execution traces, we were able to step through a compiled program, one instruction at a time, in MCU 8051 IDE, and compare the resulting execution trace with the trace produced by our emulator.
705
706Our Matita formalisation was largely copied from the O'Caml source code, apart from changes related to addressing modes already mentioned.
707However, as the Matita emulator is executable, we could perform further validation by comparing the trace of a program's execution in the Matita emulator with the trace of the same program in the O'Caml emulator.
708
709%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
710% SECTION                                                                      %
711%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
712\section{Related work}
713\label{sect.related.work}
714There exists a large body of literature on the formalisation of microprocessors.
715The majority of it aims to prove correctness of the implementation of the microprocessor at the microcode or gate level.
716 We are interested in providing a precise specification of the behaviour of the microprocessor in order to prove the correctness of a compiler which will target the processor.
717In particular, we are interested in intensional properties of the processor; precise timings of instruction execution in clock cycles.
718Moreover, in addition to formalising the interface of an MCS-51 processor, we have also built a complete MCS-51 ecosystem: the UART, the I/O lines, and hardware timers, along with an assembler.
719
720Similar work to ours can be found in~\cite{fox:trustworthy:2010}.
721Here, the authors describe the formalisation, in HOL4, of the ARMv7 instruction set architecture, and point to a good list of references to related work in the literature.
722This formalisation also considers the machine code level, as opposed to only considering an abstract assembly language.
723In particular, instruction decoding is explicitly modelled inside HOL4's logic.
724We go further in also providing an assembly language, complete with assembler, to translate instructions and pseudoinstruction to machine code.
725
726Further, in~\cite{fox:trustworthy:2010} the authors validated their formalisation by using development boards and random testing.
727We currently rely on non-exhaustive testing against a third party emulator.
728We leave exhaustive testing for future work.
729
730Executability is another key difference between our work and~\cite{fox:trustworthy:2010}.
731Our formalisation is executable: applying the emulation function to an input state eventually reduces to an output state that already satisfies the appropriate conditions.
732This is because Matita is based on a logic that internalizes conversion.
733In~\cite{fox:trustworthy:2010} the authors provide an automation layer to derive single step theorems: if the processor is in a particular state that satisfies some preconditions, then after execution of an instruction it will reside in another state satisfying some postconditions.
734We do not need single step theorems of this form.
735
736Our main difficulties resided in the non-uniformity of an old 8-bit architecture, in terms of the instruction set, addressing modes and memory models.
737In contrast, the ARM instruction set and memory model is relatively uniform, simplifying any formalisation considerably.
738
739Perhaps the closest project to CerCo is CompCert~\cite{leroy:formally:2009}.
740CompCert concerns the certification of a C compiler and includes a formalisation in Coq of a subset of PowerPC.
741Coq and Matita essentially share the same logic.
742
743Despite this similarity, the two formalisations do not have much in common.
744First, CompCert provides a formalisation at the assembly level (no instruction decoding), and this impels them to trust an unformalised assembler and linker, whereas we provide our own. Our formalization is directly executable, while the
745one in CompCert only provides a relation that describes execution.
746I/O is also not considered at all in CompCert.
747Moreover an idealized abstract and uniform memory model is assumed, while we take into account the complicated overlapping memory model of the MCS-51 architecture.
748Finally, 82 instructions of the 200+ offered by the processor are formalised in CompCert, and the assembly language is augmented with macro instructions that are turned into `real' instructions only during communication with the external assembler.
749Even from a technical level the two formalisations differ: while we tried to exploit dependent types as often as possible, CompCert largely sticks to the non-dependent fragment of Coq.
750
751In~\cite{atkey:coqjvm:2007} Atkey presents an executable specification of the Java virtual machine which uses dependent types.
752As we do, dependent types are used to remove spurious partiality from the model, and to lower the need for over-specifying the behaviour of the processor in impossible cases.
753Our use of dependent types will also help to maintain invariants when we prove the correctness of the CerCo prototype compiler.
754
755Finally, Sarkar et al~\cite{sarkar:semantics:2009} provide an executable semantics for x86-CC multiprocessor machine code.
756This machine code exhibits a high degree of non-uniformity similar to the MCS-51.
757However, only a small subset of the instruction set is considered, and they over-approximate the possibilities of unorthogonality of the instruction set, largely dodging the problems we had to face.
758
759Further, it seems that the definition of the decode function is potentially error prone.
760A small domain specific language of patterns is formalised in HOL4.
761This is similar to the specification language of the x86 instruction set found in manufacturer's data sheets.
762A decode function is implemented by copying lines from data sheets into the proof script.
763
764We are currently considering implementing a similar domain specific language in Matita.
765However, we would prefer to certify in Matita the compiler for this language.
766Data sheets could then be compiled down to the efficient code that we currently provide, instead of inefficiently interpreting the data sheets every time an instruction is executed.
767
768%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
769% SECTION                                                                      %
770%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
771\section{Conclusions}
772\label{sect.conclusions}
773
774The CerCo project is interested in the certification of a compiler for C that induces a precise cost model on the source code.
775Our cost model assigns costs to blocks of instructions by tracing the way that blocks are compiled, and by computing exact costs on generated assembly code.
776To perform this accurately, we have provided an executable semantics for the MCS-51 family of processors, better known as 8051/8052.
777The formalisation was done twice, first in O'Caml and then in Matita, and captures the exact timings of the processor.
778Moreover, the O'Caml formalisation also considers timers and I/O.
779Adding support for I/O and timers in Matita is an on-going work that will not present any major problem, and was delayed only because the addition is not immediately useful for the formalisation of the CerCo compiler.
780
781The formalisation is done at machine level and not at assembly level; we also formalise fetching and decoding.
782We separately provide an assembly language, enhanched with labels and pseudoinstructions, and an assembler from this language to machine code.
783We introduce cost labels in the machine language to relate the data flow of the assembly program to that of the C source language, in order to associate costs to the C program.
784For the O'Caml version, we provide a parser and pretty printer from code memory to Intel HEX format.
785Hence we can perform testing on programs compiled using any free or commercial compiler.
786
787Our main difficulty in formalising the MCS-51 was the unorthogonality of its memory model and instruction set.
788These problems are easily handled in O'Caml by using advanced language features like polymorphic variants and phantom types, simulating Generalized Abstract Data Types.
789In Matita, we use dependent types to recover the same flexibility, to reduce spurious partiality, and to grant invariants that will be useful in the formalization of the CerCo compiler.
790
791The formalisation has been partially verified by computing execution traces on selected programs and comparing them with an existing emulator.
792All instructions have been tested at least once, but we have not yet pushed testing further, for example with random testing or by using development boards.
793I/O in particular has not been tested yet, and it is currently unclear how to provide exhaustive testing in the presence of I/O.
794Finally, we are aware of having over-specified the processor in several places, by fixing a behaviour hopefully consistent with the real machine, where manufacturer data sheets are ambiguous or under specified.
795
796\bibliography{itp-2011.bib}
797
798\end{document}
799
800\newpage
801
802\appendix
803
804\section{Listing of main O'Caml functions}
805\label{sect.listing.main.ocaml.functions}
806
807\subsubsection{From \texttt{ASMInterpret.ml(i)}}
808
809\begin{center}
810\begin{tabular*}{\textwidth}{p{3cm}@{\quad}p{9cm}}
811Name & Description \\
812\hline
813\texttt{assembly} & Assembles an abstract syntax tree representing an 8051 assembly program into a list of bytes, its compiled form. \\
814\texttt{initialize} & Initializes the emulator status. \\
815\texttt{load} & Loads an assembled program into the emulator's code memory. \\
816\texttt{fetch} & Fetches the next instruction, and automatically increments the program counter. \\
817\texttt{execute} & Emulates the processor.  Accepts as input a function that pretty prints the emulator status after every emulation loop. \\
818\end{tabular*}
819\end{center}
820
821\subsubsection{From \texttt{ASMCosts.ml(i)}}
822
823\begin{center}
824\begin{tabular*}{\textwidth}{p{3cm}@{\quad}p{9cm}}
825Name & Description \\
826\hline
827\texttt{compute} & Computes a map associating costings to basic blocks in the program.
828\end{tabular*}
829\end{center}
830
831\subsubsection{From \texttt{IntelHex.ml(i)}}
832
833\begin{center}
834\begin{tabular*}{\textwidth}{p{3cm}@{\quad}p{9cm}}
835Name & Description \\
836\hline
837\texttt{intel\_hex\_of\_file} & Reads in a file and parses it if in Intel IHX format, otherwise raises an exception. \\
838\texttt{process\_intel\_hex} & Accepts a parsed Intel IHX file and populates a hashmap (of the same type as code memory) with the contents.
839\end{tabular*}
840\end{center}
841
842\subsubsection{From \texttt{Physical.ml(i)}}
843
844\begin{center}
845\begin{tabular*}{\textwidth}{p{3cm}@{\quad}p{9cm}}
846Name & Description \\
847\hline
848\texttt{subb8\_with\_c} & Performs an eight bit subtraction on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\
849\texttt{add8\_with\_c} & Performs an eight bit addition on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\
850\texttt{dec} & Decrements an eight bit bitvector with underflow, if necessary. \\
851\texttt{inc} & Increments an eight bit bitvector with overflow, if necessary.
852\end{tabular*}
853\end{center}
854
855\newpage
856
857\section{Listing of main Matita functions}
858\label{sect.listing.main.matita.functions}
859
860\subsubsection{From \texttt{Arithmetic.ma}}
861
862\begin{center}
863\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
864Title & Description \\
865\hline
866\texttt{add\_n\_with\_carry} & Performs an $n$ bit addition on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\
867\texttt{sub\_8\_with\_carry} & Performs an eight bit subtraction on bitvectors. The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\
868\texttt{half\_add} & Performs a standard half addition on bitvectors, returning the result and carry bit. \\
869\texttt{full\_add} & Performs a standard full addition on bitvectors and a carry bit, returning the result and a carry bit.
870\end{tabular*}
871\end{center}
872
873\subsubsection{From \texttt{Assembly.ma}}
874
875\begin{center}
876\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
877Title & Description \\
878\hline
879\texttt{assemble1} & Assembles a single 8051 assembly instruction into its memory representation. \\
880\texttt{assemble} & Assembles an 8051 assembly program into its memory representation.\\
881\texttt{assemble\_unlabelled\_program} &\\& Assembles a list of (unlabelled) 8051 assembly instructions into its memory representation.
882\end{tabular*}
883\end{center}
884
885\subsubsection{From \texttt{BitVectorTrie.ma}}
886
887\begin{center}
888\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
889Title & Description \\
890\hline
891\texttt{lookup} & Returns the data stored at the end of a particular path (a bitvector) from the trie.  If no data exists, returns a default value. \\
892\texttt{insert} & Inserts data into a tree at the end of the path (a bitvector) indicated.  Automatically expands the tree (by filling in stubs) if necessary.
893\end{tabular*}
894\end{center}
895
896\subsubsection{From \texttt{DoTest.ma}}
897
898\begin{center}
899\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
900Title & Description \\
901\hline
902\texttt{execute\_trace} & Executes an assembly program for a fixed number of steps, recording in a trace which instructions were executed.
903\end{tabular*}
904\end{center}
905
906\subsubsection{From \texttt{Fetch.ma}}
907
908\begin{center}
909\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
910Title & Description \\
911\hline
912\texttt{fetch} & Decodes and returns the instruction currently pointed to by the program counter and automatically increments the program counter the required amount to point to the next instruction. \\
913\end{tabular*}
914\end{center}
915
916\subsubsection{From \texttt{Interpret.ma}}
917
918\begin{center}
919\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
920Title & Description \\
921\hline
922\texttt{execute\_1} & Executes a single step of an 8051 assembly program. \\
923\texttt{execute} & Executes a fixed number of steps of an 8051 assembly program.
924\end{tabular*}
925\end{center}
926
927\subsubsection{From \texttt{Status.ma}}
928
929\begin{center}
930\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
931Title & Description \\
932\hline
933\texttt{load} & Loads an assembled 8051 assembly program into code memory.
934\end{tabular*}
935\end{center}
936\end{document}
Note: See TracBrowser for help on using the repository browser.