# source:Deliverables/D4.1/ITP-Paper/itp-2011.tex@561

Last change on this file since 561 was 561, checked in by sacerdot, 8 years ago

...

File size: 56.4 KB
Line
1\documentclass{llncs}
2
3\usepackage{amsfonts}
4\usepackage{amsmath}
5\usepackage{amssymb}
6\usepackage[english]{babel}
7\usepackage{color}
8\usepackage{fancybox}
9\usepackage{graphicx}
11\usepackage[utf8x]{inputenc}
12\usepackage{listings}
13\usepackage{mdwlist}
14\usepackage{microtype}
15\usepackage{stmaryrd}
16\usepackage{url}
17
18\newlength{\mylength}
19\newenvironment{frametxt}%
20        {\setlength{\fboxsep}{5pt}
21                \setlength{\mylength}{\linewidth}%
24                \Sbox
25                \minipage{\mylength}%
26                        \setlength{\abovedisplayskip}{0pt}%
27                        \setlength{\belowdisplayskip}{0pt}%
28                }%
29                {\endminipage\endSbox
30                        $\fbox{\TheSbox}$}
31
32\lstdefinelanguage{matita-ocaml}
33  {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try,on,to},
34   morekeywords={[2]whd,normalize,elim,cases,destruct},
35   morekeywords={[3]type,of,val,assert,let,function},
36   mathescape=true,
37  }
38\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
39        keywordstyle=\color{red}\bfseries,
40        keywordstyle=[2]\color{blue},
41        keywordstyle=[3]\color{blue}\bfseries,
43        stringstyle=\color{blue},
44        showspaces=false,showstringspaces=false}
45\lstset{extendedchars=false}
46\lstset{inputencoding=utf8x}
47\DeclareUnicodeCharacter{8797}{:=}
48\DeclareUnicodeCharacter{10746}{++}
49\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
50\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
51
52\author{Dominic P. Mulligan\thanks{The project CerCo acknowledges the financial support of the Future and
53Emerging Technologies (FET) programme within the Seventh Framework
54Programme for Research of the European Commission, under FET-Open grant
55number: 243881} \and Claudio Sacerdoti Coen$^\star$}
56\authorrunning{D. P. Mulligan and C. Sacerdoti Coen}
57\title{An executable formalisation of the MCS-51 microprocessor in Matita}
58\titlerunning{An executable formalisation of the MCS-51}
59\institute{Dipartimento di Scienze dell'Informazione, Universit\a di Bologna}
60
61\bibliographystyle{plain}
62
63\begin{document}
64
65\maketitle
66
67\begin{abstract}
68We summarise our formalisation of an emulator for the MCS-51 microprocessor in the Matita proof assistant.
69The MCS-51 is a widely used 8-bit microprocessor, especially popular in embedded devices.
70
71We proceeded in two stages, first implementing in O'Caml a prototype emulator, where bugs could be ironed out' quickly.
72We then ported our O'Caml emulator to Matita's internal language.
73Though mostly straight-forward, this porting presented multiple problems.
74Of particular interest is how we handle the extreme non-orthoganality of the MSC-51's instruction set.
75In O'Caml, this was handled through heavy use of polymorphic variants.
76In Matita, we achieve the same effect through a non-standard use of dependent types.
77
78Both the O'Caml and Matita emulators are executable'.
79Assembly programs may be animated within Matita, producing a trace of instructions executed.
80
81Our formalisation is a major component of the ongoing EU-funded CerCo project.
82\end{abstract}
83
84%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
85% SECTION                                                                      %
86%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
87\section{Background}
88\label{sect.introduction}
89
90Formal methods are designed to increase our confidence in the design and implementation of software (and hardware).
91Ideally, we would like all software to come equipped with a formal specification, along with a proof of correctness that the software meets this specification.
92Today the majority of programs are written in high level languages and then compiled into low level ones.
93Specifications are therefore also given at a high level and correctness can be proved by reasoning automatically or interactively on the program's source code.
94The code that is actually run, however, is not the high level source code that we reason on, but the object code that is generated by the compiler.
95A few simple questions now arise:
96\begin{itemize*}
97\item
98What properties are preserved during compilation?
99\item
100What properties are affected by the compilation strategy?
101\item
102To what extent can you trust your compiler in preserving those properties?
103\end{itemize*}
104These questions, and others like them, motivate a current hot topic' in computer science research: \emph{compiler verification} (for instance~\cite{leroy:formal:2009,chlipala:verified:2010}, and many others).
105So far, the field has been focused on the first and last questions only.
106In particular, much attention has been placed on verifying compiler correctness with respect to extensional properties of programs, which are easily preserved during compilation; it is sufficient to completely preserve the denotational semantics of the input program.
107
108However, if we consider intensional properties of programs---such as space, time or energy spent into computation and transmission of data---the situation is more complex.
109To even be able to express these properties, and to be able to reason about them, we are forced to adopt a cost model that assigns a cost to single, or blocks, of instructions.
110Ideally, we would like to have a compositional cost model that assigns the same cost to all occurrences of one instruction.
111However, compiler optimisations are inherently non-compositional: each occurrence of a high level instruction is usually compiled in a different way according to the context it finds itself in.
112Therefore both the cost model and intensional specifications are affected by the compilation process.
113
114In the current EU project CerCo (Certified Complexity')~\cite{cerco:2011} we approach the problem of reasoning about intensional properties of programs as follows.
115We are currently developing a compiler that induces a cost model on the high level source code.
116Costs are assigned to each block of high level instructions by considering the costs of the corresponding blocks of compiled object code.
117The cost model is therefore inherently non-compositional.
118However, the model has the potential to be extremely \emph{precise}, capturing a program's \emph{realistic} cost, by taking into account, not ignoring, the compilation process.
119A prototype compiler, where no approximation of the cost is provided, has been developed.
120(The full technical details of the CerCo cost model is explained in~\cite{amadio:certifying:2010}.)
121
122We believe that our approach is especially applicable to certifying real time programs.
123Here, a user can certify that all deadlines' are met whilst wringing as many clock cycles from the processor---using a cost model that does not over-estimate---as possible.
124
125Further, we see our approach as being relevant to the field of compiler verification (and construction) itself.
126For instance, an optimisation specified only extensionally is only half specified; though the optimisation may preserve the denotational semantics of a program, there is no guarantee that any intensional properties of the program, such as space or time usage, will be improved.
127Another potential application is toward completeness and correctness of the compilation process in the presence of space constraints.
128Here, a compiler could potentially reject a source program targetting an embedded system when the size of the compiled code exceeds the available ROM size.
129Moreover, preservation of a program's semantics may only be required for those programs that do not exhaust the stack or heap.
130Hence the statement of completeness of the compiler must take in to account a realistic cost model.
131
132In the methodology proposed in CerCo we assume we are able to compute on the object code exact and realistic costs for sequential blocks of instructions.
133With modern processors, though possible (see~\cite{bate:wcet:2011,yan:wcet:2008} for instance), it is difficult to compute exact costs or to reasonably approximate them.
134This is because the execution of a program itself has an influence on the speed of processing.
135For instance, caching, memory effects and other advanced features such as branch prediction all have a profound effect on execution speeds.
136For this reason CerCo decided to focus on 8-bit microprocessors.
137These are still widely used in embedded systems, and have the advantage of an easily predictable cost model due to the relative sparcity of features that they possess.
138
139In particular, we have fully formalised an executable formal semantics of a family of 8 bit Freescale Microprocessors~\cite{oliboni:matita:2008}, and provided a similar executable formal semantics for the MCS-51 microprocessor.
140The latter work is what we describe in this paper.
141The main focus of the formalisation has been on capturing the intensional behaviour of the processor.
142However, the design of the MCS-51 itself has caused problems in our formalisation.
143For example, the MCS-51 has a highly unorthogonal instruction set.
144To cope with this unorthogonality, and to produce an executable specification, we have exploited the dependent type system of Matita, an interactive proof assistant.
145
146\subsection{The 8051/8052}
147\label{subsect.8051-8052}
148
149The MCS-51 is an eight bit microprocessor introduced by Intel in the late 1970s.
150Commonly called the 8051, in the three decades since its introduction the processor has become a highly popular target for embedded systems engineers.
151Further, the processor, its immediate successor the 8052, and many derivatives are still manufactured \emph{en masse} by a host of semiconductor suppliers.
152
153The 8051 is a well documented processor, and has the additional support of numerous open source and commercial tools, such as compilers for high-level languages and emulators.
154For instance, the open source Small Device C Compiler (SDCC) recognises a dialect of C~\cite{sdcc:2010}, and other compilers targeting the 8051 for BASIC, Forth and Modula-2 are also extant.
155An open source emulator for the processor, MCU-8051 IDE, is also available~\cite{mcu8051ide:2010}.
156Both MCU-8051 IDE and SDCC were used profitably in the implementation of our formalisation.
157
158\begin{figure}[t]
159\setlength{\unitlength}{0.87pt}
160\begin{picture}(410,250)(-50,200)
161%\put(-50,200){\framebox(410,250){}}
162\put(12,410){\makebox(80,0)[b]{Internal (256B)}}
163\put(13,242){\line(0,1){165}}
164\put(93,242){\line(0,1){165}}
165\put(13,407){\line(1,0){80}}
166\put(12,400){\makebox(0,0)[r]{0h}}  \put(14,400){\makebox(0,0)[l]{Register bank 0}}
167\put(13,393){\line(1,0){80}}
168\put(12,386){\makebox(0,0)[r]{8h}}  \put(14,386){\makebox(0,0)[l]{Register bank 1}}
169\put(13,379){\line(1,0){80}}
170\put(12,372){\makebox(0,0)[r]{10h}}  \put(14,372){\makebox(0,0)[l]{Register bank 2}}
171\put(13,365){\line(1,0){80}}
172\put(12,358){\makebox(0,0)[r]{18h}} \put(14,358){\makebox(0,0)[l]{Register bank 3}}
173\put(13,351){\line(1,0){80}}
175\put(13,323){\line(1,0){80}}
176\put(12,316){\makebox(0,0)[r]{30h}}
178\put(13,291){\line(1,0){80}}
179\put(12,284){\makebox(0,0)[r]{80h}}
181\put(12,249){\makebox(0,0)[r]{ffh}}
182\put(13,242){\line(1,0){80}}
183
184\qbezier(-2,407)(-6,407)(-6,393)
185\qbezier(-6,393)(-6,324)(-10,324)
186\put(-12,324){\makebox(0,0)[r]{Indirect/stack}}
187\qbezier(-6,256)(-6,324)(-10,324)
188\qbezier(-2,242)(-6,242)(-6,256)
189
190\qbezier(94,407)(98,407)(98,393)
191\qbezier(98,393)(98,349)(102,349)
192\put(104,349){\makebox(0,0)[l]{Direct}}
193\qbezier(98,305)(98,349)(102,349)
194\qbezier(94,291)(98,291)(98,305)
195
196\put(102,242){\framebox(20,49){SFR}}
198
199\qbezier(124,291)(128,291)(128,277)
200\qbezier(128,277)(128,266)(132,266)
201\put(134,266){\makebox(0,0)[l]{Direct}}
202\qbezier(128,257)(128,266)(132,266)
203\qbezier(124,242)(128,242)(128,256)
204
205\put(164,410){\makebox(80,0)[b]{External (64kB)}}
206\put(164,220){\line(0,1){187}}
207\put(164,407){\line(1,0){80}}
208\put(244,220){\line(0,1){187}}
209\put(164,242){\line(1,0){80}}
210\put(163,400){\makebox(0,0)[r]{0h}}
211\put(164,324){\makebox(80,0){Paged access}}
212  \put(164,310){\makebox(80,0){Direct/indirect}}
213\put(163,235){\makebox(0,0)[r]{80h}}
214  \put(164,228){\makebox(80,0){\vdots}}
215  \put(164,210){\makebox(80,0){Direct/indirect}}
216
217\put(264,410){\makebox(80,0)[b]{Code (64kB)}}
218\put(264,220){\line(0,1){187}}
219\put(264,407){\line(1,0){80}}
220\put(344,220){\line(0,1){187}}
221\put(263,400){\makebox(0,0)[r]{0h}}
222  \put(264,228){\makebox(80,0){\vdots}}
223  \put(264,324){\makebox(80,0){Direct}}
224  \put(264,310){\makebox(80,0){PC relative}}
225\end{picture}
226\caption{The 8051 memory model}
227\label{fig.memory.layout}
228\end{figure}
229
230The 8051 has a relatively straightforward architecture, unencumbered by advanced features of modern processors, making it an ideal target for formalisation.
231A high-level overview of the processor's memory layout, along with the ways in which different memory spaces may be addressed, is provided in Figure~\ref{fig.memory.layout}.
232
233Processor RAM is divided into numerous segments, with the most prominent division being between internal and (optional) external memory.
234Internal memory, commonly provided on the die itself with fast access, is composed of 256 bytes, but, in direct addressing mode, half of them are overloaded with 128 bytes of memory mapped Special Function Registers (SFRs) which control the operation of the processor.
235Internal RAM (IRAM) is further divided into eight general purpose bit-addressable registers (R0--R7).
236These sit in the first eight bytes of IRAM, though can be programmatically shifted up' as needed.
237Bit memory, followed by a small amount of stack space, resides in the memory space immediately after the register banks.
238What remains of the IRAM may be treated as general purpose memory.
239A schematic view of IRAM layout is also provided in Figure~\ref{fig.memory.layout}.
240
241External RAM (XRAM), limited to a maximum size of 64 kilobytes, is optional, and may be provided on or off chip, depending on the manufacturer.
242XRAM is accessed using a dedicated instruction, and requires sixteen bits to address fully.
243External code memory (XCODE) is often stored in the form of an EPROM, and limited to 64 kilobytes in size.
244However, depending on the particular manufacturer and processor model, a dedicated on-die read-only memory area for program code (ICODE) may also be supplied.
245
246Memory may be addressed in numerous ways: immediate, direct, indirect, external direct and code indirect.
247As the latter two addressing modes hint, there are some restrictions enforced by the 8051 and its derivatives on which addressing modes may be used with specific types of memory.
248For instance, the 128 bytes of extra internal RAM that the 8052 features cannot be addressed using indirect addressing; rather, external (in)direct addressing must be used. Moreover, some memory segments are addressed using 8 bits pointers while others require 16 bits.
249
250The 8051 series possesses an eight bit Arithmetic and Logic Unit (ALU), with a wide variety of instructions for performing arithmetic and logical operations on bits and integers.
251Further, the processor possesses two eight bit general purpose accumulators, A and B.
252
253Communication with the device is facilitated by an onboard UART serial port, and associated serial controller, which can operate in numerous modes.
254Serial baud rate is determined by one of two sixteen bit timers included with the 8051, which can be set to multiple modes of operation.
255(The 8052 provides an additional sixteen bit timer.)
256As an additional method of communication, the 8051 also provides a four byte bit-addressable input-output port.
257
258The programmer may take advantage of the interrupt mechanism that the processor provides.
259This is especially useful when dealing with input or output involving the serial device, as an interrupt can be set when a whole character is sent or received via the serial port.
260
261Interrupts immediately halt the flow of execution of the processor, and cause the program counter to jump to a fixed address, where the requisite interrupt handler is stored.
262However, interrupts may be set to one of two priorities: low and high.
263The interrupt handler of an interrupt with high priority is executed ahead of the interrupt handler of an interrupt of lower priority, interrupting a currently executing handler of lower priority, if necessary.
264
265The 8051 has interrupts disabled by default.
266The programmer is free to handle serial input and output manually, by poking serial flags in the SFRs.
267Similarly, exceptional circumstances' that would otherwise trigger an interrupt on more modern processors, for example, division by zero, are also signalled by setting flags.
268
269%\begin{figure}[t]
270%\begin{center}
271%\includegraphics[scale=0.5]{iramlayout.png}
272%\end{center}
273%\caption{Schematic view of 8051 IRAM layout}
274%\label{fig.iram.layout}
275%\end{figure}
276
277%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
278% SECTION                                                                      %
279%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
280\subsection{Overview of paper}
281\label{subsect.overview.paper}
282
283In Section~\ref{sect.design.issues.formalisation} we discuss design issues in the development of the formalisation.
284In Section~\ref{sect.validation} we discuss how we validated the design and implementation of our emulator to ensure that what we formalised was an accurate model of an MCS-51 series microprocessor.
285In Section~\ref{sect.related.work} we describe previous work, with an eye toward describing its relation with the work described herein.
286In Section~\ref{sect.conclusions} we conclude the paper.
287
288%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
289% SECTION                                                                      %
290%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
291\section{Design issues in the formalisation}
292\label{sect.design.issues.formalisation}
293
294From hereonin, we typeset O'Caml source with \texttt{\color{blue}{blue}} and Matita source with \texttt{\color{red}{red}} to distinguish the two syntaxes.
295Matita's syntax is largely straightforward to those familiar with Coq or O'Caml.
296The only subtlety is the use of \texttt{?}' in an argument position denoting an argument that should be inferred automatically.
297
298A full account of the formalisation can be found in~\cite{cerco-report:2011}.
299
300\subsection{Development strategy}
301\label{subsect.development.strategy}
302
303Our implementation progressed in two stages.
304We began with an emulator written in O'Caml.
305We used this to iron out' any bugs in our design and implementation within O'Caml's more permissive type system.
306O'Caml's ability to perform file input-output also eased debugging and validation.
307Once we were happy with the performance and design of the O'Caml emulator, we moved to the Matita formalisation.
308
309Matita's syntax is lexically similar to O'Caml's.
310This eased the translation, as large swathes of code were merely copy-pasted with minor modifications.
311However, several major issues had to be addresses when moving from O'Caml to Matita.
312These are now discussed.
313
314%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
315% SECTION                                                                      %
316%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
317\subsection{Representation of bytes, words, etc.}
318\label{subsect.representation.integers}
319
320\begin{figure}[t]
321\begin{minipage}[t]{0.45\textwidth}
322\vspace{0pt}
323\begin{lstlisting}
324type 'a vect = bit list
325type nibble = [Sixteen] vect
326type byte = [Eight] vect
327$\color{blue}{\mathtt{let}}$ split_word w = split_nth 4 w
328$\color{blue}{\mathtt{let}}$ split_byte b = split_nth 2 b
329\end{lstlisting}
330\end{minipage}
331%
332\begin{minipage}[t]{0.55\textwidth}
333\vspace{0pt}
334\begin{lstlisting}
335type 'a vect
336type word = [Sixteen] vect
337type byte = [Eight] vect
338val split_word: word -> byte * word
339val split_byte: byte -> nibble * nibble
340\end{lstlisting}
341\end{minipage}
342\caption{Sample of O'Caml implementation and interface for bitvectors module}
343\label{fig.ocaml.implementation.bitvectors}
344\end{figure}
345
346The formalization of MCS-51 must deal with bytes (8 bits), words (16 bits) but also with more exoteric quantities (7 bits, 3 bits, 9 bits).
347To avoid size mismatch bugs difficult to spot, we represent all of these quantities using bitvectors, i.e. fixed length vectors of booleans.
348In our O'Caml emulator, we faked' bitvectors using phantom types~\cite{leijen:domain:1999} implemented with polymorphic variants~\cite{garrigue:programming:1998}, as in Figure~\ref{fig.ocaml.implementation.bitvectors}.
349From within the bitvector module (left column) bitvectors are just lists of bits and no guarantee is provided on sizes.
350However, the module's interface (right column) enforces the size invariants in the rest of the code.
351
352In Matita, we are able to use the full power of dependent types to always work with vectors of a known size:
353\begin{lstlisting}
354inductive Vector (A: Type[0]): nat → Type[0] ≝
355  VEmpty: Vector A O
356| VCons: ∀n: nat. A → Vector A n → Vector A (S n).
357\end{lstlisting}
358We define \texttt{BitVector} as a specialization of \texttt{Vector} to \texttt{bool}.
359We may use Matita's type system to provide precise typing for functions that are
360polymorphic in the size without having to duplicate the code as we did in O'Caml:
361\begin{lstlisting}
362let rec split (A: Type[0]) (m,n: nat) on m:
363   Vector A (plus m n) $\rightarrow$ (Vector A m) $\times$ (Vector A n) := ...
364\end{lstlisting}
365
366%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
367% SECTION                                                                      %
368%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
369\subsection{Representing memory}
370\label{subsect.representing.memory}
371
372The MCS-51 has numerous disjoint memory segments addressed by pointers of
373different sizes.
374In our prototype implementation, we simply used a map datastructure (from the O'Caml standard library) for each segment.
375Matita's standard library is relatively small, and does not contain a generic map datastructure. Therefore, we had the opportunity of crafting a dependently typed special-purpose datastructure for the job to enforce the correspondence between the size of pointers and the size of the segment .
376We also worked under the assumption that large swathes of memory would often be uninitialized, trying to represent them concisely using stubs.
377
378We picked a modified form of trie of fixed height $h$ where paths are
379represented by bitvectors of length $h$, that are already used in our
381\begin{lstlisting}
382inductive BitVectorTrie (A: Type[0]): nat $\rightarrow$ Type[0] ≝
383  Leaf: A $\rightarrow$ BitVectorTrie A 0
384| Node: ∀n. BitVectorTrie A n $\rightarrow$ BitVectorTrie A n $\rightarrow$ BitVectorTrie A (S n)
385| Stub: ∀n. BitVectorTrie A n.
386\end{lstlisting}
387Here, \texttt{Stub} is a constructor that can appear at any point in our tries.
388It internalises the notion of uninitialized data'.
389Performing a lookup in memory is now straight-forward.
390We merely traverse a path, and if at any point we encounter a \texttt{Stub}, we return a default value\footnote{All manufacturer data sheets that we consulted were silent on the subject of what should be returned if we attempt to access uninitialized memory.  We defaulted to simply returning zero, though our \texttt{lookup} function is parametric in this choice.  We do not believe that this is an outrageous decision, as SDCC for instance generates code which first zeroes out' all memory in a preamble before executing the program proper.  This is in line with the C standard, which guarantees that all global variables will be zero initialized piecewise.}.
391
392%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
393% SECTION                                                                      %
394%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
395\subsection{Labels and pseudoinstructions}
396\label{subsect.labels.pseudoinstructions}
397
398Aside from implementing the core MCS-51 instruction set, we also provided \emph{pseudoinstructions}, \emph{labels} and \emph{cost labels}.
399The purpose of \emph{cost labels} will be explained in Subsection~\ref{subsect.computation.cost.traces}.
400
401Introducing pseudoinstructions had the effect of simplifying a C compiler---another component of the CerCo project---that was being implemented in parallel with our implementation.
402To understand why this is so, consider the fact that the MCS-51's instruction set has numerous instructions for unconditional and conditional jumps to memory locations.
403For instance, the instructions \texttt{AJMP}, \texttt{JMP} and \texttt{LJMP} all perform unconditional jumps.
404However, these instructions differ in how large the maximum size of the offset of the jump to be performed can be.
406Hence compilers that support separate compilation cannot directly compute these offsets and select the appropriate jump instructions. These operations are
407needleslly burdensome also for compilers that do not do separate compilation
408and are thus handled by the assemblers, as we decided to do.
409
410While introducing pseudo instructions we also introduced labels for locations
411for jumps and for global data.
412To specify global data via labels, we have introduced the notion of a preamble
413before the program to hold the association of labels to sizes of reserved space.
414A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the (16-bit) register \texttt{DPTR}.
415
416Our pseudoinstructions and labels induce an assembly language similar to that of SDCC. All pseudoinstructions and labels are assembled away', prior to program execution, using a preprocessing stage. Jumps are computed in two stages.
417The first stage builds a map associating memory addresses to labels, with the second stage removing pseudojumps with concrete jumps to the correct address. The algorithm currently implemented does not try to minimize the object code size by always picking the shortest possible jump instruction. The choice of an optimal algorithm is currently left as future work.
418
419%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
420% SECTION                                                                      %
421%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
422\subsection{Anatomy of the (Matita) emulator}
423\label{subsect.anatomy.matita.emulator}
424
425The internal state of our Matita emulator is represented as a record:
426\begin{lstlisting}
427record Status: Type[0] ≝ {
428  code_memory: BitVectorTrie Byte 16;
429  low_internal_ram: BitVectorTrie Byte 7;
430  high_internal_ram: BitVectorTrie Byte 7;
431  external_ram: BitVectorTrie Byte 16;
432  program_counter: Word;
433  special_function_registers_8051: Vector Byte 19;
434  special_function_registers_8052: Vector Byte 5;
435  ...  }.
436\end{lstlisting}
437This record neatly encapsulates the current memory contents, the program counter, the state of the current SFRs, and so on.
438
439Here we choosed to represent the MCS-51 memory model using four disjoint
440segments plus the SFRs. From the programmer point of view, however, what
441matters are addressing modes that are in a many-to-many relation with the
442segments. For instance, the \texttt{DIRECT} addressing mode can be used to
443address either low internal RAM (if the first bit is 0) or the SFRs (if the first bit is 1). That's why DIRECT uses 8-bits address but pointers to the low
444internal RAM only have 7 bits. Hence the complexity of the memory model
445is incapsulated in the  \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX}
446functions that get and set data of size \texttt{XX} from the
447memory by considering all possible addressing modes
448
449%Overlapping, and checking which addressing modes can be used to address particular memory spaces, is handled through numerous \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX} (for 1, 8 and 16 bits) functions.
450
451Both the Matita and O'Caml emulators follows the classic fetch-decode-execute' model of processor operation.
452The next instruction to be processed, indexed by the program counter, is fetched from code memory with \texttt{fetch}.
453An updated program counter, along with the concrete cost, in processor cycles for executing this instruction, is also returned.
454These costs are taken from a Siemens Semiconductor Group data sheet for the MCS-51~\cite{siemens:2011}, and will likely vary across manufacturers and particular derivatives of the processor.
455\begin{lstlisting}
456definition fetch: BitVectorTrie Byte 16 $\rightarrow$ Word $\rightarrow$ instruction $\times$ Word $\times$ nat
457\end{lstlisting}
458Instruction are assembled to bit encodings by \texttt{assembly1}:
459\begin{lstlisting}
460definition assembly1: instruction $\rightarrow$ list Byte
461\end{lstlisting}
462An assembly program, consisting of a preamble containing global data, and a list of (pseudo)instructions, is assembled using \texttt{assembly}.
463Pseudoinstructions and labels are eliminated in favour of concrete instructions from the MCS-51 instruction set.
464A map associating memory locations and cost labels (see Subsection~\ref{subsect.computation.cost.traces}) is also produced.
465\begin{lstlisting}
466definition assembly:
467  assembly_program $\rightarrow$ option (list Byte $\times$ (BitVectorTrie String 16))
468\end{lstlisting}
469A single fetch-decode-execute cycle is performed by \texttt{execute\_1}:
470\begin{lstlisting}
471definition execute_1: Status $\rightarrow$ Status
472\end{lstlisting}
473The \texttt{execute} functions performs a fixed number of cycles by iterating
474\texttt{execute\_1}:
475\begin{lstlisting}
476let rec execute (n: nat) (s: Status) on n: Status := ...
477\end{lstlisting}
478This differs slightly from the design of the O'Caml emulator, which executed a program indefinitely, and also accepted a callback function as an argument, which could witness' the execution as it happened, and provide a print-out of the processor state, and other debugging information.
479Due to Matita's requirement that all functions be strongly normalizing, \texttt{execute} cannot execute a program indefinitely. An alternative is to
480produce an infinite stream of statuses representing the execution trace.
481Infinite streams are encodable in Matita as co-inductive types.
482
483%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
484% SECTION                                                                      %
485%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
486\subsection{Instruction set unorthogonality}
487\label{subsect.instruction.set.unorthogonality}
488
489A peculiarity of the MCS-51 is the non-orthogonality of its instruction set.
490For instance, the \texttt{MOV} instruction, can be invoked using one of sixteen combinations of addressing modes out of 361.
491
492% Show example of pattern matching with polymorphic variants
493
494Such non-orthogonality in the instruction set was handled with the use of polymorphic variants in the O'Caml emulator.
495For instance, we introduced types corresponding to each addressing mode:
496\begin{lstlisting}
497type direct = [ DIRECT of byte ]
498type indirect = [ INDIRECT of bit ]
499...
500\end{lstlisting}
501Which were then combined in our inductive datatype for assembly instructions using the union operator $|$':
502\begin{lstlisting}
504 [ ADD of acc * [ reg | direct | indirect | data ]
505...
506 | MOV of
507    (acc * [ reg | direct | indirect | data ],
508     [ reg | indirect ] * [ acc | direct | data ],
509     direct * [ acc | reg | direct | indirect | data ],
510     dptr * data16,
511     carry * bit,
512     bit * carry
513     ) union6
514...
515\end{lstlisting}
516Here, \texttt{union6} is a disjoint union type, defined as follows:
517\begin{lstlisting}
518type ('a,'b,'c,'d,'e,'f) union6 = [ U1 of 'a | ... | U6 of 'f ]
519\end{lstlisting}
520For our purposes, the types \texttt{union2}, \texttt{union3} and \texttt{union6} sufficed.
521
522This polymorphic variant machinery worked well: it introduced a certain level of type safety (for instance, the type of our \texttt{MOV} instruction above guarantees it cannot be invoked with arguments in the \texttt{carry} and \texttt{data16} addressing modes, respectively), and also allowed us to pattern match against instructions, when necessary.
523However, this polymorphic variant machinery is \emph{not} present in Matita.
524We needed some way to produce the same effect, which Matita supported.
525For this task, we used dependent types.
526
527We first provided an inductive data type representing all possible addressing modes, a type that functions will pattern match against:
528\begin{lstlisting}
530  DIRECT: Byte $\rightarrow$ addressing_mode
531| INDIRECT: Bit $\rightarrow$ addressing_mode
532...
533\end{lstlisting}
534We also wished to express in the type of functions the \emph{impossibility} of pattern matching against certain constructors.
535In order to do this, we introduced an inductive type of addressing mode tags'.
536The constructors of \texttt{addressing\_mode\_tag} are in one-to-one correspondence with the constructors of \texttt{addressing\_mode}:
537\begin{lstlisting}
541...
542\end{lstlisting}
544\begin{lstlisting}
546  match d with
547   [ direct $\Rightarrow$ match A with [ DIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ]
548   | indirect $\Rightarrow$ match A with [ INDIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ]
549...
550\end{lstlisting}
551The \texttt{is\_in} function checks if an \texttt{addressing\_mode} matches a set of tags represented as a vector. It simply extends the \texttt{is\_a} function in the obvious manner.
552
553Finally, a \texttt{subaddressing\_mode} is an ad-hoc non empty $\Sigma$-type of addressing
554modes constrained to be in a given set of tags:
555\begin{lstlisting}
559\end{lstlisting}
560An implicit coercion is provided to promote vectors of tags (denoted with
561$\llbracket - \rrbracket$)
562to the
563corresponding \texttt{subaddressing\_mode} so that we can use a syntax
564close to the O'Caml one to specify preinstructions:
565\begin{lstlisting}
566inductive preinstruction (A: Type[0]): Type[0] ≝
567   ADD: $\llbracket$ acc_a $\rrbracket$ $\rightarrow$ $\llbracket$ register; direct; indirect; data $\rrbracket$ $\rightarrow$ preinstruction A
568 | ADDC: $\llbracket$ acc_a $\rrbracket$ $\rightarrow$ $\llbracket$ register; direct; indirect; data $\rrbracket$ $\rightarrow$ preinstruction A
569...
570\end{lstlisting}
571We see that the constructor \texttt{ADD} expects two parameters, the first being the accumulator A (\texttt{acc\_a}), and the second being one of a register, direct, indirect or data addressing mode.
572
573% One of these coercions opens up a proof obligation which needs discussing
574% Have lemmas proving that if an element is a member of a sub, then it is a member of a superlist, and so on
575The final, missing component is a pair of type coercions from \texttt{addressing\_mode} to \texttt{subaddressing\_mode} and from \texttt{subaddressing\_mode} to \texttt{Type$\lbrack0\rbrack$}, respectively.
576The first one is simply a forgetful coercion, while the second one opens
577a proof obligation wherein we must prove that the provided value is in the
578admissible set. This kind of coercions were firstly introduced in PVS to
579implement subset types~\cite{pvs?} and then in Coq as an additional machinery~\cite{russell}. In Matita all coercions can open proof obligations.
580
581Proof obligations impels us to state and prove a few auxilliary lemmas related
582to transitivity of subtyping. For instance, an addressing mode that belongs
583to an allowed set also belongs to any one of its super-set. At the moment,
584Matita's automation exploits these lemmas to completely solve all the proof
585obligations opened in our formalization, comprising the 200 proof obligations
586related to the main \texttt{execute\_1} function.
587
588The machinery just described allows us to restrict the set of addressing
589modes expected by a function and use this information during pattern matching
590to skip impossible cases.
591For instance, consider \texttt{set\_arg\_16}, which expects only a \texttt{DPTR}:
592\begin{lstlisting}
593definition set_arg_16: Status $\rightarrow$ Word $\rightarrow$ $\llbracket$ dptr $\rrbracket$ $\rightarrow$ Status ≝ $~\lambda$s, v, a.
594   match a return $\lambda$x. bool_to_Prop (is_in ? $\llbracket$ dptr $\rrbracket$ x) $\rightarrow$ ? with
595     [ DPTR $\Rightarrow$ $\lambda$_: True.
596       let 〈 bu, bl 〉 := split $\ldots$ eight eight v in
597       let status := set_8051_sfr s SFR_DPH bu in
598       let status := set_8051_sfr status SFR_DPL bl in
599         status
600     | _ $\Rightarrow$ $\lambda$_: False. $\bot$ ] $~$(subaddressing_modein $\ldots$ a).
601\end{lstlisting}
602We feed to the pattern matching the proof \texttt{(subaddressing\_modein} $\ldots$ \texttt{a)} that the argument $a$ is in the set $\llbracket$ \texttt{dptr} $\rrbracket$. In all cases but \texttt{DPTR}, the proof is a proof of \texttt{False} and we can ask the system to open a proof obligation $\bot$ that will be
604Attempting to match against a non allowed addressing mode
605(replacing \texttt{False} with \texttt{True} in the branch) will produce a
606type-error.
607
608All the other dependently and non dependently typed solutions we tried before
609the current one resulted to be sub-optimal in practice. In particular, since
610we need a large number of different combinations of address modes to describe
611the whole instruction set, it is unfeasible to declare a data type for each
612one of these combinations. Moreover, the current solution is the one that
613matches best the corresponding O'Caml code, at the point that the translation
614from O'Caml to Matita is almost syntactical. In particular, we would like to
615investigate the possibility of changing the code extraction procedure of
616Matita to recognize this programming pattern and output the original
617code based on polymorphic variants.
618
619% Talk about extraction to O'Caml code, which hopefully will allow us to extract back to using polymorphic variants, or when extracting vectors we could extract using phantom types
620% Discuss alternative approaches, i.e. Sigma types to piece together smaller types into larger ones, as opposed to using a predicate to cut out' pieces of a larger type, which is what we did
621
622%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
623% SECTION                                                                      %
624%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
625\subsection{I/O and timers}
626\label{subsect.i/o.timers}
627
628% Real clock' for I/O and timers
629The O'Caml emulator has code for handling timers, asynchronous I/O and interrupts (these are not yet ported to the Matita emulator).
630All three of these features interact with each other in subtle ways.
631For instance, interrupts can fire' when an input is detected on the processor's UART port, and, in certain modes, timers reset when a high signal is detected on one of the MCS-51's communication pins.
632
633To accurately model timers and I/O, we add an unbounded integral field \texttt{clock} to the central \texttt{status} record.
634This field is only logical, since it does not represent any quantity stored in the actual processor, and is used to keep track of the current processor time.
635Before every execution step, \texttt{clock} is incremented by the number of processor cycles that the instruction just fetched will take to execute.
636The processor then executes the instruction, followed by the code implementing the timers and I/O\footnote{Though it isn't fully specified by the manufacturer's data sheets if I/O is handled at the beginning or the end of each cycle.}. In order to model I/O, we also store in the status a
637We use \emph{continuation} as a description of the behaviour of the environment:
638\begin{lstlisting}
639type line =
640  [ P1 of byte | P3 of byte
641  | SerialBuff of [ Eight of byte | Nine of BitVectors.bit * byte ]]
642type continuation =
643  [In of time * line * epsilon * continuation] option *
644  [Out of (time -> line -> time * continuation)]
645\end{lstlisting}
646At each moment, the second projection of the continuation $k$ describes how the environment will react to an output event performed in the future by the processor.
647If the processor at time $\tau$ starts an asynchronous output $o$ either on the P1 or P3 output lines, or on the UART, then the environment will receive the output at time $\tau'$.
648Moreover the status is immediately updated with the continuation $k'$ where $\pi_2(k)(\tau,o) = \langle \tau',k' \rangle$.
649
650Further, if $\pi_1(k) = \mathtt{Some}~\langle \tau',i,\epsilon,k'\rangle$, then at time $\tau'$ the environment will send the asynchronous input $i$ to the processor and the status will be updated with the continuation $k'$.
651This input will become visible to the processor only at time $\tau' + \epsilon$.
652
653The time required to perform an I/O operation is partially specified in the data sheets of the UART module.
654However, this computation is complex so we prefer to abstract over it.
655We therefore leave the computation of the delay time to the environment.
656
657We use only the P1 and P3 lines despite the MCS-51 having four output lines, P0--P3.
658This is because P0 and P2 become inoperable if the processor is equipped with XRAM (which we assume it is).
659
660The UART port can work in several modes, depending on the how the SFRs are set.
661In an asyncrhonous mode, the UART transmits eight bits at a time, using a ninth line for syncrhonization.
662In a syncrhonous mode the ninth line is used to transmit an additional bit.
663
664%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
665% SECTION                                                                      %
666%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
667\subsection{Computation of cost traces}
668\label{subsect.computation.cost.traces}
669
670As mentioned in Subsection~\ref{subsect.labels.pseudoinstructions} we introduced a notion of \emph{cost label}.
671Cost labels are inserted by the prototype C compiler in specific locations in the object code.
672Roughly, for those familiar with control flow graphs, they are inserted at the start of every basic block.
673
674Cost labels are used to calculate a precise costing for a program by marking the location of basic blocks.
675During the assembly phase, where labels and pseudoinstructions are eliminated, a map is generated associating cost labels with memory locations.
676This map is later used in a separate analysis which computes the cost of a program by traversing through a program, fetching one instruction at a time, and computing the cost of blocks.
677These block costings are stored in another map, and will later be passed back to the prototype compiler.
678
679%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
680% SECTION                                                                      %
681%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
682\section{Validation}
683\label{sect.validation}
684
685We spent considerable effort attempting to ensure that our formalisation is correct, that is, what we have formalised really is an accurate model of the MCS-51 microprocessor.
686
687First, we made use of multiple data sheets, each from a different semiconductor manufacturer.
688This helped us spot errors in the specification of the processor's instruction set, and its behaviour.
689
690The O'Caml prototype was especially useful for validation purposes.
691This is because we wrote a module for parsing and loading the Intel HEX file format.
692HEX is a standard format that all compilers targetting the MCS-51, and similar processors, produce.
693It is essentially a snapshot of the processor's code memory in compressed form.
694Using this, we were able to compile C programs with SDCC, an open source compiler, and load the resulting program directly into our emulator's code memory, ready for execution.
695Further, we are able to produce a HEX file from our emulator's code memory, for loading into third party tools.
696After each step of execution, we can print out both the instruction that had been executed, along with its arguments, and a snapshot of the processor's state, including all flags and register contents.
697For example:
698\begin{frametxt}
699\begin{verbatim}
700...
70108: mov 81 #07
702
703 Processor status:
704
705   ACC: 0   B: 0   PSW: 0
706    with flags set as:
707     CY: false  AC: false  FO: false  RS1: false
708     RS0: false  OV: false UD: false  P: false
709   SP: 7  IP: 0  PC: 8  DPL: 0  DPH: 0  SCON: 0
710   SBUF: 0  TMOD: 0  TCON: 0
711   Registers:
712    R0: 0  R1: 0  R2: 0  R3: 0  R4: 0  R5: 0  R6: 0  R7: 0
713...
714\end{verbatim}
715\end{frametxt}
716Here, the traces indicates that the instruction \texttt{mov 81 \#07} has just been executed by the processor, which is now in the state indicated.
717These traces were useful in spotting anything that was obviously' wrong with the execution of the program.
718
719Further, we used MCU 8051 IDE as a reference.
720Using our execution traces, we were able to step through a compiled program, one instruction at a time, in MCU 8051 IDE, and compare the resulting execution trace with the trace produced by our emulator.
721
722Our Matita formalisation was largely copied from the O'Caml source code, apart from changes related to addressing modes already mentioned.
723However, as the Matita emulator is executable, we could perform further validation by comparing the trace of a program's execution in the Matita emulator with the trace of the same program in the O'Caml emulator.
724
725%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
726% SECTION                                                                      %
727%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
728\section{Related work}
729\label{sect.related.work}
730There exists a large body of literature on the formalisation of microprocessors.
731The majority of it aims to prove correctness of the implementation of the microprocessor at the microcode or gate level.
732However, we are interested in providing a precise specification of the behaviour of the microprocessor in order to prove the correctness of a compiler which will target the processor.
733In particular, we are interested in intensional properties of the processor; precise timings of instruction execution in clock cycles.
734Moreover, in addition to formalising the interface of an MCS-51 processor, we have also built a complete MCS-51 ecosystem: the UART, the I/O lines, and hardware timers, along with an assembler.
735
736Similar work to ours can be found in~\cite{fox:trustworthy:2010}.
737Here, the authors describe the formalisation, in HOL4, of the ARMv7 instruction set architecture, and point to a good list of references to related work in the literature.
738This formalisation also considers the machine code level, as opposed to only considering an abstract assembly language.
739In particular, instruction decoding is explicitly modelled inside HOL4's logic.
740However, we go further in also providing an assembly language, complete with assembler, to translate instructions and pseudoinstruction to machine code.
741
742Further, in~\cite{fox:trustworthy:2010} the authors validated their formalisation by using development boards and random testing.
743However, we currently rely on non-exhaustive testing against a third party emulator.
744We leave similar exhaustive testing for future work.
745
746Executability is another key difference between our work and~\cite{fox:trustworthy:2010}.
747Our formalisation is executable: applying the emulation function to an input state eventually reduces to an output state that already satisfies the appropriate conditions.
748This is because Matita is based on a logic that internalizes conversion.
749In~\cite{fox:trustworthy:2010} the authors provide an automation layer to derive single step theorems: if the processor is in a particular state that satisfies some preconditions, then after execution of an instruction it will reside in another state satisfying some postconditions.
750We do not need single step theorems of this form.
751
752Our main difficulties resided in the non-uniformity of an old 8-bit architecture, in terms of the instruction set, addressing modes and memory models.
753In contrast, the ARM instruction set and memory model is relatively uniform, simplifying any formalisation considerably.
754
755Perhaps the closest project to CerCo is CompCert~\cite{leroy:formal:2009,leroy:formally:2009,blazy:formal:2006}.
756CompCert concerns the certification of an ARM compiler and includes a formalisation in Coq of a subset of ARM.
757Coq and Matita essentially share the same logic.
758
759Despite this similarity, the two formalisations do not have much in common.
760First, CompCert provides a formalisation at the assembly level (no instruction decoding), and this impels them to trust an unformalised assembler and linker, whereas we provide our own.
761I/O is also not considered at all in CompCert.
762Moreover an idealized abstract and uniform memory model is assumed, while we take into account the complicated overlapping memory model of the MCS-51 architecture.
763Finally, around 90 instructions of the 200+ offered by the processor are formalised in CompCert, and the assembly language is augmented with macro instructions that are turned into real' instructions only during communication with the external assembler.
764Even from a technical level the two formalisations differ: while we tried to exploit dependent types as often as possible, CompCert largely sticks to the non-dependent fragment of Coq.
765
766In~\cite{atkey:coqjvm:2007} Atkey presents an executable specification of the Java virtual machine which uses dependent types.
767As we do, dependent types are used to remove spurious partiality from the model, and to lower the need for over-specifying the behaviour of the processor in impossible cases.
768Our use of dependent types will also help to maintain invariants when we prove the correctness of the CerCo prototype compiler.
769
770Finally, in~\cite{sarkar:semantics:2009} Sarkar et al provide an executable semantics for x86-CC multiprocessor machine code.
771This machine code exhibits a high degree of non-uniformity similar to the MCS-51.
772However, only a very small subset of the instruction set is considered, and they over-approximate the possibilities of unorthogonality of the instruction set, largely dodging the problems we had to face.
773
774Further, it seems that the definition of the decode function is potentially error prone.
775A small domain specific language of patterns is formalised in HOL4.
776This is similar to the specification language of the x86 instruction set found in manufacturer's data sheets.
777A decode function is implemented by copying lines from data sheets into the proof script.
778
779We are currently considering implementing a similar domain specific language in Matita.
780However, we would prefer to certify in Matita the compiler for this language.
781Data sheets could then be compiled down to the efficient code that we currently provide, instead of inefficiently interpreting the data sheets every time an instruction is executed.
782
783%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
784% SECTION                                                                      %
785%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
786\section{Conclusions}
787\label{sect.conclusions}
788
789\CSC{Tell what is NOT formalized/formalizable: the HEX parser/pretty printer
790 and/or the I/O procedure}
791\CSC{Decode: two implementations}
792\CSC{Discuss over-specification}
793
794- WE FORMALIZE ALSO I/O ETC. NOT ONLY THE INSTRUCTION SELECTION (??)
795  How to test it? Specify it?
796
797\bibliography{itp-2011.bib}
798
799\end{document}
800
801\newpage
802
803\appendix
804
805\section{Listing of main O'Caml functions}
806\label{sect.listing.main.ocaml.functions}
807
808\subsubsection{From \texttt{ASMInterpret.ml(i)}}
809
810\begin{center}
812Name & Description \\
813\hline
814\texttt{assembly} & Assembles an abstract syntax tree representing an 8051 assembly program into a list of bytes, its compiled form. \\
815\texttt{initialize} & Initializes the emulator status. \\
816\texttt{load} & Loads an assembled program into the emulator's code memory. \\
817\texttt{fetch} & Fetches the next instruction, and automatically increments the program counter. \\
818\texttt{execute} & Emulates the processor.  Accepts as input a function that pretty prints the emulator status after every emulation loop. \\
819\end{tabular*}
820\end{center}
821
822\subsubsection{From \texttt{ASMCosts.ml(i)}}
823
824\begin{center}
826Name & Description \\
827\hline
828\texttt{compute} & Computes a map associating costings to basic blocks in the program.
829\end{tabular*}
830\end{center}
831
832\subsubsection{From \texttt{IntelHex.ml(i)}}
833
834\begin{center}
836Name & Description \\
837\hline
838\texttt{intel\_hex\_of\_file} & Reads in a file and parses it if in Intel IHX format, otherwise raises an exception. \\
839\texttt{process\_intel\_hex} & Accepts a parsed Intel IHX file and populates a hashmap (of the same type as code memory) with the contents.
840\end{tabular*}
841\end{center}
842
843\subsubsection{From \texttt{Physical.ml(i)}}
844
845\begin{center}
847Name & Description \\
848\hline
849\texttt{subb8\_with\_c} & Performs an eight bit subtraction on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\
850\texttt{add8\_with\_c} & Performs an eight bit addition on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\
851\texttt{dec} & Decrements an eight bit bitvector with underflow, if necessary. \\
852\texttt{inc} & Increments an eight bit bitvector with overflow, if necessary.
853\end{tabular*}
854\end{center}
855
856\newpage
857
858\section{Listing of main Matita functions}
859\label{sect.listing.main.matita.functions}
860
861\subsubsection{From \texttt{Arithmetic.ma}}
862
863\begin{center}
864\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
865Title & Description \\
866\hline
867\texttt{add\_n\_with\_carry} & Performs an $n$ bit addition on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\
868\texttt{sub\_8\_with\_carry} & Performs an eight bit subtraction on bitvectors. The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\
869\texttt{half\_add} & Performs a standard half addition on bitvectors, returning the result and carry bit. \\
870\texttt{full\_add} & Performs a standard full addition on bitvectors and a carry bit, returning the result and a carry bit.
871\end{tabular*}
872\end{center}
873
874\subsubsection{From \texttt{Assembly.ma}}
875
876\begin{center}
877\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
878Title & Description \\
879\hline
880\texttt{assemble1} & Assembles a single 8051 assembly instruction into its memory representation. \\
881\texttt{assemble} & Assembles an 8051 assembly program into its memory representation.\\
882\texttt{assemble\_unlabelled\_program} &\\& Assembles a list of (unlabelled) 8051 assembly instructions into its memory representation.
883\end{tabular*}
884\end{center}
885
886\subsubsection{From \texttt{BitVectorTrie.ma}}
887
888\begin{center}
889\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
890Title & Description \\
891\hline
892\texttt{lookup} & Returns the data stored at the end of a particular path (a bitvector) from the trie.  If no data exists, returns a default value. \\
893\texttt{insert} & Inserts data into a tree at the end of the path (a bitvector) indicated.  Automatically expands the tree (by filling in stubs) if necessary.
894\end{tabular*}
895\end{center}
896
897\subsubsection{From \texttt{DoTest.ma}}
898
899\begin{center}
900\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
901Title & Description \\
902\hline
903\texttt{execute\_trace} & Executes an assembly program for a fixed number of steps, recording in a trace which instructions were executed.
904\end{tabular*}
905\end{center}
906
907\subsubsection{From \texttt{Fetch.ma}}
908
909\begin{center}
910\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
911Title & Description \\
912\hline
913\texttt{fetch} & Decodes and returns the instruction currently pointed to by the program counter and automatically increments the program counter the required amount to point to the next instruction. \\
914\end{tabular*}
915\end{center}
916
917\subsubsection{From \texttt{Interpret.ma}}
918
919\begin{center}
920\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
921Title & Description \\
922\hline
923\texttt{execute\_1} & Executes a single step of an 8051 assembly program. \\
924\texttt{execute} & Executes a fixed number of steps of an 8051 assembly program.
925\end{tabular*}
926\end{center}
927
928\subsubsection{From \texttt{Status.ma}}
929
930\begin{center}
931\begin{tabular*}{\textwidth}{p{3cm}p{9cm}}
932Title & Description \\
933\hline