source: Deliverables/D4.1/ITP-Paper/itp-2011.tex @ 536

Last change on this file since 536 was 536, checked in by mulligan, 7 years ago

more added

File size: 42.8 KB
Line 
1\documentclass{llncs}
2
3\usepackage{amsfonts}
4\usepackage{amsmath}
5\usepackage{amssymb} 
6\usepackage[english]{babel}
7\usepackage{color}
8\usepackage{fancybox}
9\usepackage{graphicx}
10\usepackage[utf8x]{inputenc}
11\usepackage{listings}
12\usepackage{mdwlist}
13\usepackage{microtype}
14\usepackage{stmaryrd}
15\usepackage{url}
16
17\newlength{\mylength} 
18\newenvironment{frametxt}%
19        {\setlength{\fboxsep}{5pt}
20                \setlength{\mylength}{\linewidth}%
21                \addtolength{\mylength}{-2\fboxsep}%
22                \addtolength{\mylength}{-2\fboxrule}%
23                \Sbox
24                \minipage{\mylength}%
25                        \setlength{\abovedisplayskip}{0pt}%
26                        \setlength{\belowdisplayskip}{0pt}%
27                }%
28                {\endminipage\endSbox
29                        \[\fbox{\TheSbox}\]}
30
31\lstdefinelanguage{matita-ocaml}
32  {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try,on},
33   morekeywords={[2]whd,normalize,elim,cases,destruct},
34   morekeywords={[3]type,of,val,assert,let,function},
35   mathescape=true,
36  }
37\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
38        keywordstyle=\color{red}\bfseries,
39        keywordstyle=[2]\color{blue},
40        keywordstyle=[3]\color{blue}\bfseries,
41        commentstyle=\color{green},
42        stringstyle=\color{blue},
43        showspaces=false,showstringspaces=false}
44\lstset{extendedchars=false}
45\lstset{inputencoding=utf8x}
46\DeclareUnicodeCharacter{8797}{:=}
47\DeclareUnicodeCharacter{10746}{++}
48\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
49\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
50
51% Who is first author?  Is CSC classed as a `S', or as a `C'?
52\author{Dominic P. Mulligan \and Claudio Sacerdoti Coen}
53\authorrunning{D. P. Mulligan and C. Sacerdoti Coen}
54\title{An executable formalisation of the MCS-51 microprocessor in Matita}
55\titlerunning{An executable formalisation of the MCS-51}
56\institute{Dipartimento di Scienze dell'Informazione, University of Bologna}
57
58\begin{document}
59
60\maketitle
61
62\begin{abstract}
63We summarise our formalisation of an emulator for the MCS-51 microprocessor in the Matita proof assistant.
64The MCS-51 is a widely used 8-bit microprocessor, especially popular in embedded devices.
65
66We proceeded in two stages, first implementing in O'Caml a prototype emulator, where bugs could be `ironed out' quickly.
67We then ported our O'Caml emulator to Matita's internal language.
68Though mostly straight-forward, this porting presented multiple problems.
69Of particular interest is how we handle the extreme non-orthoganality of the MSC-51's instruction set.
70In O'Caml, this was handled through heavy use of polymorphic variants.
71In Matita, we achieve the same effect through a non-standard use of dependent types.
72
73Both the O'Caml and Matita emulators are `executable'.
74Assembly programs may be animated within Matita, producing a trace of instructions executed.
75
76Our formalisation is a major component of the ongoing EU-funded CerCo project.
77\end{abstract}
78
79%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
80% SECTION                                                                      %
81%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
82\section{Background}
83\label{sect.introduction}
84
85Formal methods are designed to increase our confidence in the design and implementation of software (and hardware).
86Ideally, we would like all software to come equipped with a formal specification, along with a proof of correctness for the implementation.
87Today practically all programs are written in high level languages and then compiled into low level ones.
88Specifications are therefore also given at a high level and correctness can be proved by reasoning automatically or interactively on the program's source code.
89The code that is actually run, however, is not the high level source code that we reason on, but the object code that is generated by the compiler.
90
91A few simple questions now arise:
92\begin{itemize*}
93\item
94What properties are preserved during compilation?
95\item
96What properties are affected by the compilation strategy?
97\item
98To what extent can you trust your compiler in preserving those properties?
99\end{itemize*}
100These questions, and others like them, motivate a current `hot topic' in computer science research: \emph{compiler verification}.
101So far, the field has been focused on the first and last questions only.
102In particular, much attention has been placed on verifying compiler correctness with respect to extensional properties of programs, which are easily preserved during compilation; it is sufficient to completely preserve the denotational semantics of the input program.
103
104However, if we consider intensional properties of programs---such as space, time or energy spent into computation and transmission of data---the situation is more complex.
105To even be able to express these properties, and to be able to reason about them, we are forced to adopt a cost model that assigns a cost to single, or blocks, of instructions.
106Ideally, we would like to have a compositional cost model that assigns the same cost to all occurrences of one instruction.
107However, compiler optimisations are inherently non-compositional: each occurrence of a high level instruction is usually compiled in a different way according to the context it finds itself in.
108Therefore both the cost model and intensional specifications are affected by the compilation process.
109
110In the current EU project CerCo (`Certified Complexity') we approach the problem of reasoning about intensional properties of programs as follows.
111We are currently developing a compiler that induces a cost model on the high level source code.
112Costs are assigned to each block of high level instructions by considering the costs of the corresponding blocks of compiled object code.
113The cost model is therefore inherently non-compositional.
114However, the model has the potential to be extremely \emph{precise}, capturing a program's \emph{realistic} cost, by taking into account, not ignoring, the compilation process.
115A prototype compiler, where no approximation of the cost is provided, has been developed.
116
117We believe that our approach is especially applicable to certifying real time programs.
118Here, a user can certify that all `deadlines' are met whilst wringing as many clock cycles from the processor---using a cost model that does not over-estimate---as possible.
119
120Further, we see our approach as being relevant to the field of compiler verification (and construction) itself.
121For instance, an optimisation specified only extensionally is only half specified; though the optimisation may preserve the denotational semantics of a program, there is no guarantee that any intensional properties of the program, such as space or time usage, will be improved.
122Another potential application is toward completeness and correctness of the compilation process in the presence of space constraints.
123Here, a compiler could potentially reject a source program targetting an embedded system when the size of the compiled code exceeds the available ROM size.
124Moreover, preservation of a program's semantics may only be required for those programs that do not exhaust the stack or heap.
125Hence the statement of completeness of the compiler must take in to account a realistic cost model.
126
127In the methodology proposed in CerCo we assume we are able to compute on the object code exact and realistic costs for sequential blocks of instructions.
128With modern processors, though possible~\cite{??,??,??}, it is difficult to compute exact costs or to reasonably approximate them.
129This is because the execution of a program itself has an influence on the speed of processing.
130For instance, caching, memory effects and other advanced features such as branch prediction all have a profound effect on execution speeds.
131For this reason CerCo decided to focus on 8-bit microprocessors.
132These are still widely used in embedded systems, and have the advantage of an easily predictable cost model due to the relative sparcity of features that they possess.
133
134In particular, we have fully formalised an executable formal semantics of a family of 8 bit Freescale Microprocessors~\cite{oliboni}, and provided a similar executable formal semantics for the MCS-51 microprocessor.
135The latter work is what we describe in this paper.
136The main focus of the formalisation has been on capturing the intensional behaviour of the processor.
137However, the design of the MCS-51 itself has caused problems in our formalisation.
138For example, the MCS-51 has a highly unorthogonal instruction set.
139To cope with this unorthogonality, and to produce an executable specification, we have exploited the dependent type system of Matita, an interactive proof assistant.
140
141\subsection{The 8051/8052}
142\label{subsect.8051-8052}
143
144The MCS-51 is an eight bit microprocessor introduced by Intel in the late 1970s.
145Commonly called the 8051, in the three decades since its introduction the processor has become a highly popular target for embedded systems engineers.
146Further, the processor, its immediate successor the 8052, and many derivatives are still manufactured \emph{en masse} by a host of semiconductor suppliers.
147
148The 8051 is a well documented processor, and has the additional support of numerous open source and commercial tools, such as compilers for high-level languages and emulators.
149For instance, the open source Small Device C Compiler (SDCC) recognises a dialect of C, and other compilers targeting the 8051 for BASIC, Forth and Modula-2 are also extant.
150An open source emulator for the processor, MCU-8051 IDE, is also available.
151Both MCU-8051 IDE and SDCC were used profitably in the implementation of our formalisation.
152
153\begin{figure}[t]
154\begin{center}
155\includegraphics[scale=0.5]{memorylayout.png}
156\end{center}
157\caption{High level overview of the 8051 memory layout}
158\label{fig.memory.layout}
159\end{figure}
160
161The 8051 has a relatively straightforward architecture, unencumbered by advanced features of modern processors, making it an ideal target for formalisation.
162A high-level overview of the processor's memory layout is provided in Figure~\ref{fig.memory.layout}.
163
164Processor RAM is divided into numerous segments, with the most prominent division being between internal and (optional) external memory.
165Internal memory, commonly provided on the die itself with fast access, is further divided into 128 bytes of internal RAM and numerous Special Function Registers (SFRs) which control the operation of the processor.
166Internal RAM (IRAM) is further divided into eight general purpose bit-addressable registers (R0--R7).
167These sit in the first eight bytes of IRAM, though can be programmatically `shifted up' as needed.
168Bit memory, followed by a small amount of stack space, resides in the memory space immediately after the register banks.
169What remains of the IRAM may be treated as general purpose memory.
170A schematic view of IRAM layout is provided in Figure~\ref{fig.iram.layout}.
171
172External RAM (XRAM), limited to a maximum size of 64 kilobytes, is optional, and may be provided on or off chip, depending on the manufacturer.
173XRAM is accessed using a dedicated instruction, and requires sixteen bits to address fully.
174External code memory (XCODE) is often stored in the form of an EPROM, and limited to 64 kilobytes in size.
175However, depending on the particular manufacturer and processor model, a dedicated on-die read-only memory area for program code (ICODE) may also be supplied.
176
177Memory may be addressed in numerous ways: immediate, direct, indirect, external direct and code indirect.
178As the latter two addressing modes hint, there are some restrictions enforced by the 8051 and its derivatives on which addressing modes may be used with specific types of memory.
179For instance, the 128 bytes of extra internal RAM that the 8052 features cannot be addressed using indirect addressing; rather, external (in)direct addressing must be used.
180
181The 8051 series possesses an eight bit Arithmetic and Logic Unit (ALU), with a wide variety of instructions for performing arithmetic and logical operations on bits and integers.
182Further, the processor possesses two eight bit general purpose accumulators, A and B.
183
184Communication with the device is facilitated by an onboard UART serial port, and associated serial controller, which can operate in numerous modes.
185Serial baud rate is determined by one of two sixteen bit timers included with the 8051, which can be set to multiple modes of operation.
186(The 8052 provides an additional sixteen bit timer.)
187As an additional method of communication, the 8051 also provides a four byte bit-addressable input-output port.
188
189The programmer may take advantage of the interrupt mechanism that the processor provides.
190This is especially useful when dealing with input or output involving the serial device, as an interrupt can be set when a whole character is sent or received via the serial port.
191
192Interrupts immediately halt the flow of execution of the processor, and cause the program counter to jump to a fixed address, where the requisite interrupt handler is stored.
193However, interrupts may be set to one of two priorities: low and high.
194The interrupt handler of an interrupt with high priority is executed ahead of the interrupt handler of an interrupt of lower priority, interrupting a currently executing handler of lower priority, if necessary.
195
196The 8051 has interrupts disabled by default.
197The programmer is free to handle serial input and output manually, by poking serial flags in the SFRs.
198Similarly, `exceptional circumstances' that would otherwise trigger an interrupt on more modern processors, for example, division by zero, are also signalled by setting flags.
199
200\begin{figure}[t]
201\begin{center}
202\includegraphics[scale=0.5]{iramlayout.png}
203\end{center}
204\caption{Schematic view of 8051 IRAM layout}
205\label{fig.iram.layout}
206\end{figure}
207
208%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
209% SECTION                                                                      %
210%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
211\subsection{Overview of paper}
212\label{subsect.overview.paper}
213
214In Section~\ref{sect.development.strategy} we provide a brief overview of how we designed and implemented the formalised microprocessor emulator.
215In Section~\ref{sect.design.issues.formalisation} we describe how we made use of dependent types to handle some of the idiosyncracies of the microprocessor.
216In Section~\ref{sect.related.work} we describe the relation our work has to
217
218%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
219% SECTION                                                                      %
220%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
221\section{Design issues in the formalisation}
222\label{sect.design.issues.formalisation}
223
224From hereonin, we typeset O'Caml source with blue and Matita source with red to distinguish between the two similar syntaxes.
225
226\subsection{Development strategy}
227\label{subsect.development.strategy}
228
229Our implementation progressed in two stages:
230
231\paragraph{O'Caml prototype}
232We began with an emulator written in O'Caml.
233We used this to `iron out' any bugs in our design and implementation within O'Caml's more permissive type system.
234O'Caml's ability to perform file input-output also eased debugging and validation.
235Once we were happy with the performance and design of the O'Caml emulator, we moved to the Matita formalisation.
236
237\paragraph{Matita formalisation}
238Matita's syntax is lexically similar to O'Caml's.
239This eased the translation, as large swathes of code were merely copy-pasted with minor modifications.
240However, several major issues had to be addresses when moving from O'Caml to Matita.
241These are now discussed.
242
243%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
244% SECTION                                                                      %
245%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
246\subsection{Representation of integers}
247\label{subsect.representation.integers}
248
249\begin{figure}[t]
250\begin{minipage}[t]{0.45\textwidth}
251\vspace{0pt}
252\begin{lstlisting}
253type 'a vect = bit list
254type word = [`Sixteen] vect
255type byte = [`Eight] vect
256$\color{blue}{\mathtt{let}}$ from_nibble =
257 function
258    [b1;b2;b3;b4] -> b1,b2,b3,b4
259  | _ -> assert false
260\end{lstlisting}
261\end{minipage}
262%
263\begin{minipage}[t]{0.55\textwidth}
264\vspace{0pt}
265\begin{lstlisting}
266type 'a vect
267type word = [`Sixteen] vect
268type byte = [`Eight] vect
269val from_nibble: nibble -> bit*bit*bit*bit
270\end{lstlisting}
271\end{minipage}
272\caption{Sample of O'Caml implementation and interface for bitvectors module}
273\label{fig.ocaml.implementation.bitvectors}
274\end{figure}
275
276Integers are represented using bitvectors, i.e. fixed length vectors of booleans.
277In our O'Caml emulator, we `faked' bitvectors using phantom types and polymorphic variants, as in Figure~\ref{fig.ocaml.implementation.bitvectors}.
278From within the bitvector module (left column) bitvectors are just lists of bits.
279However, the module's interface (right column) hides this implementation completely.
280
281In Matita, we are able to use the full power of dependent types to define `real' bitvectors:
282\begin{lstlisting}
283inductive Vector (A: Type[0]): nat → Type[0] ≝
284  VEmpty: Vector A O
285| VCons: ∀n: nat. A → Vector A n → Vector A (S n).
286\end{lstlisting}
287We define \texttt{BitVector} as a specialization of \texttt{Vector} to \texttt{bool}.
288We may use Matita's type system to provide even stronger guarantees, here on a function that splits a vector into two pieces at any index, providing that the index is smaller than the length of the \texttt{Vector} to be split:
289\begin{lstlisting}
290let rec split (A: Type[0]) (m,n: nat) on m:
291   Vector A (plus m n) $\rightarrow$ (Vector A m) $\times$ (Vector A n) := ...
292\end{lstlisting}
293
294%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
295% SECTION                                                                      %
296%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
297\subsection{Representing memory}
298\label{subsect.representing.memory}
299
300The MCS-51 has numerous different types of memory.
301In our prototype implementation, we simply used a map datastructure from the O'Caml standard library.
302Matita's standard library is relatively small, and does not contain a generic map datastructure.
303Therefore, we had the opportunity of crafting a special-purpose datastructure for the job.
304
305We worked under the assumption that large swathes of memory would often be uninitialized.
306Na\"ively, using a complete binary tree, for instance, would be extremely memory inefficient.
307Instead, we chose to use a modified form of trie, where paths are represented by bitvectors.
308As bitvectors were widely used in our implementation already for representing integers, this worked well:
309\begin{lstlisting}
310inductive BitVectorTrie (A: Type[0]): nat $\rightarrow$ Type[0] ≝
311  Leaf: A $\rightarrow$ BitVectorTrie A 0
312| Node: ∀n. BitVectorTrie A n $\rightarrow$ BitVectorTrie A n $\rightarrow$ BitVectorTrie A (S n)
313| Stub: ∀n. BitVectorTrie A n.
314\end{lstlisting}
315Here, \texttt{Stub} is a constructor that can appear at any point in our tries.
316It internalises the notion of `uninitialized data'.
317Performing a lookup in memory is now straight-forward.
318We merely traverse a path, and if at any point we encounter a \texttt{Stub}, we return a default value\footnote{All manufacturer data sheets that we consulted were silent on the subject of what should be returned if we attempt to access uninitialized memory.  We defaulted to simply returning zero, though our \texttt{lookup} function is parametric in this choice.  We do not believe that this is an outrageous decision, as SDCC for instance generates code which first `zeroes out' all memory in a preamble before executing the program proper.  This is in line with the C standard, which guarantees that all global variables will be zero initialized piecewise.}.
319As we are using bitvectors, we may make full use of dependent types and ensure that our bitvector paths are of the same length as the height of the tree.
320
321%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
322% SECTION                                                                      %
323%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
324\subsection{Labels and pseudoinstructions}
325\label{subsect.labels.pseudoinstructions}
326
327Aside from implementing the core MCS-51 instruction set, we also provided \emph{pseudoinstructions}, \emph{labels} and \emph{cost labels}.
328The purpose of \emph{cost labels} will be explained in Subsection~\ref{subsect.computation.cost.traces}.
329
330Introducing pseudoinstructions had the effect of simplifying a C compiler---another component of the CerCo project---that was being implemented in parallel with our implementation.
331To understand why this is so, consider the fact that the MCS-51's instruction set has numerous instructions for unconditional and conditional jumps to memory locations.
332For instance, the instructions \texttt{AJMP}, \texttt{JMP} and \texttt{LJMP} all perform unconditional jumps.
333However, these instructions differ in how large the maximum size of the offset of the jump to be performed can be.
334Further, all jump instructions require a concrete memory address---to jump to---to be specified.
335Requiring the compiler to compute these offsets, and select appropriate jump instructions, was seen as needleslly burdensome.
336
337Introducing labels also had a simplifying effect on the design of the compiler.
338Instead of jumping to a concrete address, the compiler could `just' jump to a label.
339In this vein, we introduced pseudoinstructions for both unconditional and conditional jumps to a label.
340
341Further, we also introduced labels for storing global data in a preamble before the program.
342A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the (16-bit) register \texttt{DPTR}.
343We believe this facility, of storing global data in a preamble referenced by a label, will also make any future extension considering separate compilation much simpler.
344
345Our pseudoinstructions and labels induce an assembly language similar to that of SDCC.
346All pseudoinstructions and labels are `assembled away', prior to program execution, using a preprocessing stage.
347Jumps are computed in two stages.
348The first stage builds a map associating memory addresses to labels, with the second stage removing pseudojumps with concrete jumps to the correct address.
349
350%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
351% SECTION                                                                      %
352%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
353\subsection{Anatomy of the (Matita) emulator}
354\label{subsect.anatomy.matita.emulator}
355
356The internal state of our Matita emulator is represented as a record:
357\begin{lstlisting}
358record Status: Type[0] ≝
359{
360  code_memory: BitVectorTrie Byte 16;
361  low_internal_ram: BitVectorTrie Byte 7;
362  high_internal_ram: BitVectorTrie Byte 7;
363  external_ram: BitVectorTrie Byte 16;
364  program_counter: Word;
365  special_function_registers_8051: Vector Byte 19;
366  special_function_registers_8052: Vector Byte 5;
367  ...
368}.
369\end{lstlisting}
370This record neatly encapsulates the current memory contents, the program counter, the state of the current SFRs, and so on.
371One peculiarity is the packing of the 24 combined SFRs into fixed length vectors.
372This was due to a bug in Matita when we were constructing the emulator, since fixed, where the time needed to typecheck a record grew exponentially with the number of fields.
373
374Here, it appears that the MCS-51's memory spaces are completely disjoint.
375This is not so; many of them overlap with each other, and there's a many-many relationship between addressing modes and memory spaces.
376For instance, \texttt{DIRECT} addressing can be used to address low internal RAM and the SFRs, but not high internal RAM.
377
378For simplicity, we merely treat memory spaces as if they are completely disjoint in the \texttt{Status} record.
379Overlapping, and checking which addressing modes can be used to address particular memory spaces, is handled through numerous \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX} (for 1, 8 and 16 bits) functions.
380
381Both the Matita and O'Caml emulators follows the classic `fetch-decode-execute' model of processor operation.
382The next instruction to be processed, indexed by the program counter, is fetched from code memory with \texttt{fetch}.
383An updated program counter, along with the concrete cost, in processor cycles for executing this instruction, is also returned.
384These costs are taken from a Siemen's data sheet for the MCS-51, and will likely vary across manufacturers and particular derivatives of the processor.
385\begin{lstlisting}
386definition fetch:
387  BitVectorTrie Byte 16 $\rightarrow$ Word $\rightarrow$ instruction $\times$ Word $\times$ nat := ...
388\end{lstlisting}
389A single instruction is assembled into its corresponding bit encoding with \texttt{assembly1}:
390\begin{lstlisting}
391definition assembly1: instruction $\rightarrow$ list Byte := ...
392\end{lstlisting}
393An assembly program, consisting of a preamble containing global data, and a list of (pseudo)instructions, is assembled using \texttt{assembly}.
394Pseudoinstructions and labels are eliminated in favour of concrete instructions from the MCS-51 instruction set.
395A map associating memory locations and cost labels (see Subsection~\ref{subsect.computation.cost.traces}) is also produced.
396\begin{lstlisting}
397definition assembly:
398  assembly_program $\rightarrow$ option (list Byte $\times$ (BitVectorTrie String 16)) := ...
399\end{lstlisting}
400A single execution step of the processor is evaluated using \texttt{execute\_1}, mapping a \texttt{Status} to a \texttt{Status}:
401\begin{lstlisting}
402definition execute_1: Status $\rightarrow$ Status := ...
403\end{lstlisting}
404Multiple steps of processor execution are implemented in \texttt{execute}, which wraps \texttt{execute\_1}:
405\begin{lstlisting}
406let rec execute (n: nat) (s: Status) on n: Status := ...
407\end{lstlisting}
408This differs slightly from the design of the O'Caml emulator, which executed a program indefinitely, and also accepted a callback function as an argument, which could `witness' the execution as it happened, and providing a print-out of the processor state, and other debugging information.
409Due to Matita's requirement that all functions be strongly normalizing, \texttt{execute} cannot execute a program indefinitely, and must execute a fixed number of steps.
410
411%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
412% SECTION                                                                      %
413%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
414\subsection{Instruction set unorthogonality}
415\label{subsect.instruction.set.unorthogonality}
416
417A peculiarity of the MCS-51 is the non-orthogonality of its instruction set.
418For instance, the \texttt{MOV} instruction, can be invoked using one of sixteen combinations of addressing modes.
419
420% Show example of pattern matching with polymorphic variants
421
422Such non-orthogonality in the instruction set was handled with the use of polymorphic variants in the O'Caml emulator.
423For instance, we introduced types corresponding to each addressing mode:
424\begin{lstlisting}
425type direct = [ `DIRECT of byte ]
426type indirect = [ `INDIRECT of bit ]
427...
428\end{lstlisting}
429Which were then used in our inductive datatype for assembly instructions, as follows:
430\begin{lstlisting}
431type 'addr preinstruction =
432 [ `ADD of acc * [ reg | direct | indirect | data ]
433...
434 | `MOV of
435    (acc * [ reg | direct | indirect | data ],
436     [ reg | indirect ] * [ acc | direct | data ],
437     direct * [ acc | reg | direct | indirect | data ],
438     dptr * data16,
439     carry * bit,
440     bit * carry
441     ) union6
442...
443\end{lstlisting}
444Here, \texttt{union6} is a disjoint union type, defined as follows:
445\begin{lstlisting}
446type ('a,'b,'c,'d,'e,'f) union6 = [ `U1 of 'a | ... | `U6 of 'f ]
447\end{lstlisting}
448For our purposes, the types \texttt{union2}, \texttt{union3} and \texttt{union6} sufficed.
449
450This polymorphic variant machinery worked well: it introduced a certain level of type safety (for instance, the type of our \texttt{MOV} instruction above guarantees it cannot be invoked with arguments in the \texttt{carry} and \texttt{data16} addressing modes, respectively), and also allowed us to pattern match against instructions, when necessary.
451However, this polymorphic variant machinery is \emph{not} present in Matita.
452We needed some way to produce the same effect, which Matita supported.
453For this task, we used dependent types.
454
455We first provided an inductive data type representing all possible addressing modes, a type that functions will pattern match against:
456\begin{lstlisting}
457inductive addressing_mode: Type[0] ≝
458  DIRECT: Byte $\rightarrow$ addressing_mode
459| INDIRECT: Bit $\rightarrow$ addressing_mode
460...
461\end{lstlisting}
462We also wished to express in the type of functions the \emph{impossibility} of pattern matching against certain constructors.
463In order to do this, we introduced an inductive type of addressing mode `tags'.
464The constructors of \texttt{addressing\_mode\_tag} are in one-to-one correspondence with the constructors of \texttt{addressing\_mode}:
465\begin{lstlisting}
466inductive addressing_mode_tag : Type[0] ≝
467  direct: addressing_mode_tag
468| indirect: addressing_mode_tag
469...
470\end{lstlisting}
471A function that checks whether an \texttt{addressing\_mode} is `morally' an \texttt{addressing\_mode\_tag} is provided, as follows:
472\begin{lstlisting}
473let rec is_a (d: addressing_mode_tag) (A: addressing_mode) on d ≝
474  match d with
475   [ direct $\Rightarrow$ match A with [ DIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ]
476   | indirect $\Rightarrow$ match A with [ INDIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ]
477...
478\end{lstlisting}
479We also extend this check to vectors of \texttt{addressing\_mode\_tag}'s in the obvious manner:
480\begin{lstlisting}
481let rec is_in (n) (l: Vector addressing_mode_tag n) (A: addressing_mode) on l ≝
482 match l return $\lambda$m.$\lambda$_: Vector addressing_mode_tag m. bool with
483  [ VEmpty $\Rightarrow$ false
484  | VCons m he (tl: Vector addressing_mode_tag m) $\Rightarrow$
485     is_a he A $\vee$ is_in ? tl A ].
486\end{lstlisting}
487Here $\mathtt{\vee}$ is inclusive disjunction on the \texttt{bool} datatype.
488\begin{lstlisting}
489record subaddressing_mode (n: Nat) (l: Vector addressing_mode_tag (S n)) : Type[0] ≝
490{
491  subaddressing_modeel :> addressing_mode;
492  subaddressing_modein: bool_to_Prop (is_in ? l subaddressing_modeel)
493}.
494\end{lstlisting}
495We can now provide an inductive type of preinstructions with precise typings:
496\begin{lstlisting}
497inductive preinstruction (A: Type[0]): Type[0] ≝
498   ADD: $\llbracket$ acc_a $\rrbracket$ $\rightarrow$ $\llbracket$ register; direct; indirect; data $\rrbracket$ $\rightarrow$ preinstruction A
499 | ADDC: $\llbracket$ acc_a $\rrbracket$ $\rightarrow$ $\llbracket$ register; direct; indirect; data $\rrbracket$ $\rightarrow$ preinstruction A
500...
501\end{lstlisting}
502Here $\llbracket - \rrbracket$ is syntax denoting a vector.
503We see that the constructor \texttt{ADD} expects two parameters, the first being the accumulator A (\texttt{acc\_a}), and the second being one of a register, direct, indirect or data addressing mode.
504
505% One of these coercions opens up a proof obligation which needs discussing
506% Have lemmas proving that if an element is a member of a sub, then it is a member of a superlist, and so on
507The final, missing component is a pair of type coercions from \texttt{addressing\_mode} to \texttt{subaddressing\_mode} and from \texttt{subaddressing\_mode} to \texttt{Type$\lbrack0\rbrack$}, respectively.
508The previous machinery allows us to state in the type of a function what addressing modes that function expects.
509For instance, consider \texttt{set\_arg\_16}, which expects only a \texttt{DPTR}:
510\begin{lstlisting}
511definition set_arg_16: Status $\rightarrow$ Word $\rightarrow$ $\llbracket$ dptr $\rrbracket$ $\rightarrow$ Status ≝
512  $\lambda$s, v, a.
513   match a return $\lambda$x. bool_to_Prop (is_in ? $\llbracket$ dptr $\rrbracket$ x) $\rightarrow$ ? with
514     [ DPTR $\Rightarrow$ $\lambda$_: True.
515       let 〈 bu, bl 〉 := split $\ldots$ eight eight v in
516       let status := set_8051_sfr s SFR_DPH bu in
517       let status := set_8051_sfr status SFR_DPL bl in
518         status
519     | _ $\Rightarrow$ $\lambda$_: False.
520       match K in False with
521       [
522       ]
523     ] (subaddressing_modein $\ldots$ a).
524\end{lstlisting}
525All other cases are discharged by the catch-all at the bottom of the match expression.
526Attempting to match against another addressing mode not indicated in the type (for example, \texttt{REGISTER}) will produce a type-error.
527
528% Note the execute_1 function which leaves open over 200 proof obligations which we can close automatically, provided we have the lemmas mentioned above
529% Give an example of the type of proof obligations left open?
530% Talk about extraction to O'Caml code, which hopefully will allow us to extract back to using polymorphic variants, or when extracting vectors we could extract using phantom types
531% Discuss alternative approaches, i.e. Sigma types to piece together smaller types into larger ones, as opposed to using a predicate to `cut out' pieces of a larger type, which is what we did
532
533%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
534% SECTION                                                                      %
535%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
536\subsection{I/O and timers}
537\label{subsect.i/o.timers}
538
539% `Real clock' for I/O and timers
540The O'Caml emulator has code for handling timers, I/O and interrupts (these are not yet ported to the Matita emulator).
541All three of these features interact with each other in subtle ways.
542For instance, interrupts can `fire' when an input is detected on the processor's UART port, and, in certain modes, timers reset when a high signal is detected on one of the MCS-51's communication pins.
543
544To accurately model timers, we must modify the central \texttt{status} record of the emulator to keep track of the current time:
545\begin{lstlisting}
546type time = int
547type status = { ... clock: time; ... }
548\end{lstlisting}
549Before every execution step, the \texttt{clock} is incremented by the number of processor cycles that the instruction just fetched will take to execute.
550The processor then executes the instruction, followed by the code implementing the timers.
551
552I/O is handled with a continuation:
553\begin{lstlisting}
554type line =
555  [ `P1 of byte | `P3 of byte
556  | `SerialBuff of [ `Eight of byte | `Nine of BitVectors.bit * byte ]]
557type continuation =
558  [`In of time * line * epsilon * continuation] option *
559  [`Out of (time -> line -> time * continuation)]
560\end{lstlisting}
561The \texttt{line} datatype signals which input/output channel is being used, either the P1 or P3 lines, or the UART port in eight or nine bit mode.
562Input and output are handled with \texttt{continuation},
563
564%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
565% SECTION                                                                      %
566%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
567\subsection{Computation of cost traces}
568\label{subsect.computation.cost.traces}
569
570As mentioned in Subsection~\ref{subsect.labels.pseudoinstructions} we introduced a notion of \emph{cost label}.
571Cost labels are inserted by the prototype C compiler in specific locations in the object code.
572Roughly, for those familiar with control flow graphs, they are inserted at the start of every basic block.
573
574Cost labels are used to calculate a precise costing for a program by marking the location of basic blocks.
575During the assembly phase, where labels and pseudoinstructions are eliminated, a map is generated associating cost labels with memory locations.
576This map is later used in a separate analysis which computes the cost of a program by traversing through a program, fetching one instruction at a time, and computing the cost of blocks.
577These block costings are stored in another map, and will later be passed back to the prototype compiler.
578
579%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
580% SECTION                                                                      %
581%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
582\section{Validation}
583\label{sect.validation}
584
585We spent considerable effort attempting to ensure that our formalisation is correct, that is, what we have formalised really is an accurate model of the MCS-51 microprocessor.
586
587First, we made use of multiple data sheets, each from a different semiconductor manufacturer.
588This helped us spot errors in the specification of the processor's instruction set, and its behaviour.
589
590The O'Caml prototype was especially useful for validation purposes.
591This is because we wrote a module for parsing and loading the Intel HEX file format.
592HEX is a standard format that all compilers targetting the MCS-51, and similar processors, produce.
593It is essentially a snapshot of the processor's code memory in compressed form.
594Using this, we were able to compile C programs with SDCC, an open source compiler, and load the resulting program directly into our emulator's code memory, ready for execution.
595Further, we are able to produce a HEX file from our emulator's code memory, for loading into third party tools.
596After each step of execution, we can print out both the instruction that had been executed, along with its arguments, and a snapshot of the processor's state, including all flags and register contents.
597For example:
598\begin{frametxt}
599\begin{verbatim}
600...
601
60208: mov 81 #07
603
604 Processor status:                               
605
606   ACC : 0 (00000000) B   : 0 (00000000)
607   PSW : 0 (00000000) with flags set as:
608     CY  : false   AC  : false
609     FO  : false   RS1 : false
610     RS0 : false   OV  : false
611     UD  : false   P   : false
612   SP  : 7 (00000111) IP  : 0 (00000000)
613   PC  : 8 (0000000000001000)
614   DPL : 0 (00000000) DPH : 0 (00000000)
615   SCON: 0 (00000000) SBUF: 0 (00000000)
616   TMOD: 0 (00000000) TCON: 0 (00000000)
617   Registers:                                   
618    R0 : 0 (00000000) R1 : 0 (00000000)
619    R2 : 0 (00000000) R3 : 0 (00000000)
620    R4 : 0 (00000000) R5 : 0 (00000000)
621    R6 : 0 (00000000) R7 : 0 (00000000)
622
623...
624\end{verbatim}
625\end{frametxt}
626Here, the traces indicates that the instruction \texttt{mov 81 \#07} has just been executed by the processor, which is now in the state indicated.
627These traces were useful in spotting anything that was `obviously' wrong with the execution of the program.
628
629Further, we made use of an open source emulator for the MCS-51, \texttt{mcu8051ide}.
630Using our execution traces, we were able to step through a compiled program, one instruction at a time, in \texttt{mcu8051ide}, and compare the resulting execution trace with the trace produced by our emulator.
631
632Our Matita formalisation was largely copied from the O'Caml source code, apart from changes related to addressing modes already mentioned.
633However, as the Matita emulator is executable, we could perform further validation by comparing the trace of a program's execution in the Matita emulator with the trace of the same program in the O'Caml emulator.
634
635%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
636% SECTION                                                                      %
637%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
638\section{Related work}
639\label{sect.related.work}
640
641\CSC{Tell what is NOT formalized/formalizable: the HEX parser/pretty printer
642 and/or the I/O procedure}
643\CSC{Appendix with main functions interfaces?}
644\CSC{Decode: two implementations}
645\CSC{Discuss over-specification}
646
647- WE FORMALIZE ALSO I/O ETC. NOT ONLY THE INSTRUCTION SELECTION (??)
648  How to test it? Specify it?
649
650------------------------------------------
651
652Hardware verification?
653
654------------------------------------------------------------
655
656A Trustworthy Monadic Formalization of the ARMv7 Instruction Set Architecture Anthony Fox and Magnus O. Myreen:
657
658- The model operates at the machine code level, as opposed to a more abstract
659assembly code level. In particular, the model does not provide assembly
660level “pseudo instructions” and instruction decoding is explicitly modelled
661inside the logic. This means that the the model can be directly validated
662by comparing results against the behaviour of hardware employing ARM
663processors – this is discussed in Section 5.
664
665- Monadic style: what do we do in OCaml/Matita?
666- Single Step Theorems
667- validation against developments boards and random testing
668
669This section discusses related work in formalizing various commercial instruction
670set architectures using interactive theorem provers, i.e. in ACL2, Coq, Isabelle
671and HOL4. There is much work that is indirectly related, but here we exclude
672non-commercial architectures (e.g. DLX) and informal or semi-formal ISA mod-
673els (e.g. in C, System Verilog, Haskell and so forth).
674
675ARM (this paper)
676x86 (15/7)
677JVM (many)
678
679WE ALSO HAVE AN ASSEMBLER (THEY DON'T)
680
681-----------------
682
683CompCert:
684 - assembly level model (no instruction decoding, etc.)
685 - abstract memory model
686 They also have a more abstract view of memory which is expressed in terms of memory blocks, in contrast to our very concrete mapping from 32-bit addresses to 8-bit data.
687 - 90 instructions of the 200+ offered by the processor; not all registers too
688 - macro instructions that are turned into real instructions only during
689   pretty-printing!
690 - no proof of correctness of assembly and linking
691
692
693-----------------
694
695Robert Atkey. CoqJVM: An executable specification of the Java virtual machine
696using dependent types. In Marino Miculan, Ivan Scagnetto, and Furio Honsell,
697editors, TYPES, volume 4941 of Lecture Notes in Computer Science, pages 18–32.
698Springer, 2007.
699
700CoqJVM dependent types:
701
702                                       We remove this spurious partiality from
703the model by making use of dependent types to maintain invariants about the
704state of the JVM. These invariants are then available to all proofs concerning
705the model.
706
707            Our belief is that this will make large-scale proofs using the model
708easier to perform, and we have some initial evidence that this is the case, but
709detailed research of this claim is still required.
710
711------------------------------------
712
713Susmit Sarkar, Pater Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge,
714Thomas Braibant Magnus O. Myreen, and Jade Alglave. The semantics of x86-CC
715multiprocessor machine code. In Principles of Programming Languages (POPL).
716ACM, 2009.
717
718(What Yann complained on was done in this paper:)
719                                    To make the semantics
720scalable, without introducing many errors, we formalised
721the interpretation of these encodings inside the HOL logic,
722and built a HOL decoding function by directly copying the
723relevant lines from the manual into the HOL script.
724
725However, they only consider a very small subset of the instructions and they
726over-approximate the possibilities of unorthogonality of the instruction set.
727
728
729%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
730% SECTION                                                                      %
731%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
732\section{Conclusions}
733\label{sect.conclusions}
734
735\end{document}
Note: See TracBrowser for help on using the repository browser.