1 | \documentclass{llncs} |
---|
2 | |
---|
3 | \usepackage{amsfonts} |
---|
4 | \usepackage{amsmath} |
---|
5 | \usepackage{amssymb} |
---|
6 | \usepackage[english]{babel} |
---|
7 | \usepackage{color} |
---|
8 | \usepackage{fancybox} |
---|
9 | \usepackage{graphicx} |
---|
10 | \usepackage[colorlinks]{hyperref} |
---|
11 | \usepackage[utf8x]{inputenc} |
---|
12 | \usepackage{listings} |
---|
13 | \usepackage{mdwlist} |
---|
14 | \usepackage{microtype} |
---|
15 | \usepackage{stmaryrd} |
---|
16 | \usepackage{url} |
---|
17 | |
---|
18 | \newlength{\mylength} |
---|
19 | \newenvironment{frametxt}% |
---|
20 | {\setlength{\fboxsep}{5pt} |
---|
21 | \setlength{\mylength}{\linewidth}% |
---|
22 | \addtolength{\mylength}{-2\fboxsep}% |
---|
23 | \addtolength{\mylength}{-2\fboxrule}% |
---|
24 | \Sbox |
---|
25 | \minipage{\mylength}% |
---|
26 | \setlength{\abovedisplayskip}{0pt}% |
---|
27 | \setlength{\belowdisplayskip}{0pt}% |
---|
28 | }% |
---|
29 | {\endminipage\endSbox |
---|
30 | \[\fbox{\TheSbox}\]} |
---|
31 | |
---|
32 | \lstdefinelanguage{matita-ocaml} |
---|
33 | {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try,on,to}, |
---|
34 | morekeywords={[2]whd,normalize,elim,cases,destruct}, |
---|
35 | morekeywords={[3]type,of,val,assert,let,function}, |
---|
36 | mathescape=true, |
---|
37 | } |
---|
38 | \lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false, |
---|
39 | keywordstyle=\color{red}\bfseries, |
---|
40 | keywordstyle=[2]\color{blue}, |
---|
41 | keywordstyle=[3]\color{blue}\bfseries, |
---|
42 | commentstyle=\color{green}, |
---|
43 | stringstyle=\color{blue}, |
---|
44 | showspaces=false,showstringspaces=false} |
---|
45 | \lstset{extendedchars=false} |
---|
46 | \lstset{inputencoding=utf8x} |
---|
47 | \DeclareUnicodeCharacter{8797}{:=} |
---|
48 | \DeclareUnicodeCharacter{10746}{++} |
---|
49 | \DeclareUnicodeCharacter{9001}{\ensuremath{\langle}} |
---|
50 | \DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}} |
---|
51 | |
---|
52 | \author{Dominic P. Mulligan\thanks{The project CerCo acknowledges the financial support of the Future and |
---|
53 | Emerging Technologies (FET) programme within the Seventh Framework |
---|
54 | Programme for Research of the European Commission, under FET-Open grant |
---|
55 | number: 243881} \and Claudio Sacerdoti Coen$^\star$} |
---|
56 | \authorrunning{D. P. Mulligan and C. Sacerdoti Coen} |
---|
57 | \title{An executable formalisation of the MCS-51 microprocessor in Matita} |
---|
58 | \titlerunning{An executable formalisation of the MCS-51} |
---|
59 | \institute{Dipartimento di Scienze dell'Informazione, Universit\`a di Bologna} |
---|
60 | |
---|
61 | \bibliographystyle{plain} |
---|
62 | |
---|
63 | \begin{document} |
---|
64 | |
---|
65 | \maketitle |
---|
66 | |
---|
67 | \begin{abstract} |
---|
68 | We summarise the formalisation of an emulator for the MCS-51 microprocessor in the Matita proof assistant. |
---|
69 | The MCS-51 is a widely used 8-bit microprocessor, especially popular in embedded devices. |
---|
70 | |
---|
71 | The formalisation proceeded in two stages, first implementing an O'Caml prototype, for quickly `ironing out' bugs, and then porting the O'Caml emulator to Matita. |
---|
72 | Though mostly straight-forward, this porting presented multiple problems. |
---|
73 | Of particular interest is how the unorthoganality of the MSC-51's instruction set is handled. |
---|
74 | In O'Caml, this was handled with polymorphic variants. |
---|
75 | In Matita, we achieved the same effect with a non-standard use of dependent types. |
---|
76 | |
---|
77 | Both the O'Caml and Matita emulators are `executable'. |
---|
78 | Assembly programs may be animated within Matita, producing a trace of instructions executed. |
---|
79 | The formalisation is a major component of the ongoing EU-funded CerCo project. |
---|
80 | \end{abstract} |
---|
81 | |
---|
82 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
83 | % SECTION % |
---|
84 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
85 | \section{Introduction} |
---|
86 | \label{sect.introduction} |
---|
87 | |
---|
88 | Formal methods aim to increase our confidence in the design and implementation of software. |
---|
89 | Ideally, all software should come equipped with a formal specification and a proof of correctness for the corresponding implementation. |
---|
90 | The majority of programs are written in high level languages and then compiled into low level ones. |
---|
91 | Specifications are therefore also given at a high level and correctness can be proved by reasoning on the program's source code. |
---|
92 | The code that is actually run, however, is not the high level source code that we reason on, but low level code generated by the compiler. |
---|
93 | A few questions now arise: |
---|
94 | \begin{itemize*} |
---|
95 | \item |
---|
96 | What properties are preserved during compilation? |
---|
97 | \item |
---|
98 | What properties are affected by the compilation strategy? |
---|
99 | \item |
---|
100 | To what extent can you trust your compiler in preserving those properties? |
---|
101 | \end{itemize*} |
---|
102 | These questions, and others like them, motivate a current `hot topic' in computer science research: \emph{compiler verification} (for instance~\cite{chlipala:verified:2010,leroy:formal:2009}, and many others). |
---|
103 | So far, the field has only been focused on the first and last questions. |
---|
104 | Much attention has been placed on verifying compiler correctness with respect to extensional properties of programs. |
---|
105 | These are `easily' preserved during compilation. |
---|
106 | |
---|
107 | If we consider intensional properties of programs---space, time, and so forth---the situation is more complex. |
---|
108 | To express these properties, and reason about them, we must adopt a cost model that assigns a cost to single, or blocks, of instructions. |
---|
109 | A compositional cost model, assigning the same cost to all occurrences of one instruction, would be ideal. |
---|
110 | However, compiler optimisations are inherently non-compositional: each occurrence of a high level instruction may be compiled in a different way depending on its context. |
---|
111 | Therefore both the cost model and intensional specifications are affected by the compilation process. |
---|
112 | |
---|
113 | In the CerCo project (`Certified Complexity')~\cite{cerco:2011} we approach the problem of reasoning about intensional properties of programs as follows. |
---|
114 | We are currently developing a compiler that induces a cost model on high level source code. |
---|
115 | Costs are assigned to each block of high level instructions by considering the costs of the corresponding blocks of compiled code. |
---|
116 | The cost model is therefore inherently non-compositional, but has the potential to be extremely \emph{precise}, capturing a program's \emph{realistic} cost. |
---|
117 | That is, the compilation process is taken into account, not ignored. |
---|
118 | A prototype compiler, where no approximation of the cost is provided, has been developed. |
---|
119 | (The technical details of the cost model is explained in~\cite{amadio:certifying:2010}.) |
---|
120 | |
---|
121 | We believe that our approach is applicable to certifying real time programs. |
---|
122 | A user can certify that `deadlines' are met whilst wringing as many clock cycles from the processor---using a cost model that does not over-estimate---as possible. |
---|
123 | |
---|
124 | We also see our approach as being relevant to compiler verification (and construction) itself. |
---|
125 | \emph{An optimisation specified only extensionally is only half specified}. |
---|
126 | Though the optimisation may preserve the denotational semantics of a program, there is no guarantee that any intensional properties of the program will be improved. |
---|
127 | |
---|
128 | Another potential application is toward completeness and correctness of the compilation process in the presence of space constraints. |
---|
129 | A compiler could reject a source program targetting an embedded system when the size of the compiled code exceeds the available ROM size. |
---|
130 | Preservation of a program's semantics may only be required for those programs that do not exhaust the stack or heap. |
---|
131 | The statement of completeness of the compiler must therefore take in to account a realistic cost model. |
---|
132 | |
---|
133 | With the CerCo methodology, we assume we can assign to object code exact and realistic costs for sequential blocks of instructions. |
---|
134 | This is possible with modern processors (see~\cite{bate:wcet:2011,yan:wcet:2008} for instance) but difficult, as the structure and execution of a program itself has an influence on the speed of processing. |
---|
135 | Caching, memory effects, and advanced features such as branch prediction all have an effect on execution speed. |
---|
136 | For this reason CerCo decided to focus on 8-bit microprocessors. |
---|
137 | These are still used in embedded systems, with the advantage of a predictable cost model due to their relative paucity of features. |
---|
138 | |
---|
139 | We have fully formalised an executable formal semantics of a family of 8-bit Freescale microprocessors~\cite{oliboni:matita:2008}, and provided a similar executable formal semantics for the MCS-51 microprocessor. |
---|
140 | The latter is what we describe in this paper. |
---|
141 | The focus of the formalisation has been on capturing the intensional behaviour of the processor. |
---|
142 | However, the design of the MCS-51 itself has caused problems in the formalisation. |
---|
143 | For example, the MCS-51 has a highly unorthogonal instruction set. |
---|
144 | To cope with this unorthogonality, and to produce an executable specification, we rely on Matita's dependent types. |
---|
145 | |
---|
146 | \paragraph{The MCS-51}\quad |
---|
147 | The MCS-51 is an 8-bit microprocessor introduced by Intel in the late 1970s. |
---|
148 | Commonly called the 8051, in the decades since its introduction the processor has become a popular component of embedded systems. |
---|
149 | The processor, its successor the 8052, and derivatives are still manufactured \emph{en masse} by a host of vendors. |
---|
150 | |
---|
151 | The 8051 is a well documented processor, and has the support of numerous open source and commercial tools, such as compilers and emulators. |
---|
152 | For instance, the open source Small Device C Compiler (SDCC) recognises a dialect of C~\cite{sdcc:2010}, and other compilers for BASIC, Forth and Modula-2 are also extant. |
---|
153 | An open source emulator for the processor, MCU 8051 IDE, is also available~\cite{mcu8051ide:2010}. |
---|
154 | Both MCU 8051 IDE and SDCC were used in for validating the formalisation. |
---|
155 | |
---|
156 | \begin{figure}[t] |
---|
157 | \setlength{\unitlength}{0.87pt} |
---|
158 | \begin{picture}(410,250)(-50,200) |
---|
159 | %\put(-50,200){\framebox(410,250){}} |
---|
160 | \put(12,410){\makebox(80,0)[b]{Internal (256B)}} |
---|
161 | \put(13,242){\line(0,1){165}} |
---|
162 | \put(93,242){\line(0,1){165}} |
---|
163 | \put(13,407){\line(1,0){80}} |
---|
164 | \put(12,400){\makebox(0,0)[r]{0h}} \put(14,400){\makebox(0,0)[l]{Register bank 0}} |
---|
165 | \put(13,393){\line(1,0){80}} |
---|
166 | \put(12,386){\makebox(0,0)[r]{8h}} \put(14,386){\makebox(0,0)[l]{Register bank 1}} |
---|
167 | \put(13,379){\line(1,0){80}} |
---|
168 | \put(12,372){\makebox(0,0)[r]{10h}} \put(14,372){\makebox(0,0)[l]{Register bank 2}} |
---|
169 | \put(13,365){\line(1,0){80}} |
---|
170 | \put(12,358){\makebox(0,0)[r]{18h}} \put(14,358){\makebox(0,0)[l]{Register bank 3}} |
---|
171 | \put(13,351){\line(1,0){80}} |
---|
172 | \put(12,344){\makebox(0,0)[r]{20h}} \put(14,344){\makebox(0,0)[l]{Bit addressable}} |
---|
173 | \put(13,323){\line(1,0){80}} |
---|
174 | \put(12,316){\makebox(0,0)[r]{30h}} |
---|
175 | \put(14,309){\makebox(0,0)[l]{\quad \vdots}} |
---|
176 | \put(13,291){\line(1,0){80}} |
---|
177 | \put(12,284){\makebox(0,0)[r]{80h}} |
---|
178 | \put(14,263){\makebox(0,0)[l]{\quad \vdots}} |
---|
179 | \put(12,249){\makebox(0,0)[r]{ffh}} |
---|
180 | \put(13,242){\line(1,0){80}} |
---|
181 | |
---|
182 | \qbezier(-2,407)(-6,407)(-6,393) |
---|
183 | \qbezier(-6,393)(-6,324)(-10,324) |
---|
184 | \put(-12,324){\makebox(0,0)[r]{Indirect/stack}} |
---|
185 | \qbezier(-6,256)(-6,324)(-10,324) |
---|
186 | \qbezier(-2,242)(-6,242)(-6,256) |
---|
187 | |
---|
188 | \qbezier(94,407)(98,407)(98,393) |
---|
189 | \qbezier(98,393)(98,349)(102,349) |
---|
190 | \put(104,349){\makebox(0,0)[l]{Direct}} |
---|
191 | \qbezier(98,305)(98,349)(102,349) |
---|
192 | \qbezier(94,291)(98,291)(98,305) |
---|
193 | |
---|
194 | \put(102,242){\framebox(20,49){SFR}} |
---|
195 | % bit access to sfrs? |
---|
196 | |
---|
197 | \qbezier(124,291)(128,291)(128,277) |
---|
198 | \qbezier(128,277)(128,266)(132,266) |
---|
199 | \put(134,266){\makebox(0,0)[l]{Direct}} |
---|
200 | \qbezier(128,257)(128,266)(132,266) |
---|
201 | \qbezier(124,242)(128,242)(128,256) |
---|
202 | |
---|
203 | \put(164,410){\makebox(80,0)[b]{External (64kB)}} |
---|
204 | \put(164,220){\line(0,1){187}} |
---|
205 | \put(164,407){\line(1,0){80}} |
---|
206 | \put(244,220){\line(0,1){187}} |
---|
207 | \put(164,242){\line(1,0){80}} |
---|
208 | \put(163,400){\makebox(0,0)[r]{0h}} |
---|
209 | \put(164,324){\makebox(80,0){Paged access}} |
---|
210 | \put(164,310){\makebox(80,0){Direct/indirect}} |
---|
211 | \put(163,235){\makebox(0,0)[r]{80h}} |
---|
212 | \put(164,228){\makebox(80,0){\vdots}} |
---|
213 | \put(164,210){\makebox(80,0){Direct/indirect}} |
---|
214 | |
---|
215 | \put(264,410){\makebox(80,0)[b]{Code (64kB)}} |
---|
216 | \put(264,220){\line(0,1){187}} |
---|
217 | \put(264,407){\line(1,0){80}} |
---|
218 | \put(344,220){\line(0,1){187}} |
---|
219 | \put(263,400){\makebox(0,0)[r]{0h}} |
---|
220 | \put(264,228){\makebox(80,0){\vdots}} |
---|
221 | \put(264,324){\makebox(80,0){Direct}} |
---|
222 | \put(264,310){\makebox(80,0){PC relative}} |
---|
223 | \end{picture} |
---|
224 | \caption{The 8051 memory model} |
---|
225 | \label{fig.memory.layout} |
---|
226 | \end{figure} |
---|
227 | |
---|
228 | The 8051 has a relatively straightforward architecture, unencumbered by advanced features of modern processors, making it an ideal target for formalisation. |
---|
229 | A high-level overview of the processor's memory layout, along with the ways in which different memory spaces may be addressed, is provided in Figure~\ref{fig.memory.layout}. |
---|
230 | |
---|
231 | Processor RAM is divided into numerous segments, with the most prominent division being between internal and (optional) external memory. |
---|
232 | Internal memory, commonly provided on the die itself with fast access, is composed of 256 bytes, but, in direct addressing mode, half of them are overloaded with 128 bytes of memory-mapped Special Function Registers (SFRs). |
---|
233 | SFRs control the operation of the processor. |
---|
234 | Internal RAM (IRAM) is divided again into 8 general purpose bit-addressable registers (R0--R7). |
---|
235 | These sit in the first 8 bytes of IRAM, though can be programmatically `shifted up' as needed. |
---|
236 | Bit memory, followed by a small amount of stack space, resides in the memory space immediately following the register banks. |
---|
237 | What remains of IRAM may be treated as general purpose memory. |
---|
238 | A schematic view of IRAM layout is also provided in Figure~\ref{fig.memory.layout}. |
---|
239 | |
---|
240 | External RAM (XRAM), limited to a maximum size of 64 kilobytes, is optional, and may be provided on or off chip, depending on the vendor. |
---|
241 | XRAM is accessed using a dedicated instruction, and requires 16 bits to address fully. |
---|
242 | External code memory (XCODE) is often stored as an EPROM, and limited to 64 kilobytes in size. |
---|
243 | However, depending on the particular processor model, a dedicated on-die read-only memory area for program code (ICODE) may be supplied. |
---|
244 | |
---|
245 | Memory may be addressed in numerous ways: immediate, direct, indirect, external direct and code indirect. |
---|
246 | As the latter two addressing modes hint, there are some restrictions enforced by the 8051, and its derivatives, on which addressing modes may be used with specific types of memory. |
---|
247 | For instance, the extra 128 bytes of IRAM of the 8052 cannot be addressed using indirect addressing; rather, external (in)direct addressing must be used. Moreover, some memory segments are addressed using 8-bit pointers while others require 16-bits. |
---|
248 | |
---|
249 | The 8051 possesses an 8-bit Arithmetic and Logic Unit (ALU), with a variety of instructions for performing arithmetic and logical operations on bits and integers. |
---|
250 | Two 8-bit general purpose accumulators, A and B, are provided. |
---|
251 | |
---|
252 | Communication with the device is handled by an inbuilt UART serial port and controller. |
---|
253 | This can operate in numerous modes. |
---|
254 | Serial baud rate is determined by one of two 16-bit timers included with the 8051, which can be set to multiple modes of operation. |
---|
255 | (The 8052 provides an additional 16-bit timer.) |
---|
256 | The 8051 also provides a 4 byte bit-addressable I/O port. |
---|
257 | |
---|
258 | The programmer may take advantage of an interrupt mechanism. |
---|
259 | This is especially useful when dealing with I/O involving the serial device, as an interrupt can be set when a whole character is sent or received via the UART. |
---|
260 | |
---|
261 | Interrupts immediately halt the flow of execution of the processor, and cause the program counter to jump to a fixed address, where the requisite interrupt handler is stored. |
---|
262 | However, interrupts may be set to one of two priorities: low and high. |
---|
263 | The interrupt handler of an interrupt with high priority is executed ahead of the interrupt handler of an interrupt of lower priority, interrupting a currently executing handler of lower priority, if necessary. |
---|
264 | |
---|
265 | The 8051 has interrupts disabled by default. |
---|
266 | The programmer is free to handle serial input and output manually, by poking serial flags in the SFRs. |
---|
267 | `Exceptional circumstances' that would otherwise trigger an interrupt on more modern processors, (e.g. division by zero) are also signalled by setting flags. |
---|
268 | |
---|
269 | %\begin{figure}[t] |
---|
270 | %\begin{center} |
---|
271 | %\includegraphics[scale=0.5]{iramlayout.png} |
---|
272 | %\end{center} |
---|
273 | %\caption{Schematic view of 8051 IRAM layout} |
---|
274 | %\label{fig.iram.layout} |
---|
275 | %\end{figure} |
---|
276 | |
---|
277 | \paragraph{Overview of paper}\quad |
---|
278 | In Section~\ref{sect.design.issues.formalisation} we discuss design issues in the development of the formalisation. |
---|
279 | In Section~\ref{sect.validation} we discuss how we validated the design and implementation of the emulator to ensure that what we formalised was an accurate model of an MCS-51 series microprocessor. |
---|
280 | In Section~\ref{sect.related.work} we describe previous work, with an eye toward describing its relation with the work described herein. |
---|
281 | In Section~\ref{sect.conclusions} we conclude. |
---|
282 | |
---|
283 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
284 | % SECTION % |
---|
285 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
286 | \section{Design issues in the formalisation} |
---|
287 | \label{sect.design.issues.formalisation} |
---|
288 | |
---|
289 | From hereonin, we typeset O'Caml source with \texttt{\color{blue}{blue}} and Matita source with \texttt{\color{red}{red}}. |
---|
290 | Matita's syntax is straightforward if familiar with Coq or O'Caml. |
---|
291 | One subtlety is the use of `\texttt{?}' or `\texttt{$\ldots$}' in an argument position denoting an argument or arguments to be inferred, respectively. |
---|
292 | |
---|
293 | A full account of the formalisation can be found in~\cite{cerco-report:2011}. |
---|
294 | |
---|
295 | \subsection{Development strategy} |
---|
296 | \label{subsect.development.strategy} |
---|
297 | |
---|
298 | The implementation progressed in two stages. |
---|
299 | We began with an emulator written in O'Caml to `iron out' any bugs in the design and implementation. |
---|
300 | O'Caml's ability to perform file I/O also eased debugging and validation. |
---|
301 | Once we were happy with the design of the O'Caml emulator, we moved to Matita. |
---|
302 | |
---|
303 | Matita's syntax is lexically similar to O'Caml's. |
---|
304 | This eased the translation, as swathes of code were copied with minor modifications. |
---|
305 | However, several major issues had to be addressed when moving to Matita. |
---|
306 | These are now discussed. |
---|
307 | |
---|
308 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
309 | % SECTION % |
---|
310 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
311 | \subsection{Representation of bytes, words, etc.} |
---|
312 | \label{subsect.representation.integers} |
---|
313 | |
---|
314 | \begin{figure}[t] |
---|
315 | \begin{minipage}[t]{0.45\textwidth} |
---|
316 | \vspace{0pt} |
---|
317 | \begin{lstlisting} |
---|
318 | type 'a vect = bit list |
---|
319 | type nibble = [`Sixteen] vect |
---|
320 | type byte = [`Eight] vect |
---|
321 | $\color{blue}{\mathtt{let}}$ split_word w = split_nth 4 w |
---|
322 | $\color{blue}{\mathtt{let}}$ split_byte b = split_nth 2 b |
---|
323 | \end{lstlisting} |
---|
324 | \end{minipage} |
---|
325 | % |
---|
326 | \begin{minipage}[t]{0.55\textwidth} |
---|
327 | \vspace{0pt} |
---|
328 | \begin{lstlisting} |
---|
329 | type 'a vect |
---|
330 | type word = [`Sixteen] vect |
---|
331 | type byte = [`Eight] vect |
---|
332 | val split_word: word -> byte * word |
---|
333 | val split_byte: byte -> nibble * nibble |
---|
334 | \end{lstlisting} |
---|
335 | \end{minipage} |
---|
336 | \caption{Sample of O'Caml implementation and interface for bitvectors module} |
---|
337 | \label{fig.ocaml.implementation.bitvectors} |
---|
338 | \end{figure} |
---|
339 | |
---|
340 | The formalization of MCS-51 must deal with bytes (8-bits), words (16-bits), and also more exoteric quantities (7, 3 and 9-bits). |
---|
341 | To avoid difficult-to-trace size mismatch bugs, we represented all quantities using bitvectors, i.e. fixed length vectors of booleans. |
---|
342 | In the O'Caml emulator, we `faked' bitvectors using phantom types~\cite{leijen:domain:1999} implemented with polymorphic variants~\cite{garrigue:programming:1998}, as in Figure~\ref{fig.ocaml.implementation.bitvectors}. |
---|
343 | From within the bitvector module (left column) bitvectors are just lists of bits and no guarantee is provided on sizes. |
---|
344 | However, the module's interface (right column) enforces size invariants in the rest of the code. |
---|
345 | |
---|
346 | In Matita, we are able to use the full power of dependent types to always work with vectors of a known size: |
---|
347 | \begin{lstlisting} |
---|
348 | inductive Vector (A: Type[0]): nat $\rightarrow$ Type[0] ≝ |
---|
349 | VEmpty: Vector A O |
---|
350 | | VCons: $\forall$n: nat. A $\rightarrow$ Vector A n $\rightarrow$ Vector A (S n). |
---|
351 | \end{lstlisting} |
---|
352 | We define \texttt{BitVector} as a specialization of \texttt{Vector} to \texttt{bool}. |
---|
353 | We may use Matita's type system to provide precise typings for functions that are polymorphic in the size without code duplication: |
---|
354 | \begin{lstlisting} |
---|
355 | let rec split (A: Type[0]) (m,n: nat) on m: |
---|
356 | Vector A (plus m n) $\rightarrow$ (Vector A m) $\times$ (Vector A n) := ... |
---|
357 | \end{lstlisting} |
---|
358 | |
---|
359 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
360 | % SECTION % |
---|
361 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
362 | \subsection{Representing memory} |
---|
363 | \label{subsect.representing.memory} |
---|
364 | |
---|
365 | The MCS-51 has numerous disjoint memory spaces addressed by differently sized pointers. |
---|
366 | In the O'Caml implementation, we use a map data structure (from the standard library) for each space. |
---|
367 | Matita's standard library is small, and does not contain a generic map data structure. |
---|
368 | We had the opportunity of crafting a dependently typed special-purpose data structure for the job to enforce the correspondence between the size of pointer and the size of the memory space. |
---|
369 | Further, we assumed that large swathes of memory would often be uninitialized. |
---|
370 | |
---|
371 | We picked a modified form of trie of fixed height $h$. |
---|
372 | Paths are represented by bitvectors (already used in the implementation for addresses and registers) of length $h$: |
---|
373 | \begin{lstlisting} |
---|
374 | inductive BitVectorTrie (A: Type[0]): nat $\rightarrow$ Type[0] ≝ |
---|
375 | Leaf: A $\rightarrow$ BitVectorTrie A 0 |
---|
376 | | Node: ∀n. BitVectorTrie A n $\rightarrow$ BitVectorTrie A n $\rightarrow$ BitVectorTrie A (S n) |
---|
377 | | Stub: ∀n. BitVectorTrie A n. |
---|
378 | \end{lstlisting} |
---|
379 | \texttt{Stub} is a constructor that can appear at any point in a trie. |
---|
380 | It represents `uninitialized data'. |
---|
381 | Performing a lookup in memory is now straight-forward. |
---|
382 | The only subtlety over normal trie lookup is how we handle \texttt{Stub}. |
---|
383 | We traverse a path, and upon encountering \texttt{Stub}, we return a default value\footnote{All manufacturer data sheets that we consulted were silent on the subject of what should be returned if we attempt to access uninitialized memory. We defaulted to simply returning zero, though our \texttt{lookup} function is parametric in this choice. We do not believe that this is an outrageous decision, as SDCC for instance generates code which first `zeroes out' all memory in a preamble before executing the program proper. This is in line with the C standard, which guarantees that all global variables will be zero initialized piecewise.}. |
---|
384 | |
---|
385 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
386 | % SECTION % |
---|
387 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
388 | \subsection{Labels and pseudoinstructions} |
---|
389 | \label{subsect.labels.pseudoinstructions} |
---|
390 | |
---|
391 | Aside from implementing the core MCS-51 instruction set, we also provided \emph{pseudoinstructions}, \emph{labels} and \emph{cost labels}. |
---|
392 | The purpose of \emph{cost labels} will be explained in Subsection~\ref{subsect.computation.cost.traces}. |
---|
393 | |
---|
394 | Introducing pseudoinstructions had the effect of simplifying a C compiler---another component of the CerCo project---that was being implemented in parallel with our implementation. |
---|
395 | To see why, consider the fact that the MCS-51's instruction set has numerous instructions for unconditional and conditional jumps to memory locations. |
---|
396 | For instance, the instructions \texttt{AJMP}, \texttt{JMP} and \texttt{LJMP} all perform unconditional jumps. |
---|
397 | However, these instructions differ in how large the maximum size of the offset of the jump to be performed can be. |
---|
398 | Further, all jump instructions require a concrete memory address---to jump to---to be specified. |
---|
399 | Compilers that support separate compilation cannot directly compute these offsets and select the appropriate jump instructions. |
---|
400 | These operations are also burdensome for compilers that do not do separate compilation and are handled by assemblers. |
---|
401 | We followed suit. |
---|
402 | |
---|
403 | While introducing pseudoinstructions, we also introduced labels for locations to jump to, and for global data. |
---|
404 | To specify global data via labels, we introduced a preamble before the program where labels and the size of reserved space for data is stored. |
---|
405 | A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the MCS-51's one 16-bit register, \texttt{DPTR}. |
---|
406 | |
---|
407 | The pseudoinstructions and labels induce an assembly language similar to that of SDCC's. |
---|
408 | All pseudoinstructions and labels are `assembled away' prior to program execution. |
---|
409 | Jumps are computed in two stages. |
---|
410 | A map associating memory addresses to labels is built, before replacing pseudojumps with concrete jumps to the correct address. |
---|
411 | The algorithm currently implemented does not try to minimize object code size by picking the shortest possible jump instruction. |
---|
412 | A better algorithm is left for future work. |
---|
413 | |
---|
414 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
415 | % SECTION % |
---|
416 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
417 | \subsection{Anatomy of the (Matita) emulator} |
---|
418 | \label{subsect.anatomy.matita.emulator} |
---|
419 | |
---|
420 | The internal state of the Matita emulator is represented as a record: |
---|
421 | \begin{lstlisting} |
---|
422 | record Status: Type[0] ≝ { |
---|
423 | code_memory: BitVectorTrie Byte 16; |
---|
424 | low_internal_ram: BitVectorTrie Byte 7; |
---|
425 | high_internal_ram: BitVectorTrie Byte 7; |
---|
426 | external_ram: BitVectorTrie Byte 16; |
---|
427 | program_counter: Word; |
---|
428 | special_function_registers_8051: Vector Byte 19; |
---|
429 | special_function_registers_8052: Vector Byte 5; |
---|
430 | ... }. |
---|
431 | \end{lstlisting} |
---|
432 | This record encapsulates the current memory contents, the program counter, the state of the current SFRs, and so on. |
---|
433 | |
---|
434 | Here the MCS-51's memory model is implemented using four disjoint memory spaces, plus SFRs. |
---|
435 | From the programmer's point of view, what \emph{really} matters are the addressing modes that are in a many-to-many relationship with the spaces. |
---|
436 | \texttt{DIRECT} addressing can be used to address either lower IRAM (if the first bit is 0) or the SFRs (if the first bit is 1), for instance. |
---|
437 | That's why DIRECT uses 8-bit addresses but pointers to lower IRAM only use 7 bits. |
---|
438 | The complexity of the memory model is captured in a pair of functions, \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX}, that `get' and `set' data of size \texttt{XX} from memory. |
---|
439 | |
---|
440 | %Overlapping, and checking which addressing modes can be used to address particular memory spaces, is handled through numerous \texttt{get\_arg\_XX} and \texttt{set\_arg\_XX} (for 1, 8 and 16 bits) functions. |
---|
441 | |
---|
442 | Both the Matita and O'Caml emulators follow the classic `fetch-decode-execute' model of processor operation. |
---|
443 | The next instruction to be processed, indexed by the program counter, is fetched from code memory with \texttt{fetch}. |
---|
444 | An updated program counter, along with its concrete cost in processor cycles, is also returned. |
---|
445 | These costs are taken from a Siemens Semiconductor Group data sheet for the MCS-51~\cite{siemens:2011}, and will likely vary across manufacturers and derivatives of the processor. |
---|
446 | \begin{lstlisting} |
---|
447 | definition fetch: BitVectorTrie Byte 16 $\rightarrow$ Word $\rightarrow$ instruction $\times$ Word $\times$ nat |
---|
448 | \end{lstlisting} |
---|
449 | Instruction are assembled to bit encodings by \texttt{assembly1}: |
---|
450 | \begin{lstlisting} |
---|
451 | definition assembly1: instruction $\rightarrow$ list Byte |
---|
452 | \end{lstlisting} |
---|
453 | An assembly program---comprising a preamble containing global data and a list of pseudoinstructions---is assembled using \texttt{assembly}. |
---|
454 | Pseudoinstructions and labels are eliminated in favour of instructions from the MCS-51 instruction set. |
---|
455 | A map associating memory locations and cost labels (see Subsection~\ref{subsect.computation.cost.traces}) is produced. |
---|
456 | \begin{lstlisting} |
---|
457 | definition assembly: |
---|
458 | assembly_program $\rightarrow$ option (list Byte $\times$ (BitVectorTrie String 16)) |
---|
459 | \end{lstlisting} |
---|
460 | A single fetch-decode-execute cycle is performed by \texttt{execute\_1}: |
---|
461 | \begin{lstlisting} |
---|
462 | definition execute_1: Status $\rightarrow$ Status |
---|
463 | \end{lstlisting} |
---|
464 | The \texttt{execute} functions performs a fixed number of cycles by iterating |
---|
465 | \texttt{execute\_1}: |
---|
466 | \begin{lstlisting} |
---|
467 | let rec execute (n: nat) (s: Status) on n: Status := ... |
---|
468 | \end{lstlisting} |
---|
469 | This differs from the O'Caml emulator, which executed a program indefinitely. |
---|
470 | A callback function was also accepted as an argument, which could `witness' the execution as it happened, providing a print-out of the processor state. |
---|
471 | Due to Matita's termination requirement, \texttt{execute} cannot execute a program indefinitely. |
---|
472 | An alternative approach would be to produce an infinite stream of statuses representing an execution trace. |
---|
473 | Matita supports infinite streams through co-inductive types. |
---|
474 | |
---|
475 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
476 | % SECTION % |
---|
477 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
478 | \subsection{Instruction set unorthogonality} |
---|
479 | \label{subsect.instruction.set.unorthogonality} |
---|
480 | |
---|
481 | A peculiarity of the MCS-51 is its unorthogonal instruction set. |
---|
482 | For instance, the \texttt{MOV} instruction can be invoked using one of 16 combinations of addressing modes out of a possible 361. |
---|
483 | |
---|
484 | % Show example of pattern matching with polymorphic variants |
---|
485 | |
---|
486 | Such unorthogonality in the instruction set was handled with the use of polymorphic variants in O'Caml. |
---|
487 | For instance, we introduced types corresponding to each addressing mode: |
---|
488 | \begin{lstlisting} |
---|
489 | type direct = [ `DIRECT of byte ] |
---|
490 | type indirect = [ `INDIRECT of bit ] |
---|
491 | ... |
---|
492 | \end{lstlisting} |
---|
493 | Which were then combined in the inductive datatype for assembly preinstructions using the union operator `$|$': |
---|
494 | \begin{lstlisting} |
---|
495 | type 'addr preinstruction = |
---|
496 | [ `ADD of acc * [ reg | direct | indirect | data ] |
---|
497 | ... |
---|
498 | | `MOV of |
---|
499 | (acc * [ reg | direct | indirect | data ], |
---|
500 | [ reg | indirect ] * [ acc | direct | data ], |
---|
501 | direct * [ acc | reg | direct | indirect | data ], |
---|
502 | dptr * data16, |
---|
503 | carry * bit, |
---|
504 | bit * carry |
---|
505 | ) union6 |
---|
506 | ... |
---|
507 | \end{lstlisting} |
---|
508 | Here, \texttt{union6} is a disjoint union type, defined as follows: |
---|
509 | \begin{lstlisting} |
---|
510 | type ('a,'b,'c,'d,'e,'f) union6 = [ `U1 of 'a | ... | `U6 of 'f ] |
---|
511 | \end{lstlisting} |
---|
512 | For our purposes, the types \texttt{union2}, \texttt{union3} and \texttt{union6} sufficed. |
---|
513 | |
---|
514 | This polymorphic variant machinery worked well: it introduced a certain level of type safety (for instance, the type of \texttt{MOV} above guarantees it cannot be invoked with arguments in the \texttt{carry} and \texttt{data16} addressing modes, respectively), and also allowed us to pattern match against instructions, when necessary. |
---|
515 | However, this polymorphic variant machinery is \emph{not} present in Matita. |
---|
516 | We needed some way to produce the same effect, which Matita supported. |
---|
517 | For this task, we used dependent types. |
---|
518 | |
---|
519 | We first provided an inductive data type representing all possible addressing modes, a type that functions will pattern match against: |
---|
520 | \begin{lstlisting} |
---|
521 | inductive addressing_mode: Type[0] ≝ |
---|
522 | DIRECT: Byte $\rightarrow$ addressing_mode |
---|
523 | | INDIRECT: Bit $\rightarrow$ addressing_mode |
---|
524 | ... |
---|
525 | \end{lstlisting} |
---|
526 | We also wished to express in the type of functions the \emph{impossibility} of pattern matching against certain constructors. |
---|
527 | In order to do this, we introduced an inductive type of addressing mode `tags'. |
---|
528 | The constructors of \texttt{addressing\_mode\_tag} are in one-to-one correspondence with the constructors of \texttt{addressing\_mode}: |
---|
529 | \begin{lstlisting} |
---|
530 | inductive addressing_mode_tag : Type[0] ≝ |
---|
531 | direct: addressing_mode_tag |
---|
532 | | indirect: addressing_mode_tag |
---|
533 | ... |
---|
534 | \end{lstlisting} |
---|
535 | The \texttt{is\_a} function checks if an \texttt{addressing\_mode} matches an \texttt{addressing\_mode\_tag}: |
---|
536 | \begin{lstlisting} |
---|
537 | let rec is_a (d: addressing_mode_tag) (A: addressing_mode) on d := |
---|
538 | match d with |
---|
539 | [ direct $\Rightarrow$ match A with [ DIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ] |
---|
540 | | indirect $\Rightarrow$ match A with [ INDIRECT _ $\Rightarrow$ true | _ $\Rightarrow$ false ] |
---|
541 | ... |
---|
542 | \end{lstlisting} |
---|
543 | The \texttt{is\_in} function checks if an \texttt{addressing\_mode} matches a set of tags represented as a vector. It simply extends the \texttt{is\_a} function in the obvious manner. |
---|
544 | |
---|
545 | A \texttt{subaddressing\_mode} is an \emph{ad hoc} non-empty $\Sigma$-type of \texttt{addressing\_mode}s constrained to be in a set of tags: |
---|
546 | \begin{lstlisting} |
---|
547 | record subaddressing_mode n (l: Vector addressing_mode_tag (S n)): Type[0] := |
---|
548 | { subaddressing_modeel :> addressing_mode; |
---|
549 | subaddressing_modein: bool_to_Prop (is_in ? l subaddressing_modeel) }. |
---|
550 | \end{lstlisting} |
---|
551 | An implicit coercion is provided to promote vectors of tags (denoted with $\llbracket - \rrbracket$) to the corresponding \texttt{subaddressing\_mode} so that we can use a syntax close to that of O'Caml to specify \texttt{preinstruction}s: |
---|
552 | \begin{lstlisting} |
---|
553 | inductive preinstruction (A: Type[0]): Type[0] ≝ |
---|
554 | ADD: $\llbracket$ acc_a $\rrbracket$ $\rightarrow$ $\llbracket$ register; direct; indirect; data $\rrbracket$ $\rightarrow$ preinstruction A |
---|
555 | | ADDC: $\llbracket$ acc_a $\rrbracket$ $\rightarrow$ $\llbracket$ register; direct; indirect; data $\rrbracket$ $\rightarrow$ preinstruction A |
---|
556 | ... |
---|
557 | \end{lstlisting} |
---|
558 | The constructor \texttt{ADD} expects two parameters, the first being the accumulator A (\texttt{acc\_a}), the second being a register, direct, indirect or data addressing mode. |
---|
559 | |
---|
560 | % One of these coercions opens up a proof obligation which needs discussing |
---|
561 | % Have lemmas proving that if an element is a member of a sub, then it is a member of a superlist, and so on |
---|
562 | The final component is a pair of type coercions from \texttt{addressing\_mode} to \texttt{subaddressing\_mode} and from \texttt{subaddressing\_mode} to \texttt{Type$\lbrack0\rbrack$}, respectively. |
---|
563 | The first is a forgetful coercion, while the second opens a proof obligation wherein we must prove that the provided value is in the admissible set. |
---|
564 | These coercions were first introduced by PVS to implement subset types~\cite{shankar:principles:1999}, and later in Coq as part of Russell~\cite{sozeau:subset:2006}. |
---|
565 | In Matita all coercions can open proof obligations. |
---|
566 | |
---|
567 | Proof obligations require us to state and prove a few auxilliary lemmas related to the transitivity of subtyping. |
---|
568 | For instance, an \texttt{addressing\_mode} that belongs to an allowed set also belongs to any one of its supersets. |
---|
569 | At the moment, Matita's automation exploits these lemmas to completely solve all the proof obligations opened in the formalisation. |
---|
570 | The \texttt{execute\_1} function, for instance, opens over 200 proof obligations during type checking. |
---|
571 | |
---|
572 | The machinery just described allows us to restrict the set of \texttt{addressing\_mode}s expected by a function and use this information during pattern matching. |
---|
573 | This allows us to skip impossible cases. |
---|
574 | For instance, consider \texttt{set\_arg\_16}, which expects only a \texttt{DPTR}: |
---|
575 | \begin{lstlisting} |
---|
576 | definition set_arg_16: Status $\rightarrow$ Word $\rightarrow$ $\llbracket$ dptr $\rrbracket$ $\rightarrow$ Status ≝ $~\lambda$s, v, a. |
---|
577 | match a return $\lambda$x. bool_to_Prop (is_in ? $\llbracket$ dptr $\rrbracket$ x) $\rightarrow$ ? with |
---|
578 | [ DPTR $\Rightarrow$ $\lambda$_: True. |
---|
579 | let 〈 bu, bl 〉 := split $\ldots$ eight eight v in |
---|
580 | let status := set_8051_sfr s SFR_DPH bu in |
---|
581 | let status := set_8051_sfr status SFR_DPL bl in |
---|
582 | status |
---|
583 | | _ $\Rightarrow$ $\lambda$_: False. $\bot$ ] $~$(subaddressing_modein $\ldots$ a). |
---|
584 | \end{lstlisting} |
---|
585 | We give a proof (the expression \texttt{(subaddressing\_modein} $\ldots$ \texttt{a)}) that the argument $a$ is in the set $\llbracket$ \texttt{dptr} $\rrbracket$ to the \texttt{match} expression. |
---|
586 | In every case but \texttt{DPTR}, the proof is a proof of \texttt{False}, and the system opens a proof obligation $\bot$ that can be discarded using \emph{ex falso}. |
---|
587 | Attempting to match against a disallowed addressing mode (replacing \texttt{False} with \texttt{True} in the branch) produces a type error. |
---|
588 | |
---|
589 | We tried other dependently and non-dependently typed solutions before settling on this approach. |
---|
590 | As we need a large number of different combinations of addressing modes to describe the whole instruction set, it is infeasible to declare a datatype for each one of these combinations. |
---|
591 | The current solution is closest to the corresponding O'Caml code, to the point that the translation from O'Caml to Matita is almost syntactical. |
---|
592 | We would like to investigate the possibility of changing the code extraction procedure of Matita so that it recognises this programming pattern and outputs O'Caml code using polymorphic variants. |
---|
593 | |
---|
594 | % Talk about extraction to O'Caml code, which hopefully will allow us to extract back to using polymorphic variants, or when extracting vectors we could extract using phantom types |
---|
595 | % Discuss alternative approaches, i.e. Sigma types to piece together smaller types into larger ones, as opposed to using a predicate to `cut out' pieces of a larger type, which is what we did |
---|
596 | |
---|
597 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
598 | % SECTION % |
---|
599 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
600 | \subsection{I/O and timers} |
---|
601 | \label{subsect.i/o.timers} |
---|
602 | |
---|
603 | % `Real clock' for I/O and timers |
---|
604 | The O'Caml emulator has code for handling timers, asynchronous I/O and interrupts (these are not yet ported to the Matita emulator). |
---|
605 | All three of these features interact with each other in subtle ways. |
---|
606 | Interrupts can `fire' when an input is detected on the processor's UART port, and, in certain modes, timers reset when a high signal is detected on one of the MCS-51's communication pins. |
---|
607 | |
---|
608 | To accurately model timers and I/O, we add an unbounded integral field \texttt{clock} to the central \texttt{status} record. |
---|
609 | This field is only logical, since it does not represent any quantity stored in the physical processor, and is used to keep track of the current `processor time'. |
---|
610 | Before every execution step, \texttt{clock} is incremented by the number of processor cycles that the instruction just fetched will take to execute. |
---|
611 | The emulator then executes the instruction, followed by the code implementing the timers and I/O\footnote{Though it isn't fully specified by the manufacturer's data sheets if I/O is handled at the beginning or the end of each cycle.}. |
---|
612 | In order to model I/O, we also store in \texttt{status} a \emph{continuation} which is a description of the behaviour of the environment: |
---|
613 | \begin{lstlisting} |
---|
614 | type line = |
---|
615 | [ `P1 of byte | `P3 of byte |
---|
616 | | `SerialBuff of [ `Eight of byte | `Nine of BitVectors.bit * byte ]] |
---|
617 | type continuation = |
---|
618 | [`In of time * line * epsilon * continuation] option * |
---|
619 | [`Out of (time -> line -> time * continuation)] |
---|
620 | \end{lstlisting} |
---|
621 | At each moment, the second projection of the continuation $k$ describes how the environment will react to an output event performed in the future by the processor. |
---|
622 | Suppose $\pi_2(k)(\tau,o) = \langle \tau',k' \rangle$. |
---|
623 | If the emulator at time $\tau$ starts an asynchronous output $o$ either on the P1 or P3 output lines, or on the UART, then the environment will receive the output at time $\tau'$. |
---|
624 | Moreover \texttt{status} is immediately updated with the continuation $k'$. |
---|
625 | |
---|
626 | Further, if $\pi_1(k) = \mathtt{Some}~\langle \tau',i,\epsilon,k'\rangle$, then at time $\tau'$ the environment will send the asynchronous input $i$ to the emulator and \texttt{status} is updated with the continuation $k'$. |
---|
627 | This input is visible to the emulator only at time $\tau' + \epsilon$. |
---|
628 | |
---|
629 | The time required to perform an I/O operation is partially specified in the data sheets of the UART module. |
---|
630 | This computation is complex so we prefer to abstract over it. |
---|
631 | We leave the computation of the delay time to the environment. |
---|
632 | |
---|
633 | We use only the P1 and P3 lines despite the MCS-51 having 4 output lines, P0--P3. |
---|
634 | This is because P0 and P2 become inoperable if the processor is equipped with XRAM (we assume it is). |
---|
635 | |
---|
636 | The UART port can work in several modes, depending on the how the SFRs are set. |
---|
637 | In an asyncrhonous mode, the UART transmits 8 bits at a time, using a ninth line for synchronisation. |
---|
638 | In a synchronous mode the ninth line is used to transmit an additional bit. |
---|
639 | |
---|
640 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
641 | % SECTION % |
---|
642 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
643 | \subsection{Computation of cost traces} |
---|
644 | \label{subsect.computation.cost.traces} |
---|
645 | |
---|
646 | As mentioned in Subsection~\ref{subsect.labels.pseudoinstructions} we introduced a notion of \emph{cost label}. |
---|
647 | Cost labels are inserted by the prototype C compiler at specific locations in the object code. |
---|
648 | Roughly, for those familiar with control flow graphs, they are inserted at the start of every basic block. |
---|
649 | |
---|
650 | Cost labels are used to calculate a precise costing for a program by marking the location of basic blocks. |
---|
651 | During the assembly phase, where labels and pseudoinstructions are eliminated, a map is generated associating cost labels with memory locations. |
---|
652 | This map is later used in a separate analysis which computes the cost of a program by traversing through a program, fetching one instruction at a time, and computing the cost of blocks. |
---|
653 | These block costings are stored in another map, and will later be passed back to the prototype compiler. |
---|
654 | |
---|
655 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
656 | % SECTION % |
---|
657 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
658 | \section{Validation} |
---|
659 | \label{sect.validation} |
---|
660 | |
---|
661 | We spent considerable effort attempting to ensure that what we have formalised is an accurate model of the MCS-51 microprocessor. |
---|
662 | |
---|
663 | We made use of multiple data sheets, each from a different semiconductor manufacturer. |
---|
664 | This helped us triangulate errors in the specification of the processor's instruction set, and its behaviour, for instance, in a data sheet from Philips Semiconductor. |
---|
665 | |
---|
666 | The O'Caml prototype was especially useful for validation purposes. |
---|
667 | We wrote a module for parsing and loading Intel HEX format files. |
---|
668 | Intel HEX is a standard format that all compilers targetting the MCS-51, and similar processors, produce. |
---|
669 | It is essentially a snapshot of the processor's code memory in compressed form. |
---|
670 | Using this we were able to compile C programs with SDCC and load the resulting program directly into the emulator's code memory, ready for execution. |
---|
671 | Further, we can produce a HEX file from the emulator's code memory for loading into third party tools. |
---|
672 | After each step of execution, we can print out both the instruction that had been executed and a snapshot of the processor's state, including all flags and register contents. |
---|
673 | For example: |
---|
674 | \begin{frametxt} |
---|
675 | \begin{small} |
---|
676 | \begin{verbatim} |
---|
677 | ... |
---|
678 | 08: mov 81 #07 |
---|
679 | |
---|
680 | Processor status: |
---|
681 | |
---|
682 | ACC: 0 B: 0 PSW: 0 |
---|
683 | with flags set as: |
---|
684 | CY: false AC: false FO: false RS1: false |
---|
685 | RS0: false OV: false UD: false P: false |
---|
686 | SP: 7 IP: 0 PC: 8 DPL: 0 DPH: 0 SCON: 0 |
---|
687 | SBUF: 0 TMOD: 0 TCON: 0 |
---|
688 | Registers: |
---|
689 | R0: 0 R1: 0 R2: 0 R3: 0 R4: 0 R5: 0 R6: 0 R7: 0 |
---|
690 | ... |
---|
691 | \end{verbatim} |
---|
692 | \end{small} |
---|
693 | \end{frametxt} |
---|
694 | Here, the trace indicates that the instruction \texttt{mov 81 \#07} has just been executed by the processor, which is now in the state indicated. |
---|
695 | These traces were useful in spotting anything that was `obviously' wrong with the execution of the program. |
---|
696 | |
---|
697 | We further used MCU 8051 IDE as a reference, which allows a user to step through an assembly program one instruction at a time. |
---|
698 | Using these execution traces, we were able to step through a compiled program in MCU 8051 IDE and compare the resulting execution trace with the trace produced by our emulator. |
---|
699 | |
---|
700 | The Matita formalisation was largely copied from the O'Caml source code, apart from the changes already mentioned. |
---|
701 | However, as the Matita emulator is executable, we could perform further validation by comparing the trace of a program's execution in the Matita emulator with the trace of the same program in the O'Caml emulator. |
---|
702 | |
---|
703 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
704 | % SECTION % |
---|
705 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
706 | \section{Related work} |
---|
707 | \label{sect.related.work} |
---|
708 | A large body of literature on the formalisation of microprocessors exists. |
---|
709 | The majority of it deals with proving correctness of implementations of microprocessors at the microcode or gate level. |
---|
710 | We are interested in providing a precise specification of the behaviour of the microprocessor in order to prove the correctness of a compiler which will target the processor. |
---|
711 | In particular, we are interested in intensional properties of the processor; precise timings of instruction execution in clock cycles. |
---|
712 | Moreover, in addition to formalising the interface of an MCS-51 processor, we have also built a complete MCS-51 ecosystem: UART, I/O lines, and hardware timers, complete with an assembler. |
---|
713 | |
---|
714 | Work closely related to our own can be found in~\cite{fox:trustworthy:2010}. |
---|
715 | Here, the authors describe the formalisation, in HOL4, of the ARMv7 instruction set architecture. |
---|
716 | They further point to an excellent list of references to related work in the literature for the interested reader. |
---|
717 | This formalisation also considers the machine code level, opposed to their formalisation, which only considering an abstract assembly language. |
---|
718 | In particular, instruction decoding is explicitly modeled inside HOL4's logic. |
---|
719 | We go further in also providing an assembly language, complete with assembler, to translate instructions and pseudoinstruction into machine code. |
---|
720 | |
---|
721 | Further, in~\cite{fox:trustworthy:2010} the authors validated their formalisation by using development boards and random testing. |
---|
722 | We currently rely on non-exhaustive testing against a third party emulator. |
---|
723 | We recognise the importance of this exhaustive testing, but currently leave it for future work. |
---|
724 | |
---|
725 | Executability is another key difference between our work and that of~\cite{fox:trustworthy:2010}. |
---|
726 | Our formalisation is executable: applying the emulation function to an input state eventually reduces to an output state. |
---|
727 | This is because Matita is based on a logic, CIC, which internalizes conversion. |
---|
728 | In~\cite{fox:trustworthy:2010} the authors provide an automation layer to derive single step theorems: if the processor is in a particular state that satisfies some preconditions, then after execution of an instruction it will reside in another state satisfying some postconditions. |
---|
729 | We do not need single step theorems of this form. |
---|
730 | |
---|
731 | Our main difficulties resided in the non-uniformity of an old 8-bit architecture, in terms of the instruction set, addressing modes and memory models. |
---|
732 | In contrast, the ARM instruction set and memory model is relatively uniform, simplifying any formalisation considerably. |
---|
733 | |
---|
734 | Perhaps the closest project to CerCo is CompCert~\cite{leroy:formally:2009}. |
---|
735 | CompCert concerns the certification of a C compiler and includes a formalisation in Coq of a subset of PowerPC. |
---|
736 | (Coq and Matita essentially share the same logic.) |
---|
737 | |
---|
738 | Despite this similarity, the two formalisations do not have much in common. |
---|
739 | First, CompCert provides a formalisation at the assembly level (no instruction decoding). |
---|
740 | This impels them to trust an unformalised assembler and linker, whereas we provide our own. |
---|
741 | Our formalisation is \emph{directly} executable, while the one in CompCert only provides a relation that describes execution. |
---|
742 | I/O is also not considered at all in CompCert. |
---|
743 | Moreover an idealized abstract and uniform memory model is assumed, while we take into account the complicated overlapping memory model of the MCS-51 architecture. |
---|
744 | Finally, 82 instructions of the more than 200 offered by the processor are formalised in CompCert, and the assembly language is augmented with macro instructions that are turned into `real' instructions only during communication with the external assembler. |
---|
745 | Even from a technical level the two formalisations differ: we tried to exploit dependent types whilst CompCert largely sticks to a non-dependent fragment of Coq. |
---|
746 | |
---|
747 | In~\cite{atkey:coqjvm:2007} an executable specification of the Java Virtual Machine, using dependent types, is presented. |
---|
748 | As we do, dependent types there are used to remove spurious partiality from the model. |
---|
749 | They also lower the need for over-specifying the behaviour of the processor in impossible cases. |
---|
750 | Our use of dependent types will also help to maintain invariants when we prove the correctness of the CerCo prototype C compiler. |
---|
751 | |
---|
752 | Finally~\cite{sarkar:semantics:2009} provides an executable semantics for x86-CC multiprocessor machine code. |
---|
753 | This machine code exhibits a high degree of non-uniformity similar to the MCS-51. |
---|
754 | However, only a small subset of the instruction set is considered, and they over-approximate the possibilities of unorthogonality of the instruction set, largely dodging the problems we had to face. |
---|
755 | |
---|
756 | Further, it seems that the definition of the decode function is potentially error prone. |
---|
757 | A small domain specific language of patterns is formalised in HOL4. |
---|
758 | This is similar to the specification language of the x86 instruction set found in manufacturer's data sheets. |
---|
759 | A decode function is implemented by copying lines from data sheets into the proof script, which are then interpreted. |
---|
760 | |
---|
761 | We are currently considering implementing a similar domain specific language in Matita. |
---|
762 | However, we would prefer to certify in Matita the compiler for this language. |
---|
763 | Data sheets could then be compiled down to the efficient code that we currently provide, instead of inefficiently interpreting the data sheets every time an instruction is executed. |
---|
764 | |
---|
765 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
766 | % SECTION % |
---|
767 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
768 | \section{Conclusions} |
---|
769 | \label{sect.conclusions} |
---|
770 | |
---|
771 | In CerCo, we are interested in the certification of a compiler for C that induces a precise cost model on the source code. |
---|
772 | Our cost model assigns costs to blocks of instructions by tracing the way that blocks are compiled, and by computing exact costs on generated machine language. |
---|
773 | To perform this accurately, we have provided an executable semantics for the MCS-51 family of processors. |
---|
774 | The formalisation was done twice, first in O'Caml and then in Matita, and captures the exact timings of the processor (according to a Siemen's data sheet). |
---|
775 | Moreover, the O'Caml formalisation also considers timers and I/O. |
---|
776 | Adding support for I/O and timers in Matita is on-going work that will not present any major problem, and was delayed only because the addition is not immediately useful for the formalisation of the CerCo compiler. |
---|
777 | |
---|
778 | The formalisation is done at machine level and not at assembly level; we also formalise fetching and decoding. |
---|
779 | We separately provide an assembly language, enhanched with labels and pseudoinstructions, and an assembler from this language to machine code. |
---|
780 | This assembly language is similar to those found in `industrial strength' compilers, such as SDCC. |
---|
781 | We introduce cost labels in the machine language to relate the data flow of the assembly program to that of the C source language, in order to associate costs to the C program. |
---|
782 | For the O'Caml version, we provide a parser and pretty printer from code memory to Intel HEX format. |
---|
783 | Hence we can perform testing on programs compiled using any free or commercial compiler. |
---|
784 | |
---|
785 | Our main difficulty in formalising the MCS-51 was the unorthogonality of its memory model and instruction set. |
---|
786 | These problems are easily handled in O'Caml by using advanced language features like polymorphic variants and phantom types, simulating Generalized Abstract Data Types. |
---|
787 | In Matita, we use dependent types to recover the same flexibility, to reduce spurious partiality, and to grant invariants that will be later useful in other formalisations in the CerCo project. |
---|
788 | |
---|
789 | The formalisation has been partially verified by computing execution traces on selected programs and comparing them with an existing emulator. |
---|
790 | All instructions have been tested at least once, but we have not yet pushed testing further, for example with random testing or by using development boards. |
---|
791 | I/O in particular has not been tested yet, and it is currently unclear how to provide exhaustive testing in the presence of I/O. |
---|
792 | Finally, we are aware of having over-specified the processor in several places, by fixing a behaviour hopefully consistent with the real machine, where manufacturer data sheets are ambiguous or under-specified. |
---|
793 | |
---|
794 | \bibliography{itp-2011.bib} |
---|
795 | |
---|
796 | \end{document} |
---|