Changeset 574

Ignore:
Timestamp:
Feb 18, 2011, 1:32:46 PM (8 years ago)
Message:

Tweaked the document, and removed all junk from the end of the file.

File:
1 edited

Legend:

Unmodified
 r572 \begin{abstract} We summarise our formalisation of an emulator for the MCS-51 microprocessor in the Matita proof assistant. We summarise the formalisation of an emulator for the MCS-51 microprocessor in the Matita proof assistant. The MCS-51 is a widely used 8-bit microprocessor, especially popular in embedded devices. The formalisation proceeded in two stages, first implementing an O'Caml prototype, for quickly ironing out' bugs, and then porting the O'Caml emulator to Matita. Though mostly straight-forward, this porting presented multiple problems. Of particular interest is how we handled the unorthoganality of the MSC-51's instruction set. Of particular interest is how the unorthoganality of the MSC-51's instruction set is handled. In O'Caml, this was handled with polymorphic variants. In Matita, we achieved the same effect with a non-standard use of dependent types. The latter is what we describe in this paper. The focus of the formalisation has been on capturing the intensional behaviour of the processor. However, the design of the MCS-51 itself has caused problems in our formalisation. However, the design of the MCS-51 itself has caused problems in the formalisation. For example, the MCS-51 has a highly unorthogonal instruction set. To cope with this unorthogonality, and to produce an executable specification, we rely on Matita's dependent types. For instance, the open source Small Device C Compiler (SDCC) recognises a dialect of C~\cite{sdcc:2010}, and other compilers for BASIC, Forth and Modula-2 are also extant. An open source emulator for the processor, MCU 8051 IDE, is also available~\cite{mcu8051ide:2010}. Both MCU 8051 IDE and SDCC were used in for validating our formalisation. Both MCU 8051 IDE and SDCC were used in for validating the formalisation. \begin{figure}[t] \paragraph{Overview of paper}\quad In Section~\ref{sect.design.issues.formalisation} we discuss design issues in the development of the formalisation. In Section~\ref{sect.validation} we discuss how we validated the design and implementation of our emulator to ensure that what we formalised was an accurate model of an MCS-51 series microprocessor. In Section~\ref{sect.validation} we discuss how we validated the design and implementation of the emulator to ensure that what we formalised was an accurate model of an MCS-51 series microprocessor. In Section~\ref{sect.related.work} we describe previous work, with an eye toward describing its relation with the work described herein. In Section~\ref{sect.conclusions} we conclude. \label{subsect.development.strategy} Our implementation progressed in two stages. The implementation progressed in two stages. We began with an emulator written in O'Caml to iron out' any bugs in the design and implementation. O'Caml's ability to perform file I/O also eased debugging and validation. The formalization of MCS-51 must deal with bytes (8-bits), words (16-bits), and also more exoteric quantities (7, 3 and 9-bits). To avoid difficult-to-trace size mismatch bugs, we represented all quantities using bitvectors, i.e. fixed length vectors of booleans. In our O'Caml emulator, we faked' bitvectors using phantom types~\cite{leijen:domain:1999} implemented with polymorphic variants~\cite{garrigue:programming:1998}, as in Figure~\ref{fig.ocaml.implementation.bitvectors}. In the O'Caml emulator, we faked' bitvectors using phantom types~\cite{leijen:domain:1999} implemented with polymorphic variants~\cite{garrigue:programming:1998}, as in Figure~\ref{fig.ocaml.implementation.bitvectors}. From within the bitvector module (left column) bitvectors are just lists of bits and no guarantee is provided on sizes. However, the module's interface (right column) enforces size invariants in the rest of the code. The MCS-51 has numerous disjoint memory spaces addressed by differently sized pointers. In our O'Caml implementation, we use a map data structure (from the standard library) for each space. In the O'Caml implementation, we use a map data structure (from the standard library) for each space. Matita's standard library is small, and does not contain a generic map data structure. We had the opportunity of crafting a dependently typed special-purpose data structure for the job to enforce the correspondence between the size of pointer and the size of the memory space. We picked a modified form of trie of fixed height $h$. Paths are represented by bitvectors (already used in our implementation for addresses and registers) of length $h$: Paths are represented by bitvectors (already used in the implementation for addresses and registers) of length $h$: \begin{lstlisting} inductive BitVectorTrie (A: Type[0]): nat $\rightarrow$ Type[0] ≝ A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the MCS-51's one 16-bit register, \texttt{DPTR}. Our pseudoinstructions and labels induce an assembly language similar to that of SDCC's. The pseudoinstructions and labels induce an assembly language similar to that of SDCC's. All pseudoinstructions and labels are assembled away' prior to program execution. Jumps are computed in two stages. ... \end{lstlisting} Which were then combined in our inductive datatype for assembly instructions using the union operator $|$': Which were then combined in the inductive datatype for assembly preinstructions using the union operator `$|$': \begin{lstlisting} type 'addr preinstruction = Proof obligations require us to state and prove a few auxilliary lemmas related to the transitivity of subtyping. For instance, an \texttt{addressing\_mode} that belongs to an allowed set also belongs to any one of its supersets. At the moment, Matita's automation exploits these lemmas to completely solve all the proof obligations opened in our formalisation. At the moment, Matita's automation exploits these lemmas to completely solve all the proof obligations opened in the formalisation. The \texttt{execute\_1} function, for instance, opens over 200 proof obligations during type checking. We leave the computation of the delay time to the environment. We use only the P1 and P3 lines despite the MCS-51 having four output lines, P0--P3. We use only the P1 and P3 lines despite the MCS-51 having 4 output lines, P0--P3. This is because P0 and P2 become inoperable if the processor is equipped with XRAM (we assume it is). Intel HEX is a standard format that all compilers targetting the MCS-51, and similar processors, produce. It is essentially a snapshot of the processor's code memory in compressed form. Using this we were able to compile C programs with SDCC and load the resulting program directly into our emulator's code memory, ready for execution. Further, we can produce a HEX file from our emulator's code memory for loading into third party tools. Using this we were able to compile C programs with SDCC and load the resulting program directly into the emulator's code memory, ready for execution. Further, we can produce a HEX file from the emulator's code memory for loading into third party tools. After each step of execution, we can print out both the instruction that had been executed and a snapshot of the processor's state, including all flags and register contents. For example: We further used MCU 8051 IDE as a reference, which allows a user to step through an assembly program one instruction at a time. Using our execution traces, we were able to step through a compiled program in MCU 8051 IDE and compare the resulting execution trace with the trace produced by our emulator. Our Matita formalisation was largely copied from the O'Caml source code, apart from the changes already mentioned. Using these execution traces, we were able to step through a compiled program in MCU 8051 IDE and compare the resulting execution trace with the trace produced by our emulator. The Matita formalisation was largely copied from the O'Caml source code, apart from the changes already mentioned. However, as the Matita emulator is executable, we could perform further validation by comparing the trace of a program's execution in the Matita emulator with the trace of the same program in the O'Caml emulator. \end{document} \newpage \appendix \section{Listing of main O'Caml functions} \label{sect.listing.main.ocaml.functions} \subsubsection{From \texttt{ASMInterpret.ml(i)}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}@{\quad}p{9cm}} Name & Description \\ \hline \texttt{assembly} & Assembles an abstract syntax tree representing an 8051 assembly program into a list of bytes, its compiled form. \\ \texttt{initialize} & Initializes the emulator status. \\ \texttt{load} & Loads an assembled program into the emulator's code memory. \\ \texttt{fetch} & Fetches the next instruction, and automatically increments the program counter. \\ \texttt{execute} & Emulates the processor.  Accepts as input a function that pretty prints the emulator status after every emulation loop. \\ \end{tabular*} \end{center} \subsubsection{From \texttt{ASMCosts.ml(i)}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}@{\quad}p{9cm}} Name & Description \\ \hline \texttt{compute} & Computes a map associating costings to basic blocks in the program. \end{tabular*} \end{center} \subsubsection{From \texttt{IntelHex.ml(i)}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}@{\quad}p{9cm}} Name & Description \\ \hline \texttt{intel\_hex\_of\_file} & Reads in a file and parses it if in Intel IHX format, otherwise raises an exception. \\ \texttt{process\_intel\_hex} & Accepts a parsed Intel IHX file and populates a hashmap (of the same type as code memory) with the contents. \end{tabular*} \end{center} \subsubsection{From \texttt{Physical.ml(i)}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}@{\quad}p{9cm}} Name & Description \\ \hline \texttt{subb8\_with\_c} & Performs an eight bit subtraction on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\ \texttt{add8\_with\_c} & Performs an eight bit addition on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\ \texttt{dec} & Decrements an eight bit bitvector with underflow, if necessary. \\ \texttt{inc} & Increments an eight bit bitvector with overflow, if necessary. \end{tabular*} \end{center} \newpage \section{Listing of main Matita functions} \label{sect.listing.main.matita.functions} \subsubsection{From \texttt{Arithmetic.ma}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}p{9cm}} Title & Description \\ \hline \texttt{add\_n\_with\_carry} & Performs an $n$ bit addition on bitvectors.  The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\ \texttt{sub\_8\_with\_carry} & Performs an eight bit subtraction on bitvectors. The function also returns the most important PSW flags for the 8051: carry, auxiliary carry and overflow. \\ \texttt{half\_add} & Performs a standard half addition on bitvectors, returning the result and carry bit. \\ \texttt{full\_add} & Performs a standard full addition on bitvectors and a carry bit, returning the result and a carry bit. \end{tabular*} \end{center} \subsubsection{From \texttt{Assembly.ma}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}p{9cm}} Title & Description \\ \hline \texttt{assemble1} & Assembles a single 8051 assembly instruction into its memory representation. \\ \texttt{assemble} & Assembles an 8051 assembly program into its memory representation.\\ \texttt{assemble\_unlabelled\_program} &\\& Assembles a list of (unlabelled) 8051 assembly instructions into its memory representation. \end{tabular*} \end{center} \subsubsection{From \texttt{BitVectorTrie.ma}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}p{9cm}} Title & Description \\ \hline \texttt{lookup} & Returns the data stored at the end of a particular path (a bitvector) from the trie.  If no data exists, returns a default value. \\ \texttt{insert} & Inserts data into a tree at the end of the path (a bitvector) indicated.  Automatically expands the tree (by filling in stubs) if necessary. \end{tabular*} \end{center} \subsubsection{From \texttt{DoTest.ma}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}p{9cm}} Title & Description \\ \hline \texttt{execute\_trace} & Executes an assembly program for a fixed number of steps, recording in a trace which instructions were executed. \end{tabular*} \end{center} \subsubsection{From \texttt{Fetch.ma}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}p{9cm}} Title & Description \\ \hline \texttt{fetch} & Decodes and returns the instruction currently pointed to by the program counter and automatically increments the program counter the required amount to point to the next instruction. \\ \end{tabular*} \end{center} \subsubsection{From \texttt{Interpret.ma}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}p{9cm}} Title & Description \\ \hline \texttt{execute\_1} & Executes a single step of an 8051 assembly program. \\ \texttt{execute} & Executes a fixed number of steps of an 8051 assembly program. \end{tabular*} \end{center} \subsubsection{From \texttt{Status.ma}} \begin{center} \begin{tabular*}{\textwidth}{p{3cm}p{9cm}} Title & Description \\ \hline \texttt{load} & Loads an assembled 8051 assembly program into code memory. \end{tabular*} \end{center} \end{document}