# Changeset 579

Ignore:
Timestamp:
Feb 18, 2011, 3:08:04 PM (9 years ago)
Message:

slight tweaks

Location:
Deliverables/D4.1/ITP-Paper
Files:
2 edited

### Legend:

Unmodified
 r578 A user can certify that deadlines' are met whilst wringing as many clock cycles from the processor---using a cost model that does not over-estimate---as possible. We also see our approach as being relevant to compiler verification (and construction) itself. We also see our approach as being relevant to compiler verification (and construction). \emph{An optimisation specified only extensionally is only half specified}. Though the optimisation may preserve the denotational semantics of a program, there is no guarantee that any intensional properties of the program will be improved. Though the optimisation may preserve the denotational semantics of a program, there is no guarantee that intensional properties of the program improve. Another potential application is toward completeness and correctness of the compilation process in the presence of space constraints. However, the design of the MCS-51 itself has caused problems in the formalisation. For example, the MCS-51 has a highly unorthogonal instruction set. To cope with this unorthogonality, and to produce an executable specification, we rely on Matita's dependent types. To cope with this unorthogonality, and to produce an executable specification, we rely on the dependent types of Matita, an interactive proof assistant~\cite{asperti:user:2007}. \paragraph{The MCS-51}\quad The MCS-51 is an 8-bit microprocessor introduced by Intel in the late 1970s. Commonly called the 8051, in the decades since its introduction the processor has become a popular component of embedded systems. The processor, its successor the 8052, and derivatives are still manufactured \emph{en masse} by a host of vendors. The processor and derivatives are still manufactured \emph{en masse} by a host of vendors. The 8051 is a well documented processor, and has the support of numerous open source and commercial tools, such as compilers and emulators. \label{sect.design.issues.formalisation} From hereonin, we typeset O'Caml source with \texttt{\color{blue}{blue}} and Matita source with \texttt{\color{red}{red}}. We typeset O'Caml source with \texttt{\color{blue}{blue}} and Matita source with \texttt{\color{red}{red}}. Matita's syntax is straightforward if familiar with Coq or O'Caml. One subtlety is the use of \texttt{?}' or \texttt{$\ldots$}' in an argument position denoting an argument or arguments to be inferred, respectively. One subtlety is the use of \texttt{?}' or \texttt{$\ldots$}' denoting an argument or arguments to be inferred, respectively. A full account of the formalisation can be found in~\cite{cerco-report:2011}. All source code is available from the CerCo project website~\cite{cerco-code:2011}. \subsection{Development strategy} To specify global data via labels, we introduced a preamble before the program where labels and the size of reserved space for data is stored. A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the MCS-51's one 16-bit register, \texttt{DPTR}. (This register is used for indirect addressing of data stored in external memory.) The pseudoinstructions and labels induce an assembly language similar to that of SDCC's. The next instruction to be processed, indexed by the program counter, is fetched from code memory with \texttt{fetch}. An updated program counter, along with its concrete cost in processor cycles, is also returned. These costs are taken from a Siemens Semiconductor Group data sheet for the MCS-51~\cite{siemens:2011}, and will likely vary across manufacturers and derivatives of the processor. These costs are taken from a Siemens Semiconductor Group data sheet for the MCS-51~\cite{siemens:2011}, and will likely vary between particular implementations. \begin{lstlisting} definition fetch: BitVectorTrie Byte 16 $\rightarrow$ Word $\rightarrow$ instruction $\times$ Word $\times$ nat \end{lstlisting} This differs from the O'Caml emulator, which executed a program indefinitely. A callback function was also accepted as an argument, which could witness' the execution as it happened, providing a print-out of the processor state. A callback function was also accepted as an argument, which witnessed' the execution as it happened. Due to Matita's termination requirement, \texttt{execute} cannot execute a program indefinitely. An alternative approach would be to produce an infinite stream of statuses representing an execution trace. \label{sect.validation} \begin{figure}[t] \begin{scriptsize} \begin{verbatim} 08: mov 81 #07 Processor status: ACC: 0   B: 0   PSW: 0 with flags set as: CY: false    AC: false   FO: false   RS1: false RS0: false   OV: false   UD: false   P: false SP: 7   IP: 0   PC: 8   DPL: 0   DPH: 0   SCON: 0 SBUF: 0   TMOD: 0   TCON: 0 Registers: R0: 0   R1: 0   R2: 0   R3: 0   R4: 0   R5: 0   R6: 0   R7: 0 \end{verbatim} \end{scriptsize} \caption{An example snippet from an emulator execution trace} \label{fig.execution.trace} \end{figure} We spent considerable effort attempting to ensure that what we have formalised is an accurate model of the MCS-51 microprocessor. We made use of multiple data sheets, each from a different semiconductor manufacturer. We made use of multiple data sheets, each from a different manufacturer. This helped us triangulate errors in the specification of the processor's instruction set, and its behaviour, for instance, in a data sheet from Philips Semiconductor. Further, we can produce a HEX file from the emulator's code memory for loading into third party tools. After each step of execution, we can print out both the instruction that had been executed and a snapshot of the processor's state, including all flags and register contents. For example: \begin{frametxt} \begin{small} \begin{verbatim} ... 08: mov 81 #07 Processor status: ACC: 0   B: 0   PSW: 0 with flags set as: CY: false  AC: false  FO: false  RS1: false RS0: false  OV: false UD: false  P: false SP: 7  IP: 0  PC: 8  DPL: 0  DPH: 0  SCON: 0 SBUF: 0  TMOD: 0  TCON: 0 Registers: R0: 0  R1: 0  R2: 0  R3: 0  R4: 0  R5: 0  R6: 0  R7: 0 ... \end{verbatim} \end{small} \end{frametxt} A snippet from an execution trace is found in Figure~\ref{fig.execution.trace}. Here, the trace indicates that the instruction \texttt{mov 81 \#07} has just been executed by the processor, which is now in the state indicated. These traces were useful in spotting anything that was obviously' wrong with the execution of the program. We further used MCU 8051 IDE as a reference, which allows a user to step through an assembly program one instruction at a time. Using these execution traces, we were able to step through a compiled program in MCU 8051 IDE and compare the resulting execution trace with the trace produced by our emulator. We partially validated the assembly function by checking that on defined opcodes the \texttt{assembly\_1} and \texttt{fetch} functions are inverse. The Matita formalisation was largely copied from the O'Caml source code, apart from the changes already mentioned. Our formalisation is executable: applying the emulation function to an input state eventually reduces to an output state. This is because Matita is based on a logic, CIC, which internalizes conversion. In~\cite{fox:trustworthy:2010} the authors provide an automation layer to derive single step theorems: if the processor is in a particular state that satisfies some preconditions, then after execution of an instruction it will reside in another state satisfying some postconditions. In~\cite{fox:trustworthy:2010} the authors provide an automation layer to derive single step theorems: if the processor is in a state that satisfies some preconditions, then after execution of an instruction it will reside in a state satisfying some postconditions. We do not need single step theorems of this form. This assembly language is similar to those found in `industrial strength' compilers, such as SDCC. We introduce cost labels in the machine language to relate the data flow of the assembly program to that of the C source language, in order to associate costs to the C program. For the O'Caml version, we provide a parser and pretty printer from code memory to Intel HEX format. For the O'Caml version, we provide a parser and pretty printer from code memory to Intel HEX. Hence we can perform testing on programs compiled using any free or commercial compiler.