Ignore:
Timestamp:
May 19, 2011, 3:16:20 PM (9 years ago)
Author:
mulligan
Message:

final changes, now to get under 8 pages

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Deliverables/D4.1/ITP-Paper/itp-2011.tex

    r815 r817  
    169169We have fully formalised an executable formal semantics of a family of 8-bit Freescale microprocessors~\cite{oliboni:matita:2008}, and provided a similar executable formal semantics for the MCS-51 microprocessor.
    170170The latter is what we describe in this paper.
    171 The focus of the formalisation has been on capturing the intensional behaviour of the processor.
     171The focus of the formalisation has been on capturing the intensional behaviour of the processor; this is not novel.
    172172However, the design of the MCS-51 itself has caused problems in the formalisation.
    173173For example, the MCS-51 has a highly unorthogonal instruction set.
    174174To cope with this unorthogonality, and to produce an executable specification, we rely on the dependent types of Matita, an interactive proof assistant~\cite{asperti:user:2007}.
     175The manner in which we combined dependent types and coercions to handle this problem is novel.
    175176
    176177\paragraph*{The MCS-51}\quad
     
    310311\paragraph*{Overview of paper}\quad
    311312In Section~\ref{sect.design.issues.formalisation} we discuss design issues in the development of the formalisation.
    312 In Section~\ref{sect.validation} we discuss how we validated the design and implementation of the emulator to ensure that what we formalised was an accurate model of an MCS-51 series microprocessor.
     313In Section~\ref{sect.validation} we discuss validation of the emulator to ensure that what we formalised was an accurate model of an MCS-51 series microprocessor.
    313314In Section~\ref{sect.related.work} we describe previous work, with an eye toward describing its relation with the work described herein.
    314315In Section~\ref{sect.conclusions} we conclude.
     
    319320\section{Design issues in the formalisation}
    320321\label{sect.design.issues.formalisation}
     322
     323We implemented two emulators, one in O'Caml and one in Matita.
     324The O'Caml emulator is intended to be `feature complete' with respect to the MCS-51 device.
     325However, the Matita emulator is intended to be used as a target for a certified, complexity preserving C compiler.
     326As a result, not all features of the MCS-51 are formalised in the Matita emulator.
     327
     328We designed the O'Caml emulator to be as efficient as possible, under the constraint that it would be eventually translated into Matita.
     329One performance drain in the O'Caml emulator is the use of purely functional map datastructures to represent memory spaces, used to maintain the close correspondence between the Matita and O'Caml emulators.
    321330
    322331Matita~\cite{asperti:user:2007} is a proof assistant based on the Calculus of Coinductive constructions, similar to Coq.
     
    409418We traverse a path, and upon encountering \texttt{Stub}, we return a default value\footnote{All manufacturer data sheets that we consulted were silent on the subject of what should be returned if we attempt to access uninitialized memory.  We defaulted to simply returning zero, though our \texttt{lookup} function is parametric in this choice.  We do not believe that this is an outrageous decision, as SDCC for instance generates code which first `zeroes out' all memory in a preamble before executing the program proper.  This is in line with the C standard, which guarantees that all global variables will be zero initialized piecewise.}.
    410419
     420\texttt{BitVectorTrie} and \texttt{Vector} datastructures, and related functions, can be used in the formalising other microprocessors.
     421
    411422%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    412423% SECTION                                                                      %
     
    429440While introducing pseudoinstructions, we also introduced labels for locations to jump to, and for global data.
    430441To specify global data via labels, we introduced a preamble before the program where labels and the size of reserved space for data is stored.
    431 A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the MCS-51's one 16-bit register, \texttt{DPTR}.
     442A pseudoinstruction \texttt{Mov} moves (16-bit) data stored at a label into the MCS-51's 16-bit register, \texttt{DPTR}.
    432443(This register is used for indirect addressing of data stored in external memory.)
    433444
     
    680691
    681692We use only the P1 and P3 lines despite the MCS-51 having 4 output lines, P0--P3.
    682 This is because P0 and P2 become inoperable if the processor is equipped with XRAM (we assume it is).
     693This is because P0 and P2 become inoperable if XRAM is present (we assume it is).
    683694
    684695The UART port can work in several modes, depending on the how the SFRs are set.
     
    748759Using these execution traces, we were able to step through a compiled program in MCU 8051 IDE and compare the resulting execution trace with the trace produced by our emulator.
    749760
    750 We partially validated the assembler by checking that on defined opcodes the \texttt{assembly\_1} and \texttt{fetch} functions are inverse.
     761We partially validated the assembler by proving in Matita that on all defined opcodes the \texttt{assembly\_1} and \texttt{fetch} functions are inverse.
    751762
    752763The Matita formalisation was largely copied from the O'Caml source code, apart from the changes already mentioned.
     
    759770\label{sect.related.work}
    760771A large body of literature on the formalisation of microprocessors exists.
    761 The majority of it deals with proving correctness of implementations of microprocessors at the microcode or gate level.
     772The majority of it deals with proving correctness of implementations of microprocessors at the microcode or gate level, with many considering `cycle accurate' models.
    762773We are interested in providing a precise specification of the behaviour of the microprocessor in order to prove the correctness of a compiler which will target the processor.
    763774In particular, we are interested in intensional properties of the processor; precise timings of instruction execution in clock cycles.
     
    784795In contrast, the various ARM instruction sets and memory models are relatively uniform.
    785796
    786 Perhaps the closest project to CerCo is CompCert~\cite{leroy:formally:2009}.
     797Two close projects to CerCo are CompCert~\cite{leroy:formally:2009} and the CLI Stack.
    787798CompCert concerns the certification of a C compiler and includes a formalisation in Coq of a subset of PowerPC.
     799The CLI Stack consists of the design and verification of a whole chain of artifacts including a 32-bit microprocessor, the Piton assembler and two compilers for high-level languages.
     800Like CerCo, the CLI Stack compilers gave the cost of high-level instructions in processor cycles.
     801However, unlike CerCo, both the CLI Stack high-level languages ($\mu$Gypsy and Nqthm Pure Lisp) and FM9001 microprocessor were not commercial products, but `artificial' designs created for the purpose of verification (see~\cite{moore:grand:2005}).
     802
    788803The CompCert C compiler is extracted to O'Caml using Coq's code extraction facility.
    789 We aim to make use of a similar facility in Matita.
    790804Many other formalised emulators/compilers have also been extracted from proof assistants using similar technology (e.g. see~\cite{blanqui:designing:2010}).
    791 
    792 Despite this similarity, the two formalisations do not have much in common.
     805We aim to make use of a similar code extraction facility in Matita, but only if the extracted code exhibits the same degree of type safety, provided by polymorphic variants, and human readability that the O'Caml emulator posseses.
     806This is because we aim to use the emulator as a library for non-certified software written directly in O'Caml.
     807How we have used Matita's dependent types to handle the instruction set (Subsection~\ref{subsect.instruction.set.unorthogonality}) could enable code extraction to make use of polymorphic variants.
     808Using Coq's current code extraction algorithm we could write assembly programs that would generate runtime errors when emulated.
     809We leave this for future work.
     810
     811Despite the apparent similarity between CerCo and CompCert, the two formalisations do not have much in common.
    793812First, CompCert provides a formalisation at the assembly level (no instruction decoding).
    794813This impels them to trust an unformalised assembler and linker, whereas we provide our own.
     
    836855Our main difficulty in formalising the MCS-51 was the unorthogonality of its memory model and instruction set.
    837856These problems are easily handled in O'Caml by using advanced language features like polymorphic variants and phantom types, simulating Generalized Abstract Data Types~\cite{xi:guarded:2003}.
    838 In Matita, we use dependent types to recover the same flexibility, to reduce spurious partiality, and to grant invariants that will be later useful in other formalisations in the CerCo project.
     857Importantly, we discovered the best manner of using dependent types to recover the same flexibility, to reduce spurious partiality, and to grant invariants that will be later useful in other formalisations in the CerCo project.
    839858
    840859The formalisation has been partially verified by computing execution traces on selected programs and comparing them with an existing emulator.
Note: See TracChangeset for help on using the changeset viewer.