Changeset 3512 for Papers

Oct 20, 2014, 12:53:29 PM (4 years ago)

more work on the introduction

1 edited


  • Papers/sttt/main.tex

    r3477 r3512  
    5151We consider the formalisation of an assembler for the Intel MCS-51 8-bit microprocessor in the Matita proof assistant~\cite{asperti:user:2007}.
    5252This formalisation forms a major component of the EU-funded CerCo (`Certified Complexity') project~\cite{cerco:2011}, concerning the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language.
    54 The \textsc{mcs-51}, commonly called the 8051, is an 8-bit Harvard architecture \textsc{cisc} instruction set micro-controller dating from the early 1980s, and was originally manufactured by Intel.
    55 An extended variant, the \textsc{mcs-52} or 8052, was subsequently released adding extra \textsc{ram} and \textsc{rom} above and beyond that offered by the original \textsc{mcs-51}, and an extra timer.
    56 Derivatives are still widely manufactured by a number of semiconductor foundries, with the processor being used especially in embedded systems.
    58 The MCS-51 has a relative paucity of features compared to its more modern brethren, with a lack of any caching or pipelining features meaning that timing of execution is predictable, making the MCS-51 very attractive for CerCo's ends.
    59 However, the MCS-51's paucity of features---though an advantage in many respects---also quickly becomes a hindrance, as the MCS-51 features a relatively minuscule series of memory spaces by modern standards.
    60 As a result our C compiler, to be able to successfully compile realistic programs for embedded devices, ought to produce `tight' machine code.
     53Though this document describes the formalisation of an assembler for a specific microprocessor, we believe that the techniques described are general, and should be of interest to anybody providing a proof of correctness for an optimising assembler.
     55The \textsc{mcs-51}, commonly called the 8051, is an 8-bit Harvard architecture \textsc{cisc} instruction set microprocessor dating from the early 1980s.
     56An extended variant, the \textsc{mcs-52}, or 8052, was subsequently released adding extra \textsc{ram} and \textsc{rom} above and beyond that offered by the original \textsc{mcs-51}, along with an additional timer.
     57Originally manufactured and marketed by Intel, the microprocessor quickly gained and maintained its popularity, especially in embedded systems development, with dozens of semiconductor foundries subsequently releasing derivative copies, many of which are still manufactured today.
     59Admittedly, a typical \textsc{mcs-51} derivative lacks many features compared to their more modern brethren.
     60In particular, most derivatives lack a cache, any pipelining features, branch speculation, out-of-order execution, or other advanced features we now take for granted on state-of-the-art microprocessors.
     61However, as a direct result of this simplicity, program execution is predictable: as part of the processor's specification provided by the manufacturer we can be sure that a \texttt{NOP} instruction will always take exactly one processor cycle to execute, whilst an \texttt{ADD} instruction always takes two processor cycles, and so on and so forth for all instructions in the processor's instruction set.
     63Certainty surrounding execution times makes the \textsc{mcs-51} especially attractive for certain embedded development tasks.
     64Moreover, this same certainty also makes the \textsc{mcs-51} especially attractive for CerCo's ends, wherein we aim to produce a verified concrete-complexity preserving compiler that lifts a cost model for a C source file---induced on the program's machine code image under compilation---back through the compilation chain.
     65This cost model will ultimately be made manifest as complexity assertions annotating the original C source code, and these annotations may then be used by the programmer, perhaps in conjunction with external reasoning tools, to derive accurate resource bounds for their program.
     66As a result, knowing that a given block of machine code instructions will take a fixed number of cycles to execute, irrespective of the program context or processor state when executed, allows us to generate a cycle-accurate cost model on the original C source code.
    6269To do this, we must solve the `branch displacement' problem---deciding how best to expand pseudojumps to labels in assembly language to machine code jumps.
Note: See TracChangeset for help on using the changeset viewer.