Changeset 1033


Ignore:
Timestamp:
Jun 21, 2011, 3:08:25 PM (8 years ago)
Author:
sacerdot
Message:

ispelled & submitted

File:
1 edited

Legend:

Unmodified
Added
Removed
  • src/ASM/CPP2011/cpp-2011.tex

    r1032 r1033  
    4747\begin{abstract}
    4848We consider the formalisation of an assembler for Intel MCS-51 assembly language in the Matita proof assistant.
    49 This formalisation forms a major component of the EU-funded CerCo project, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language.
     49This formalisation forms a major component of the EU-funded CerCo project, concerning the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language.
    5050
    5151The efficient expansion of pseudoinstructions---particularly jumps---into MCS-51 machine instructions is complex.
     
    6767
    6868We consider the formalisation of an assembler for the Intel MCS-51 8-bit microprocessor in the Matita proof assistant~\cite{asperti:user:2007}.
    69 This formalisation forms a major component of the EU-funded CerCo project~\cite{cerco:2011}, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language.
     69This formalisation forms a major component of the EU-funded CerCo project~\cite{cerco:2011}, concerning the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language.
    7070
    7171The MCS-51 dates from the early 1980s and is commonly called the 8051/8052.
     
    7979For the MCS-51, if a manufacturer states that a particular opcode takes three clock cycles to execute, then that opcode \emph{always} takes three clock cycles to execute.
    8080
    81 This predicability of timing information is especially attractive to the CerCo consortium.
     81This predictability of timing information is especially attractive to the CerCo consortium.
    8282We are in the process of constructing a certified, concrete complexity compiler for a realistic processor, and not for building and formalising the worst case execution time tools (WCET---see~\cite{bate:wcet:2011}, amongst many others, for an application of WCET technology to microprocessors with more complex designs) that would be necessary to achieve the same result with, for example, a modern ARM or PowerPC microprocessor.
    8383
     
    9191Consequently, the size of each opcode is different, and to squeeze as much code as possible into the MCS-51's limited code memory, the smallest possible opcode that will suffice should be selected.
    9292
    93 The prototype CerCo C compiler does not attempt to select the smallest jump opcode in this manner, as this was thought to unneccessarily complicate the compilation chain, making the eventual translation and formalisation of the compiler into Matita much harder.
     93The prototype CerCo C compiler does not attempt to select the smallest jump opcode in this manner, as this was thought to unnecessarily complicate the compilation chain, making the eventual translation and formalisation of the compiler into Matita much harder.
    9494Instead, the compiler targets a bespoke assembly language, similar to `real world' assembly languages, complete with pseudoinstructions including \texttt{Jmp} and \texttt{Call} instructions.
    9595Labels, conditional jumps to labels, a program preamble containing global data and a \texttt{MOV} instruction for moving this global data into the MCS-51's one 16-bit register also feature.
     
    155155Depending on the particular MCS-51 derivative at hand, an \texttt{SJMP} could in theory take a different number of clock cycles to execute than an \texttt{LJMP}.
    156156These issues must also be dealt with in order to prove that the translation pass preserves the concrete complexity of assembly code, and that the semantics of a program using the MCS-51's I/O facilities does not change.
    157 We address this problem by parameterizing the semantics over a cost model.
     157We address this problem by parameterising the semantics over a cost model.
    158158We prove the preservation of concrete complexity only for the program-dependent cost model induced by the optimisation.
    159159
     
    172172We observe that $100_{16} = 256_{10}$, and lies \emph{just} outside the range expressible in an 8-bit byte (0--255).
    173173
    174 As our example shows, given an occurence $l$ of an \texttt{LJMP} instruction, it may be possible to shrink $l$ to an occurence of an \texttt{SJMP}---consuming fewer bytes of code memory---provided we can shrink any \texttt{LJMP}s that exist between $l$ and its target location.
     174As our example shows, given an occurrence $l$ of an \texttt{LJMP} instruction, it may be possible to shrink $l$ to an occurrence of an \texttt{SJMP}---consuming fewer bytes of code memory---provided we can shrink any \texttt{LJMP}s that exist between $l$ and its target location.
    175175In particular, if we wish to shrink the \texttt{LJMP} occurring at line 1, then we must shrink the \texttt{LJMP} occurring at line 3.
    176176However, to shrink the \texttt{LJMP} occurring at line 3 we must also shrink the \texttt{LJMP} occurring at line 5, and \emph{vice versa}.
     
    181181
    182182How we went about resolving this problem affected the shape of our proof of correctness for the whole assembler in a rather profound way.
    183 We first attempted to synthesize a solution bottom up: starting with no solution, we gradually refine a solution using the same functions that implement the jump expansion process.
     183We first attempted to synthesise a solution bottom up: starting with no solution, we gradually refine a solution using the same functions that implement the jump expansion process.
    184184Using this technique, solutions can fail to exist, and the proof of correctness for the assembler quickly descends into a diabolical quagmire.
    185185
     
    447447Then, we check with \texttt{fetch\_many} that the number of machine instructions that were fetched matches the number of instruction that \texttt{expand\_pseudo\_instruction} expanded.
    448448
    449 The final lemma in this series is \texttt{fetch\_assembly\_pseudo2} that combines the Lemma \texttt{fetch\_aasembly\_pseudo} with the correctness of the functions that load object code into the processor's memory.
     449The final lemma in this series is \texttt{fetch\_assembly\_pseudo2} that combines the Lemma \texttt{fetch\_assembly\_pseudo} with the correctness of the functions that load object code into the processor's memory.
    450450\begin{lstlisting}
    451451lemma fetch_assembly_pseudo2:
     
    575575However, this comes at the expense of assembler completeness: the generated program may be too large to fit into code memory.
    576576In this respect, there is a trade-off between the completeness of the assembler and the efficiency of the assembled program.
    577 The definition and proof of a complete, optimal (in the sense that object code size is minimized) and correct jump expansion policy is ongoing work.
     577The definition and proof of a complete, optimal (in the sense that object code size is minimised) and correct jump expansion policy is ongoing work.
    578578
    579579Aside from their application in verified compiler projects such as CerCo and CompCert, verified assemblers such as ours could also be applied to the verification of operating system kernels.
     
    585585If CompCert chooses to assume the existence of an optimising assembler, then care should be made to ensure that any assembly program produced by the CompCert compiler falls into the subset of programs that have a hope of having their semantics preserved by an optimising assembler.
    586586
    587 Our formalization exploits dependent types in different ways and for multiple
    588 purposes. The first purpose is to reduce potential errors in the formalization
     587Our formalisation exploits dependent types in different ways and for multiple
     588purposes. The first purpose is to reduce potential errors in the formalisation
    589589of the microprocessor. In particular,
    590590dependent types are used to constraint the size of bit-vectors and
     
    607607proof has been done this way: we only used this style to prove that a
    608608function satisfies a specification that only involves that function in a
    609 significant way. For example, it would be innatural to see the proof that
     609significant way. For example, it would be unnatural to see the proof that
    610610fetch and assembly commute as the specification of one of the two functions.
    611611
Note: See TracChangeset for help on using the changeset viewer.