# Changeset 1026 for src/ASM/CPP2011/cpp-2011.tex

Ignore:
Timestamp:
Jun 21, 2011, 12:47:27 PM (9 years ago)
Message:

final version? under 16 pages

File:
1 edited

Unmodified
Added
Removed
• ## src/ASM/CPP2011/cpp-2011.tex

 r1025 This formalisation forms a major component of the EU-funded CerCo project~\cite{cerco:2011}, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language. The MCS-51 dates from the early 1980s and is commonly called the 8051/8052.\footnote{Being strict, the 8051 and 8052 are two different microprocessors, though the features that the 8052 added over the 8051 are minor, and largely irrelevant for our formalisation project.} The MCS-51 dates from the early 1980s and is commonly called the 8051/8052. Despite the microprocessor's age, derivatives are still widely manufactured by a number of semiconductor foundries. As a result the processor is widely used, especially in embedded systems development, where well-tested, cheap, predictable microprocessors find their niche. This predicability of timing information is especially attractive to the CerCo consortium. We are in the process of constructing a certified, concrete complexity compiler for a realistic processor, and not for building and formalising the worst case execution time tools (WCET---see~\cite{yan:wcet:2008} and~\cite{bate:wcet:2011}, amongst many others, for an application of WCET technology to microprocessors with more complex designs) that would be necessary to achieve the same result with, for example, a modern ARM or PowerPC microprocessor. We are in the process of constructing a certified, concrete complexity compiler for a realistic processor, and not for building and formalising the worst case execution time tools (WCET---see~\cite{bate:wcet:2011}, amongst many others, for an application of WCET technology to microprocessors with more complex designs) that would be necessary to achieve the same result with, for example, a modern ARM or PowerPC microprocessor. As in most things, what one hand giveth, the other taketh away: the MCS-51's paucity of features, though an advantage in many respects, also quickly become a hindrance, and successfully compiling high-level code for this architecture is a cumbrous and involved process. Second, as mentioned, the CerCo consortium is in the business of constructing a verified compiler for the C programming language. However, unlike CompCert~\cite{compcert:2011,leroy:formal:2009,leroy:formally:2009}---which currently represents the state of the art for industrial grade' verified compilers---CerCo considers not just the \emph{extensional correctness} of the compiler, but also its \emph{intensional correctness}. However, unlike CompCert~\cite{compcert:2011,leroy:formal:2009}---which currently represents the state of the art for industrial grade' verified compilers---CerCo considers not just the \emph{extensional correctness} of the compiler, but also its \emph{intensional correctness}. That is, CompCert focusses solely on the preservation of the \emph{meaning} of a program during the compilation process, guaranteeing that the program's meaning does not change as it is gradually transformed into assembly code. However in any realistic compiler (even the CompCert compiler!) there is no guarantee that the program's time properties are preserved during the compilation process; a compiler's optimisations' could, in theory, even conspire to degrade the concrete complexity of certain classes of programs. This cost model is induced by the compilation process itself, and its non-compositional nature allows us to assign different costs to identical blocks of instructions depending on how they are compiled. In short, we aim to obtain a very precise costing for a program by embracing the compilation process, not ignoring it. This, however, complicates the proof of correctness for the compiler proper: for every translation pass from intermediate language to intermediate language, we must prove that not only has the meaning of a program been preserved, but also its concrete complexity characteristics. This, however, complicates the proof of correctness for the compiler proper. In each translation pass from intermediate language to intermediate language, we must prove that both the meaning and concrete complexity characteristics of the program are preserved. This also applies for the translation from assembly language to machine code. \end{array} \end{displaymath}}} In the translation, if \texttt{JZ} fails, we fall through to the \texttt{SJMP} which jumps over the \texttt{LJMP}. Here, if \texttt{JZ} fails, we fall through to the \texttt{SJMP} which jumps over the \texttt{LJMP}. Naturally, if \texttt{label} is close enough, a conditional jump pseudoinstruction is mapped directly to a conditional jump machine instruction; the above translation only applies if \texttt{label} is not sufficiently local. This leaves the problem, addressed below, of calculating whether a label is indeed close enough' for the simpler translation to be used. We address the calculation of whether a label is indeed close enough' for the simpler translation to be used below. Crucially, the above translation demonstrates the difficulty in predicting how many clock cycles a pseudoinstruction will take to execute. We merely describe enough here to understand the rest of the proof. The MCS-51 emulator centres around a \texttt{Status} record, describing the current state of the microprocessor. This record contains fields corresponding to the microprocessor's program counter, special function registers, and so on. The emulator centres around a \texttt{Status} record, describing the microprocessor's state. This record contains fields corresponding to the microprocessor's program counter, registers, and so on. At the machine code level, code memory is implemented as a compact trie of bytes, addressed by the program counter. Machine code programs are loaded into \texttt{Status} using the \texttt{load\_code\_memory} function. If we change how we expand conditional jumps to labels, for instance, then the costing needs to change, hence \texttt{execute\_1\_pseudo\_instruction}'s parametricity in the costing. The costing returns \emph{pairs} of natural numbers because, in the case of expanding conditional jumps to labels, the expansion of the true branch' and false branch' may differ in the number of clock ticks needed for execution. The costing returns \emph{pairs} of natural numbers because, in the case of expanding conditional jumps to labels, the expansion of the true branch' and `false branch' may differ in execution time. This timing information is used inside \texttt{execute\_1\_pseudo\_instruction} to update the clock of the \texttt{PseudoStatus}. During the proof of correctness of the assembler we relate the clocks of \texttt{Status}es and \texttt{PseudoStatus}es for the policy induced by the cost model and optimisations.
Note: See TracChangeset for help on using the changeset viewer.