# Changeset 1026

Ignore:
Timestamp:
Jun 21, 2011, 12:47:27 PM (9 years ago)
Message:

final version? under 16 pages

Location:
src/ASM/CPP2011
Files:
2 edited

### Legend:

Unmodified
 r1020 author = {Andrea Asperti and Claudio {Sacerdoti Coen} and Enrico Tassi and Stefano Zacchiroli}, title = {User interaction with the {Matita} proof assistant}, journal = {Journal of Automated Reasoning}, journal = {Automated Reasoning}, pages = {109--139}, volume = {39}, author = {Gerwin Klein and Tobias Nipkow}, title = {A machine-checked model for a {Java-like} language, virtual machine and compiler}, journal = {Transactions on Programming Languages and Systems}, journal = {{TOPLAS}}, volume = {28}, number = {4}, author = {Gerwin Klein and June Andronick and Kevin Elphinstone and Gernot Heiser and David Cock and Philip Derrin and Dhammika Elkaduwe and Kai Engelhardt and Rafal Kolanski and Michael Norrish and Thomas Sewell, Harvey Tuch and Simon Winwood}, title = {{seL4}: Formal verification of an operating system kernel}, journal = {Communications of the {ACM}}, journal = {{CACM}}, issue = {6}, volume = {53}, author = {Xavier Leroy}, title = {Formal verification of a realistic compiler}, journal = {Communications of the {Association of Computing Machinery}}, journal = {{CACM}}, volume = {52}, number = {7}, author = {Xavier Leroy}, title = {A formally verified compiler back-end}, journal = {Journal of Automated Reasoning}, journal = {Automated Reasoning}, volume = {43}, number = {4}, author = {Robert Atkey}, title = {{CoqJVM}: An executable specification of the {Java Virtual Machine} using dependent types}, booktitle = {Conference of the {TYPES} Project}, booktitle = {{TYPES}}, pages = {18--32}, year = {2007} author = {Sandrine Blazy and Zaynah Dargaye and Xavier Leroy}, title = {Formal Verification of a {C} Compiler Front-End}, booktitle = {International Symposium on Formal Methods}, booktitle = {{FM}}, pages = {460--475}, year = {2006} author = {Anthony Fox and Magnus O. Myreen}, title = {A Trustworthy Monadic Formalization of the {ARMv7} Instruction Set Architecture}, booktitle = {$\mathrm{1^{st}}$ International Conference on Interactive Theorem Proving}, booktitle = {{ITP}}, pages = {243--258}, year = {2010} author = {Gerwin Klein and June Andronick and Kevin Elphinstone and Gernot Heiser and David Cock and Philip Derrin and Dhammika Elkaduwe and Kai Engelhardt and Rafal Kolanski and Michael Norrish and Thomas Sewell, Harvey Tuch and Simon Winwood}, title = {{seL4}: Formal verification of an operating system kernel}, booktitle = {Proceedings of the $22^\mathrm{nd}$ {ACM} Symposium on Operating Systems Principles}, booktitle = {{SOSP}}, year = {2009} } author = {Dominic P. Mulligan and Claudio {Sacerdoti Coen}}, title = {An executable formal semantics of the {MCS-51} microprocessor in {Matita}}, booktitle = {Formal Methods in Computer Aided Design}, booktitle = {{FMCAD}}, year = {2011}, note = {Submitted} author = {Susmit Sarkar and Peter Sewell and Francesco Zappa Nardelli and Scott Owens and Tom Ridge and Thomas Braibant and Magnus O. Myreen and Jade Alglave}, title = {The semantics of {x86-CC} multiprocessor machine code}, booktitle = {$\mathrm{36^{th}}$ Annual Symposium on Principles of Programming Languages}, booktitle = {{POPL}}, pages = {379--391}, year = {2009} author = {Matthieu Sozeau}, title = {Subset Coercions in {Coq}}, booktitle = {Types for Proofs and Programs}, booktitle = {{TYPES}}, pages = {237--252}, year = {2006} author = {Jun Yan and Wei Zhang}, title = {{WCET} Analysis for Multi-Core Processors with Shared {L2} Instruction Caches}, booktitle = {$\mathrm{14^{th}}$ {IEEE} Symposium on Real-time and Embedded Technology and Applications}, booktitle = {{RTAS}}, pages = {80--89}, year = {2008} } @inproceedings{russell, author    = {Matthieu Sozeau}, title     = {Subset Coercions in {C}oq}, booktitle = {Types for Proofs and Programs}, series    = {LNCS}, year      = {2006}, pages     = {237-252}, VOLUME = {4502/2007}, PUBLISHER = {Springer-Verlag}, doi = {10.1007/978-3-540-74464-1\_16}, }
 r1025 This formalisation forms a major component of the EU-funded CerCo project~\cite{cerco:2011}, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language. The MCS-51 dates from the early 1980s and is commonly called the 8051/8052.\footnote{Being strict, the 8051 and 8052 are two different microprocessors, though the features that the 8052 added over the 8051 are minor, and largely irrelevant for our formalisation project.} The MCS-51 dates from the early 1980s and is commonly called the 8051/8052. Despite the microprocessor's age, derivatives are still widely manufactured by a number of semiconductor foundries. As a result the processor is widely used, especially in embedded systems development, where well-tested, cheap, predictable microprocessors find their niche. This predicability of timing information is especially attractive to the CerCo consortium. We are in the process of constructing a certified, concrete complexity compiler for a realistic processor, and not for building and formalising the worst case execution time tools (WCET---see~\cite{yan:wcet:2008} and~\cite{bate:wcet:2011}, amongst many others, for an application of WCET technology to microprocessors with more complex designs) that would be necessary to achieve the same result with, for example, a modern ARM or PowerPC microprocessor. We are in the process of constructing a certified, concrete complexity compiler for a realistic processor, and not for building and formalising the worst case execution time tools (WCET---see~\cite{bate:wcet:2011}, amongst many others, for an application of WCET technology to microprocessors with more complex designs) that would be necessary to achieve the same result with, for example, a modern ARM or PowerPC microprocessor. As in most things, what one hand giveth, the other taketh away: the MCS-51's paucity of features, though an advantage in many respects, also quickly become a hindrance, and successfully compiling high-level code for this architecture is a cumbrous and involved process. Second, as mentioned, the CerCo consortium is in the business of constructing a verified compiler for the C programming language. However, unlike CompCert~\cite{compcert:2011,leroy:formal:2009,leroy:formally:2009}---which currently represents the state of the art for industrial grade' verified compilers---CerCo considers not just the \emph{extensional correctness} of the compiler, but also its \emph{intensional correctness}. However, unlike CompCert~\cite{compcert:2011,leroy:formal:2009}---which currently represents the state of the art for industrial grade' verified compilers---CerCo considers not just the \emph{extensional correctness} of the compiler, but also its \emph{intensional correctness}. That is, CompCert focusses solely on the preservation of the \emph{meaning} of a program during the compilation process, guaranteeing that the program's meaning does not change as it is gradually transformed into assembly code. However in any realistic compiler (even the CompCert compiler!) there is no guarantee that the program's time properties are preserved during the compilation process; a compiler's optimisations' could, in theory, even conspire to degrade the concrete complexity of certain classes of programs. This cost model is induced by the compilation process itself, and its non-compositional nature allows us to assign different costs to identical blocks of instructions depending on how they are compiled. In short, we aim to obtain a very precise costing for a program by embracing the compilation process, not ignoring it. This, however, complicates the proof of correctness for the compiler proper: for every translation pass from intermediate language to intermediate language, we must prove that not only has the meaning of a program been preserved, but also its concrete complexity characteristics. This, however, complicates the proof of correctness for the compiler proper. In each translation pass from intermediate language to intermediate language, we must prove that both the meaning and concrete complexity characteristics of the program are preserved. This also applies for the translation from assembly language to machine code. \end{array} \end{displaymath}}} In the translation, if \texttt{JZ} fails, we fall through to the \texttt{SJMP} which jumps over the \texttt{LJMP}. Here, if \texttt{JZ} fails, we fall through to the \texttt{SJMP} which jumps over the \texttt{LJMP}. Naturally, if \texttt{label} is close enough, a conditional jump pseudoinstruction is mapped directly to a conditional jump machine instruction; the above translation only applies if \texttt{label} is not sufficiently local. This leaves the problem, addressed below, of calculating whether a label is indeed close enough' for the simpler translation to be used. We address the calculation of whether a label is indeed close enough' for the simpler translation to be used below. Crucially, the above translation demonstrates the difficulty in predicting how many clock cycles a pseudoinstruction will take to execute. We merely describe enough here to understand the rest of the proof. The MCS-51 emulator centres around a \texttt{Status} record, describing the current state of the microprocessor. This record contains fields corresponding to the microprocessor's program counter, special function registers, and so on. The emulator centres around a \texttt{Status} record, describing the microprocessor's state. This record contains fields corresponding to the microprocessor's program counter, registers, and so on. At the machine code level, code memory is implemented as a compact trie of bytes, addressed by the program counter. Machine code programs are loaded into \texttt{Status} using the \texttt{load\_code\_memory} function. If we change how we expand conditional jumps to labels, for instance, then the costing needs to change, hence \texttt{execute\_1\_pseudo\_instruction}'s parametricity in the costing. The costing returns \emph{pairs} of natural numbers because, in the case of expanding conditional jumps to labels, the expansion of the true branch' and false branch' may differ in the number of clock ticks needed for execution. The costing returns \emph{pairs} of natural numbers because, in the case of expanding conditional jumps to labels, the expansion of the true branch' and `false branch' may differ in execution time. This timing information is used inside \texttt{execute\_1\_pseudo\_instruction} to update the clock of the \texttt{PseudoStatus}. During the proof of correctness of the assembler we relate the clocks of \texttt{Status}es and \texttt{PseudoStatus}es for the policy induced by the cost model and optimisations.