Changeset 1032 for src/ASM

Ignore:
Timestamp:
Jun 21, 2011, 3:04:05 PM (9 years ago)
Message:

Final version.

File:
1 edited

Legend:

Unmodified
 r1031 If CompCert chooses to assume the existence of an optimising assembler, then care should be made to ensure that any assembly program produced by the CompCert compiler falls into the subset of programs that have a hope of having their semantics preserved by an optimising assembler. In certain places in our formalisation (e.g. in proving \texttt{build\_maps} is correct) we made use of Matita's implementation of Russell~\cite{sozeau:subset:2006}. In Matita, Russell may be implemented using two coercions and some notational sugaring. % more Our formalization exploits dependent types in different ways and for multiple purposes. The first purpose is to reduce potential errors in the formalization of the microprocessor. In particular, dependent types are used to constraint the size of bit-vectors and tries that represent memory quantities and memory areas respectively. They are also used as explained in~\cite{mulligan:executable:2011}. to simulate polymorphic variants in Matita. Polymorphic variants nicely capture the absolutely unorthogonal instruction set of the MCS-51 where every opcode must accept its own subset of the 11 addressing mode of the processor. The second purpose is to single out the sources of incompleteness. By abstracting our functions over the dependent type of correct policies, we were able to manifest the fact that the compiler never refuses to compile a program where a correct policy exists. This also allowed to simplify the initial proof by dropping lemmas establishing that one function fails if and only if some other one does so. Finally, dependent types, together with Matita's liberal system of coercions, allow to simulate almost entirely in user space the proof methodology Russell'' of Sozeau~\cite{sozeau:subset:2006}. However, not every proof has been done this way: we only used this style to prove that a function satisfies a specification that only involves that function in a significant way. For example, it would be innatural to see the proof that fetch and assembly commute as the specification of one of the two functions. \subsection{Related work} \label{subsect.resources} All files relating to our formalisation effort can be found online at~\url{http://cerco.cs.unibo.it}. Our development, including the definition of the executable semantics of the MCS-51, is spread across 17 files, totalling around 13,000 lines of Matita source. The bulk of the proof described herein is contained in a single file, \texttt{AssemblyProof.ma}, consisting of approximately 3000 lines of Matita source. We admit to using a number of axioms in our development. We do not believe the use of these axioms has been particularly onerous---very few concern anything more interesting than, say, stating that converting from a natural number to a bitvector and back again is the identity---and what axioms remain are rapidly being closed as work continues. All files relating to our formalisation effort can be found online at~\url{http://cerco.cs.unibo.it}. The code of the compiler has been completed, and the proof of correctness described here is still in progress. In particular, we have assumed several properties of library functions'' related in particular to modular arithmetics and datastructures manipulation. Moreover, we only completed the interesting cases of some of the main theorems that proceed by cases on all the possible opcodes. We thus believe that the proof strategy is sound and that we will be able to close soon all axioms, up to possible minor bugs that should have local fixes that do not affect the global proof strategy. The development, including the definition of the executable semantics of the MCS-51, is spread across 17 files, totalling around 11,500 lines of Matita source. The bulk of the proof described herein is contained in a single file, \texttt{AssemblyProof.ma}, consisting at the moment of approximately 2500 lines of Matita source. Another 1000 lines of proofs are spread all over the development because of dependent types and the Russell proof style, that do not allow to separate the code from the proofs. The low ratio between the number of lines of code and the number of lines of proof is unusual. It is justified by the fact that the pseudo-assembly and the assembly language share most constructs and that large parts of the semantics is also shared. Thus many lines of code are required to describe the complex semantics of the processor, but, for the shared cases, the proof of preservation of the semantics is essentially trivial. \bibliography{cpp-2011.bib}