# Changeset 2362 for Papers/cpp-asm-2012

Ignore:
Timestamp:
Sep 27, 2012, 5:34:35 PM (9 years ago)
Message:

16 pages at last due to all sorts of fiddling. Any more additions will cause an overflow to 17 pages. Require Jaap's ArXiv? reference now.

Location:
Papers/cpp-asm-2012
Files:
2 edited

### Legend:

Unmodified
 r2361 \label{sect.the.proof} The aim of the section is to explain the main ideas and steps of the certified proof of correctness for an optimizing assembler for the MCS-51. In Section~\ref{subsect.machine.code.semantics} we sketch an operational semantics (a realistic and efficient emulator) for the MCS-51. Our aim here is to explain the main ideas and steps of the certified proof of correctness for an optimising assembler for the MCS-51. In Subsect.~\ref{subsect.machine.code.semantics} we sketch an operational semantics (a realistic and efficient emulator) for the MCS-51. We also introduce a syntax for decoded instructions that will be reused for the assembly language. In Section~\ref{subsect.assembly.code.semantics} we describe the assembly language and its operational semantics. In Subsect.~\ref{subsect.assembly.code.semantics} we describe the assembly language and its operational semantics. The latter is parametric in the cost model that will be induced by the assembler, reusing the semantics of the machine code on all real' instructions. Branch displacement policies are introduced in Section~\ref{subsect.the.assembler} where we also describe the assembler as a function over policies as previously described. Branch displacement policies are introduced in Subsect.~\ref{subsect.the.assembler} where we also describe the assembler as a function over policies as previously described. To prove our assembler correct we show that the object code given in output, together with a cost model for the source program, simulates the source program executed using that cost model. The proof can be divided into two main lemmas. The first is correctness with respect to fetching, described in Section~\ref{subsect.total.correctness.of.the.assembler}. The first is correctness with respect to fetching, described in Subsect.~\ref{subsect.total.correctness.of.the.assembler}. Roughly it states that a step of fetching at the assembly level, returning the decoded instruction $I$, is simulated by $n$ steps of fetching at the object level that returns instructions $J_1,\ldots,J_n$, where $J_1,\ldots,J_n$ is, amongst the possible expansions of $I$, the one picked by the policy. The second lemma states that $J_1,\ldots,J_n$ simulates $I$ but only if $I$ is well-behaved, i.e. manipulates addresses in good' ways. To keep track of well-behaved address manipulations we record where addresses are currently stored (in memory or an accumulator). We introduce a dynamic checking function that inspects this map to determine if the operation is well-behaved, with an affirmative answer being the pre-condition of the lemma. The second lemma is detailed in Section~\ref{subsect.total.correctness.for.well.behaved.assembly.programs} where we also establish correctness of our assembler as a composition of the two lemmas: programs that are well-behaved when executed under the cost model induced by the compiler are correctly simulated by the compiled code. The second lemma is detailed in Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs} where we also establish correctness of our assembler as a composition of the two lemmas: programs that are well-behaved when executed under the cost model induced by the compiler are correctly simulated by the compiled code. % ---------------------------------------------------------------------------- % The parameterised type $A$ of \texttt{preinstruction} represents the addressing mode allowed for conditional jumps; in the \texttt{RealInstruction} constructor we constraint it to be a relative offset. A different instantiation will be used in the next Section for assembly programs. we constraint it to be a relative offset. A different instantiation will be used in the next section for assembly programs. Once decoded, execution proceeds by a case analysis on the decoded instruction, following the operation of the hardware. The costing is a function that maps valid program counters to pairs of natural numbers representing the number of clock ticks used by the pseudoinstructions stored at those program counters. For conditional jumps the two numbers differ to represent different costs for the true branch' and the false branch'. In the next Section we will see how the optimizing assembler induces the only costing that is preserved by compilation. Obviously the induced costing is determined by the branch displacement policy that decides how to expand every pseudojump to a label into concrete instructions. In the next section we will see how the optimising assembler induces the only costing (induced by the branch displacement policy deciding how to expand pseudojumps) that is preserved by compilation. Execution proceeds by first fetching from pseudo-code memory using the program counter---treated as an index into the pseudoinstruction list. The only limitation introduced by this approach is that the size of assembly programs is bounded by $2^16$. This will be further discussed in Subsection~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}. This will be further discussed in Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}. % ---------------------------------------------------------------------------- % The aim of this section is to prove the following informal statement: when we fetch an assembly pseudoinstruction \texttt{I} at address \texttt{ppc}, then we can fetch the expanded pseudoinstruction(s) \texttt{[J1, \ldots, Jn] = fetch\_pseudo\_instruction \ldots\ I\ ppc} from \texttt{policy ppc} in the code memory obtained by loading the assembled object code. This constitutes the first major step in the proof of correctness of the assembler, the next one being the simulation of \texttt{I} by \texttt{[J1, \ldots, Jn]} (see Subsection~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}). This constitutes the first major step in the proof of correctness of the assembler, the next one being the simulation of \texttt{I} by \texttt{[J1, \ldots, Jn]} (see Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}). The \texttt{assembly} function is given a Russell type (slightly simplified here): Here we use $\pi_1$ to project the existential witness from the Russell-typed function \texttt{assembly}. We read \texttt{fetch\_assembly\_pseudo2} as follows. Suppose we are given an assembly program which can be addressed by a 16-bit word and a policy that is correct for this program. In this paper we take a different approach, tracing memory locations (and accumulators) that contain memory addresses. We prove that only those assembly programs that use addresses in safe' ways have their semantics preserved by the assembly process---a sort of dynamic type system sitting atop memory. In principle this approach allows us to introduce some permitted \emph{benign} manipulations of addresses that the traditional approach, using opaque addresses, cannot handle, therefore expanding the set of input programs that can be assembled correctly. In principle this approach allows us to introduce some permitted \emph{benign} manipulations of addresses that the traditional approach cannot handle, therefore expanding the set of input programs that can be assembled correctly. This approach seems similar to one taken by Tuch \emph{et al}~\cite{tuch:types:2007} for reasoning about low-level C code. $\times$ (option address_entry). \end{lstlisting} Here, \texttt{upper\_lower} is a type isomorphic to the booleans denoting whether a byte value is the upper or lower byte of some 16-bit address. Here, \texttt{upper\_lower} is a type isomorphic to the booleans. The implementation of \texttt{internal\_pseudo\_address\_map} is complicated by some peculiarities of the MCS-51's instruction set. Note here that all addresses are 16 bit words, but are stored (and manipulated) as 8 bit bytes. The verified assembler, complete with the underlying formalisation of the semantics of MCS-51 machine code, will form the bedrock layer upon which the rest of CerCo will build its verified compiler platform. It is interesting to compare our work to an industrial grade' assembler for the MCS-51: SDCC~\cite{sdcc:2011}. SDCC is the only open source C compiler that targets the MCS-51 instruction set. We may compare our work to an industrial grade' assembler for the MCS-51: SDCC~\cite{sdcc:2011}, the only open source C compiler that targets the MCS-51 instruction set. It appears that all pseudojumps in SDCC assembly are expanded to \texttt{LJMP} instructions, the worst possible jump expansion policy from an efficiency point of view. Note that this policy is the only possible policy \emph{in theory} that can preserve the semantics of an assembly program during the assembly process. However, this comes at the expense of assembler completeness: the generated program may be too large to fit into code memory. In this respect, there is a trade-off between the completeness of the assembler and the efficiency of the assembled program. The definition and proof of a terminating, correct jump expansion policy is described in a companion publication to this one~\cite{boender:correctness:2012}. Aside from their application in verified compiler projects such as CerCo, CompCert~\cite{leroy:formally:2009} and CompCertTSO~\cite{sevcik:relaxed-memory:2011}, verified assemblers such as ours could also be applied to the verification of operating system kernels. Of particular note is the verified seL4 kernel~\cite{klein:sel4:2009}. This verification explicitly assumes the existence of, amongst other things, a trustworthy assembler and compiler. CompCert, CompCertTSO and the seL4 formalisation assume the existence of trustworthy' assemblers. For instance, the CompCert C compiler's default backend stops at the PowerPC assembly language. The observation that an optimising assembler cannot preserve the semantics of every assembly program may have consequences for these projects (though in the case of CompCertTSO, targetting a multiprocessor, what exactly constitutes the subset of `good programs' may not be entirely clear). If CompCert chooses to assume the existence of an optimising assembler, then care should be made to ensure that any assembly program produced by the CompCert compiler falls into the subset of programs that can have their semantics preserved by an optimising assembler. Note that this policy is the only possible policy \emph{in theory} that can preserve the semantics of an assembly program during the assembly process, coming at the expense of assembler completeness as the generated program may be too large for code memory, there being a trade-off between the completeness of the assembler and the efficiency of the assembled program. The definition and proof of a terminating, correct jump expansion policy is described elsewhere~\cite{boender:correctness:2012}. Verified assemblers could also be applied to the verification of operating system kernels and other formalised compilers. For instance the verified seL4 kernel~\cite{klein:sel4:2009}, CompCert~\cite{leroy:formally:2009} and CompCertTSO~\cite{sevcik:relaxed-memory:2011} all explicitly assume the existence of trustworthy assemblers. The fact that an optimising assembler cannot preserve the semantics of all assembly programs may have consequences for these projects. Our formalisation exploits dependent types in different ways and for multiple purposes. We are not the first to consider the correctness of an assembler for a non-trivial assembly language. The most impressive piece of work in this domain is Piton~\cite{moore:piton:1996}, a stack of verified components, written and verified in ACL2, ranging from a proprietary FM9001 microprocessor verified at the gate level, to assemblers and compilers for two high-level languages---Lisp and $\mu$Gypsy~\cite{moore:grand:2005}. % jinja Klein and Nipkow consider a Java-like programming language, Jinja~\cite{klein:machine:2006}. They provide a compiler, virtual machine and operational semantics for the programming language and virtual machine, and prove that their compiler is semantics and type preserving. Though other verified assemblers exist in the literature what sets our work apart from that above is our attempt to optimise the generated machine code. This complicates any formalisation effort as an attempt at the best possible selection of machine instructions must be made---especially important on devices with limited code memory. Klein and Nipkow also provide a compiler, virtual machine and operational semantics for the Jinja~\cite{klein:machine:2006} language and prove that their compiler is semantics and type preserving. Though other verified assemblers exist what sets our work apart from that above is our attempt to optimise the generated machine code. This complicates a formalisation as an attempt at the best possible selection of machine instructions must be made---especially important on devices with limited code memory. Care must be taken to ensure that the time properties of an assembly program are not modified by assembly lest we affect the semantics of any program employing the MCS-51's I/O facilities. This is only possible by inducing a cost model on the source code from the optimisation strategy and input program. We believe that the proof strategy is sound and that all axioms can be closed, up to minor bugs that should have local fixes that do not affect the global proof strategy. The development, including the definition of the executable semantics of the MCS-51, is spread across 29 files, with around 18,500 lines of Matita source. The bulk of the proof is contained in a series of files, \texttt{AssemblyProof.ma}, \texttt{AssemblyProofSplit.ma} and \texttt{AssemblyProofSplitSplit.ma} consisting of approximately 4500 lines of Matita source. The complete development is spread across 29 files with around 20,000 lines of Matita source. Relavent files are: \texttt{AssemblyProof.ma}, \texttt{AssemblyProofSplit.ma} and \texttt{AssemblyProofSplitSplit.ma}, consisting of approximately 4500 lines of Matita source. Numerous other lines of proofs are spread all over the development because of dependent types and the Russell proof style, which does not allow one to separate the code from the proofs. The low ratio between source lines and the number of lines of proof is unusual, but justified by the fact that the pseudo-assembly and the assembly language share most constructs and large swathes of the semantics are shared. Many lines of code are required to describe the complex semantics of the processor, but for the shared cases the proof of preservation of the semantics is essentially trivial. \bibliography{cpp-2012-asm.bib}