# Changeset 999 for src/ASM/CPP2011/cpp-2011.tex

Ignore:
Timestamp:
Jun 20, 2011, 5:07:29 PM (10 years ago)
Message:

conclusions

File:
1 edited

### Legend:

Unmodified
 r997 We prove, under the assumption of the existence of a correct policy, that the assembly process never fails and preserves the semantics of a subset of assembly programs. Correct policies fail to exist only in a limited number of pathological circumstances. Quite surprisingly, it is impossible for an optimising assembler to preserve the semantics of every assembly program. Our assembler is complete with respect to the choice of policy. Surprisingly, we observe that it is impossible for an optimising assembler to preserve the semantics of every assembly program. \end{abstract} To understand why this problem is not trivial, consider the following snippet of assembly code: \begin{displaymath} \text{dpm: finish me} \begin{array}{r@{\qquad}r@{\quad}l@{\;\;}l@{\qquad}l} \text{1:} & \mathtt{0x000}  & \texttt{LJMP} & \texttt{0x100}  & \text{\texttt{;; Jump forward 256.}} \\ \text{2:} & \mathtt{...}    & \mathtt{...}  &                 &                                               \\ \text{3:} & \mathtt{0x0FA}  & \texttt{LJMP} & \texttt{0x100}  & \text{\texttt{;; Jump forward 256.}} \\ \text{4:} & \mathtt{...}    & \mathtt{...}  &                 &                                               \\ \text{5:} & \mathtt{0x100}  & \texttt{LJMP} & \texttt{0x100}  & \text{\texttt{;; Jump forward 256.}} \\ \text{6:} & \mathtt{...}    & \mathtt{...}  &                 &                                               \\ \text{7:} & \mathtt{0x1F9}  & \texttt{LJMP} & \texttt{-0x300} & \text{\texttt{;; Arbitrary instruction.}} \\ \end{array} \end{displaymath} As our example shows, given an occurence $l$ of an \texttt{LJMP} instruction, it may be possible to shrink $l$ to an occurence of an \texttt{SJMP} providing we can shrink any \texttt{LJMP}s that exist between $l$ and its target location. However, shrinking these \texttt{LJMP}s may in turn depend on shrinking $l$ to an \texttt{SJMP}, as it is perfectly possible to jump backwards. In short, unless we can somehow break this loop of circularity, and similar knotty configurations of jumps, we are stuck with a suboptimal solution to the expanding jumps problem. We observe that $100_{16} = 256_{10}$, and lies \emph{just} outside the range expressible in an 8-bit byte (0--255). As our example shows, given an occurence $l$ of an \texttt{LJMP} instruction, it may be possible to shrink $l$ to an occurence of an \texttt{SJMP}---consuming fewer bytes of code memory---provided we can shrink any \texttt{LJMP}s that exist between $l$ and its target location. In particular, if we can somehow shrink the \texttt{LJMP} occurring at line 5, the \texttt{LJMP}s occurring at lines 1 and 3 above may both be replaced by \texttt{SJMP}s, provided that \emph{both} jumps at 1 and 3 are replaced at the same time. Thinking more, it is easy to imagine knotty, circular configurations of jumps developing, each jump occurrence only being shrinkable if every other is. Finding a solution to this shrinking jumps' problem then involves us finding a method to break any vicious circularities that develop. How we went about resolving this problem affected the shape of our proof of correctness for the whole assembler in a rather profound way. The counterpart of the \texttt{Status} record is \texttt{PseudoStatus}. Instead of code memory being implemented as tries of bytes, code memory is here implemented as lists of pseudoinstructions, and program counters are merely indices into this list. In actual fact, both \texttt{Status} and \texttt{PseudoStatus} are both specialisations of the same \texttt{PreStatus} record, parametric in the representation of code memory. This allows us to share some code that is common to both records (for instance, setter' and getter' functions). A further benefit of this sharing is that those instructions that are completely ambivalent about the particular representation of code memory can be factored out into their own type. Our analogue of \texttt{execute\_1} is \texttt{execute\_1\_pseudo\_instruction}: \begin{lstlisting} The costing returns \emph{pairs} of natural numbers because, in the case of expanding conditional jumps to labels, the expansion of the true branch' and false branch' may differ in the number of clock ticks needed for execution. This timing information is used inside \texttt{execute\_1\_pseudo\_instruction} to update the clock of the \texttt{PseudoStatus}. During the proof of correctness of the assembler we relate the clocks of \texttt{Status}es and \texttt{PseudoStatus}es. During the proof of correctness of the assembler we relate the clocks of \texttt{Status}es and \texttt{PseudoStatus}es for the policy induced by the cost model and optimisations. The assembler, mapping programs consisting of lists of pseudoinstructions to lists of bytes, operates in a mostly straightforward manner. To a degree of approximation, the assembler on an assembly program, consisting of $n$ pseudoinstructions $\mathtt{P_i}$ for $1 \leq i \leq n$, works as in the following diagram: To a degree of approximation, the assembler on an assembly program, consisting of $n$ pseudoinstructions $\mathtt{P_i}$ for $1 \leq i \leq n$, works as in the following diagram (we use $-^{*}$ to denote a combined map and flatten operation): \begin{displaymath} [\mathtt{P_1}, \ldots \mathtt{P_n}] \xrightarrow{\mathtt{flatten}\left(\mathtt{P_i} \xrightarrow{\mbox{\fontsize{7}{9}\selectfont$\mathtt{expand}$}} \mathtt{[I_1^i, \ldots I^q_i]} \xrightarrow{\mbox{\fontsize{7}{9}\selectfont$\mathtt{assembly1}^*$}} \mathtt{[0110]}\right)^{*}} \mathtt{[010101]} [\mathtt{P_1}, \ldots \mathtt{P_n}] \xrightarrow{\left(\mathtt{P_i} \xrightarrow{\mbox{\fontsize{7}{9}\selectfont$\mathtt{expand\_pseudo\_instruction}$}} \mathtt{[I_1^i, \ldots I^q_i]} \xrightarrow{\mbox{\fontsize{7}{9}\selectfont$\mathtt{~~~~~~~~assembly1^{*}~~~~~~~~}$}} \mathtt{[0110]}\right)^{*}} \mathtt{[010101]} \end{displaymath} Here $\mathtt{I^i_j}$ for $1 \leq j \leq q$ are the $q$ machine code instructions obtained by expanding, with \texttt{expand\_pseudo\_instruction}, a single pseudoinstruction. By composition, we then have that the whole assembly process is semantics preserving. This is a tempting approach to the proof, but ultimately the wrong approach. In particular, it is important that we track how the program counter indexing into the assembly program, and the machine's program counter evolve, so that we can relate them. Expanding pseudoinstructions requires that the machine's program counter be incremented by $n$ steps, for $1 \leq n$, for every increment of the assembly program's program counter. Keeping track of the correspondence between the two program counters quickly becomes unfeasible using a compositional approach, and hence the proof must be monolithic. %This is a tempting approach to the proof, but ultimately the wrong approach. %In particular, it is important that we track how the program counter indexing into the assembly program, and the machine's program counter evolve, so that we can relate them. %Expanding pseudoinstructions requires that the machine's program counter be incremented by $n$ steps, for $1 \leq n$, for every increment of the assembly program's program counter. %Keeping track of the correspondence between the two program counters quickly becomes unfeasible using a compositional approach, and hence the proof must be monolithic. % ---------------------------------------------------------------------------- % We have proved the total correctness of an assembler for MCS-51 assembly language. In particular, our assembly language featured labels, arbitrary conditional and unconditional jumps to labels, global data and instructions for moving this data into the MCS-51's single 16-bit register. Expanding these pseudoinstructions into machine code instructions is not trivial, and the proof that the assembly process is correct', in that the semantics of an assembly program are not changed is complex. Expanding these pseudoinstructions into machine code instructions is not trivial, and the proof that the assembly process is correct', in that the semantics of a subset of assembly programs are not changed is complex. Further, we have observed the shocking' fact that any optimising assembler cannot preserve the semantics of all assembly programs. The formalisation is a key component of the CerCo project, which aims to produce a verified concrete complexity preserving compiler for a large subset of the C programming language. The verified assembler, complete with the underlying formalisation of the semantics of MCS-51 machine code (described fully in~\cite{mulligan:executable:2011}), will form the bedrock layer upon which the rest of the CerCo project will build its verified compiler platform. Aside from their application in verified compiler projects such as CerCo, verified assemblers such as ours could also be applied to the verification of operating system kernels. However, further work is needed. In particular, as it stands, the code produced by the prototype CerCo C compiler does not fall into the semantics preserving' subset of assembly programs for our assembler. This is because the MCS-51 features a small stack space, and a larger stack is customarily manually emulated in external RAM. As a result, the majority of programs feature slices of memory addresses and program counters being moved in-and-out of external RAM via the registers, simulating the stack mechanism. At the moment, this movement is not tracked by \texttt{internal\_pseudo\_address\_map}, which only tracks the movement of memory addresses in low internal RAM. We leave extending this tracking of memory addresses throughout the whole of the MCS-51's address spaces as future work. Aside from their application in verified compiler projects such as CerCo and CompCert, verified assemblers such as ours could also be applied to the verification of operating system kernels. Of particular note is the verified seL4 kernel~\cite{klein:sel4:2009,klein:sel4:2010}. This verification explicitly assumes the existence of, amongst other things, a trustworthy assembler and compiler. \paragraph{Use of dependent types and Russell} Our formalisation makes sparing use of dependent types. In certain datastructures, such as tries and vectors, they are used to guarantee invariants. However, we have currently shyed away from making extensive use of dependent types and inductive predicates in the proof of correctness for the assembler itself. This is because complex dependent types and inductive predicates tend not to co\"operate particularly well with tactics such as inversion. However, there are certain cases where the use of dependent types is unavoidable. For instance, when proving that the \texttt{build\_maps} function is correct, a function that collates the cost and data labels of an assembly program into map datastructures. In cases such as these we make use of Matita's implementation of Russell~\cite{sozeau:subset:2006}. In Matita, Russell may be implemented with two coercions and some notational sugaring. We note here that both CompCert and the seL4 formalisation assume the existence of trustworthy' assemblers. Our observation that an optimising assembler cannot preserve the semantics of every assembly program may have important consequences for these projects. In particular, if CompCert chooses to assume the existence of an optimising assembler, then care should be made to ensure that any assembly program produced by the CompCert C compiler falls into the class of assembly programs that have a hope of having their semantics preserved by an optimising assembler. In certain places in our formalisation (e.g. in proving \texttt{build\_maps} is correct) we made use of Matita's implementation of Russell~\cite{sozeau:subset:2006}. In Matita, Russell may be implemented using two coercions and some notational sugaring. % more \subsection{Related work} Perhaps the most impressive piece of work in this domain is the Piton stack~\cite{moore:piton:1996,moore:grand:2005}. This was a stack of verified components, written and verified in ACL2, ranging from a proprietary FM9001 microprocessor verified at the gate level, to assemblers and compilers for two high-level languages---a dialect of Lisp and $\mu$Gypsy~\cite{moore:grand:2005}. %dpm more: weirich paper? % jinja They provide a compiler, virtual machine and operational semantics for the programming language and virtual machine, and prove that their compiler is semantics and type preserving. Finally, mention should be made of CompCert~\cite{compcert:2011,blazy:formal:2006,leroy:formal:2009,leroy:formally:2009}, another verified compiler project related to CerCo. We believe many other verified assemblers exist in the literature. However, what sets our work apart from that above is our attempt to optimise the machine code generated by our assembler. This complicates any formalisation effort, as the best possible selection of machine instructions must be made, especially important on a device such as the MCS-51 with a miniscule code memory. Further, care must be taken to ensure that the time properties of an assembly program are not modified by the assembly process lest we affect the semantics of any program employing the MCS-51's I/O facilities. Finally, mention of CerCo will invariably invite comparisons with CompCert~\cite{compcert:2011,leroy:formal:2009}, another verified compiler project closely related to CerCo. As previously mentioned, CompCert considers only extensional correctness of the compiler, and not intensional correctness, which CerCo focusses on. However, CerCo also extends CompCert in other ways. In essence, the work presented in this publication is one part of CerCo's extension over CompCert. \subsection{Source code} \label{subsect.source.code} All files relating to our formalisation effort can be found online at~\url{http://cerco.cs.unibo.it}. \bibliography{cpp-2011.bib}