# Changeset 2379

Ignore:
Timestamp:
Sep 28, 2012, 6:26:29 PM (9 years ago)
Message:

Down to 16 pages again

File:
1 edited

### Legend:

Unmodified
 r2378 In practice, we note how our approach facilitates more code reuse between the semantics of assembly code and object code. The formalization of the assembler and its correctness proof are given in Section~\ref{sect.the.proof}. Section~\ref{sect.matita} provides a brief introduction to the syntax of Matita. Section~\ref{sect.conclusions} presents the conclusions and relations with previous work. % ---------------------------------------------------------------------------- % % SECTION                                                                      % % ---------------------------------------------------------------------------- % \section{Matita} \label{sect.matita} The formalisation of the assembler and its correctness proof are given in Sect.~\ref{sect.the.proof}. Sect.~\ref{sect.conclusions} presents the conclusions and relations with previous work. % ---------------------------------------------------------------------------- % % SECTION                                                                      % % ---------------------------------------------------------------------------- % \paragraph{Matita} %\section{Matita} %\label{sect.matita} Matita is a proof assistant based on a variant of the Calculus of (Co)inductive Constructions~\cite{asperti:user:2007}. It features dependent types that we exploit in the formalisation. We now begin the proof of correctness of the assembler. Correctness consists of two properties: firstly that the assembly process never fails when fed a correct policy and secondly the object code returned simulates the source code when the latter is executed according to the cost model also returned by the assembler. The second property above can be further decomposed into two main properties: correctness with respect to fetching and decoding and correctness with respect to execution. This second property can be further decomposed into two main properties: correctness with respect to fetching and decoding and correctness with respect to execution. Informally, correctness with respect to fetching is the following statement: when we fetch an assembly pseudoinstruction \texttt{I} at address \texttt{ppc}, then we can fetch the expanded pseudoinstruction(s) \texttt{[J1, \ldots, Jn] = fetch\_pseudo\_instruction \ldots\ I\ ppc} from \texttt{policy ppc} in the code memory obtained by loading the assembled object code. \end{lstlisting} We read \texttt{fetch\_assembly} as follows. Any time the encoding \texttt{assembled} of an instruction \texttt{i} is found in code memory starting at position \texttt{pc} (the hypothesis \texttt{encoding\_check} \ldots), when we fetch at address \texttt{pc} we retrieve the instruction \texttt{i}, the updated program counter is \texttt{pc} plus the length of the encoding, and the cost of the fetched instruction is the one predicted for \texttt{i}. Any time the encoding \texttt{assembled} of an instruction \texttt{i} is found in code memory starting at position \texttt{pc} (the hypothesis \texttt{encoding\_check} \ldots), when we fetch at address \texttt{pc} retrieving the instruction \texttt{i}, the new program counter is \texttt{pc} plus the length of the encoding, and the cost of the fetched instruction is the one predicted for \texttt{i}. Or, in plainer words, assembling, storing and then immediately fetching gets you back to where you started. \end{cases} \end{gather*} In order to prove the preservation of the semantics, we need to associate an object code status to each assembly pseudostatus. To prove semantic preservation we must associate an object code status to each assembly pseudostatus. This operation is driven by the current \texttt{AddrMap}: if at address \texttt{a} the assembly level memory holds \texttt{d}, then if \texttt{AddrMap(a) = Data} the object code memory will hold \texttt{d} (data is preserved), otherwise it will hold \texttt{policy(d)}. If all the operations accepted by the address update map are well-behaved, this is sufficient to show preservation of the semantics for those computation steps that are well-behaved, i.e. such that the map update does not fail. \end{lstlisting} The function that implements the tracking map update, previously denoted by $\llbracket - \rrbracket$, is called \texttt{next\_internal\_pseudo\_address\_map} in the formalisation. For the time being, we decided to accept as good behaviours the copy of addresses among memory cells and the accumulator (\texttt{MOV} pseudoinstruction) and the use of the \texttt{CJNE} conditional jump that compares two addresses and jumps to a given label if the two labels are equal. For the time being, we accept as good behaviours address copying amongst memory cells and the accumulator (\texttt{MOV} pseudoinstruction) and the use of the \texttt{CJNE} conditional jump that compares two addresses and jumps to a given label if the two labels are equal. Moreover, \texttt{RET} to return from a function call is well behaved iff the lower and upper parts of the return address, fetched from the stack, are both marked as complementary parts of the same address (i.e. \texttt{h} is tracked as \texttt{$\langle$Upper,l$\rangle$} and \texttt{l} is tracked as \texttt{$\langle$Lower,h$\rangle$}. These three operations are sufficient to implement the backend of the CerCo compiler. program_counter s < $2^{16}$ $\rightarrow$ option internal_pseudo_address_map \end{lstlisting} We are now ready to state the (slightly simplified) statement of correctness of our compiler, whose proofs combines correctness w.r.t. fetching and correctness w.r.t. execution. It states that the well behaved execution of a single assembly pseudoinstruction according to the cost model induced by compilation is correctly simulated by the execution of (possibly) many machine code instructions. We now state the (simplified) statement of correctness of our compiler, whose proofs combines correctness with respect to fetching and correctness with respect to execution. It states that the well-behaved execution of a single assembly pseudoinstruction according to the cost model induced by compilation is correctly simulated by the execution of (possibly) many machine code instructions. \begin{lstlisting} theorem main_thm: policy. \end{lstlisting} The statement is standard for forward simulation, but restricted to \texttt{PseudoStatuses} \texttt{ps} whose tracking map is \texttt{M} and who are well behaved according to \texttt{internal\_pseudo\_address\_map} \texttt{M}. The \texttt{ticks\_of program policy} function returns the costing computed by assembling the \texttt{program} using the given \texttt{policy}. An obvious corollary of \texttt{main\_thm} is the correct simulation of $n$ well behaved steps by some number of steps $m$, where each step must be well behaved w.r.t. the tracking map returned by the previous step. The statement is standard for forward simulation, but restricted to \texttt{PseudoStatuses} \texttt{ps} whose tracking map is \texttt{M} and who are well-behaved according to \texttt{internal\_pseudo\_address\_map} \texttt{M}. The \texttt{ticks\_of program policy} function returns the costing computed by assembling the \texttt{program} using the given \texttt{policy}. An obvious corollary of \texttt{main\_thm} is the correct simulation of $n$ well behaved steps by some number of steps $m$, where each step must be well-behaved with respect to the tracking map returned by the previous step. % ---------------------------------------------------------------------------- % Not every proof has been carried out in this way: we only used this style to prove that a function satisfies a specification that only involves that function in a significant way. It would not be natural to see the proof that fetch and assembly commute as the specification of one of the two functions. \subsection{Related work} \label{subsect.related.work} \paragraph{Related work} %\subsection{Related work} %\label{subsect.related.work} % piton We are not the first to consider the correctness of an assembler for a non-trivial assembly language. Care must be taken to ensure that the time properties of an assembly program are not modified by assembly lest we affect the semantics of any program employing the MCS-51's I/O facilities. This is only possible by inducing a cost model on the source code from the optimisation strategy and input program. \subsection{Resources} \label{subsect.resources} \paragraph{Resources} %\subsection{Resources} %\label{subsect.resources} Our source files are available at~\url{http://cerco.cs.unibo.it}. We assumed several properties of `library functions', e.g. modular arithmetic and datastructure manipulation.