Changeset 2366


Ignore:
Timestamp:
Sep 28, 2012, 2:19:06 AM (7 years ago)
Author:
sacerdot
Message:

...

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Papers/cpp-asm-2012/cpp-2012-asm.tex

    r2365 r2366  
    100100At the assembler level, this is reflected by our need to induce a cost
    101101model on the assembly code as a function of the assembly program and the
    102 strategy used to solve the branch displacement problem. In particular, the
    103 optimising compiler should also return a map that assigns a cost (in clock
     102strategy used to solve the branch displacement problem. In particular, our
     103optimising assembler should also return a map that assigns a cost (in clock
    104104cycles) to every instruction in the source program. We expect the induced cost
    105 to be preserved by the compiler: we will prove that the compiled code
     105to be preserved by the assembler: we will prove that the compiled code
    106106tightly simulates the source code by taking exactly the predicted amount of
    107107time.
     
    137137In practice, we note how our approach facilitates more code reuse between the semantics of assembly code and object code.
    138138
    139 The rest of this paper is a detailed description of our proof that is marginally still a work in progress.
     139The formalization of the assembler and its correctness proof are given in Section~\ref{sect.the.proof}. Section~\ref{sect.matita} provides a brief introduction
     140to the syntax of Matita. Section~\ref{sect.conclusions} presents the conclusions and relations with previous work.
    140141
    141142% ---------------------------------------------------------------------------- %
     
    160161% SECTION                                                                      %
    161162% ---------------------------------------------------------------------------- %
    162 \section{The proof}
     163\section{Certification of an optimizing assembler}
    163164\label{sect.the.proof}
    164165
     
    193194memory of the processor, represented as a compact trie of bytes addressed
    194195by the program counter.
    195 The \texttt{Status} of the emulator is described as
     196The \texttt{Status} of the emulator is
    196197a record that contains the microprocessor's program counter, registers, stack
    197 pointer, clock, special function registers, code memory, and so on.
     198pointer, clock, special function registers, data memory, and so on.
    198199The value of the code memory is a parameter of the record since it is not
    199200changed during execution.
     
    240241The parameterised type $A$ of \texttt{preinstruction} represents the addressing mode allowed for conditional jumps; in the \texttt{RealInstruction} constructor
    241242we constraint it to be a relative offset.
    242 A different instantiation will be used in the next section for assembly programs.
     243A different instantiation (labels) will be used in the next section for assembly programs.
    243244
    244245Once decoded, execution proceeds by a case analysis on the decoded instruction, following the operation of the hardware.
     
    310311
    311312We do not perform any kind of symbolic execution, wherein data is the disjoint union of bytes and addresses, with addresses kept opaque and immutable.
    312 Labels are immediately translated to concrete addresses, and registers and memory locations only ever contain bytes, never labels.
     313Labels are immediately translated before execution to concrete addresses, and registers and memory locations only ever contain bytes, never labels.
    313314As a consequence, we allow the programmer to mangle, change and generally adjust addresses as they want, under the proviso that the translation process may not be able to preserve the semantics of programs that do this.
     315This will be further discussed in Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}.
    314316The only limitation introduced by this approach is that the size of
    315317assembly programs is bounded by $2^{16}$.
    316 This will be further discussed in Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}.
    317318
    318319% ---------------------------------------------------------------------------- %
     
    322323\subsection{The assembler}
    323324\label{subsect.the.assembler}
    324 The assembler takes in input an assembly program to expand and a
    325 branch displacement policy for it.
    326 It returns both the assembled program (a list of bytes to be
    327 loaded in code memory for execution) and the costing of the source program.
     325The assembler takes in input an assembly program made of pseudoinstructions
     326and a branch displacement policy for it.
     327It returns both the object code (a list of bytes to be
     328loaded in code memory for execution) and the costing for the source.
    328329
    329330Conceptually the assembler works in two passes.
     
    361362Naturally, if \texttt{label} is `close enough', a conditional jump pseudoinstruction is mapped directly to a conditional jump machine instruction; the above translation only applies if \texttt{label} is not sufficiently local.
    362363
     364The cost returned by the assembler for a pseudoinstruction is set to be the
     365cost of its expansion in clock cycles. For conditional jumps that are expanded
     366as just shown, the costs of takin the true and false branches are different and
     367both need to be returned.
     368
    363369%In order to implement branch displacement it is impossible to really make the \texttt{expand\_pseudo\_instruction} function completely independent of the encoding function.
    364370%This is due to branch displacement requiring the distance in bytes of the target of the jump.
     
    400406preceed the one at address \texttt{a}: the assembler just concatenates all
    401407expansions sequentially. To grant this property, we impose a
    402 correctness criterion over policies. A policy is correct when for all
    403 valid pseudoaddresses \texttt{ppc}
    404 $${\texttt{policy(ppc+1) = policy(ppc) + instruction\_size(ppc)}}$$
     408correctness criterion over policies. A policy is correct when
     409\texttt{policy(0) = 0} and for all valid pseudoaddresses \texttt{ppc}
     410$${\texttt{policy(ppc+1) = policy(ppc) + instruction\_size(ppc)} \leq 2^{16}}$$
    405411Here \texttt{instruction\_size(ppc)} is the size in bytes of the expansion
    406412of the pseudoinstruction found at \texttt{pcc}, i.e. the length of
     
    413419\subsection{Correctness of the assembler with respect to fetching}
    414420\label{subsect.total.correctness.of.the.assembler}
    415 Using our policies, we now work toward proving the correctness of the assembler.
    416 Correctness means that the assembly process never fails when provided with a correct policy and that the process does not change the semantics of a certain class of well-behaved assembly programs.
    417 
    418 The aim of this section is to prove the following informal statement: when we fetch an assembly pseudoinstruction \texttt{I} at address \texttt{ppc}, then we can fetch the expanded pseudoinstruction(s) \texttt{[J1, \ldots, Jn] = fetch\_pseudo\_instruction \ldots\ I\ ppc} from \texttt{policy ppc} in the code memory obtained by loading the assembled object code.
    419 This constitutes the first major step in the proof of correctness of the assembler, the next one being the simulation of \texttt{I} by \texttt{[J1, \ldots, Jn]} (see Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}).
    420 
    421 The \texttt{assembly} function is given a Russell type (slightly simplified here):
     421We now begin the proof of correctness of the assembler. Correctness consists
     422of two properties: 1) the assembly process never fails when fed a correct
     423policy;
     4242)~the object code returned simulates the source code when the latter is
     425executed according to the cost model also returned by the assembler.
     426
     427Property 2) can be decomposed into two main properties: correctness with
     428respect to fetching+decoding and correctness w.r.t. execution.
     429
     430Informally, correctness w.r.t. fetching is the following statement: when we fetch an assembly pseudoinstruction \texttt{I} at address \texttt{ppc}, then we can fetch the expanded pseudoinstruction(s) \texttt{[J1, \ldots, Jn] = fetch\_pseudo\_instruction \ldots\ I\ ppc} from \texttt{policy ppc} in the code memory obtained by loading the assembled object code. This Section reviews the main steps
     431to prove correctness w.r.t. fetching. Section~\ref{subsect.total.correctness.for.well.behaved.assembly.programs} deals with correctness w.r.t. execution:
     432the instructions \texttt{[J1, \ldots, Jn]} simulate the pseudoinstruction~\texttt{I}.
     433
     434The (slightly simplified) Russell type for the \texttt{assembly} function is:
    422435\begin{lstlisting}
    423436definition assembly:
    424   $\forall$program: pseudo_assembly_program.
    425   $\forall$policy.
    426     $\Sigma$assembled: list Byte $\times$ (BitVectorTrie costlabel 16).
    427       policy is correct for program $\rightarrow$
    428       $\mid$program$\mid$ < $2^{16}$ $\rightarrow$ $\mid$fst assembled$\mid$ < $2^{16}$ $\wedge$
    429       (policy ($\mid$program$\mid$) = $\mid$fst assembled$\mid$ $\vee$
    430       (policy ($\mid$program$\mid$) = 0 $\wedge$ $\mid$fst assembled$\mid$ = $2^{16}$)) $\wedge$
    431       $\forall$ppc: pseudo_program_counter. ppc < $2^{16}$ $\rightarrow$
    432         let pseudo_instr := fetch from program at ppc in
    433         let assembled_i := assemble pseudo_instr in
    434           $\mid$assembled_i$\mid$ $\leq$ $2^{16}$ $\wedge$
    435             $\forall$n: nat. n < $\mid$assembled_i$\mid$ $\rightarrow$ $\exists$k: nat.
    436               nth assembled_i n = nth assembled (policy ppc + k).
     437 $\forall$program: pseudo_assembly_program.  $\forall$policy.
     438  $\Sigma$assembled: list Byte $\times$ (BitVectorTrie nat 16).
     439   $\mid$program$\mid$ $\leq$ $2^{16}$ $\rightarrow$ policy is correct for program $\rightarrow$
     440     policy ($\mid$program$\mid$) = $\mid$fst assembled$\mid$ $\leq$ $2^{16}$ $\wedge$
     441     $\forall$ppc: pseudo_program_counter. ppc < $2^{16}$ $\rightarrow$
     442       let pseudo_instr := fetch from program at ppc in
     443       let assembled_i := assemble pseudo_instr in
     444         $\mid$assembled_i$\mid$ $\leq$ $2^{16}$ $\wedge$
     445           $\forall$n: nat. n < $\mid$assembled_i$\mid$ $\rightarrow$ $\exists$k: nat.
     446             nth assembled_i n = nth assembled (policy ppc + k).
    437447\end{lstlisting}
    438448In plain words, the type of \texttt{assembly} states the following.
    439 Suppose we are given a policy that is correct for the program we are assembling.
    440 Then we return a list of assembled bytes, complete with a map from program counters to cost labels, such that the following properties hold for the list of bytes.
    441 Under the condition that the policy is `correct' for the program and the program is fully addressable by a 16-bit word, the assembled list is also fully addressable by a 16-bit word, the policy maps the last program counter that can address the program to the last instruction of the assemble pseudoinstruction or overflows, and if we fetch from the pseudo-program counter \texttt{ppc} we get a pseudoinstruction \texttt{pi} and a new pseudo-program counter \texttt{ppc}.
    442 Further, assembling the pseudoinstruction \texttt{pseudo\_instr} results in a list of bytes, \texttt{assembled\_i}.
    443 Then, indexing into this list with any natural number \texttt{n} less than the length of \texttt{assembled\_i} gives the same result as indexing into \texttt{assembled} with \texttt{policy ppc} (the program counter pointing to the start of the expansion in \texttt{assembled}) plus \texttt{k}.
    444 
    445 Essentially the lemma above states that the \texttt{assembly} function correctly expands pseudoinstructions, and that the expanded instruction reside consecutively in memory.
    446 This result is lifted from lists of bytes into a result on tries of bytes (i.e. code memories), using an additional lemma: \texttt{assembly\_ok}.
    447 
    448 Lemma \texttt{fetch\_assembly} establishes that the \texttt{fetch} and \texttt{assembly1} functions interact correctly.
    449 The \texttt{fetch} function, as its name implies, fetches the instruction indexed by the program counter in the code memory, while \texttt{assembly1} maps a single instruction to its byte encoding:
     449Given a correct policy for the program to be assembled, the assembler never
     450fails and returns some object code and a costing function.
     451Under the condition that the policy is `correct' for the program and the program is fully addressable by a 16-bit word, the object code is also fully addressable by a 16-bit word. Moreover, the
     452result of assemblying the pseudoinstruction obtained fetching from the
     453assembly address \texttt{ppc}
     454is a list of bytes found in the generated object code starting from
     455the object code address \texttt{policy(ppc)}.
     456
     457Essentially the type above states that the \texttt{assembly} function correctly expands pseudoinstructions, and that the expanded instruction reside consecutively in memory. The fundamental hypothesis is correctness of the policy which
     458allows to prove the inductive step of the proof, which is by induction over
     459the assembly program. It is then simple to lift the property from lists
     460of bytes (object code) to tries of bytes (i.e. code memories after loading).
     461The \texttt{assembly\_ok} lemma does the lifting.
     462
     463We have established that every pseudoinstruction is compiled to a sequence of
     464bytes that is found in memory at the expect place. This does not imply
     465trivially that those bytes will be decoded in a correct way to recover the
     466pseudoinstruction expansion. Indeed, we need first to prove a lemma that
     467establish that the \texttt{fetch} function is the left inverse of the
     468\texttt{assembly1} function:
    450469\begin{lstlisting}
    451470lemma fetch_assembly:
Note: See TracChangeset for help on using the changeset viewer.