 Timestamp:
 Sep 28, 2012, 2:19:06 AM (7 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

Papers/cppasm2012/cpp2012asm.tex
r2365 r2366 100 100 At the assembler level, this is reflected by our need to induce a cost 101 101 model on the assembly code as a function of the assembly program and the 102 strategy used to solve the branch displacement problem. In particular, the103 optimising compiler should also return a map that assigns a cost (in clock102 strategy used to solve the branch displacement problem. In particular, our 103 optimising assembler should also return a map that assigns a cost (in clock 104 104 cycles) to every instruction in the source program. We expect the induced cost 105 to be preserved by the compiler: we will prove that the compiled code105 to be preserved by the assembler: we will prove that the compiled code 106 106 tightly simulates the source code by taking exactly the predicted amount of 107 107 time. … … 137 137 In practice, we note how our approach facilitates more code reuse between the semantics of assembly code and object code. 138 138 139 The rest of this paper is a detailed description of our proof that is marginally still a work in progress. 139 The formalization of the assembler and its correctness proof are given in Section~\ref{sect.the.proof}. Section~\ref{sect.matita} provides a brief introduction 140 to the syntax of Matita. Section~\ref{sect.conclusions} presents the conclusions and relations with previous work. 140 141 141 142 %  % … … 160 161 % SECTION % 161 162 %  % 162 \section{ The proof}163 \section{Certification of an optimizing assembler} 163 164 \label{sect.the.proof} 164 165 … … 193 194 memory of the processor, represented as a compact trie of bytes addressed 194 195 by the program counter. 195 The \texttt{Status} of the emulator is described as196 The \texttt{Status} of the emulator is 196 197 a record that contains the microprocessor's program counter, registers, stack 197 pointer, clock, special function registers, codememory, and so on.198 pointer, clock, special function registers, data memory, and so on. 198 199 The value of the code memory is a parameter of the record since it is not 199 200 changed during execution. … … 240 241 The parameterised type $A$ of \texttt{preinstruction} represents the addressing mode allowed for conditional jumps; in the \texttt{RealInstruction} constructor 241 242 we constraint it to be a relative offset. 242 A different instantiation will be used in the next section for assembly programs.243 A different instantiation (labels) will be used in the next section for assembly programs. 243 244 244 245 Once decoded, execution proceeds by a case analysis on the decoded instruction, following the operation of the hardware. … … 310 311 311 312 We do not perform any kind of symbolic execution, wherein data is the disjoint union of bytes and addresses, with addresses kept opaque and immutable. 312 Labels are immediately translated to concrete addresses, and registers and memory locations only ever contain bytes, never labels.313 Labels are immediately translated before execution to concrete addresses, and registers and memory locations only ever contain bytes, never labels. 313 314 As a consequence, we allow the programmer to mangle, change and generally adjust addresses as they want, under the proviso that the translation process may not be able to preserve the semantics of programs that do this. 315 This will be further discussed in Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}. 314 316 The only limitation introduced by this approach is that the size of 315 317 assembly programs is bounded by $2^{16}$. 316 This will be further discussed in Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}.317 318 318 319 %  % … … 322 323 \subsection{The assembler} 323 324 \label{subsect.the.assembler} 324 The assembler takes in input an assembly program to expand and a325 branch displacement policy for it.326 It returns both the assembled program(a list of bytes to be327 loaded in code memory for execution) and the costing of the source program.325 The assembler takes in input an assembly program made of pseudoinstructions 326 and a branch displacement policy for it. 327 It returns both the object code (a list of bytes to be 328 loaded in code memory for execution) and the costing for the source. 328 329 329 330 Conceptually the assembler works in two passes. … … 361 362 Naturally, if \texttt{label} is `close enough', a conditional jump pseudoinstruction is mapped directly to a conditional jump machine instruction; the above translation only applies if \texttt{label} is not sufficiently local. 362 363 364 The cost returned by the assembler for a pseudoinstruction is set to be the 365 cost of its expansion in clock cycles. For conditional jumps that are expanded 366 as just shown, the costs of takin the true and false branches are different and 367 both need to be returned. 368 363 369 %In order to implement branch displacement it is impossible to really make the \texttt{expand\_pseudo\_instruction} function completely independent of the encoding function. 364 370 %This is due to branch displacement requiring the distance in bytes of the target of the jump. … … 400 406 preceed the one at address \texttt{a}: the assembler just concatenates all 401 407 expansions sequentially. To grant this property, we impose a 402 correctness criterion over policies. A policy is correct when for all403 valid pseudoaddresses \texttt{ppc}404 $${\texttt{policy(ppc+1) = policy(ppc) + instruction\_size(ppc)} }$$408 correctness criterion over policies. A policy is correct when 409 \texttt{policy(0) = 0} and for all valid pseudoaddresses \texttt{ppc} 410 $${\texttt{policy(ppc+1) = policy(ppc) + instruction\_size(ppc)} \leq 2^{16}}$$ 405 411 Here \texttt{instruction\_size(ppc)} is the size in bytes of the expansion 406 412 of the pseudoinstruction found at \texttt{pcc}, i.e. the length of … … 413 419 \subsection{Correctness of the assembler with respect to fetching} 414 420 \label{subsect.total.correctness.of.the.assembler} 415 Using our policies, we now work toward proving the correctness of the assembler. 416 Correctness means that the assembly process never fails when provided with a correct policy and that the process does not change the semantics of a certain class of wellbehaved assembly programs. 417 418 The aim of this section is to prove the following informal statement: when we fetch an assembly pseudoinstruction \texttt{I} at address \texttt{ppc}, then we can fetch the expanded pseudoinstruction(s) \texttt{[J1, \ldots, Jn] = fetch\_pseudo\_instruction \ldots\ I\ ppc} from \texttt{policy ppc} in the code memory obtained by loading the assembled object code. 419 This constitutes the first major step in the proof of correctness of the assembler, the next one being the simulation of \texttt{I} by \texttt{[J1, \ldots, Jn]} (see Subsect.~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}). 420 421 The \texttt{assembly} function is given a Russell type (slightly simplified here): 421 We now begin the proof of correctness of the assembler. Correctness consists 422 of two properties: 1) the assembly process never fails when fed a correct 423 policy; 424 2)~the object code returned simulates the source code when the latter is 425 executed according to the cost model also returned by the assembler. 426 427 Property 2) can be decomposed into two main properties: correctness with 428 respect to fetching+decoding and correctness w.r.t. execution. 429 430 Informally, correctness w.r.t. fetching is the following statement: when we fetch an assembly pseudoinstruction \texttt{I} at address \texttt{ppc}, then we can fetch the expanded pseudoinstruction(s) \texttt{[J1, \ldots, Jn] = fetch\_pseudo\_instruction \ldots\ I\ ppc} from \texttt{policy ppc} in the code memory obtained by loading the assembled object code. This Section reviews the main steps 431 to prove correctness w.r.t. fetching. Section~\ref{subsect.total.correctness.for.well.behaved.assembly.programs} deals with correctness w.r.t. execution: 432 the instructions \texttt{[J1, \ldots, Jn]} simulate the pseudoinstruction~\texttt{I}. 433 434 The (slightly simplified) Russell type for the \texttt{assembly} function is: 422 435 \begin{lstlisting} 423 436 definition assembly: 424 $\forall$program: pseudo_assembly_program. 425 $\forall$policy. 426 $\Sigma$assembled: list Byte $\times$ (BitVectorTrie costlabel 16). 427 policy is correct for program $\rightarrow$ 428 $\mid$program$\mid$ < $2^{16}$ $\rightarrow$ $\mid$fst assembled$\mid$ < $2^{16}$ $\wedge$ 429 (policy ($\mid$program$\mid$) = $\mid$fst assembled$\mid$ $\vee$ 430 (policy ($\mid$program$\mid$) = 0 $\wedge$ $\mid$fst assembled$\mid$ = $2^{16}$)) $\wedge$ 431 $\forall$ppc: pseudo_program_counter. ppc < $2^{16}$ $\rightarrow$ 432 let pseudo_instr := fetch from program at ppc in 433 let assembled_i := assemble pseudo_instr in 434 $\mid$assembled_i$\mid$ $\leq$ $2^{16}$ $\wedge$ 435 $\forall$n: nat. n < $\mid$assembled_i$\mid$ $\rightarrow$ $\exists$k: nat. 436 nth assembled_i n = nth assembled (policy ppc + k). 437 $\forall$program: pseudo_assembly_program. $\forall$policy. 438 $\Sigma$assembled: list Byte $\times$ (BitVectorTrie nat 16). 439 $\mid$program$\mid$ $\leq$ $2^{16}$ $\rightarrow$ policy is correct for program $\rightarrow$ 440 policy ($\mid$program$\mid$) = $\mid$fst assembled$\mid$ $\leq$ $2^{16}$ $\wedge$ 441 $\forall$ppc: pseudo_program_counter. ppc < $2^{16}$ $\rightarrow$ 442 let pseudo_instr := fetch from program at ppc in 443 let assembled_i := assemble pseudo_instr in 444 $\mid$assembled_i$\mid$ $\leq$ $2^{16}$ $\wedge$ 445 $\forall$n: nat. n < $\mid$assembled_i$\mid$ $\rightarrow$ $\exists$k: nat. 446 nth assembled_i n = nth assembled (policy ppc + k). 437 447 \end{lstlisting} 438 448 In plain words, the type of \texttt{assembly} states the following. 439 Suppose we are given a policy that is correct for the program we are assembling. 440 Then we return a list of assembled bytes, complete with a map from program counters to cost labels, such that the following properties hold for the list of bytes. 441 Under the condition that the policy is `correct' for the program and the program is fully addressable by a 16bit word, the assembled list is also fully addressable by a 16bit word, the policy maps the last program counter that can address the program to the last instruction of the assemble pseudoinstruction or overflows, and if we fetch from the pseudoprogram counter \texttt{ppc} we get a pseudoinstruction \texttt{pi} and a new pseudoprogram counter \texttt{ppc}. 442 Further, assembling the pseudoinstruction \texttt{pseudo\_instr} results in a list of bytes, \texttt{assembled\_i}. 443 Then, indexing into this list with any natural number \texttt{n} less than the length of \texttt{assembled\_i} gives the same result as indexing into \texttt{assembled} with \texttt{policy ppc} (the program counter pointing to the start of the expansion in \texttt{assembled}) plus \texttt{k}. 444 445 Essentially the lemma above states that the \texttt{assembly} function correctly expands pseudoinstructions, and that the expanded instruction reside consecutively in memory. 446 This result is lifted from lists of bytes into a result on tries of bytes (i.e. code memories), using an additional lemma: \texttt{assembly\_ok}. 447 448 Lemma \texttt{fetch\_assembly} establishes that the \texttt{fetch} and \texttt{assembly1} functions interact correctly. 449 The \texttt{fetch} function, as its name implies, fetches the instruction indexed by the program counter in the code memory, while \texttt{assembly1} maps a single instruction to its byte encoding: 449 Given a correct policy for the program to be assembled, the assembler never 450 fails and returns some object code and a costing function. 451 Under the condition that the policy is `correct' for the program and the program is fully addressable by a 16bit word, the object code is also fully addressable by a 16bit word. Moreover, the 452 result of assemblying the pseudoinstruction obtained fetching from the 453 assembly address \texttt{ppc} 454 is a list of bytes found in the generated object code starting from 455 the object code address \texttt{policy(ppc)}. 456 457 Essentially the type above states that the \texttt{assembly} function correctly expands pseudoinstructions, and that the expanded instruction reside consecutively in memory. The fundamental hypothesis is correctness of the policy which 458 allows to prove the inductive step of the proof, which is by induction over 459 the assembly program. It is then simple to lift the property from lists 460 of bytes (object code) to tries of bytes (i.e. code memories after loading). 461 The \texttt{assembly\_ok} lemma does the lifting. 462 463 We have established that every pseudoinstruction is compiled to a sequence of 464 bytes that is found in memory at the expect place. This does not imply 465 trivially that those bytes will be decoded in a correct way to recover the 466 pseudoinstruction expansion. Indeed, we need first to prove a lemma that 467 establish that the \texttt{fetch} function is the left inverse of the 468 \texttt{assembly1} function: 450 469 \begin{lstlisting} 451 470 lemma fetch_assembly:
Note: See TracChangeset
for help on using the changeset viewer.