Changeset 983
 Timestamp:
 Jun 16, 2011, 4:00:54 PM (8 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

src/ASM/CPP2011/cpp2011.tex
r981 r983 46 46 47 47 \begin{abstract} 48 We consider the formalisation of an assembler for the Intel MCS51 8bit microprocessorin the Matita proof assistant.48 We consider the formalisation of an assembler for Intel MCS51 assembly language in the Matita proof assistant. 49 49 This formalisation forms a major component of the EUfunded CerCo project, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language. 50 ... 50 51 The efficient expansion of pseudoinstructionsparticularly jumpsinto MCS51 machine instructions is complex. 52 We employ a strategy, involving the use of `policies', that separates the decision making over how jumps should be expanded from the expansion process itself. 53 This makes the proof of correctness for the assembler significantly more straightforward. 54 55 We prove, under the assumption of the existence of a correct policy, that the assembly process preserves the semantics of assembly programs. 56 Correct policies fail to exist only in a limited number of pathological circumstances. 51 57 \end{abstract} 52 58 … … 294 300 For this, we use the \texttt{sigma} machinery defined at the end of Subsection~\ref{subsect.policies}. 295 301 296 CSC: no options using policy297 \begin{lstlisting}298 lemma fetch_assembly_pseudo2: $\forall$program, assembled, costs, labels.299 Some $\ldots$ $\langle$labels,costs$\rangle$ = build_maps program $\rightarrow$300 Some $\ldots$ $\langle$assembled,costs$\rangle$ = assembly program $\rightarrow$ $\forall$ppc.301 let code_memory := load_code_memory assembled in302 let preamble := $\pi_1$ program in303 let data_labels := construct_datalabels preamble in304 let lk_labels :=305 λx. sigma program (address_of_word_labels_code_mem ($\pi_2$ program) x) in306 let lk_dlabels := λx. lookup ? ? x data_labels (zero ?) in307 let expansion := jump_expansion ppc program in308 let $\langle$pi,newppc$\rangle$ := fetch_pseudo_instruction ($\pi_2$ program) ppc in309 let ppc' := sigma program ppc in310 let newppc' := sigma program newppc in311 let instructions' :=312 expand_pseudo_instruction lk_labels lk_dlabels ppc' expansion pi in313 let fetched := fetch_many code_memory newppc' ppc' instructions in314 $\exists$instructions. Some ? instructions = instructions' $\wedge$ fetched.315 \end{lstlisting}316 317 302 We expand pseudoinstructions using the function \texttt{expand\_pseudo\_instruction}. 318 303 This function accepts a `policy decision'an element of type \texttt{jump\_length}that is used when expanding a \texttt{Call}, \texttt{Jmp} or conditional jump to a label into the correct machine instruction. … … 327 312 For instance, if the policy decision dictates that we should expand a \texttt{Call} pseudoinstruction into a `short jump', then we fail, as the MCS51's instruction set only features instructions \texttt{ACALL} and \texttt{LCALL}. 328 313 314 % dpm todo 329 315 \begin{lstlisting} 330 316 axiom assembly_ok: ∀program,assembled,costs,labels. … … 348 334 \end{lstlisting} 349 335 350 Lemma \texttt{fetch\_assembly\_pseudo} establishes the correctness of expanding and then assembling pseudoinstructions: 336 % dpm todo 337 \begin{lstlisting} 338 theorem fetch_assembly: $\forall$pc, i, cmem, assembled. 339 assembled = assembly1 i $\rightarrow$ 340 let len := length $\ldots$ assembled in 341 encoding_check cmem pc (pc + len) assembled $\rightarrow$ 342 let fetched := fetch code_memory (bitvector_of_nat $\ldots$ pc) in 343 let $\langle$instr_pc, ticks$\rangle$ := fetched in 344 let $\langle$instr, pc'$\rangle$ := instr_pc in 345 (eq_instruction instr i $\wedge$ 346 eqb ticks (ticks_of_instruction instr) $\wedge$ 347 eq_bv $\ldots$ pc' (pc + len)) = true. 348 \end{lstlisting} 349 350 Lemma \texttt{fetch\_assembly\_pseudo} establishes a basic property between \texttt{expand\_pseudo\_instruction} and \texttt{assembly\_1\_pseudoinstruction}: 351 351 \begin{lstlisting} 352 352 lemma fetch_assembly_pseudo: $\forall$program, ppc, lk_labels, lk_dlabels. … … 366 366 The function \texttt{fetch\_many} fetches multiple machine code instructions from code memory and performs some routine checks. 367 367 368 Intuitively, lemma \texttt{fetch\_assembly\_pseudo} can be read as follows.368 Intuitively, Lemma \texttt{fetch\_assembly\_pseudo} can be read as follows. 369 369 Suppose our policy \texttt{jump\_expansion} dictates that the pseudoinstruction indexed by the pseudo program counter \texttt{ppc} in assembly program \texttt{program} gives us the policy decision \texttt{jexp}. 370 370 Further, suppose we expand the pseudoinstruction at \texttt{ppc} with the policy decision \texttt{jexp}, obtaining an (optional) list of machine code instructions \texttt{exp}. 371 Suppose we also 372 373 \begin{lstlisting} 374 theorem fetch_assembly: $\forall$pc, i, cmem, assembled. 375 assembled = assembly1 i $\rightarrow$ 376 let len := length $\ldots$ assembled in 377 encoding_check cmem pc (pc + len) assembled $\rightarrow$ 378 let fetched := fetch code_memory (bitvector_of_nat $\ldots$ pc) in 379 let $\langle$instr_pc, ticks$\rangle$ := fetched in 380 let $\langle$instr, pc'$\rangle$ := instr_pc in 381 (eq_instruction instr i $\wedge$ 382 eqb ticks (ticks_of_instruction instr) $\wedge$ 383 eq_bv $\ldots$ pc' (pc + len)) = true. 384 \end{lstlisting} 371 Suppose we also assemble the pseudoinstruction at \texttt{ppc} to obtain \texttt{ass}, a list of bytes. 372 Then, under the assumption that neither the expansion of the pseudoinstruction to obtain \texttt{exp}, nor the assembly of the pseudoinstruction to obtain \texttt{ass}, failed, we check with \texttt{fetch\_many} that the number of machine instructions that were fetched matches the number of instruction that \texttt{expand\_pseudo\_instruction} expanded. 373 374 At first sight, Lemma \texttt{fetch\_assembly\_pseudo2} appears to nearly establish the correctness of the assembler: 375 \begin{lstlisting} 376 lemma fetch_assembly_pseudo2: $\forall$program, assembled, costs, labels. 377 Some $\ldots$ $\langle$labels, costs$\rangle$ = build_maps program $\rightarrow$ 378 Some $\ldots$ $\langle$assembled, costs$\rangle$ = assembly program $\rightarrow$ $\forall$ppc. 379 let code_memory := load_code_memory assembled in 380 let preamble := $\pi_1$ program in 381 let data_labels := construct_datalabels preamble in 382 let lk_labels := 383 λx. sigma program (address_of_word_labels_code_mem ($\pi_2$ program) x) in 384 let lk_dlabels := λx. lookup ? ? x data_labels (zero ?) in 385 let expansion := jump_expansion ppc program in 386 let $\langle$pi, newppc$\rangle$ := fetch_pseudo_instruction ($\pi_2$ program) ppc in 387 let ppc' := sigma program ppc in 388 let newppc' := sigma program newppc in 389 let instructions' := 390 expand_pseudo_instruction lk_labels lk_dlabels ppc' expansion pi in 391 let fetched := $\lambda$instr. fetch_many code_memory newppc' ppc' instr in 392 $\exists$instrs. Some ? instrs = instructions' $\wedge$ fetched instrs. 393 \end{lstlisting} 394 Intuitively, we may read \texttt{fetch\_assembly\_pseudo2} as follows. 395 Suppose we are able to successfully assemble an assembly program using \texttt{assembly} and produce a code memory, \texttt{code\_memory}. 396 Then there exists some list of machine instructions equal to the expansion of a pseudoinstruction and the number of machine instructions that need to be fetched is equal to the number of machine instructions that the pseudoinstruction was expanded into. 397 398 However, this property is \emph{not} strong enough to establish that the semantics of an assembly program has been preserved by the assembly process. 399 In particular, \texttt{fetch\_assembly\_pseudo2} says nothing about how 385 400 386 401 An \texttt{internal\_pseudo\_address\_map} positions in the memory of a \texttt{PseudoStatus} with a physical memory address. … … 426 441 \end{lstlisting} 427 442 The statement can be given an intuitive reading as follows. 428 Suppose our \texttt{PseudoStatus} $ps$ can be successfully converted into a \texttt{Status} $s$.429 Suppose further that, after executing a single assembly instruction and converting the resulting \texttt{PseudoStatus} into a \texttt{Status}, we obtain $s''$, being careful to track the number of ticks executed.430 Then, there exists some number $n$, so that executing $n$ machine code instructions in \texttt{Status} $s$ gives us \texttt{Status} $s''$.443 Suppose our \texttt{PseudoStatus}, \texttt{ps}, can be successfully converted into a \texttt{Status}, \texttt{s}. 444 Suppose further that, after executing a single assembly instruction and converting the resulting \texttt{PseudoStatus} into a \texttt{Status}, we obtain \texttt{s''}, being careful to track the number of ticks executed. 445 Then, there exists some number \texttt{n}, so that executing \texttt{n} machine code instructions in \texttt{Status} \texttt{s} gives us \texttt{Status} \texttt{s''}. 431 446 Theorem \texttt{main\_thm} establishes the correctness of the assembly process. 432 447 … … 441 456 Expanding these pseudoinstructions into machine code instructions is not trivial, and the proof that the assembly process is `correct', in that the semantics of an assembly program are not changed is complex. 442 457 443 Aside from their application in verified compiler projects such as CerCo, verified assemblers could also be applied to the verification of operating system kernels. 458 The formalisation is a key component of the CerCo project, which aims to produce a verified concrete complexity preserving compiler for a large subset of the C programming language. 459 The verified assembler, complete with the underlying formalisation of the semantics of MCS51 machine code (described fully in~\cite{mulligan:executable:2011}), will form the bedrock layer upon which the rest of the CerCo project will build its verified compiler platform. 460 461 Aside from their application in verified compiler projects such as CerCo, verified assemblers such as ours could also be applied to the verification of operating system kernels. 444 462 Of particular note is the verified seL4 kernel~\cite{klein:sel4:2009,klein:sel4:2010}. 445 This verification explicitly assumes the existence of, amongst other things, a trustworthy assembler .463 This verification explicitly assumes the existence of, amongst other things, a trustworthy assembler and compiler. 446 464 447 465 \paragraph{Use of dependent types and Russell}
Note: See TracChangeset
for help on using the changeset viewer.