Changeset 983


Ignore:
Timestamp:
Jun 16, 2011, 4:00:54 PM (8 years ago)
Author:
mulligan
Message:

more work added

File:
1 edited

Legend:

Unmodified
Added
Removed
  • src/ASM/CPP2011/cpp-2011.tex

    r981 r983  
    4646
    4747\begin{abstract}
    48 We consider the formalisation of an assembler for the Intel MCS-51 8-bit microprocessor in the Matita proof assistant.
     48We consider the formalisation of an assembler for Intel MCS-51 assembly language in the Matita proof assistant.
    4949This formalisation forms a major component of the EU-funded CerCo project, concering the construction and formalisation of a concrete complexity preserving compiler for a large subset of the C programming language.
    50 ...
     50
     51The efficient expansion of pseudoinstructions---particularly jumps---into MCS-51 machine instructions is complex.
     52We employ a strategy, involving the use of `policies', that separates the decision making over how jumps should be expanded from the expansion process itself.
     53This makes the proof of correctness for the assembler significantly more straightforward.
     54
     55We prove, under the assumption of the existence of a correct policy, that the assembly process preserves the semantics of assembly programs.
     56Correct policies fail to exist only in a limited number of pathological circumstances.
    5157\end{abstract}
    5258
     
    294300For this, we use the \texttt{sigma} machinery defined at the end of Subsection~\ref{subsect.policies}.
    295301
    296 CSC: no options using policy
    297 \begin{lstlisting}
    298 lemma fetch_assembly_pseudo2: $\forall$program, assembled, costs, labels.
    299   Some $\ldots$ $\langle$labels,costs$\rangle$ = build_maps program $\rightarrow$
    300   Some $\ldots$ $\langle$assembled,costs$\rangle$ = assembly program $\rightarrow$ $\forall$ppc.
    301   let code_memory := load_code_memory assembled in
    302   let preamble := $\pi_1$ program in
    303   let data_labels := construct_datalabels preamble in
    304   let lk_labels :=
    305     λx. sigma program (address_of_word_labels_code_mem ($\pi_2$ program) x) in
    306   let lk_dlabels := λx. lookup ? ? x data_labels (zero ?) in
    307   let expansion := jump_expansion ppc program in
    308   let $\langle$pi,newppc$\rangle$ := fetch_pseudo_instruction ($\pi_2$ program) ppc in
    309   let ppc' := sigma program ppc in
    310   let newppc' := sigma program newppc in
    311   let instructions' :=
    312     expand_pseudo_instruction lk_labels lk_dlabels ppc' expansion pi in
    313   let fetched := fetch_many code_memory newppc' ppc' instructions in
    314     $\exists$instructions. Some ? instructions = instructions' $\wedge$ fetched.
    315 \end{lstlisting}
    316 
    317302We expand pseudoinstructions using the function \texttt{expand\_pseudo\_instruction}.
    318303This function accepts a `policy decision'---an element of type \texttt{jump\_length}---that is used when expanding a \texttt{Call}, \texttt{Jmp} or conditional jump to a label into the correct machine instruction.
     
    327312For instance, if the policy decision dictates that we should expand a \texttt{Call} pseudoinstruction into a `short jump', then we fail, as the MCS-51's instruction set only features instructions \texttt{ACALL} and \texttt{LCALL}.
    328313
     314% dpm todo
    329315\begin{lstlisting}
    330316axiom assembly_ok: ∀program,assembled,costs,labels.
     
    348334\end{lstlisting}
    349335
    350 Lemma \texttt{fetch\_assembly\_pseudo} establishes the correctness of expanding and then assembling pseudoinstructions:
     336% dpm todo
     337\begin{lstlisting}
     338theorem fetch_assembly: $\forall$pc, i, cmem, assembled.
     339  assembled = assembly1 i $\rightarrow$
     340  let len := length $\ldots$ assembled in
     341    encoding_check cmem pc (pc + len) assembled $\rightarrow$
     342    let fetched := fetch code_memory (bitvector_of_nat $\ldots$ pc) in
     343    let $\langle$instr_pc, ticks$\rangle$ := fetched in
     344    let $\langle$instr, pc'$\rangle$ := instr_pc in
     345      (eq_instruction instr i $\wedge$
     346       eqb ticks (ticks_of_instruction instr) $\wedge$
     347       eq_bv $\ldots$ pc' (pc + len)) = true.
     348\end{lstlisting}
     349
     350Lemma \texttt{fetch\_assembly\_pseudo} establishes a basic property between \texttt{expand\_pseudo\_instruction} and \texttt{assembly\_1\_pseudoinstruction}:
    351351\begin{lstlisting}
    352352lemma fetch_assembly_pseudo: $\forall$program, ppc, lk_labels, lk_dlabels.
     
    366366The function \texttt{fetch\_many} fetches multiple machine code instructions from code memory and performs some routine checks.
    367367
    368 Intuitively, lemma \texttt{fetch\_assembly\_pseudo} can be read as follows.
     368Intuitively, Lemma \texttt{fetch\_assembly\_pseudo} can be read as follows.
    369369Suppose our policy \texttt{jump\_expansion} dictates that the pseudoinstruction indexed by the pseudo program counter \texttt{ppc} in assembly program \texttt{program} gives us the policy decision \texttt{jexp}.
    370370Further, suppose we expand the pseudoinstruction at \texttt{ppc} with the policy decision \texttt{jexp}, obtaining an (optional) list of machine code instructions \texttt{exp}.
    371 Suppose we also
    372 
    373 \begin{lstlisting}
    374 theorem fetch_assembly: $\forall$pc, i, cmem, assembled.
    375   assembled = assembly1 i $\rightarrow$
    376   let len := length $\ldots$ assembled in
    377     encoding_check cmem pc (pc + len) assembled $\rightarrow$
    378     let fetched := fetch code_memory (bitvector_of_nat $\ldots$ pc) in
    379     let $\langle$instr_pc, ticks$\rangle$ := fetched in
    380     let $\langle$instr, pc'$\rangle$ := instr_pc in
    381       (eq_instruction instr i $\wedge$
    382        eqb ticks (ticks_of_instruction instr) $\wedge$
    383        eq_bv $\ldots$ pc' (pc + len)) = true.
    384 \end{lstlisting}
     371Suppose we also assemble the pseudoinstruction at \texttt{ppc} to obtain \texttt{ass}, a list of bytes.
     372Then, under the assumption that neither the expansion of the pseudoinstruction to obtain \texttt{exp}, nor the assembly of the pseudoinstruction to obtain \texttt{ass}, failed, we check with \texttt{fetch\_many} that the number of machine instructions that were fetched matches the number of instruction that \texttt{expand\_pseudo\_instruction} expanded.
     373
     374At first sight, Lemma \texttt{fetch\_assembly\_pseudo2} appears to nearly establish the correctness of the assembler:
     375\begin{lstlisting}
     376lemma fetch_assembly_pseudo2: $\forall$program, assembled, costs, labels.
     377  Some $\ldots$ $\langle$labels, costs$\rangle$ = build_maps program $\rightarrow$
     378  Some $\ldots$ $\langle$assembled, costs$\rangle$ = assembly program $\rightarrow$ $\forall$ppc.
     379  let code_memory := load_code_memory assembled in
     380  let preamble := $\pi_1$ program in
     381  let data_labels := construct_datalabels preamble in
     382  let lk_labels :=
     383    λx. sigma program (address_of_word_labels_code_mem ($\pi_2$ program) x) in
     384  let lk_dlabels := λx. lookup ? ? x data_labels (zero ?) in
     385  let expansion := jump_expansion ppc program in
     386  let $\langle$pi, newppc$\rangle$ := fetch_pseudo_instruction ($\pi_2$ program) ppc in
     387  let ppc' := sigma program ppc in
     388  let newppc' := sigma program newppc in
     389  let instructions' :=
     390    expand_pseudo_instruction lk_labels lk_dlabels ppc' expansion pi in
     391  let fetched := $\lambda$instr. fetch_many code_memory newppc' ppc' instr in
     392    $\exists$instrs. Some ? instrs = instructions' $\wedge$ fetched instrs.
     393\end{lstlisting}
     394Intuitively, we may read \texttt{fetch\_assembly\_pseudo2} as follows.
     395Suppose we are able to successfully assemble an assembly program using \texttt{assembly} and produce a code memory, \texttt{code\_memory}.
     396Then there exists some list of machine instructions equal to the expansion of a pseudoinstruction and the number of machine instructions that need to be fetched is equal to the number of machine instructions that the pseudoinstruction was expanded into.
     397
     398However, this property is \emph{not} strong enough to establish that the semantics of an assembly program has been preserved by the assembly process.
     399In particular, \texttt{fetch\_assembly\_pseudo2} says nothing about how
    385400
    386401An \texttt{internal\_pseudo\_address\_map} positions in the memory of a \texttt{PseudoStatus} with a physical memory address.
     
    426441\end{lstlisting}
    427442The statement can be given an intuitive reading as follows.
    428 Suppose our \texttt{PseudoStatus} $ps$ can be successfully converted into a \texttt{Status} $s$.
    429 Suppose further that, after executing a single assembly instruction and converting the resulting \texttt{PseudoStatus} into a \texttt{Status}, we obtain $s''$, being careful to track the number of ticks executed.
    430 Then, there exists some number $n$, so that executing $n$ machine code instructions in \texttt{Status} $s$ gives us \texttt{Status} $s''$.
     443Suppose our \texttt{PseudoStatus}, \texttt{ps}, can be successfully converted into a \texttt{Status}, \texttt{s}.
     444Suppose further that, after executing a single assembly instruction and converting the resulting \texttt{PseudoStatus} into a \texttt{Status}, we obtain \texttt{s''}, being careful to track the number of ticks executed.
     445Then, there exists some number \texttt{n}, so that executing \texttt{n} machine code instructions in \texttt{Status} \texttt{s} gives us \texttt{Status} \texttt{s''}.
    431446Theorem \texttt{main\_thm} establishes the correctness of the assembly process.
    432447
     
    441456Expanding these pseudoinstructions into machine code instructions is not trivial, and the proof that the assembly process is `correct', in that the semantics of an assembly program are not changed is complex.
    442457
    443 Aside from their application in verified compiler projects such as CerCo, verified assemblers could also be applied to the verification of operating system kernels.
     458The formalisation is a key component of the CerCo project, which aims to produce a verified concrete complexity preserving compiler for a large subset of the C programming language.
     459The verified assembler, complete with the underlying formalisation of the semantics of MCS-51 machine code (described fully in~\cite{mulligan:executable:2011}), will form the bedrock layer upon which the rest of the CerCo project will build its verified compiler platform.
     460
     461Aside from their application in verified compiler projects such as CerCo, verified assemblers such as ours could also be applied to the verification of operating system kernels.
    444462Of particular note is the verified seL4 kernel~\cite{klein:sel4:2009,klein:sel4:2010}.
    445 This verification explicitly assumes the existence of, amongst other things, a trustworthy assembler.
     463This verification explicitly assumes the existence of, amongst other things, a trustworthy assembler and compiler.
    446464
    447465\paragraph{Use of dependent types and Russell}
Note: See TracChangeset for help on using the changeset viewer.