# Changeset 2349 for Papers

Ignore:
Timestamp:
Sep 26, 2012, 5:44:31 PM (7 years ago)
Message:

Up to section 3.5.

File:
1 edited

### Legend:

Unmodified
 r2348 In this way, Matita supports the Russell' proof methodology developed by Sozeau in~\cite{sozeau:subset:2006}, with an implementation that is lighter and more tightly integrated with the system than that of Coq. Throughout this paper we simplify the statements of lemmas and types of definitions in order to emphasise readability. % ---------------------------------------------------------------------------- % % SECTION                                                                      % definition expand_pseudo_instruction: $\forall$lookup_labels: Identifier $\rightarrow$ Word. $\forall$sigma: Word $\rightarrow$ Word. $\forall$policy: Word $\rightarrow$ bool. $\forall$policy. $\forall$ppc: Word. $\forall$lookup_datalabels: Identifier $\rightarrow$ Word. Here, the functions \texttt{lookup\_labels} and \texttt{lookup\_datalabels} are the functions that map labels and datalabels to program counters respectively, both of them used in the semantics of assembly. The input \texttt{pi} is the pseudoinstruction to be expanded and is found at address \texttt{ppc} in the assembly program. The policy is defined using the two functions \texttt{sigma} and \texttt{policy}, of which \texttt{sigma} is the most interesting. The function \texttt{sigma} maps pseudoprogram counters to program counters: the encoding of the expansion of the pseudoinstruction found at address \texttt{a} in the assembly code should be placed into code memory at address \texttt{sigma(a)}. The function takes \texttt{policy} as an input. In reality, this is a pair of functions, but for the purposes of this paper we simplify. The \texttt{policy} maps pseudo-program counters to program counters: the encoding of the expansion of the pseudoinstruction found at address \texttt{a} in the assembly code should be placed into code memory at address \texttt{policy(a)}. Of course this is possible only if the policy is correct, which means that the encoding of consecutive assembly instructions must be consecutive in code memory. \begin{displaymath} \texttt{sigma}(\texttt{ppc} + 1) = \texttt{pc} + \texttt{current\_instruction\_size} \texttt{policy}(\texttt{ppc} + 1) = \texttt{pc} + \texttt{current\_instruction\_size} \end{displaymath} Here, \texttt{current\_instruction\_size} is the size in bytes of the encoding of the expanded pseudoinstruction found at \texttt{ppc}. A companion submission to this one~\cite{boender:correctness:2012} certifies an algorithm that finds branch displacement policies for the assembler described in this paper. The \texttt{expand\_pseudo\_instruction} function uses the \texttt{sigma} map to determine the size of jump required when expanding pseudojumps, computing the jump size by examining the size of the differences between program counters. For instance, if at address \texttt{ppc} in the assembly program we found \texttt{Jmp l} such that \texttt{lookup\_labels l = a}, if the offset \texttt{d = sigma(a) - sigma(ppc + 1)} is such that \texttt{d} $< 128$ then \texttt{Jmp l} is normally translated to the best local solution, the short jump \texttt{SJMP d}. The \texttt{expand\_pseudo\_instruction} function uses the \texttt{policy} map to determine the size of jump required when expanding pseudojumps, computing the jump size by examining the size of the differences between program counters. For instance, if at address \texttt{ppc} in the assembly program we found \texttt{Jmp l} such that \texttt{lookup\_labels l = a}, if the offset \texttt{d = policy(a) - policy(ppc + 1)} is such that \texttt{d} $< 128$ then \texttt{Jmp l} is normally translated to the best local solution, the short jump \texttt{SJMP d}. A global best solution to the branch displacement problem, however, is not always made of locally best solutions. Therefore, in some circumstances, it is necessary to force the assembler to expand jumps into larger ones. This is achieved by the \texttt{policy} function: if \texttt{policy ppc = true} then a \texttt{Jmp l} at address \texttt{ppc} is always translated to a long jump. This is achieved by another boolean-valued function such that if the function applied to \texttt{ppc} returns true then a \texttt{Jmp l} at address \texttt{ppc} is always translated to a long jump. An essentially identical mechanism exists for call instructions. nth assembled_i n = nth assembled (policy ppc + k). \end{lstlisting} [dpm: update] In plain words, the type of \texttt{assembly} states the following. Suppose we are given a policy that is correct for the program we are assembling, and suppose the program to be assembled is fully addressable by a 16-bit word. Then if we fetch from the pseudo-program counter \texttt{ppc} we get a pseudoinstruction \texttt{pi} and a new pseudo-program counter \texttt{ppc}. Further, assembling the pseudoinstruction \texttt{pi} results in a list of bytes, \texttt{a}. Then, indexing into this list with any natural number \texttt{j} less than the length of \texttt{a} gives the same result as indexing into \texttt{assembled} with \texttt{sigma(ppc)} (the program counter pointing to the start of the expansion in \texttt{assembled}) plus \texttt{j}. Suppose we are given a policy that is correct for the program we are assembling. Then we return a list of assembled bytes, complete with a map from program counters to cost labels, such that the following properties hold for the list of bytes. Under the condition that the policy is correct' for the program and the program is fully addressable by a 16-bit word, the assembled list is also fully addressable by a 16-bit word, the policy maps the last program counter that can address the program to the last instruction of the assemble pseudoinstruction or overflows, and if we fetch from the pseudo-program counter \texttt{ppc} we get a pseudoinstruction \texttt{pi} and a new pseudo-program counter \texttt{ppc}. Further, assembling the pseudoinstruction \texttt{pseudo\_instr} results in a list of bytes, \texttt{assembled\_i}. Then, indexing into this list with any natural number \texttt{n} less than the length of \texttt{assembled\_i} gives the same result as indexing into \texttt{assembled} with \texttt{policy ppc} (the program counter pointing to the start of the expansion in \texttt{assembled}) plus \texttt{k}. Essentially the lemma above states that the \texttt{assembly} function correctly expands pseudoinstructions, and that the expanded instruction reside consecutively in memory. Or, in plainer words, assembling and then immediately fetching again gets you back to where you started. Lemma \texttt{fetch\_assembly\_pseudo} (slightly simplified, here) is obtained by composition of \texttt{expand\_pseudo\_instruction} and \texttt{assembly\_1\_pseudoinstruction}: Lemma \texttt{fetch\_assembly\_pseudo} is obtained by composition of \texttt{expand\_pseudo\_instruction} and \texttt{assembly\_1\_pseudoinstruction}: \begin{lstlisting} lemma fetch_assembly_pseudo: let pi := $\pi_1$ (fetch_pseudo_instruction instr_list ppc) in let pc := policy ppc in let instrs := expand_pseudo_instructio sigma policy ppc pi in let $\langle$l, a$\rangle$ := assembly_1_pseudoinstruction sigma policy ppc pi in let instrs := expand_pseudo_instructio policy ppc pi in let $\langle$l, a$\rangle$ := assembly_1_pseudoinstruction policy ppc pi in let pc_plus_len := pc + l in encoding_check code_memory pc pc_plus_len a $\rightarrow$ Intuitively, Lemma \texttt{fetch\_assembly\_pseudo} can be read as follows. Suppose we expand the pseudoinstruction at \texttt{ppc} with the policy decision \texttt{sigma}, obtaining the list of machine code instructions \texttt{instrs}. Suppose we expand the pseudoinstruction at \texttt{ppc} with the policy, obtaining the list of machine code instructions \texttt{instrs}. Suppose we also assemble the pseudoinstruction at \texttt{ppc} to obtain \texttt{a}, a list of bytes. Then, we check with \texttt{fetch\_many} that the number of machine instructions that were fetched matches the number of instruction that \texttt{expand\_pseudo\_instruction} expanded. The final lemma in this series is \texttt{fetch\_assembly\_pseudo2} that combines the Lemma \texttt{fetch\_assembly\_pseudo} with the correctness of the functions that load object code into the processor's memory. Again, we slightly simplify: The final lemma in this series is \texttt{fetch\_assembly\_pseudo2} that combines the Lemma \texttt{fetch\_assembly\_pseudo} with the correctness of the functions that load object code into the processor's memory: \begin{lstlisting} lemma fetch_assembly_pseudo2: policy is correct for program $\rightarrow$ $\forall$ppc. ppc < $\mid$snd program$\mid$ $\rightarrow$ let $\langle$preamble, instr_list$\rangle$ := program in let $\langle$labels, costs$\rangle$ := create_label_cost_map instr_list in let $\langle$labels, costs$\rangle$ := create_label_cost_map program in let $\langle$assembled, costs'$\rangle$ := $\pi_1$ (assembly program sigma policy) in let cmem := load_code_memory assembled in let $\langle$pi, newppc$\rangle$ := fetch_pseudo_instruction instr_list ppc in let $\langle$pi, newppc$\rangle$ := fetch_pseudo_instruction program ppc in let instructions := expand_pseudo_instruction policy ppc pi in fetch_many cmem (policy newppc) (policy ppc) instructions. We read \texttt{fetch\_assembly\_pseudo2} as follows. Suppose we are given an assembly program which can be addressed by a 16-bit word and a policy that is correct for this program. Suppose we are able to successfully assemble an assembly program using \texttt{assembly} and produce a code memory, \texttt{cmem}. Then, fetching a pseudoinstruction from the pseudo-code memory at address \texttt{ppc} corresponds to fetching a sequence of instructions from the real code memory using \texttt{sigma} to expand pseudoinstructions. The fetched sequence corresponds to the expansion, according to \texttt{sigma}, of the pseudoinstruction. Then, fetching a pseudoinstruction from the pseudo-code memory at address \texttt{ppc} corresponds to fetching a sequence of instructions from the real code memory using \texttt{policy} to expand pseudoinstructions. The fetched sequence corresponds to the expansion, according to the policy, of the pseudoinstruction. At first, the lemma appears to immediately imply the correctness of the assembler.