Changeset 2096


Ignore:
Timestamp:
Jun 15, 2012, 3:25:21 PM (5 years ago)
Author:
mulligan
Message:

Changes to the English for Jaap, and some tidying up and making consistent with the other CPP submission.

Location:
src/ASM/CPP2012-policy
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • src/ASM/CPP2012-policy/algorithm.tex

    r2091 r2096  
    44
    55Given the NP-completeness of the problem, to arrive at an optimal solution
    6 within a short space of time (using, for example, a constraint solver) will
    7 potentially take a great amount of time.
     6(using, for example, a constraint solver) will potentially take a great amount
     7of time.
    88
    9 The SDCC compiler~\cite{SDCC2011}, which has the MCS-51 among its target
    10 instruction sets, simply encodes every branch instruction as a long jump
     9The SDCC compiler~\cite{SDCC2011}, which has a backend targetting the MCS-51
     10instruction set, simply encodes every branch instruction as a long jump
    1111without taking the distance into account. While certainly correct (the long
    12 jump can reach any destination in memory) and rapid, it does result in a less
    13 than optimal solution.
     12jump can reach any destination in memory) and a very fast solution to compute,
     13it results in a less than optimal solution.
    1414
    15 The {\tt gcc} compiler suite~\cite{GCC2012}, while compiling C on the x86
    16 architecture, uses a greatest fix point algorithm. In other words, it starts
    17 off with all branch instructions encoded as the largest jumps available, and
    18 then tries to reduce branch instructions as much as possible.
     15On the other hand, the {\tt gcc} compiler suite~\cite{GCC2012}, while compiling
     16C on the x86 architecture, uses a greatest fix point algorithm. In other words,
     17it starts with all branch instructions encoded as the largest jumps
     18available, and then tries to reduce the size of branch instructions as much as
     19possible.
    1920
    2021Such an algorithm has the advantage that any intermediate result it returns
     
    2223jump is always possible, and the algorithm only reduces those branch
    2324instructions whose destination address is in range for a shorter jump.
    24 The algorithm can thus be stopped after a determined amount of steps without
    25 losing correctness.
     25The algorithm can thus be stopped after a determined number of steps without
     26sacrificing correctness.
    2627
    27 The result, however, is not necessarily optimal, even if the algorithm is run
    28 until it terminates naturally: the fixed point reached is the {\em greatest}
     28The result, however, is not necessarily optimal. Even if the algorithm is run
     29until it terminates naturally, the fixed point reached is the {\em greatest}
    2930fixed point, not the least fixed point. Furthermore, {\tt gcc} (at least for
    3031the x86 architecture) only uses short and long jumps. This makes the algorithm
    31 more rapid, as shown in the previous section, but also results in a less
     32more efficient, as shown in the previous section, but also results in a less
    3233optimal solution.
    3334
     
    3940encoding can only switch from short to long, but never in the other direction.
    4041When we add absolute jumps, however, it is theoretically possible for a branch
    41 instruction to switch from absolute to long and back, as shown in the previous
    42 section.
     42instruction to switch from absolute to long and back, as previously explained.
    4343
    4444Proving termination then becomes difficult, because there is nothing that
    45 precludes a branch instruction from switching back and forth between absolute
    46 and long indefinitely.
     45precludes a branch instruction from oscillating back and forth between absolute
     46and long jumps indefinitely.
    4747
    4848In order to keep the algorithm in the same complexity class and more easily
     
    5454from absolute to long.
    5555
    56 There is one complicating factor: suppose that a branch instruction is encoded
     56There is one complicating factor. Suppose that a branch instruction is encoded
    5757in step $n$ as an absolute jump, but in step $n+1$ it is determined that
    5858(because of changes elsewhere) it can now be encoded as a short jump. Due to
     
    6161$n+1$ as well.
    6262
    63 This is not necessarily correct, however: a branch instruction that can be
    64 encoded as a short jump cannot always also be encoded as an absolute jump
    65 (a short jump can bridge segments, whereas an absolute jump cannot). Therefore,
    66 in this situation we decide to encode the branch instruction as a long jump,
    67 which is always correct.
     63This is not necessarily correct. A branch instruction that can be
     64encoded as a short jump cannot always also be encoded as an absolute jump, as a
     65short jump can bridge segments, whereas an absolute jump cannot. Therefore,
     66in this situation we have decided to encode the branch instruction as a long
     67jump, which is always correct.
    6868
    6969The resulting algorithm, while not optimal, is at least as good as the ones
     
    7575
    7676The branch displacement algorithm forms part of the translation from
    77 pseudo-code to assembler. More specifically, it is used by the function that
     77pseudocode to assembler. More specifically, it is used by the function that
    7878translates pseudo-addresses (natural numbers indicating the position of the
    7979instruction in the program) to actual addresses in memory.
    8080
    81 The original intention was to have two different functions, one function
     81Our original intention was to have two different functions, one function
    8282$\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short\_jump},
    8383\mathtt{absolute\_jump}, \mathtt{long\_jump}\}$ to associate jumps to their
    8484intended encoding, and a function $\sigma: \mathbb{N} \rightarrow
    85 \mathtt{Word}$ to associate pseudo-addresses to actual addresses. $\sigma$
     85\mathtt{Word}$ to associate pseudo-addresses to machine addresses. $\sigma$
    8686would use $\mathtt{policy}$ to determine the size of jump instructions.
    8787
     
    9191From the algorithmic point of view, in order to create the $\mathtt{policy}$
    9292function, we must necessarily have a translation from pseudo-addresses
    93 to actual addresses (i.e. a $\sigma$ function): in order to judge the distance
     93to machine addresses (i.e. a $\sigma$ function): in order to judge the distance
    9494between a jump and its destination, we must know their memory locations.
    9595Conversely, in order to create the $\sigma$ function, we need to have the
     
    104104algorithms. We now have a function
    105105$\sigma: \mathbb{N} \rightarrow \mathtt{Word} \times \mathtt{bool}$ which
    106 associates a pseudo-address to an actual address. The boolean denotes a forced
     106associates a pseudo-address to a machine address. The boolean denotes a forced
    107107long jump; as noted in the previous section, if during the fixed point
    108108computation an absolute jump changes to be potentially re-encoded as a short
     
    143143\end{figure}
    144144
    145 The algorithm, shown in figure~\ref{f:jump_expansion_step}, works by folding the
     145The algorithm, shown in Figure~\ref{f:jump_expansion_step}, works by folding the
    146146function {\sc f} over the entire program, thus gradually constructing $sigma$.
    147147This constitutes one step in the fixed point calculation; successive steps
     
    162162
    163163The first two are parameters that remain the same through one iteration, the
    164 last three are standard parameters for a fold function (including $ppc$,
     164final three are standard parameters for a fold function (including $ppc$,
    165165which is simply the number of instructions of the program already processed).
    166166
     
    169169$\sigma: \mathbb{N} \rightarrow \mathbb{N} \times \{\mathtt{short\_jump},
    170170\mathtt{absolute\_jump},\mathtt{long\_jump}\}$; a function that associates a
    171 pseudo-address with an memory address and a jump length. We do this to be able
    172 to more easily compare the jump lengths between iterations. In the algorithm,
     171pseudo-address with a memory address and a jump length. We do this to be able
     172to more ease the comparison of jump lengths between iterations. In the algorithm,
    173173we use the notation $sigma_1(x)$ to denote the memory address corresponding to
    174174$x$, and $sigma_2(x)$ to denote the jump length corresponding to $x$.
     
    182182
    183183We cannot use $old\_sigma$ without change: it might be the case that we have
    184 already increased the size some branch instructions before, making the program
     184already increased the size of some branch instructions before, making the program
    185185longer and moving every instruction forward. We must compensate for this by
    186186adding the size increase of the program to the label's memory address according
     
    189189Note also that we add the pc to $sigma$ at location $ppc+1$, whereas we add the
    190190jump length at location $ppc$. We do this so that $sigma(ppc)$ will always
    191 return a couple with the start address of the instruction at $ppc$ and the
     191return a pair with the start address of the instruction at $ppc$ and the
    192192length of its branch instruction (if any); the end address of the program can
    193193be found at $sigma(n+1)$, where $n$ is the number of instructions in the
  • src/ASM/CPP2012-policy/conclusion.tex

    r2093 r2096  
    11\section{Conclusion}
    22
    3 In the previous sections, we have discussed the branch displacement optimisation
     3In the previous sections we have discussed the branch displacement optimisation
    44problem, presented an optimised solution, and discussed the proof of
    55termination and correctness for this algorithm, as formalised in Matita.
     
    88optimal solution would need techniques like constraint solvers. While outside
    99the scope of the present research, it would be interesting to see if enough
    10 heuristics could be find to make such a solution practical for implementing
    11 in an existent compiler; this would be especially useful for embedded systems,
    12 where it is important to have a small solution as possible.
     10heuristics could be found to make such a solution practical for implementing
     11in an existing compiler; this would be especially useful for embedded systems,
     12where it is important to have as small solution as possible.
    1313
    14 This algorithm is part of a greater whole, the CerCo project, which aims at
    15 the complete formalisation and verification of a compiler. More information
     14This algorithm is part of a greater whole, the CerCo project, which aims to
     15complete formalise and verify a concrete cost preserving compiler for a large
     16subset of the C programming language. More information
    1617on the formalisation of the assembler, of which the present work is a part,
    1718can be found in a companion publication~\cite{DC2012}.
     
    2223displacement optimisation algorithm.
    2324
    24 The CompCert project is another project aimed at formally verifying a compiler.
    25 Their backend~\cite{Leroy2009} generates assembly code for (amongst others) the
     25The CompCert project is another verified compiler project.
     26Their backend~\cite{Leroy2009} generates assembly code for (amongst others) subsets of the
    2627PowerPC and x86 (32-bit) architectures. At the assembly code stage, there is
    2728no distinction between the span-dependent jump instructions, so a branch
     
    3031An offshoot of the CompCert project is the CompCertTSO project, who add thread
    3132concurrency and synchronisation to the CompCert compiler~\cite{TSO2011}. This
    32 compiler also generate assembly code and therefore does not include a branch
     33compiler also generates assembly code and therefore does not include a branch
    3334displacement algorithm.
    3435 
    35 There is also Piton~\cite{Moore1996}, which not only includes the
     36Finally, there is also the Piton stack~\cite{Moore1996}, which not only includes the
    3637formal verification of a compiler, but also of the machine architecture
    37 targeted by that compiler. However, this architecture does not have different
     38targeted by that compiler, a bespoke microprocessor called the FM9001.
     39However, this architecture does not have different
    3840jump sizes (branching is simulated by assigning values to the program counter),
    39 so the branch displacement problem does not occur.
     41so the branch displacement problem is irrelevant.
    4042 
    4143\subsection{Formal development}
  • src/ASM/CPP2012-policy/problem.tex

    r2091 r2096  
    55fact that in many architecture sets, the encoding (and therefore size) of some
    66instructions depends on the distance to their operand (the instruction 'span').
    7 The branch displacement optimisation problem consists in encoding these
     7The branch displacement optimisation problem consists of encoding these
    88span-dependent instructions in such a way that the resulting program is as
    99small as possible.
     
    1111This problem is the subject of the present paper. After introducing the problem
    1212in more detail, we will discuss the solutions used by other compilers, present
    13 the algorithm used by us in the CerCo assembler, and discuss its verification,
    14 that is the proofs of termination and correctness using the Matita theorem
    15 prover~\cite{Asperti2007}.
     13the algorithm we use in the CerCo assembler, and discuss its verification,
     14that is the proofs of termination and correctness using the Matita proof
     15assistant~\cite{Asperti2007}.
    1616 
    1717The research presented in this paper has been executed within the CerCo project
     
    1919target architecture for this project is the MCS-51, whose instruction set
    2020contains span-dependent instructions. Furthermore, its maximum addressable
    21 memory size is very small (65 Kbytes), which makes it important to generate
     21memory size is very small (64 Kb), which makes it important to generate
    2222programs that are as small as possible.
    2323
     
    3131In most modern instruction sets that have them, the only span-dependent
    3232instructions are branch instructions. Taking the ubiquitous x86-64 instruction
    33 set as an example, we find that it contains are eleven different forms of the
     33set as an example, we find that it contains eleven different forms of the
    3434unconditional branch instruction, all with different ranges, instruction sizes
    3535and semantics (only six are valid in 64-bit mode, for example). Some examples
    36 are shown in figure~\ref{f:x86jumps}.
     36are shown in Figure~\ref{f:x86jumps}.
    3737
    3838\begin{figure}[h]
     
    5555The chosen target architecture of the CerCo project is the Intel MCS-51, which
    5656features three types of branch instructions (or jump instructions; the two terms
    57 are used interchangeably), as shown in figure~\ref{f:mcs51jumps}.
     57are used interchangeably), as shown in Figure~\ref{f:mcs51jumps}.
    5858
    5959\begin{figure}[h]
     
    7878encoded using three branch instructions (for instructions whose logical
    7979negation is available, it can be done with two branch instructions, but for
    80 some instructions, this opposite is not available); the call instruction is
     80some instructions this is not available); the call instruction is
    8181only available in absolute and long forms.
    8282
    83 Note that even though the MCS-51 architecture is much less advanced and more
    84 simple than the x86-64 architecture, the basic types of branch instruction
     83Note that even though the MCS-51 architecture is much less advanced and simpler
     84than the x86-64 architecture, the basic types of branch instruction
    8585remain the same: a short jump with a limited range, an intra-segment jump and a
    8686jump that can reach the entire available memory.
    8787 
    88 Generally, in the code that is sent to the assembler as input, the only
    89 difference made between branch instructions is by semantics, not by span. This
     88Generally, in code fed to the assembler as input, the only
     89difference between branch instructions is semantics, not span. This
    9090means that a distinction is made between an unconditional branch and the
    9191several kinds of conditional branch, but not between their short, absolute or
     
    9494The algorithm used by the assembler to encode these branch instructions into
    9595the different machine instructions is known as the {\em branch displacement
    96 algorithm}. The optimisation problem consists in using as small an encoding as
     96algorithm}. The optimisation problem consists of finding as small an encoding as
    9797possible, thus minimising program length and execution time.
    9898
     
    102102The canonical solution, as shown by Szymanski~\cite{Szymanski1978} or more
    103103recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a
    104 fixed point algorithm that starts out with the shortest possible encoding (all
    105 branch instruction encoded as short jumps, which is very probably not a correct
     104fixed point algorithm that starts with the shortest possible encoding (all
     105branch instruction encoded as short jumps, which is likely not a correct
    106106solution) and then iterates over the program to re-encode those branch
    107107instructions whose target is outside their range.
     
    120120situation where the span of $j$ is so small that it can be encoded as a short
    121121jump. This argument continues to hold throughout the subsequent iterations of
    122 the algorithm: short jumps can change into long jumps, but not vice versa
    123 (spans only increase). Hence, the algorithm either terminates when a fixed
     122the algorithm: short jumps can change into long jumps, but not \emph{vice versa},
     123as spans only increase. Hence, the algorithm either terminates early when a fixed
    124124point is reached or when all short jumps have been changed into long jumps.
    125125
     
    128128
    129129However, neither of these claims (termination nor optimality) hold when we add
    130 the absolute jump.
    131 
    132 The reason for this is that with absolute jumps, the encoding of a branch
     130the absolute jump, as with absolute jumps, the encoding of a branch
    133131instruction no longer depends only on the distance between the branch
    134132instruction and its target: in order for an absolute jump to be possible, they
     
    139137long if this is not the case).
    140138
    141 This invalidates the termination argument: a branch instruction, once encoded
     139\begin{figure}[ht]
     140\begin{alltt}
     141    jmp X
     142    \vdots
     143L\(\sb{0}\):
     144    \vdots
     145    jmp L\(\sb{0}\)
     146\end{alltt}
     147\caption{Example of a program where a long jump becomes absolute}
     148\label{f:term_example}
     149\end{figure}
     150
     151This invalidates our earlier termination argument: a branch instruction, once encoded
    142152as a long jump, can be re-encoded during a later iteration as an absolute jump.
    143 Consider the program shown in figure~\ref{f:term_example}. At the start of the
     153Consider the program shown in Figure~\ref{f:term_example}. At the start of the
    144154first iteration, both the branch to {\tt X} and the branch to $\mathtt{L}_{0}$
    145155are encoded as small jumps. Let us assume that in this case, the placement of
     
    156166Hence, in the third iteration, the branch to $\mathtt{L}_{0}$ can be encoded as
    157167an absolute jump. At first glance, there is nothing that prevents us from
    158 making a construction where two branch instructions interact in such a way as
    159 to keep switching between long and absolute encodings for an indefinite amount
    160 of iterations.
    161 
    162 \begin{figure}[h]
    163 \begin{alltt}
    164     jmp X
    165     \vdots
    166 L\(\sb{0}\):
    167     \vdots
    168     jmp L\(\sb{0}\)
    169 \end{alltt}
    170 \caption{Example of a program where a long jump becomes absolute}
    171 \label{f:term_example}
    172 \end{figure}
    173 
    174 In fact, this situation mirrors the explanation by
    175 Szymanski~\cite{Szymanski1978} of why the branch displacement optimisation
    176 problem is NP-complete. In this explanation, a condition for NP-completeness
    177 is the fact that programs be allowed to contain {\em pathological} jumps.
    178 These are branch instructions that can normally not be encoded as a short(er)
    179 jump, but gain this property when some other branch instructions are encoded as
    180 a long(er) jump. This is exactly what happens in figure~\ref{f:term_example}:
    181 by encoding the first branch instruction as a long jump, another branch
    182 instructions switches from long to absolute (which is shorter).
    183 
    184 The optimality argument no longer holds either. Let us consider the program
    185 shown in figure~\ref{f:opt_example}. Suppose that the distance between
     168constructing a configuration where two branch instructions interact in such a
     169way as to iterate indefinitely between long and absolute encodings.
     170
     171This situation mirrors the explanation by Szymanski~\cite{Szymanski1978} of why
     172the branch displacement optimisation problem is NP-complete. In this explanation,
     173a condition for NP-completeness is the fact that programs be allowed to contain
     174{\em pathological} jumps. These are branch instructions that can normally not be
     175encoded as a short(er) jump, but gain this property when some other branch
     176instructions are encoded as a long(er) jump. This is exactly what happens in
     177Figure~\ref{f:term_example}. By encoding the first branch instruction as a long
     178jump, another branch instruction switches from long to absolute (which is
     179shorter).
     180
     181In addition, our previous optimality argument no longer holds. Consider the program
     182shown in Figure~\ref{f:opt_example}. Suppose that the distance between
    186183$\mathtt{L}_{0}$ and $\mathtt{L}_{1}$ is such that if {\tt jmp X} is encoded
    187184as a short jump, there is a segment border just after $\mathtt{L}_{1}$. Let
     
    189186segment, but far enough away from $\mathtt{L}_{1}$ that they cannot be encoded
    190187as short jumps.
     188
     189\begin{figure}[ht]
     190\begin{alltt}
     191L\(\sb{0}\): jmp X
     192X:
     193    \vdots
     194L\(\sb{1}\):
     195    \vdots
     196    jmp L\(\sb{1}\)
     197    \vdots
     198    jmp L\(\sb{1}\)
     199    \vdots
     200    jmp L\(\sb{1}\)
     201    \vdots
     202\end{alltt}
     203\caption{Example of a program where the fixed-point algorithm is not optimal}
     204\label{f:opt_example}
     205\end{figure}
    191206
    192207Then, if {\tt jmp X} were to be encoded as a short jump, which is clearly
     
    198213this solution might actually be smaller than the one reached by the smallest
    199214fixed point algorithm.
    200 
    201 \begin{figure}[h]
    202 \begin{alltt}
    203 L\(\sb{0}\): jmp X
    204 X:
    205     \vdots
    206 L\(\sb{1}\):
    207     \vdots
    208     jmp L\(\sb{1}\)
    209     \vdots
    210     jmp L\(\sb{1}\)
    211     \vdots
    212     jmp L\(\sb{1}\)
    213     \vdots
    214 \end{alltt}
    215 \caption{Example of a program where the fixed-point algorithm is not optimal}
    216 \label{f:opt_example}
    217 \end{figure}
  • src/ASM/CPP2012-policy/proof.tex

    r2091 r2096  
    11\section{The proof}
    22
    3 In this section, we will present the correctness proof of the algorithm in more
    4 detail.
    5 
    6 The main correctness statement is as follows:
     3In this section, we present the correctness proof for the algorithm in more
     4detail.  The main correctness statement is as follows (slightly simplified, here):
    75
    86\clearpage
    97
    108\begin{lstlisting}
    11 definition sigma_policy_specification :=:
    12 $\lambda$program: pseudo_assembly_program.
    13 $\lambda$sigma: Word → Word.
    14 $\lambda$policy: Word → bool.
    15  sigma (zero $\ldots$) = zero $\ldots$ $\wedge$
    16  $\forall$ppc: Word.$\forall$ppc_ok.
    17  let instr_list := \snd program in
    18  let pc ≝ sigma ppc in
    19  let labels := \fst (create_label_cost_map (\snd program)) in
    20  let lookup_labels :=
    21   $\lambda$x.bitvector_of_nat ? (lookup_def ?? labels x 0) in
    22  let instruction :=
    23   \fst (fetch_pseudo_instruction (\snd program) ppc ppc_ok) in
    24  let next_pc := \fst (sigma (add ? ppc (bitvector_of_nat ? 1))) in
    25   (nat_of_bitvector $\ldots$ ppc ≤ |instr_list| →
    26    next_pc = add ? pc (bitvector_of_nat $\ldots$
    27    (instruction_size lookup_labels sigma policy ppc instruction)))
    28   $\wedge$     
    29   ((nat_of_bitvector $\ldots$ ppc < |instr_list| →
    30    nat_of_bitvector $\ldots$ pc < nat_of_bitvector $\ldots$ next_pc)
    31   $\vee$ (nat_of_bitvector $\ldots$ ppc = |instr_list| → next_pc = (zero $\ldots$))).
     9definition sigma_policy_specification :=
     10 $\lambda$program: pseudo_assembly_program.
     11 $\lambda$sigma: Word $\rightarrow$ Word.
     12 $\lambda$policy: Word $\rightarrow$ bool.
     13  sigma (zero $\ldots$) = zero $\ldots$ $\wedge$
     14  $\forall$ppc: Word.$\forall$ppc_ok.
     15  let $\langle$preamble, instr_list$\rangle$ := program in
     16  let pc := sigma ppc in
     17  let instruction :=
     18   \fst (fetch_pseudo_instruction instr_list ppc ppc_ok) in
     19  let next_pc := \fst (sigma (add ? ppc (bitvector_of_nat ? 1))) in
     20   (nat_of_bitvector $\ldots$ ppc $\leq$ |instr_list| $\rightarrow$
     21    next_pc = add ? pc (bitvector_of_nat $\ldots$
     22    (instruction_size $\ldots$ sigma policy ppc instruction)))
     23   $\wedge$     
     24   ((nat_of_bitvector $\ldots$ ppc < |instr_list| $\rightarrow$
     25    nat_of_bitvector $\ldots$ pc < nat_of_bitvector $\ldots$ next_pc)
     26   $\vee$ (nat_of_bitvector $\ldots$ ppc = |instr_list| $\rightarrow$ next_pc = (zero $\ldots$))).
    3227\end{lstlisting}
    3328
    3429Informally, this means that when fetching a pseudo-instruction at $ppc$, the
    3530translation by $\sigma$ of $ppc+1$ is the same as $\sigma(ppc)$ plus the size
    36 of the instruction at $ppc$; i.e. an instruction is placed immediately
     31of the instruction at $ppc$.  That is, an instruction is placed consecutively
    3732after the previous one, and there are no overlaps.
    3833
    39 The other condition enforced is the fact that instructions are stocked in
     34Instructions are also stocked in
    4035order: the memory address of the instruction at $ppc$ should be smaller than
    4136the memory address of the instruction at $ppc+1$. There is one exeception to
     
    4439to the amount of memory).
    4540
    46 And finally, we enforce that the program starts at address 0, i.e.
    47 $\sigma(0) = 0$.
     41Finally, we enforce that the program starts at address 0, i.e. $\sigma(0) = 0$.
    4842
    4943Since our computation is a least fixed point computation, we must prove
     
    5145a number of steps without reaching a fixed point, the solution is not
    5246guaranteed to be correct. More specifically, branch instructions might be
    53 encoded who do not coincide with the span between their location and their
     47encoded which do not coincide with the span between their location and their
    5448destination.
    5549
     
    6256long, there can be at most $2n$ changes.
    6357
    64 This proof has been executed in the ``Russell'' style from~\cite{Sozeau2006}.
     58The proof has been carried out using the ``Russell'' style from~\cite{Sozeau2006}.
    6559We have proven some invariants of the {\sc f} function from the previous
    6660section; these invariants are then used to prove properties that hold for every
     
    7468
    7569Note that during the fixed point computation, the $\sigma$ function is
    76 implemented as a trie for ease access; computing $\sigma(x)$ is done by looking
     70implemented as a trie for ease of access; computing $\sigma(x)$ is achieved by looking
    7771up the value of $x$ in the trie. Actually, during the fold, the value we
    78 pass along is a couple $\mathbb{N} \times \mathtt{ppc_pc_map}$. The natural
    79 number is the number of bytes added to the program so far with respect to
    80 the previous iteration, and {\tt ppc\_pc\_map} is a couple of the current
    81 size of the program and our $\sigma$ function.
     72pass along is a pair $\mathbb{N} \times \mathtt{ppc_pc_map}$. The first component
     73is the number of bytes added to the program so far with respect to
     74the previous iteration, and the second component, {\tt ppc\_pc\_map}, is a pair
     75consisting of the current size of the program and our $\sigma$ function.
    8276
    8377\begin{lstlisting}
    8478definition out_of_program_none :=
    8579 $\lambda$prefix:list labelled_instruction.$\lambda$sigma:ppc_pc_map.
    86  $\forall$i.i < 2^16 (i > |prefix| $\leftrightarrow$
     80 $\forall$i.i < 2^16 $\rightarrow$ (i > |prefix| $\leftrightarrow$
    8781  bvt_lookup_opt $\ldots$ (bitvector_of_nat ? i) (\snd sigma) = None ?).
    8882\end{lstlisting}
    8983
    90 This invariant states that every pseudo-address not yet treated cannot be
    91 found in the lookup trie.
    92 
    93 \begin{lstlisting}
    94 definition not_jump_default
     84This invariant states that any pseudo-address not yet examined is not
     85present in the lookup trie.
     86
     87\begin{lstlisting}
     88definition not_jump_default :=
    9589 $\lambda$prefix:list labelled_instruction.$\lambda$sigma:ppc_pc_map.
    96  $\forall$i.i < |prefix|
    97   ¬is_jump (\snd (nth i ? prefix $\langle$None ?, Comment []$\rangle$))
     90 $\forall$i.i < |prefix| $\rightarrow$
     91  ¬is_jump (\snd (nth i ? prefix $\langle$None ?, Comment []$\rangle$)) $\rightarrow$
    9892  \snd (bvt_lookup $\ldots$ (bitvector_of_nat ? i) (\snd sigma)
    9993   $\langle$0,short_jump$\rangle$) = short_jump.
     
    106100\begin{lstlisting}
    107101definition jump_increase :=
    108  λprefix:list labelled_instruction.λop:ppc_pc_map.λp:ppc_pc_map.
    109  ∀i.i ≤ |prefix| →
     102 $\lambda$prefix:list labelled_instruction.$\lambda$op:ppc_pc_map.$\lambda$p:ppc_pc_map.
     103 $\forall$i.i $\leq$ |prefix| $\rightarrow$
    110104 let $\langle$opc,oj$\rangle$ :=
    111105  bvt_lookup $\ldots$ (bitvector_of_nat ? i) (\snd op) $\langle$0,short_jump$\rangle$ in
     
    115109\end{lstlisting}
    116110
    117 This invariant states that between iterations ($op$ being the previous
     111This invariant states that between iterations (with $op$ being the previous
    118112iteration, and $p$ the current one), jump lengths either remain equal or
    119113increase. It is needed for proving termination.
     
    123117\begin{lstlisting}
    124118definition sigma_compact_unsafe :=
    125  λprogram:list labelled_instruction.λlabels:label_map.λsigma:ppc_pc_map.
    126  ∀n.n < |program| →
     119 $\lambda$program:list labelled_instruction.$\lambda$labels:label_map.$\lambda$sigma:ppc_pc_map.
     120 $\forall$n.n < |program| $\rightarrow$
    127121  match bvt_lookup_opt $\ldots$ (bitvector_of_nat ? n) (\snd sigma) with
    128   [ None False
    129   | Some x let $\langle$pc,j$\rangle$ := x in
     122  [ None $\Rightarrow$ False
     123  | Some x $\Rightarrow$ let $\langle$pc,j$\rangle$ := x in
    130124    match bvt_lookup_opt $\ldots$ (bitvector_of_nat ? (S n)) (\snd sigma) with
    131     [ None False
    132     | Some x1 ⇒ let $\langle$pc1,j1$\rangle$ ≝ x1 in
     125    [ None $\Rightarrow$ False
     126    | Some x1 $\Rightarrow$ let $\langle$pc1,j1$\rangle$ := x1 in
    133127       pc1 = pc + instruction_size_jmplen j
    134128        (\snd (nth n ? program $\langle$None ?, Comment []$\rangle$)))
     
    139133This is a temporary formulation of the main property
    140134({\tt sigma\_policy\_specification}); its main difference
    141 with the final version is that it uses {\tt instruction\_size\_jmplen} to
     135from the final version is that it uses {\tt instruction\_size\_jmplen} to
    142136compute the instruction size. This function uses $j$ to compute the span
    143137of branch instructions  (i.e. it uses the $\sigma$ function under construction),
     
    149143\begin{lstlisting}
    150144definition sigma_safe :=
    151  λprefix:list labelled_instruction.λlabels:label_map.λadded:$\mathbb{N}$.
    152  λold_sigma:ppc_pc_map.λsigma:ppc_pc_map.
    153  ∀i.i < |prefix| → let $\langle$pc,j$\rangle$ :=
     145 $\lambda$prefix:list labelled_instruction.$\lambda$labels:label_map.$\lambda$added:$\mathbb{N}$.
     146 $\lambda$old_sigma:ppc_pc_map.$\lambda$sigma:ppc_pc_map.
     147 $\forall$i.i < |prefix| $\rightarrow$ let $\langle$pc,j$\rangle$ :=
    154148  bvt_lookup $\ldots$ (bitvector_of_nat ? i) (\snd sigma) $\langle$0,short_jump$\rangle$ in
    155149  let pc_plus_jmp_length := bitvector_of_nat ?  (\fst (bvt_lookup $\ldots$
     
    178172
    179173Note that we compute the distance using the memory address of the instruction
    180 plus its size: this is due to the behaviour of the MCS-51 architecture, which
     174plus its size: this follows the behaviour of the MCS-51 microprocessor, which
    181175increases the program counter directly after fetching, and only then executes
    182176the branch instruction (by changing the program counter again).
     
    194188
    195189\begin{lstlisting}
    196 (added = 0 policy_pc_equal prefix old_sigma policy))
    197 (policy_jump_equal prefix old_sigma policy added = 0))
     190(added = 0 $\rightarrow$ policy_pc_equal prefix old_sigma policy))
     191(policy_jump_equal prefix old_sigma policy $\rightarrow$ added = 0))
    198192\end{lstlisting}
    199193
    200194And finally, two properties that deal with what happens when the previous
    201 iteration does not change with respect to the current one. $added$ is the
     195iteration does not change with respect to the current one. $added$ is a
    202196variable that keeps track of the number of bytes we have added to the program
    203 size by changing the encoding of branch instructions; if this is 0, the program
     197size by changing the encoding of branch instructions. If $added$ is 0, the program
    204198has not changed and vice versa.
    205199
    206200We need to use two different formulations, because the fact that $added$ is 0
    207 does not guarantee that no branch instructions have changed: it is possible
    208 that we have replaced a short jump with a absolute jump, which does not change
    209 the size of the branch instruction.
     201does not guarantee that no branch instructions have changed.  For instance,
     202it is possible that we have replaced a short jump with a absolute jump, which
     203does not change the size of the branch instruction.
    210204
    211205Therefore {\tt policy\_pc\_equal} states that $old\_sigma_1(x) = sigma_1(x)$,
     
    220214difference between these invariants and the fold invariants is that after the
    221215completion of the fold, we check whether the program size does not supersede
    222 65 Kbytes (the maximum memory size the MCS-51 can address).
     21664 Kb, the maximum memory size the MCS-51 can address.
    223217
    224218The type of an iteration therefore becomes an option type: {\tt None} in case
    225 the program becomes larger than 65 KBytes, or $\mathtt{Some}\ \sigma$
     219the program becomes larger than 64 Kb, or $\mathtt{Some}\ \sigma$
    226220otherwise. We also no longer use a natural number to pass along the number of
    227221bytes added to the program size, but a boolean that indicates whether we have
     
    232226\begin{lstlisting}
    233227definition nec_plus_ultra :=
    234  λprogram:list labelled_instruction.λp:ppc_pc_map.
    235  ¬(∀i.i < |program| →
    236   is_jump (\snd (nth i ? program $\langle$None ?, Comment []$\rangle$))
     228 $\lambda$program:list labelled_instruction.$\lambda$p:ppc_pc_map.
     229 ¬($\forall$i.i < |program| $\rightarrow$
     230  is_jump (\snd (nth i ? program $\langle$None ?, Comment []$\rangle$)) $\rightarrow$
    237231  \snd (bvt_lookup $\ldots$ (bitvector_of_nat 16 i) (\snd p) $\langle$0,short_jump$\rangle$) =
    238232   long_jump).
     
    241235This invariant is applied to $old\_sigma$; if our program becomes too large
    242236for memory, the previous iteration cannot have every branch instruction encoded
    243 as a long jump. This is needed later on in the proof of termination.
     237as a long jump. This is needed later in the proof of termination.
    244238
    245239If the iteration returns $\mathtt{Some}\ \sigma$, the invariants
     
    253247\begin{lstlisting}
    254248definition sigma_compact :=
    255  λprogram:list labelled_instruction.λlabels:label_map.λsigma:ppc_pc_map.
    256  ∀n.n < |program| →
     249 $\lambda$program:list labelled_instruction.$\lambda$labels:label_map.$\lambda$sigma:ppc_pc_map.
     250 $\forall$n.n < |program| $\rightarrow$
    257251  match bvt_lookup_opt $\ldots$ (bitvector_of_nat ? n) (\snd sigma) with
    258   [ None False
    259   | Some x let $\langle$pc,j$\rangle$ := x in
     252  [ None $\Rightarrow$ False
     253  | Some x $\Rightarrow$ let $\langle$pc,j$\rangle$ := x in
    260254    match bvt_lookup_opt $\ldots$ (bitvector_of_nat ? (S n)) (\snd sigma) with
    261     [ None False
    262     | Some x1 let $\langle$pc1,j1$\rangle$ := x1 in
     255    [ None $\Rightarrow$ False
     256    | Some x1 $\Rightarrow$ let $\langle$pc1,j1$\rangle$ := x1 in
    263257      pc1 = pc + instruction_size
    264        (λid.bitvector_of_nat ? (lookup_def ?? labels id 0))
    265        (λppc.bitvector_of_nat ?
     258       ($\lambda$id.bitvector_of_nat ? (lookup_def ?? labels id 0))
     259       ($\lambda$ppc.bitvector_of_nat ?
    266260        (\fst (bvt_lookup $\ldots$ ppc (\snd sigma) $\langle$0,short_jump$\rangle$)))
    267        (λppc.jmpeqb long_jump (\snd (bvt_lookup $\ldots$ ppc
     261       ($\lambda$ppc.jmpeqb long_jump (\snd (bvt_lookup $\ldots$ ppc
    268262        (\snd sigma) $\langle$0,short_jump$\rangle$))) (bitvector_of_nat ? n)
    269263       (\snd (nth n ? program $\langle$None ?, Comment []$\rangle$))
     
    272266\end{lstlisting}
    273267
    274 This is the same invariant as ${\tt sigma\_compact\_unsafe}$, but instead it
     268This is almost the same invariant as ${\tt sigma\_compact\_unsafe}$, but differs in that it
    275269computes the sizes of branch instructions by looking at the distance between
    276270position and destination using $\sigma$.
     
    292286
    293287The proof of {\tt nec\_plus\_ultra} works as follows: if we return {\tt None},
    294 then the program size must be greater than 65 Kbytes. However, since the
     288then the program size must be greater than 64 Kb. However, since the
    295289previous iteration did not return {\tt None} (because otherwise we would
    296290terminate immediately), the program size in the previous iteration must have
    297 been smaller than 65 Kbytes.
     291been smaller than 64 Kb.
    298292
    299293Suppose that all the branch instructions in the previous iteration are
    300 encodes as long jumps. This means that all branch instructions in this
     294encoded as long jumps. This means that all branch instructions in this
    301295iteration are long jumps as well, and therefore that both iterations are equal
    302296in the encoding of their branch instructions. Per the invariant, this means that
     
    304298But if all addresses are equal, the program sizes must be equal too, which
    305299means that the program size in the current iteration must be smaller than
    306 65 Kbytes. This contradicts the earlier hypothesis, hence not all branch
     30064 Kb. This contradicts the earlier hypothesis, hence not all branch
    307301instructions in the previous iteration are encoded as long jumps.
    308302
     
    316310These are the invariants that hold after $2n$ iterations, where $n$ is the
    317311program size (we use the program size for convenience; we could also use the
    318 number of branch instructions, but this is more complicated). Here, we only
     312number of branch instructions, but this is more complex). Here, we only
    319313need {\tt out\_of\_program\_none}, {\tt sigma\_compact} and the fact that
    320314$\sigma(0) = 0$.
    321315
    322 Termination can now be proven through the fact that there is a $k \leq 2n$, with
     316Termination can now be proved using the fact that there is a $k \leq 2n$, with
    323317$n$ the length of the program, such that iteration $k$ is equal to iteration
    324318$k+1$. There are two possibilities: either there is a $k < 2n$ such that this
Note: See TracChangeset for help on using the changeset viewer.