Changeset 3393
 Timestamp:
 Oct 11, 2013, 4:39:32 PM (7 years ago)
 Location:
 src/ASM/TACAS2013policy
 Files:

 5 edited
Legend:
 Unmodified
 Added
 Removed

src/ASM/TACAS2013policy/algorithm.tex
r3370 r3393 13 13 execution time. 14 14 15 On the other hand, the {\tt gcc} compiler suite ~\cite{GCC2012}, while compiling15 On the other hand, the {\tt gcc} compiler suite, while compiling 16 16 C on the x86 architecture, uses a greatest fix point algorithm. In other words, 17 17 it starts with all branch instructions encoded as the largest jumps … … 41 41 When we add absolute jumps, however, it is theoretically possible for a branch 42 42 instruction to switch from absolute to long and back, as previously explained. 43 44 43 Proving termination then becomes difficult, because there is nothing that 45 44 precludes a branch instruction from oscillating back and forth between absolute … … 72 71 return a smaller or equal solution. 73 72 73 Experiments on the gcc 2.3.3 test suite of C programs have shown that on 74 average, about 25 percent of jumps are encoded as short or absolute jumps by the algorithm. As not all instructions are jumps, this does not make for a large reduction in size, but it can make for a reduction in execution time: if jumps 75 are executed multiple times, for example in loops, the fact that short jumps take less cycles to execute than long jumps can have great effect. 76 74 77 As for complexity, there are at most $2n$ iterations, with $n$ the number of 75 78 branch instructions. Practical tests within the CerCo project on small to 76 79 medium pieces of code have shown that in almost all cases, a fixed point is 77 reached in 3 passes. Only in one case did the algorithm need 4. 78 79 This is not surprising: after all, the difference between short/absolute and 80 reached in 3 passes. Only in one case did the algorithm need 4. This is not surprising: after all, the difference between short/absolute and 80 81 long jumps is only one byte (three for conditional jumps). For a change from 81 82 short/absolute to long to have an effect on other jumps is therefore relatively … … 87 88 pseudocode to assembler. More specifically, it is used by the function that 88 89 translates pseudoaddresses (natural numbers indicating the position of the 89 instruction in the program) to actual addresses in memory. 90 instruction in the program) to actual addresses in memory. Note that in pseudocode, all instructions are of size 1. 90 91 91 92 Our original intention was to have two different functions, one function … … 94 95 intended encoding, and a function $\sigma: \mathbb{N} \rightarrow 95 96 \mathtt{Word}$ to associate pseudoaddresses to machine addresses. $\sigma$ 96 would use $\mathtt{policy}$ to determine the size of jump instructions. 97 98 This turned out to be suboptimal from the algorithmic point of view and 97 would use $\mathtt{policy}$ to determine the size of jump instructions. This turned out to be suboptimal from the algorithmic point of view and 99 98 impossible to prove correct. 100 99 … … 129 128 130 129 \begin{figure}[t] 130 \small 131 131 \begin{algorithmic} 132 132 \Function{f}{$labels$,$old\_sigma$,$instr$,$ppc$,$acc$} … … 136 136 \ElsIf {$instr$ is a forward jump to $j$} 137 137 \State $length \gets \mathrm{jump\_size}(pc,old\_sigma_1(labels(j))+added)$ 138 \Else139 \State $length \gets \mathtt{short\_jump}$140 138 \EndIf 141 139 \State $old\_length \gets \mathrm{old\_sigma_1}(ppc)$ … … 157 155 function {\sc f} over the entire program, thus gradually constructing $sigma$. 158 156 This constitutes one step in the fixed point calculation; successive steps 159 repeat the fold until a fixed point is reached. 157 repeat the fold until a fixed point is reached. We have abstracted away the case where an instruction is not a jump, since the size of these instructions is constant. 160 158 161 159 Parameters of the function {\sc f} are: … … 171 169 $\sigma$ function under construction. 172 170 \end{itemize} 173 174 171 The first two are parameters that remain the same through one iteration, the 175 172 final three are standard parameters for a fold function (including $ppc$, … … 193 190 194 191 We cannot use $old\_sigma$ without change: it might be the case that we have 195 already increased the size of some branch instructions before, making the program 196 longer and moving every instruction forward. We must compensate for this by 197 adding the size increase of the program to the label's memory address according 198 to $old\_sigma$, so that branch instruction spans do not get compromised. 199 200 Note also that we add the pc to $sigma$ at location $ppc+1$, whereas we add the 201 jump length at location $ppc$. We do this so that $sigma(ppc)$ will always 202 return a pair with the start address of the instruction at $ppc$ and the 203 length of its branch instruction (if any); the end address of the program can 204 be found at $sigma(n+1)$, where $n$ is the number of instructions in the 205 program. 192 already increased the size of some branch instructions before, making the 193 program longer and moving every instruction forward. We must compensate for this 194 by adding the size increase of the program to the label's memory address 195 according to $old\_sigma$, so that branch instruction spans do not get 196 compromised. 197 198 %Note also that we add the pc to $sigma$ at location $ppc+1$, whereas we add the 199 %jump length at location $ppc$. We do this so that $sigma(ppc)$ will always 200 %return a pair with the start address of the instruction at $ppc$ and the 201 %length of its branch instruction (if any); the end address of the program can 202 %be found at $sigma(n+1)$, where $n$ is the number of instructions in the 203 %program. 
src/ASM/TACAS2013policy/biblio.bib
r3370 r3393 61 61 year = {2011}, 62 62 key = {SDCC2011} 63 }64 65 @misc{GCC2012,66 title = {GNU Compiler Collection 4.7.0},67 howpublished = {\url{http://gcc.gnu.org/}},68 year = {2012},69 key = {GCC2012}70 63 } 71 64 
src/ASM/TACAS2013policy/conclusion.tex
r3370 r3393 62 62 surely also interesting to formally prove that the assembler never rejects 63 63 programs that should be accepted, i.e. that the algorithm itself is correct. 64 This was the topic of the current paper.64 This is the topic of the current paper. 65 65 66 66 \subsection{Related work} … … 75 75 displacement optimisation algorithm is not needed. 76 76 77 An offshoot of the CompCert project is the CompCertTSO project, which adds78 thread concurrency and synchronisation to the CompCert compiler~\cite{TSO2011}.79 This compiler also generates assembly code and therefore does not include a80 branch displacement algorithm.77 %An offshoot of the CompCert project is the CompCertTSO project, which adds 78 %thread concurrency and synchronisation to the CompCert compiler~\cite{TSO2011}. 79 %This compiler also generates assembly code and therefore does not include a 80 %branch displacement algorithm. 81 81 82 Finally, there is also the Piton stack~\cite{Moore1996}, which not only includes the83 formal verification of a compiler, but also of the machine architecture84 targeted by that compiler, a microprocessor called the FM9001.85 However, this architecture does not have different86 jump sizes (branching is simulated by assigning values to the program counter),87 so the branch displacement problem is irrelevant.82 %Finally, there is also the Piton stack~\cite{Moore1996}, which not only includes the 83 %formal verification of a compiler, but also of the machine architecture 84 %targeted by that compiler, a microprocessor called the FM9001. 85 %However, this architecture does not have different 86 %jump sizes (branching is simulated by assigning values to the program counter), 87 %so the branch displacement problem is irrelevant. 88 88 89 \subsection{Formal development}90 91 All Matita files related to this development can be found on the CerCo92 website, \url{http://cerco.cs.unibo.it}. The specific part that contains the93 branch displacement algorithm is in the {\tt ASM} subdirectory, in the files94 {\tt PolicyFront.ma}, {\tt PolicyStep.ma} and {\tt Policy.ma}. 
src/ASM/TACAS2013policy/problem.tex
r3364 r3393 27 27 contains spandependent instructions. Furthermore, its maximum addressable 28 28 memory size is very small (64 Kb), which makes it important to generate 29 programs that are as small as possible. 30 31 With this optimisation, however, comes increased complexity and hence 32 increased possibility for error. We must make sure that the branch instructions 33 are encoded correctly, otherwise the assembled program will behave 34 unpredictably.29 programs that are as small as possible. With this optimisation, however, comes increased complexity and hence increased possibility for error. We must make sure that the branch instructions are encoded correctly, otherwise the assembled program will behave unpredictably. 30 31 All Matita files related to this development can be found on the CerCo 32 website, \url{http://cerco.cs.unibo.it}. The specific part that contains the 33 branch displacement algorithm is in the {\tt ASM} subdirectory, in the files 34 {\tt PolicyFront.ma}, {\tt PolicyStep.ma} and {\tt Policy.ma}. 35 35 36 36 \section{The branch displacement optimisation problem} … … 123 123 Here, termination of the smallest fixed point algorithm is easy to prove. All 124 124 branch instructions start out encoded as short jumps, which means that the 125 distance between any branch instruction and its target is as short as possible. 125 distance between any branch instruction and its target is as short as possible 126 (all the intervening jumps are short). 126 127 If, in this situation, there is a branch instruction $b$ whose span is not 127 128 within the range for a short jump, we can be sure that we can never reach a … … 147 148 \begin{figure}[t] 148 149 \begin{subfigure}[b]{.45\linewidth} 150 \small 149 151 \begin{alltt} 150 152 jmp X … … 161 163 \hfill 162 164 \begin{subfigure}[b]{.45\linewidth} 165 \small 163 166 \begin{alltt} 164 167 L\(\sb{0}\): jmp X 
src/ASM/TACAS2013policy/proof.tex
r3370 r3393 3 3 In this section, we present the correctness proof for the algorithm in more 4 4 detail. The main correctness statement is shown, slightly simplified, in~Figure~\ref{statement}. 5 6 \label{sigmapolspec} 5 % 7 6 \begin{figure}[t] 7 \small 8 8 \begin{alignat*}{6} 9 9 \mathtt{sigma}&\omit\rlap{$\mathtt{\_policy\_specification} \equiv 10 \lambda program.\lambda sigma. \lambda policy.$} \notag\\10 \lambda program.\lambda sigma.$} \notag\\ 11 11 & \omit\rlap{$sigma\ 0 = 0\ \wedge$} \notag\\ 12 12 & \mathbf{let}\ & & \omit\rlap{$instr\_list \equiv code\ program\ \mathbf{in}$} \notag\\ … … 15 15 &&& \mathbf{let}\ && instruction \equiv \mathtt{fetch\_pseudo\_instruction}\ instr\_list\ ppc\ \mathbf{in} \notag\\ 16 16 &&& \mathbf{let}\ && next\_pc \equiv sigma\ (ppc+1)\ \mathbf{in}\notag\\ 17 &&&&& next\_pc = pc + \mathtt{instruction\_size}\ sigma\ p olicy\ ppc\ instruction\ \wedge\notag\\18 &&&&& (pc + \mathtt{instruction\_size}\ sigma\ p olicy\ ppc\ instruction < 2^{16}\ \vee\notag\\17 &&&&& next\_pc = pc + \mathtt{instruction\_size}\ sigma\ ppc\ instruction\ \wedge\notag\\ 18 &&&&& (pc + \mathtt{instruction\_size}\ sigma\ ppc\ instruction < 2^{16}\ \vee\notag\\ 19 19 &&&&& (\forall ppc'.ppc' < instr\_list \rightarrow ppc < ppc' \rightarrow \notag\\ 20 20 &&&&& \mathbf{let}\ instruction' \equiv \mathtt{fetch\_pseudo\_instruction}\ instr\_list\ ppc'\ ppc\_ok'\ \mathbf{in} \notag\\ 21 &&&&&\ \mathtt{instruction\_size}\ sigma\ p olicy\ ppc'\ instruction' = 0)\ \wedge \notag\\22 &&&&& pc + \mathtt{instruction\_size}\ sigma\ p olicy\ ppc\ instruction = 2^{16})21 &&&&&\ \mathtt{instruction\_size}\ sigma\ ppc'\ instruction' = 0)\ \wedge \notag\\ 22 &&&&& pc + \mathtt{instruction\_size}\ sigma\ ppc\ instruction = 2^{16}) 23 23 \end{alignat*} 24 24 \caption{Main correctness statement\label{statement}} 25 \label{sigmapolspec} 25 26 \end{figure} 26 27 % 27 28 Informally, this means that when fetching a pseudoinstruction at $ppc$, the 28 29 translation by $\sigma$ of $ppc+1$ is the same as $\sigma(ppc)$ plus the size 29 30 of the instruction at $ppc$. That is, an instruction is placed consecutively 30 after the previous one, and there are no overlaps. 31 32 Instructions are also stocked in 33 order: the memory address of the instruction at $ppc$ should be smaller than 34 the memory address of the instruction at $ppc+1$. There is one exception to 35 this rule: the instruction at the very end of the program, whose successor 36 address can be zero (this is the case where the program size is exactly equal 37 to the amount of memory). 38 39 Finally, we enforce that the program starts at address 0, i.e. $\sigma(0) = 0$. 31 after the previous one, and there are no overlaps. The rest of the statement deals with memory size: either the next instruction fits within memory ($next\_pc < 2^{16}$) or it ends exactly at the limit memory, 32 in which case it must be the last translated instruction in the program (enforced by specfiying that the size of all subsequent instructions is 0: there may be comments or cost annotations that are not translated). 33 34 Finally, we enforce that the program starts at address 0, i.e. $\sigma(0) = 0$. It may seem strange that we do not explicitly include a safety property stating that every jump instruction is of the right type with respect to its target (akin to the lemma from Figure~\ref{sigmasafe}), but this is not necessary. The distance is recalculated according to the instruction addresses from $\sigma$, which implicitly expresses safety. 40 35 41 36 Since our computation is a least fixed point computation, we must prove … … 54 49 can be at most $2n$ changes. 55 50 56 The proof has been carried out using the ``Russell'' style from~\cite{Sozeau2006}.57 We have proven some invariants of the {\sc f} function from the previous58 section; these invariants are then used to prove properties that hold for every59 iteration of the fixed point computation; and finally, we can prove some60 properties of the fixed point.51 %The proof has been carried out using the ``Russell'' style from~\cite{Sozeau2006}. 52 %We have proven some invariants of the {\sc f} function from the previous 53 %section; these invariants are then used to prove properties that hold for every 54 %iteration of the fixed point computation; and finally, we can prove some 55 %properties of the fixed point. 61 56 62 57 \subsection{Fold invariants} … … 64 59 In this section, we present the invariants that hold during the fold of {\sc f} 65 60 over the program. These will be used later on to prove the properties of the 66 iteration. 67 68 Note that during the fixed point computation, the $\sigma$ function is 61 iteration. During the fixed point computation, the $\sigma$ function is 69 62 implemented as a trie for ease of access; computing $\sigma(x)$ is achieved by 70 63 looking up the value of $x$ in the trie. Actually, during the fold, the value … … 72 65 component is the number of bytes added to the program so far with respect to 73 66 the previous iteration, and the second component, {\tt ppc\_pc\_map}, is the 74 actual $\sigma$ trie. 75 67 actual $\sigma$ trie (which we'll call $strie$ to avoid confusion). 68 % 69 {\small 76 70 \begin{alignat*}{2} 77 \mathtt{out} & \mathtt{\_of\_program\_none} \equiv \lambda prefix.\lambda s igma. \notag\\71 \mathtt{out} & \mathtt{\_of\_program\_none} \equiv \lambda prefix.\lambda strie. \notag\\ 78 72 & \forall i.i < 2^{16} \rightarrow (i > prefix \leftrightarrow 79 \mathtt{lookup\_opt}\ i\ (\mathtt{snd}\ s igma) = \mathtt{None})80 \end{alignat*} 81 73 \mathtt{lookup\_opt}\ i\ (\mathtt{snd}\ strie) = \mathtt{None}) 74 \end{alignat*}} 75 % 82 76 The first invariant states that any pseudoaddress not yet examined is not 83 77 present in the lookup trie. 84 78 % 79 {\small 85 80 \begin{alignat*}{2} 86 \mathtt{not} & \mathtt{\_jump\_default} \equiv \lambda prefix.\lambda sigma.\notag\\ 87 & \forall i.i < prefix \rightarrow\notag\\ 88 & \neg\mathtt{is\_jump}\ (\mathtt{nth}\ i\ prefix) \rightarrow\notag\\ 89 & \mathtt{lookup}\ i\ (\mathtt{snd}\ sigma) = \mathtt{short\_jump} 90 \end{alignat*} 91 81 \mathtt{not} & \mathtt{\_jump\_default} \equiv \lambda prefix.\lambda strie.\forall i.i < prefix \rightarrow\notag\\ 82 & \neg\mathtt{is\_jump}\ (\mathtt{nth}\ i\ prefix) \rightarrow \mathtt{lookup}\ i\ (\mathtt{snd}\ strie) = \mathtt{short\_jump} 83 \end{alignat*}} 84 % 92 85 This invariant states that when we try to look up the jump length of a 93 86 pseudoaddress where there is no branch instruction, we will get the default 94 87 value, a short jump. 95 88 % 89 {\small 96 90 \begin{alignat*}{4} 97 \mathtt{jump} & \omit\rlap{$\mathtt{\_increase} \equiv \lambda pc.\lambda op.\lambda p.$} \notag\\ 98 & \omit\rlap{$\forall i.i < prefix \rightarrow$} \notag\\ 99 & \mathbf{let}\ && oj \equiv \mathtt{lookup}\ i\ (\mathtt{snd}\ op)\ \mathbf{in} \notag\\ 100 & \mathbf{let}\ && j \equiv \mathtt{lookup}\ i\ (\mathtt{snd}\ p)\ \mathbf{in} \notag\\ 101 &&& \mathtt{jmpleq}\ oj\ j 102 \end{alignat*} 103 91 \mathtt{jump} & \mathtt{\_increase} \equiv \lambda pc.\lambda op.\lambda p.\forall i.i < prefix \rightarrow \notag\\ 92 & \mathbf{let}\ oj \equiv \mathtt{lookup}\ i\ (\mathtt{snd}\ op)\ \mathbf{in} \notag\\ 93 & \mathbf{let}\ j \equiv \mathtt{lookup}\ i\ (\mathtt{snd}\ p)\ \mathbf{in}\ \mathtt{jmpleq}\ oj\ j 94 \end{alignat*}} 95 % 104 96 This invariant states that between iterations (with $op$ being the previous 105 97 iteration, and $p$ the current one), jump lengths either remain equal or 106 98 increase. It is needed for proving termination. 107 108 \begin{figure}[h t]109 \ begin{alignat*}{6}110 \ mathtt{sigma} & \omit\rlap{$\mathtt{\_compact\_unsafe} \equiv \lambda prefix.\lambda sigma.$}\notag\\111 & \omit\rlap{$\forall n.n < prefix \rightarrow$}\notag\\112 & \mathbf{match}\ && \omit\rlap{$\mathtt{lookup\_opt}\ n\ (\mathtt{snd}\ s igma)\ \mathbf{with}$}\notag\\99 % 100 \begin{figure}[h] 101 \small 102 \begin{alignat*}{6} 103 \mathtt{sigma} & \omit\rlap{$\mathtt{\_compact\_unsafe} \equiv \lambda prefix.\lambda strie.\forall n.n < prefix \rightarrow$}\notag\\ 104 & \mathbf{match}\ && \omit\rlap{$\mathtt{lookup\_opt}\ n\ (\mathtt{snd}\ strie)\ \mathbf{with}$}\notag\\ 113 105 &&& \omit\rlap{$\mathtt{None} \Rightarrow \mathrm{False}$} \notag\\ 114 106 &&& \omit\rlap{$\mathtt{Some}\ \langle pc, j \rangle \Rightarrow$} \notag\\ 115 &&& \mathbf{match}\ && \mathtt{lookup\_opt}\ (n+1)\ (\mathtt{snd}\ s igma)\ \mathbf{with}\notag\\107 &&& \mathbf{match}\ && \mathtt{lookup\_opt}\ (n+1)\ (\mathtt{snd}\ strie)\ \mathbf{with}\notag\\ 116 108 &&&&& \mathtt{None} \Rightarrow \mathrm{False} \notag\\ 117 109 &&&&& \mathtt{Some}\ \langle pc_1, j_1 \rangle \Rightarrow … … 122 114 \label{sigmacompactunsafe} 123 115 \end{figure} 124 116 % 125 117 We now proceed with the safety lemmas. The lemma in 126 118 Figure~\ref{sigmacompactunsafe} is a temporary formulation of the main … … 132 124 $\sigma$ is still under construction; we will prove below that after the 133 125 final iteration, {\tt sigma\_compact\_unsafe} is equivalent to the main 134 property .135 136 \begin{figure}[h t]137 \ begin{alignat*}{6}138 \ mathtt{sigma} & \omit\rlap{$\mathtt{\_safe} \equiv \lambda prefix.\lambda labels.\lambda old\_sigma.\lambda sigma$}\notag\\139 & \omit\rlap{$\forall i.i < prefix \rightarrow$} \notag\\126 property in Figure~\ref{sigmasafe} which holds at the end of the computation. 127 % 128 \begin{figure}[h] 129 \small 130 \begin{alignat*}{6} 131 \mathtt{sigma} & \omit\rlap{$\mathtt{\_safe} \equiv \lambda prefix.\lambda labels.\lambda old\_strie.\lambda strie.\forall i.i < prefix \rightarrow$} \notag\\ 140 132 & \omit\rlap{$\forall dest\_label.\mathtt{is\_jump\_to\ (\mathtt{nth}\ i\ prefix})\ dest\_label \rightarrow$} \notag\\ 141 133 & \mathbf{let} && \omit\rlap{$\ paddr \equiv \mathtt{lookup}\ labels\ dest\_label\ \mathbf{in}$} \notag\\ 142 & \mathbf{let} && \omit\rlap{$\ \langle j, src, dest \rangle \equiv$}\notag\\ 143 &&& \omit\rlap{$\mathbf{if} \ paddr\ \leq\ i\ \mathbf{then}$}\notag\\ 144 &&&&& \mathbf{let}\ \langle \_, j \rangle \equiv \mathtt{lookup}\ i\ (\mathtt{snd}\ sigma)\ \mathbf{in} \notag\\ 145 &&&&& \mathbf{let}\ \langle pc\_plus\_jl, \_ \rangle \equiv \mathtt{lookup}\ (i+1)\ (\mathtt{snd}\ sigma)\ \mathbf{in}\notag\\ 146 &&&&& \mathbf{let}\ \langle addr, \_ \rangle \equiv \mathtt{lookup}\ paddr\ (\mathtt{snd}\ sigma)\ \mathbf{in}\notag\\ 134 & \mathbf{let} && \omit\rlap{$\ \langle j, src, dest \rangle \equiv \mathbf{if} \ paddr\ \leq\ i\ \mathbf{then}$}\notag\\ 135 &&&&& \mathbf{let}\ \langle \_, j \rangle \equiv \mathtt{lookup}\ i\ (\mathtt{snd}\ strie)\ \mathbf{in} \notag\\ 136 &&&&& \mathbf{let}\ \langle pc\_plus\_jl, \_ \rangle \equiv \mathtt{lookup}\ (i+1)\ (\mathtt{snd}\ strie)\ \mathbf{in}\notag\\ 137 &&&&& \mathbf{let}\ \langle addr, \_ \rangle \equiv \mathtt{lookup}\ paddr\ (\mathtt{snd}\ strie)\ \mathbf{in}\notag\\ 147 138 &&&&& \langle j, pc\_plus\_jl, addr \rangle\notag\\ 148 139 &&&\mathbf{else} \notag\\ 149 &&&&&\mathbf{let}\ \langle \_, j \rangle \equiv \mathtt{lookup}\ i\ (\mathtt{snd}\ sigma)\ \mathbf{in} \notag\\ 150 &&&&&\mathbf{let}\ \langle pc\_plus\_jl, \_ \rangle \equiv \mathtt{lookup}\ (i+1)\ (\mathtt{snd}\ old\_sigma)\ \mathbf{in}\notag\\ 151 &&&&&\mathbf{let}\ \langle addr, \_ \rangle \equiv \mathtt{lookup}\ paddr\ (\mathtt{snd}\ old\_sigma)\ \mathbf{in}\notag\\ 152 &&&&&\langle j, pc\_plus\_jl, addr \rangle\notag\\ 153 &&&\omit\rlap{$\mathbf{in}$}\notag\\ 140 &&&&&\mathbf{let}\ \langle \_, j \rangle \equiv \mathtt{lookup}\ i\ (\mathtt{snd}\ strie)\ \mathbf{in} \notag\\ 141 &&&&&\mathbf{let}\ \langle pc\_plus\_jl, \_ \rangle \equiv \mathtt{lookup}\ (i+1)\ (\mathtt{snd}\ old\_strie)\ \mathbf{in}\notag\\ 142 &&&&&\mathbf{let}\ \langle addr, \_ \rangle \equiv \mathtt{lookup}\ paddr\ (\mathtt{snd}\ old\_strie)\ \mathbf{in}\notag\\ 143 &&&&&\langle j, pc\_plus\_jl, addr \rangle \mathbf{in}\ \notag\\ 154 144 &&&\mathbf{match} && \ j\ \mathbf{with} \notag\\ 155 145 &&&&&\mathrm{short\_jump} \Rightarrow \mathtt{short\_jump\_valid}\ src\ dest\notag\\ … … 160 150 \label{sigmasafe} 161 151 \end{figure} 162 163 The lemma in figure~\ref{sigmasafe} is a safety property. It states 164 that branch instructions are encoded properly, and that no wrong branch 165 instructions are chosen. 166 167 Note that we compute the distance using the memory address of the instruction 152 % 153 We compute the distance using the memory address of the instruction 168 154 plus its size. This follows the behaviour of the MCS51 microprocessor, which 169 155 increases the program counter directly after fetching, and only then executes 170 156 the branch instruction (by changing the program counter again). 171 157 172 We now encode some final,simple, properties to make sure that our policy158 There are also some simple, properties to make sure that our policy 173 159 remains consistent, and to keep track of whether the fixed point has been 174 reached. 175 176 \begin{align*} 177 & \mathtt{lookup}\ 0\ (\mathtt{snd}\ policy) = 0 \notag\\ 178 & \mathtt{lookup}\ prefix\ (\mathtt{snd}\ policy) = \mathtt{fst}\ policy 179 \end{align*} 180 181 These two properties give the values of $\sigma$ for the start and end of the 182 program; $\sigma(0) = 0$ and $\sigma(n)$, where $n$ is the number of 183 instructions up until now, is equal to the maximum memory address so far. 184 185 \begin{align*} 186 & added = 0\ \rightarrow\ \mathtt{policy\_pc\_equal}\ prefix\ old\_sigma\ policy \notag\\ 187 & \mathtt{policy\_jump\_equal}\ prefix\ old\_sigma\ policy\ \rightarrow\ added = 0 188 \end{align*} 189 190 And finally, two properties that deal with what happens when the previous 160 reached. We do not include them here in detail. Two of these properties give the values of $\sigma$ for the start and end of the program; $\sigma(0) = 0$ and $\sigma(n)$, where $n$ is the number of instructions up until now, is equal to the maximum memory address so far. There are also two properties that deal with what happens when the previous 191 161 iteration does not change with respect to the current one. $added$ is a 192 162 variable that keeps track of the number of bytes we have added to the program … … 194 164 has not changed and vice versa. 195 165 166 %{\small 167 %\begin{align*} 168 %& \mathtt{lookup}\ 0\ (\mathtt{snd}\ strie) = 0 \notag\\ 169 %& \mathtt{lookup}\ prefix\ (\mathtt{snd}\ strie) = \mathtt{fst}\ strie 170 %\end{align*}} 171 172 173 %{\small 174 %\begin{align*} 175 %& added = 0\ \rightarrow\ \mathtt{policy\_pc\_equal}\ prefix\ old\_strie\ strie \notag\\ 176 %& \mathtt{policy\_jump\_equal}\ prefix\ old\_strie\ strie\ \rightarrow\ added = 0 177 %\end{align*}} 178 196 179 We need to use two different formulations, because the fact that $added$ is 0 197 180 does not guarantee that no branch instructions have changed. For instance, 198 181 it is possible that we have replaced a short jump with an absolute jump, which 199 does not change the size of the branch instruction. 200 201 Therefore {\tt policy\_pc\_equal} states that $old\_sigma_1(x) = sigma_1(x)$, 202 whereas {\tt policy\_jump\_equal} states that $old\_sigma_2(x) = sigma_2(x)$. 203 This formulation is sufficient to prove termination and compactness. 182 does not change the size of the branch instruction. Therefore {\tt policy\_pc\_equal} states that $old\_sigma_1(x) = sigma_1(x)$, whereas {\tt policy\_jump\_equal} states that $old\_sigma_2(x) = sigma_2(x)$. This formulation is sufficient to prove termination and compactness. 204 183 205 184 Proving these invariants is simple, usually by induction on the prefix length. … … 210 189 difference between these invariants and the fold invariants is that after the 211 190 completion of the fold, we check whether the program size does not supersede 212 64 Kb, the maximum memory size the MCS51 can address. 213 214 The type of an iteration therefore becomes an option type: {\tt None} in case 191 64 Kb, the maximum memory size the MCS51 can address. The type of an iteration therefore becomes an option type: {\tt None} in case 215 192 the program becomes larger than 64 Kb, or $\mathtt{Some}\ \sigma$ 216 193 otherwise. We also no longer pass along the number of bytes added to the 217 194 program size, but a boolean that indicates whether we have changed something 218 during the iteration or not. 219 220 If an iteration returns {\tt None}, we have the following invariant: 221 222 \begin{alignat*}{2} 223 \mathtt{nec} & \mathtt{\_plus\_ultra} \equiv \lambda program.\lambda p. \notag\\ 224 &\neg(\forall i.i < program\ \rightarrow \notag\\ 225 & \mathtt{is\_jump}\ (\mathtt{nth}\ i\ program)\ \rightarrow \notag\\ 226 & \mathtt{lookup}\ i\ (\mathtt{snd}\ p) = \mathrm{long\_jump}). 227 \end{alignat*} 228 229 This invariant is applied to $old\_sigma$; if our program becomes too large 230 for memory, the previous iteration cannot have every branch instruction encoded 231 as a long jump. This is needed later in the proof of termination. 232 233 If the iteration returns $\mathtt{Some}\ \sigma$, the invariants 234 {\tt out\_of\_program\_none},\\ 235 {\tt not\_jump\_default}, {\tt jump\_increase}, 236 and the two invariants that deal with $\sigma(0)$ and $\sigma(n)$ are 237 retained without change. 195 during the iteration or not. 196 197 If the iteration returns {\tt None}, which means that it has become too large for memory, there is an invariant that states that the previous iteration cannot 198 have every branch instruction encoded as a long jump. This is needed later in the proof of termination. If the iteration returns $\mathtt{Some}\ \sigma$, the fold invariants are retained without change. 238 199 239 200 Instead of using {\tt sigma\_compact\_unsafe}, we can now use the proper 240 201 invariant: 241 202 % 203 {\small 242 204 \begin{alignat*}{6} 243 205 \mathtt{sigma} & \omit\rlap{$\mathtt{\_compact} \equiv \lambda program.\lambda sigma.$} \notag\\ … … 250 212 &&&&& \mathrm{Some} \langle pc1, j1 \rangle \Rightarrow\notag\\ 251 213 &&&&& \ \ pc1 = pc + \mathtt{instruction\_size}\ n\ (\mathtt{nth}\ n\ program) 252 \end{alignat*} 253 254 214 \end{alignat*}} 215 % 255 216 This is almost the same invariant as ${\tt sigma\_compact\_unsafe}$, but differs in that it 256 217 computes the sizes of branch instructions by looking at the distance between 257 position and destination using $\sigma$. 258 259 In actual use, the invariant is qualified: $\sigma$ is compact if there have 218 position and destination using $\sigma$. In actual use, the invariant is qualified: $\sigma$ is compact if there have 260 219 been no changes (i.e. the boolean passed along is {\tt true}). This is to 261 220 reflect the fact that we are doing a least fixed point computation: the result … … 265 224 $\mathtt{Some}\ \sigma$: it must hold that $\mathtt{fst}\ sigma < 2^{16}$. 266 225 We need this invariant to make sure that addresses do not overflow. 267 268 The invariants taken directly from the fold invariants are trivial to prove.269 226 270 227 The proof of {\tt nec\_plus\_ultra} goes as follows: if we return {\tt None},
Note: See TracChangeset
for help on using the changeset viewer.