Changeset 2096 for src/ASM/CPP2012policy/problem.tex
 Timestamp:
 Jun 15, 2012, 3:25:21 PM (7 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

src/ASM/CPP2012policy/problem.tex
r2091 r2096 5 5 fact that in many architecture sets, the encoding (and therefore size) of some 6 6 instructions depends on the distance to their operand (the instruction 'span'). 7 The branch displacement optimisation problem consists inencoding these7 The branch displacement optimisation problem consists of encoding these 8 8 spandependent instructions in such a way that the resulting program is as 9 9 small as possible. … … 11 11 This problem is the subject of the present paper. After introducing the problem 12 12 in more detail, we will discuss the solutions used by other compilers, present 13 the algorithm used by usin the CerCo assembler, and discuss its verification,14 that is the proofs of termination and correctness using the Matita theorem15 prover~\cite{Asperti2007}.13 the algorithm we use in the CerCo assembler, and discuss its verification, 14 that is the proofs of termination and correctness using the Matita proof 15 assistant~\cite{Asperti2007}. 16 16 17 17 The research presented in this paper has been executed within the CerCo project … … 19 19 target architecture for this project is the MCS51, whose instruction set 20 20 contains spandependent instructions. Furthermore, its maximum addressable 21 memory size is very small (6 5 Kbytes), which makes it important to generate21 memory size is very small (64 Kb), which makes it important to generate 22 22 programs that are as small as possible. 23 23 … … 31 31 In most modern instruction sets that have them, the only spandependent 32 32 instructions are branch instructions. Taking the ubiquitous x8664 instruction 33 set as an example, we find that it contains areeleven different forms of the33 set as an example, we find that it contains eleven different forms of the 34 34 unconditional branch instruction, all with different ranges, instruction sizes 35 35 and semantics (only six are valid in 64bit mode, for example). Some examples 36 are shown in figure~\ref{f:x86jumps}.36 are shown in Figure~\ref{f:x86jumps}. 37 37 38 38 \begin{figure}[h] … … 55 55 The chosen target architecture of the CerCo project is the Intel MCS51, which 56 56 features three types of branch instructions (or jump instructions; the two terms 57 are used interchangeably), as shown in figure~\ref{f:mcs51jumps}.57 are used interchangeably), as shown in Figure~\ref{f:mcs51jumps}. 58 58 59 59 \begin{figure}[h] … … 78 78 encoded using three branch instructions (for instructions whose logical 79 79 negation is available, it can be done with two branch instructions, but for 80 some instructions , this oppositeis not available); the call instruction is80 some instructions this is not available); the call instruction is 81 81 only available in absolute and long forms. 82 82 83 Note that even though the MCS51 architecture is much less advanced and more84 simplethan the x8664 architecture, the basic types of branch instruction83 Note that even though the MCS51 architecture is much less advanced and simpler 84 than the x8664 architecture, the basic types of branch instruction 85 85 remain the same: a short jump with a limited range, an intrasegment jump and a 86 86 jump that can reach the entire available memory. 87 87 88 Generally, in the code that is sentto the assembler as input, the only89 difference made between branch instructions is by semantics, not byspan. This88 Generally, in code fed to the assembler as input, the only 89 difference between branch instructions is semantics, not span. This 90 90 means that a distinction is made between an unconditional branch and the 91 91 several kinds of conditional branch, but not between their short, absolute or … … 94 94 The algorithm used by the assembler to encode these branch instructions into 95 95 the different machine instructions is known as the {\em branch displacement 96 algorithm}. The optimisation problem consists in using as small an encoding as96 algorithm}. The optimisation problem consists of finding as small an encoding as 97 97 possible, thus minimising program length and execution time. 98 98 … … 102 102 The canonical solution, as shown by Szymanski~\cite{Szymanski1978} or more 103 103 recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a 104 fixed point algorithm that starts outwith the shortest possible encoding (all105 branch instruction encoded as short jumps, which is very probably not a correct104 fixed point algorithm that starts with the shortest possible encoding (all 105 branch instruction encoded as short jumps, which is likely not a correct 106 106 solution) and then iterates over the program to reencode those branch 107 107 instructions whose target is outside their range. … … 120 120 situation where the span of $j$ is so small that it can be encoded as a short 121 121 jump. This argument continues to hold throughout the subsequent iterations of 122 the algorithm: short jumps can change into long jumps, but not vice versa123 (spans only increase). Hence, the algorithm either terminateswhen a fixed122 the algorithm: short jumps can change into long jumps, but not \emph{vice versa}, 123 as spans only increase. Hence, the algorithm either terminates early when a fixed 124 124 point is reached or when all short jumps have been changed into long jumps. 125 125 … … 128 128 129 129 However, neither of these claims (termination nor optimality) hold when we add 130 the absolute jump. 131 132 The reason for this is that with absolute jumps, the encoding of a branch 130 the absolute jump, as with absolute jumps, the encoding of a branch 133 131 instruction no longer depends only on the distance between the branch 134 132 instruction and its target: in order for an absolute jump to be possible, they … … 139 137 long if this is not the case). 140 138 141 This invalidates the termination argument: a branch instruction, once encoded 139 \begin{figure}[ht] 140 \begin{alltt} 141 jmp X 142 \vdots 143 L\(\sb{0}\): 144 \vdots 145 jmp L\(\sb{0}\) 146 \end{alltt} 147 \caption{Example of a program where a long jump becomes absolute} 148 \label{f:term_example} 149 \end{figure} 150 151 This invalidates our earlier termination argument: a branch instruction, once encoded 142 152 as a long jump, can be reencoded during a later iteration as an absolute jump. 143 Consider the program shown in figure~\ref{f:term_example}. At the start of the153 Consider the program shown in Figure~\ref{f:term_example}. At the start of the 144 154 first iteration, both the branch to {\tt X} and the branch to $\mathtt{L}_{0}$ 145 155 are encoded as small jumps. Let us assume that in this case, the placement of … … 156 166 Hence, in the third iteration, the branch to $\mathtt{L}_{0}$ can be encoded as 157 167 an absolute jump. At first glance, there is nothing that prevents us from 158 making a construction where two branch instructions interact in such a way as 159 to keep switching between long and absolute encodings for an indefinite amount 160 of iterations. 161 162 \begin{figure}[h] 163 \begin{alltt} 164 jmp X 165 \vdots 166 L\(\sb{0}\): 167 \vdots 168 jmp L\(\sb{0}\) 169 \end{alltt} 170 \caption{Example of a program where a long jump becomes absolute} 171 \label{f:term_example} 172 \end{figure} 173 174 In fact, this situation mirrors the explanation by 175 Szymanski~\cite{Szymanski1978} of why the branch displacement optimisation 176 problem is NPcomplete. In this explanation, a condition for NPcompleteness 177 is the fact that programs be allowed to contain {\em pathological} jumps. 178 These are branch instructions that can normally not be encoded as a short(er) 179 jump, but gain this property when some other branch instructions are encoded as 180 a long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: 181 by encoding the first branch instruction as a long jump, another branch 182 instructions switches from long to absolute (which is shorter). 183 184 The optimality argument no longer holds either. Let us consider the program 185 shown in figure~\ref{f:opt_example}. Suppose that the distance between 168 constructing a configuration where two branch instructions interact in such a 169 way as to iterate indefinitely between long and absolute encodings. 170 171 This situation mirrors the explanation by Szymanski~\cite{Szymanski1978} of why 172 the branch displacement optimisation problem is NPcomplete. In this explanation, 173 a condition for NPcompleteness is the fact that programs be allowed to contain 174 {\em pathological} jumps. These are branch instructions that can normally not be 175 encoded as a short(er) jump, but gain this property when some other branch 176 instructions are encoded as a long(er) jump. This is exactly what happens in 177 Figure~\ref{f:term_example}. By encoding the first branch instruction as a long 178 jump, another branch instruction switches from long to absolute (which is 179 shorter). 180 181 In addition, our previous optimality argument no longer holds. Consider the program 182 shown in Figure~\ref{f:opt_example}. Suppose that the distance between 186 183 $\mathtt{L}_{0}$ and $\mathtt{L}_{1}$ is such that if {\tt jmp X} is encoded 187 184 as a short jump, there is a segment border just after $\mathtt{L}_{1}$. Let … … 189 186 segment, but far enough away from $\mathtt{L}_{1}$ that they cannot be encoded 190 187 as short jumps. 188 189 \begin{figure}[ht] 190 \begin{alltt} 191 L\(\sb{0}\): jmp X 192 X: 193 \vdots 194 L\(\sb{1}\): 195 \vdots 196 jmp L\(\sb{1}\) 197 \vdots 198 jmp L\(\sb{1}\) 199 \vdots 200 jmp L\(\sb{1}\) 201 \vdots 202 \end{alltt} 203 \caption{Example of a program where the fixedpoint algorithm is not optimal} 204 \label{f:opt_example} 205 \end{figure} 191 206 192 207 Then, if {\tt jmp X} were to be encoded as a short jump, which is clearly … … 198 213 this solution might actually be smaller than the one reached by the smallest 199 214 fixed point algorithm. 200 201 \begin{figure}[h]202 \begin{alltt}203 L\(\sb{0}\): jmp X204 X:205 \vdots206 L\(\sb{1}\):207 \vdots208 jmp L\(\sb{1}\)209 \vdots210 jmp L\(\sb{1}\)211 \vdots212 jmp L\(\sb{1}\)213 \vdots214 \end{alltt}215 \caption{Example of a program where the fixedpoint algorithm is not optimal}216 \label{f:opt_example}217 \end{figure}
Note: See TracChangeset
for help on using the changeset viewer.