Changeset 2085 for src/ASM/CPP2012policy/problem.tex
 Timestamp:
 Jun 15, 2012, 11:36:47 AM (9 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

src/ASM/CPP2012policy/problem.tex
r2064 r2085 2 2 3 3 The problem of branch displacement optimisation, also known as jump encoding, is 4 a wellknown problem in assembler design. 5 6 In many instruction sets, amongst which the ubiquitous x86 architecture (both 7 its 32 and 64bit incarnations), there are instructions whose addressing 8 mode varies with the distance between the instruction and the object they 9 address. Mostly this occurs with jump instructions; for example, in the 10 x8664 instruction set there are eleven different forms of the unconditional 11 jump instruction, all with different ranges, instruction sizes and semantics 12 (only six are valid in 64bit mode, for example). Some examples are shown in 13 figure~\ref{f:x86jumps}: 14 15 \begin{figure}[h] 4 a wellknown problem in assembler design. It is caused by the fact that in 5 many architecture sets, the encoding (and therefore size) of some instructions 6 depends on the distance to their operand (the instruction 'span'). The branch 7 displacement optimisation problem consists in encoding these spandependent 8 instructions in such a way that the resulting program is as small as possible. 9 10 This problem is the subject of the present paper. After introducing the problem 11 in more detail, we will discuss the solutions used by other compilers, present 12 the algorithm used by us in the CerCo assembler, and discuss its verification, 13 that is the proofs of termination and correctness. 14 15 The research presented in this paper has been executed within the CerCo project 16 which aims at formally verifying a C compiler with cost annotations. The 17 target architecture for this project is the MCS51, whose instruction set 18 contains spandependent instructions. Furthermore, its maximum addressable 19 memory size is very small (65 Kbytes), which makes it important to generate 20 programs that are as small as possible. 21 22 With this optimisation, however, comes increased complexity and hence 23 increased possibility for error. We must make sure that the jumps are encoded 24 correctly, otherwise the assembled program will behave impredictably. 25 26 \section{The branch displacement optimisation problem} 27 28 In most modern instruction sets, the only spandependent instructions are 29 branch instructions. Taking the ubiquitous x8664 instruction set as an 30 example, we find that it contains are eleven different forms of the 31 unconditional jump instruction, all with different ranges, instruction sizes 32 and semantics (only six are valid in 64bit mode, for example). Some examples 33 are shown in figure~\ref{f:x86jumps}. 34 35 \begin{figure}[h] 36 \begin{center} 16 37 \begin{tabular}{lll} 17 38 \hline … … 24 45 \hline 25 46 \end{tabular} 47 \end{center} 26 48 \caption{List of x86 jump instructions} 27 49 \label{f:x86jumps} … … 29 51 30 52 The chosen target architecture of the CerCo project is the Intel MCS51, which 31 features three types of jump instructions, as shown in 32 figure~\ref{f:mcs51jumps}. 33 34 \begin{figure}[h] 53 features three types of branch instructions (or jump instructions; the two terms 54 are used interchangeably), as shown in figure~\ref{f:mcs51jumps}. 55 56 \begin{figure}[h] 57 \begin{center} 35 58 \begin{tabular}{llll} 36 59 \hline … … 39 62 \hline 40 63 SJMP (`short jump') & 2 & 2 & 128 to 127 bytes \\ 41 AJMP (` mediumjump') & 2 & 2 & one segment (11bit address) \\64 AJMP (`absolute jump') & 2 & 2 & one segment (11bit address) \\ 42 65 LJMP (`long jump') & 3 & 3 & entire memory \\ 43 66 \hline 44 67 \end{tabular} 68 \end{center} 45 69 \caption{List of MCS51 jump instructions} 46 70 \label{f:mcs51jumps} … … 50 74 means that a conditional jump outside the short address range has to be 51 75 encoded using two jumps; the call instruction is only available in 52 mediumand long forms.76 absolute and long forms. 53 77 54 78 Note that even though the MCS51 architecture is much less advanced and more … … 57 81 reach the entire available memory. 58 82 59 Generally, in assembly code, there is only one way to indicate a jump; the 60 algorithm used by the assembler to encode these jumps into the different 83 Generally, in the code that is sent to the assembler as input, the only 84 difference made between jump instructions is by semantics, not by span. This 85 means that a distinction is made between the inconditional jump and the several 86 kinds of conditional jump, but not between their short, absolute or long 87 variants. 88 89 The algorithm used by the assembler to encode these jumps into the different 61 90 machine instructions is known as the {\tt branch displacement algorithm}. The 62 91 optimisation problem consists in using as small an encoding as possible, thus … … 66 95 which could make finding an optimal solution very timeconsuming. 67 96 68 The canonical solution, as shown in~\cite{Szymanski1978} or more recently69 in~\cite{Dickson2008} for the x86 instruction set, is to use a fixed point 70 algorithm that starts out with the shortest possible encoding (all jumps 71 encoded as short jumps, which is very probably not a correct solution) and then 72 iterates over the program to reencode those jumps whose target is outside 73 their range.74 75 \subsection*{Adding mediumjumps}97 The canonical solution, as shown by Szymanski~\cite{Szymanski1978} or more 98 recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a 99 fixed point algorithm that starts out with the shortest possible encoding (all 100 jumps encoded as short jumps, which is very probably not a correct solution) 101 and then iterates over the program to reencode those jumps whose target is 102 outside their range. 103 104 \subsection*{Adding absolute jumps} 76 105 77 106 In both papers mentioned above, the encoding of a jump is only dependent on the … … 81 110 Here, termination of the smallest fixed point algorithm is easy to prove. All 82 111 jumps start out encoded as short jumps, which means that the distance between 83 any jump and its target is as short as possible. If we therefore need to encode 84 a jump $j$ as a long jump, we can be sure that we can never reach a situation 85 where the span of $j$ is so small that it can be encoded as a short jump. This 86 reasoning holds throughout the subsequent iterations of the algorithms: short 87 jumps can change into long jumps, but not vice versa. Hence, the algorithm 88 either terminates when a fixed point is reached or when all short jumps have 89 been changed into long jumps. 112 any jump and its target is as short as possible. If, in this situation, there 113 is a jump $j$ whose span is not within the range for a short jump, we can be 114 sure that we can never reach a situation where the span of $j$ is so small that 115 it can be encoded as a short jump. This reasoning holds throughout the 116 subsequent iterations of the algorithm: short jumps can change into long jumps, 117 but not vice versa (spans only increase). Hence, the algorithm either 118 terminates when a fixed point is reached or when all short jumps have been 119 changed into long jumps. 90 120 91 121 Also, we can be certain that we have reached an optimal solution: a short jump … … 93 123 94 124 However, neither of these claims (termination nor optimality) hold when we add 95 the medium jump. 96 97 The reason for this is that with medium jumps, the encoding of a jump no longer 98 depends only on the distance between the jump and its target: in order for a 99 medium jump to be possible, they need to be in the same segment. It is therefore 100 entirely possible for two jumps with the same span to be encoded in different 101 ways (medium if the jump and its destination are in the same segment, long if 102 this is not the case). 125 the absolute jump. 126 127 The reason for this is that with absolute jumps, the encoding of a jump no 128 longer depends only on the distance between the jump and its target: in order 129 for an absolute jump to be possible, they need to be in the same segment (for 130 the MCS51, this means that the first 5 bytes of their addesses have to be 131 equal). It is therefore entirely possible for two jumps with the same span to 132 be encoded in different ways (absolute if the jump and its target are in the 133 same segment, long if this is not the case). 103 134 104 135 This invalidates the termination argument: a jump, once encoded as a long jump, 105 can be reencoded during a later iteration as a mediumjump. Consider the136 can be reencoded during a later iteration as an absolute jump. Consider the 106 137 program shown in figure~\ref{f:term_example}. At the start of the first 107 138 iteration, both the jump to {\tt X} and the jump to $\mathtt{L}_{0}$ are … … 115 146 be encoded as a long jump. If we assume that the jump to {\tt X} is encoded as 116 147 a long jump as well, the size of the jump will increase and $\mathtt{L}_{0}$ 117 will be `propelled' into the same segment as its jump. Hence, in the next118 iteration, it can be encoded as a mediumjump. At first glance, there is148 will be `propelled' into the same segment as its jump. Hence, in the third 149 iteration, it can be encoded as an absolute jump. At first glance, there is 119 150 nothing that prevents us from making a construction where two jumps interact 120 in such a way as to keep switching between long and medium indefinitely. 151 in such a way as to keep switching between long and absolute encodings for 152 an indefinite amount of iterations. 121 153 122 154 \begin{figure}[h] … … 128 160 jmp L\(\sb{0}\) 129 161 \end{alltt} 130 \caption{Example of a program where a long jump becomes medium}162 \caption{Example of a program where a long jump becomes absolute} 131 163 \label{f:term_example} 132 164 \end{figure} 165 166 In fact, this situation mirrors the explanation by 167 Szymanski~\cite{Szymanski1978} of why the branch displacement optimisation 168 problem is NPcomplete. In this explanation, a condition for NPcompleteness 169 is the fact that programs be allowed to contain {\em pathological} jumps. 170 These are jumps that can normally not be encoded as a short(er) jump, but gain 171 this property when some other jumps are encoded as a long(er) jump. This is 172 exactly what happens in figure~\ref{f:term_example}: by encoding the first 173 jump as a long jump, another jump switches from long to absolute 174 (which is shorter). 133 175 134 176 The optimality argument no longer holds either. Let us consider the program … … 145 187 therefore increase in size, $\mathtt{L}_{1}$ would be `propelled' across the 146 188 segment border, so that the three jumps to $\mathtt{L}_{1}$ could be encoded 147 as medium jumps. Depending on the relative sizes of long and medium jumps, this148 solution might actually be smaller than the one reached by the smallest189 as absolute jumps. Depending on the relative sizes of long and absolute jumps, 190 this solution might actually be smaller than the one reached by the smallest 149 191 fixed point algorithm. 150 192
Note: See TracChangeset
for help on using the changeset viewer.