1 | \section{Introduction} |
---|
2 | |
---|
3 | The problem of branch displacement optimisation, also known as jump encoding, is |
---|
4 | a well-known problem in assembler design. |
---|
5 | |
---|
6 | In many instruction sets, amongst which the ubiquitous x86 architecture (both |
---|
7 | its 32-- and 64--bit incarnations), there are instructions whose addressing |
---|
8 | mode varies with the distance between the instruction and the object they |
---|
9 | address. Mostly this occurs with jump instructions; for example, in the |
---|
10 | x86-64 instruction set there are eleven different forms of the unconditional |
---|
11 | jump instruction, all with different ranges, instruction sizes and semantics |
---|
12 | (only six are valid in 64-bit mode, for example). Some examples are shown in |
---|
13 | figure~\ref{f:x86jumps}: |
---|
14 | |
---|
15 | \begin{figure}[h] |
---|
16 | \begin{tabular}{|l|l|l|} |
---|
17 | \hline |
---|
18 | Instruction & Size (bytes) & Displacement range \\ |
---|
19 | \hline |
---|
20 | Short jump & 2 & -128 to 127 bytes \\ |
---|
21 | Relative near jump & 5 & $-2^{32}$ to $2^{32}-1$ bytes \\ |
---|
22 | Absolute near jump & 6 & one segment (64-bit address) \\ |
---|
23 | Far jump & 8 & entire memory \\ |
---|
24 | \hline |
---|
25 | \end{tabular} |
---|
26 | \caption{List of x86 jump instructions} |
---|
27 | \label{f:x86jumps} |
---|
28 | \end{figure} |
---|
29 | |
---|
30 | The chosen target architecture of the CerCo project is the Intel MCS-51, which |
---|
31 | features three types of jump instructions, as shown in |
---|
32 | figure~\ref{f:mcs51jumps}. |
---|
33 | |
---|
34 | \begin{figure}[h] |
---|
35 | \begin{tabular}{|l|l|l|l|} |
---|
36 | \hline |
---|
37 | Instruction & Size & Execution time & Displacement range \\ |
---|
38 | & (bytes) & (cycles) & \\ |
---|
39 | \hline |
---|
40 | SJMP (`short jump') & 2 & 2 & -128 to 127 bytes \\ |
---|
41 | AJMP (`medium jump') & 2 & 2 & one segment (11-bit address) \\ |
---|
42 | LJMP (`long jump') & 3 & 3 & entire memory \\ |
---|
43 | \hline |
---|
44 | \end{tabular} |
---|
45 | \caption{List of MCS-51 jump instructions} |
---|
46 | \label{f:mcs51jumps} |
---|
47 | \end{figure} |
---|
48 | |
---|
49 | The conditional jump instruction is only available in short form, which |
---|
50 | means that a conditional jump outside the short address range has to be |
---|
51 | encoded using two jumps; the call instruction is only available in |
---|
52 | medium and long forms. |
---|
53 | |
---|
54 | Note that even though the MCS-51 architecture is much less advanced and more |
---|
55 | simple than the x86-64 architecture, the basic types of jump remain the same: |
---|
56 | a short jump with a limited range, an intra-segment jump and a jump that can |
---|
57 | reach the entire available memory. |
---|
58 | |
---|
59 | Generally, in assembly code, there is only one way to indicate a jump; the |
---|
60 | algorithm used by the assembler to encode these jumps into the different |
---|
61 | machine instructions is known as the {\tt branch displacement algorithm}. The |
---|
62 | optimisation problem consists in using as small an encoding as possible, thus |
---|
63 | minimising program length and execution time. |
---|
64 | |
---|
65 | This problem is known to be NP-complete~\cite{Robertson1979,Szymanski1978}, |
---|
66 | which could make finding an optimal solution very time-consuming. |
---|
67 | |
---|
68 | The canonical solution, as shown in~\cite{Szymanski1978} or more recently |
---|
69 | in~\cite{Dickson2008} for the x86 instruction set, is to use a fixed point |
---|
70 | algorithm that starts out with the shortest possible encoding (all jumps |
---|
71 | encoded as short jumps, which is very probably not a correct solution) and then |
---|
72 | iterates over the program to re-encode those jumps whose target is outside |
---|
73 | their range. |
---|
74 | |
---|
75 | \subsection*{Adding medium jumps} |
---|
76 | |
---|
77 | In both papers mentioned above, the encoding of a jump is only dependent on the |
---|
78 | distance between the jump and its target: below a certain value a short jump |
---|
79 | can be used; above this value the jump must be encoded as a long jump. |
---|
80 | |
---|
81 | Here, termination of the smallest fixed point algorithm is easy to prove. All |
---|
82 | jumps start out encoded as short jumps, which means that the distance between |
---|
83 | any jump and its target is as short as possible. If we therefore need to encode |
---|
84 | a jump $j$ as a long jump, we can be sure that we can never reach a situation |
---|
85 | where the span of $j$ is so small that it can be encoded as a short jump. This |
---|
86 | reasoning holds throughout the subsequent iterations of the algorithms: short |
---|
87 | jumps can change into long jumps, but not vice versa. Hence, the algorithm |
---|
88 | either terminates when a fixed point is reached or when all short jumps have |
---|
89 | been changed into long jumps. |
---|
90 | |
---|
91 | Also, we can be certain that we have reached an optimal solution: a short jump |
---|
92 | is only changed into a long jump if it is absolutely necessary. |
---|
93 | |
---|
94 | However, neither of these claims (termination nor optimality) hold when we add |
---|
95 | the medium jump. |
---|
96 | |
---|
97 | The reason for this is that with medium jumps, the encoding of a jump no longer |
---|
98 | depends only on the distance between the jump and its target: in order for a |
---|
99 | medium jump to be possible, they need to be in the same segment. It is therefore |
---|
100 | entirely possible for two jumps with the same span to be encoded in different |
---|
101 | ways (medium if the jump and its destination are in the same segment, long if |
---|
102 | this is not the case). |
---|
103 | |
---|
104 | This invalidates the termination argument: a jump, once encoded as a long jump, |
---|
105 | can be re-encoded during a later iteration as a medium jump. Consider the |
---|
106 | program shown in figure~\ref{f:term_example}. At the start of the first |
---|
107 | iteration, both the jump to {\tt X} and the jump to $\mathtt{L}_{0}$ are |
---|
108 | encoded as small jumps. Let us assume that in this case, the placement of |
---|
109 | $\mathtt{L}_{0}$ and the jump to it are such that $\mathtt{L}_{0}$ is just |
---|
110 | outside the segment that contains this jump. Let us also assume that the |
---|
111 | distance between $\mathtt{L}_{0}$ and the jump to it are too large for the jump |
---|
112 | to be encoded as a short jump. |
---|
113 | |
---|
114 | All this means that in the second iteration, the jump to $\mathtt{L}_{0}$ will |
---|
115 | be encoded as a long jump. If we assume that the jump to {\tt X} is encoded as |
---|
116 | a long jump as well, the size of the jump will increase and $\mathtt{L}_{0}$ |
---|
117 | will be `propelled' into the same segment as its jump. Hence, in the next |
---|
118 | iteration, it can be encoded as a medium jump. At first glance, there is |
---|
119 | nothing that prevents us from making a construction where two jumps interact |
---|
120 | in such a way as to keep switching between long and medium indefinitely. |
---|
121 | |
---|
122 | \begin{figure}[h] |
---|
123 | \begin{alltt} |
---|
124 | jmp X |
---|
125 | \vdots |
---|
126 | L\(\sb{0}\): |
---|
127 | \vdots |
---|
128 | jmp L\(\sb{0}\) |
---|
129 | \end{alltt} |
---|
130 | \caption{Example of a program where a long jump becomes medium} |
---|
131 | \label{f:term_example} |
---|
132 | \end{figure} |
---|
133 | |
---|
134 | The optimality argument no longer holds either. Let us consider the program |
---|
135 | shown in figure~\ref{f:opt_example}. Suppose that the distance between |
---|
136 | $\mathtt{L}_{0}$ and $\mathtt{L}_{1}$ is such that if {\tt jmp X} is encoded |
---|
137 | as a short jump, there is a segment border just after $\mathtt{L}_{1}$. Let |
---|
138 | us also assume that the three jumps to $\mathtt{L}_{1}$ are all in the same |
---|
139 | segment, but far enough away from $\mathtt{L}_{1}$ that they cannot be encoded |
---|
140 | as short jumps. |
---|
141 | |
---|
142 | Then, if {\tt jmp X} were to be encoded as a short jump, which is clearly |
---|
143 | possible, all of the jumps to $\mathtt{L}_{1}$ would have to be encoded as |
---|
144 | long jumps. However, if {\tt jmp X} were to be encoded as a long jump, and |
---|
145 | therefore increase in size, $\mathtt{L}_{1}$ would be `propelled' across the |
---|
146 | segment border, so that the three jumps to $\mathtt{L}_{1}$ could be encoded |
---|
147 | as medium jumps. Depending on the relative sizes of long and medium jumps, this |
---|
148 | solution might actually be smaller than the one reached by the smallest |
---|
149 | fixed point algorithm. |
---|
150 | |
---|
151 | \begin{figure}[h] |
---|
152 | \begin{alltt} |
---|
153 | L\(\sb{0}\): jmp X |
---|
154 | X: |
---|
155 | \vdots |
---|
156 | L\(\sb{1}\): |
---|
157 | \vdots |
---|
158 | jmp L\(\sb{1}\) |
---|
159 | \vdots |
---|
160 | jmp L\(\sb{1}\) |
---|
161 | \vdots |
---|
162 | jmp L\(\sb{1}\) |
---|
163 | \vdots |
---|
164 | \end{alltt} |
---|
165 | \caption{Example of a program where the fixed-point algorithm is not optimal} |
---|
166 | \label{f:opt_example} |
---|
167 | \end{figure} |
---|