# Changeset 2091 for src/ASM/CPP2012-policy/problem.tex

Ignore:
Timestamp:
Jun 15, 2012, 1:35:46 PM (8 years ago)
Message:
• systematically changed 'jump' to 'branch'
File:
1 edited

### Legend:

Unmodified
 r2086 The problem of branch displacement optimisation, also known as jump encoding, is a well-known problem in assembler design. It is caused by the fact that in many architecture sets, the encoding (and therefore size) of some instructions depends on the distance to their operand (the instruction 'span'). The branch displacement optimisation problem consists in encoding these span-dependent instructions in such a way that the resulting program is as small as possible. a well-known problem in assembler design~\cite{Hyde2006}. It is caused by the fact that in many architecture sets, the encoding (and therefore size) of some instructions depends on the distance to their operand (the instruction 'span'). The branch displacement optimisation problem consists in encoding these span-dependent instructions in such a way that the resulting program is as small as possible. This problem is the subject of the present paper. After introducing the problem in more detail, we will discuss the solutions used by other compilers, present the algorithm used by us in the CerCo assembler, and discuss its verification, that is the proofs of termination and correctness. that is the proofs of termination and correctness using the Matita theorem prover~\cite{Asperti2007}. The research presented in this paper has been executed within the CerCo project With this optimisation, however, comes increased complexity and hence increased possibility for error. We must make sure that the jumps are encoded correctly, otherwise the assembled program will behave unpredictably. increased possibility for error. We must make sure that the branch instructions are encoded correctly, otherwise the assembled program will behave unpredictably. \section{The branch displacement optimisation problem} In most modern instruction sets, the only span-dependent instructions are branch instructions. Taking the ubiquitous x86-64 instruction set as an example, we find that it contains are eleven different forms of the unconditional jump instruction, all with different ranges, instruction sizes In most modern instruction sets that have them, the only span-dependent instructions are branch instructions. Taking the ubiquitous x86-64 instruction set as an example, we find that it contains are eleven different forms of the unconditional branch instruction, all with different ranges, instruction sizes and semantics (only six are valid in 64-bit mode, for example). Some examples are shown in figure~\ref{f:x86jumps}. \end{tabular} \end{center} \caption{List of x86 jump instructions} \caption{List of x86 branch instructions} \label{f:x86jumps} \end{figure} \end{tabular} \end{center} \caption{List of MCS-51 jump instructions} \caption{List of MCS-51 branch instructions} \label{f:mcs51jumps} \end{figure} The conditional jump instruction is only available in short form, which means that a conditional jump outside the short address range has to be encoded using two jumps; the call instruction is only available in absolute and long forms. Conditional branch instructions are only available in short form, which means that a conditional branch outside the short address range has to be encoded using three branch instructions (for instructions whose logical negation is available, it can be done with two branch instructions, but for some instructions, this opposite is not available); the call instruction is only available in absolute and long forms. Note that even though the MCS-51 architecture is much less advanced and more simple than the x86-64 architecture, the basic types of jump remain the same: a short jump with a limited range, an intra-segment jump and a jump that can reach the entire available memory. simple than the x86-64 architecture, the basic types of branch instruction remain the same: a short jump with a limited range, an intra-segment jump and a jump that can reach the entire available memory. Generally, in the code that is sent to the assembler as input, the only difference made between jump instructions is by semantics, not by span. This means that a distinction is made between the unconditional jump and the several kinds of conditional jump, but not between their short, absolute or long variants. The algorithm used by the assembler to encode these jumps into the different machine instructions is known as the {\tt branch displacement algorithm}. The optimisation problem consists in using as small an encoding as possible, thus minimising program length and execution time. difference made between branch instructions is by semantics, not by span. This means that a distinction is made between an unconditional branch and the several kinds of conditional branch, but not between their short, absolute or long variants. The algorithm used by the assembler to encode these branch instructions into the different machine instructions is known as the {\em branch displacement algorithm}. The optimisation problem consists in using as small an encoding as possible, thus minimising program length and execution time. This problem is known to be NP-complete~\cite{Robertson1979,Szymanski1978}, recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a fixed point algorithm that starts out with the shortest possible encoding (all jumps encoded as short jumps, which is very probably not a correct solution) and then iterates over the program to re-encode those jumps whose target is outside their range. branch instruction encoded as short jumps, which is very probably not a correct solution) and then iterates over the program to re-encode those branch instructions whose target is outside their range. \subsection*{Adding absolute jumps} Here, termination of the smallest fixed point algorithm is easy to prove. All jumps start out encoded as short jumps, which means that the distance between any jump and its target is as short as possible. If, in this situation, there is a jump $j$ whose span is not within the range for a short jump, we can be sure that we can never reach a situation where the span of $j$ is so small that it can be encoded as a short jump. This reasoning holds throughout the subsequent iterations of the algorithm: short jumps can change into long jumps, but not vice versa (spans only increase). Hence, the algorithm either terminates when a fixed point is reached or when all short jumps have been changed into long jumps. branch instructions start out encoded as short jumps, which means that the distance between any branch instruction and its target is as short as possible. If, in this situation, there is a branch instruction $b$ whose span is not within the range for a short jump, we can be sure that we can never reach a situation where the span of $j$ is so small that it can be encoded as a short jump. This argument continues to hold throughout the subsequent iterations of the algorithm: short jumps can change into long jumps, but not vice versa (spans only increase). Hence, the algorithm either terminates when a fixed point is reached or when all short jumps have been changed into long jumps. Also, we can be certain that we have reached an optimal solution: a short jump the absolute jump. The reason for this is that with absolute jumps, the encoding of a jump no longer depends only on the distance between the jump and its target: in order for an absolute jump to be possible, they need to be in the same segment (for the MCS-51, this means that the first 5 bytes of their addresses have to be equal). It is therefore entirely possible for two jumps with the same span to be encoded in different ways (absolute if the jump and its target are in the same segment, long if this is not the case). This invalidates the termination argument: a jump, once encoded as a long jump, can be re-encoded during a later iteration as an absolute jump. Consider the program shown in figure~\ref{f:term_example}. At the start of the first iteration, both the jump to {\tt X} and the jump to $\mathtt{L}_{0}$ are encoded as small jumps. Let us assume that in this case, the placement of $\mathtt{L}_{0}$ and the jump to it are such that $\mathtt{L}_{0}$ is just outside the segment that contains this jump. Let us also assume that the distance between $\mathtt{L}_{0}$ and the jump to it are too large for the jump to be encoded as a short jump. All this means that in the second iteration, the jump to $\mathtt{L}_{0}$ will be encoded as a long jump. If we assume that the jump to {\tt X} is encoded as a long jump as well, the size of the jump will increase and $\mathtt{L}_{0}$ will be propelled' into the same segment as its jump. Hence, in the third iteration, it can be encoded as an absolute jump. At first glance, there is nothing that prevents us from making a construction where two jumps interact in such a way as to keep switching between long and absolute encodings for an indefinite amount of iterations. The reason for this is that with absolute jumps, the encoding of a branch instruction no longer depends only on the distance between the branch instruction and its target: in order for an absolute jump to be possible, they need to be in the same segment (for the MCS-51, this means that the first 5 bytes of their addresses have to be equal). It is therefore entirely possible for two branch instructions with the same span to be encoded in different ways (absolute if the branch instruction and its target are in the same segment, long if this is not the case). This invalidates the termination argument: a branch instruction, once encoded as a long jump, can be re-encoded during a later iteration as an absolute jump. Consider the program shown in figure~\ref{f:term_example}. At the start of the first iteration, both the branch to {\tt X} and the branch to $\mathtt{L}_{0}$ are encoded as small jumps. Let us assume that in this case, the placement of $\mathtt{L}_{0}$ and the branch to it are such that $\mathtt{L}_{0}$ is just outside the segment that contains this branch. Let us also assume that the distance between $\mathtt{L}_{0}$ and the branch to it are too large for the branch instruction to be encoded as a short jump. All this means that in the second iteration, the branch to $\mathtt{L}_{0}$ will be encoded as a long jump. If we assume that the branch to {\tt X} is encoded as a long jump as well, the size of the branch instruction will increase and $\mathtt{L}_{0}$ will be propelled' into the same segment as its branch instruction, because every subsequent instruction will move one byte forward. Hence, in the third iteration, the branch to $\mathtt{L}_{0}$ can be encoded as an absolute jump. At first glance, there is nothing that prevents us from making a construction where two branch instructions interact in such a way as to keep switching between long and absolute encodings for an indefinite amount of iterations. \begin{figure}[h] problem is NP-complete. In this explanation, a condition for NP-completeness is the fact that programs be allowed to contain {\em pathological} jumps. These are jumps that can normally not be encoded as a short(er) jump, but gain this property when some other jumps are encoded as a long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by encoding the first jump as a long jump, another jump switches from long to absolute (which is shorter). These are branch instructions that can normally not be encoded as a short(er) jump, but gain this property when some other branch instructions are encoded as a long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by encoding the first branch instruction as a long jump, another branch instructions switches from long to absolute (which is shorter). The optimality argument no longer holds either. Let us consider the program $\mathtt{L}_{0}$ and $\mathtt{L}_{1}$ is such that if {\tt jmp X} is encoded as a short jump, there is a segment border just after $\mathtt{L}_{1}$. Let us also assume that the three jumps to $\mathtt{L}_{1}$ are all in the same us also assume that the three branches to $\mathtt{L}_{1}$ are all in the same segment, but far enough away from $\mathtt{L}_{1}$ that they cannot be encoded as short jumps. Then, if {\tt jmp X} were to be encoded as a short jump, which is clearly possible, all of the jumps to $\mathtt{L}_{1}$ would have to be encoded as possible, all of the branches to $\mathtt{L}_{1}$ would have to be encoded as long jumps. However, if {\tt jmp X} were to be encoded as a long jump, and therefore increase in size, $\mathtt{L}_{1}$ would be `propelled' across the segment border, so that the three jumps to $\mathtt{L}_{1}$ could be encoded segment border, so that the three branches to $\mathtt{L}_{1}$ could be encoded as absolute jumps. Depending on the relative sizes of long and absolute jumps, this solution might actually be smaller than the one reached by the smallest