# Changeset 2085 for src/ASM/CPP2012-policy/problem.tex

Ignore:
Timestamp:
Jun 15, 2012, 11:36:47 AM (9 years ago)
Message:
• rewrote introduction
• changed 'medium' to 'absolute'
• added a bit to conclusion (CompCert?, Piton, ...)
File:
1 edited

Unmodified
Added
Removed
• ## src/ASM/CPP2012-policy/problem.tex

 r2064 The problem of branch displacement optimisation, also known as jump encoding, is a well-known problem in assembler design. In many instruction sets, amongst which the ubiquitous x86 architecture (both its 32-- and 64--bit incarnations), there are instructions whose addressing mode varies with the distance between the instruction and the object they address. Mostly this occurs with jump instructions; for example, in the x86-64 instruction set there are eleven different forms of the unconditional jump instruction, all with different ranges, instruction sizes and semantics (only six are valid in 64-bit mode, for example). Some examples are shown in figure~\ref{f:x86jumps}: \begin{figure}[h] a well-known problem in assembler design. It is caused by the fact that in many architecture sets, the encoding (and therefore size) of some instructions depends on the distance to their operand (the instruction 'span'). The branch displacement optimisation problem consists in encoding these span-dependent instructions in such a way that the resulting program is as small as possible. This problem is the subject of the present paper. After introducing the problem in more detail, we will discuss the solutions used by other compilers, present the algorithm used by us in the CerCo assembler, and discuss its verification, that is the proofs of termination and correctness. The research presented in this paper has been executed within the CerCo project which aims at formally verifying a C compiler with cost annotations. The target architecture for this project is the MCS-51, whose instruction set contains span-dependent instructions. Furthermore, its maximum addressable memory size is very small (65 Kbytes), which makes it important to generate programs that are as small as possible. With this optimisation, however, comes increased complexity and hence increased possibility for error. We must make sure that the jumps are encoded correctly, otherwise the assembled program will behave impredictably. \section{The branch displacement optimisation problem} In most modern instruction sets, the only span-dependent instructions are branch instructions. Taking the ubiquitous x86-64 instruction set as an example, we find that it contains are eleven different forms of the unconditional jump instruction, all with different ranges, instruction sizes and semantics (only six are valid in 64-bit mode, for example). Some examples are shown in figure~\ref{f:x86jumps}. \begin{figure}[h] \begin{center} \begin{tabular}{|l|l|l|} \hline \hline \end{tabular} \end{center} \caption{List of x86 jump instructions} \label{f:x86jumps} The chosen target architecture of the CerCo project is the Intel MCS-51, which features three types of jump instructions, as shown in figure~\ref{f:mcs51jumps}. \begin{figure}[h] features three types of branch instructions (or jump instructions; the two terms are used interchangeably), as shown in figure~\ref{f:mcs51jumps}. \begin{figure}[h] \begin{center} \begin{tabular}{|l|l|l|l|} \hline \hline SJMP (short jump') & 2 & 2 & -128 to 127 bytes \\ AJMP (medium jump') & 2 & 2 & one segment (11-bit address) \\ AJMP (absolute jump') & 2 & 2 & one segment (11-bit address) \\ LJMP (long jump') & 3 & 3 & entire memory \\ \hline \end{tabular} \end{center} \caption{List of MCS-51 jump instructions} \label{f:mcs51jumps} means that a conditional jump outside the short address range has to be encoded using two jumps; the call instruction is only available in medium and long forms. absolute and long forms. Note that even though the MCS-51 architecture is much less advanced and more reach the entire available memory. Generally, in assembly code, there is only one way to indicate a jump; the algorithm used by the assembler to encode these jumps into the different Generally, in the code that is sent to the assembler as input, the only difference made between jump instructions is by semantics, not by span. This means that a distinction is made between the inconditional jump and the several kinds of conditional jump, but not between their short, absolute or long variants. The algorithm used by the assembler to encode these jumps into the different machine instructions is known as the {\tt branch displacement algorithm}. The optimisation problem consists in using as small an encoding as possible, thus which could make finding an optimal solution very time-consuming. The canonical solution, as shown in~\cite{Szymanski1978} or more recently in~\cite{Dickson2008} for the x86 instruction set, is to use a fixed point algorithm that starts out with the shortest possible encoding (all jumps encoded as short jumps, which is very probably not a correct solution) and then iterates over the program to re-encode those jumps whose target is outside their range. \subsection*{Adding medium jumps} The canonical solution, as shown by Szymanski~\cite{Szymanski1978} or more recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a fixed point algorithm that starts out with the shortest possible encoding (all jumps encoded as short jumps, which is very probably not a correct solution) and then iterates over the program to re-encode those jumps whose target is outside their range. \subsection*{Adding absolute jumps} In both papers mentioned above, the encoding of a jump is only dependent on the Here, termination of the smallest fixed point algorithm is easy to prove. All jumps start out encoded as short jumps, which means that the distance between any jump and its target is as short as possible. If we therefore need to encode a jump $j$ as a long jump, we can be sure that we can never reach a situation where the span of $j$ is so small that it can be encoded as a short jump. This reasoning holds throughout the subsequent iterations of the algorithms: short jumps can change into long jumps, but not vice versa. Hence, the algorithm either terminates when a fixed point is reached or when all short jumps have been changed into long jumps. any jump and its target is as short as possible. If, in this situation, there is a jump $j$ whose span is not within the range for a short jump, we can be sure that we can never reach a situation where the span of $j$ is so small that it can be encoded as a short jump. This reasoning holds throughout the subsequent iterations of the algorithm: short jumps can change into long jumps, but not vice versa (spans only increase). Hence, the algorithm either terminates when a fixed point is reached or when all short jumps have been changed into long jumps. Also, we can be certain that we have reached an optimal solution: a short jump However, neither of these claims (termination nor optimality) hold when we add the medium jump. The reason for this is that with medium jumps, the encoding of a jump no longer depends only on the distance between the jump and its target: in order for a medium jump to be possible, they need to be in the same segment. It is therefore entirely possible for two jumps with the same span to be encoded in different ways (medium if the jump and its destination are in the same segment, long if this is not the case). the absolute jump. The reason for this is that with absolute jumps, the encoding of a jump no longer depends only on the distance between the jump and its target: in order for an absolute jump to be possible, they need to be in the same segment (for the MCS-51, this means that the first 5 bytes of their addesses have to be equal). It is therefore entirely possible for two jumps with the same span to be encoded in different ways (absolute if the jump and its target are in the same segment, long if this is not the case). This invalidates the termination argument: a jump, once encoded as a long jump, can be re-encoded during a later iteration as a medium jump. Consider the can be re-encoded during a later iteration as an absolute jump. Consider the program shown in figure~\ref{f:term_example}. At the start of the first iteration, both the jump to {\tt X} and the jump to $\mathtt{L}_{0}$ are be encoded as a long jump. If we assume that the jump to {\tt X} is encoded as a long jump as well, the size of the jump will increase and $\mathtt{L}_{0}$ will be propelled' into the same segment as its jump. Hence, in the next iteration, it can be encoded as a medium jump. At first glance, there is will be propelled' into the same segment as its jump. Hence, in the third iteration, it can be encoded as an absolute jump. At first glance, there is nothing that prevents us from making a construction where two jumps interact in such a way as to keep switching between long and medium indefinitely. in such a way as to keep switching between long and absolute encodings for an indefinite amount of iterations. \begin{figure}[h] jmp L$$\sb{0}$$ \end{alltt} \caption{Example of a program where a long jump becomes medium} \caption{Example of a program where a long jump becomes absolute} \label{f:term_example} \end{figure} In fact, this situation mirrors the explanation by Szymanski~\cite{Szymanski1978} of why the branch displacement optimisation problem is NP-complete. In this explanation, a condition for NP-completeness is the fact that programs be allowed to contain {\em pathological} jumps. These are jumps that can normally not be encoded as a short(er) jump, but gain this property when some other jumps are encoded as a long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by encoding the first jump as a long jump, another jump switches from long to absolute (which is shorter). The optimality argument no longer holds either. Let us consider the program therefore increase in size, $\mathtt{L}_{1}$ would be `propelled' across the segment border, so that the three jumps to $\mathtt{L}_{1}$ could be encoded as medium jumps. Depending on the relative sizes of long and medium jumps, this solution might actually be smaller than the one reached by the smallest as absolute jumps. Depending on the relative sizes of long and absolute jumps, this solution might actually be smaller than the one reached by the smallest fixed point algorithm.
Note: See TracChangeset for help on using the changeset viewer.