# Changeset 2096 for src/ASM/CPP2012-policy/problem.tex

Ignore:
Timestamp:
Jun 15, 2012, 3:25:21 PM (8 years ago)
Message:

Changes to the English for Jaap, and some tidying up and making consistent with the other CPP submission.

File:
1 edited

### Legend:

Unmodified
 r2091 fact that in many architecture sets, the encoding (and therefore size) of some instructions depends on the distance to their operand (the instruction 'span'). The branch displacement optimisation problem consists in encoding these The branch displacement optimisation problem consists of encoding these span-dependent instructions in such a way that the resulting program is as small as possible. This problem is the subject of the present paper. After introducing the problem in more detail, we will discuss the solutions used by other compilers, present the algorithm used by us in the CerCo assembler, and discuss its verification, that is the proofs of termination and correctness using the Matita theorem prover~\cite{Asperti2007}. the algorithm we use in the CerCo assembler, and discuss its verification, that is the proofs of termination and correctness using the Matita proof assistant~\cite{Asperti2007}. The research presented in this paper has been executed within the CerCo project target architecture for this project is the MCS-51, whose instruction set contains span-dependent instructions. Furthermore, its maximum addressable memory size is very small (65 Kbytes), which makes it important to generate memory size is very small (64 Kb), which makes it important to generate programs that are as small as possible. In most modern instruction sets that have them, the only span-dependent instructions are branch instructions. Taking the ubiquitous x86-64 instruction set as an example, we find that it contains are eleven different forms of the set as an example, we find that it contains eleven different forms of the unconditional branch instruction, all with different ranges, instruction sizes and semantics (only six are valid in 64-bit mode, for example). Some examples are shown in figure~\ref{f:x86jumps}. are shown in Figure~\ref{f:x86jumps}. \begin{figure}[h] The chosen target architecture of the CerCo project is the Intel MCS-51, which features three types of branch instructions (or jump instructions; the two terms are used interchangeably), as shown in figure~\ref{f:mcs51jumps}. are used interchangeably), as shown in Figure~\ref{f:mcs51jumps}. \begin{figure}[h] encoded using three branch instructions (for instructions whose logical negation is available, it can be done with two branch instructions, but for some instructions, this opposite is not available); the call instruction is some instructions this is not available); the call instruction is only available in absolute and long forms. Note that even though the MCS-51 architecture is much less advanced and more simple than the x86-64 architecture, the basic types of branch instruction Note that even though the MCS-51 architecture is much less advanced and simpler than the x86-64 architecture, the basic types of branch instruction remain the same: a short jump with a limited range, an intra-segment jump and a jump that can reach the entire available memory. Generally, in the code that is sent to the assembler as input, the only difference made between branch instructions is by semantics, not by span. This Generally, in code fed to the assembler as input, the only difference between branch instructions is semantics, not span. This means that a distinction is made between an unconditional branch and the several kinds of conditional branch, but not between their short, absolute or The algorithm used by the assembler to encode these branch instructions into the different machine instructions is known as the {\em branch displacement algorithm}. The optimisation problem consists in using as small an encoding as algorithm}. The optimisation problem consists of finding as small an encoding as possible, thus minimising program length and execution time. The canonical solution, as shown by Szymanski~\cite{Szymanski1978} or more recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a fixed point algorithm that starts out with the shortest possible encoding (all branch instruction encoded as short jumps, which is very probably not a correct fixed point algorithm that starts with the shortest possible encoding (all branch instruction encoded as short jumps, which is likely not a correct solution) and then iterates over the program to re-encode those branch instructions whose target is outside their range. situation where the span of $j$ is so small that it can be encoded as a short jump. This argument continues to hold throughout the subsequent iterations of the algorithm: short jumps can change into long jumps, but not vice versa (spans only increase). Hence, the algorithm either terminates when a fixed the algorithm: short jumps can change into long jumps, but not \emph{vice versa}, as spans only increase. Hence, the algorithm either terminates early when a fixed point is reached or when all short jumps have been changed into long jumps. However, neither of these claims (termination nor optimality) hold when we add the absolute jump. The reason for this is that with absolute jumps, the encoding of a branch the absolute jump, as with absolute jumps, the encoding of a branch instruction no longer depends only on the distance between the branch instruction and its target: in order for an absolute jump to be possible, they long if this is not the case). This invalidates the termination argument: a branch instruction, once encoded \begin{figure}[ht] \begin{alltt} jmp X \vdots L$$\sb{0}$$: \vdots jmp L$$\sb{0}$$ \end{alltt} \caption{Example of a program where a long jump becomes absolute} \label{f:term_example} \end{figure} This invalidates our earlier termination argument: a branch instruction, once encoded as a long jump, can be re-encoded during a later iteration as an absolute jump. Consider the program shown in figure~\ref{f:term_example}. At the start of the Consider the program shown in Figure~\ref{f:term_example}. At the start of the first iteration, both the branch to {\tt X} and the branch to $\mathtt{L}_{0}$ are encoded as small jumps. Let us assume that in this case, the placement of Hence, in the third iteration, the branch to $\mathtt{L}_{0}$ can be encoded as an absolute jump. At first glance, there is nothing that prevents us from making a construction where two branch instructions interact in such a way as to keep switching between long and absolute encodings for an indefinite amount of iterations. \begin{figure}[h] \begin{alltt} jmp X \vdots L$$\sb{0}$$: \vdots jmp L$$\sb{0}$$ \end{alltt} \caption{Example of a program where a long jump becomes absolute} \label{f:term_example} \end{figure} In fact, this situation mirrors the explanation by Szymanski~\cite{Szymanski1978} of why the branch displacement optimisation problem is NP-complete. In this explanation, a condition for NP-completeness is the fact that programs be allowed to contain {\em pathological} jumps. These are branch instructions that can normally not be encoded as a short(er) jump, but gain this property when some other branch instructions are encoded as a long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by encoding the first branch instruction as a long jump, another branch instructions switches from long to absolute (which is shorter). The optimality argument no longer holds either. Let us consider the program shown in figure~\ref{f:opt_example}. Suppose that the distance between constructing a configuration where two branch instructions interact in such a way as to iterate indefinitely between long and absolute encodings. This situation mirrors the explanation by Szymanski~\cite{Szymanski1978} of why the branch displacement optimisation problem is NP-complete. In this explanation, a condition for NP-completeness is the fact that programs be allowed to contain {\em pathological} jumps. These are branch instructions that can normally not be encoded as a short(er) jump, but gain this property when some other branch instructions are encoded as a long(er) jump. This is exactly what happens in Figure~\ref{f:term_example}. By encoding the first branch instruction as a long jump, another branch instruction switches from long to absolute (which is shorter). In addition, our previous optimality argument no longer holds. Consider the program shown in Figure~\ref{f:opt_example}. Suppose that the distance between $\mathtt{L}_{0}$ and $\mathtt{L}_{1}$ is such that if {\tt jmp X} is encoded as a short jump, there is a segment border just after $\mathtt{L}_{1}$. Let segment, but far enough away from $\mathtt{L}_{1}$ that they cannot be encoded as short jumps. \begin{figure}[ht] \begin{alltt} L$$\sb{0}$$: jmp X X: \vdots L$$\sb{1}$$: \vdots jmp L$$\sb{1}$$ \vdots jmp L$$\sb{1}$$ \vdots jmp L$$\sb{1}$$ \vdots \end{alltt} \caption{Example of a program where the fixed-point algorithm is not optimal} \label{f:opt_example} \end{figure} Then, if {\tt jmp X} were to be encoded as a short jump, which is clearly this solution might actually be smaller than the one reached by the smallest fixed point algorithm. \begin{figure}[h] \begin{alltt} L$$\sb{0}$$: jmp X X: \vdots L$$\sb{1}$$: \vdots jmp L$$\sb{1}$$ \vdots jmp L$$\sb{1}$$ \vdots jmp L$$\sb{1}$$ \vdots \end{alltt} \caption{Example of a program where the fixed-point algorithm is not optimal} \label{f:opt_example} \end{figure}