\section{The problem}
The problem of branch displacement optimisation, also known as jump encoding, is
a well-known problem in assembler design.
In many instruction sets, amongst which the ubiquitous x86 architecture (both
its 32-- and 64--bit incarnations), there are instructions whose addressing
mode varies with the distance between the instruction and the object they
address. Mostly this occurs with jump instructions; for example, in the
x86-64 instruction set there are eleven different forms of the unconditional
jump instruction, all with different ranges and semantics (only six are valid
in 64-bit mode, for example). Some examples are shown in
figure~\ref{f:x86jumps}:
\begin{figure}[h]
\begin{tabular}{|l|l|l|}
\hline
Instruction & Size (bytes) & Displacement range \\
\hline
Short jump & 2 & -128 to 127 bytes \\
Relative near jump & 5 & $-2^{32}$ to $2^{32}-1$ bytes \\
Absolute near jump & 6 & one segment (64-bit address) \\
Far jump & 8 & entire memory \\
\hline
\end{tabular}
\caption{List of x86 jump instructions}
\label{f:x86jumps}
\end{figure}
The chosen target architecture of the CerCo project is the Intel MCS-51, which
features three types of jump instructions, as shown in
figure~\ref{f:mcs51jumps}.
\begin{figure}[h]
\begin{tabular}{|l|l|l|l|}
\hline
Instruction & Size & Execution time & Displacement range \\
& (bytes) & (cycles) & \\
\hline
SJMP (`short jump') & 2 & 2 & -128 to 127 bytes \\
AJMP (`medium jump') & 2 & 2 & one segment (11-bit address) \\
LJMP (`long jump') & 3 & 3 & entire memory \\
\hline
\end{tabular}
\caption{List of MCS-51 jump instructions}
\label{f:mcs51jumps}
\end{figure}
The conditional jump instruction is only available in short form, which
means that a conditional jump outside the short address range has to be
encoded using two jumps; the call instruction is only available in
medium and long forms.
Note that even though the MCS-51 architecture is much less advanced and more
simple than the x86-64 architecture, the basic types of jump remain the same:
a short jump with a limited range, an intra-segment jump and a jump that can
reach the entire available memory.
Generally, in assembly code, there is only one way to indicate a jump; the
algorithm used by the assembler to encode these jumps into the different
machine instructions is known as the {\tt branch displacement algorithm}. The
optimisation problem consists in using as small an encoding as possible, thus
minimising program length and execution time.
This problem is known to be NP-complete~\cite{Robertson1979,Szymanski1978},
which could make finding an optimal solution very time-consuming.
The canonical solution, as shown in~\cite{Szymanski1978} or more recently
in~\cite{Dickson2008} for the x86 instruction set, is to use a fixpoint
algorithm that starts out with the shortest possible encoding (all jumps
encoded as short jumps, which is very probably not a correct solution) and then
iterates over the program to re-encode those jumps whose target is outside
their range.
\subsection*{Adding medium jumps}
In both the canonical solutions presented, the encoding of a jump is only
dependent on the distance between the jump and its target: below a certain
value a short jump can be used; above this value the jump must be encoded
as a long jump.
Here, termination of the smallest fixpoint algorithm is easy to prove. All
jumps start out encoded as short jumps, which means that the distance between
any jump and its target is as short as possible. If we therefore need to encode
a jump $j$ as a long jump, we can be sure that we can never reach a situation
where the span of $j$ is so small that it can be encoded as a short jump. This
reasoning holds throughout the subsequent iterations of the algorithms: short
jumps can change into long jumps, but not vice versa. Hence, the algorithm
either terminates when a fixpoint is reached or when all short jumps have been
changed into long jumps.
Also, we can be certain that we have reached an optimal solution: a short jump
is only changed into a long jump if it is absolutely necessary.
However, neither of these claims (termination nor optimality) hold when we add
the medium jump.
The reason for this is that with medium jumps, the encoding of a jump no longer
depends only on the distance between the jump and its target: in order for a
medium jump to be possible, they need to be in the same segment. It is therefore
entirely possible for two jumps with the same span to be encoded in different
ways (medium if the jump and its destination are in the same segment, long if
this is not the case).
This invalidates the termination argument: a jump, once encoded as a long jump,
can be re-encoded during a later iteration as a medium jump. Consider the
program shown in figure~\ref{f:term_example}. At the start of the first
iteration, both the jump to {\tt X} and the jump to $\mathtt{L}_{0}$ are
encoded as small jumps. Let us assume that in this case, the placement of
$\mathtt{L}_{0}$ and the jump to it are such that $\mathtt{L}_{0}$ is just
outside the segment that contains this jump. Let us also assume that the
distance between $\mathtt{L}_{0}$ and the jump to it are too large for the jump
to be encoded as a short jump.
All this means that in the second iteration, the jump to $\mathtt{L}_{0}$ will
be encoded as a long jump. If we assume that the jump to {\tt X} is encoded as
a long jump as well, the size of the jump will increase and $\mathtt{L}_{0}$
will be `propelled' into the same segment as its jump. Hence, in the next
iteration, it can be encoded as a medium jump. At first glance, there is
nothing that prevents us from making a construction where two jumps interact
in such a way as to keep switching between long and medium indefinitely.
\begin{figure}[h]
\begin{alltt}
jmp X
\vdots
L\(\sb{0}\):
\vdots
jmp L\(\sb{0}\)
\end{alltt}
\caption{Example of a program where a long jump becomes medium}
\label{f:term_example}
\end{figure}
The optimality argument no longer holds either. Let us consider the program
shown in figure~\ref{f:opt_example}. Suppose that the distance between
$\mathtt{L}_{0}$ and $\mathtt{L}_{1}$ is such that if {\tt jmp X} is encoded
as a short jump, there is a segment border just after $\mathtt{L}_{1}$. Let
us also assume that the three jumps to $\mathtt{L}_{1}$ are all in the same
segment, but far enough away from $\mathtt{L}_{1}$ that they cannot be encoded
as short jumps.
Then, if {\tt jmp X} were to be encoded as a short jump, which is clearly
possible, all of the jumps to $\mathtt{L}_{1}$ would have to be encoded as
long jumps. However, if {\tt jmp X} were to be encoded as a long jump, and
therefore increase in size, $\mathtt{L}_{1}$ would be `propelled' across the
segment border, so that the three jumps to $\mathtt{L}_{1}$ could be encoded
as medium jumps. Depending on the relative sizes of long and medium jumps, this
solution might actually be smaller than the one reached by the smallest
fixpoint algorithm.
\begin{figure}[h]
\begin{alltt}
L\(\sb{0}\): jmp X
X:
\vdots
L\(\sb{1}\):
\vdots
jmp L\(\sb{1}\)
\vdots
jmp L\(\sb{1}\)
\vdots
jmp L\(\sb{1}\)
\vdots
\end{alltt}
\caption{Example of a program where the fixed-point algorithm is not optimal}
\label{f:opt_example}
\end{figure}