Ignore:
Timestamp:
Jun 15, 2012, 1:35:46 PM (8 years ago)
Author:
boender
Message:
  • systematically changed 'jump' to 'branch'
File:
1 edited

Legend:

Unmodified
Added
Removed
  • src/ASM/CPP2012-policy/problem.tex

    r2086 r2091  
    22
    33The problem of branch displacement optimisation, also known as jump encoding, is
    4 a well-known problem in assembler design. It is caused by the fact that in
    5 many architecture sets, the encoding (and therefore size) of some instructions
    6 depends on the distance to their operand (the instruction 'span'). The branch
    7 displacement optimisation problem consists in encoding these span-dependent
    8 instructions in such a way that the resulting program is as small as possible.
     4a well-known problem in assembler design~\cite{Hyde2006}. It is caused by the
     5fact that in many architecture sets, the encoding (and therefore size) of some
     6instructions depends on the distance to their operand (the instruction 'span').
     7The branch displacement optimisation problem consists in encoding these
     8span-dependent instructions in such a way that the resulting program is as
     9small as possible.
    910
    1011This problem is the subject of the present paper. After introducing the problem
    1112in more detail, we will discuss the solutions used by other compilers, present
    1213the algorithm used by us in the CerCo assembler, and discuss its verification,
    13 that is the proofs of termination and correctness.
     14that is the proofs of termination and correctness using the Matita theorem
     15prover~\cite{Asperti2007}.
    1416 
    1517The research presented in this paper has been executed within the CerCo project
     
    2123
    2224With this optimisation, however, comes increased complexity and hence
    23 increased possibility for error. We must make sure that the jumps are encoded
    24 correctly, otherwise the assembled program will behave unpredictably.
     25increased possibility for error. We must make sure that the branch instructions
     26are encoded correctly, otherwise the assembled program will behave
     27unpredictably.
    2528
    2629\section{The branch displacement optimisation problem}
    2730
    28 In most modern instruction sets, the only span-dependent instructions are
    29 branch instructions. Taking the ubiquitous x86-64 instruction set as an
    30 example, we find that it contains are eleven different forms of the
    31 unconditional jump instruction, all with different ranges, instruction sizes
     31In most modern instruction sets that have them, the only span-dependent
     32instructions are branch instructions. Taking the ubiquitous x86-64 instruction
     33set as an example, we find that it contains are eleven different forms of the
     34unconditional branch instruction, all with different ranges, instruction sizes
    3235and semantics (only six are valid in 64-bit mode, for example). Some examples
    3336are shown in figure~\ref{f:x86jumps}.
     
    4649\end{tabular}
    4750\end{center}
    48 \caption{List of x86 jump instructions}
     51\caption{List of x86 branch instructions}
    4952\label{f:x86jumps}
    5053\end{figure}
     
    6770\end{tabular}
    6871\end{center}
    69 \caption{List of MCS-51 jump instructions}
     72\caption{List of MCS-51 branch instructions}
    7073\label{f:mcs51jumps}
    7174\end{figure}
    7275
    73 The conditional jump instruction is only available in short form, which
    74 means that a conditional jump outside the short address range has to be
    75 encoded using two jumps; the call instruction is only available in
    76 absolute and long forms.
     76Conditional branch instructions are only available in short form, which
     77means that a conditional branch outside the short address range has to be
     78encoded using three branch instructions (for instructions whose logical
     79negation is available, it can be done with two branch instructions, but for
     80some instructions, this opposite is not available); the call instruction is
     81only available in absolute and long forms.
    7782
    7883Note that even though the MCS-51 architecture is much less advanced and more
    79 simple than the x86-64 architecture, the basic types of jump remain the same:
    80 a short jump with a limited range, an intra-segment jump and a jump that can
    81 reach the entire available memory.
     84simple than the x86-64 architecture, the basic types of branch instruction
     85remain the same: a short jump with a limited range, an intra-segment jump and a
     86jump that can reach the entire available memory.
    8287 
    8388Generally, in the code that is sent to the assembler as input, the only
    84 difference made between jump instructions is by semantics, not by span. This
    85 means that a distinction is made between the unconditional jump and the several
    86 kinds of conditional jump, but not between their short, absolute or long
    87 variants.
    88 
    89 The algorithm used by the assembler to encode these jumps into the different
    90 machine instructions is known as the {\tt branch displacement algorithm}. The
    91 optimisation problem consists in using as small an encoding as possible, thus
    92 minimising program length and execution time.
     89difference made between branch instructions is by semantics, not by span. This
     90means that a distinction is made between an unconditional branch and the
     91several kinds of conditional branch, but not between their short, absolute or
     92long variants.
     93
     94The algorithm used by the assembler to encode these branch instructions into
     95the different machine instructions is known as the {\em branch displacement
     96algorithm}. The optimisation problem consists in using as small an encoding as
     97possible, thus minimising program length and execution time.
    9398
    9499This problem is known to be NP-complete~\cite{Robertson1979,Szymanski1978},
     
    98103recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a
    99104fixed point algorithm that starts out with the shortest possible encoding (all
    100 jumps encoded as short jumps, which is very probably not a correct solution)
    101 and then iterates over the program to re-encode those jumps whose target is
    102 outside their range.
     105branch instruction encoded as short jumps, which is very probably not a correct
     106solution) and then iterates over the program to re-encode those branch
     107instructions whose target is outside their range.
    103108
    104109\subsection*{Adding absolute jumps}
     
    109114
    110115Here, termination of the smallest fixed point algorithm is easy to prove. All
    111 jumps start out encoded as short jumps, which means that the distance between
    112 any jump and its target is as short as possible. If, in this situation, there
    113 is a jump $j$ whose span is not within the range for a short jump, we can be
    114 sure that we can never reach a situation where the span of $j$ is so small that
    115 it can be encoded as a short jump. This reasoning holds throughout the
    116 subsequent iterations of the algorithm: short jumps can change into long jumps,
    117 but not vice versa (spans only increase). Hence, the algorithm either
    118 terminates when a fixed point is reached or when all short jumps have been
    119 changed into long jumps.
     116branch instructions start out encoded as short jumps, which means that the
     117distance between any branch instruction and its target is as short as possible.
     118If, in this situation, there is a branch instruction $b$ whose span is not
     119within the range for a short jump, we can be sure that we can never reach a
     120situation where the span of $j$ is so small that it can be encoded as a short
     121jump. This argument continues to hold throughout the subsequent iterations of
     122the algorithm: short jumps can change into long jumps, but not vice versa
     123(spans only increase). Hence, the algorithm either terminates when a fixed
     124point is reached or when all short jumps have been changed into long jumps.
    120125
    121126Also, we can be certain that we have reached an optimal solution: a short jump
     
    125130the absolute jump.
    126131
    127 The reason for this is that with absolute jumps, the encoding of a jump no
    128 longer depends only on the distance between the jump and its target: in order
    129 for an absolute jump to be possible, they need to be in the same segment (for
    130 the MCS-51, this means that the first 5 bytes of their addresses have to be
    131 equal). It is therefore entirely possible for two jumps with the same span to
    132 be encoded in different ways (absolute if the jump and its target are in the
    133 same segment, long if this is not the case).
    134 
    135 This invalidates the termination argument: a jump, once encoded as a long jump,
    136 can be re-encoded during a later iteration as an absolute jump. Consider the
    137 program shown in figure~\ref{f:term_example}. At the start of the first
    138 iteration, both the jump to {\tt X} and the jump to $\mathtt{L}_{0}$ are
    139 encoded as small jumps. Let us assume that in this case, the placement of
    140 $\mathtt{L}_{0}$ and the jump to it are such that $\mathtt{L}_{0}$ is just
    141 outside the segment that contains this jump. Let us also assume that the
    142 distance between $\mathtt{L}_{0}$ and the jump to it are too large for the jump
    143 to be encoded as a short jump.
    144 
    145 All this means that in the second iteration, the jump to $\mathtt{L}_{0}$ will
    146 be encoded as a long jump. If we assume that the jump to {\tt X} is encoded as
    147 a long jump as well, the size of the jump will increase and $\mathtt{L}_{0}$
    148 will be `propelled' into the same segment as its jump. Hence, in the third
    149 iteration, it can be encoded as an absolute jump. At first glance, there is
    150 nothing that prevents us from making a construction where two jumps interact
    151 in such a way as to keep switching between long and absolute encodings for
    152 an indefinite amount of iterations.
     132The reason for this is that with absolute jumps, the encoding of a branch
     133instruction no longer depends only on the distance between the branch
     134instruction and its target: in order for an absolute jump to be possible, they
     135need to be in the same segment (for the MCS-51, this means that the first 5
     136bytes of their addresses have to be equal). It is therefore entirely possible
     137for two branch instructions with the same span to be encoded in different ways
     138(absolute if the branch instruction and its target are in the same segment,
     139long if this is not the case).
     140
     141This invalidates the termination argument: a branch instruction, once encoded
     142as a long jump, can be re-encoded during a later iteration as an absolute jump.
     143Consider the program shown in figure~\ref{f:term_example}. At the start of the
     144first iteration, both the branch to {\tt X} and the branch to $\mathtt{L}_{0}$
     145are encoded as small jumps. Let us assume that in this case, the placement of
     146$\mathtt{L}_{0}$ and the branch to it are such that $\mathtt{L}_{0}$ is just
     147outside the segment that contains this branch. Let us also assume that the
     148distance between $\mathtt{L}_{0}$ and the branch to it are too large for the
     149branch instruction to be encoded as a short jump.
     150
     151All this means that in the second iteration, the branch to $\mathtt{L}_{0}$ will
     152be encoded as a long jump. If we assume that the branch to {\tt X} is encoded as
     153a long jump as well, the size of the branch instruction will increase and
     154$\mathtt{L}_{0}$ will be `propelled' into the same segment as its branch
     155instruction, because every subsequent instruction will move one byte forward.
     156Hence, in the third iteration, the branch to $\mathtt{L}_{0}$ can be encoded as
     157an absolute jump. At first glance, there is nothing that prevents us from
     158making a construction where two branch instructions interact in such a way as
     159to keep switching between long and absolute encodings for an indefinite amount
     160of iterations.
    153161
    154162\begin{figure}[h]
     
    168176problem is NP-complete. In this explanation, a condition for NP-completeness
    169177is the fact that programs be allowed to contain {\em pathological} jumps.
    170 These are jumps that can normally not be encoded as a short(er) jump, but gain
    171 this property when some other jumps are encoded as a long(er) jump. This is
    172 exactly what happens in figure~\ref{f:term_example}: by encoding the first
    173 jump as a long jump, another jump switches from long to absolute
    174 (which is shorter).
     178These are branch instructions that can normally not be encoded as a short(er)
     179jump, but gain this property when some other branch instructions are encoded as
     180a long(er) jump. This is exactly what happens in figure~\ref{f:term_example}:
     181by encoding the first branch instruction as a long jump, another branch
     182instructions switches from long to absolute (which is shorter).
    175183
    176184The optimality argument no longer holds either. Let us consider the program
     
    178186$\mathtt{L}_{0}$ and $\mathtt{L}_{1}$ is such that if {\tt jmp X} is encoded
    179187as a short jump, there is a segment border just after $\mathtt{L}_{1}$. Let
    180 us also assume that the three jumps to $\mathtt{L}_{1}$ are all in the same
     188us also assume that the three branches to $\mathtt{L}_{1}$ are all in the same
    181189segment, but far enough away from $\mathtt{L}_{1}$ that they cannot be encoded
    182190as short jumps.
    183191
    184192Then, if {\tt jmp X} were to be encoded as a short jump, which is clearly
    185 possible, all of the jumps to $\mathtt{L}_{1}$ would have to be encoded as
     193possible, all of the branches to $\mathtt{L}_{1}$ would have to be encoded as
    186194long jumps. However, if {\tt jmp X} were to be encoded as a long jump, and
    187195therefore increase in size, $\mathtt{L}_{1}$ would be `propelled' across the
    188 segment border, so that the three jumps to $\mathtt{L}_{1}$ could be encoded
     196segment border, so that the three branches to $\mathtt{L}_{1}$ could be encoded
    189197as absolute jumps. Depending on the relative sizes of long and absolute jumps,
    190198this solution might actually be smaller than the one reached by the smallest
Note: See TracChangeset for help on using the changeset viewer.