Changeset 2085


Ignore:
Timestamp:
Jun 15, 2012, 11:36:47 AM (5 years ago)
Author:
boender
Message:
  • rewrote introduction
  • changed 'medium' to 'absolute'
  • added a bit to conclusion (CompCert?, Piton, ...)
Location:
src/ASM/CPP2012-policy
Files:
5 edited

Legend:

Unmodified
Added
Removed
  • src/ASM/CPP2012-policy/algorithm.tex

    r2080 r2085  
    3333
    3434In the CerCo assembler, we opted at first for a least fixed point algorithm,
    35 taking medium jumps into account.
     35taking absolute jumps into account.
    3636
    37 Here, we ran into a problem with proving termination: whereas the SDCC
    38 algorithm only switches jumps from short to long, when we add medium jumps,
    39 it is theoretically possible for a jump to switch from medium to long and back,
    40 as explained in the previous section.
     37Here, we ran into a problem with proving termination, as explained in the
     38previous section: if we only take short and long jumps into account, the jump
     39encoding can only switch from short to long, but never in the other direction.
     40When we add absolute jumps, however, it is theoretically possible for a jump to
     41switch from absolute to long and back, as explained in the previous section.
    4142
    4243Proving termination then becomes difficult, because there is nothing that
    43 precludes a jump switching back and forth between medium and long indefinitely.
     44precludes a jump from switching back and forth between absolute and long
     45indefinitely.
    4446
    45 In fact, this mirrors the argument from~\cite{Szymanski1978}. There, it is
    46 argued that for the problem to be NP-complete, it must be allowed to contain
    47 {\em pathological} jumps. These are jumps that can normally not be encoded as a
    48 short(er) jump, but gain this property when some other jumps are encoded as a
    49 long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by
    50 encoding the first jump as a long jump, another jump switches from long to
    51 medium (which is shorter).
    52 
    53 In order to keep the algorithm linear and more easily prove termination, we
    54 decided to explicitly enforce the `jumps must always increase' requirement: if
    55 a jump is encoded as a long jump in one step, it will also be encoded as a
    56 long jump in all the following steps. This means that any jump can change at
    57 maximum two times: once from short to medium (or long), and once from medium
    58 to long.
     47In order to keep the algorithm in the same complexity class and more easily
     48prove termination, we decided to explicitly enforce the `jumps must always
     49increase' requirement: if a jump is encoded as a long jump in one step, it will
     50also be encoded as a long jump in all the following steps. This means that any
     51jump can change at most two times: once from short to absolute (or long), and
     52once from absolute to long.
    5953
    6054There is one complicating factor: suppose that a jump is encoded in step $n$
    61 as a medium jump, but in step $n+1$ it is determined that (because of changes
     55as an absolute jump, but in step $n+1$ it is determined that (because of changes
    6256elsewhere) it can now be encoded as a short jump. Due to the requirement that
    63 jumps must always increase, this means that the jump will be encoded as a
    64 medium jump in step $n+1$ as well.
     57jumps must always increase, this means that the jump will be encoded as an
     58absolute jump in step $n+1$ as well.
    6559
    6660This is not necessarily correct, however: it is not the case that any short
    67 jump can correctly be encoded as a medium jump (a short jump can bridge
    68 segments, whereas a medium jump cannot). Therefore, in this situation
     61jump can correctly be encoded as an absolute jump (a short jump can bridge
     62segments, whereas an absolute jump cannot). Therefore, in this situation
    6963we decide to encode the jump as a long jump, which is always correct.
    7064
    7165The resulting algorithm, while not optimal, is at least as good as the ones
    72 from {\tt gcc} and SDCC, and potentially better. Its complexity remains
    73 linear (though with a higher constant than SDCC).
     66from SDCC and {\tt gcc}, and potentially better. Its complexity remains
     67the same (there are at most $2n$ iterations, where $n$ is the number of jumps
     68in the program).
    7469
    7570\subsection{The algorithm in detail}
     
    8176
    8277The original intention was to have two different functions, one function
    83 $\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short}, \mathtt{medium},
    84 \mathtt{long}\}$ to associate jumps to their intended translation, and a
    85 function $\sigma: \mathbb{N} \rightarrow \mathtt{Word}$ to associate
    86 pseudo-addresses to actual addresses. $\sigma$ would use $\mathtt{policy}$ to
    87 determine the size of jump instructions.
     78$\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short\_jump},
     79\mathtt{absolute\_jump}, \mathtt{long\_jump}\}$ to associate jumps to their
     80intended encoding, and a function $\sigma: \mathbb{N} \rightarrow
     81\mathtt{Word}$ to associate pseudo-addresses to actual addresses. $\sigma$
     82would use $\mathtt{policy}$ to determine the size of jump instructions.
    8883
    8984This turned out to be suboptimal from the algorithmic point of view and
     
    107102associates a pseudo-address to an actual address. The boolean denotes a forced
    108103long jump; as noted in the previous section, if during the fixed point
    109 computation a medium jump needs to be re-encoded as a short jump, the result
    110 is actually a long jump. It might therefore be the case that jumps are encoded
    111 as long jumps without this actually being necessary.
     104computation an absolute jump changes to be potentially re-encoded as a short
     105jump, the result is actually a long jump. It might therefore be the case that
     106jumps are encoded as long jumps without this actually being necessary, and this
     107information needs to be passed to the code generating function.
    112108
    113109The assembler function encodes the jumps by checking the distance between
    114 source and destination according to $\sigma$, so it could select a medium
     110source and destination according to $\sigma$, so it could select an absolute
    115111jump in a situation where there should be a long jump. The boolean is there
    116112to prevent this from happening by indicating the locations where a long jump
     
    168164$\sigma$ function: they are of type
    169165$\sigma: \mathbb{N} \rightarrow \mathbb{N} \times \{\mathtt{short\_jump},
    170 \mathtt{medium\_jump},\mathtt{long\_jump}\}$; a function that associates a
     166\mathtt{absolute\_jump},\mathtt{long\_jump}\}$; a function that associates a
    171167pseudo-address with an memory address and a jump length. We do this to be able
    172168to more easily compare the jump lengths between iterations. In the algorithm,
  • src/ASM/CPP2012-policy/biblio.bib

    r2080 r2085  
    6767        year = {2012}
    6868}
     69
     70@article {Leroy2009,
     71   author = {Leroy, Xavier},
     72   affiliation = {INRIA Paris-Rocquencourt B.P. 105 78153 Le Chesnay France},
     73   title = {A Formally Verified Compiler Back-end},
     74   journal = {Journal of Automated Reasoning},
     75   publisher = {Springer Netherlands},
     76   issn = {0168-7433},
     77   keyword = {Computer Science},
     78   pages = {363-446},
     79   volume = {43},
     80   issue = {4},
     81   url = {http://dx.doi.org/10.1007/s10817-009-9155-4},
     82   note = {10.1007/s10817-009-9155-4},
     83   year = {2009}
     84}
     85
     86@book {Moore1996,
     87  author = {J Strother Moore},
     88  title = {Piton: A mechanically verified assembly language},
     89  series = {Automated Reasoning Series},
     90  volume = {3},
     91  isbn = {978-0-7923-3920-5},
     92  publisher = {Springer},
     93  year = {1996}
     94}
     95
  • src/ASM/CPP2012-policy/conclusion.tex

    r2084 r2085  
    55termination and correctness for this algorithm, as formalised in Matita.
    66
     7The algorithm we have presented is fast and correct, but not optimal; a true
     8optimal solution would need techniques like constraint solvers. While outside
     9the scope of the present research, it would be interesting to see if enough
     10heuristics could be find to make such a solution practical for implementing
     11in an existent compiler; this would be especially useful for embedded systems,
     12where it is important to have a small solution as possible.
     13
     14This algorithm is part of a greater whole, the CerCo project, which aims at
     15the complete formalisation and verification of a compiler. More information
     16on the formalisation of the assembler, of which the present work is a part,
     17can be found in a companion publication.
     18
    719\subsection{Related work}
    820
     
    1022displacement algorithm.
    1123
    12 (Piton? CompCert?)
     24The CompCert project is another project aimed at formally verifying a compiler.
     25Their backend~\cite{Leroy2009} generates assembly for (amongst others) the
     26PowerPC and x86 (32-bit) architectures, but does not include a branch
     27displacement algorithm; at least for the x86 architecture, all jumps are
     28encoded as long jumps.
    1329
     30There is also Piton~\cite{Moore1996}, which not only includes the
     31formal verification of a compiler, but also of the machine architecture
     32targeted by that compiler. However, this architecture does not have different
     33jump sizes (branching is simulated by assigning values to the program counter),
     34so the branch displacement problem does not occur.
     35 
    1436\subsection{Formal development}
    1537
  • src/ASM/CPP2012-policy/problem.tex

    r2064 r2085  
    22
    33The problem of branch displacement optimisation, also known as jump encoding, is
    4 a well-known problem in assembler design.
    5 
    6 In many instruction sets, amongst which the ubiquitous x86 architecture (both
    7 its 32-- and 64--bit incarnations), there are instructions whose addressing
    8 mode varies with the distance between the instruction and the object they
    9 address. Mostly this occurs with jump instructions; for example, in the
    10 x86-64 instruction set there are eleven different forms of the unconditional
    11 jump instruction, all with different ranges, instruction sizes and semantics
    12 (only six are valid in 64-bit mode, for example). Some examples are shown in
    13 figure~\ref{f:x86jumps}:
    14 
    15 \begin{figure}[h]
     4a well-known problem in assembler design. It is caused by the fact that in
     5many architecture sets, the encoding (and therefore size) of some instructions
     6depends on the distance to their operand (the instruction 'span'). The branch
     7displacement optimisation problem consists in encoding these span-dependent
     8instructions in such a way that the resulting program is as small as possible.
     9
     10This problem is the subject of the present paper. After introducing the problem
     11in more detail, we will discuss the solutions used by other compilers, present
     12the algorithm used by us in the CerCo assembler, and discuss its verification,
     13that is the proofs of termination and correctness.
     14 
     15The research presented in this paper has been executed within the CerCo project
     16which aims at formally verifying a C compiler with cost annotations. The
     17target architecture for this project is the MCS-51, whose instruction set
     18contains span-dependent instructions. Furthermore, its maximum addressable
     19memory size is very small (65 Kbytes), which makes it important to generate
     20programs that are as small as possible.
     21
     22With this optimisation, however, comes increased complexity and hence
     23increased possibility for error. We must make sure that the jumps are encoded
     24correctly, otherwise the assembled program will behave impredictably.
     25
     26\section{The branch displacement optimisation problem}
     27
     28In most modern instruction sets, the only span-dependent instructions are
     29branch instructions. Taking the ubiquitous x86-64 instruction set as an
     30example, we find that it contains are eleven different forms of the
     31unconditional jump instruction, all with different ranges, instruction sizes
     32and semantics (only six are valid in 64-bit mode, for example). Some examples
     33are shown in figure~\ref{f:x86jumps}.
     34
     35\begin{figure}[h]
     36\begin{center}
    1637\begin{tabular}{|l|l|l|}
    1738\hline
     
    2445\hline
    2546\end{tabular}
     47\end{center}
    2648\caption{List of x86 jump instructions}
    2749\label{f:x86jumps}
     
    2951
    3052The chosen target architecture of the CerCo project is the Intel MCS-51, which
    31 features three types of jump instructions, as shown in
    32 figure~\ref{f:mcs51jumps}.
    33 
    34 \begin{figure}[h]
     53features three types of branch instructions (or jump instructions; the two terms
     54are used interchangeably), as shown in figure~\ref{f:mcs51jumps}.
     55
     56\begin{figure}[h]
     57\begin{center}
    3558\begin{tabular}{|l|l|l|l|}
    3659\hline
     
    3962\hline
    4063SJMP (`short jump') & 2 & 2 & -128 to 127 bytes \\
    41 AJMP (`medium jump') & 2 & 2 & one segment (11-bit address) \\
     64AJMP (`absolute jump') & 2 & 2 & one segment (11-bit address) \\
    4265LJMP (`long jump') & 3 & 3 & entire memory \\
    4366\hline
    4467\end{tabular}
     68\end{center}
    4569\caption{List of MCS-51 jump instructions}
    4670\label{f:mcs51jumps}
     
    5074means that a conditional jump outside the short address range has to be
    5175encoded using two jumps; the call instruction is only available in
    52 medium and long forms.
     76absolute and long forms.
    5377
    5478Note that even though the MCS-51 architecture is much less advanced and more
     
    5781reach the entire available memory.
    5882 
    59 Generally, in assembly code, there is only one way to indicate a jump; the
    60 algorithm used by the assembler to encode these jumps into the different
     83Generally, in the code that is sent to the assembler as input, the only
     84difference made between jump instructions is by semantics, not by span. This
     85means that a distinction is made between the inconditional jump and the several
     86kinds of conditional jump, but not between their short, absolute or long
     87variants.
     88
     89The algorithm used by the assembler to encode these jumps into the different
    6190machine instructions is known as the {\tt branch displacement algorithm}. The
    6291optimisation problem consists in using as small an encoding as possible, thus
     
    6695which could make finding an optimal solution very time-consuming.
    6796
    68 The canonical solution, as shown in~\cite{Szymanski1978} or more recently
    69 in~\cite{Dickson2008} for the x86 instruction set, is to use a fixed point
    70 algorithm that starts out with the shortest possible encoding (all jumps
    71 encoded as short jumps, which is very probably not a correct solution) and then
    72 iterates over the program to re-encode those jumps whose target is outside
    73 their range.
    74 
    75 \subsection*{Adding medium jumps}
     97The canonical solution, as shown by Szymanski~\cite{Szymanski1978} or more
     98recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a
     99fixed point algorithm that starts out with the shortest possible encoding (all
     100jumps encoded as short jumps, which is very probably not a correct solution)
     101and then iterates over the program to re-encode those jumps whose target is
     102outside their range.
     103
     104\subsection*{Adding absolute jumps}
    76105
    77106In both papers mentioned above, the encoding of a jump is only dependent on the
     
    81110Here, termination of the smallest fixed point algorithm is easy to prove. All
    82111jumps start out encoded as short jumps, which means that the distance between
    83 any jump and its target is as short as possible. If we therefore need to encode
    84 a jump $j$ as a long jump, we can be sure that we can never reach a situation
    85 where the span of $j$ is so small that it can be encoded as a short jump. This
    86 reasoning holds throughout the subsequent iterations of the algorithms: short
    87 jumps can change into long jumps, but not vice versa. Hence, the algorithm
    88 either terminates when a fixed point is reached or when all short jumps have
    89 been changed into long jumps.
     112any jump and its target is as short as possible. If, in this situation, there
     113is a jump $j$ whose span is not within the range for a short jump, we can be
     114sure that we can never reach a situation where the span of $j$ is so small that
     115it can be encoded as a short jump. This reasoning holds throughout the
     116subsequent iterations of the algorithm: short jumps can change into long jumps,
     117but not vice versa (spans only increase). Hence, the algorithm either
     118terminates when a fixed point is reached or when all short jumps have been
     119changed into long jumps.
    90120
    91121Also, we can be certain that we have reached an optimal solution: a short jump
     
    93123
    94124However, neither of these claims (termination nor optimality) hold when we add
    95 the medium jump.
    96 
    97 The reason for this is that with medium jumps, the encoding of a jump no longer
    98 depends only on the distance between the jump and its target: in order for a
    99 medium jump to be possible, they need to be in the same segment. It is therefore
    100 entirely possible for two jumps with the same span to be encoded in different
    101 ways (medium if the jump and its destination are in the same segment, long if
    102 this is not the case).
     125the absolute jump.
     126
     127The reason for this is that with absolute jumps, the encoding of a jump no
     128longer depends only on the distance between the jump and its target: in order
     129for an absolute jump to be possible, they need to be in the same segment (for
     130the MCS-51, this means that the first 5 bytes of their addesses have to be
     131equal). It is therefore entirely possible for two jumps with the same span to
     132be encoded in different ways (absolute if the jump and its target are in the
     133same segment, long if this is not the case).
    103134
    104135This invalidates the termination argument: a jump, once encoded as a long jump,
    105 can be re-encoded during a later iteration as a medium jump. Consider the
     136can be re-encoded during a later iteration as an absolute jump. Consider the
    106137program shown in figure~\ref{f:term_example}. At the start of the first
    107138iteration, both the jump to {\tt X} and the jump to $\mathtt{L}_{0}$ are
     
    115146be encoded as a long jump. If we assume that the jump to {\tt X} is encoded as
    116147a long jump as well, the size of the jump will increase and $\mathtt{L}_{0}$
    117 will be `propelled' into the same segment as its jump. Hence, in the next
    118 iteration, it can be encoded as a medium jump. At first glance, there is
     148will be `propelled' into the same segment as its jump. Hence, in the third
     149iteration, it can be encoded as an absolute jump. At first glance, there is
    119150nothing that prevents us from making a construction where two jumps interact
    120 in such a way as to keep switching between long and medium indefinitely.
     151in such a way as to keep switching between long and absolute encodings for
     152an indefinite amount of iterations.
    121153
    122154\begin{figure}[h]
     
    128160    jmp L\(\sb{0}\)
    129161\end{alltt}
    130 \caption{Example of a program where a long jump becomes medium}
     162\caption{Example of a program where a long jump becomes absolute}
    131163\label{f:term_example}
    132164\end{figure}
     165
     166In fact, this situation mirrors the explanation by
     167Szymanski~\cite{Szymanski1978} of why the branch displacement optimisation
     168problem is NP-complete. In this explanation, a condition for NP-completeness
     169is the fact that programs be allowed to contain {\em pathological} jumps.
     170These are jumps that can normally not be encoded as a short(er) jump, but gain
     171this property when some other jumps are encoded as a long(er) jump. This is
     172exactly what happens in figure~\ref{f:term_example}: by encoding the first
     173jump as a long jump, another jump switches from long to absolute
     174(which is shorter).
    133175
    134176The optimality argument no longer holds either. Let us consider the program
     
    145187therefore increase in size, $\mathtt{L}_{1}$ would be `propelled' across the
    146188segment border, so that the three jumps to $\mathtt{L}_{1}$ could be encoded
    147 as medium jumps. Depending on the relative sizes of long and medium jumps, this
    148 solution might actually be smaller than the one reached by the smallest
     189as absolute jumps. Depending on the relative sizes of long and absolute jumps,
     190this solution might actually be smaller than the one reached by the smallest
    149191fixed point algorithm.
    150192
  • src/ASM/CPP2012-policy/proof.tex

    r2084 r2085  
    55
    66The main correctness statement is as follows:
     7
     8\clearpage
    79
    810\begin{lstlisting}
     
    5557$2n$ the number of jumps in the program. This worst case is reached if at every
    5658iteration, we change the encoding of exactly one jump; since a jump can change
    57 from {\tt short} to {\tt medium} and from {\tt medium} to {\tt long}, there
    58 can be at most $2n$ changes.
     59from short to absolute and from absolute to long, there can be at most $2n$
     60changes.
    5961
    6062This proof has been executed in the ``Russell'' style from~\cite{Sozeau2006}.
     
    114116iteration, and $p$ the current one), jump lengths either remain equal or
    115117increase. It is needed for proving termination.
     118
     119\clearpage
    116120
    117121\begin{lstlisting}
     
    160164    [ short_jump $\Rightarrow$ $\neg$is_call instr $\wedge$
    161165       \fst (short_jump_cond pc_plus_jmp_length addr) = true
    162     | medium_jump $\Rightarrow$ $\neg$is_relative_jump instr $\wedge$
    163        \fst (medium_jump_cond pc_plus_jmp_length addr) = true $\wedge$
     166    | absolute_jump $\Rightarrow$ $\neg$is_relative_jump instr $\wedge$
     167       \fst (absolute_jump_cond pc_plus_jmp_length addr) = true $\wedge$
    164168       \fst (short_jump_cond pc_plus_jmp_length addr) = false
    165169    | long_jump $\Rightarrow$ \fst (short_jump_cond pc_plus_jmp_length addr) = false
    166        $\wedge$ \fst (medium_jump_cond pc_plus_jmp_length addr) = false
     170       $\wedge$ \fst (absolute_jump_cond pc_plus_jmp_length addr) = false
    167171    ].
    168172\end{lstlisting}
     
    200204We need to use two different formulations, because the fact that $added$ is 0
    201205does not guarantee that no jumps have changed: it is possible that we have
    202 replaced a short jump with a medium jump, which does not change the size.
     206replaced a short jump with a absolute jump, which does not change the size.
    203207
    204208Therefore {\tt policy\_pc\_equal} states that $old\_sigma_1(x) = sigma_1(x)$,
     
    245249
    246250\begin{lstlisting}
    247 definition sigma_compact: list labelled_instruction → label_map → ppc_pc_map → Prop :=
    248  λprogram.λlabels.λsigma.
     251definition sigma_compact :=
     252 λprogram:list labelled_instruction.λlabels:label_map.λsigma:ppc_pc_map.
    249253 ∀n.n < |program| →
    250254  match bvt_lookup_opt $\ldots$ (bitvector_of_nat ? n) (\snd sigma) with
Note: See TracChangeset for help on using the changeset viewer.