source:src/ASM/CPP2012-policy/algorithm.tex@2064

Last change on this file since 2064 was 2064, checked in by boender, 8 years ago
• more progress
File size: 10.3 KB
Line
1\section{Our algorithm}
2
3\subsection{Design decisions}
4
5Given the NP-completeness of the problem, to arrive at an optimal solution
6within a short space of time (using, for example, a constraint solver) will
7potentially take a great amount of time.
8
9The {\tt gcc} compiler suite, for the x86 architecture, uses a greatest fix
10point algorithm. In other words, it starts off with all jumps encoded as the
11largest jumps possible, and then tries to reduce jumps as much as possible.
12
13Such an algorithm has the advantage that any intermediate results it returns
14are correct: the solution where every jump is encoded as a large jump is
15always possible, and the algorithm only reduces those jumps where the
16destination address is in range for a shorter jump instruction. The algorithm
17can thus be stopped after a determined amount of steps without losing
18correctness.
19
20The result, however, is not necessarily optimal, even if the algorithm is run
21until it terminates naturally: the fixed point reached is the {\em greatest}
22fixed point, not the least fixed point.
23
24The SDCC compiler, which has the MCS-51 among its target instruction sets, uses
25a least fix point algorithm, but does not take the presence of medium jumps
26into account. This makes the algorithm run in linear time with respect to the
27number of jumps in the program: starting out with every jump encoded as a
28short jump, each jump can be switched to a long jump once, but no jump ever
29goes back from long to short. This algorithm must be run until a fixed point
30is reached, because the intermediate solutions are not necessarily correct.
31
32This algorithm results in a least fixed point, which means its solution is
33potentially more optimal then the one reached by the {\tt gcc} algorithm.
34However, the solution is still not optimal, since there might be jumps whose
35destination is in the same segment. These jumps could be encoded as medium
36jumps, which are smaller than long jumps.
37
38Our first attempt at an algorithm was a least fixed point algorithm that took
39medium jumps into account.
40
41Here, we ran into a problem with proving termination: whereas the SDCC
42algorithm only switches jumps from short to long, when we add medium jumps,
43it is theoretically possible for a jump to switch from medium to long and back,
44as explained in the previous section.
45
46Proving termination then becomes difficult, because there is nothing that
47precludes a jump switching back and forth between medium and long indefinitely.
48
49In fact, this mirrors the argument from~\cite{Szymanski1978}. There, it is
50argued that for the problem to be NP-complete, it must be allowed to contain
51{\em pathological} jumps. These are jumps that can normally not be encoded as a
52short(er) jump, but gain this property when some other jumps are encoded as a
53long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by
54encoding the first jump as a long jump, another jump switches from long to
55medium (which is shorter).
56
57In order to keep the algorithm linear and more easily prove termination, we
58decided to explicitly enforce the `jumps must always increase' requirement: if
59a jump is encoded as a long jump in one step, it will also be encoded as a
60long jump in all the following steps. This means that any jump can change at
61maximum two times: once from short to medium (or long), and once from medium
62to long.
63
64There is one complicating factor: suppose that a jump is encoded in step $n$
65as a medium jump, but in step $n+1$ it is determined that (because of changes
66elsewhere) it can now be encoded as a short jump. Due to the requirement that
67jumps must always increase, this means that the jump will be encoded as a
68medium jump in step $n+1$ as well.
69
70This is not necessarily correct, however: it is not the case that any short
71jump can correctly be encoded as a medium jump (a short jump can bridge
72segments, whereas a medium jump cannot). Therefore, in this situation
73we decide to encode the jump as a long jump, which is always correct.
74
75The resulting algorithm, while not optimal, is at least as good as the ones
76from {\tt gcc} and SDCC, and potentially better. Its complexity remains
77linear (though with a higher constant than SDCC).
78
79\subsection{The algorithm in detail}
80
81The branch displacement algorithm forms part of the translation from
82pseudo-code to assembler. More specifically, it is used by the function that
83translates pseudo-addresses (natural numbers indicating the position of the
84instruction in the program) to actual addresses in memory.
85
86The original intention was to have two different functions, one function
87$\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short}, \mathtt{medium}, 88\mathtt{long}\}$ to associate jumps to their intended translation, and a
89function $\sigma: \mathbb{N} \rightarrow \mathtt{Word}$ to associate
90pseudo-addresses to actual addresses. $\sigma$ would use $\mathtt{policy}$ to
91determine the size of jump instructions.
92
93This turned out to be suboptimal from the algorithmic point of view and
94impossible to prove correct.
95
96From the algorithmic point of view, in order to create the $\mathtt{policy}$
97function, we must necessarily have a translation from pseudo-addresses
98to actual addresses (i.e. a $\sigma$ function): in order to judge the distance
99between a jump and its destination, we must know their memory locations.
100Conversely, in order to create the $\sigma$ function, we need to have the
101$\mathtt{policy}$ function, otherwise we do not know the sizes of the jump
102instructions in the program.
103
104Much the same problem appears when we try to prove the algorithm correct: the
105correctness of $\mathtt{policy}$ depends on the correctness of $\sigma$, and
106the correctness of $\sigma$ depends on the correctness of $\mathtt{policy}$.
107
108We solved this problem by integrating the $\mathtt{policy}$ and $\sigma$
109algorithms. We now have a function
110$\sigma: \mathbb{N} \rightarrow \mathtt{Word} \times \mathtt{bool}$ which
111associates a pseudo-address to an actual address. The boolean denotes a forced
112long jump; as noted in the previous section, if during the fixed point
113computation a medium jump needs to be re-encoded as a short jump, the result
114is actually a long jump. It might therefore be the case that jumps are encoded
115as long jumps without this actually being necessary.
116
117The assembler function encodes the jumps by checking the distance between
118source and destination according to $\sigma$, so it could select a medium
119jump in a situation where there should be a long jump. The boolean is there
120to prevent this from happening by indicating the locations where a long jump
121should be encoded, even if a shorter jump is possible. This has no effect on
122correctness, since a long jump is applicable in any situation.
123
124\begin{figure}
125\begin{algorithmic}
126\Function{f}{$labels$,$old\_sigma$,$instr$,$ppc$,$acc$}
127        \State $\langle added, pc, sigma \rangle \gets acc$
128        \If {$instr$ is a backward jump to $j$}
129                \State $length \gets \mathrm{jump\_size}(pc,sigma_1(labels(j)))$
130        \ElsIf {$instr$ is a forward jump to $j$}
131                \State $length \gets \mathrm{jump\_size}(pc,old\_sigma_1(labels(j))+added)$
132        \Else
133                \State $length \gets \mathtt{short\_jump}$
134        \EndIf
135        \State $old\_length \gets \mathrm{old\_sigma_1}(ppc)$
136        \State $new\_length \gets \mathrm{max}(old\_length, length)$
137        \State $old\_size \gets \mathrm{old\_sigma_2}(ppc)$
138        \State $new\_size \gets \mathrm{instruction\_size}(instr,new\_length)$
139        \State $new\_added \gets added+(new\_size-old\_size)$
140        \State $new\_sigma_1(ppc+1) \gets pc+new\_size$
141        \State $new\_sigma_2(ppc) \gets new\_length$ \\
142        \Return $\langle new\_added, pc+new\_size, new\_sigma \rangle$
143\EndFunction
144\end{algorithmic}
145\caption{The heart of the algorithm}
146\label{f:jump_expansion_step}
147\end{figure}
148
149The algorithm, shown in figure~\ref{f:jump_expansion_step}, works by folding the
150function {\sc f} over the entire program, thus gradually constructing $sigma$.
151This constitutes one step in the fixed point calculation; successive steps
152repeat the fold until a fixed point is reached.
153
154Parameters of the function {\sc f} are:
155\begin{itemize}
156        \item a function $labels$ that associates a label to its pseudo-address;
157        \item $old\_sigma$, the $\sigma$ function returned by the previous
158                iteration of the fixed point calculcation;
159        \item $instr$, the instruction currently under consideration;
160        \item $ppc$, the pseudo-address of $instr$;
161        \item $acc$, the fold accumulator, which contains $pc$ (the highest memory
162                address reached so far), $added$ (the number of bytes added to the program
163                size with respect to the previous iteration), and of course $sigma$, the
164                $\sigma$ function under construction.
165\end{itemize}
166
167The first two are parameters that remain the same through one iteration, the
168last three are standard parameters for a fold function (including $ppc$,
169which is simply the number of instructions of the program already processed).
170
171The $\sigma$ functions used by {\sc f} are not of the same type as the final
172$\sigma$ function: they are of type
173$\sigma: \mathbb{N} \rightarrow \mathbb{N} \times \{\mathtt{short\_jump}, 174\mathtt{medium\_jump},\mathtt{long\_jump}\}$; a function that associates a
175pseudo-address with an memory address and a jump length. We do this to be able
176to more easily compare the jump lengths between iterations. In the algorithm,
177we use the notation $sigma_1(x)$ to denote the memory address corresponding to
178$x$, and $sigma_2(x)$ to denote the jump length corresponding to $x$.
179
180Note that the $\sigma$ function used for label lookup varies depending on
181whether the label is behind our current position or ahead of it. For
182backward jumps, where the label is behind our current position, we can use
183$sigma$ for lookup, since its memory address is already known. However, for
184forward jumps, the memory address of the address of the label is not yet
185known, so we must use $old\_sigma$.
186
187We cannot use $old\_sigma$ without change: it might be the case that we have
188already changed some jumps before, making the program longer. We must therefore
189compensate for this by adding the size increase of the program to the label's
190memory address according to $old\_sigma$, so that jump distances do not get
191compromised.
192
193Note also that we add the pc to $sigma$ at location $ppc+1$, whereas we add the
194jump length at location $ppc$. We do this so that $sigma(ppc)$ will always
195return a couple with the start address of the instruction at $ppc$ and the
196length of its jump; the end address of the program can be found at $sigma(n+1)$,
197where $n$ is the number of instructions in the program.
Note: See TracBrowser for help on using the repository browser.