source: src/ASM/CPP2012-policy/algorithm.tex @ 2084

Last change on this file since 2084 was 2080, checked in by boender, 7 years ago
  • added references to SDCC and gcc (thanks, Dominic)
  • updated sigma policy specification
  • changed description of SDCC algorithm to actual SDCC behaviour
File size: 10.0 KB
RevLine 
[1889]1\section{Our algorithm}
[2049]2
3\subsection{Design decisions}
4
5Given the NP-completeness of the problem, to arrive at an optimal solution
6within a short space of time (using, for example, a constraint solver) will
7potentially take a great amount of time.
8
[2080]9The SDCC compiler~\cite{SDCC2011}, which has the MCS-51 among its target
10instruction sets, simply encodes every jump as a long jump without taking the
11distance into account. While certainly correct (the long jump can reach any
12destination in memory) and rapid, it does result in a less than optimal
13solution.
[2049]14
[2080]15The {\tt gcc} compiler suite~\cite{GCC2012}, while compiling C on the x86
16architecture, uses a greatest fix point algorithm. In other words, it starts
17off with all jumps encoded as the largest jumps available, and then tries to
18reduce jumps as much as possible.
19
[2049]20Such an algorithm has the advantage that any intermediate results it returns
21are correct: the solution where every jump is encoded as a large jump is
22always possible, and the algorithm only reduces those jumps where the
23destination address is in range for a shorter jump instruction. The algorithm
24can thus be stopped after a determined amount of steps without losing
25correctness.
26
27The result, however, is not necessarily optimal, even if the algorithm is run
28until it terminates naturally: the fixed point reached is the {\em greatest}
[2080]29fixed point, not the least fixed point. Furthermore, {\tt gcc} (at least for
30the x86 architecture) only uses short and long jumps. This makes the algorithm
31more rapid, as shown in the previous section, but also results in a less
32optimal solution.
[2049]33
[2080]34In the CerCo assembler, we opted at first for a least fixed point algorithm,
35taking medium jumps into account.
[2049]36
37Here, we ran into a problem with proving termination: whereas the SDCC
38algorithm only switches jumps from short to long, when we add medium jumps,
39it is theoretically possible for a jump to switch from medium to long and back,
40as explained in the previous section.
41
42Proving termination then becomes difficult, because there is nothing that
43precludes a jump switching back and forth between medium and long indefinitely.
44
45In fact, this mirrors the argument from~\cite{Szymanski1978}. There, it is
46argued that for the problem to be NP-complete, it must be allowed to contain
47{\em pathological} jumps. These are jumps that can normally not be encoded as a
48short(er) jump, but gain this property when some other jumps are encoded as a
49long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by
50encoding the first jump as a long jump, another jump switches from long to
51medium (which is shorter).
52
53In order to keep the algorithm linear and more easily prove termination, we
54decided to explicitly enforce the `jumps must always increase' requirement: if
55a jump is encoded as a long jump in one step, it will also be encoded as a
56long jump in all the following steps. This means that any jump can change at
57maximum two times: once from short to medium (or long), and once from medium
58to long.
59
60There is one complicating factor: suppose that a jump is encoded in step $n$
61as a medium jump, but in step $n+1$ it is determined that (because of changes
62elsewhere) it can now be encoded as a short jump. Due to the requirement that
63jumps must always increase, this means that the jump will be encoded as a
64medium jump in step $n+1$ as well.
65
66This is not necessarily correct, however: it is not the case that any short
[2054]67jump can correctly be encoded as a medium jump (a short jump can bridge
68segments, whereas a medium jump cannot). Therefore, in this situation
[2049]69we decide to encode the jump as a long jump, which is always correct.
70
71The resulting algorithm, while not optimal, is at least as good as the ones
72from {\tt gcc} and SDCC, and potentially better. Its complexity remains
73linear (though with a higher constant than SDCC).
[2054]74
75\subsection{The algorithm in detail}
76
77The branch displacement algorithm forms part of the translation from
78pseudo-code to assembler. More specifically, it is used by the function that
79translates pseudo-addresses (natural numbers indicating the position of the
80instruction in the program) to actual addresses in memory.
81
82The original intention was to have two different functions, one function
83$\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short}, \mathtt{medium},
84\mathtt{long}\}$ to associate jumps to their intended translation, and a
85function $\sigma: \mathbb{N} \rightarrow \mathtt{Word}$ to associate
86pseudo-addresses to actual addresses. $\sigma$ would use $\mathtt{policy}$ to
87determine the size of jump instructions.
88
89This turned out to be suboptimal from the algorithmic point of view and
90impossible to prove correct.
91
92From the algorithmic point of view, in order to create the $\mathtt{policy}$
93function, we must necessarily have a translation from pseudo-addresses
94to actual addresses (i.e. a $\sigma$ function): in order to judge the distance
95between a jump and its destination, we must know their memory locations.
96Conversely, in order to create the $\sigma$ function, we need to have the
97$\mathtt{policy}$ function, otherwise we do not know the sizes of the jump
98instructions in the program.
99
100Much the same problem appears when we try to prove the algorithm correct: the
101correctness of $\mathtt{policy}$ depends on the correctness of $\sigma$, and
102the correctness of $\sigma$ depends on the correctness of $\mathtt{policy}$.
103
104We solved this problem by integrating the $\mathtt{policy}$ and $\sigma$
105algorithms. We now have a function
106$\sigma: \mathbb{N} \rightarrow \mathtt{Word} \times \mathtt{bool}$ which
107associates a pseudo-address to an actual address. The boolean denotes a forced
108long jump; as noted in the previous section, if during the fixed point
109computation a medium jump needs to be re-encoded as a short jump, the result
110is actually a long jump. It might therefore be the case that jumps are encoded
111as long jumps without this actually being necessary.
112
113The assembler function encodes the jumps by checking the distance between
114source and destination according to $\sigma$, so it could select a medium
115jump in a situation where there should be a long jump. The boolean is there
116to prevent this from happening by indicating the locations where a long jump
117should be encoded, even if a shorter jump is possible. This has no effect on
118correctness, since a long jump is applicable in any situation.
119
120\begin{figure}
121\begin{algorithmic}
[2064]122\Function{f}{$labels$,$old\_sigma$,$instr$,$ppc$,$acc$}
123        \State $\langle added, pc, sigma \rangle \gets acc$
124        \If {$instr$ is a backward jump to $j$}
125                \State $length \gets \mathrm{jump\_size}(pc,sigma_1(labels(j)))$
126        \ElsIf {$instr$ is a forward jump to $j$}
127                \State $length \gets \mathrm{jump\_size}(pc,old\_sigma_1(labels(j))+added)$
128        \Else
129                \State $length \gets \mathtt{short\_jump}$
130        \EndIf
131        \State $old\_length \gets \mathrm{old\_sigma_1}(ppc)$
132        \State $new\_length \gets \mathrm{max}(old\_length, length)$
133        \State $old\_size \gets \mathrm{old\_sigma_2}(ppc)$
134        \State $new\_size \gets \mathrm{instruction\_size}(instr,new\_length)$
135        \State $new\_added \gets added+(new\_size-old\_size)$
136        \State $new\_sigma_1(ppc+1) \gets pc+new\_size$
137        \State $new\_sigma_2(ppc) \gets new\_length$ \\
138        \Return $\langle new\_added, pc+new\_size, new\_sigma \rangle$
[2054]139\EndFunction
140\end{algorithmic}
141\caption{The heart of the algorithm}
142\label{f:jump_expansion_step}
143\end{figure}
[2064]144
145The algorithm, shown in figure~\ref{f:jump_expansion_step}, works by folding the
146function {\sc f} over the entire program, thus gradually constructing $sigma$.
147This constitutes one step in the fixed point calculation; successive steps
148repeat the fold until a fixed point is reached.
149
150Parameters of the function {\sc f} are:
151\begin{itemize}
152        \item a function $labels$ that associates a label to its pseudo-address;
153        \item $old\_sigma$, the $\sigma$ function returned by the previous
154                iteration of the fixed point calculcation;
155        \item $instr$, the instruction currently under consideration;
156        \item $ppc$, the pseudo-address of $instr$;
157        \item $acc$, the fold accumulator, which contains $pc$ (the highest memory
158                address reached so far), $added$ (the number of bytes added to the program
159                size with respect to the previous iteration), and of course $sigma$, the
160                $\sigma$ function under construction.
161\end{itemize}
162
163The first two are parameters that remain the same through one iteration, the
164last three are standard parameters for a fold function (including $ppc$,
165which is simply the number of instructions of the program already processed).
166
167The $\sigma$ functions used by {\sc f} are not of the same type as the final
168$\sigma$ function: they are of type
169$\sigma: \mathbb{N} \rightarrow \mathbb{N} \times \{\mathtt{short\_jump},
170\mathtt{medium\_jump},\mathtt{long\_jump}\}$; a function that associates a
171pseudo-address with an memory address and a jump length. We do this to be able
172to more easily compare the jump lengths between iterations. In the algorithm,
173we use the notation $sigma_1(x)$ to denote the memory address corresponding to
174$x$, and $sigma_2(x)$ to denote the jump length corresponding to $x$.
175
176Note that the $\sigma$ function used for label lookup varies depending on
177whether the label is behind our current position or ahead of it. For
178backward jumps, where the label is behind our current position, we can use
179$sigma$ for lookup, since its memory address is already known. However, for
180forward jumps, the memory address of the address of the label is not yet
181known, so we must use $old\_sigma$.
182
183We cannot use $old\_sigma$ without change: it might be the case that we have
184already changed some jumps before, making the program longer. We must therefore
185compensate for this by adding the size increase of the program to the label's
186memory address according to $old\_sigma$, so that jump distances do not get
187compromised.
188
189Note also that we add the pc to $sigma$ at location $ppc+1$, whereas we add the
190jump length at location $ppc$. We do this so that $sigma(ppc)$ will always
191return a couple with the start address of the instruction at $ppc$ and the
192length of its jump; the end address of the program can be found at $sigma(n+1)$,
193where $n$ is the number of instructions in the program.
Note: See TracBrowser for help on using the repository browser.