Changeset 2049
 Timestamp:
 Jun 12, 2012, 3:49:27 PM (5 years ago)
 Location:
 src/ASM/CPP2012policy
 Files:

 2 edited
Legend:
 Unmodified
 Added
 Removed

src/ASM/CPP2012policy/algorithm.tex
r1889 r2049 1 1 \section{Our algorithm} 2 3 \subsection{Design decisions} 4 5 Given the NPcompleteness of the problem, to arrive at an optimal solution 6 within a short space of time (using, for example, a constraint solver) will 7 potentially take a great amount of time. 8 9 The {\tt gcc} compiler suite, for the x86 architecture, uses a greatest fix 10 point algorithm. In other words, it starts off with all jumps encoded as the 11 largest jumps possible, and then tries to reduce jumps as much as possible. 12 13 Such an algorithm has the advantage that any intermediate results it returns 14 are correct: the solution where every jump is encoded as a large jump is 15 always possible, and the algorithm only reduces those jumps where the 16 destination address is in range for a shorter jump instruction. The algorithm 17 can thus be stopped after a determined amount of steps without losing 18 correctness. 19 20 The result, however, is not necessarily optimal, even if the algorithm is run 21 until it terminates naturally: the fixed point reached is the {\em greatest} 22 fixed point, not the least fixed point. 23 24 The SDCC compiler, which has the MCS51 among its target instruction sets, uses 25 a least fix point algorithm, but does not take the presence of medium jumps 26 into account. This makes the algorithm run in linear time with respect to the 27 number of jumps in the program: starting out with every jump encoded as a 28 short jump, each jump can be switched to a long jump once, but no jump ever 29 goes back from long to short. This algorithm must be run until a fixed point 30 is reached, because the intermediate solutions are not necessarily correct. 31 32 This algorithm results in a least fixed point, which means its solution is 33 potentially more optimal then the one reached by the {\tt gcc} algorithm. 34 However, the solution is still not optimal, since there might be jumps whose 35 destination is in the same segment. These jumps could be encoded as medium 36 jumps, which are smaller than long jumps. 37 38 Our first attempt at an algorithm was a least fixed point algorithm that took 39 medium jumps into account. 40 41 Here, we ran into a problem with proving termination: whereas the SDCC 42 algorithm only switches jumps from short to long, when we add medium jumps, 43 it is theoretically possible for a jump to switch from medium to long and back, 44 as explained in the previous section. 45 46 Proving termination then becomes difficult, because there is nothing that 47 precludes a jump switching back and forth between medium and long indefinitely. 48 49 In fact, this mirrors the argument from~\cite{Szymanski1978}. There, it is 50 argued that for the problem to be NPcomplete, it must be allowed to contain 51 {\em pathological} jumps. These are jumps that can normally not be encoded as a 52 short(er) jump, but gain this property when some other jumps are encoded as a 53 long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by 54 encoding the first jump as a long jump, another jump switches from long to 55 medium (which is shorter). 56 57 In order to keep the algorithm linear and more easily prove termination, we 58 decided to explicitly enforce the `jumps must always increase' requirement: if 59 a jump is encoded as a long jump in one step, it will also be encoded as a 60 long jump in all the following steps. This means that any jump can change at 61 maximum two times: once from short to medium (or long), and once from medium 62 to long. 63 64 There is one complicating factor: suppose that a jump is encoded in step $n$ 65 as a medium jump, but in step $n+1$ it is determined that (because of changes 66 elsewhere) it can now be encoded as a short jump. Due to the requirement that 67 jumps must always increase, this means that the jump will be encoded as a 68 medium jump in step $n+1$ as well. 69 70 This is not necessarily correct, however: it is not the case that any short 71 jump can correctly be encoded as a medium jump. Therefore, in this situation 72 we decide to encode the jump as a long jump, which is always correct. 73 74 The resulting algorithm, while not optimal, is at least as good as the ones 75 from {\tt gcc} and SDCC, and potentially better. Its complexity remains 76 linear (though with a higher constant than SDCC). 
src/ASM/CPP2012policy/problem.tex
r1889 r2049 9 9 address. Mostly this occurs with jump instructions; for example, in the 10 10 x8664 instruction set there are eleven different forms of the unconditional 11 jump instruction, all with different ranges and semantics (only six are valid12 in 64bit mode, for example). Some examples are shown in11 jump instruction, all with different ranges, instruction sizes and semantics 12 (only six are valid in 64bit mode, for example). Some examples are shown in 13 13 figure~\ref{f:x86jumps}: 14 14 … … 75 75 \subsection*{Adding medium jumps} 76 76 77 In both the canonical solutions presented, the encoding of a jump is only 78 dependent on the distance between the jump and its target: below a certain 79 value a short jump can be used; above this value the jump must be encoded 80 as a long jump. 77 In both papers mentioned above, the encoding of a jump is only dependent on the 78 distance between the jump and its target: below a certain value a short jump 79 can be used; above this value the jump must be encoded as a long jump. 81 80 82 81 Here, termination of the smallest fixpoint algorithm is easy to prove. All
Note: See TracChangeset
for help on using the changeset viewer.