Ignore:
Timestamp:
Jun 15, 2012, 11:36:47 AM (8 years ago)
Author:
boender
Message:
  • rewrote introduction
  • changed 'medium' to 'absolute'
  • added a bit to conclusion (CompCert?, Piton, ...)
File:
1 edited

Legend:

Unmodified
Added
Removed
  • src/ASM/CPP2012-policy/algorithm.tex

    r2080 r2085  
    3333
    3434In the CerCo assembler, we opted at first for a least fixed point algorithm,
    35 taking medium jumps into account.
     35taking absolute jumps into account.
    3636
    37 Here, we ran into a problem with proving termination: whereas the SDCC
    38 algorithm only switches jumps from short to long, when we add medium jumps,
    39 it is theoretically possible for a jump to switch from medium to long and back,
    40 as explained in the previous section.
     37Here, we ran into a problem with proving termination, as explained in the
     38previous section: if we only take short and long jumps into account, the jump
     39encoding can only switch from short to long, but never in the other direction.
     40When we add absolute jumps, however, it is theoretically possible for a jump to
     41switch from absolute to long and back, as explained in the previous section.
    4142
    4243Proving termination then becomes difficult, because there is nothing that
    43 precludes a jump switching back and forth between medium and long indefinitely.
     44precludes a jump from switching back and forth between absolute and long
     45indefinitely.
    4446
    45 In fact, this mirrors the argument from~\cite{Szymanski1978}. There, it is
    46 argued that for the problem to be NP-complete, it must be allowed to contain
    47 {\em pathological} jumps. These are jumps that can normally not be encoded as a
    48 short(er) jump, but gain this property when some other jumps are encoded as a
    49 long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by
    50 encoding the first jump as a long jump, another jump switches from long to
    51 medium (which is shorter).
    52 
    53 In order to keep the algorithm linear and more easily prove termination, we
    54 decided to explicitly enforce the `jumps must always increase' requirement: if
    55 a jump is encoded as a long jump in one step, it will also be encoded as a
    56 long jump in all the following steps. This means that any jump can change at
    57 maximum two times: once from short to medium (or long), and once from medium
    58 to long.
     47In order to keep the algorithm in the same complexity class and more easily
     48prove termination, we decided to explicitly enforce the `jumps must always
     49increase' requirement: if a jump is encoded as a long jump in one step, it will
     50also be encoded as a long jump in all the following steps. This means that any
     51jump can change at most two times: once from short to absolute (or long), and
     52once from absolute to long.
    5953
    6054There is one complicating factor: suppose that a jump is encoded in step $n$
    61 as a medium jump, but in step $n+1$ it is determined that (because of changes
     55as an absolute jump, but in step $n+1$ it is determined that (because of changes
    6256elsewhere) it can now be encoded as a short jump. Due to the requirement that
    63 jumps must always increase, this means that the jump will be encoded as a
    64 medium jump in step $n+1$ as well.
     57jumps must always increase, this means that the jump will be encoded as an
     58absolute jump in step $n+1$ as well.
    6559
    6660This is not necessarily correct, however: it is not the case that any short
    67 jump can correctly be encoded as a medium jump (a short jump can bridge
    68 segments, whereas a medium jump cannot). Therefore, in this situation
     61jump can correctly be encoded as an absolute jump (a short jump can bridge
     62segments, whereas an absolute jump cannot). Therefore, in this situation
    6963we decide to encode the jump as a long jump, which is always correct.
    7064
    7165The resulting algorithm, while not optimal, is at least as good as the ones
    72 from {\tt gcc} and SDCC, and potentially better. Its complexity remains
    73 linear (though with a higher constant than SDCC).
     66from SDCC and {\tt gcc}, and potentially better. Its complexity remains
     67the same (there are at most $2n$ iterations, where $n$ is the number of jumps
     68in the program).
    7469
    7570\subsection{The algorithm in detail}
     
    8176
    8277The original intention was to have two different functions, one function
    83 $\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short}, \mathtt{medium},
    84 \mathtt{long}\}$ to associate jumps to their intended translation, and a
    85 function $\sigma: \mathbb{N} \rightarrow \mathtt{Word}$ to associate
    86 pseudo-addresses to actual addresses. $\sigma$ would use $\mathtt{policy}$ to
    87 determine the size of jump instructions.
     78$\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short\_jump},
     79\mathtt{absolute\_jump}, \mathtt{long\_jump}\}$ to associate jumps to their
     80intended encoding, and a function $\sigma: \mathbb{N} \rightarrow
     81\mathtt{Word}$ to associate pseudo-addresses to actual addresses. $\sigma$
     82would use $\mathtt{policy}$ to determine the size of jump instructions.
    8883
    8984This turned out to be suboptimal from the algorithmic point of view and
     
    107102associates a pseudo-address to an actual address. The boolean denotes a forced
    108103long jump; as noted in the previous section, if during the fixed point
    109 computation a medium jump needs to be re-encoded as a short jump, the result
    110 is actually a long jump. It might therefore be the case that jumps are encoded
    111 as long jumps without this actually being necessary.
     104computation an absolute jump changes to be potentially re-encoded as a short
     105jump, the result is actually a long jump. It might therefore be the case that
     106jumps are encoded as long jumps without this actually being necessary, and this
     107information needs to be passed to the code generating function.
    112108
    113109The assembler function encodes the jumps by checking the distance between
    114 source and destination according to $\sigma$, so it could select a medium
     110source and destination according to $\sigma$, so it could select an absolute
    115111jump in a situation where there should be a long jump. The boolean is there
    116112to prevent this from happening by indicating the locations where a long jump
     
    168164$\sigma$ function: they are of type
    169165$\sigma: \mathbb{N} \rightarrow \mathbb{N} \times \{\mathtt{short\_jump},
    170 \mathtt{medium\_jump},\mathtt{long\_jump}\}$; a function that associates a
     166\mathtt{absolute\_jump},\mathtt{long\_jump}\}$; a function that associates a
    171167pseudo-address with an memory address and a jump length. We do this to be able
    172168to more easily compare the jump lengths between iterations. In the algorithm,
Note: See TracChangeset for help on using the changeset viewer.