Changeset 2085 for src/ASM/CPP2012policy
 Timestamp:
 Jun 15, 2012, 11:36:47 AM (8 years ago)
 Location:
 src/ASM/CPP2012policy
 Files:

 5 edited
Legend:
 Unmodified
 Added
 Removed

src/ASM/CPP2012policy/algorithm.tex
r2080 r2085 33 33 34 34 In the CerCo assembler, we opted at first for a least fixed point algorithm, 35 taking mediumjumps into account.35 taking absolute jumps into account. 36 36 37 Here, we ran into a problem with proving termination: whereas the SDCC 38 algorithm only switches jumps from short to long, when we add medium jumps, 39 it is theoretically possible for a jump to switch from medium to long and back, 40 as explained in the previous section. 37 Here, we ran into a problem with proving termination, as explained in the 38 previous section: if we only take short and long jumps into account, the jump 39 encoding can only switch from short to long, but never in the other direction. 40 When we add absolute jumps, however, it is theoretically possible for a jump to 41 switch from absolute to long and back, as explained in the previous section. 41 42 42 43 Proving termination then becomes difficult, because there is nothing that 43 precludes a jump switching back and forth between medium and long indefinitely. 44 precludes a jump from switching back and forth between absolute and long 45 indefinitely. 44 46 45 In fact, this mirrors the argument from~\cite{Szymanski1978}. There, it is 46 argued that for the problem to be NPcomplete, it must be allowed to contain 47 {\em pathological} jumps. These are jumps that can normally not be encoded as a 48 short(er) jump, but gain this property when some other jumps are encoded as a 49 long(er) jump. This is exactly what happens in figure~\ref{f:term_example}: by 50 encoding the first jump as a long jump, another jump switches from long to 51 medium (which is shorter). 52 53 In order to keep the algorithm linear and more easily prove termination, we 54 decided to explicitly enforce the `jumps must always increase' requirement: if 55 a jump is encoded as a long jump in one step, it will also be encoded as a 56 long jump in all the following steps. This means that any jump can change at 57 maximum two times: once from short to medium (or long), and once from medium 58 to long. 47 In order to keep the algorithm in the same complexity class and more easily 48 prove termination, we decided to explicitly enforce the `jumps must always 49 increase' requirement: if a jump is encoded as a long jump in one step, it will 50 also be encoded as a long jump in all the following steps. This means that any 51 jump can change at most two times: once from short to absolute (or long), and 52 once from absolute to long. 59 53 60 54 There is one complicating factor: suppose that a jump is encoded in step $n$ 61 as a mediumjump, but in step $n+1$ it is determined that (because of changes55 as an absolute jump, but in step $n+1$ it is determined that (because of changes 62 56 elsewhere) it can now be encoded as a short jump. Due to the requirement that 63 jumps must always increase, this means that the jump will be encoded as a 64 mediumjump in step $n+1$ as well.57 jumps must always increase, this means that the jump will be encoded as an 58 absolute jump in step $n+1$ as well. 65 59 66 60 This is not necessarily correct, however: it is not the case that any short 67 jump can correctly be encoded as a mediumjump (a short jump can bridge68 segments, whereas a mediumjump cannot). Therefore, in this situation61 jump can correctly be encoded as an absolute jump (a short jump can bridge 62 segments, whereas an absolute jump cannot). Therefore, in this situation 69 63 we decide to encode the jump as a long jump, which is always correct. 70 64 71 65 The resulting algorithm, while not optimal, is at least as good as the ones 72 from {\tt gcc} and SDCC, and potentially better. Its complexity remains 73 linear (though with a higher constant than SDCC). 66 from SDCC and {\tt gcc}, and potentially better. Its complexity remains 67 the same (there are at most $2n$ iterations, where $n$ is the number of jumps 68 in the program). 74 69 75 70 \subsection{The algorithm in detail} … … 81 76 82 77 The original intention was to have two different functions, one function 83 $\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short }, \mathtt{medium},84 \mathtt{ long}\}$ to associate jumps to their intended translation, and a85 function $\sigma: \mathbb{N} \rightarrow \mathtt{Word}$ to associate 86 pseudoaddresses to actual addresses. $\sigma$ would use $\mathtt{policy}$ to 87 determine the size of jump instructions.78 $\mathtt{policy}: \mathbb{N} \rightarrow \{\mathtt{short\_jump}, 79 \mathtt{absolute\_jump}, \mathtt{long\_jump}\}$ to associate jumps to their 80 intended encoding, and a function $\sigma: \mathbb{N} \rightarrow 81 \mathtt{Word}$ to associate pseudoaddresses to actual addresses. $\sigma$ 82 would use $\mathtt{policy}$ to determine the size of jump instructions. 88 83 89 84 This turned out to be suboptimal from the algorithmic point of view and … … 107 102 associates a pseudoaddress to an actual address. The boolean denotes a forced 108 103 long jump; as noted in the previous section, if during the fixed point 109 computation a medium jump needs to be reencoded as a short jump, the result 110 is actually a long jump. It might therefore be the case that jumps are encoded 111 as long jumps without this actually being necessary. 104 computation an absolute jump changes to be potentially reencoded as a short 105 jump, the result is actually a long jump. It might therefore be the case that 106 jumps are encoded as long jumps without this actually being necessary, and this 107 information needs to be passed to the code generating function. 112 108 113 109 The assembler function encodes the jumps by checking the distance between 114 source and destination according to $\sigma$, so it could select a medium110 source and destination according to $\sigma$, so it could select an absolute 115 111 jump in a situation where there should be a long jump. The boolean is there 116 112 to prevent this from happening by indicating the locations where a long jump … … 168 164 $\sigma$ function: they are of type 169 165 $\sigma: \mathbb{N} \rightarrow \mathbb{N} \times \{\mathtt{short\_jump}, 170 \mathtt{ medium\_jump},\mathtt{long\_jump}\}$; a function that associates a166 \mathtt{absolute\_jump},\mathtt{long\_jump}\}$; a function that associates a 171 167 pseudoaddress with an memory address and a jump length. We do this to be able 172 168 to more easily compare the jump lengths between iterations. In the algorithm, 
src/ASM/CPP2012policy/biblio.bib
r2080 r2085 67 67 year = {2012} 68 68 } 69 70 @article {Leroy2009, 71 author = {Leroy, Xavier}, 72 affiliation = {INRIA ParisRocquencourt B.P. 105 78153 Le Chesnay France}, 73 title = {A Formally Verified Compiler Backend}, 74 journal = {Journal of Automated Reasoning}, 75 publisher = {Springer Netherlands}, 76 issn = {01687433}, 77 keyword = {Computer Science}, 78 pages = {363446}, 79 volume = {43}, 80 issue = {4}, 81 url = {http://dx.doi.org/10.1007/s1081700991554}, 82 note = {10.1007/s1081700991554}, 83 year = {2009} 84 } 85 86 @book {Moore1996, 87 author = {J Strother Moore}, 88 title = {Piton: A mechanically verified assembly language}, 89 series = {Automated Reasoning Series}, 90 volume = {3}, 91 isbn = {9780792339205}, 92 publisher = {Springer}, 93 year = {1996} 94 } 95 
src/ASM/CPP2012policy/conclusion.tex
r2084 r2085 5 5 termination and correctness for this algorithm, as formalised in Matita. 6 6 7 The algorithm we have presented is fast and correct, but not optimal; a true 8 optimal solution would need techniques like constraint solvers. While outside 9 the scope of the present research, it would be interesting to see if enough 10 heuristics could be find to make such a solution practical for implementing 11 in an existent compiler; this would be especially useful for embedded systems, 12 where it is important to have a small solution as possible. 13 14 This algorithm is part of a greater whole, the CerCo project, which aims at 15 the complete formalisation and verification of a compiler. More information 16 on the formalisation of the assembler, of which the present work is a part, 17 can be found in a companion publication. 18 7 19 \subsection{Related work} 8 20 … … 10 22 displacement algorithm. 11 23 12 (Piton? CompCert?) 24 The CompCert project is another project aimed at formally verifying a compiler. 25 Their backend~\cite{Leroy2009} generates assembly for (amongst others) the 26 PowerPC and x86 (32bit) architectures, but does not include a branch 27 displacement algorithm; at least for the x86 architecture, all jumps are 28 encoded as long jumps. 13 29 30 There is also Piton~\cite{Moore1996}, which not only includes the 31 formal verification of a compiler, but also of the machine architecture 32 targeted by that compiler. However, this architecture does not have different 33 jump sizes (branching is simulated by assigning values to the program counter), 34 so the branch displacement problem does not occur. 35 14 36 \subsection{Formal development} 15 37 
src/ASM/CPP2012policy/problem.tex
r2064 r2085 2 2 3 3 The problem of branch displacement optimisation, also known as jump encoding, is 4 a wellknown problem in assembler design. 5 6 In many instruction sets, amongst which the ubiquitous x86 architecture (both 7 its 32 and 64bit incarnations), there are instructions whose addressing 8 mode varies with the distance between the instruction and the object they 9 address. Mostly this occurs with jump instructions; for example, in the 10 x8664 instruction set there are eleven different forms of the unconditional 11 jump instruction, all with different ranges, instruction sizes and semantics 12 (only six are valid in 64bit mode, for example). Some examples are shown in 13 figure~\ref{f:x86jumps}: 14 15 \begin{figure}[h] 4 a wellknown problem in assembler design. It is caused by the fact that in 5 many architecture sets, the encoding (and therefore size) of some instructions 6 depends on the distance to their operand (the instruction 'span'). The branch 7 displacement optimisation problem consists in encoding these spandependent 8 instructions in such a way that the resulting program is as small as possible. 9 10 This problem is the subject of the present paper. After introducing the problem 11 in more detail, we will discuss the solutions used by other compilers, present 12 the algorithm used by us in the CerCo assembler, and discuss its verification, 13 that is the proofs of termination and correctness. 14 15 The research presented in this paper has been executed within the CerCo project 16 which aims at formally verifying a C compiler with cost annotations. The 17 target architecture for this project is the MCS51, whose instruction set 18 contains spandependent instructions. Furthermore, its maximum addressable 19 memory size is very small (65 Kbytes), which makes it important to generate 20 programs that are as small as possible. 21 22 With this optimisation, however, comes increased complexity and hence 23 increased possibility for error. We must make sure that the jumps are encoded 24 correctly, otherwise the assembled program will behave impredictably. 25 26 \section{The branch displacement optimisation problem} 27 28 In most modern instruction sets, the only spandependent instructions are 29 branch instructions. Taking the ubiquitous x8664 instruction set as an 30 example, we find that it contains are eleven different forms of the 31 unconditional jump instruction, all with different ranges, instruction sizes 32 and semantics (only six are valid in 64bit mode, for example). Some examples 33 are shown in figure~\ref{f:x86jumps}. 34 35 \begin{figure}[h] 36 \begin{center} 16 37 \begin{tabular}{lll} 17 38 \hline … … 24 45 \hline 25 46 \end{tabular} 47 \end{center} 26 48 \caption{List of x86 jump instructions} 27 49 \label{f:x86jumps} … … 29 51 30 52 The chosen target architecture of the CerCo project is the Intel MCS51, which 31 features three types of jump instructions, as shown in 32 figure~\ref{f:mcs51jumps}. 33 34 \begin{figure}[h] 53 features three types of branch instructions (or jump instructions; the two terms 54 are used interchangeably), as shown in figure~\ref{f:mcs51jumps}. 55 56 \begin{figure}[h] 57 \begin{center} 35 58 \begin{tabular}{llll} 36 59 \hline … … 39 62 \hline 40 63 SJMP (`short jump') & 2 & 2 & 128 to 127 bytes \\ 41 AJMP (` mediumjump') & 2 & 2 & one segment (11bit address) \\64 AJMP (`absolute jump') & 2 & 2 & one segment (11bit address) \\ 42 65 LJMP (`long jump') & 3 & 3 & entire memory \\ 43 66 \hline 44 67 \end{tabular} 68 \end{center} 45 69 \caption{List of MCS51 jump instructions} 46 70 \label{f:mcs51jumps} … … 50 74 means that a conditional jump outside the short address range has to be 51 75 encoded using two jumps; the call instruction is only available in 52 mediumand long forms.76 absolute and long forms. 53 77 54 78 Note that even though the MCS51 architecture is much less advanced and more … … 57 81 reach the entire available memory. 58 82 59 Generally, in assembly code, there is only one way to indicate a jump; the 60 algorithm used by the assembler to encode these jumps into the different 83 Generally, in the code that is sent to the assembler as input, the only 84 difference made between jump instructions is by semantics, not by span. This 85 means that a distinction is made between the inconditional jump and the several 86 kinds of conditional jump, but not between their short, absolute or long 87 variants. 88 89 The algorithm used by the assembler to encode these jumps into the different 61 90 machine instructions is known as the {\tt branch displacement algorithm}. The 62 91 optimisation problem consists in using as small an encoding as possible, thus … … 66 95 which could make finding an optimal solution very timeconsuming. 67 96 68 The canonical solution, as shown in~\cite{Szymanski1978} or more recently69 in~\cite{Dickson2008} for the x86 instruction set, is to use a fixed point 70 algorithm that starts out with the shortest possible encoding (all jumps 71 encoded as short jumps, which is very probably not a correct solution) and then 72 iterates over the program to reencode those jumps whose target is outside 73 their range.74 75 \subsection*{Adding mediumjumps}97 The canonical solution, as shown by Szymanski~\cite{Szymanski1978} or more 98 recently by Dickson~\cite{Dickson2008} for the x86 instruction set, is to use a 99 fixed point algorithm that starts out with the shortest possible encoding (all 100 jumps encoded as short jumps, which is very probably not a correct solution) 101 and then iterates over the program to reencode those jumps whose target is 102 outside their range. 103 104 \subsection*{Adding absolute jumps} 76 105 77 106 In both papers mentioned above, the encoding of a jump is only dependent on the … … 81 110 Here, termination of the smallest fixed point algorithm is easy to prove. All 82 111 jumps start out encoded as short jumps, which means that the distance between 83 any jump and its target is as short as possible. If we therefore need to encode 84 a jump $j$ as a long jump, we can be sure that we can never reach a situation 85 where the span of $j$ is so small that it can be encoded as a short jump. This 86 reasoning holds throughout the subsequent iterations of the algorithms: short 87 jumps can change into long jumps, but not vice versa. Hence, the algorithm 88 either terminates when a fixed point is reached or when all short jumps have 89 been changed into long jumps. 112 any jump and its target is as short as possible. If, in this situation, there 113 is a jump $j$ whose span is not within the range for a short jump, we can be 114 sure that we can never reach a situation where the span of $j$ is so small that 115 it can be encoded as a short jump. This reasoning holds throughout the 116 subsequent iterations of the algorithm: short jumps can change into long jumps, 117 but not vice versa (spans only increase). Hence, the algorithm either 118 terminates when a fixed point is reached or when all short jumps have been 119 changed into long jumps. 90 120 91 121 Also, we can be certain that we have reached an optimal solution: a short jump … … 93 123 94 124 However, neither of these claims (termination nor optimality) hold when we add 95 the medium jump. 96 97 The reason for this is that with medium jumps, the encoding of a jump no longer 98 depends only on the distance between the jump and its target: in order for a 99 medium jump to be possible, they need to be in the same segment. It is therefore 100 entirely possible for two jumps with the same span to be encoded in different 101 ways (medium if the jump and its destination are in the same segment, long if 102 this is not the case). 125 the absolute jump. 126 127 The reason for this is that with absolute jumps, the encoding of a jump no 128 longer depends only on the distance between the jump and its target: in order 129 for an absolute jump to be possible, they need to be in the same segment (for 130 the MCS51, this means that the first 5 bytes of their addesses have to be 131 equal). It is therefore entirely possible for two jumps with the same span to 132 be encoded in different ways (absolute if the jump and its target are in the 133 same segment, long if this is not the case). 103 134 104 135 This invalidates the termination argument: a jump, once encoded as a long jump, 105 can be reencoded during a later iteration as a mediumjump. Consider the136 can be reencoded during a later iteration as an absolute jump. Consider the 106 137 program shown in figure~\ref{f:term_example}. At the start of the first 107 138 iteration, both the jump to {\tt X} and the jump to $\mathtt{L}_{0}$ are … … 115 146 be encoded as a long jump. If we assume that the jump to {\tt X} is encoded as 116 147 a long jump as well, the size of the jump will increase and $\mathtt{L}_{0}$ 117 will be `propelled' into the same segment as its jump. Hence, in the next118 iteration, it can be encoded as a mediumjump. At first glance, there is148 will be `propelled' into the same segment as its jump. Hence, in the third 149 iteration, it can be encoded as an absolute jump. At first glance, there is 119 150 nothing that prevents us from making a construction where two jumps interact 120 in such a way as to keep switching between long and medium indefinitely. 151 in such a way as to keep switching between long and absolute encodings for 152 an indefinite amount of iterations. 121 153 122 154 \begin{figure}[h] … … 128 160 jmp L\(\sb{0}\) 129 161 \end{alltt} 130 \caption{Example of a program where a long jump becomes medium}162 \caption{Example of a program where a long jump becomes absolute} 131 163 \label{f:term_example} 132 164 \end{figure} 165 166 In fact, this situation mirrors the explanation by 167 Szymanski~\cite{Szymanski1978} of why the branch displacement optimisation 168 problem is NPcomplete. In this explanation, a condition for NPcompleteness 169 is the fact that programs be allowed to contain {\em pathological} jumps. 170 These are jumps that can normally not be encoded as a short(er) jump, but gain 171 this property when some other jumps are encoded as a long(er) jump. This is 172 exactly what happens in figure~\ref{f:term_example}: by encoding the first 173 jump as a long jump, another jump switches from long to absolute 174 (which is shorter). 133 175 134 176 The optimality argument no longer holds either. Let us consider the program … … 145 187 therefore increase in size, $\mathtt{L}_{1}$ would be `propelled' across the 146 188 segment border, so that the three jumps to $\mathtt{L}_{1}$ could be encoded 147 as medium jumps. Depending on the relative sizes of long and medium jumps, this148 solution might actually be smaller than the one reached by the smallest189 as absolute jumps. Depending on the relative sizes of long and absolute jumps, 190 this solution might actually be smaller than the one reached by the smallest 149 191 fixed point algorithm. 150 192 
src/ASM/CPP2012policy/proof.tex
r2084 r2085 5 5 6 6 The main correctness statement is as follows: 7 8 \clearpage 7 9 8 10 \begin{lstlisting} … … 55 57 $2n$ the number of jumps in the program. This worst case is reached if at every 56 58 iteration, we change the encoding of exactly one jump; since a jump can change 57 from {\tt short} to {\tt medium} and from {\tt medium} to {\tt long}, there58 c an be at most $2n$ changes.59 from short to absolute and from absolute to long, there can be at most $2n$ 60 changes. 59 61 60 62 This proof has been executed in the ``Russell'' style from~\cite{Sozeau2006}. … … 114 116 iteration, and $p$ the current one), jump lengths either remain equal or 115 117 increase. It is needed for proving termination. 118 119 \clearpage 116 120 117 121 \begin{lstlisting} … … 160 164 [ short_jump $\Rightarrow$ $\neg$is_call instr $\wedge$ 161 165 \fst (short_jump_cond pc_plus_jmp_length addr) = true 162  medium_jump $\Rightarrow$ $\neg$is_relative_jump instr $\wedge$163 \fst ( medium_jump_cond pc_plus_jmp_length addr) = true $\wedge$166  absolute_jump $\Rightarrow$ $\neg$is_relative_jump instr $\wedge$ 167 \fst (absolute_jump_cond pc_plus_jmp_length addr) = true $\wedge$ 164 168 \fst (short_jump_cond pc_plus_jmp_length addr) = false 165 169  long_jump $\Rightarrow$ \fst (short_jump_cond pc_plus_jmp_length addr) = false 166 $\wedge$ \fst ( medium_jump_cond pc_plus_jmp_length addr) = false170 $\wedge$ \fst (absolute_jump_cond pc_plus_jmp_length addr) = false 167 171 ]. 168 172 \end{lstlisting} … … 200 204 We need to use two different formulations, because the fact that $added$ is 0 201 205 does not guarantee that no jumps have changed: it is possible that we have 202 replaced a short jump with a mediumjump, which does not change the size.206 replaced a short jump with a absolute jump, which does not change the size. 203 207 204 208 Therefore {\tt policy\_pc\_equal} states that $old\_sigma_1(x) = sigma_1(x)$, … … 245 249 246 250 \begin{lstlisting} 247 definition sigma_compact : list labelled_instruction → label_map → ppc_pc_map → Prop:=248 λprogram .λlabels.λsigma.251 definition sigma_compact := 252 λprogram:list labelled_instruction.λlabels:label_map.λsigma:ppc_pc_map. 249 253 ∀n.n < program → 250 254 match bvt_lookup_opt $\ldots$ (bitvector_of_nat ? n) (\snd sigma) with
Note: See TracChangeset
for help on using the changeset viewer.