1 | \documentclass[a4paper, 10pt]{article} |
---|
2 | |
---|
3 | \usepackage{a4wide} |
---|
4 | \usepackage{amsfonts} |
---|
5 | \usepackage{amsmath} |
---|
6 | \usepackage{amssymb} |
---|
7 | \usepackage[english]{babel} |
---|
8 | \usepackage{color} |
---|
9 | \usepackage{diagrams} |
---|
10 | \usepackage{graphicx} |
---|
11 | \usepackage[colorlinks]{hyperref} |
---|
12 | \usepackage[utf8x]{inputenc} |
---|
13 | \usepackage{listings} |
---|
14 | \usepackage{microtype} |
---|
15 | \usepackage{skull} |
---|
16 | \usepackage{stmaryrd} |
---|
17 | \usepackage{wasysym} |
---|
18 | |
---|
19 | \lstdefinelanguage{matita-ocaml} { |
---|
20 | mathescape=true |
---|
21 | } |
---|
22 | \lstset{ |
---|
23 | language=matita-ocaml,basicstyle=\tt,columns=flexible,breaklines=false, |
---|
24 | showspaces=false, showstringspaces=false, extendedchars=false, |
---|
25 | inputencoding=utf8x, tabsize=2 |
---|
26 | } |
---|
27 | |
---|
28 | \title{Proof outline for the correctness of the CerCo compiler} |
---|
29 | \date{\today} |
---|
30 | \author{The CerCo team} |
---|
31 | |
---|
32 | \begin{document} |
---|
33 | |
---|
34 | \maketitle |
---|
35 | |
---|
36 | \section{Introduction} |
---|
37 | \label{sect.introduction} |
---|
38 | |
---|
39 | In the last project review of the CerCo project, the project reviewers expressed the opinion that it would be a good idea to attempt to write down some of the statements of the correctness theorems that we intend to prove about the complexity preserving compiler. |
---|
40 | This document provides a very high-level, pen-and-paper sketch of what we view as the best path to completing the correctness proof for the compiler. |
---|
41 | In particular, for every translation between two intermediate languages, in both the front- and back-ends, we identify the key translation steps, and identify some invariants that we view as being important for the correctness proof. |
---|
42 | |
---|
43 | \section{Front-end: Clight to RTLabs} |
---|
44 | |
---|
45 | The front-end of the CerCo compiler consists of several stages: |
---|
46 | |
---|
47 | \begin{center} |
---|
48 | \begin{minipage}{.8\linewidth} |
---|
49 | \begin{tabbing} |
---|
50 | \quad \= $\downarrow$ \quad \= \kill |
---|
51 | \textsf{Clight}\\ |
---|
52 | \> $\downarrow$ \> cast removal\\ |
---|
53 | \> $\downarrow$ \> add runtime functions\footnote{Following the last project |
---|
54 | meeting we intend to move this transformation to the back-end}\\ |
---|
55 | \> $\downarrow$ \> cost labelling\\ |
---|
56 | \> $\downarrow$ \> loop optimizations\footnote{\label{lab:opt2}To be ported from the untrusted compiler and certified only in case of early completion of the certification of the other passes.} (an endo-transformation)\\ |
---|
57 | \> $\downarrow$ \> partial redundancy elimination$^{\mbox{\scriptsize \ref{lab:opt2}}}$ (an endo-transformation)\\ |
---|
58 | \> $\downarrow$ \> stack variable allocation and control structure |
---|
59 | simplification\\ |
---|
60 | \textsf{Cminor}\\ |
---|
61 | \> $\downarrow$ \> generate global variable initialisation code\\ |
---|
62 | \> $\downarrow$ \> transform to RTL graph\\ |
---|
63 | \textsf{RTLabs}\\ |
---|
64 | \> $\downarrow$ \> \\ |
---|
65 | \>\,\vdots |
---|
66 | \end{tabbing} |
---|
67 | \end{minipage} |
---|
68 | \end{center} |
---|
69 | |
---|
70 | Here, by `endo-transformation', we mean a mapping from language back to itself: |
---|
71 | the loop optimization step maps the Clight language to itself. |
---|
72 | |
---|
73 | %Our overall statements of correctness with respect to costs will |
---|
74 | %require a correctly labelled program |
---|
75 | There are three layers in most of the proofs proposed: |
---|
76 | \begin{enumerate} |
---|
77 | \item invariants closely tied to the syntax and transformations using |
---|
78 | dependent types, |
---|
79 | \item a forward simulation proof, and |
---|
80 | \item syntactic proofs about the cost labelling. |
---|
81 | \end{enumerate} |
---|
82 | The first will support both functional correctness and allow us to show |
---|
83 | the totality of some of the compiler stages (that is, those stages of |
---|
84 | the compiler cannot fail). The second provides the main functional |
---|
85 | correctness result, and the last will be crucial for applying |
---|
86 | correctness results about the costings from the back-end. |
---|
87 | |
---|
88 | We will also prove that a suitably labelled RTLabs trace can be turned |
---|
89 | into a \emph{structured trace} which splits the execution trace into |
---|
90 | cost-label to cost-label chunks with nested function calls. This |
---|
91 | structure was identified during work on the correctness of the |
---|
92 | back-end cost analysis as retaining important information about the |
---|
93 | structure of the execution that is difficult to reconstruct later in |
---|
94 | the compiler. |
---|
95 | |
---|
96 | \subsection{Clight cast removal} |
---|
97 | |
---|
98 | This transformation removes some casts inserted by the parser to make |
---|
99 | arithmetic promotion explicit but which are superfluous (such as |
---|
100 | \lstinline[language=C]'c = (short)((int)a + (int)b);' where |
---|
101 | \lstinline'a' and \lstinline'b' are \lstinline[language=C]'short'). |
---|
102 | This is necessary for producing good code for our target architecture. |
---|
103 | |
---|
104 | It only affects Clight expressions, recursively detecting casts that |
---|
105 | can be safely eliminated. The semantics provides a big-step |
---|
106 | definition for expression, so we should be able to show a lock-step |
---|
107 | forward simulation between otherwise identical states using a lemma |
---|
108 | showing that cast elimination does not change the evaluation of |
---|
109 | expressions. This lemma will follow from a structural induction on |
---|
110 | the source expression. We have already proved a few of the underlying |
---|
111 | arithmetic results necessary to validate the approach. |
---|
112 | |
---|
113 | \subsection{Clight cost labelling} |
---|
114 | |
---|
115 | This adds cost labels before and after selected statements and |
---|
116 | expressions, and the execution traces ought to be equivalent modulo |
---|
117 | the new cost labels. Hence it requires a simple forward simulation |
---|
118 | with a limited amount of stuttering whereever a new cost label is |
---|
119 | introduced. A bound can be given for the amount of stuttering allowed |
---|
120 | based on the statement or continuation to be evaluated next. |
---|
121 | |
---|
122 | We also intend to show three syntactic properties about the cost |
---|
123 | labelling: |
---|
124 | \begin{enumerate} |
---|
125 | \item every function starts with a cost label, |
---|
126 | \item every branching instruction is followed by a cost label (note that |
---|
127 | exiting a loop is treated as a branch), and |
---|
128 | \item the head of every loop (and any \lstinline'goto' destination) is |
---|
129 | a cost label. |
---|
130 | \end{enumerate} |
---|
131 | These can be shown by structural induction on the source term. |
---|
132 | |
---|
133 | \subsection{Clight to Cminor translation} |
---|
134 | |
---|
135 | This translation is the first to introduce some invariants, with the |
---|
136 | proofs closely tied to the implementation by dependent typing. These |
---|
137 | are largely complete and show that the generated code enjoys: |
---|
138 | \begin{itemize} |
---|
139 | \item some minimal type safety shown by explicit checks on the |
---|
140 | Cminor types during the transformation (a little more work remains |
---|
141 | to be done here, but follows the same form); |
---|
142 | \item that variables named in the parameter and local variable |
---|
143 | environments are distinct from one another, again by an explicit |
---|
144 | check; |
---|
145 | \item that variables used in the generated code are present in the |
---|
146 | resulting environment (either by checking their presence in the |
---|
147 | source environment, or from a list of freshly generated temporary variables); |
---|
148 | and |
---|
149 | \item that all \lstinline[language=C]'goto' labels are present (by |
---|
150 | checking them against a list of source labels and proving that all |
---|
151 | source labels are preserved). |
---|
152 | \end{itemize} |
---|
153 | |
---|
154 | The simulation will be similar to the relevant stages of CompCert |
---|
155 | (Clight to Csharpminor and Csharpminor to Cminor --- in the event that |
---|
156 | the direct proof is unwieldy we could introduce an intermediate |
---|
157 | language corresponding to Csharpminor). During early experimentation |
---|
158 | with porting CompCert definitions to the Matita proof assistant we |
---|
159 | found little difficulty reproving the results for the memory model, so |
---|
160 | we plan to port the memory injection properties and use them to relate |
---|
161 | Clight in-memory variables with either the value of the local variable or a |
---|
162 | stack slot, depending on how it was classified. |
---|
163 | |
---|
164 | This should be sufficient to show the equivalence of (big-step) |
---|
165 | expression evaluation. The simulation can then be shown by relating |
---|
166 | corresponding blocks of statement and continuations with their Cminor |
---|
167 | counterparts and proving that a few steps reaches the next matching |
---|
168 | state. |
---|
169 | |
---|
170 | The syntactic properties required for cost labels remain similar and a |
---|
171 | structural induction on the function bodies should be sufficient to |
---|
172 | show that they are preserved. |
---|
173 | |
---|
174 | \subsection{Cminor global initialisation code} |
---|
175 | |
---|
176 | This short phase replaces the global variable initialisation data with |
---|
177 | code that executes when the program starts. Each piece of |
---|
178 | initialisation data in the source is matched by a new statement |
---|
179 | storing that data. As each global variable is allocated a distinct |
---|
180 | memory block, the program state after the initialisation statements |
---|
181 | will be the same as the original program's state at the start of |
---|
182 | execution, and will proceed in the same manner afterwards. |
---|
183 | |
---|
184 | % Actually, the above is wrong... |
---|
185 | % ... this ought to be in a fresh main function with a fresh cost label |
---|
186 | |
---|
187 | \subsection{Cminor to RTLabs translation} |
---|
188 | |
---|
189 | In this part of the compiler we transform the program's functions into |
---|
190 | control flow graphs. It is closely related to CompCert's Cminorsel to |
---|
191 | RTL transformation, albeit with target-independent operations. |
---|
192 | |
---|
193 | We already enforce several invariants with dependent types: some type |
---|
194 | safety, mostly shown using the type information from Cminor; and |
---|
195 | that the graph is closed (by showing that each successor was recently |
---|
196 | added, or corresponds to a \lstinline[language=C]'goto' label which |
---|
197 | are all added before the end). Note that this relies on a |
---|
198 | monotonicity property; CompCert maintains a similar property in a |
---|
199 | similar way while building RTL graphs. We will also add a result |
---|
200 | showing that all of the pseudo-register names are distinct for use by |
---|
201 | later stages using the same method as Cminor. |
---|
202 | |
---|
203 | The simulation will relate Cminor states to RTLabs states which are about to |
---|
204 | execute the code corresponding to the Cminor statement or continuation. |
---|
205 | Each Cminor statement becomes zero or more RTLabs statements, with a |
---|
206 | decreasing measure based on the statement and continuations similar to |
---|
207 | CompCert's. We may also follow CompCert in using a relational |
---|
208 | specification of this stage so as to abstract away from the functional |
---|
209 | (and highly dependently typed) definition. |
---|
210 | |
---|
211 | The first two labelling properties remain as before; we will show that |
---|
212 | cost labels are preserved, so the function entry point will be a cost |
---|
213 | label, and successors to any statement that are cost labels map still |
---|
214 | map to cost labels, preserving the condition on branches. We replace |
---|
215 | the property for loops with the notion that we will always reach a |
---|
216 | cost label or the end of the function after following a bounded number of |
---|
217 | successors. This can be easily seen in Cminor using the requirement |
---|
218 | for cost labels at the head of loops and after gotos. It remains to |
---|
219 | show that this is preserved by the translation to RTLabs. % how? |
---|
220 | |
---|
221 | \subsection{RTLabs structured trace generation} |
---|
222 | |
---|
223 | This proof-only step incorporates the function call structure and cost |
---|
224 | labelling properties into the execution trace. As the function calls |
---|
225 | are nested within the trace, we need to distinguish between |
---|
226 | terminating and non-terminating function calls. Thus we use the |
---|
227 | excluded middle (specialised to a function termination property) to do |
---|
228 | this. |
---|
229 | |
---|
230 | Structured traces for terminating functions are built by following the |
---|
231 | flat trace, breaking it into chunks between cost labels and |
---|
232 | recursively processing function calls. The main difficulties here are |
---|
233 | the non-structurally recursive nature of the function (instead we use |
---|
234 | the size of the termination proof as a measure) and using the RTLabs |
---|
235 | cost labelling properties to show that the constraints of the |
---|
236 | structured traces are observed. We also show that the lower stack |
---|
237 | frames are preserved during function calls in order to prove that |
---|
238 | after returning from a function call we resume execution of the |
---|
239 | correct code. This part of the work has already been constructed, but |
---|
240 | still requires a simple proof to show that flattening the structured |
---|
241 | trace recreates the original flat trace. |
---|
242 | |
---|
243 | The non-terminating case follows the trace like the terminating |
---|
244 | version to build up chunks of trace from cost-label to cost-label |
---|
245 | (which, by the finite distance to a cost label property shown before, |
---|
246 | can be represented by an inductive type). These chunks are chained |
---|
247 | together in a coinductive data structure that can represent |
---|
248 | non-terminating traces. The excluded middle is used to decide whether |
---|
249 | function calls terminate, in which case the function described above |
---|
250 | constructs an inductive terminating structured trace which is nested |
---|
251 | in the caller's trace. Otherwise, another coinductive constructor is |
---|
252 | used to embed the non-terminating trace of the callee, generated by |
---|
253 | corecursion. This part of the trace transformation is currently under |
---|
254 | construction, and will also need a flattening result to show that it |
---|
255 | is correct. |
---|
256 | |
---|
257 | |
---|
258 | \section{Backend: RTLabs to machine code} |
---|
259 | \label{sect.backend.rtlabs.machine.code} |
---|
260 | |
---|
261 | The compiler backend consists of the following intermediate languages, and stages of translation: |
---|
262 | |
---|
263 | \begin{center} |
---|
264 | \begin{minipage}{.8\linewidth} |
---|
265 | \begin{tabbing} |
---|
266 | \quad \=\,\vdots\= \\ |
---|
267 | \> $\downarrow$ \>\\ |
---|
268 | \> $\downarrow$ \quad \= \kill |
---|
269 | \textsf{RTLabs}\\ |
---|
270 | \> $\downarrow$ \> copy propagation\footnote{\label{lab:opt}To be ported from the untrusted compiler and certified only in case of early completion of the certification of the other passes.} (an endo-transformation) \\ |
---|
271 | \> $\downarrow$ \> instruction selection\\ |
---|
272 | \textsf{RTL}\\ |
---|
273 | \> $\downarrow$ \> constant propagation$^{\mbox{\scriptsize \ref{lab:opt}}}$ (an endo-transformation) \\ |
---|
274 | \> $\downarrow$ \> calling convention made explicit \\ |
---|
275 | \> $\downarrow$ \> layout of activation records \\ |
---|
276 | \textsf{ERTL}\\ |
---|
277 | \> $\downarrow$ \> register allocation and spilling\\ |
---|
278 | \> $\downarrow$ \> dead code elimination\\ |
---|
279 | \textsf{LTL}\\ |
---|
280 | \> $\downarrow$ \> function linearisation\\ |
---|
281 | \> $\downarrow$ \> branch compression (an endo-transformation) \\ |
---|
282 | \textsf{LIN}\\ |
---|
283 | \> $\downarrow$ \> relabeling\\ |
---|
284 | \textsf{ASM}\\ |
---|
285 | \> $\downarrow$ \> pseudoinstruction expansion\\ |
---|
286 | \textsf{MCS-51 machine code}\\ |
---|
287 | \end{tabbing} |
---|
288 | \end{minipage} |
---|
289 | \end{center} |
---|
290 | |
---|
291 | \subsection{The RTLabs to RTL translation} |
---|
292 | \label{subsect.rtlabs.rtl.translation} |
---|
293 | |
---|
294 | % dpm: type system and casting load (???) |
---|
295 | We require a map, $\sigma$, between \texttt{Values} of the front-end memory model to lists of \texttt{BEValues} of the back-end memory model: |
---|
296 | |
---|
297 | \begin{displaymath} |
---|
298 | \mathtt{Value} ::= \bot \mid \mathtt{int(size)} \mid \mathtt{float} \mid \mathtt{null} \mid \mathtt{ptr} \quad\stackrel{\sigma}{\longrightarrow}\quad \mathtt{BEValue} ::= \bot \mid \mathtt{byte} \mid \mathtt{int}_i \mid \mathtt{ptr}_i |
---|
299 | \end{displaymath} |
---|
300 | |
---|
301 | We further require a map, $\sigma$, which maps the front-end \texttt{Memory} and the back-end's notion of \texttt{BEMemory}. |
---|
302 | |
---|
303 | \begin{displaymath} |
---|
304 | \mathtt{Mem}\ \alpha = \mathtt{Block} \rightarrow (\mathbb{Z} \rightarrow \alpha) |
---|
305 | \end{displaymath} |
---|
306 | |
---|
307 | where |
---|
308 | |
---|
309 | \begin{displaymath} |
---|
310 | \mathtt{Block} ::= \mathtt{Region} \cup \mathtt{ID} |
---|
311 | \end{displaymath} |
---|
312 | |
---|
313 | \begin{displaymath} |
---|
314 | \mathtt{BEMem} = \mathtt{Mem} \mathtt{Value} |
---|
315 | \end{displaymath} |
---|
316 | |
---|
317 | \begin{displaymath} |
---|
318 | \mathtt{Address} = \mathtt{BEValue} \times \mathtt{BEValue} \\ |
---|
319 | \end{displaymath} |
---|
320 | |
---|
321 | \begin{displaymath} |
---|
322 | \mathtt{Mem} = \mathtt{Block} \rightarrow (\mathbb{Z} \rightarrow \mathtt{Cont} \mid \mathtt{Value} \times \mathtt{Size} \mid \mathtt{null}) |
---|
323 | \end{displaymath} |
---|
324 | |
---|
325 | \begin{center} |
---|
326 | \begin{picture}(2, 2) |
---|
327 | \put(-150,-20){\framebox(25,25)[c]{\texttt{v}}} |
---|
328 | \put(-125,-20){\framebox(25,25)[c]{\texttt{4}}} |
---|
329 | \put(-100,-20){\framebox(25,25)[c]{\texttt{cont}}} |
---|
330 | \put(-75,-20){\framebox(25,25)[c]{\texttt{cont}}} |
---|
331 | \put(-50,-20){\framebox(25,25)[c]{\texttt{cont}}} |
---|
332 | \put(-15,-10){\vector(1, 0){30}} |
---|
333 | \put(25,-20){\framebox(25,25)[c]{\texttt{\texttt{v1}}}} |
---|
334 | \put(50,-20){\framebox(25,25)[c]{\texttt{\texttt{v2}}}} |
---|
335 | \put(75,-20){\framebox(25,25)[c]{\texttt{\texttt{v3}}}} |
---|
336 | \put(100,-20){\framebox(25,25)[c]{\texttt{\texttt{v4}}}} |
---|
337 | \end{picture} |
---|
338 | \end{center} |
---|
339 | |
---|
340 | \begin{displaymath} |
---|
341 | \mathtt{load}\ s\ a\ M = \mathtt{Some}\ v \rightarrow \forall i \leq s.\ \mathtt{load}\ s\ (a + i)\ \sigma(M) = \mathtt{Some}\ v_i |
---|
342 | \end{displaymath} |
---|
343 | |
---|
344 | \begin{displaymath} |
---|
345 | \sigma(\mathtt{store}\ v\ M) = \mathtt{store}\ \sigma(v)\ \sigma(M) |
---|
346 | \end{displaymath} |
---|
347 | |
---|
348 | \begin{displaymath} |
---|
349 | \texttt{load}^* (\mathtt{store}\ \sigma(v)\ \sigma(M))\ \sigma(a)\ \sigma(M) = \mathtt{load}^*\ \sigma(s)\ \sigma(a)\ \sigma(M) |
---|
350 | \end{displaymath} |
---|
351 | |
---|
352 | \begin{displaymath} |
---|
353 | \begin{array}{rll} |
---|
354 | \mathtt{State} & ::= & (\mathtt{State} : \mathtt{Frame}^* \times \mathtt{Frame} \\ |
---|
355 | & \mid & \mathtt{Call} : \mathtt{Frame}^* \times \mathtt{Args} \times \mathtt{Return} \times \mathtt{Fun} \\ |
---|
356 | & \mid & \mathtt{Return} : \mathtt{Frame}^* \times \mathtt{Value} \times \mathtt{Return}) \times \mathtt{Mem} |
---|
357 | \end{array} |
---|
358 | \end{displaymath} |
---|
359 | |
---|
360 | \begin{displaymath} |
---|
361 | \mathtt{State} ::= \mathtt{Frame}^* \times \mathtt{PC} \times \mathtt{SP} \times \mathtt{ISP} \times \mathtt{CARRY} \times \mathtt{REGS} |
---|
362 | \end{displaymath} |
---|
363 | |
---|
364 | \begin{displaymath} |
---|
365 | \mathtt{State} \stackrel{\sigma}{\longrightarrow} \mathtt{State} |
---|
366 | \end{displaymath} |
---|
367 | |
---|
368 | \begin{displaymath} |
---|
369 | \sigma(\mathtt{State} (\mathtt{Frame}^* \times \mathtt{Frame})) \longrightarrow ((\sigma(\mathtt{Frame}^*), \sigma(\mathtt{PC}), \sigma(\mathtt{SP}), 0, 0, \sigma(\mathtt{REGS})), \sigma(\mathtt{Mem})) |
---|
370 | \end{displaymath} |
---|
371 | |
---|
372 | \begin{displaymath} |
---|
373 | \sigma(\mathtt{Return}(-)) \longrightarrow \sigma \circ \text{return one step} |
---|
374 | \end{displaymath} |
---|
375 | |
---|
376 | \begin{displaymath} |
---|
377 | \sigma(\mathtt{Call}(-)) \longrightarrow \sigma \circ \text{call one step} |
---|
378 | \end{displaymath} |
---|
379 | |
---|
380 | Return one step commuting diagram: |
---|
381 | |
---|
382 | \begin{displaymath} |
---|
383 | \begin{diagram} |
---|
384 | s & \rTo^{\text{one step of execution}} & s' \\ |
---|
385 | & \rdTo & \dTo \\ |
---|
386 | & & \llbracket s'' \rrbracket |
---|
387 | \end{diagram} |
---|
388 | \end{displaymath} |
---|
389 | |
---|
390 | Call one step commuting diagram: |
---|
391 | |
---|
392 | \begin{displaymath} |
---|
393 | \begin{diagram} |
---|
394 | s & \rTo^{\text{one step of execution}} & s' \\ |
---|
395 | & \rdTo & \dTo \\ |
---|
396 | & & \llbracket s'' \rrbracket |
---|
397 | \end{diagram} |
---|
398 | \end{displaymath} |
---|
399 | |
---|
400 | \begin{displaymath} |
---|
401 | \begin{array}{rcl} |
---|
402 | \mathtt{Call(id,\ args,\ dst,\ pc),\ State(FRAME, FRAMES)} & \longrightarrow & \mathtt{Call(M(args), dst)}, \\ |
---|
403 | & & \mathtt{PUSH(current\_frame[PC := after\_return])} |
---|
404 | \end{array} |
---|
405 | \end{displaymath} |
---|
406 | |
---|
407 | In the case where the call is to an external function, we have: |
---|
408 | |
---|
409 | \begin{displaymath} |
---|
410 | \begin{array}{rcl} |
---|
411 | \mathtt{Call(M(args), dst)}, & \stackrel{\mathtt{ret\_val = f(M(args))}}{\longrightarrow} & \mathtt{Return(ret\_val,\ dst,\ PUSH(...))} \\ |
---|
412 | \mathtt{PUSH(current\_frame[PC := after\_return])} & & |
---|
413 | \end{array} |
---|
414 | \end{displaymath} |
---|
415 | |
---|
416 | then: |
---|
417 | |
---|
418 | \begin{displaymath} |
---|
419 | \begin{array}{rcl} |
---|
420 | \mathtt{Return(ret\_val,\ dst,\ PUSH(...))} & \longrightarrow & \mathtt{pc = POP\_STACK(regs[dst := M(ret\_val)],\ pc)} |
---|
421 | \end{array} |
---|
422 | \end{displaymath} |
---|
423 | |
---|
424 | In the case where the call is to an internal function, we have: |
---|
425 | |
---|
426 | \begin{displaymath} |
---|
427 | \begin{array}{rcl} |
---|
428 | \mathtt{CALL}(\mathtt{id}, \mathtt{args}, \mathtt{dst}, \mathtt{pc}) & \longrightarrow & \mathtt{CALL\_ID}(\mathtt{id}, \sigma'(\mathtt{args}), \sigma(\mathtt{dst}), \mathtt{pc}) \\ |
---|
429 | \mathtt{RETURN} & \longrightarrow & \mathtt{RETURN} \\ |
---|
430 | \end{array} |
---|
431 | \end{displaymath} |
---|
432 | |
---|
433 | \begin{displaymath} |
---|
434 | \begin{array}{rcl} |
---|
435 | \mathtt{Call(M(args), dst)} & \longrightarrow & \mathtt{sp = alloc}, regs = \emptyset[- := PARAMS] \\ |
---|
436 | \mathtt{PUSH(current\_frame[PC := after\_return])} & & \mathtt{State}(regs,\ sp,\ pc_\emptyset,\ dst) |
---|
437 | \end{array} |
---|
438 | \end{displaymath} |
---|
439 | |
---|
440 | then: |
---|
441 | |
---|
442 | \begin{displaymath} |
---|
443 | \begin{array}{rcl} |
---|
444 | \mathtt{sp = alloc}, regs = \emptyset[- := PARAMS] & \longrightarrow & \mathtt{free(sp)} \\ |
---|
445 | \mathtt{State}(regs,\ sp,\ pc_\emptyset,\ dst) & & \mathtt{Return(M(ret\_val), dst, frames)} |
---|
446 | \end{array} |
---|
447 | \end{displaymath} |
---|
448 | |
---|
449 | and finally: |
---|
450 | |
---|
451 | \begin{displaymath} |
---|
452 | \begin{array}{rcl} |
---|
453 | \mathtt{free(sp)} & \longrightarrow & \mathtt{pc = POP\_STACK(regs[dst := M(ret\_val)],\ pc)} \\ |
---|
454 | \mathtt{Return(M(ret\_val), dst, frames)} & & |
---|
455 | \end{array} |
---|
456 | \end{displaymath} |
---|
457 | |
---|
458 | \begin{displaymath} |
---|
459 | \begin{array}{rcl} |
---|
460 | \sigma & : & \mathtt{register} \rightarrow \mathtt{list\ register} \\ |
---|
461 | \sigma' & : & \mathtt{list\ register} \rightarrow \mathtt{list\ register} |
---|
462 | \end{array} |
---|
463 | \end{displaymath} |
---|
464 | |
---|
465 | \subsection{The RTL to ERTL translation} |
---|
466 | \label{subsect.rtl.ertl.translation} |
---|
467 | |
---|
468 | \begin{displaymath} |
---|
469 | \begin{diagram} |
---|
470 | & & \llbracket \mathtt{CALL\_ID}(\mathtt{id}, \mathtt{args}, \mathtt{dst}, \mathtt{pc})\rrbracket & & \\ |
---|
471 | & \ldTo^{\text{external}} & & \rdTo^{\text{internal}} & \\ |
---|
472 | \skull & & & & \mathtt{regs} = [\mathtt{params}/-] \\ |
---|
473 | & & & & \mathtt{sp} = \mathtt{ALLOC} \\ |
---|
474 | & & & & \mathtt{PUSH}(\mathtt{carry}, \mathtt{regs}, \mathtt{dst}, \mathtt{return\_addr}), \mathtt{pc}_{0}, \mathtt{regs}, \mathtt{sp} \\ |
---|
475 | \end{diagram} |
---|
476 | \end{displaymath} |
---|
477 | |
---|
478 | \begin{align*} |
---|
479 | \llbracket \mathtt{RETURN} \rrbracket \\ |
---|
480 | \mathtt{return\_addr} & := \mathtt{top}(\mathtt{stack}) \\ |
---|
481 | v* & := m(\mathtt{rv\_regs}) \\ |
---|
482 | \mathtt{dst}, \mathtt{sp}, \mathtt{carry}, \mathtt{regs} & := \mathtt{pop} \\ |
---|
483 | \mathtt{regs}[v* / \mathtt{dst}] \\ |
---|
484 | \end{align*} |
---|
485 | |
---|
486 | \begin{displaymath} |
---|
487 | \begin{diagram} |
---|
488 | s & \rTo^1 & s' & \rTo^1 & s'' \\ |
---|
489 | \dTo & & & \rdTo & \dTo \\ |
---|
490 | \llbracket s \rrbracket & \rTo(1,3)^1 & & & \llbracket s'' \rrbracket \\ |
---|
491 | \mathtt{CALL} \\ |
---|
492 | \end{diagram} |
---|
493 | \end{displaymath} |
---|
494 | |
---|
495 | \begin{displaymath} |
---|
496 | \begin{diagram} |
---|
497 | s & \rTo^1 & s' & \rTo^1 & s'' \\ |
---|
498 | \dTo & & & \rdTo & \dTo \\ |
---|
499 | \ & \rTo(1,3) & & & \ \\ |
---|
500 | \mathtt{RETURN} \\ |
---|
501 | \end{diagram} |
---|
502 | \end{displaymath} |
---|
503 | |
---|
504 | \begin{displaymath} |
---|
505 | \mathtt{b\_graph\_translate}: (\mathtt{label} \rightarrow \mathtt{blist'}) |
---|
506 | \rightarrow \mathtt{graph} \rightarrow \mathtt{graph} |
---|
507 | \end{displaymath} |
---|
508 | |
---|
509 | \begin{align*} |
---|
510 | \mathtt{theorem} &\ \mathtt{b\_graph\_translate\_ok}: \\ |
---|
511 | & \forall f.\forall G_{i}.\mathtt{let}\ G_{\sigma} := \mathtt{b\_graph\_translate}\ f\ G_{i}\ \mathtt{in} \\ |
---|
512 | & \forall l \in G_{i}.\mathtt{subgraph}\ (f\ l)\ l\ (\mathtt{next}\ l\ G_{i})\ G_{\sigma} |
---|
513 | \end{align*} |
---|
514 | |
---|
515 | \begin{align*} |
---|
516 | \mathtt{lemma} &\ \mathtt{execute\_1\_step\_ok}: \\ |
---|
517 | & \forall s. \mathtt{let}\ s' := s\ \sigma\ \mathtt{in} \\ |
---|
518 | & \mathtt{let}\ l := pc\ s\ \mathtt{in} \\ |
---|
519 | & s \stackrel{1}{\rightarrow} s^{*} \Rightarrow \exists n. s' \stackrel{n}{\rightarrow} s'^{*} \wedge s'^{*} = s'\ \sigma |
---|
520 | \end{align*} |
---|
521 | |
---|
522 | \begin{align*} |
---|
523 | \mathrm{RTL\ status} & \ \ \mathrm{ERTL\ status} \\ |
---|
524 | \mathtt{sp} & = \mathtt{spl} / \mathtt{sph} \\ |
---|
525 | \mathtt{graph} & \mapsto \mathtt{graph} + \mathtt{prologue}(s) + \mathtt{epilogue}(s) \\ |
---|
526 | & \mathrm{where}\ s = \mathrm{callee\ saved} + \nu \mathrm{RA} \\ |
---|
527 | \end{align*} |
---|
528 | |
---|
529 | \begin{displaymath} |
---|
530 | \begin{diagram} |
---|
531 | \mathtt{CALL} & \rTo^1 & \mathtt{inside\ function} \\ |
---|
532 | \dTo & & \dTo \\ |
---|
533 | \underbrace{\ldots}_{\llbracket \mathtt{CALL} \rrbracket} & \rTo & |
---|
534 | \underbrace{\ldots}_{\mathtt{prologue}} \\ |
---|
535 | \end{diagram} |
---|
536 | \end{displaymath} |
---|
537 | |
---|
538 | \begin{displaymath} |
---|
539 | \begin{diagram} |
---|
540 | \mathtt{RETURN} & \rTo^1 & \mathtt{.} \\ |
---|
541 | \dTo & & \dTo \\ |
---|
542 | \underbrace{\ldots}_{\mathtt{epilogue}} & \rTo & |
---|
543 | \underbrace{\ldots} \\ |
---|
544 | \end{diagram} |
---|
545 | \end{displaymath} |
---|
546 | |
---|
547 | \begin{align*} |
---|
548 | \mathtt{prologue}(s) = & \mathtt{create\_new\_frame}; \\ |
---|
549 | & \mathtt{pop\ ra}; \\ |
---|
550 | & \mathtt{save\ callee\_saved}; \\ |
---|
551 | & \mathtt{get\_params} \\ |
---|
552 | & \ \ \mathtt{reg\_params}: \mathtt{move} \\ |
---|
553 | & \ \ \mathtt{stack\_params}: \mathtt{push}/\mathtt{pop}/\mathtt{move} \\ |
---|
554 | \end{align*} |
---|
555 | |
---|
556 | \begin{align*} |
---|
557 | \mathtt{epilogue}(s) = & \mathtt{save\ return\ to\ tmp\ real\ regs}; \\ |
---|
558 | & \mathtt{restore\_registers}; \\ |
---|
559 | & \mathtt{push\ ra}; \\ |
---|
560 | & \mathtt{delete\_frame}; \\ |
---|
561 | & \mathtt{save return} \\ |
---|
562 | \end{align*} |
---|
563 | |
---|
564 | \begin{displaymath} |
---|
565 | \mathtt{CALL}\ id \mapsto \mathtt{set\_params};\ \mathtt{CALL}\ id;\ \mathtt{fetch\_result} |
---|
566 | \end{displaymath} |
---|
567 | |
---|
568 | \subsection{The ERTL to LTL translation} |
---|
569 | \label{subsect.ertl.ltl.translation} |
---|
570 | |
---|
571 | \subsection{The LTL to LIN translation} |
---|
572 | \label{subsect.ltl.lin.translation} |
---|
573 | |
---|
574 | We require a map, $\sigma$, from LTL statuses, where program counters are represented as labels in a graph data structure, to LIN statuses, where program counters are natural numbers: |
---|
575 | \begin{displaymath} |
---|
576 | \mathtt{pc : label} \stackrel{\sigma}{\longrightarrow} \mathbb{N} |
---|
577 | \end{displaymath} |
---|
578 | |
---|
579 | The LTL to LIN translation pass also linearises the graph data structure into a list of instructions. |
---|
580 | Pseudocode for the linearisation process is as follows: |
---|
581 | |
---|
582 | \begin{lstlisting} |
---|
583 | let rec linearise graph visited required generated todo := |
---|
584 | match todo with |
---|
585 | | l::todo -> |
---|
586 | if l $\in$ visited then |
---|
587 | let generated := generated $\cup\ \{$ Goto l $\}$ in |
---|
588 | let required := required $\cup$ l in |
---|
589 | linearise graph visited required generated todo |
---|
590 | else |
---|
591 | -- Get the instruction at label `l' in the graph |
---|
592 | let lookup := graph(l) in |
---|
593 | let generated := generated $\cup\ \{$ lookup $\}$ in |
---|
594 | -- Find the successor of the instruction at label `l' in the graph |
---|
595 | let successor := succ(l, graph) in |
---|
596 | let todo := successor::todo in |
---|
597 | linearise graph visited required generated todo |
---|
598 | | [] -> (required, generated) |
---|
599 | \end{lstlisting} |
---|
600 | |
---|
601 | It is easy to see that this linearisation process eventually terminates. |
---|
602 | In particular, the size of the visited label set is monotonically increasing, and is bounded above by the size of the graph that we are linearising. |
---|
603 | |
---|
604 | The initial call to \texttt{linearise} sees the \texttt{visited}, \texttt{required} and \texttt{generated} sets set to the empty set, and \texttt{todo} initialized with the singleton list consisting of the entry point of the graph. |
---|
605 | We envisage needing to prove the following invariants on the linearisation function above: |
---|
606 | |
---|
607 | \begin{enumerate} |
---|
608 | \item |
---|
609 | $\mathtt{visited} \approx \mathtt{generated}$, where $\approx$ is \emph{multiset} equality, as \texttt{generated} is a set of instructions where instructions may mention labels multiple times, and \texttt{visited} is a set of labels, |
---|
610 | \item |
---|
611 | $\forall \mathtt{l} \in \mathtt{generated}.\ \mathtt{succ(l,\ graph)} \subseteq \mathtt{required} \cup \mathtt{todo}$, |
---|
612 | \item |
---|
613 | $\mathtt{required} \subseteq \mathtt{visited}$, |
---|
614 | \item |
---|
615 | $\mathtt{visited} \cap \mathtt{todo} = \emptyset$. |
---|
616 | \end{enumerate} |
---|
617 | |
---|
618 | The invariants collectively imply the following properties, crucial to correctness, about the linearisation process: |
---|
619 | |
---|
620 | \begin{enumerate} |
---|
621 | \item |
---|
622 | Every graph node is visited at most once, |
---|
623 | \item |
---|
624 | Every instruction that is generated is generated due to some graph node being visited, |
---|
625 | \item |
---|
626 | The successor instruction of every instruction that has been visited already will eventually be visited too. |
---|
627 | \end{enumerate} |
---|
628 | |
---|
629 | Note, because the LTL to LIN transformation is the first time the program is linearised, we must discover a notion of `well formed program' suitable for linearised forms. |
---|
630 | In particular, we see the notion of well formedness (yet to be formally defined) resting on the following conditions: |
---|
631 | |
---|
632 | \begin{enumerate} |
---|
633 | \item |
---|
634 | For every jump to a label in a linearised program, the target label exists at some point in the program, |
---|
635 | \item |
---|
636 | Each label is unique, appearing only once in the program, |
---|
637 | \item |
---|
638 | The final instruction of a program must be a return. |
---|
639 | \end{enumerate} |
---|
640 | |
---|
641 | We assume that these properties will be easy consequences of the invariants on the linearisation function defined above. |
---|
642 | |
---|
643 | The final condition above is potentially a little opaque, so we explain further. |
---|
644 | First, the only instructions that can reasonably appear in final position at the end of a program are returns or backward jumps, as any other instruction would cause execution to `fall out' of the end of the program (for example, when a function invoked with \texttt{CALL} returns, it returns to the next instruction past the \texttt{CALL} that invoked it). |
---|
645 | However, in LIN, though each function's graph has been linearised, the entire program is not yet fully linearised into a list of instructions, but rather, a list of `functions', each consisting of a linearised body along with other data. |
---|
646 | Each well-formed function must end with a call to \texttt{RET}, and therefore the only correct instruction that can terminate a LIN program is a \texttt{RET} instruction. |
---|
647 | |
---|
648 | \subsection{The LIN to ASM and ASM to MCS-51 machine code translations} |
---|
649 | \label{subsect.lin.asm.translation} |
---|
650 | |
---|
651 | The LIN to ASM translation step is trivial, being almost the identity function. |
---|
652 | The only non-trivial feature of the LIN to ASM translation is that all labels are `named apart' so that there is no chance of freshly generated labels from different namespaces clashing with labels from another namespace. |
---|
653 | |
---|
654 | The ASM to MCS-51 machine code translation step, and the required statements of correctness, are found in an unpublished manuscript attached to this document. |
---|
655 | |
---|
656 | \section{Estimated effort} |
---|
657 | Based on the rough analysis performed so far we can estimate the total |
---|
658 | effort for the certification of the compiler. We obtain this estimation by |
---|
659 | combining, for each pass: 1) the number of lines of code to be certified; |
---|
660 | 2) the ratio of number of lines of proof to number of lines of code from |
---|
661 | the CompCert project~\cite{compcert} for the CompCert pass that is closest to |
---|
662 | ours; 3) an estimation of the complexity of the pass according to the |
---|
663 | analysis above. |
---|
664 | |
---|
665 | \begin{tabular}{lrll} |
---|
666 | Pass origin & Code lines & CompCert ratio & Estimated effort \\ |
---|
667 | \hline |
---|
668 | Cminor & 2913 & 5.23 \permil & 15.24 \\ |
---|
669 | RTLabs & 1252 & 1.17 \permil & 1.48 \\ |
---|
670 | RTL & 469 & 4.17 \permil & 1.95 \\ |
---|
671 | ERTL & 789 & 3.01 \permil & 2.38 \\ |
---|
672 | LTL & 92 & 5.94 \permil & 0.55 \\ |
---|
673 | LIN & 354 & 6.54 \permil & 2.31 \\ |
---|
674 | ASM & 984 & 4.80 \permil & 4.72 \\ |
---|
675 | \hline |
---|
676 | Total & 6853 & 4.17 \permil & 28.63 \\ |
---|
677 | \end{tabular} |
---|
678 | |
---|
679 | We provide now some additional informations on the methodology used in the |
---|
680 | computation. The passes in Cerco and CompCert front-end closely match each |
---|
681 | other. However, there is no clear correspondence between the two back-ends. |
---|
682 | For instance, we enforce the calling convention immediately after instruction |
---|
683 | selection, whereas in CompCert this is performed in a later phase. Or we |
---|
684 | linearize the code at the very end, whereas CompCert performs linearization |
---|
685 | as soon as possible. Therefore, the first part of the exercise has consisted |
---|
686 | in shuffling and partitioning the CompCert code in order to assign to each |
---|
687 | CerCo pass the CompCert code that performs the same transformation. |
---|
688 | |
---|
689 | After this preliminary step, using the data given in~\cite{compcert} (which |
---|
690 | are relative to an early version of CompCert) we computed the ratio between |
---|
691 | men months and lines of code in CompCert for each CerCo pass. This is shown |
---|
692 | in the third column of Table~\ref{wildguess}. For those CerCo passes that |
---|
693 | have no correspondence in CompCert (like the optimizing assembler) or where |
---|
694 | we have insufficient data, we have used the average of the ratios computed |
---|
695 | above. |
---|
696 | |
---|
697 | The first column of the table shows the number of lines of code for each |
---|
698 | pass in CerCo. The third column is obtained multiplying the first with the |
---|
699 | CompCert ratio. It provides an estimate of the effort required (in men months) |
---|
700 | if the complexity of the proofs for CerCo and Compcert would be the same. |
---|
701 | |
---|
702 | The two proof styles, however, are on purpose completely different. Where |
---|
703 | CompCert uses non executable semantics, describing the various semantics with |
---|
704 | inductive types, we have preferred executable semantics. Therefore, CompCert |
---|
705 | proofs by induction and inversion become proof by functional inversion, |
---|
706 | performed using the Russel methodology (now called Program in Coq, but whose |
---|
707 | behaviour differs from Matita's one). Moreover, CompCert code is written using |
---|
708 | only types that belong to the Hindley-Milner fragment, whereas we have |
---|
709 | heavily exploited dependent types all over the code. The dependent type |
---|
710 | discipline offers many advantages from the point of view of clarity of the |
---|
711 | invariants involved and early detection of errors and it naturally combines |
---|
712 | well with the Russel approach which is based on dependent types. However, it |
---|
713 | is also well known to introduce technical problems all over the code, like |
---|
714 | the need to explicitly prove type equalities to be able to manipulate |
---|
715 | expressions in certain ways. In many situations, the difficulties encountered |
---|
716 | with manipulating dependent types are better addressed by improving the Matita |
---|
717 | system, according to the formalization driven system development. For this |
---|
718 | reason, and assuming a pessimistic point of view on our performance, the |
---|
719 | fourth columns presents the final estimation of the effort required, that also |
---|
720 | takes in account the complexity of the proof suggested by the informal proofs |
---|
721 | sketched in the previous section. |
---|
722 | |
---|
723 | \end{document} |
---|