source: Deliverables/D1.2/CompilerProofOutline/outline.tex @ 1751

Last change on this file since 1751 was 1751, checked in by mulligan, 8 years ago

Commit to avoid conflicts with Claudio

File size: 35.9 KB
Line 
1\documentclass[a4paper, 10pt]{article}
2
3\usepackage{a4wide}
4\usepackage{amsfonts}
5\usepackage{amsmath}
6\usepackage{amssymb}
7\usepackage[english]{babel}
8\usepackage{color}
9\usepackage{diagrams}
10\usepackage{graphicx}
11\usepackage[colorlinks]{hyperref}
12\usepackage[utf8x]{inputenc}
13\usepackage{listings}
14\usepackage{microtype}
15\usepackage{skull}
16\usepackage{stmaryrd}
17\usepackage{array}
18\newcolumntype{b}{@{}>{{}}}
19\newcolumntype{B}{@{}>{{}}c<{{}}@{}}
20\newcolumntype{h}[1]{@{\hspace{#1}}}
21\newcolumntype{L}{>{$}l<{$}}
22\newcolumntype{C}{>{$}c<{$}}
23\newcolumntype{R}{>{$}r<{$}}
24\newcolumntype{S}{>{$(}r<{)$}}
25\newcolumntype{n}{@{}}
26\usepackage{wasysym}
27
28
29\lstdefinelanguage{matita-ocaml} {
30  mathescape=true
31}
32\lstset{
33  language=matita-ocaml,basicstyle=\tt,columns=flexible,breaklines=false,
34  showspaces=false, showstringspaces=false, extendedchars=false,
35  inputencoding=utf8x, tabsize=2
36}
37
38\title{Proof outline for the correctness of the CerCo compiler}
39\date{\today}
40\author{The CerCo team}
41
42\begin{document}
43
44\maketitle
45
46\section{Introduction}
47\label{sect.introduction}
48
49In the last project review of the CerCo project, the project reviewers
50recommended us to quickly outline a paper-and-pencil correctness proof
51for each of the stages of the CerCo compiler in order to allow for an
52estimation of the complexity and time required to complete the formalization
53of the proof. This has been possible starting from month 18 when we have
54completed the formalization in Matita of the datastructures and code of
55the compiler.
56
57In this document we provide a very high-level, pen-and-paper
58sketch of what we view as the best path to completing the correctness proof
59for the compiler. In particular, for every translation between two intermediate languages, in both the front- and back-ends, we identify the key translation steps, and identify some invariants that we view as being important for the correctness proof. We also briefly describe the parts of the proof that have already
60been completed at the end of the First Period.
61
62In the last section we finally present an estimation of the effort required
63for the certification in Matita of the compiler.
64
65\section{Front-end: Clight to RTLabs}
66
67The front-end of the CerCo compiler consists of several stages:
68
69\begin{center}
70\begin{minipage}{.8\linewidth}
71\begin{tabbing}
72\quad \= $\downarrow$ \quad \= \kill
73\textsf{Clight}\\
74\> $\downarrow$ \> cast removal\\
75\> $\downarrow$ \> add runtime functions\footnote{Following the last project
76meeting we intend to move this transformation to the back-end}\\
77\> $\downarrow$ \> cost labelling\\
78\> $\downarrow$ \> loop optimizations\footnote{\label{lab:opt2}To be ported from the untrusted compiler and certified only in case of early completion of the certification of the other passes.} (an endo-transformation)\\
79\> $\downarrow$ \> partial redundancy elimination$^{\mbox{\scriptsize \ref{lab:opt2}}}$ (an endo-transformation)\\
80\> $\downarrow$ \> stack variable allocation and control structure
81 simplification\\
82\textsf{Cminor}\\
83\> $\downarrow$ \> generate global variable initialisation code\\
84\> $\downarrow$ \> transform to RTL graph\\
85\textsf{RTLabs}\\
86\> $\downarrow$ \> \\
87\>\,\vdots
88\end{tabbing}
89\end{minipage}
90\end{center}
91
92Here, by `endo-transformation', we mean a mapping from language back to itself:
93the loop optimization step maps the Clight language to itself.
94
95%Our overall statements of correctness with respect to costs will
96%require a correctly labelled program
97There are three layers in most of the proofs proposed:
98\begin{enumerate}
99\item invariants closely tied to the syntax and transformations using
100  dependent types,
101\item a forward simulation proof, and
102\item syntactic proofs about the cost labelling.
103\end{enumerate}
104The first will support both functional correctness and allow us to show
105the totality of some of the compiler stages (that is, those stages of
106the compiler cannot fail).  The second provides the main functional
107correctness result, and the last will be crucial for applying
108correctness results about the costings from the back-end.
109
110We will also prove that a suitably labelled RTLabs trace can be turned
111into a \emph{structured trace} which splits the execution trace into
112cost-label to cost-label chunks with nested function calls.  This
113structure was identified during work on the correctness of the
114back-end cost analysis as retaining important information about the
115structure of the execution that is difficult to reconstruct later in
116the compiler.
117
118\subsection{Clight cast removal}
119
120This transformation removes some casts inserted by the parser to make
121arithmetic promotion explicit but which are superfluous (such as
122\lstinline[language=C]'c = (short)((int)a + (int)b);' where
123\lstinline'a' and \lstinline'b' are \lstinline[language=C]'short').
124This is necessary for producing good code for our target architecture.
125
126It only affects Clight expressions, recursively detecting casts that
127can be safely eliminated.  The semantics provides a big-step
128definition for expression, so we should be able to show a lock-step
129forward simulation between otherwise identical states using a lemma
130showing that cast elimination does not change the evaluation of
131expressions.  This lemma will follow from a structural induction on
132the source expression.  We have already proved a few of the underlying
133arithmetic results necessary to validate the approach.
134
135\subsection{Clight cost labelling}
136
137This adds cost labels before and after selected statements and
138expressions, and the execution traces ought to be equivalent modulo
139the new cost labels.  Hence it requires a simple forward simulation
140with a limited amount of stuttering whereever a new cost label is
141introduced.  A bound can be given for the amount of stuttering allowed
142based on the statement or continuation to be evaluated next.
143
144We also intend to show three syntactic properties about the cost
145labelling:
146\begin{enumerate}
147\item every function starts with a cost label,
148\item every branching instruction is followed by a cost label (note that
149  exiting a loop is treated as a branch), and
150\item the head of every loop (and any \lstinline'goto' destination) is
151  a cost label.
152\end{enumerate}
153These can be shown by structural induction on the source term.
154
155\subsection{Clight to Cminor translation}
156
157This translation is the first to introduce some invariants, with the
158proofs closely tied to the implementation by dependent typing.  These
159are largely complete and show that the generated code enjoys:
160\begin{itemize}
161\item some minimal type safety shown by explicit checks on the
162  Cminor types during the transformation (a little more work remains
163  to be done here, but follows the same form);
164\item that variables named in the parameter and local variable
165  environments are distinct from one another, again by an explicit
166  check;
167\item that variables used in the generated code are present in the
168  resulting environment (either by checking their presence in the
169  source environment, or from a list of freshly generated temporary variables);
170  and
171\item that all \lstinline[language=C]'goto' labels are present (by
172  checking them against a list of source labels and proving that all
173  source labels are preserved).
174\end{itemize}
175
176The simulation will be similar to the relevant stages of CompCert
177(Clight to Csharpminor and Csharpminor to Cminor --- in the event that
178the direct proof is unwieldy we could introduce an intermediate
179language corresponding to Csharpminor).  During early experimentation
180with porting CompCert definitions to the Matita proof assistant we
181found little difficulty reproving the results for the memory model, so
182we plan to port the memory injection properties and use them to relate
183Clight in-memory variables with either the value of the local variable or a
184stack slot, depending on how it was classified.
185
186This should be sufficient to show the equivalence of (big-step)
187expression evaluation.  The simulation can then be shown by relating
188corresponding blocks of statement and continuations with their Cminor
189counterparts and proving that a few steps reaches the next matching
190state.
191
192The syntactic properties required for cost labels remain similar and a
193structural induction on the function bodies should be sufficient to
194show that they are preserved.
195
196\subsection{Cminor global initialisation code}
197
198This short phase replaces the global variable initialisation data with
199code that executes when the program starts.  Each piece of
200initialisation data in the source is matched by a new statement
201storing that data.  As each global variable is allocated a distinct
202memory block, the program state after the initialisation statements
203will be the same as the original program's state at the start of
204execution, and will proceed in the same manner afterwards.
205
206% Actually, the above is wrong...
207% ... this ought to be in a fresh main function with a fresh cost label
208
209\subsection{Cminor to RTLabs translation}
210
211In this part of the compiler we transform the program's functions into
212control flow graphs.  It is closely related to CompCert's Cminorsel to
213RTL transformation, albeit with target-independent operations.
214
215We already enforce several invariants with dependent types: some type
216safety, mostly shown using the type information from Cminor; and
217that the graph is closed (by showing that each successor was recently
218added, or corresponds to a \lstinline[language=C]'goto' label which
219are all added before the end).  Note that this relies on a
220monotonicity property; CompCert maintains a similar property in a
221similar way while building RTL graphs.  We will also add a result
222showing that all of the pseudo-register names are distinct for use by
223later stages using the same method as Cminor.
224
225The simulation will relate Cminor states to RTLabs states which are about to
226execute the code corresponding to the Cminor statement or continuation.
227Each Cminor statement becomes zero or more RTLabs statements, with a
228decreasing measure based on the statement and continuations similar to
229CompCert's.  We may also follow CompCert in using a relational
230specification of this stage so as to abstract away from the functional
231(and highly dependently typed) definition.
232
233The first two labelling properties remain as before; we will show that
234cost labels are preserved, so the function entry point will be a cost
235label, and successors to any statement that are cost labels map still
236map to cost labels, preserving the condition on branches.  We replace
237the property for loops with the notion that we will always reach a
238cost label or the end of the function after following a bounded number of
239successors.  This can be easily seen in Cminor using the requirement
240for cost labels at the head of loops and after gotos.  It remains to
241show that this is preserved by the translation to RTLabs.  % how?
242
243\subsection{RTLabs structured trace generation}
244
245This proof-only step incorporates the function call structure and cost
246labelling properties into the execution trace.  As the function calls
247are nested within the trace, we need to distinguish between
248terminating and non-terminating function calls.  Thus we use the
249excluded middle (specialised to a function termination property) to do
250this.
251
252Structured traces for terminating functions are built by following the
253flat trace, breaking it into chunks between cost labels and
254recursively processing function calls.  The main difficulties here are
255the non-structurally recursive nature of the function (instead we use
256the size of the termination proof as a measure) and using the RTLabs
257cost labelling properties to show that the constraints of the
258structured traces are observed.  We also show that the lower stack
259frames are preserved during function calls in order to prove that
260after returning from a function call we resume execution of the
261correct code.  This part of the work has already been constructed, but
262still requires a simple proof to show that flattening the structured
263trace recreates the original flat trace.
264
265The non-terminating case follows the trace like the terminating
266version to build up chunks of trace from cost-label to cost-label
267(which, by the finite distance to a cost label property shown before,
268can be represented by an inductive type).  These chunks are chained
269together in a coinductive data structure that can represent
270non-terminating traces.  The excluded middle is used to decide whether
271function calls terminate, in which case the function described above
272constructs an inductive terminating structured trace which is nested
273in the caller's trace.  Otherwise, another coinductive constructor is
274used to embed the non-terminating trace of the callee, generated by
275corecursion.  This part of the trace transformation is currently under
276construction, and will also need a flattening result to show that it
277is correct.
278
279
280\section{Backend: RTLabs to machine code}
281\label{sect.backend.rtlabs.machine.code}
282
283The compiler backend consists of the following intermediate languages, and stages of translation:
284
285\begin{center}
286\begin{minipage}{.8\linewidth}
287\begin{tabbing}
288\quad \=\,\vdots\= \\
289\> $\downarrow$ \>\\
290\> $\downarrow$ \quad \= \kill
291\textsf{RTLabs}\\
292\> $\downarrow$ \> copy propagation\footnote{\label{lab:opt}To be ported from the untrusted compiler and certified only in case of early completion of the certification of the other passes.} (an endo-transformation) \\
293\> $\downarrow$ \> instruction selection\\
294\> $\downarrow$ \> change of memory models in compiler\\
295\textsf{RTL}\\
296\> $\downarrow$ \> constant propagation$^{\mbox{\scriptsize \ref{lab:opt}}}$ (an endo-transformation) \\
297\> $\downarrow$ \> calling convention made explicit \\
298\> $\downarrow$ \> layout of activation records \\
299\textsf{ERTL}\\
300\> $\downarrow$ \> register allocation and spilling\\
301\> $\downarrow$ \> dead code elimination\\
302\textsf{LTL}\\
303\> $\downarrow$ \> function linearisation\\
304\> $\downarrow$ \> branch compression (an endo-transformation) \\
305\textsf{LIN}\\
306\> $\downarrow$ \> relabeling\\
307\textsf{ASM}\\
308\> $\downarrow$ \> pseudoinstruction expansion\\
309\textsf{MCS-51 machine code}\\
310\end{tabbing}
311\end{minipage}
312\end{center}
313
314\subsection{The RTLabs to RTL translation}
315\label{subsect.rtlabs.rtl.translation}
316
317The RTLabs to RTL translation pass marks the frontier between the two memory models used in the CerCo project.
318As a result, we require some method of translating between the values that the two memory models permit.
319Suppose we have such a translation, $\sigma$.
320Then the translation between values of the two memory models may be pictured with:
321
322\begin{displaymath}
323\mathtt{Value} ::= \bot \mid \mathtt{int(size)} \mid \mathtt{float} \mid \mathtt{null} \mid \mathtt{ptr} \quad\stackrel{\sigma}{\longrightarrow}\quad \mathtt{BEValue} ::= \bot \mid \mathtt{byte} \mid \mathtt{int}_i \mid \mathtt{ptr}_i
324\end{displaymath}
325
326In the front-end, we have both integer and float values, where integer values are `sized, along with null values and pointers.
327In the back-end memory model, floats values are no longer present, as floating point arithmetic is not supported by the CerCo compiler.
328However, the back-end memory model introduces an undefined value ($\bot$) and sized pointer values.
329
330We further require a map, $\sigma$, which maps the front-end \texttt{Memory} and the back-end's notion of \texttt{BEMemory}.
331
332\begin{displaymath}
333\mathtt{Mem}\ \alpha = \mathtt{Block} \rightarrow (\mathbb{Z} \rightarrow \alpha)
334\end{displaymath}
335
336Here, \texttt{Block} consists of a \texttt{Region} paired with an identifier.
337
338\begin{displaymath}
339\mathtt{Block} ::= \mathtt{Region} \cup \mathtt{ID}
340\end{displaymath}
341
342We now have what we need for defining what is meant by the `memory' in the backend memory model.
343Namely, we instantiate the previously defined \texttt{Mem} type with the type of back-end memory values.
344
345\begin{displaymath}
346\mathtt{BEMem} = \mathtt{Mem} \mathtt{BEValue}
347\end{displaymath}
348
349Memory addresses consist of a pair of back-end memory values:
350
351\begin{displaymath}
352\mathtt{Address} = \mathtt{BEValue} \times  \mathtt{BEValue} \\
353\end{displaymath}
354
355The back- and front-end memory models differ in how they represent sized integeer values in memory.
356In particular, the front-end stores integer values as a header, with size information, followed by a string of `continuation' blocks, marking out the full representation of the value in memory.
357In contrast, the layout of sized integer values in the back-end memory model consists of a series of byte-sized `chunks':
358\begin{center}
359\begin{picture}(2, 2)
360\put(-150,-20){\framebox(25,25)[c]{\texttt{v}}}
361\put(-125,-20){\framebox(25,25)[c]{\texttt{4}}}
362\put(-100,-20){\framebox(25,25)[c]{\texttt{cont}}}
363\put(-75,-20){\framebox(25,25)[c]{\texttt{cont}}}
364\put(-50,-20){\framebox(25,25)[c]{\texttt{cont}}}
365\put(-15,-10){\vector(1, 0){30}}
366\put(25,-20){\framebox(25,25)[c]{\texttt{\texttt{v1}}}}
367\put(50,-20){\framebox(25,25)[c]{\texttt{\texttt{v2}}}}
368\put(75,-20){\framebox(25,25)[c]{\texttt{\texttt{v3}}}}
369\put(100,-20){\framebox(25,25)[c]{\texttt{\texttt{v4}}}}
370\end{picture}
371\end{center}
372The differing memory representations of values in the two memory models imply the need for a series of lemmas on the actions of \texttt{load} and \texttt{store} to ensure correctness.
373The first lemma required has the following statement:
374\begin{displaymath}
375\mathtt{load}\ s\ a\ M = \mathtt{Some}\ v \rightarrow \forall i \leq s.\ \mathtt{load}\ s\ (a + i)\ \sigma(M) = \mathtt{Some}\ v_i
376\end{displaymath}
377That is, if we are successful in reading a value of size $s$ from memory at address $a$ in front-end memory, then we should successfully be able to read a value from memory in the back-end memory at \emph{any} address from address $a$ up to and including address $a + i$, where $i \leq s$.
378
379\begin{displaymath}
380\sigma(\mathtt{store}\ v\ M) = \mathtt{store}\ \sigma(v)\ \sigma(M)
381\end{displaymath}
382
383\begin{displaymath}
384\texttt{load}^* (\mathtt{store}\ \sigma(v)\ \sigma(M))\ \sigma(a)\ \sigma(M) = \mathtt{load}^*\ \sigma(s)\ \sigma(a)\ \sigma(M)
385\end{displaymath}
386
387\begin{displaymath}
388\begin{array}{rll}
389\mathtt{State} & ::=  & (\mathtt{State} : \mathtt{Frame}^* \times \mathtt{Frame} \\
390               & \mid & \mathtt{Call} : \mathtt{Frame}^* \times \mathtt{Args} \times \mathtt{Return} \times \mathtt{Fun} \\
391               & \mid & \mathtt{Return} : \mathtt{Frame}^* \times \mathtt{Value} \times \mathtt{Return}) \times \mathtt{Mem}
392\end{array}
393\end{displaymath}
394
395\begin{displaymath}
396\mathtt{State} ::= \mathtt{Frame}^* \times \mathtt{PC} \times \mathtt{SP} \times \mathtt{ISP} \times \mathtt{CARRY} \times \mathtt{REGS}
397\end{displaymath}
398
399\begin{displaymath}
400\mathtt{State} \stackrel{\sigma}{\longrightarrow} \mathtt{State}
401\end{displaymath}
402
403\begin{displaymath}
404\sigma(\mathtt{State} (\mathtt{Frame}^* \times \mathtt{Frame})) \longrightarrow ((\sigma(\mathtt{Frame}^*), \sigma(\mathtt{PC}), \sigma(\mathtt{SP}), 0, 0, \sigma(\mathtt{REGS})), \sigma(\mathtt{Mem}))
405\end{displaymath}
406
407\begin{displaymath}
408\sigma(\mathtt{Return}(-)) \longrightarrow \sigma \circ \text{return one step}
409\end{displaymath}
410
411\begin{displaymath}
412\sigma(\mathtt{Call}(-)) \longrightarrow \sigma \circ \text{call one step}
413\end{displaymath}
414
415Return one step commuting diagram:
416
417\begin{displaymath}
418\begin{diagram}
419s & \rTo^{\text{one step of execution}} & s'   \\
420  & \rdTo                             & \dTo \\
421  &                                   & \llbracket s'' \rrbracket
422\end{diagram}
423\end{displaymath}
424
425Call one step commuting diagram:
426
427\begin{displaymath}
428\begin{diagram}
429s & \rTo^{\text{one step of execution}} & s'   \\
430  & \rdTo                             & \dTo \\
431  &                                   & \llbracket s'' \rrbracket
432\end{diagram}
433\end{displaymath}
434
435\begin{displaymath}
436\begin{array}{rcl}
437\mathtt{Call(id,\ args,\ dst,\ pc),\ State(FRAME, FRAMES)} & \longrightarrow & \mathtt{Call(M(args), dst)}, \\
438                                                           &                 & \mathtt{PUSH(current\_frame[PC := after\_return])}
439\end{array}
440\end{displaymath}
441
442In the case where the call is to an external function, we have:
443
444\begin{displaymath}
445\begin{array}{rcl}
446\mathtt{Call(M(args), dst)},                       & \stackrel{\mathtt{ret\_val = f(M(args))}}{\longrightarrow} & \mathtt{Return(ret\_val,\ dst,\ PUSH(...))} \\
447\mathtt{PUSH(current\_frame[PC := after\_return])} &                                                            & 
448\end{array}
449\end{displaymath}
450
451then:
452
453\begin{displaymath}
454\begin{array}{rcl}
455\mathtt{Return(ret\_val,\ dst,\ PUSH(...))} & \longrightarrow & \mathtt{pc = POP\_STACK(regs[dst := M(ret\_val)],\ pc)}
456\end{array}
457\end{displaymath}
458
459In the case where the call is to an internal function, we have:
460
461\begin{displaymath}
462\begin{array}{rcl}
463\mathtt{CALL}(\mathtt{id}, \mathtt{args}, \mathtt{dst}, \mathtt{pc}) & \longrightarrow & \mathtt{CALL\_ID}(\mathtt{id}, \sigma'(\mathtt{args}), \sigma(\mathtt{dst}), \mathtt{pc}) \\
464\mathtt{RETURN}                                                      & \longrightarrow & \mathtt{RETURN} \\
465\end{array} 
466\end{displaymath}
467
468\begin{displaymath}
469\begin{array}{rcl}
470\mathtt{Call(M(args), dst)}                        & \longrightarrow & \mathtt{sp = alloc}, regs = \emptyset[- := PARAMS] \\
471\mathtt{PUSH(current\_frame[PC := after\_return])} &                 & \mathtt{State}(regs,\ sp,\ pc_\emptyset,\ dst)
472\end{array}
473\end{displaymath}
474
475then:
476
477\begin{displaymath}
478\begin{array}{rcl}
479\mathtt{sp = alloc}, regs = \emptyset[- := PARAMS] & \longrightarrow & \mathtt{free(sp)} \\
480\mathtt{State}(regs,\ sp,\ pc_\emptyset,\ dst)     &                 & \mathtt{Return(M(ret\_val), dst, frames)}
481\end{array}
482\end{displaymath}
483
484and finally:
485
486\begin{displaymath}
487\begin{array}{rcl}
488\mathtt{free(sp)}                         & \longrightarrow & \mathtt{pc = POP\_STACK(regs[dst := M(ret\_val)],\ pc)} \\
489\mathtt{Return(M(ret\_val), dst, frames)} &                 & 
490\end{array}
491\end{displaymath}
492
493\begin{displaymath}
494\begin{array}{rcl}
495\sigma & : & \mathtt{register} \rightarrow \mathtt{list\ register} \\
496\sigma' & : & \mathtt{list\ register} \rightarrow \mathtt{list\ register}
497\end{array}
498\end{displaymath}
499
500\subsection{The RTL to ERTL translation}
501\label{subsect.rtl.ertl.translation}
502
503\begin{displaymath}
504\begin{diagram}
505& & \llbracket \mathtt{CALL\_ID}(\mathtt{id}, \mathtt{args}, \mathtt{dst}, \mathtt{pc})\rrbracket & & \\
506& \ldTo^{\text{external}} & & \rdTo^{\text{internal}} & \\
507\skull & & & & \mathtt{regs} = [\mathtt{params}/-] \\
508& & & & \mathtt{sp} = \mathtt{ALLOC} \\
509& & & & \mathtt{PUSH}(\mathtt{carry}, \mathtt{regs}, \mathtt{dst}, \mathtt{return\_addr}), \mathtt{pc}_{0}, \mathtt{regs}, \mathtt{sp} \\
510\end{diagram}
511\end{displaymath}
512
513\begin{align*}
514\llbracket \mathtt{RETURN} \rrbracket \\
515\mathtt{return\_addr} & := \mathtt{top}(\mathtt{stack}) \\
516v*                    & := m(\mathtt{rv\_regs}) \\
517\mathtt{dst}, \mathtt{sp}, \mathtt{carry}, \mathtt{regs} & := \mathtt{pop} \\
518\mathtt{regs}[v* / \mathtt{dst}] \\
519\end{align*}
520
521\begin{displaymath}
522\begin{diagram}
523s    & \rTo^1 & s' & \rTo^1 & s'' \\
524\dTo &        &    & \rdTo  & \dTo \\
525\llbracket s \rrbracket & \rTo(1,3)^1 & & & \llbracket s'' \rrbracket \\ 
526\mathtt{CALL} \\
527\end{diagram}
528\end{displaymath}
529
530\begin{displaymath}
531\begin{diagram}
532s    & \rTo^1 & s' & \rTo^1 & s'' \\
533\dTo &        &    & \rdTo  & \dTo \\
534\  & \rTo(1,3) & & & \ \\
535\mathtt{RETURN} \\
536\end{diagram}
537\end{displaymath}
538
539\begin{displaymath}
540\mathtt{b\_graph\_translate}: (\mathtt{label} \rightarrow \mathtt{blist'})
541\rightarrow \mathtt{graph} \rightarrow \mathtt{graph}
542\end{displaymath}
543
544\begin{align*}
545\mathtt{theorem} &\ \mathtt{b\_graph\_translate\_ok}: \\
546& \forall  f.\forall G_{i}.\mathtt{let}\ G_{\sigma} := \mathtt{b\_graph\_translate}\ f\ G_{i}\ \mathtt{in} \\
547&       \forall l \in G_{i}.\mathtt{subgraph}\ (f\ l)\ l\ (\mathtt{next}\ l\ G_{i})\ G_{\sigma}
548\end{align*}
549
550\begin{align*}
551\mathtt{lemma} &\ \mathtt{execute\_1\_step\_ok}: \\
552&       \forall s.  \mathtt{let}\ s' := s\ \sigma\ \mathtt{in} \\
553&       \mathtt{let}\ l := pc\ s\ \mathtt{in} \\
554&       s \stackrel{1}{\rightarrow} s^{*} \Rightarrow \exists n. s' \stackrel{n}{\rightarrow} s'^{*} \wedge s'^{*} = s'\ \sigma
555\end{align*}
556
557\begin{align*}
558\mathrm{RTL\ status} & \ \ \mathrm{ERTL\ status} \\
559\mathtt{sp} & = \mathtt{spl} / \mathtt{sph} \\
560\mathtt{graph} &  \mathtt{graph} + \mathtt{prologue}(s) + \mathtt{epilogue}(s) \\
561& \mathrm{where}\ s = \mathrm{callee\ saved} + \nu \mathrm{RA} \\
562\end{align*}
563
564\begin{displaymath}
565\begin{diagram}
566\mathtt{CALL} & \rTo^1 & \mathtt{inside\ function} \\
567\dTo & & \dTo \\
568\underbrace{\ldots}_{\llbracket \mathtt{CALL} \rrbracket} & \rTo &
569\underbrace{\ldots}_{\mathtt{prologue}} \\
570\end{diagram}
571\end{displaymath}
572
573\begin{displaymath}
574\begin{diagram}
575\mathtt{RETURN} & \rTo^1 & \mathtt{.} \\
576\dTo & & \dTo \\
577\underbrace{\ldots}_{\mathtt{epilogue}} & \rTo &
578\underbrace{\ldots} \\
579\end{diagram}
580\end{displaymath}
581
582\begin{align*}
583\mathtt{prologue}(s) = & \mathtt{create\_new\_frame}; \\
584                       & \mathtt{pop\ ra}; \\
585                       & \mathtt{save\ callee\_saved}; \\
586                                                                                         & \mathtt{get\_params} \\
587                                                                                         & \ \ \mathtt{reg\_params}: \mathtt{move} \\
588                                                                                         & \ \ \mathtt{stack\_params}: \mathtt{push}/\mathtt{pop}/\mathtt{move} \\
589\end{align*}
590
591\begin{align*}
592\mathtt{epilogue}(s) = & \mathtt{save\ return\ to\ tmp\ real\ regs}; \\
593                                                                                         & \mathtt{restore\_registers}; \\
594                       & \mathtt{push\ ra}; \\
595                       & \mathtt{delete\_frame}; \\
596                       & \mathtt{save return} \\
597\end{align*}
598
599\begin{displaymath}
600\mathtt{CALL}\ id \mapsto \mathtt{set\_params};\ \mathtt{CALL}\ id;\ \mathtt{fetch\_result}
601\end{displaymath}
602
603\subsection{The ERTL to LTL translation}
604\label{subsect.ertl.ltl.translation}
605\newcommand{\declsf}[1]{\expandafter\newcommand\expandafter{\csname #1\endcsname}{\mathop{\mathsf{#1}}\nolimits}}
606\declsf{Livebefore}
607\declsf{Liveafter}
608\declsf{Defined}
609\declsf{Used}
610\declsf{Eliminable}
611\declsf{StatementSem}
612For the liveness analysis, we aim at a map
613$\ell \in \mathtt{label} \mapsto $ live registers at $\ell$.
614We define the following operators on ERTL statements.
615$$
616\begin{array}{lL>{(ex. $}L<{)$}}
617\Defined(s) & registers defined at $s$ & r_1\leftarrow r_2+r_3 \mapsto \{r_1,C\}, \mathtt{CALL}id\mapsto \text{caller-save}
618\\
619\Used(s) & registers used at $s$ & r_1\leftarrow r_2+r_3 \mapsto \{r_2,r_3\}, \mathtt{CALL}id\mapsto \text{parameters}
620\end{array}
621$$
622Given $LA:\mathtt{label}\to\mathtt{lattice}$ (where $\mathtt{lattice}$
623is the type of sets of registers\footnote{More precisely, it is thethe lattice
624of pairs of sets of pseudo-registers and sets of hardware registers,
625with pointwise operations.}, we also have have the following
626predicates:
627$$
628\begin{array}{lL}
629\Eliminable_{LA}(\ell) & iff $s(\ell)$ has side-effects only on $r\notin LA(\ell)$
630\\&
631(ex.\ $\ell : r_1\leftarrow r_2+r_3 \mapsto (\{r_1,C\}\cap LA(\ell)\neq\emptyset,
632  \mathtt{CALL}id\mapsto \text{never}$)
633\\
634\Livebefore_{LA}(\ell) &$:=
635  \begin{cases}
636    LA(\ell) &\text{if $\Eliminable_{LA}(\ell)$,}\\
637    (LA(\ell)\setminus \Defined(s(\ell)))\cup \Used(s(\ell) &\text{otherwise}.
638  \end{cases}$
639\end{array}
640$$
641In particular, $\Livebefore$ has type $(\mathtt{label}\to\mathtt{lattice})\to
642\mathtt{label}\to\mathtt{lattice}$.
643
644The equation on which we build the fixpoint is then
645$$\Liveafter(\ell) \doteq \bigcup_{\ell' >_1 \ell} \Livebefore_{\Liveafter}(\ell')$$
646where $\ell' >_1 \ell$ denotes that $\ell'$ is an immediate successor of $\ell$
647in the graph. We do not require the fixpoint to be the least one, so the hypothesis
648on $\Liveafter$ that we require is
649$$\Liveafter(\ell) \supseteq \bigcup_{\ell' >_1 \ell} \Livebefore(\ell')$$
650(for shortness we drop the subscript from $\Livebefore$).
651\subsection{The LTL to LIN translation}
652\label{subsect.ltl.lin.translation}
653
654We require a map, $\sigma$, from LTL statuses, where program counters are represented as labels in a graph data structure, to LIN statuses, where program counters are natural numbers:
655\begin{displaymath}
656\mathtt{pc : label} \stackrel{\sigma}{\longrightarrow} \mathbb{N}
657\end{displaymath}
658
659The LTL to LIN translation pass also linearises the graph data structure into a list of instructions.
660Pseudocode for the linearisation process is as follows:
661
662\begin{lstlisting}
663let rec linearise graph visited required generated todo :=
664  match todo with
665  | l::todo ->
666    if l $\in$ visited then
667      let generated := generated $\cup\ \{$ Goto l $\}$ in
668      let required := required $\cup$ l in
669        linearise graph visited required generated todo
670    else
671      -- Get the instruction at label `l' in the graph
672      let lookup := graph(l) in
673      let generated := generated $\cup\ \{$ lookup $\}$ in
674      -- Find the successor of the instruction at label `l' in the graph
675      let successor := succ(l, graph) in
676      let todo := successor::todo in
677        linearise graph visited required generated todo
678  | []      -> (required, generated)
679\end{lstlisting}
680
681It is easy to see that this linearisation process eventually terminates.
682In particular, the size of the visited label set is monotonically increasing, and is bounded above by the size of the graph that we are linearising.
683
684The initial call to \texttt{linearise} sees the \texttt{visited}, \texttt{required} and \texttt{generated} sets set to the empty set, and \texttt{todo} initialized with the singleton list consisting of the entry point of the graph.
685We envisage needing to prove the following invariants on the linearisation function above:
686
687\begin{enumerate}
688\item
689$\mathtt{visited} \approx \mathtt{generated}$, where $\approx$ is \emph{multiset} equality, as \texttt{generated} is a set of instructions where instructions may mention labels multiple times, and \texttt{visited} is a set of labels,
690\item
691$\forall \mathtt{l} \in \mathtt{generated}.\ \mathtt{succ(l,\ graph)} \subseteq \mathtt{required} \cup \mathtt{todo}$,
692\item
693$\mathtt{required} \subseteq \mathtt{visited}$,
694\item
695$\mathtt{visited} \cap \mathtt{todo} = \emptyset$.
696\end{enumerate}
697
698The invariants collectively imply the following properties, crucial to correctness, about the linearisation process:
699
700\begin{enumerate}
701\item
702Every graph node is visited at most once,
703\item
704Every instruction that is generated is generated due to some graph node being visited,
705\item
706The successor instruction of every instruction that has been visited already will eventually be visited too.
707\end{enumerate}
708
709Note, because the LTL to LIN transformation is the first time the program is linearised, we must discover a notion of `well formed program' suitable for linearised forms.
710In particular, we see the notion of well formedness (yet to be formally defined) resting on the following conditions:
711
712\begin{enumerate}
713\item
714For every jump to a label in a linearised program, the target label exists at some point in the program,
715\item
716Each label is unique, appearing only once in the program,
717\item
718The final instruction of a program must be a return.
719\end{enumerate}
720
721We assume that these properties will be easy consequences of the invariants on the linearisation function defined above.
722
723The final condition above is potentially a little opaque, so we explain further.
724First, the only instructions that can reasonably appear in final position at the end of a program are returns or backward jumps, as any other instruction would cause execution to `fall out' of the end of the program (for example, when a function invoked with \texttt{CALL} returns, it returns to the next instruction past the \texttt{CALL} that invoked it).
725However, in LIN, though each function's graph has been linearised, the entire program is not yet fully linearised into a list of instructions, but rather, a list of `functions', each consisting of a linearised body along with other data.
726Each well-formed function must end with a call to \texttt{RET}, and therefore the only correct instruction that can terminate a LIN program is a \texttt{RET} instruction.
727
728\subsection{The LIN to ASM and ASM to MCS-51 machine code translations}
729\label{subsect.lin.asm.translation}
730
731The LIN to ASM translation step is trivial, being almost the identity function.
732The only non-trivial feature of the LIN to ASM translation is that all labels are `named apart' so that there is no chance of freshly generated labels from different namespaces clashing with labels from another namespace.
733
734The ASM to MCS-51 machine code translation step, and the required statements of correctness, are found in an unpublished manuscript attached to this document.
735
736\section{Estimated effort}
737Based on the rough analysis performed so far we can estimate the total
738effort for the certification of the compiler. We obtain this estimation by
739combining, for each pass: 1) the number of lines of code to be certified;
7402) the ratio of number of lines of proof to number of lines of code from
741the CompCert project~\cite{compcert} for the CompCert pass that is closest to
742ours; 3) an estimation of the complexity of the pass according to the
743analysis above.
744
745\begin{tabular}{lrlll}
746Pass origin & Code lines & CompCert ratio & Estimated effort & Estimated effort \\
747            &            &                & (based on CompCert) & \\
748\hline
749Common & 12759 & 4.25 \permil & 54.22 \\
750Cminor &  1057 & 5.23 \permil & 5.53 \\
751Clight &  1856 & 5.23 \permil & 9.71 \\ 
752RTLabs &  1252 & 1.17 \permil & 1.48 \\
753RTL    &   469 & 4.17 \permil & 1.95 \\
754ERTL   &   789 & 3.01 \permil & 2.38 \\
755LTL    &    92 & 5.94 \permil & 0.55 \\
756LIN    &   354 & 6.54 \permil & 2.31 \\
757ASM    &   984 & 4.80 \permil & 4.72 \\
758\hline
759Total common    & 12759 & 4.25 \permil & 54.22 \\
760Total front-end &  2913 & 5.23 \permil & 15.24 \\
761Total back-end  &  6853 & 4.17 \permil & 13.39 \\
762\hline
763Total           & 22525 & 3.75 \permil & 84.85 \\
764\end{tabular}
765
766We provide now some additional informations on the methodology used in the
767computation. The passes in Cerco and CompCert front-end closely match each
768other. However, there is no clear correspondence between the two back-ends.
769For instance, we enforce the calling convention immediately after instruction
770selection, whereas in CompCert this is performed in a later phase. Or we
771linearize the code at the very end, whereas CompCert performs linearization
772as soon as possible. Therefore, the first part of the exercise has consisted
773in shuffling and partitioning the CompCert code in order to assign to each
774CerCo pass the CompCert code that performs the same transformation.
775
776After this preliminary step, using the data given in~\cite{compcert} (which
777are relative to an early version of CompCert) we computed the ratio between
778men months and lines of code in CompCert for each CerCo pass. This is shown
779in the third column of Table~\ref{wildguess}. For those CerCo passes that
780have no correspondence in CompCert (like the optimizing assembler) or where
781we have insufficient data, we have used the average of the ratios computed
782above.
783
784The first column of the table shows the number of lines of code for each
785pass in CerCo. The third column is obtained multiplying the first with the
786CompCert ratio. It provides an estimate of the effort required (in men months)
787if the complexity of the proofs for CerCo and Compcert would be the same.
788
789The two proof styles, however, are on purpose completely different. Where
790CompCert uses non executable semantics, describing the various semantics with
791inductive types, we have preferred executable semantics. Therefore, CompCert
792proofs by induction and inversion become proof by functional inversion,
793performed using the Russel methodology (now called Program in Coq, but whose
794behaviour differs from Matita's one). Moreover, CompCert code is written using
795only types that belong to the Hindley-Milner fragment, whereas we have
796heavily exploited dependent types all over the code. The dependent type
797discipline offers many advantages from the point of view of clarity of the
798invariants involved and early detection of errors and it naturally combines
799well with the Russel approach which is based on dependent types. However, it
800is also well known to introduce technical problems all over the code, like
801the need to explicitly prove type equalities to be able to manipulate
802expressions in certain ways. In many situations, the difficulties encountered
803with manipulating dependent types are better addressed by improving the Matita
804system, according to the formalization driven system development. For this
805reason, and assuming a pessimistic point of view on our performance, the
806fourth columns presents the final estimation of the effort required, that also
807takes in account the complexity of the proof suggested by the informal proofs
808sketched in the previous section.
809
810\end{document}
Note: See TracBrowser for help on using the repository browser.