source: Deliverables/D1.2/CompilerProofOutline/outline.tex @ 1786

Last change on this file since 1786 was 1786, checked in by sacerdot, 8 years ago

...

File size: 74.2 KB
Line 
1\documentclass[11pt,a4paper]{article}
2\usepackage{a4wide}
3\usepackage{amsfonts}
4\usepackage{amsmath}
5\usepackage{amssymb}
6\usepackage{array}
7\usepackage[english]{babel}
8\usepackage{../../style/cerco}
9\usepackage{color}
10\usepackage{diagrams}
11\usepackage{graphicx}
12\usepackage[utf8x]{inputenc}
13\usepackage{listings}
14\usepackage{microtype}
15\usepackage{skull}
16\usepackage{stmaryrd}
17\usepackage{wasysym}
18
19\newcolumntype{b}{\at{}>{{}}}
20\newcolumntype{B}{\at{}>{{}}c<{{}}\at{}}
21\newcolumntype{h}[1]{\at{\hspace{#1}}}
22\newcolumntype{L}{>{$}l<{$}}
23\newcolumntype{C}{>{$}c<{$}}
24\newcolumntype{R}{>{$}r<{$}}
25\newcolumntype{S}{>{$(}r<{)$}}
26\newcolumntype{n}{\at{}}
27
28\lstdefinelanguage{matita-ocaml}
29  {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try,on,to},
30   morekeywords={[2]whd,normalize,elim,cases,destruct},
31   morekeywords={[3]type,of,val,assert,let,function},
32   mathescape=true,
33  }
34\lstset{language=matita-ocaml,basicstyle=\tt,columns=flexible,breaklines=false,
35        keywordstyle=\bfseries, %\color{red}\bfseries,
36        keywordstyle=[2]\bfseries, %\color{blue},
37        keywordstyle=[3]\bfseries, %\color{blue}\bfseries,
38%        commentstyle=\color{green},
39%        stringstyle=\color{blue},
40        showspaces=false,showstringspaces=false}
41\DeclareUnicodeCharacter{8797}{:=}
42\DeclareUnicodeCharacter{10746}{++}
43\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
44\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
45
46\lstset{
47  extendedchars=false,
48  inputencoding=utf8x,
49  tabsize=1
50}
51
52\hypersetup{bookmarksopenlevel=2}
53
54\title{
55INFORMATION AND COMMUNICATION TECHNOLOGIES\\
56(ICT)\\
57PROGRAMME\\
58\vspace*{1cm}Project FP7-ICT-2009-C-243881 {\cerco}}
59
60\date{ }
61\author{}
62
63\begin{document}
64\thispagestyle{empty}
65
66\vspace*{-1cm}
67\begin{center}
68\includegraphics[width=0.6\textwidth]{../../style/cerco_logo.png}
69\end{center}
70
71\begin{minipage}{\textwidth}
72\maketitle
73\end{minipage}
74
75
76\vspace*{0.5cm}
77\begin{center}
78\begin{LARGE}
79\bf
80Proof outline for the correctness of the CerCo compiler
81\end{LARGE} 
82\end{center}
83
84\vspace*{2cm}
85\begin{center}
86\begin{large}
87Version 1.0
88\end{large}
89\end{center}
90
91\vspace*{0.5cm}
92\begin{center}
93\begin{large}
94Main Authors:\\
95B. Campbell, D. Mulligan, P. Tranquilli, C. Sacerdoti Coen
96\end{large}
97\end{center}
98
99\vspace*{\fill}
100\noindent
101Project Acronym: {\cerco}\\
102Project full title: Certified Complexity\\
103Proposal/Contract no.: FP7-ICT-2009-C-243881 {\cerco}\\
104
105\clearpage \pagestyle{myheadings} \markright{{\cerco}, FP7-ICT-2009-C-243881}
106
107\tableofcontents
108
109\section{Introduction}
110\label{sect.introduction}
111
112In the last project review of the CerCo project, the project reviewers
113recommended us to quickly outline a paper-and-pencil correctness proof
114for each of the stages of the CerCo compiler in order to allow for an
115estimation of the complexity and time required to complete the formalization
116of the proof. This has been possible starting from month 18 when we have
117completed the formalization in Matita of the datastructures and code of
118the compiler.
119
120In this document we provide a very high-level, pen-and-paper
121sketch of what we view as the best path to completing the correctness proof
122for the compiler. In particular, for every translation between two intermediate languages, in both the front- and back-ends, we identify the key translation steps, and identify some invariants that we view as being important for the correctness proof.  We sketch the overall correctness results, and also briefly describe the parts of the proof that have already
123been completed at the end of the First Period.
124
125In the last section we finally present an estimation of the effort required
126for the certification in Matita of the compiler and we draw conclusions.
127
128\section{Front-end: Clight to RTLabs}
129
130The front-end of the CerCo compiler consists of several stages:
131
132\begin{center}
133\begin{minipage}{.8\linewidth}
134\begin{tabbing}
135\quad \= $\downarrow$ \quad \= \kill
136\textsf{Clight}\\
137\> $\downarrow$ \> cast removal\\
138\> $\downarrow$ \> add runtime functions\footnote{Following the last project
139meeting we intend to move this transformation to the back-end}\\
140\> $\downarrow$ \> cost labelling\\
141\> $\downarrow$ \> loop optimizations\footnote{\label{lab:opt2}To be ported from the untrusted compiler and certified only in case of early completion of the certification of the other passes.} (an endo-transformation)\\
142\> $\downarrow$ \> partial redundancy elimination$^{\mbox{\scriptsize \ref{lab:opt2}}}$ (an endo-transformation)\\
143\> $\downarrow$ \> stack variable allocation and control structure
144 simplification\\
145\textsf{Cminor}\\
146\> $\downarrow$ \> generate global variable initialisation code\\
147\> $\downarrow$ \> transform to RTL graph\\
148\textsf{RTLabs}\\
149\> $\downarrow$ \> \\
150\>\,\vdots
151\end{tabbing}
152\end{minipage}
153\end{center}
154
155Here, by `endo-transformation', we mean a mapping from language back to itself:
156the loop optimization step maps the Clight language to itself.
157
158%Our overall statements of correctness with respect to costs will
159%require a correctly labelled program
160There are three layers in most of the proofs proposed:
161\begin{enumerate}
162\item invariants closely tied to the syntax and transformations using
163  dependent types (such as the presence of variable names in environments),
164\item a forward simulation proof relating each small-step of the
165  source to zero or more steps of the target, and
166\item proofs about syntactic properties of the cost labelling.
167\end{enumerate}
168The first will support both functional correctness and allow us to
169show the totality of some of the compiler stages (that is, those
170stages of the compiler cannot fail).  The second provides the main
171functional correctness result, including the preservation of cost
172labels in the traces, and the last will be crucial for applying
173correctness results about the costings from the back-end by showing
174that they appear in enough places so that we can assign all of the
175execution costs to them.
176
177We will also prove that a suitably labelled RTLabs trace can be turned
178into a \emph{structured trace} which splits the execution trace into
179cost-label to cost-label chunks with nested function calls.  This
180structure was identified during work on the correctness of the
181back-end cost analysis as retaining important information about the
182structure of the execution that is difficult to reconstruct later in
183the compiler.
184
185\subsection{Clight cast removal}
186
187This transformation removes some casts inserted by the parser to make
188arithmetic promotion explicit but which are superfluous (such as
189\lstinline[language=C]'c = (short)((int)a + (int)b);' where
190\lstinline'a' and \lstinline'b' are \lstinline[language=C]'short').
191This is necessary for producing good code for our target architecture.
192
193It only affects Clight expressions, recursively detecting casts that
194can be safely eliminated.  The semantics provides a big-step
195definition for expression, so we should be able to show a lock-step
196forward simulation between otherwise identical states using a lemma
197showing that cast elimination does not change the evaluation of
198expressions.  This lemma will follow from a structural induction on
199the source expression.  We have already proved a few of the underlying
200arithmetic results necessary to validate the approach.
201
202\subsection{Clight cost labelling}
203
204This adds cost labels before and after selected statements and
205expressions, and the execution traces ought to be equivalent modulo
206the new cost labels.  Hence it requires a simple forward simulation
207with a limited amount of stuttering whereever a new cost label is
208introduced.  A bound can be given for the amount of stuttering allowed
209based on the statement or continuation to be evaluated next.
210
211We also intend to show three syntactic properties about the cost
212labelling:
213\begin{enumerate}
214\item every function starts with a cost label,
215\item every branching instruction is followed by a cost label (note that
216  exiting a loop is treated as a branch), and
217\item the head of every loop (and any \lstinline'goto' destination) is
218  a cost label.
219\end{enumerate}
220These can be shown by structural induction on the source term.
221
222\subsection{Clight to Cminor translation}
223
224This translation is the first to introduce some invariants, with the
225proofs closely tied to the implementation by dependent typing.  These
226are largely complete and show that the generated code enjoys:
227\begin{itemize}
228\item some minimal type safety shown by explicit checks on the
229  Cminor types during the transformation (a little more work remains
230  to be done here, but follows the same form);
231\item that variables named in the parameter and local variable
232  environments are distinct from one another, again by an explicit
233  check;
234\item that variables used in the generated code are present in the
235  resulting environment (either by checking their presence in the
236  source environment, or from a list of freshly generated temporary variables);
237  and
238\item that all \lstinline[language=C]'goto' labels are present (by
239  checking them against a list of source labels and proving that all
240  source labels are preserved).
241\end{itemize}
242
243The simulation will be similar to the relevant stages of CompCert
244(Clight to Csharpminor and Csharpminor to Cminor --- in the event that
245the direct proof is unwieldy we could introduce an intermediate
246language corresponding to Csharpminor).  During early experimentation
247with porting CompCert definitions to the Matita proof assistant we
248found little difficulty reproving the results for the memory model, so
249we plan to port the memory injection properties and use them to relate
250Clight in-memory variables with either the value of the local variable or a
251stack slot, depending on how it was classified.
252
253This should be sufficient to show the equivalence of (big-step)
254expression evaluation.  The simulation can then be shown by relating
255corresponding blocks of statement and continuations with their Cminor
256counterparts and proving that a few steps reaches the next matching
257state.
258
259The syntactic properties required for cost labels remain similar and a
260structural induction on the function bodies should be sufficient to
261show that they are preserved.
262
263\subsection{Cminor global initialisation code}
264
265This short phase replaces the global variable initialisation data with
266code that executes when the program starts.  Each piece of
267initialisation data in the source is matched by a new statement
268storing that data.  As each global variable is allocated a distinct
269memory block, the program state after the initialisation statements
270will be the same as the original program's state at the start of
271execution, and will proceed in the same manner afterwards.
272
273% Actually, the above is wrong...
274% ... this ought to be in a fresh main function with a fresh cost label
275
276\subsection{Cminor to RTLabs translation}
277
278In this part of the compiler we transform the program's functions into
279control flow graphs.  It is closely related to CompCert's Cminorsel to
280RTL transformation, albeit with target-independent operations.
281
282We already enforce several invariants with dependent types: some type
283safety, mostly shown using the type information from Cminor; and
284that the graph is closed (by showing that each successor was recently
285added, or corresponds to a \lstinline[language=C]'goto' label which
286are all added before the end).  Note that this relies on a
287monotonicity property; CompCert maintains a similar property in a
288similar way while building RTL graphs.  We will also add a result
289showing that all of the pseudo-register names are distinct for use by
290later stages using the same method as Cminor.
291
292The simulation will relate Cminor states to RTLabs states which are about to
293execute the code corresponding to the Cminor statement or continuation.
294Each Cminor statement becomes zero or more RTLabs statements, with a
295decreasing measure based on the statement and continuations similar to
296CompCert's.  We may also follow CompCert in using a relational
297specification of this stage so as to abstract away from the functional
298(and highly dependently typed) definition.
299
300The first two labelling properties remain as before; we will show that
301cost labels are preserved, so the function entry point will be a cost
302label, and successors to any statement that are cost labels map still
303map to cost labels, preserving the condition on branches.  We replace
304the property for loops with the notion that we will always reach a
305cost label or the end of the function after following a bounded number of
306successors.  This can be easily seen in Cminor using the requirement
307for cost labels at the head of loops and after gotos.  It remains to
308show that this is preserved by the translation to RTLabs.  % how?
309
310\subsection{RTLabs structured trace generation}
311
312This proof-only step incorporates the function call structure and cost
313labelling properties into the execution trace.  As the function calls
314are nested within the trace, we need to distinguish between
315terminating and non-terminating function calls.  Thus we use the
316excluded middle (specialised to a function termination property) to do
317this.
318
319Structured traces for terminating functions are built by following the
320flat trace, breaking it into chunks between cost labels and
321recursively processing function calls.  The main difficulties here are
322the non-structurally recursive nature of the function (instead we use
323the size of the termination proof as a measure) and using the RTLabs
324cost labelling properties to show that the constraints of the
325structured traces are observed.  We also show that the lower stack
326frames are preserved during function calls in order to prove that
327after returning from a function call we resume execution of the
328correct code.  This part of the work has already been constructed, but
329still requires a simple proof to show that flattening the structured
330trace recreates the original flat trace.
331
332The non-terminating case follows the trace like the terminating
333version to build up chunks of trace from cost-label to cost-label
334(which, by the finite distance to a cost label property shown before,
335can be represented by an inductive type).  These chunks are chained
336together in a coinductive data structure that can represent
337non-terminating traces.  The excluded middle is used to decide whether
338function calls terminate, in which case the function described above
339constructs an inductive terminating structured trace which is nested
340in the caller's trace.  Otherwise, another coinductive constructor is
341used to embed the non-terminating trace of the callee, generated by
342corecursion.  This part of the trace transformation is currently under
343construction, and will also need a flattening result to show that it
344is correct.
345
346
347\section{Backend: RTLabs to machine code}
348\label{sect.backend.rtlabs.machine.code}
349
350The compiler backend consists of the following intermediate languages, and stages of translation:
351
352\begin{center}
353\begin{minipage}{.8\linewidth}
354\begin{tabbing}
355\quad \=\,\vdots\= \\
356\> $\downarrow$ \>\\
357\> $\downarrow$ \quad \= \kill
358\textsf{RTLabs}\\
359\> $\downarrow$ \> copy propagation\footnote{\label{lab:opt}To be ported from the untrusted compiler and certified only in case of early completion of the certification of the other passes.} (an endo-transformation) \\
360\> $\downarrow$ \> instruction selection\\
361\> $\downarrow$ \> change of memory models in compiler\\
362\textsf{RTL}\\
363\> $\downarrow$ \> constant propagation$^{\mbox{\scriptsize \ref{lab:opt}}}$ (an endo-transformation) \\
364\> $\downarrow$ \> calling convention made explicit \\
365\> $\downarrow$ \> layout of activation records \\
366\textsf{ERTL}\\
367\> $\downarrow$ \> register allocation and spilling\\
368\> $\downarrow$ \> dead code elimination\\
369\textsf{LTL}\\
370\> $\downarrow$ \> function linearisation\\
371\> $\downarrow$ \> branch compression (an endo-transformation) \\
372\textsf{LIN}\\
373\> $\downarrow$ \> relabeling\\
374\textsf{ASM}\\
375\> $\downarrow$ \> pseudoinstruction expansion\\
376\textsf{MCS-51 machine code}\\
377\end{tabbing}
378\end{minipage}
379\end{center}
380
381\subsection{Graph translations}
382RTLabs and most intermediate languages in the back-end have a graph
383representation:
384the code of each function is represented by a graph of instructions.
385The graph maps a set of labels (the names of the nodes) to the instruction
386stored at that label (the nodes of the graph).
387Instructions reference zero or more additional labels that are the immediate
388successors of the instruction: zero for return from functions; more than one
389for conditional jumps and calls; one in all other cases. The references
390from one instruction to its immediates are the arcs of the graph.
391
392Status of graph languages always have a program counter that holds a
393representation of a reference to the current instruction.
394
395A translation between two consecutive graph languages maps each instruction
396stored at location $l$ in the first graph and with immediate successors
397$\{l_1,\ldots,l_n\}$ to a subgraph of the output graph that has a single
398entry point at location $l$ and exit arcs to $\{l_1,\ldots,l_n\}$. Moreover,
399the labels of all non entry nodes in the subgraph are distinct from all the
400labels in the source graph.
401
402In order to simplify the translations and the relative proofs of forward
403simulation, after the release of D4.2 and D4.3, we have provided:
404\begin{itemize}
405 \item a new data type (called \texttt{blist}) that represents a
406   sequence of instructions to be added to the output graph.
407   The ``b'' in the name stands for binder, since a \texttt{blist} is
408   either empty, an extension of a \texttt{blist} with an instruction
409   at the front, or the generation of a fresh quantity followed by a
410   \texttt{blist}. The latter feature is used, for instance, to generate
411   fresh register names. The instructions in the list are unlabelled and
412   all of them but the last one are also sequential, like in a linear program.
413 \item a new iterator (called \texttt{b\_graph\_translate}) of type
414\begin{displaymath}
415\mathtt{b\_graph\_translate}: (\mathtt{label} \rightarrow \mathtt{blist})
416\rightarrow \mathtt{graph} \rightarrow \mathtt{graph}
417\end{displaymath}
418   The iterator transform the input graph in the output graph by replacing
419   each node with the graph that corresponds to the linear \texttt{blist}
420   obtained by applying the function in input to the node label.
421\end{itemize}
422
423Using the iterator above, the code can be written in such a way that
424the programmer does not see any distinction between writing a transformation
425on linear or graph languages.
426
427In order to prove simulations for translations obtained using the iterator,
428we will prove the following theorem:
429
430\begin{align*}
431\mathtt{theorem} &\ \mathtt{b\_graph\_translate\_ok}: \\
432& \forall  f.\forall G_{i}.\mathtt{let}\ G_{\sigma} := \mathtt{b\_graph\_translate}\ f\ G_{i}\ \mathtt{in} \\
433&       \forall l \in G_{i}.\mathtt{subgraph}\ (f\ l)\ l\ (next \ l \ G_i)\ G_{\sigma}
434\end{align*}
435
436Here \texttt{subgraph} is a computational predicate that given a \texttt{blist}
437$[i_1, \ldots, i_n]$, an entry label $l$, an exit label $l'$ and a graph $G$
438expands to the fact that fetching from $G$ at address $l$ one retrieves a node
439$i_1$ with a successor $l_1$ that, when fetched, yields a node $i_2$ with a
440successor $l_2$ such that \ldots. The successor of $i_n$ is $l'$.
441
442Proving a forward simulation diagram of the following kind using the theorem
443above is now as simple as doing the same using standard small step operational
444semantics over linear languages.
445
446\begin{align*}
447\mathtt{lemma} &\ \mathtt{execute\_1\_step\_ok}: \\
448&       \forall s.  \mathtt{let}\ s' := s\ \sigma\ \mathtt{in} \\
449&       \mathtt{let}\ l := pc\ s\ \mathtt{in} \\
450&       s \stackrel{1}{\rightarrow} s^{*} \Rightarrow \exists n. s' \stackrel{n}{\rightarrow} s'^{*} \wedge s'^{*} = s'\ \sigma
451\end{align*}
452
453Because of the fact that graph translation preserves entry and exit labels of
454translated statements, the state translation function $\sigma$ will simply
455preserve the value of the program counter. The program code, which is
456part of the state, is translated using the iterator.
457
458The proof is then roughly the following. Let $l$ be the program counter of the
459input state $s$. We proceed by cases on the current instruction of $s$.
460Let $[i_1, \ldots, i_n]$ be the \texttt{blist} associated to $l$ and $s$
461by the translation function. The witness required for the existential
462statement is simply $n$. By applying the theorem above we know that the
463next $n$ instructions that will be fetched from $s\ \sigma$ will be
464$[i_1, \ldots, i_n]$ and it is now sufficient to prove that they simulate
465the original instruction.
466
467\subsection{The RTLabs to RTL translation}
468\label{subsect.rtlabs.rtl.translation}
469
470\subsubsection*{Translation of values and memory}
471
472The RTLabs to RTL translation pass marks the frontier between the two memory models used in the CerCo project.
473As a result, we require some method of translating between the values that the two memory models permit.
474Suppose we have such a translation, $\sigma$.
475Then the translation between values of the two memory models may be pictured with:
476
477\begin{displaymath}
478\mathtt{Value} ::= \bot \mid \mathtt{int(size)} \mid \mathtt{float} \mid \mathtt{null} \mid \mathtt{ptr} \quad\stackrel{\sigma}{\longrightarrow}\quad \mathtt{BEValue} ::= \bot \mid \mathtt{byte} \mid \mathtt{null}_i \mid \mathtt{ptr}_i
479\end{displaymath}
480
481In the front-end, we have both integer and float values, where integer values are `sized', along with null values and pointers. Some frontenv values are
482representables in a byte, but some others require more bits.
483
484In the back-end model all values are meant to be represented in a single byte.
485Values can thefore be undefined, be one byte long integers or be indexed
486fragments of a pointer, null or not. Floats values are no longer present, as floating point arithmetic is not supported by the CerCo compiler.
487
488The $\sigma$ map implements a one-to-many relation: a single front-end value
489is mapped to a sequence of back-end values when its size is more then one byte.
490
491We further require a map, $\sigma$, which maps the front-end \texttt{Memory} and the back-end's notion of \texttt{BEMemory}. Both kinds of memory can be
492thought as an instance of a generic \texttt{Mem} data type parameterized over
493the kind of values stored in memory.
494
495\begin{displaymath}
496\mathtt{Mem}\ \alpha = \mathtt{Block} \rightarrow (\mathbb{Z} \rightarrow \alpha)
497\end{displaymath}
498
499Here, \texttt{Block} consists of a \texttt{Region} paired with an identifier.
500
501\begin{displaymath}
502\mathtt{Block} ::= \mathtt{Region} \times \mathtt{ID}
503\end{displaymath}
504
505We now have what we need for defining what is meant by the `memory' in the backend memory model.
506Namely, we instantiate the previously defined \texttt{Mem} type with the type of back-end memory values.
507
508\begin{displaymath}
509\mathtt{BEMem} = \mathtt{Mem}~\mathtt{BEValue}
510\end{displaymath}
511
512Memory addresses consist of a pair of back-end memory values:
513
514\begin{displaymath}
515\mathtt{Address} = \mathtt{BEValue} \times  \mathtt{BEValue} \\
516\end{displaymath}
517
518The back- and front-end memory models differ in how they represent sized integeer values in memory.
519In particular, the front-end stores integer values as a header, with size information, followed by a string of `continuation' blocks, marking out the full representation of the value in memory.
520In contrast, the layout of sized integer values in the back-end memory model consists of a series of byte-sized `chunks':
521
522\begin{center}
523\begin{picture}(0, 25)
524\put(-125,0){\framebox(25,25)[c]{\texttt{v,4}}}
525\put(-100,0){\framebox(25,25)[c]{\texttt{cont}}}
526\put(-75,0){\framebox(25,25)[c]{\texttt{cont}}}
527\put(-50,0){\framebox(25,25)[c]{\texttt{cont}}}
528\put(-15,10){\vector(1, 0){30}}
529\put(25,0){\framebox(25,25)[c]{\texttt{\texttt{v$_1$}}}}
530\put(50,0){\framebox(25,25)[c]{\texttt{\texttt{v$_2$}}}}
531\put(75,0){\framebox(25,25)[c]{\texttt{\texttt{v$_3$}}}}
532\put(100,0){\framebox(25,25)[c]{\texttt{\texttt{v$_4$}}}}
533\end{picture}
534\end{center}
535
536Chunks for pointers are pairs made of the original pointer and the index of the chunk.
537Therefore, when assembling the chunks together, we can always recognize if all chunks refer to the same value or if the operation is meaningless.
538
539The differing memory representations of values in the two memory models imply the need for a series of lemmas on the actions of \texttt{load} and \texttt{store} to ensure correctness.
540The first lemma required has the following statement:
541\begin{displaymath}
542\mathtt{load}\ s\ a\ M = \mathtt{Some}\ v \rightarrow \forall i \leq s.\ \mathtt{load}\ s\ (a + i)\ \sigma(M) = \mathtt{Some}\ v_i
543\end{displaymath}
544That is, if we are successful in reading a value of size $s$ from memory at address $a$ in front-end memory, then we should successfully be able to read all of its chunks from memory in the back-end memory at appropriate address (from address $a$ up to and including address $a + i$, where $i \leq s$).
545
546Next, we must show that \texttt{store} properly commutes with the $\sigma$-map between memory spaces:
547\begin{displaymath}
548\sigma(\mathtt{store}\ a\ v\ M) = \mathtt{store}\ \sigma(v)\ \sigma(a)\ \sigma(M)
549\end{displaymath}
550That is, if we store a value \texttt{v} in the front-end memory \texttt{M} at address \texttt{a} and transform the resulting memory with $\sigma$, then this is equivalent to storing a transformed value $\mathtt{\sigma(v)}$ at address $\mathtt{\sigma(a)}$ into the back-end memory $\mathtt{\sigma(M)}$.
551
552Finally, the commutation properties between \texttt{load} and \texttt{store} are weakened in the $\sigma$-image of the memory.
553Writing \texttt{load}$^*$ for the multiple consecutive iterations of \texttt{load} used to fetch all chunks of a value, we must prove that, when $a \neq a'$:
554\begin{displaymath}
555\texttt{load}^* \sigma(a)\ (\mathtt{store}\ \sigma(a')\ \sigma(v)\ \sigma(M)) = \mathtt{load}^*\ \sigma(s)\ \sigma(a)\ \sigma(M)
556\end{displaymath}
557That is, suppose we store a transformed value $\mathtt{\sigma(v)}$ into a back-end memory $\mathtt{\sigma(M)}$ at address $\mathtt{\sigma(a')}$, using \texttt{store}, and then load from the address $\sigma(a)$. Even if $a$ and $a'$ are
558distinct by hypothesis, there is a priori no guarantee that the consecutive
559bytes for the value stored at $\sigma(a)$ are disjoint from those for the
560values stored at $\sigma(a')$. The fact that this holds is a non-trivial
561property of $\sigma$ to be proved.
562
563\subsubsection*{Translation of RTLabs states}
564RTLabs states come in three flavours:
565\begin{displaymath}
566\begin{array}{rll}
567\mathtt{State} & ::=  & (\mathtt{State} : \mathtt{Frame}^* \times \mathtt{Frame} \\
568               & \mid & \mathtt{Call} : \mathtt{Frame}^* \times \mathtt{Args} \times \mathtt{Return} \times \mathtt{Fun} \\
569               & \mid & \mathtt{Return} : \mathtt{Frame}^* \times \mathtt{Value} \times \mathtt{Return}) \times \mathtt{Mem}
570\end{array}
571\end{displaymath}
572\texttt{State} is the default state in which RTLabs programs are almost always in.
573The \texttt{Call} state is only entered when a call instruction is being executed, and then we immediately return to being in \texttt{State}.
574Similarly, \texttt{Return} is only entered when a return instruction is being executed, before returning immediately to \texttt{State}.
575All RTLabs states are accompanied by a memory, \texttt{Mem}, with \texttt{Call} and \texttt{Return} keeping track of arguments, return addresses and the results of functions.
576\texttt{State} keeps track of a list of stack frames.
577
578RTL states differ from their RTLabs counterparts, in including a program counter \texttt{PC}, stack-pointer \texttt{SP}, internal stack pointer \texttt{ISP}, a carry flag \texttt{CARRY} and a set of registers \texttt{REGS}:
579\begin{displaymath}
580\mathtt{State} ::= \mathtt{Frame}^* \times \mathtt{PC} \times \mathtt{SP} \times \mathtt{ISP} \times \mathtt{CARRY} \times \mathtt{REGS}
581\end{displaymath}
582The internal stack pointer \texttt{ISP}, and its relationship with the stack pointer \texttt{SP}, needs some comment.
583Due to the design of the MCS-51, and its minuscule stack, it was decided that the compiler would implement an emulated stack in external memory.
584As a result, we have two stack pointers in our state: \texttt{ISP}, which is the real, hardware stack, and \texttt{SP}, which is the stack pointer of the emulated stack in memory.
585The emulated stack is used for pushing and popping stack frames when calling or returning from function calls, however this is done using the hardware stack, indexed by \texttt{ISP} as an intermediary.
586Instructions like \texttt{LCALL} and \texttt{ACALL} are hardwired by the processor's design to push the return address on to the hardware stack. Therefore after a call has been made, and before a call returns, the compiler emits code to move the return address back and forth the two stacks. Parameters, return values
587and local variables are only present in the external stack.
588As a result, for most of the execution of the processor, the hardware stack is empty, or contains a single item ready to be moved into external memory.
589
590Once more, we require a relation $\sigma$ between RTLabs states and RTL states.
591Because $\sigma$ is one-to-many and, morally, a multi-function,
592we use in the following the functional notation for $\sigma$, using $\star$
593in the output of $\sigma$ to mean that any value is accepted.
594\begin{displaymath}
595\mathtt{State} \stackrel{\sigma}{\longrightarrow} \mathtt{State}
596\end{displaymath}
597
598Translating an RTLabs state to an RTL state proceeds by cases on the particular type of state we are trying to translate, either a \texttt{State}, \texttt{Call} or a \texttt{Return}.
599For \texttt{State} we perform a further case analysis of the top stack frame, which decomposes into a tuple holding the current program counter value, the current stack pointer and the value of the registers:
600\begin{displaymath}
601\sigma(\mathtt{State} (\mathtt{Frame}^* \times \mathtt{\langle PC, REGS, SP \rangle})) \longrightarrow ((\sigma(\mathtt{Frame}^*), \sigma(\mathtt{PC}), \sigma(\mathtt{SP}), \star, \star, \sigma(\mathtt{REGS})), \sigma(\mathtt{Mem}))
602\end{displaymath}
603Translation then proceeds by translating the remaining stack frames, as well as the contents of the top stack frame. Any value for the internal stack pointer
604and the carry bit is admitted.
605
606Translating \texttt{Call} and \texttt{Return} states is more involved, as a commutation between a single step of execution and the translation process must hold:
607\begin{displaymath}
608\sigma(\mathtt{Return}(-)) \longrightarrow \sigma \circ \text{return one step}
609\end{displaymath}
610
611\begin{displaymath}
612\sigma(\mathtt{Call}(-)) \longrightarrow \sigma \circ \text{call one step}
613\end{displaymath}
614
615Here \emph{return one step} and \emph{call one step} refer to a pair of commuting diagrams relating the one-step execution of a call and return state and translation of both.
616We provide the one step commuting diagrams in Figure~\ref{fig.commuting.diagrams}. The fact that one execution step in the source language is not performed
617in the target language is not problematic for preservation of divergence
618because it is easy to show that every step from a \texttt{Call} or
619\texttt{Return} state is always preceeded/followed by one step that is always
620simulated.
621
622\begin{figure}
623\begin{displaymath}
624\begin{diagram}
625s & \rTo^{\text{one step of execution}} & s'   \\
626  & \rdTo                             & \dTo \\
627  &                                   & \llbracket s'' \rrbracket
628\end{diagram}
629\end{displaymath}
630
631\begin{displaymath}
632\begin{diagram}
633s & \rTo^{\text{one step of execution}} & s'   \\
634  & \rdTo                             & \dTo \\
635  &                                   & \llbracket s'' \rrbracket
636\end{diagram}
637\end{displaymath}
638\caption{The one-step commuting diagrams for \texttt{Call} and \texttt{Return} state translations}
639\label{fig.commuting.diagrams}
640\end{figure}
641
642\subsubsection*{The forward simulation proof}
643The forward simulation proof for all steps that do not involve function calls are lengthy, but routine.
644They consist of simulating a front-end operation on front-end pseudo-registers and the front-end memory with sequences of back-end operations on the back-end pseudo-registers and back-end memory.
645The properties of $\sigma$ presented before that relate values and memories will need to be heavily exploited.
646
647The simulation of invocation of functions and returns from functions is less obvious.
648We sketch here what happens on the source code and on its translation.
649
650\subparagraph{Function call/return in RTLabs}
651
652\begin{displaymath}
653\begin{array}{rcl}
654\mathtt{Call(id,\ args,\ dst,\ pc) \in State(Frame^*, Frame)} & \longrightarrow & \mathtt{Call(M(args), dst)}, \\
655                                                           &                 & \mathtt{PUSH(Frame[pc := after\_return])}
656\end{array}
657\end{displaymath}
658Suppose we are given a \texttt{State} with a list of stack frames, with the top frame being \texttt{Frame}.
659Suppose also that the program counter in \texttt{Frame} points to a \texttt{Call} instruction, complete with arguments and destination address.
660Then this is executed by entering into a \texttt{Call} state where the arguments are loaded from memory, and the address pointing to the instruction immediately following the \texttt{Call} instruction is filled in, with the current stack frame being pushed on top of the stack with the return address substituted for the program counter.
661
662Now, what happens next depends on whether we are executing an internal or an external function.
663In the case where the call is to an external function, we have:
664\begin{displaymath}
665\begin{array}{rcl}
666\mathtt{Call(M(args), dst)},                       & \stackrel{\mathtt{ret\_val = f(M(args))}}{\longrightarrow} & \mathtt{Return(ret\_val,\ dst,\ PUSH(...))} \\
667\mathtt{PUSH(current\_frame[pc := after\_return])} &                                                            & 
668\end{array}
669\end{displaymath}
670That is, the call to the external function enters a return state after first computing the return value by executing the external function on the arguments.
671Then the return state restores the program counter by popping the stack, and execution proceeds in a new \texttt{State}:
672\begin{displaymath}
673\begin{array}{rcl}
674\mathtt{Return(ret\_val,\ dst,\ PUSH(...))} & \longrightarrow & \mathtt{pc = POP\_STACK(regs[dst := M(ret\_val)],\ pc)} \\
675                                            &                 & \mathtt{State(regs[dst := M(ret\_val),\ pc)}
676\end{array}
677\end{displaymath}
678
679Suppose we are executing an internal function, however:
680\begin{displaymath}
681\begin{array}{rcl}
682\mathtt{Call(M(args), dst)}                        & \longrightarrow & \mathtt{sp = alloc,\ regs = \emptyset[- := params]} \\
683\mathtt{PUSH(current\_frame[pc := after\_return])} &                 & \mathtt{State(regs,\ sp,\ pc_\emptyset,\ dst)}
684\end{array}
685\end{displaymath}
686A new stack frame is allocated and its address is stored in the stack pointer.
687The register map is initialized first to the empty map, assigning an undefined value to all register, before the value of the parameters is inserted into the map into the argument registers, and a new \texttt{State} follows.
688
689
690Eventually, a Return instruction is faced,
691the return value is fetched from the registers map,
692the stack frame is deallocated
693and a Return state is entered:
694\begin{displaymath}
695\begin{array}{rcl}
696\mathtt{Return(id,\ args,\ dst,\ pc) \in State(Frame^*, Frame)} & \longrightarrow & \mathtt{free(sp)} \\
697   &                 & \mathtt{Return(M(ret\_val), dst, Frames)}
698\end{array}
699\end{displaymath}
700
701Then the return state restores the program counter by popping the stack, and execution proceeds in a new \texttt{State}, like the case for external functions:
702\begin{displaymath}
703\begin{array}{rcl}
704\mathtt{free(sp)}                         & \longrightarrow & \mathtt{pc = POP\_STACK(regs[dst := M(ret\_val)],\ pc)} \\
705\mathtt{Return(M(ret\_val), dst, frames)} &                 & \mathtt{State(regs[dst := M(ret\_val),\ pc)}
706\end{array}
707\end{displaymath}
708
709\subparagraph{The RTLabs to RTL translation for function calls}
710Return instructions are translated to return instructions:
711\begin{displaymath}
712\mathtt{Return} \longrightarrow \mathtt{Return}
713\end{displaymath}
714
715\texttt{Call} instructions are translated to \texttt{Call\_ID} instructions:
716\begin{displaymath}
717\mathtt{Call(id,\ args,\ dst,\ pc)} \longrightarrow \mathtt{CALL\_ID(id,\ \Sigma'(args),\ \Sigma(dst),\ pc)}
718\end{displaymath}
719Here $\Sigma$ is the map, computed by the compiler,
720that translate pseudo-registers holding front-end values to list of
721pseudo-registers holding the chunks for the front-end values.
722The specification for $\Sigma$ is that for every state $s$,
723$$\sigma(s(r)) = (\sigma(s))(\Sigma(r))$$
724
725\subparagraph{Function call/return in RTL}
726
727In the case of RTL, execution proceeds as follows.
728Suppose we are executing a \texttt{CALL\_ID} instruction.
729Then a case split occurs depending on whether we are executing an internal or an external function, as in the RTLabs case:
730\begin{displaymath}
731\hspace{-3.5cm}
732\begin{diagram}
733& & \llbracket \mathtt{CALL\_ID}(\mathtt{id}, \mathtt{args}, \mathtt{dst}, \mathtt{pc})\rrbracket & & \\
734& \ldTo^{\text{external}} & & \rdTo^{\text{internal}} & \\
735\skull & & & &
736\begin{array}{l}
737\mathtt{sp = alloc,\ regs = \emptyset[- := params]} \\
738\mathtt{PUSH}(\mathtt{carry}, \mathtt{regs}, \mathtt{dst}, \mathtt{return\_addr}), \mathtt{pc}_{0}, \mathtt{regs}, \mathtt{sp}
739\end{array}
740\end{diagram}
741\end{displaymath}
742Here, however, we differ from RTLabs when we attempt to execute an external function, in that we use a daemon (i.e. an axiom that can close any goal) to artificially close the case, as we have not yet implemented external functions in the backend.
743The reason for this lack of implementation is as follows.
744Though we have implemented an optimising assembler as the target of the compiler's backend, we have not yet implemented a linker for that assembler, so external functions can not yet be called.
745Whilst external functions are carried forth throughout the entirety of the compiler's frontend, we choose not to do the same for the backend, instead eliminating them in RTL.
746However, it is plausible that we could have carried external functions forth, in order to eliminate them at a later stage (i.e. when translating from LIN to assembly).
747
748In the case of an internal function being executed, we proceed as follows.
749The register map is initialized to the empty map, where all registers are assigned the undefined value, and then the registers corresponding to the function parameters are assigned the value of the parameters.
750Further, the stack pointer is reallocated to make room for an extra stack frame, then a frame is pushed onto the stack with the correct address to jump back to in place of the program counter.
751
752Note, in particular, that this final act of pushing a frame on the stack leaves us in an identical state to the RTLabs case, where the instruction
753\begin{displaymath}
754\mathtt{PUSH(current\_frame[pc := after\_return])}
755\end{displaymath}
756was executed. To summarize, up to the different numer of transitions required
757to do the job, the RTL code for internal function calls closely simulates
758the RTLabs code.
759
760The execution of \texttt{Return} in RTL is similarly straightforward, with the return address, stack pointer, and so on, being computed by popping off the top of the stack, and the return value computed by the function being retrieved from memory:
761\begin{align*}
762\mathtt{return\_addr} & := \mathtt{top}(\mathtt{stack}) \\
763v^*                    & := M(\mathtt{rv\_regs}) \\
764\mathtt{dst}, \mathtt{sp}, \mathtt{carry}, \mathtt{regs} & := \mathtt{pop} \\
765\mathtt{regs}[v^* / \mathtt{dst}] \\
766\end{align*}
767
768To summarize, the forward simulation diagrams for function call/return
769XXX
770
771
772Translation and execution must satisfy a pair of commutation properties for the \texttt{Return} and \texttt{Call} cases.
773Starting from any \texttt{Return} or \texttt{Call} state, translating and then executing a single step must be the same as executing exactly two steps and then translating, with the intermediate state obtained by executing once also being translatable to the final state.
774This is exemplified by the following diagram:
775\begin{displaymath}
776\begin{diagram}
777s    & \rTo^1 & s' & \rTo^1 & s'' \\
778\dTo &        &    & \rdTo  & \dTo \\
779\llbracket s \rrbracket & \rTo(1,3)^1 & & & \llbracket s'' \rrbracket \\ 
780\end{diagram}
781\end{displaymath}
782
783\subsection{The RTL to ERTL translation}
784\label{subsect.rtl.ertl.translation}
785
786We map RTL statuses to ERTL statuses as follows:
787\begin{align*}
788\mathtt{sp} & = \mathtt{RegisterSPH} / \mathtt{RegisterSPL} \\
789\mathtt{graph} &  \mathtt{graph} + \mathtt{prologue}(s) + \mathtt{epilogue}(s) \\
790& \mathrm{where}\ s = \mathrm{callee\ saved} + \nu \mathrm{RA} \\
791\end{align*}
792The 16-bit RTL stack pointer \texttt{SP} is mapped to a pair of 8-bit hardware registers \texttt{RegisterSPH} and \texttt{RegisterSPL}.
793The internal function graphs of RTL are augmented with an epilogue and a prologue, indexed by a set of registers, consisting of a fresh pair of registers \texttt{RA} and the set of registers that must be saved by the callee of a function.
794
795The prologue and epilogue that are added to the function graph do the following:
796\begin{align*}
797\mathtt{prologue}(s) = & \mathtt{create\_new\_frame}; \\
798                       & \mathtt{pop\ ra}; \\
799                       & \mathtt{save\ callee\_saved}; \\
800                                                                                         & \mathtt{get\_params} \\
801                                                                                         & \ \ \mathtt{reg\_params}: \mathtt{move} \\
802                                                                                         & \ \ \mathtt{stack\_params}: \mathtt{push}/\mathtt{pop}/\mathtt{move} \\
803\end{align*}
804That is, the prologue first creates a new stack frame, pops the return address from the stack, saves all the callee saved registers (i.e. the set \texttt{s}), fetches the parameters that are passed via registers and the stack and moves them into the correct registers.
805In other words, the prologue of a function correctly sets up the calling convention used in the compiler when calling a function.
806On the other hand, the epilogue undoes the action of the prologue:
807\begin{align*}
808\mathtt{epilogue}(s) = & \mathtt{save\ return\ to\ tmp\ real\ regs}; \\
809                                                                                         & \mathtt{restore\_registers}; \\
810                       & \mathtt{push\ ra}; \\
811                       & \mathtt{delete\_frame}; \\
812                       & \mathtt{save return} \\
813\end{align*}
814That is, the epilogue first saves the return value to a temporary register, restores all the registers, pushes the return address on to the stack, deletes the stack frame that the prologue created, and saves the return value.
815
816The \texttt{CALL} instruction is translated as follows:
817\begin{displaymath}
818\mathtt{CALL}\ id \mapsto \mathtt{set\_params};\ \mathtt{CALL}\ id;\ \mathtt{fetch\_result}
819\end{displaymath}
820Here, \texttt{set\_params} and \texttt{fetch\_result} are functions that implement what the caller of the function needs to do when calling a function, as opposed to the epilogue and prologue which implement what the callee must do.
821
822The translation from RTL to ERTL and execution functions must satisfy the following properties for \texttt{CALL} and \texttt{RETURN} instructions appropriately:
823\begin{displaymath}
824\begin{diagram}
825\mathtt{CALL} & \rTo^1 & \mathtt{inside\ function} \\
826\dTo & & \dTo \\
827\underbrace{\ldots}_{\llbracket \mathtt{CALL} \rrbracket} & \rTo &
828\underbrace{\ldots}_{\mathtt{prologue}} \\
829\end{diagram}
830\end{displaymath}
831That is, if we start in a RTL \texttt{CALL} instruction, and translate this to an ERTL \texttt{CALL} instruction, then executing the RTL \texttt{CALL} instruction for one step and translating should land us in the prologue of the translated function.
832A similar property for \texttt{RETURN} should also hold, substituting the prologue for the epilogue of the function being translated:
833\begin{displaymath}
834\begin{diagram}
835\mathtt{RETURN} & \rTo^1 & \mathtt{.} \\
836\dTo & & \dTo \\
837\underbrace{\ldots}_{\mathtt{epilogue}} & \rTo &
838\underbrace{\ldots} \\
839\end{diagram}
840\end{displaymath}
841
842\subsection{The ERTL to LTL translation}
843\label{subsect.ertl.ltl.translation}
844During the ERTL to LTL translation pseudo-registers are stored in hardware
845registers or spilled on to the stack frame. The decision is based on liveness
846analysis of the ERTL code to determine what pair of pseudoregisters are live
847at the same time at a given location. A coloring algorithm is then used to
848choose where to store the pseudo-registers, allowing pseudo-registers that
849are never live at the same time to share the same location.
850
851We will not certify any coloring algorithm or control flow analysis.
852Instead, we axiomatically assume the existence of solutions to the
853coloring and liveness problems. In a later phase we plan to validate the
854solutions by writing and certifying the code of a validator.
855
856We describe the liveness analysis and colouring analysis first and then
857the ERTL to LTL translation.
858
859Throughout this section, we denote pseudoregisters with the type $\mathtt{register}$ and hardware ones with $\mathtt{hdwregister}$.
860\subsubsection{Liveness analysis}
861\newcommand{\declsf}[1]{\expandafter\newcommand\expandafter{\csname #1\endcsname}{\mathop{\mathsf{#1}}\nolimits}}
862\declsf{Livebefore}
863\declsf{Liveafter}
864\declsf{Defined}
865\declsf{Used}
866\declsf{Eliminable}
867\declsf{StatementSem}
868
869For the liveness analysis, we aim at a map
870$\ell \in \mathtt{label} \mapsto $ live registers at $\ell$.
871We define the following operators on ERTL statements.
872$$
873\begin{array}{lL>{(ex. $}L<{)$}}
874\Defined(\ell) & registers defined at $\ell$ & \ell:r_1\leftarrow r_2+r_3 \mapsto \{r_1,C\}, \ell:\mathtt{CALL}~id\mapsto \text{caller-save}
875\\
876\Used(\ell) & registers used at $\ell$ & \ell:r_1\leftarrow r_2+r_3 \mapsto \{r_2,r_3\}, \ell:\mathtt{CALL}~id\mapsto \text{parameters}
877\end{array}
878$$
879Given $LA:\mathtt{label}\to\mathtt{lattice}$ (where $\mathtt{lattice}$
880is the type of sets of registers\footnote{More precisely, it is the lattice
881of pairs of sets of pseudo-registers and sets of hardware registers,
882with pointwise operations.}), we also have have the following
883predicates:
884$$
885\begin{array}{lL}
886\Eliminable_{LA}(\ell) & iff executing $\ell$ has side-effects only on $r\notin LA(\ell)$
887\\&
888(ex.\ $\ell : r_1\leftarrow r_2+r_3 \mapsto (\{r_1,C\}\cap LA(\ell)\neq\emptyset),
889  \mathtt{CALL}id\mapsto \text{never}$)
890\\
891\Livebefore_{LA}(\ell) &$:=
892  \begin{cases}
893    LA(\ell) &\text{if $\Eliminable_{LA}(\ell)$,}\\
894    (LA(\ell)\setminus \Defined(\ell))\cup \Used(\ell) &\text{otherwise}.
895  \end{cases}$
896\end{array}
897$$
898In particular, $\Livebefore$ has type $(\mathtt{label}\to\mathtt{lattice})\to
899\mathtt{label}\to\mathtt{lattice}$.
900
901The equation on which we build the fixpoint is then
902$$\Liveafter(\ell) \doteq \bigcup_{\ell <_1 \ell'} \Livebefore_{\Liveafter}(\ell')$$
903where $\ell <_1 \ell'$ denotes that $\ell'$ is an immediate successor of $\ell$
904in the graph. We do not require the fixpoint to be the least one, so the hypothesis
905on $\Liveafter$ that we require is
906\begin{equation}
907\label{eq:livefixpoint}
908\Liveafter(\ell) \supseteq \bigcup_{\ell <_1 \ell'} \Livebefore(\ell')
909\end{equation}
910(for shortness we drop the subscript from $\Livebefore$).
911
912\subsubsection{Colouring}
913\declsf{Colour}
914\newcommand{\at}{\mathrel{@}}
915The aim of the liveness analysis is to define what properties we need
916of the colouring function, which is a map (computed separately for each
917internal function)
918$$\Colour:\mathtt{register}\to\mathtt{hdwregister}+\mathtt{nat}$$
919which identifies pseudoregisters with hardware ones if it is able to, otherwise
920it spills them to the stack. We will just state what property we need from such
921a map. First, we extend the definition to all types of registers by:
922$$\begin{aligned}
923   \Colour^+:\mathtt{hdwregister}+\mathtt{register} &\to \mathtt{hdwregister}+\mathtt{nat}\\
924   r & \mapsto
925\begin{cases}
926  \Colour(r) &\text{if $r\in\mathtt{register}$,}\\
927  r &\text{if $r\in\mathtt{hdwregister}$,}.
928\end{cases}
929  \end{aligned}$$
930The other piece of information we compute for each function is a \emph{similarity}
931relation, which is an equivalence relation on all kinds of registers which depends
932on the point of the program. We write
933$$x\sim y \at \ell$$
934to state that registers $x$ and $y$ are similar at $\ell$. The formal definition
935will be given next, but intuitively it means that those two registers \emph{must}
936have the same value at $\ell$. The analysis that produces this information can be
937coarse: in our case, we just set two different registers to be similar at $\ell$
938if at $\ell$ itself there is a move instruction between the two.
939
940The property required of colouring is the following:
941\begin{equation}
942\label{eq:colourprop}
943\forall \ell.\forall x,y. x,y\in \Liveafter(\ell)\Rightarrow
944  \Colour^+(x)=\Colour^+(y) \Rightarrow x\sim y \at\ell.
945\end{equation}
946
947\subsubsection{The translation}
948For example:
949$$\ell : r_1\leftarrow r_2+r_3 \mapsto \begin{cases}
950                                 \varepsilon & \text{if $\Eliminable(\ell)$},\\
951                                 \Colour(r_1) \leftarrow \Colour(r_2) + \Colour(r_3) & \text{otherwise}.
952                                \end{cases}$$
953where $\varepsilon$ is the empty block of instructions (i.e.\ a \texttt{GOTO}),
954and $\Colour(r_1) \leftarrow \Colour(r_2) + \Colour(r_3)$ is a notation for a
955block of instructions that take into account:
956\begin{itemize}
957 \item load and store ops on the stack if any colouring is in fact a spilling;
958 \item using the accumulator to store intermediate values.
959\end{itemize}
960The overall effect is that if $T$ is an LTL state with $\ell(T)=\ell$
961then we will have $T\to^+T'$ where $T'(\Colour(r_1))=T(\Colour(r_2))+T(\Colour(r_2))$.
962
963We skip over the details of correctly dealing with the stack and its size.
964\subsubsection{The relation between ERTL and LTL states}
965Given a state $S$ in ERTL, we abuse the notation by using $S$ as the underlying map
966$$S : \mathtt{hdwregister}+\mathtt{register} \to \mathtt{Value}.$$
967The program counter in $S$ is written as $\ell(S)$. At this point we can state
968the property asked from similarity:
969\begin{equation}
970\label{eq:similprop}
971\forall S,S'.S\to S' \Rightarrow \forall x,y.x\sim y \at \ell(S) \Rightarrow S'(x) = S'(y).
972\end{equation}
973
974Next, we relate ERTL states with LTL ones. For a state $T$ in LTL we again
975abuse the notation using $T$ as a map
976$$T: \mathtt{hdwregister}+\mathtt{nat} \to \mathtt{Value}$$
977which maps hardware registers and \emph{local stack offsets} to values (in particular,
978$T$ as a map depends on the saved frames for computing the correct absolute
979stack values).
980
981The relation existing between the states at the two sides of this translation step,
982which depends on liveness and colouring, is
983then defined as
984$$S\mathrel\sigma T \iff \ldots \wedge \forall x. x\in \Livebefore(\ell(S))\Rightarrow T(\Colour^+(x)) = S(x).$$
985The ellipsis stands for other straightforward preservation, among which the properties
986$\ell(T) = \ell(S)$ and, inductively, the preservation of frames.
987
988\subsubsection{Proof of preservation}
989We will prove the following proposition:
990$$\forall S, T. S \mathrel\sigma T \Rightarrow S \to S' \Rightarrow \exists T'.T\to^+ T' \wedge S'\mathrel\sigma T'$$
991(with appropriate cost-labelled trace preservation which we omit). We will call $S\mathrel \sigma T$
992the inductive hypothsis, as it will be such in the complete proof by induction on the trace of the program.
993As usual, this step is done by cases
994on the statement at $\ell(S)$ and how it is translated. We carry out in some detail a single case, the one of
995a binary operation on registers.
996
997Suppose that $\ell(S):r_1 \leftarrow r_2+r_3$, so that
998$$S'(x)=\begin{cases}S(r_1)+S(r_2) &\text{if $x=r_1$,}\\S(x) &\text{otherwise.}\end{cases}$$
999\paragraph*{Case $\Eliminable(\ell(S))$.}
1000By definition we have $r_1\notin \Liveafter(\ell(S))$, and the translation
1001of the operation yields a \texttt{GOTO}. We take $T'$ the immediate successor
1002of $T$.
1003
1004Now in order to prove $S'\mathrel\sigma T'$, take any
1005$$x\in\Livebefore(\ell(S'))\subseteq \Liveafter(\ell(S)) = \Livebefore(\ell(S))$$
1006where we have used property~\eqref{eq:livefixpoint} and the definition
1007of $\Livebefore$ when $\Eliminable(\ell(S))$. We get the following chain of equalities:
1008$$T'(\Colour^+(x))\stackrel 1=T(\Colour^+(x))\stackrel 2=S(x) \stackrel 3= S'(x)$$
1009where
1010\begin{enumerate}
1011 \item is because $T'$ has the same store as $T$,
1012 \item is by inductive hypothesis as $x\in\Livebefore(\ell(S))$,
1013 \item is because $x\neq r_1$, as $r_1\notin \Liveafter(\ell(S))\ni x$.
1014\end{enumerate}
1015\paragraph*{Case $\neg\Eliminable(\ell(S))$.}
1016We then have $r_1\in\Liveafter(\ell(S))$, and
1017$$\Livebefore(\ell(S))=(\Liveafter(\ell(S))\setminus\{r_1\})\cup\{r_2,r_3\}.$$
1018Moreover the statement is translated to $\Colour(r_1)\leftarrow\Colour(r_2)+\Colour(r_3)$,
1019and we take the $T'\leftarrow^+T$ such that $T'(\Colour(r_1))=T(\Colour(r_2))+T(\Colour(r_3))$ and
1020$T'(\Colour^+(x))=T(\Colour^+(x))$ for all $x$ with $\Colour^+(x)\neq\Colour(r_1)$.
1021
1022Take any $x\in\Livebefore(\ell(S'))\subseteq \Liveafter(\ell(S))$ (by property~\eqref{eq:livefixpoint}).
1023
1024If $\Colour^+(x)=\Colour(r_1)$, we have by property~\eqref{eq:colourprop}
1025that $x\sim r_1\at \ell(S)$ (as both $r_1,x\in\Liveafter(\ell(S))$, so that
1026$$T'(\Colour^+(x))=T(\Colour(r_2))+T(\Colour(r_3))\stackrel 1=S(r_2)+S(r_3)=S'(r_1)\stackrel 2=S(x)$$
1027where
1028\begin{enumerate}
1029 \item is by two uses of inductive hypothesis, as $r_2,r_3\in\Livebefore(\ell(S))$,
1030 \item is by property~\eqref{eq:similprop}\footnote{Notice that in our particular implementation
1031for this case of binary op $x\sim r_1\at\ell(S)$ implies $x=r_1$. But nothing prevents us from
1032employing more fine euristics for similarity.}.
1033\end{enumerate}
1034
1035If $\Colour^+(x)\neq\Colour(r_1)$ (so in particular $x\neq r_1$), then $x\in\Livebefore(\ell(S))$,
1036so by inductive hypothesis we have
1037$$T'(\Colour^+(x))=T(\Colour^+(x))=S(x)=S'(x).$$
1038
1039\subsection{The LTL to LIN translation}
1040\label{subsect.ltl.lin.translation}
1041Ad detailed elsewhere in the reports, due to the parameterized representation of
1042the back-end languages, the pass described here is actually much more generic
1043than the translation from LTL to LIN. It consists in a linearization pass
1044that maps any graph-based back-end language to its corresponding linear form,
1045preserving its semantics. In the rest of the section, however, we will keep
1046the names LTL and LIN for the two partial instantiations of the parameterized
1047language.
1048
1049We require a map, $\sigma$, from LTL statuses, where program counters are represented as labels in a graph data structure, to LIN statuses, where program counters are natural numbers:
1050\begin{displaymath}
1051\mathtt{pc : label} \stackrel{\sigma}{\longrightarrow} \mathbb{N}
1052\end{displaymath}
1053
1054The LTL to LIN translation pass also linearises the graph data structure into a list of instructions.
1055Pseudocode for the linearisation process is as follows:
1056
1057\begin{lstlisting}
1058let rec linearise graph visited required generated todo :=
1059  match todo with
1060  | l::todo ->
1061    if l $\in$ visited then
1062      let generated := generated $\cup\ \{$ Goto l $\}$ in
1063      let required := required $\cup$ l in
1064        linearise graph visited required generated todo
1065    else
1066      -- Get the instruction at label `l' in the graph
1067      let lookup := graph(l) in
1068      let generated := generated $\cup\ \{$ lookup $\}$ in
1069      -- Find the successor of the instruction at label `l' in the graph
1070      let successor := succ(l, graph) in
1071      let todo := successor::todo in
1072        linearise graph visited required generated todo
1073  | []      -> (required, generated)
1074\end{lstlisting}
1075
1076It is easy to see that this linearisation process eventually terminates.
1077In particular, the size of the visited label set is monotonically increasing, and is bounded above by the size of the graph that we are linearising.
1078
1079The initial call to \texttt{linearise} sees the \texttt{visited}, \texttt{required} and \texttt{generated} sets set to the empty set, and \texttt{todo} initialized with the singleton list consisting of the entry point of the graph.
1080We envisage needing to prove the following invariants on the linearisation function above:
1081
1082\begin{enumerate}
1083\item
1084$\mathtt{visited} \approx \mathtt{generated}$, where $\approx$ is \emph{multiset} equality, as \texttt{generated} is a set of instructions where instructions may mention labels multiple times, and \texttt{visited} is a set of labels,
1085\item
1086$\forall \mathtt{l} \in \mathtt{generated}.\ \mathtt{succ(l,\ graph)} \subseteq \mathtt{required} \cup \mathtt{todo}$,
1087\item
1088$\mathtt{required} \subseteq \mathtt{visited}$,
1089\item
1090$\mathtt{visited} \cap \mathtt{todo} = \emptyset$.
1091\end{enumerate}
1092
1093The invariants collectively imply the following properties, crucial to correctness, about the linearisation process:
1094
1095\begin{enumerate}
1096\item
1097Every graph node is visited at most once,
1098\item
1099Every instruction that is generated is generated due to some graph node being visited,
1100\item
1101The successor instruction of every instruction that has been visited already will eventually be visited too.
1102\end{enumerate}
1103
1104Note, because the LTL to LIN transformation is the first time the code of
1105a function is linearised in the back-end, we must discover a notion of `well formed function code' suitable for linearised forms.
1106In particular, we see the notion of well formedness (yet to be formally defined) resting on the following conditions:
1107
1108\begin{enumerate}
1109\item
1110For every jump to a label in a linearised function code, the target label exists at some point in the function code,
1111\item
1112Each label is unique, appearing only once in the function code,
1113\item
1114The final instruction of a function code must be a return or an unconditional
1115jump.
1116\end{enumerate}
1117
1118We assume that these properties will be easy consequences of the invariants on the linearisation function defined above.
1119
1120The final condition above is potentially a little opaque, so we explain further.
1121The only instructions that can reasonably appear in final position at the end of a function code are returns or backward jumps, as any other instruction would cause execution to `fall out' of the end of the program (for example, when a function invoked with \texttt{CALL} returns, it returns to the next instruction past the \texttt{CALL} that invoked it).
1122
1123\subsection{The LIN to ASM and ASM to MCS-51 machine code translations}
1124\label{subsect.lin.asm.translation}
1125
1126The LIN to ASM translation step is trivial, being almost the identity function.
1127The only non-trivial feature of the LIN to ASM translation is that all labels are `named apart' so that there is no chance of freshly generated labels from different namespaces clashing with labels from another namespace.
1128
1129The ASM to MCS-51 machine code translation step, and the required statements of correctness, are found in an unpublished manuscript attached to this document.
1130This is the most complex translation because of the huge number of cases
1131to be addressed and because of the complexity of the two semantics.
1132Moreover, in the assembly code we have conditional and unconditional jumps
1133to arbitrary locations in the code, which are not supported by the MCS-51
1134instruction set. The latter has several kind of jumps characterized by a
1135different instruction size and execution time, but limited in range. For
1136instance, conditional jumps to locations whose destination is more than
1137$2^7$ bytes away from the jump instruction location are not supported at
1138all and need to be emulated with a code transformation. The problem, which
1139is known in the litterature as branch displacement and that applies also
1140to modern architectures, is known to be hard and is often NP. As far as we
1141know, we will provide the first formally verified proof of correctness for
1142an assembler that implements branch displacement. We are also providing
1143the first verified proof of correctness of a mildly optimizing branch
1144displacement algorithm that, at the moment, is almost finished, but not
1145described in the companion paper. This proof by itself took about 6 men
1146months.
1147
1148\section{Correctness of cost prediction}
1149Roughly speaking,
1150the proof of correctness of cost prediction shows that the cost of executing
1151a labelled object code program is the same as the sum over all labels in the
1152program execution trace of the cost statically associated to the label and
1153computed on the object code itself.
1154
1155In presence of object level function calls, the previous statement is, however,
1156incorrect. The reason is twofold. First of all, a function call may diverge.
1157To the last labels that comes before the call, however, we also associate
1158the cost of the instructions that follow the call. Therefore, in the
1159sum over all labels, when we meet a label we pre-pay for the instructions
1160after function calls, assuming all calls to be terminating. This choice is
1161driven by considerations on the source code. Functions can be called also
1162inside expressions and it would be too disruptive to put labels inside
1163expressions to capture the cost of instructions that follow a call. Moreover,
1164adding a label after each call would produce a much higher number of proof
1165obligations in the certification of source programs using Frama-C. The
1166proof obligations, moreover, would be guarded by termination of all functions
1167involved, that also generates lots of additional complex proof obligations
1168that have little to do with execution costs. With our approach, instead, we
1169put less burden on the user, at the price of proving a weaker statement:
1170the estimated and actual costs will be the same if and only if the high level
1171program is converging. For prefixes of diverging programs we can provide
1172a similar result where the equality is replaced by an inequality (loss of
1173precision).
1174
1175Assuming totality of functions is however not sufficient yet at the object
1176level. Even if a function returns, there is no guarantee that it will transfer
1177control back to the calling point. For instance, the function could have
1178manipulated the return address from its stack frame. Moreover, an object level
1179program can forge any address and transfer control to it, with no guarantee
1180on the execution behaviour and labelling properties of the called program.
1181
1182To solve the problem, we introduced the notion of \emph{structured trace}
1183that come in two flavours: structured traces for total programs (an inductive
1184type) and structured traces for diverging programs (a co-inductive type based
1185on the previous one). Roughly speaking, a structured trace represents the
1186execution of a well behaved program that is subject to several constraints
1187like
1188\begin{enumerate}
1189 \item All function calls return control just after the calling point
1190 \item The execution of all function bodies start with a label and end with
1191   a RET (even the ones reached by invoking a function pointer)
1192 \item All instructions are covered by a label (required by correctness of
1193   the labelling approach)
1194 \item The target of all conditional jumps must be labelled (a sufficient
1195   but not necessary condition for precision of the labelling approach)
1196 \item \label{prop5} Two structured traces with the same structure yield the same
1197   cost traces.
1198\end{enumerate}
1199
1200Correctness of cost predictions is proved only for structured execution traces,
1201i.e. well behaved programs. The forward simulation proof for all back-end
1202passes will actually be a proof of preservation of the structure of
1203the structured traces that, because of property \ref{prop5}, will imply
1204correctness of the cost prediction for the back-end. The Clight to RTLabs
1205will also include a proof that associates to each converging execution its
1206converging structured trace and to each diverging execution its diverging
1207structured trace.
1208
1209There are also other two issues that invalidate the naive statement of
1210correctness of cost prediciton given above. The algorithm that statically
1211computes the cost of blocks is correct only when the object code is \emph{well
1212formed} and the program counter is \emph{reachable}.
1213A well formed object code is such that
1214the program counter will never overflow after the execution step of
1215the processor. An overflow that occurs during fetching but is overwritten
1216during execution is, however, correct and necessary to accept correct
1217programs that are as large as the program memory. Temporary overflows add
1218complications to the proof. A reachable address is an address that can be
1219obtained by fetching (not executing!) a finite number of times from the
1220beginning of the code memory without ever overflowing. The complication is that
1221the static prediction traverses the code memory assuming that the memory will
1222be read sequentially from the beginning and that all jumps jump only to
1223reachable addresses. When this property is violated, the way the code memory
1224is interpreted is uncorrect and the cost computed is totally meaningless.
1225The reachability relation is closed by fetching for well formed programs.
1226The property that calls to function pointers only target reachable and
1227well labelled locations, however, is not statically predictable and it is
1228enforced in the structured trace.
1229
1230The proof of correctness of cost predictions has been quite complex. Setting
1231up the good invariants (structured traces, well formed programs, reachability)
1232and completing the proof has required more than 3 men months while the initally
1233estimated effort was much lower. In the paper-and-pencil proof for IMP, the
1234corresponding proof was obvious and only took two lines.
1235
1236The proof itself is quite involved. We
1237basically need to show as an important lemma that the sum of the execution
1238costs over a structured trace, where the costs are summed in execution order,
1239is equivalent to the sum of the execution costs in the order of pre-payment.
1240The two orders are quite different and the proof is by mutual recursion over
1241the definition of the converging structured traces, which is a family of three
1242mutual inductive types. The fact that this property only holds for converging
1243function calls in hidden in the definition of the structured traces.
1244Then we need to show that the order of pre-payment
1245corresponds to the order induced by the cost traces extracted from the
1246structured trace. Finally, we need to show that the statically computed cost
1247for one block corresponds to the cost dinamically computed in pre-payment
1248order.
1249
1250\section{Overall results}
1251
1252Functional correctness of the compiled code can be shown by composing
1253the simulations to show that the target behaviour matches the
1254behaviour of the source program, if the source program does not `go
1255wrong'.  More precisely, we show that there is a forward simulation
1256between the source trace and a (flattened structured) trace of the
1257output, and conclude equivalence because the target's semantics are
1258in the form of an executable function, and hence
1259deterministic.
1260
1261Combining this with the correctness of the assignment of costs to cost
1262labels at the ASM level for a structured trace, we can show that the
1263cost of executing any compiled function (including the main function)
1264is equal to the sum of all the values for cost labels encountered in
1265the \emph{source code's} trace of the function.
1266
1267\section{Estimated effort}
1268Based on the rough analysis performed so far we can estimate the total
1269effort for the certification of the compiler. We obtain this estimation by
1270combining, for each pass: 1) the number of lines of code to be certified;
12712) the ratio of number of lines of proof to number of lines of code from
1272the CompCert project~\cite{compcert} for the CompCert pass that is closest to
1273ours; 3) an estimation of the complexity of the pass according to the
1274analysis above. The result is shown in Table~\ref{table}.
1275
1276\begin{table}{h}
1277\begin{tabular}{lrlrr}
1278Pass origin & Code lines & CompCert ratio & Estimated effort & Estimated effort \\
1279            &            &                & (based on CompCert) & \\
1280\hline
1281Common &  4864 & 4.25 \permil & 20.67 & 17.0 \\
1282Cminor &  1057 & 5.23 \permil & 5.53  &  6.0 \\
1283Clight &  1856 & 5.23 \permil & 9.71  & 10.0 \\ 
1284RTLabs &  1252 & 1.17 \permil & 1.48  &  5.0 \\
1285RTL    &   469 & 4.17 \permil & 1.95  &  2.0 \\
1286ERTL   &   789 & 3.01 \permil & 2.38  & 2.5 \\
1287LTL    &    92 & 5.94 \permil & 0.55  & 0.5 \\
1288LIN    &   354 & 6.54 \permil & 2.31  &   1.0 \\
1289ASM    &   984 & 4.80 \permil & 4.72  &  10.0 \\
1290\hline
1291Total common    &  4864 & 4.25 \permil & 20.67 & 17.0 \\
1292Total front-end &  2913 & 5.23 \permil & 15.24 & 16.0 \\
1293Total back-end  &  6853 & 4.17 \permil & 13.39 & 21.0 \\
1294\hline
1295Total           & 14630 & 3.75 \permil & 49.30 & 54.0 \\
1296\end{tabular}
1297\caption{\label{table} Estimated effort}
1298\end{table}
1299
1300We provide now some additional informations on the methodology used in the
1301computation. The passes in Cerco and CompCert front-end closely match each
1302other. However, there is no clear correspondence between the two back-ends.
1303For instance, we enforce the calling convention immediately after instruction
1304selection, whereas in CompCert this is performed in a later phase. Or we
1305linearize the code at the very end, whereas CompCert performs linearization
1306as soon as possible. Therefore, the first part of the exercise has consisted
1307in shuffling and partitioning the CompCert code in order to assign to each
1308CerCo pass the CompCert code that performs the same transformation.
1309
1310After this preliminary step, using the data given in~\cite{compcert} (which
1311are relative to an early version of CompCert) we computed the ratio between
1312men months and lines of code in CompCert for each CerCo pass. This is shown
1313in the third column of Table~\ref{table}. For those CerCo passes that
1314have no correspondence in CompCert (like the optimizing assembler) or where
1315we have insufficient data, we have used the average of the ratios computed
1316above.
1317
1318The first column of the table shows the number of lines of code for each
1319pass in CerCo. The third column is obtained multiplying the first with the
1320CompCert ratio. It provides an estimate of the effort required (in men months)
1321if the complexity of the proofs for CerCo and Compcert would be the same.
1322
1323The two proof styles, however, are on purpose completely different. Where
1324CompCert uses non executable semantics, describing the various semantics with
1325inductive types, we have preferred executable semantics. Therefore, CompCert
1326proofs by induction and inversion become proof by functional inversion,
1327performed using the Russel methodology (now called Program in Coq, but whose
1328behaviour differs from Matita's one). Moreover, CompCert code is written using
1329only types that belong to the Hindley-Milner fragment, whereas we have
1330heavily exploited dependent types all over the code. The dependent type
1331discipline offers many advantages from the point of view of clarity of the
1332invariants involved and early detection of errors and it naturally combines
1333well with the Russel approach which is based on dependent types. However, it
1334is also well known to introduce technical problems all over the code, like
1335the need to explicitly prove type equalities to be able to manipulate
1336expressions in certain ways. In many situations, the difficulties encountered
1337with manipulating dependent types are better addressed by improving the Matita
1338system, according to the formalization driven system development. For this
1339reason, and assuming a pessimistic point of view on our performance, the
1340fourth columns presents the final estimation of the effort required, that also
1341takes in account the complexity of the proof suggested by the informal proofs
1342sketched in the previous section.
1343
1344\subsection{Contingency plan}
1345On the basis of the proof strategy sketched in this document and the
1346estimated effort, we can refine our contingency plan. In case we will end
1347the certification of the basic compiler in advance we will have the choice
1348of either proving loop optimizations and/or partial redundancy elimination
1349correct (both tasks that seem difficult to achieve in a short time) or
1350considering the MCS-51 specific extensions introduced during the first period
1351and under-used in the formalized prototype. Yet another possibility would be
1352to better study retargeting of the code and the commutation property between
1353different compiler passes. The latter study is easily enabled by our
1354approach where all back-end languages are instances of the same parameterized
1355language.
1356
1357In the case of a consistent delay in the certification of some
1358components, we will address first the passes that are more likely to have
1359undetected bugs and we will follow a top-down approach, axiomatizing
1360the properties of the data structured used in the compiler to focus more
1361on the algorithms. The rational is that data structures are easier then
1362algorithms to test using well known methodologies.
1363The effort table clearly shows that commond definitions
1364and data structures are 1/4th of the size of the code and require slightly
1365less than 1/3rd of the total effort. At least half of this effort really goes
1366into simple data structures (vectors, bounded and unbounded integers, tries
1367and maps) whose certification is not interesting and whose code could be
1368taken without re-proving it from the library of any other theorem prover.
1369
1370\section{Conclusions}
1371The overall exercise, whose details have been only minimally reported here,
1372has been very useful. It has allowed to spot in an early moment some criticities
1373of the proof that have required major changes in the proof plan. It has also
1374shown that the last passes of the compilation (e.g. assembly) and cost
1375prediction on the object code are much more involved than more high level
1376passes.
1377
1378The final estimation for the effort is surely affected by a low degree of
1379confidence. It is however sufficient to conclude that the effort required
1380is in line with the man power that was scheduled for the second half of the
1381second period and for the third period. Compared to the number of men months
1382declared in Annex I of the contract, we will need more men months. However,
1383both at UNIBO and UEDIN there have been major differences in hiring with
1384respect to the Annex. Therefore both sites have now an higher number of men
1385months available, with the trade-off of a lower level of maturity of the
1386people employed.
1387
1388The reviewers suggested that we use this estimation to compare two possible
1389scenarios: a) proceed as planned, porting all the CompCert proofs to Matita
1390or b) port D3.1 and D4.1 to Coq and re-use the CompCert proofs.
1391We remark here again that the back-end of the two compilers, from the
1392memory model on, are sensibly different: we are not re-proving correctness
1393of the same piece of code. Moreover, the proof techniques are different for
1394the front-end too. Switching to the CompCert formalization would imply
1395the abandon of the untrusted compiler, the abandon of the experiment with
1396a different proof technique, the abandon of the quest for an open source
1397proof, and the abandon of the co-development of the formalization and the
1398Matita proof assistant. In the Commitment Letter~\cite{letter} delivered
1399to the Officer in May we clarified our personal perspective on the project
1400goals and objectives. We do not re-describe here the point of view presented
1401in the letter that we can condense in ``we value diversity''.
1402
1403Clearly, if the execise would have suggested the infeasability in terms of
1404effort of concluding the formalization or getting close to that, we would have
1405abandoned our path and embraced the reviewer's suggestion. However, we
1406have been comforted in the analysis we did in autumn and further progress done
1407during the winter does not show yet any major delay with respect to the
1408proof schedule. We are thus planning to continue the certification according
1409to the more detailed proof plan that came out from the exercise reported in
1410this manuscript.
1411
1412\begin{thebibliography}{2}
1413\bibitem{compcert} X. Leroy, ``A Formally Verified Compiler back-end'',
1414Journal of Automated Reasoning 43(4)):363-446, 2009.
1415
1416\bibitem{letter}The CerCo team, ``Commitment to the Consideration of Reviewer's Reccomendation'', 16/05/2011.
1417\end{thebibliography}
1418
1419
1420\end{document}
Note: See TracBrowser for help on using the repository browser.