source: Deliverables/D3.4/Report/report.tex @ 3168

Last change on this file since 3168 was 3168, checked in by campbell, 6 years ago

Add some text from Ilias.

File size: 37.8 KB
Line 
1\documentclass[11pt,epsf,a4wide]{article}
2\usepackage[mathletters]{ucs}
3\usepackage[utf8x]{inputenc}
4\usepackage{stmaryrd}
5\usepackage{listings}
6\usepackage{../../style/cerco}
7\newcommand{\ocaml}{OCaml}
8\newcommand{\clight}{Clight}
9\newcommand{\matita}{Matita}
10\newcommand{\sdcc}{\texttt{sdcc}}
11
12\newcommand{\textSigma}{\ensuremath{\Sigma}}
13
14% LaTeX Companion, p 74
15\newcommand{\todo}[1]{\marginpar{\raggedright - #1}}
16
17\lstdefinelanguage{coq}
18  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
19   morekeywords={[2]if,then,else},
20  }
21
22\lstdefinelanguage{matita}
23  {keywords={definition,lemma,theorem,remark,inductive,record,qed,let,rec,match,with,Type,and,on},
24   morekeywords={[2]whd,normalize,elim,cases,destruct},
25   mathescape=true,
26   morecomment=[n]{(*}{*)},
27  }
28
29\lstset{language=matita,basicstyle=\small\tt,columns=flexible,breaklines=false,
30        keywordstyle=\color{red}\bfseries,
31        keywordstyle=[2]\color{blue},
32        commentstyle=\color{green},
33        stringstyle=\color{blue},
34        showspaces=false,showstringspaces=false}
35
36\lstset{extendedchars=false}
37\lstset{inputencoding=utf8x}
38\DeclareUnicodeCharacter{8797}{:=}
39\DeclareUnicodeCharacter{10746}{++}
40\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
41\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
42
43
44\title{
45INFORMATION AND COMMUNICATION TECHNOLOGIES\\
46(ICT)\\
47PROGRAMME\\
48\vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}}
49
50\date{ }
51\author{}
52
53\begin{document}
54\thispagestyle{empty}
55
56\vspace*{-1cm}
57\begin{center}
58\includegraphics[width=0.6\textwidth]{../../style/cerco_logo.png}
59\end{center}
60
61\begin{minipage}{\textwidth}
62\maketitle
63\end{minipage}
64
65
66\vspace*{0.5cm}
67\begin{center}
68\begin{LARGE}
69\bf
70Report n. D3.4\\
71Front-end Correctness Proofs\\
72\end{LARGE} 
73\end{center}
74
75\vspace*{2cm}
76\begin{center}
77\begin{large}
78Version 1.0
79\end{large}
80\end{center}
81
82\vspace*{0.5cm}
83\begin{center}
84\begin{large}
85Authors:\\
86Brian~Campbell, Ilias~Garnier, James~McKinna, Ian~Stark
87\end{large}
88\end{center}
89
90\vspace*{\fill}
91\noindent
92Project Acronym: \cerco{}\\
93Project full title: Certified Complexity\\
94Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\
95
96\clearpage \pagestyle{myheadings} \markright{\cerco{}, FP7-ICT-2009-C-243881}
97
98\newpage
99
100\vspace*{7cm}
101\paragraph{Abstract}
102We report on the correctness proofs for the front-end of the \cerco{} cost
103lifting compiler, considering three distinct parts of the task: showing that
104the \emph{annotated source code} output by the compiler has equivalent
105behaviour to the original input (up to the annotations); showing that a
106\emph{measurable} subtrace of the annotated source code corresponds to an
107equivalent measurable subtrace in the code produced by the front-end, including
108costs; and finally showing that the enriched \emph{structured} execution traces
109required for cost correctness in the back-end can be constructed from the
110properties of the code produced by the front-end.
111
112A key part of our work is that the intensional correctness results that show
113that we get consistent cost measurements throughout the intermediate languages
114of the compiler can be layered on top of normal forward simulation results,
115if we split them into local call-structure preserving results.
116
117This deliverable shows correctness results about the formalised compiler
118described in D3.2, using the source language semantics from D3.1 and
119intermediate language semantics from D3.3.  Together with the companion
120deliverable about the correctness of the back-end, D4.4, we obtain results
121about the whole formalised compiler.
122
123% TODO: mention the deliverable about the extracted compiler et al?
124
125\newpage 
126
127\tableofcontents
128
129% CHECK: clear up any -ize vs -ise
130% CHECK: clear up any "front end" vs "front-end"
131% CHECK: clear up any mentions of languages that aren't textsf'd.
132% CHECK: fix unicode in listings
133
134\section{Introduction}
135
136\todo{add stack space for more precise statement.  Also do some
137translation validation on sound, precise labelling properties.}
138
139The \cerco{} compiler compiles C code targeting microcontrollers implementing
140the Intel 8051 architecture, which produces both the object code and source
141code containing annotations describing the timing behavior of the object code.
142There are two versions: first, an initial prototype implemented in
143\ocaml{}~\cite{d2.2}, and a version formalised in the \matita{} proof
144assistant~\cite{d3.2,d4.2} and then extracted to \ocaml{} code to produce an
145executable compiler.  In this document we present results formalised in
146\matita{} about the front-end of that version of the compiler, and how that fits
147into the verification of the whole compiler.
148
149\todo{maybe mention stack space here?  other additions?  refer to "layering"?}
150A key part of this work was to separate the proofs about the compiled code's
151extensional behaviour (that is, the functional correctness of the compiler)
152from the intensional correctness that the costs given are correct.
153Unfortunately, the ambitious goal of completely verifying the compiler was not
154feasible within the time available, but thanks to this separation of extensional
155and intensional proofs we are able to axiomatize simulation results similar to
156those in other compiler verification projects and concentrate on the novel
157intensional proofs.  The proofs were also made more tractable by introducing
158compile-time checks for the `sound and precise' cost labelling properties
159rather than proving that they are preserved throughout.
160
161The overall statement of correctness says that the annotated program has the
162same behaviour as the input, and that for any suitably well-structured part of
163the execution (which we call \emph{measurable}), the object code will execute
164the same behaviour taking precisely the time given by the cost annotations in
165the annotated source program.
166
167In the next section we recall the structure of the compiler and make the overall
168statement more precise.  Following that, in Section~\ref{sec:fegoals} we
169describe the statements we need to prove about the intermediate \textsf{RTLabs}
170programs sufficient for the back-end proofs.  \todo{rest of document structure}
171
172\section{The compiler and main goals}
173
174TODO: outline compiler, maybe figure from talk, maybe something like the figure
175below.
176
177TODO: might want a version of this figure
178\begin{figure}
179\begin{center}
180\begin{minipage}{.8\linewidth}
181\begin{tabbing}
182\quad \= $\downarrow$ \quad \= \kill
183\textsf{C} (unformalized)\\
184\> $\downarrow$ \> CIL parser (unformalized \ocaml)\\
185\textsf{Clight}\\
186%\> $\downarrow$ \> add runtime functions\\
187\> $\downarrow$ \> \lstinline[language=C]'switch' removal\\
188\> $\downarrow$ \> labelling\\
189\> $\downarrow$ \> cast removal\\
190\> $\downarrow$ \> stack variable allocation and control structure
191 simplification\\
192\textsf{Cminor}\\
193%\> $\downarrow$ \> generate global variable initialization code\\
194\> $\downarrow$ \> transform to RTL graph\\
195\textsf{RTLabs}\\
196\> $\downarrow$ \> check cost labelled properties of RTL graph\\
197\> $\downarrow$ \> start of target specific back-end\\
198\>\quad \vdots
199\end{tabbing}
200\end{minipage}
201\end{center}
202\caption{Front-end languages and compiler passes}
203\label{fig:summary}
204\end{figure}
205
206The compiler function returns the following record on success:
207\begin{lstlisting}[language=matita]
208record compiler_output : Type[0] :=
209 { c_labelled_object_code: labelled_object_code
210 ; c_stack_cost: stack_cost_model
211 ; c_max_stack: nat
212 ; c_init_costlabel: costlabel
213 ; c_labelled_clight: clight_program
214 ; c_clight_cost_map: clight_cost_map
215 }.
216\end{lstlisting}
217It consists of annotated 8051 object code, a mapping from function
218identifiers to the function's stack space usage\footnote{The compiled
219  code's only stack usage is to allocate a fixed-size frame on each
220  function entry and discard it on exit.}, a cost label covering the
221initialisation of global variables and calling the
222\lstinline[language=C]'main' function, the annotated source code, and
223finally a mapping from cost labels to actual execution time costs.
224
225An \ocaml{} pretty printer is used to provide a concrete version of the output
226code.  In the case of the annotated source code, it also inserts the actual
227costs alongside the cost labels, and optionally adds a global cost variable
228and instrumentation to support further reasoning.  \todo{Cross-ref case study
229deliverables}
230
231\subsection{Revisions to the prototype compiler}
232
233TODO: could be a good idea to have this again; stack space,
234initialisation, cost checks, had we dropped cminor loops in previous
235writing?, check mailing list in case I've forgotten something
236
237TODO: continued use of dep types to reduce partiality
238
239\subsection{Main goals}
240
241TODO: need an example for this
242
243Informally, our main intensional result links the time difference in a source
244code execution to the time difference in the object code, expressing the time
245for the source by summing the values for the cost labels in the trace, and the
246time for the target by a clock built in to the 8051 executable semantics.
247
248The availability of precise timing information for 8501
249implementations and the design of the compiler allow it to give exact
250time costs in terms of processor cycles.  However, these exact results
251are only available if the subtrace we measure starts and ends at
252suitable points.  In particular, pure computation with no observable
253effects may be reordered and moved past cost labels, so we cannot
254measure time between arbitrary statements in the program.
255
256There is also a constraint on the subtraces that we
257measure due to the requirements of the correctness proof for the
258object code timing analysis.  To be sure that the timings are assigned
259to the correct cost label, we need to know that each return from a
260function call must go to the correct return address.  It is difficult
261to observe this property locally in the object code because it relies
262on much earlier stages in the compiler.  To convey this information to
263the timing analysis extra structure is imposed on the subtraces, which
264we will give more details on in Section~\ref{sec:fegoals}.
265
266% Regarding the footnote, would there even be much point?
267These restrictions are reflected in the subtraces that we give timing
268guarantees on; they must start at a cost label and end at the return
269of the enclosing function of the cost label\footnote{We expect that
270  this would generalise to subtraces between cost labels in the same
271  function, but could not justify the extra complexity that would be
272  required to show this.}.  A typical example of such a subtrace is
273the execution of an entire function from the cost label at the start
274of the function until it returns.  We call such any such subtrace
275\emph{measurable} if it (and the prefix of the trace before it) can
276also be executed within the available stack space.
277
278Now we can give the main intensional statement for the compiler.
279Given a \emph{measurable} subtrace for a labelled \textsf{Clight}
280program, there is a subtrace of the 8051 object code program where the
281time differences match.  Moreover, \emph{observable} parts of the
282trace also match --- these are the appearance of cost labels and
283function calls and returns.
284
285More formally, the definition of this statement in \matita{} is
286\begin{lstlisting}[language=matita]
287definition simulates :=
288  $\lambda$p: compiler_output.
289  let initial_status := initialise_status $\dots$ (cm (c_labelled_object_code $\dots$ p)) in
290  $\forall$m1,m2.
291   measurable Clight_pcs (c_labelled_clight $\dots$ p) m1 m2
292    (lookup_stack_cost (c_stack_cost $\dots$ p)) (c_max_stack $\dots$ p) $\rightarrow$
293  $\forall$c1,c2.
294   clock_after Clight_pcs (c_labelled_clight $\dots$ p) m1 (c_clight_cost_map $\dots$ p) = OK $\dots$ c1 $\rightarrow$
295   clock_after Clight_pcs (c_labelled_clight $\dots$ p) (m1+m2) (c_clight_cost_map $\dots$ p) = OK $\dots$ c2 $\rightarrow$
296  $\exists$n1,n2.
297   observables Clight_pcs (c_labelled_clight $\dots$ p) m1 m2 =
298   observables (OC_preclassified_system (c_labelled_object_code $\dots$ p))
299    (c_labelled_object_code $\dots$ p) n1 n2
300  $\wedge$
301   c2 - c1 = clock $\dots$ (execute n2 ? initial_status) - clock $\dots$ (execute n1 ? initial_status).
302\end{lstlisting}
303where the \lstinline'measurable', \lstinline'clock_after' and
304\lstinline'observables' definitions can be applied to multiple
305languages; in this case the \lstinline'Clight_pcs' record applies them
306to \textsf{Clight} programs.
307
308There is a second part to the statement, which says that the initial
309processing of the input program to produce the cost labelled version
310does not affect the semantics of the program:
311% Yes, I'm paraphrasing the result a tiny bit to remove the observe non-function
312\begin{lstlisting}[language=matita]
313  $\forall$input_program,output.
314  compile input_program = return output $\rightarrow$
315  not_wrong … (exec_inf … clight_fullexec input_program) $\rightarrow$
316  sim_with_labels
317   (exec_inf … clight_fullexec input_program)
318   (exec_inf … clight_fullexec (c_labelled_clight … output))
319\end{lstlisting}
320That is, any successful compilation produces a labelled program that
321has identical behaviour to the original, so long as there is no
322`undefined behaviour'.
323
324Note that this provides full functional correctness, including
325preservation of (non-)termination.  The intensional result above does
326not do this directly --- it does not guarantee the same result or same
327termination.  There are two mitigating factors, however: first, to
328prove the intensional property you need local simulation results that
329can be pieced together to form full behavioural equivalence, only time
330constraints have prevented us from doing so.  Second, if we wish to
331confirm a result, termination, or non-termination we could add an
332observable witness, such as a function that is only called if the
333correct result is given.  The intensional result guarantees that the
334observable witness is preserved, so the program must behave correctly.
335
336\section{Intermediate goals for the front-end}
337\label{sec:fegoals}
338
339The essential parts of the intensional proof were outlined during work
340on a toy
341compiler~\cite{d2.1,springerlink:10.1007/978-3-642-32469-7_3}.  These
342are
343\begin{enumerate}
344\item functional correctness, in particular preserving the trace of
345  cost labels,
346\item the \emph{soundness} and \emph{precision} of the cost labelling
347  on the object code, and
348\item the timing analysis on the object code produces a correct
349  mapping from cost labels to time.
350\end{enumerate}
351
352However, that toy development did not include function calls.  For the
353full \cerco{} compiler we also need to maintain the invariant that
354functions return to the correct program location in the caller, as we
355mentioned in the previous section.  During work on the back-end timing
356analysis (describe in more detail in the companion deliverable, D4.4)
357the notion of a \emph{structured trace} was developed to enforce this
358return property, and also most of the cost labelling properties too.
359
360\begin{figure}
361\begin{center}
362\includegraphics[width=0.5\linewidth]{compiler.pdf}
363\end{center}
364\caption{The compiler and proof outline}
365\label{fig:compiler}
366\end{figure}
367
368Jointly, we generalised the structured traces to apply to any of the
369intermediate languages with some idea of program counter.  This means
370that they are introduced part way through the compiler, see
371Figure~\ref{fig:compiler}.  Proving that a structured trace can be
372constructed at \textsf{RTLabs} has several virtues:
373\begin{itemize}
374\item This is the first language where every operation has its own
375  unique, easily addressable, statement.
376\item Function calls and returns are still handled in the language and
377  so the structural properties are ensured by the semantics.
378\item Many of the back-end languages from \textsf{RTL} share a common
379  core set of definitions, and using structured traces throughout
380  increases this uniformity.
381\end{itemize}
382
383\begin{figure}
384\begin{center}
385\includegraphics[width=0.6\linewidth]{strtraces.pdf}
386\end{center}
387\caption{Nesting of functions in structured traces}
388\label{fig:strtrace}
389\end{figure}
390A structured trace is a mutually inductive data type which principally
391contains the steps from a normal program trace, but arranged into a
392nested structure which groups entire function calls together and
393aggregates individual steps between cost labels (or between the final
394cost label and the return from the function), see
395Figure~\ref{fig:strtrace}.  This capture the idea that the cost labels
396only represent costs \emph{within} a function --- calls to other
397functions are accounted for in the trace for their execution, and we
398can locally regard function calls as a single step.
399
400These structured traces form the core part of the intermediate results
401that we must prove so that the back-end can complete the main
402intensional result stated above.  In full, we provide the back-end
403with
404\begin{enumerate}
405\item A normal trace of the \textbf{prefix} of the program's execution
406  before reaching the measurable subtrace.  (This needs to be
407  preserved so that we know that the stack space consumed is correct.)
408\item The \textbf{structured trace} corresponding to the measurable
409  subtrace.
410\item An additional property about the structured trace that no
411  `program counter' is \textbf{repeated} between cost labels.  Together with
412  the structure in the trace, this takes over from showing that
413  cost labelling is sound and precise.
414\item A proof that the \textbf{observables} have been preserved.
415\item A proof that the \textbf{stack limit} is still observed by the prefix and
416  the structure trace.  (This is largely a consequence of the
417  preservation of observables.)
418\end{enumerate}
419
420Following the outline in Figure~\ref{fig:compiler}, we will first deal
421with the transformations in \textsf{Clight} that produce the source
422program with cost labels, then show that measurable traces can be
423lifted to \textsf{RTLabs}, and finally that we can construct the
424properties listed above ready for the back-end proofs.
425
426\section{Input code to cost labelled program}
427
428The simple form of labelling used in the formalised compiler is not
429quite capable of capturing costs arising from complex C
430\lstinline[language=C]'switch' statements, largely due to the
431fall-through behaviour.  Our first pass replaces these statements with
432simpler C code, allowing our second pass to perform the cost
433labelling.  We show that the behaviour of programs is unchanged by
434these passes.
435
436TODO: both give one-step-sim-by-many forward sim results; switch
437removal tricky, uses aux var to keep result of expr, not central to
438intensional correctness so curtailed proof effort once reasonable
439level of confidence in code gained; labelling much simpler; don't care
440what the labels are at this stage, just need to know when to go
441through extra steps.  Rolled up into a single result with a cofixpoint
442to obtain coinductive statement of equivalence (show).
443
444\section{Finding corresponding measurable subtraces}
445
446There follow the three main passes of the front-end:
447\begin{enumerate}
448\item simplification of casts in \textsf{Clight} code
449\item \textsf{Clight} to \textsf{Cminor} translation, performing stack
450  variable allocation and simplifying control structures
451\item transformation to \textsf{RTLabs} control flow graph
452\end{enumerate}
453\todo{I keep mentioning forward sim results - I probably ought to say
454  something about determinancy} We have taken a common approach to
455each pass: first we build (or axiomatise) forward simulation results
456that are similar to normal compiler proofs, but slightly more
457fine-grained so that we can see that the call structure and relative
458placement of cost labels is preserved.
459
460Then we instantiate a general result which shows that we can find a
461\emph{measurable} subtrace in the target of the pass that corresponds
462to the measurable subtract in the source.  By repeated application of
463this result we can find a measurable subtrace of the execution of the
464\textsf{RTLabs} code, suitable for the construction of a structured
465trace (see Section~\ref{sec:structuredtrace}.  This is essentially an
466extra layer on top of the simulation proofs that provides us with the
467extra information required for our intensional correctness proof.
468
469\subsection{Generic measurable subtrace lifting proof}
470
471Our generic proof is parametrised on a record containing small-step
472semantics for the source and target language, a classification of
473states (the same form of classification is used when defining
474structured traces), a simulation relation which loosely respects the
475classification and cost labelling \todo{this isn't very helpful} and
476four simulation results:
477\begin{enumerate}
478\item a step from a `normal' state (which is not classified as a call
479  or return) which is not a cost label is simulated by zero or more
480  `normal' steps;
481\item a step from a `call' state followed by a cost label step is
482  simulated by a step from a `call' state, a corresponding label step,
483  then zero or more `normal' steps;
484\item a step from a `call' state not followed by a cost label
485  similarly (note that this case cannot occur in a well-labelled
486  program, but we do not have enough information locally to exploit
487  this); and
488\item a cost label step is simulated by a cost label step.
489\end{enumerate}
490Finally, we need to know that a successfully translated program will
491have an initial state in the simulation relation with the original
492program's initial state.
493
494\begin{figure}
495\begin{center}
496\includegraphics[width=0.5\linewidth]{meassim.pdf}
497\end{center}
498\caption{Tiling of simulation for a measurable subtrace}
499\label{fig:tiling}
500\end{figure}
501
502To find the measurable subtrace in the target program's execution we
503walk along the original program's execution trace applying the
504appropriate simulation result by induction on the number of steps.
505While the number of steps taken varies, the overall structure is
506preserved, as illustrated in Figure~\ref{fig:tiling}.  By preserving
507the structure we also maintain the same intensional observables.  One
508delicate point is that the cost label following a call must remain
509directly afterwards\footnote{The prototype compiler allowed some
510  straight-line code to appear before the cost label until a later
511  stage of the compiler, but we must move the requirement forward to
512  fit with the structured traces.}
513% Damn it, I should have just moved the cost label forwards in RTLabs,
514% like the prototype does in RTL to ERTL; the result would have been
515% simpler.  Or was there some reason not to do that?
516(both in the program code and in the execution trace), even if we
517introduce extra steps, for example to store parameters in memory in
518\textsf{Cminor}.  Thus we have a version of the call simulation
519that deals with both in one result.
520
521In addition to the subtrace we are interested in measuring we must
522also prove that the earlier part of the trace is also preserved in
523order to use the simulation from the initial state.  It also
524guarantees that we do not run out of stack space before the subtrace
525we are interested in.  The lemmas for this prefix and the measurable
526subtrace are similar, following the pattern above.  However, the
527measurable subtrace also requires us to rebuild the termination
528proof.  This has a recursive form:
529\begin{lstlisting}[language=matita]
530let rec will_return_aux C (depth:nat)
531                             (trace:list (cs_state … C × trace)) on trace : bool :=
532match trace with
533[ nil $\Rightarrow$ false
534| cons h tl $\Rightarrow$
535  let $\langle$s,tr$\rangle$ := h in
536  match cs_classify C s with
537  [ cl_call $\Rightarrow$ will_return_aux C (S depth) tl
538  | cl_return $\Rightarrow$
539      match depth with
540      [ O $\Rightarrow$ match tl with [ nil $\Rightarrow$ true | _ $\Rightarrow$ false ]
541      | S d $\Rightarrow$ will_return_aux C d tl
542      ]
543  | _ $\Rightarrow$ will_return_aux C depth tl
544  ]
545].
546\end{lstlisting}
547The \lstinline'depth' is the number of return states we need to see
548before we have returned to the original function (initially zero) and
549\lstinline'trace' the measurable subtrace obtained from the running
550the semantics for the correct number of steps.  This definition
551unfolds tail recursively for each step, and once the corresponding
552simulation result has been applied a new one for the target can be
553asserted by unfolding and applying the induction hypothesis on the
554shorter trace.
555
556This then gives us an overall result for any simulation fitting the
557requirements above (contained in the \lstinline'meas_sim' record):
558\begin{lstlisting}[language=matita]
559theorem measured_subtrace_preserved :
560  $\forall$MS:meas_sim.
561  $\forall$p1,p2,m,n,stack_cost,max.
562  ms_compiled MS p1 p2 $\rightarrow$
563  measurable (ms_C1 MS) p1 m n stack_cost max $\rightarrow$
564  $\exists$m',n'.
565    measurable (ms_C2 MS) p2 m' n' stack_cost max $\wedge$
566    observables (ms_C1 MS) p1 m n = observables (ms_C2 MS) p2 m' n'.
567\end{lstlisting}
568The stack space requirement that is embedded in \lstinline'measurable'
569is a consequence of the preservation of observables.
570
571TODO: how to deal with untidy edges wrt to sim rel; anything to
572say about obs?
573
574TODO: say something about termination measures; cost labels are
575statements/exprs in these languages; split call/return gives simpler
576simulations
577
578\subsection{Simulation results for each pass}
579
580\todo{don't use loop structures from CompCert, go straight to jumps}
581
582\paragraph{Cast simplification}
583
584The parser used in Cerco introduces a lot of explicit type casts. If left
585as they are, these casts can greatly hamper the quality of the generated
586code -- especially as the architecture we consider is an $8$ bit one.
587In \textsf{Clight}, casts are expressions. Hence, most of the work of
588this transformation proceeds on expressions, and the one-step simulation
589result one those is easily lifted to statement evaluations.
590
591\todo{introduce the explicit type signature of the functions below ?}
592
593The tranformation proceeds by recursively trying to coerce an expression to
594a particular type, which is in practice a integer type smaller than the
595original one. This functionality is implemented by two mutually recursive
596functions: \textbf{simplify\_expr} and \textbf{simplify\_inside}. The latter
597acts as a wrapper for the first one. Whenever \textbf{simplify\_inside} 
598encounters a \textbf{Ecast} expression, it tries to coerce the sub-expression
599to the desired type using \textbf{simplify\_expr}, which tries to perform the
600actual coercion. In return, \textbf{simplify\_expr} calls back
601\textbf{simplify\_inside} in some particular positions, where we decided to
602be conservative in order to simplify the proofs. However, the current design
603allows to incrementally revert to a more aggressive version, by replacing
604recursive calls to \textbf{simplify\_inside} by calls to \textbf{simplify\_expr} 
605\emph{and} proving the corresponding invariants -- where possible.
606
607The main problem encountered with this particular pass was dealing with inconsistently
608typed programs, a canonical case being a particular integer constant of a certain
609size typed with another size. This prompted the need to introduce numerous
610type checks, complexifying both the implementation and the proof.
611
612\paragraph{Switch removal}
613
614Our intermediate languages have no jump tables. Hence, we have to compile the
615\lstinline[language=C]'switch' constructs away before going from \textsf{Clight} to
616\textsf{Cminor}. Note that this is not necessarily a bad decision performance-wise
617for small switches. For instance, compilers such as GCC introduce balanced trees
618of ``if-then-else'' constructs. However, our implementation strategy is much simpler.
619Let us consider the following input statement.
620
621\begin{lstlisting}[language=C]
622   switch(e) {
623   case v1:
624     stmt1;
625   case v2:
626     stmt2;
627   default:
628     stmt_default;
629   }
630\end{lstlisting}
631
632Note that stmt1, stmt2, \ldots stmt\_default may contain "break" statements,
633which have the effect of exiting the switch statement. In the absence of break,
634the execution falls through each case sequentially. In our current implementation,
635we produce an equivalent sequence of ``if-then'' chained by gotos:
636\begin{lstlisting}[language=C]
637   fresh = e;
638   if(fresh == v1) {
639     $\llbracket$stmt1$\rrbracket$;
640     goto lbl_case2;
641   };
642   if(fresh == v2) {
643     lbl_case2:
644     $\llbracket$stmt2;$\rrbracket$
645     goto lbl_case2;
646   };
647   $\llbracket$stmt_default$\rrbracket$;
648   exit_label:
649\end{lstlisting}
650
651The proof had to tackle the following points:
652\begin{itemize}
653\item the source and target memories are not the same (cf. fresh variable),
654\item the flow of control is changed in a non-local way (e.g. \textbf{goto} 
655  instead of \textbf{break}).
656\end{itemize}
657
658In order to tackle the first point, we implemented a version of memory
659extensions similar to Compcert's. What has been done is the simulation proof
660for all ``easy'' cases, that do not interact with the switch removal per se,
661plus a bit of the switch case. This comprises propagating the memory extension
662through  each statement (except switch), as well as various invariants that
663are needed for the switch case (in particular, freshness hypotheses). The
664details of the evaluation process for the source switch statement and its
665target counterpart can be found in the file \textbf{switchRemoval.ma},
666along more details on the transformation itself.
667
668Proving the correctness of the second point would require reasoning on the
669semantics of \lstinline[language=C]'goto' statements. In the \textsf{Clight} 
670semantics, this is implemented as a function-wide lookup of the target label.
671The invariant we would need is the fact that a global label lookup on a freshly
672created goto is equivalent to a local lookup. This would in turn require the
673propagation of some freshness hypotheses on labels. For reasons expressed above,
674we decided to omit this part of the correctness proof.
675
676\paragraph{Clight to Cminor}
677
678This pass is the last one operating on the Clight intermediate language.
679Its input is a full \textsf{Clight} program, and its output is a \textsf{Cminor} 
680program. Note that we do not use an equivalent of the C\#minor language: we
681translate directly to Cminor. This presents the advantage of not requiring the
682special loop constructs, nor the explicit block structure. Another salient
683point of our approach is that a significant part of the properties needed for
684the simulation proof were directly encoded in dependently typed translation
685functions. This concerns more precisely freshness conditions and well-typedness
686conditions. The main effects of the transformation from \textsf{Clight} to
687\textsf{Cminor} are listed below.
688
689\begin{itemize}
690\item Variables are classified as being either globals, stack-allocated
691  locals or potentially register-allocated locals. The value of register-allocated
692  local variables is moved out of the modelled memory and stored in a
693  dedicated environment.
694\item In Clight, each local variable has a dedicated memory block, whereas
695  stack-allocated locals are bundled together on a function-by-function basis.
696\item Loops are converted to jumps.
697\end{itemize}
698
699The first two points require memory injections which are more flexible that those
700needed in the switch removal case. In the remainder of this section, we briefly
701discuss our implementation of memory injections, and then the simulation proof.
702
703\subparagraph{Memory injections}
704
705Our memory injections are modelled after the work of Blazy \& Leroy.
706However, the corresponding paper is based on the first version of the
707Compcert memory model, whereas we use a much more concrete model, allowing byte-level
708manipulations (as in the later version of Compcert's memory model). We proved
709roughly 80 \% of the required lemmas. Some difficulties encountered were notably
710due to some too relaxed conditions on pointer validity (problem fixed during development).
711Some more conditions had to be added to take care of possible overflows when converting
712from \textbf{Z} block bounds to $16$ bit pointer offsets (in pratice, these overflows
713only occur in edge cases that are easily ruled out -- but this fact is not visible
714in memory injections). Concretely, some of the lemmas on the preservation of simulation of
715loads after writes were axiomatised, for lack of time.
716
717\subparagraph{Simulation proof}
718
719The correction proof for this transformation was not terminated. We proved the
720simulation result for expressions and for some subset of the critical statement cases.
721Notably lacking are the function entry and exit, where the memory injection is
722properly set up. As can be guessed, a significant amount of work has to be performed
723to show the conservation of all invariants at each simulation step.
724
725\todo{list cases, explain while loop, explain labeling problem}
726
727\section{Checking cost labelling properties}
728
729Ideally, we would provide proofs that the cost labelling pass always
730produced programs that are soundly and precisely labelled and that
731each subsequent pass preserves these properties.  This would match our
732use of dependent types to eliminate impossible sources of errors
733during compilation, in particular retaining intermediate language type
734information.
735
736However, given the limited amount of time available we realised that
737implementing a compile-time check for a sound and precise labelling of
738the \textsf{RTLabs} intermediate code would reduce the proof burden
739considerably.  This is similar in spirit to the use of translation
740validation in certified compilation\todo{Cite some suitable work
741  here}, which makes a similar trade-off between the potential for
742compile-time failure and the volume of proof required.
743
744The check cannot be pushed into a later stage of the compiler because
745much of the information is embedded into the structured traces.
746However, if an alternative method was used to show that function
747returns in the compiled code are sufficiently well-behaved, then we
748could consider pushing the cost property checks into the timing
749analysis itself.  We leave this as a possible area for future work.
750
751\subsection{Implementation and correctness}
752
753For a cost labelling to be sound and precise we need a cost label at
754the start of each function, after each branch and at least once in
755every loop.  The first two parts are trivial to check by examining the
756code.  In \textsf{RTLabs} the last part is specified by saying
757that there is a bound on the number of successive instruction nodes in
758the CFG that you can follow before you encounter a cost label, and
759checking this is more difficult.
760
761The implementation works through the set of nodes in the graph,
762following successors until a cost label is found or a label-free cycle
763is discovered (in which case the property does not hold and we stop).
764This is made easier by the prior knowledge that any branch is followed
765by cost labels, so we do not need to search each branch.  When a label
766is found, we remove the chain from the set and continue until it is
767empty, at which point we know that there is a bound for every node in
768the graph.
769
770Directly reasoning about the function that implements this would be
771rather awkward, so an inductive specification of a single step of its
772behaviour was written and proved to match the implementation.  This
773was then used to prove the implementation sound and complete.
774
775While we have not attempted to proof that the cost labelled properties
776are established and preserved earlier in the compiler, we expect that
777the effort for the \textsf{Cminor} to \textsf{RTLabs} would be similar
778to the work outlined above, because it involves the change from
779requiring a cost label at particular positions to requiring cost
780labels to break loops in the CFG.  As there are another three passes
781to consider (including the labelling itself), we believe that using
782the check above is much simpler overall.
783
784% TODO? Found some Clight to Cminor bugs quite quickly
785
786\section{Existence of a structured trace}
787\label{sec:structuredtrace}
788
789\emph{Structured traces} enrich the execution trace of a program by
790nesting function calls in a mixed-step style\todo{Can I find a
791  justification for mixed-step} and embedding the cost properties of
792the program.  It was originally designed to support the proof of
793correctness for the timing analysis of the object code in the
794back-end, then generalised to provide a common structure to use from
795the end of the front-end to the object code.  See
796Figure~\ref{fig:strtrace} on page~\pageref{fig:strtrace}
797for an illustration of a structured trace.
798
799To make the definition generic we abstract over the semantics of the
800language,
801\begin{lstlisting}[language=matita]
802record abstract_status : Type[1] :=
803  { as_status :> Type[0]
804  ; as_execute : relation as_status
805  ; as_pc : DeqSet
806  ; as_pc_of : as_status $\rightarrow$ as_pc
807  ; as_classify : as_status $\rightarrow$ status_class
808  ; as_label_of_pc : as_pc $\rightarrow$ option costlabel
809  ; as_after_return : ($\Sigma$s:as_status. as_classify s =  cl_call) $\rightarrow$ as_status $\rightarrow$ Prop
810  ; as_result: as_status $\rightarrow$ option int
811  ; as_call_ident : ($\Sigma$s:as_status.as_classify s = cl_call) $\rightarrow$ ident
812  ; as_tailcall_ident : ($\Sigma$s:as_status.as_classify s = cl_tailcall) $\rightarrow$ ident
813  }.
814\end{lstlisting}
815which gives a type of states, an execution relation, some notion of
816program counters with decidable equality, the classification of states
817and functions to extract the observable intensional information (cost
818labels and the identity of functions that are called).
819
820The structured traces are defined using three mutually inductive
821types.  The core data structure is \lstinline'trace_any_label', which
822captures some straight-line execution until the next cost label or
823function return.  Each function call is embedded as a single step,
824with its own trace nested inside, and states classified as a `jump'
825(in particular branches) must be followed by a cost label.
826
827The second type, \lstinline'trace_label_label', is a
828\lstinline'trace_any_label' where the initial state is cost labelled.
829Thus a trace in this type identifies a series of steps whose cost is
830entirely accounted for by the label at the start.
831
832Finally, \lstinline'trace_label_return' is a sequence of
833\lstinline'trace_label_label' values which end in the return from the
834function.  These correspond to a measurable subtrace, and in
835particular include executions of an entire function call.
836
837TODO: design, basic structure from termination proof, how cost
838labelling props are baked in; unrepeating PCs, remainder of sound
839labellings; coinductive version for whole programs, reason/relevance,
840use of em (maybe a general comment about uses of classical reasoning
841in development)
842
843\section{Conclusion}
844
845TODO
846
847TODO: appendix on code layout?
848
849\bibliographystyle{plain}
850\bibliography{report}
851
852\end{document}
Note: See TracBrowser for help on using the repository browser.