source: Papers/itp-2013/ccexec.tex @ 3305

Last change on this file since 3305 was 3305, checked in by tranquil, 6 years ago

messo qualche figura, e molte notazioni

File size: 71.9 KB
12% \usepackage{amsmath}
17% NB: might be worth removing this if changing class in favour of
18% a built-in definition.
23  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
24   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
25  }
28  morekeywords={int8_t}
32        keywordstyle=\color{red}\bfseries,
33        keywordstyle=[2]\color{blue},
34        commentstyle=\color{green},
35        stringstyle=\color{blue},
36        showspaces=false,showstringspaces=false,
37        xleftmargin=1em}
42\pgfutil@ifundefined{pgf@arrow@code@implies}{% supply for lack of double arrow special arrow tip if it is not there%
43  \pgfarrowsdeclare{implies}{implies}%
44  {%
45  \pgfarrowsleftextend{2.2pt}%
46  \pgfarrowsrightextend{2.2pt}%
47  }%
48  {%
49    \pgfsetdash{}{0pt} % do not dash%
50    \pgfsetlinewidth{.33pt}%
51    \pgfsetroundjoin   % fix join%
52    \pgfsetroundcap    % fix cap%
53    \pgfpathmoveto{\pgfpoint{-1.5pt}{2.5pt}}%
54    \pgfpathcurveto{\pgfpoint{-.75pt}{1.5pt}}{\pgfpoint{0.5pt}{.5pt}}{\pgfpoint{2pt}{0pt}}%
55    \pgfpathcurveto{\pgfpoint{0.5pt}{-.5pt}}{\pgfpoint{-.75pt}{-1.5pt}}{\pgfpoint{-1.5pt}{-2.5pt}}%
56    \pgfusepathqstroke%
57  }%
61\tikzset{state/.style={inner sep = 0, outer sep = 2pt, draw, fill},
62         every node/.style={inner sep=2pt},
63         every on chain/.style = {inner sep = 0, outer sep = 2pt},
64         join all/.style = {every on chain/.append style={join}},
65         on/.style={on chain={#1}, state},
66         m/.style={execute at begin node=$, execute at end node=$},
67         node distance=3mm,
68         is other/.style={circle, minimum size = 3pt, state},
69         other/.style={on, is other},
70         is jump/.style={diamond, minimum size = 6pt, state},
71         jump/.style={on, is jump},
72         is call/.style={regular polygon, regular polygon sides=3, minimum size=5pt, state},
73         call/.style={on=going below, is call, node distance=6mm, label=above left:$#1$},
74         is ret/.style={regular polygon, regular polygon sides=3, minimum size=5pt, shape border rotate=180, state},
75         ret/.style={on=going above, is ret, node distance=6mm},
76         chain/.style={start chain=#1 going left},
77         rev ar/.style={stealth-, thick},
78         ar/.style={-stealth, thick},
79         every join/.style={rev ar},
80         labelled/.style={fill=white, label=above:$#1$},
81         vcenter/.style={baseline={([yshift=-.5ex]current bounding box)}},
82         every picture/.style={thick},
83         double equal sign distance/.prefix style={double distance=1.5pt}, %% if already defined (newest version of pgf) it should be ignored%}
84         implies/.style={double, -implies, thin, double equal sign distance, shorten <=5pt, shorten >=5pt},
85         new/.style={densely dashed},
86         rel/.style={font=\scriptsize, fill=white, inner sep=2pt},
87         diag/.style={row sep={11mm,between origins},
88                      column sep={11mm,between origins},
89                      every node/.style={draw, is other}},
90         small vgap/.style={row sep={7mm,between origins}},
93\def\L{\mathrel{\mathcal L}}
94\def\S{\mathrel{\mathcal S}}
95\def\R{\mathrel{\mathcal R}}
96\def\C{\mathrel{\mathcal C}}
99\savebox{\execbox}{\tikz[baseline=-.5ex]\draw [-stealth] (0,0) -- ++(1em, 0);}
112\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
113Emerging Technologies (FET) programme within the Seventh Framework
114Programme for Research of the European Commission, under FET-Open grant
115number: 243881}}
116\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
117\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{}, \email{}}
120The labelling approach is a technique to lift cost models for non-functional
121properties of programs from the object code to the source code. It is based
122on the preservation of the structure of the high level program in every
123intermediate language used by the compiler. Such structure is captured by
124observables that are added to the semantics and that needs to be preserved
125by the forward simulation proof of correctness of the compiler. Additional
126special observables are required for function calls. In this paper we
127present a generic forward simulation proof that preserves all these observables.
128The proof statement is based on a new mechanised semantics that traces the
129structure of execution when the language is unstructured. The generic semantics
130and simulation proof have been mechanised in the interactive theorem prover
135The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
136a technique to \emph{lift} cost models for non-functional properties of programs
137from the object code to the source code. Examples of non-functional properties
138are execution time, amount of stack/heap space consumed and energy required for
139communication. The basic idea of the approach is that it is impossible to
140provide a \emph{uniform} cost model for an high level language that is preserved
141\emph{precisely} by a compiler. For instance, two instances of an assignment
142$x = y$ in the source code can be compiled very differently according to the
143place (registers vs stack) where $x$ and $y$ are stored at the moment of
144execution. Therefore a precise cost model must assign a different cost
145to every occurrence, and the exact cost can only be known after compilation.
147According to the labelling approach, the compiler is free to compile and optimise
148the source code without any major restriction, but it must keep trace
149of what happens to basic blocks during the compilation. The cost model is
150then computed on the object code. It assigns a cost to every basic block.
151Finally, the compiler propagates back the cost model to the source level,
152assigning a cost to each basic block of the source code.
154Implementing the labelling approach in a certified compiler
155allows to reason formally on the high level source code of a program to prove
156non-functional properties that are granted to be preserved by the compiler
157itself. The trusted code base is then reduced to 1) the interactive theorem
158prover (or its kernel) used in the certification of the compiler and
1592) the software used to certify the property on the source language, that
160can be itself certified further reducing the trusted code base.
161In~\cite{easylabelling} the authors provide an example of a simple
162certified compiler that implements the labelling approach for the
163imperative \texttt{While} language~\cite{while}, that does not have
164pointers and function calls.
166The labelling approach has been shown to scale to more interesting scenarios.
167In particular in~\cite{functionallabelling} it has been applied to a functional
168language and in~\cite{loopoptimizations} it has been shown that the approach
169can be slightly complicated to handle loop optimisations and, more generally,
170program optimisations that do not preserve the structure of basic blocks.
171On-going work also shows that the labelling approach is also compatible with
172the complex analyses required to obtain a cost model for object code
173on processors that implement advanced features like pipelining, superscalar
174architectures and caches.
176In the European Project CerCo (Certified Complexity~\footnote{\url{}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
1778051 object code. The compiler is
178moderately optimising and implements a compilation chain that is largely
179inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novelty and source of difficulties is due to the presence
180of function calls. Surprisingly, the addition of function calls require a
181revisitation of the proof technique given in~\cite{easylabelling}. In
182particular, at the core of the labelling approach there is a forward
183simulation proof that, in the case of \texttt{While}, is only minimally
184more complex than the proof required for the preservation of the
185functional properties only. In the case of a programming language with
186function calls, instead, it turns out that the forward simulation proof for
187the back-end languages must grant a whole new set of invariants.
189In this paper we present a formalisation in the Matita interactive theorem
190prover~\cite{matita1,matita2} of a generic version of the simulation proof required for unstructured
191languages. All back-end languages of the CerCo compiler are unstructured
192languages, so the proof covers half of the correctness of the compiler.
193The statement of the generic proof is based on a new semantics
194for imperative unstructured languages that is based on \emph{structured
195traces} and that restores the preservation of structure in the observables of
196the semantics. The generic proof allows to almost completely split the
197part of the simulation that deals with functional properties only from the
198part that deals with the preservation of structure.
200The plan of this paper is the following. In Section~\ref{labelling} we
201sketch the labelling method and the problems derived from the application
202to languages with function calls. In Section~\ref{semantics} we introduce
203a generic description of an unstructured imperative language and the
204corresponding structured traces (the novel semantics). In
205Section~\ref{simulation} we describe the forward simulation proof.
206Conclusions and future works are in Section~\ref{conclusions}
208\section{The labelling approach}
210We briefly sketch here a simplified version of the labelling approach as
211introduced in~\cite{easylabelling}. The simplification strengthens the
212sufficient conditions given in~\cite{easylabelling} to allow a simpler
213explanation. The simplified conditions given here are also used in the
214CerCo compiler to simplify the proof.
216Let $\mathcal{P}$ be a programming language whose semantics is given in
217terms of observables: a run of a program yields a finite or infinite
218stream of observables. We also assume for the time being that function
219calls are not available in $\mathcal{P}$. We want to associate a cost
220model to a program $P$ written in $\mathcal{P}$. The first step is to
221extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
222where $L$ is a label distinct from all observables of $\mathcal{P}$.
223The semantics of $\texttt{emit L}$ is the emission of the observable
224\texttt{L} that is meant to signal the beginning of a basic block.
226There exists an automatic procedure that injects into the program $P$ an
227$\texttt{emit L}$ at the beginning of each basic block, using a fresh
228\texttt{L} for each block. In particular, the bodies of loops, both branches
229of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
230with an emission statement.
232Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
233that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
234language used by the compiler. We can easily extend every
235intermediate language (and its semantics) with an $\texttt{emit L}$ statement
236as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
237the additional difficulty that the syntax of object code is given as a
238sequence of bytes. The injection of an emission statement in the object code
239can be done using a map that maps two consecutive code addresses with the
240statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
241instruction stored at $pc_1$ and before the execution of the instruction
242stored at $pc_2$. The two program counters are necessary because the
243instruction stored at $pc_1$ can have multiple possible successors (e.g.
244in case of a conditional branch or an indirect call). Dually, the instruction
245stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
246target of a jump).
248The compiler, to be functionally correct, must preserve the observational
249equivalence, i.e. executing the program after each compiler pass should
250yield the same stream of observables. After the injection of emission
251statements, observables now capture both functional and non-functional
253This correctness property is called in the literature a forward simulation
254and is sufficient for correctness when the target language is
256We also require a stronger, non-functional preservation property: after each
257pass all basic blocks must start with an emission statement, and all labels
258\texttt{L} must be unique.
260Now let $M$ be the object code obtained for the program $P$. Let us suppose
261that we can statically inspect the code $M$ and associate to each basic block
262a cost (e.g. the number of clock cycles required to execute all instructions
263in the basic block, or an upper bound to that time). Every basic block is
264labelled with an unique label \texttt{L}, thus we can actually associate the
265cost to \texttt{L}. Let call it $k(\texttt{L})$.
267The function $k$ is defined as the cost model for the object code control
268blocks. It can be equally used as well as the cost model for the source
269control blocks. Indeed, if the semantics of $P$ is the stream
270$L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
271also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
272every instruction belongs to a control block and every control block is
273labelled. Thus it is correct to say that the execution cost of $P$ is also
274$\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
275the blocks of the high level program $P$ that is preserved by compilation.
277How can the user profit from the high level cost model? Suppose, for instance,
278that he wants to prove that the WCET of his program is bounded by $c$. It
279is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
280a purely functional property of the code. He can therefore use any technique
281available to certify functional properties of the source code.
282What is suggested in~\cite{easylabelling} is to actually instrument the
283source code $P$ by replacing every label emission statement
284$\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
285a global fresh variable \texttt{cost}. The bound is now proved by establishing
286the program invariant $\texttt{cost} \leq c$, which can be done for example
287using the Frama-C~\cite{framaC} suite if the source code is some variant of
290\subsection{Labelling function calls}
291We now want to extend the labelling approach to support function calls.
292On the high level, \emph{structured} programming language $\mathcal{P}$ there
293is not much to change.
294When a function is invoked, the current basic block is temporarily exited
295and the basic block the function starts with take control. When the function
296returns, the execution of the original basic block is resumed. Thus the only
297significant change is that basic blocks can now be nested. Let \texttt{E}
298be the label of the external block and \texttt{I} the label of a nested one.
299Since the external starts before the internal, the semantics observed will be
300\texttt{E I} and the cost associated to it on the source language will be
301$k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
302in the block \texttt{E} first plus the cost of executing all the instructions in
303the block \texttt{I}. However, we know that some instructions in \texttt{E} are
304executed after the last instruction in \texttt{I}. This is actually irrelevant
305because we are here assuming that costs are additive, so that we can freely
306permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
307the function call terminates and yields back control to the basic block
308\texttt{E}. If the call diverges, the instrumentation
309$\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
310but just as an upper bound to the real execution cost: only precision is lost.
312Let now consider what happens when we move down the compilation chain to an
313unstructured intermediate or final language. Here unstructured means that
314the only control operators are conditional and unconditional jumps, function
315calls and returns. Unlike a structured language, though, there is no guarantee
316that a function will return control just after the function call point.
317The semantics of the return statement, indeed, consists in fetching the
318return address from some internal structure (typically the control stack) and
319jumping directly to it. The code can freely manipulate the control stack to
320make the procedure returns to whatever position. Indeed, it is also possible
321to break the well nesting of function calls/returns.
323Is it the case that the code produced by a correct compiler must respect the
324additional property that every function returns just after its function call
325point? The answer is negative and the property is not implied by forward
326simulation proofs. For instance, imagine to modify a correct compiler pass
327by systematically adding one to the return address on the stack and by
328putting a \texttt{NOP} (or any other instruction that takes one byte) after
329every function call. The obtained code will be functionally indistinguishable,
330and the added instructions will all be dead code.
332This lack of structure in the semantics badly interferes with the labelling
333approach. The reason is the following: when a basic block labelled with
334\texttt{E} contains a function call, it no longer makes any sense to associate
335to a label \texttt{E} the sum of the costs of all the instructions in the block.
336Indeed, there is no guarantee that the function will return into the block and
337that the instructions that will be executed after the return will be the ones
338we are paying for in the cost model.
340How can we make the labelling approach work in this scenario? We only see two
341possible ways. The first one consists in injecting an emission statement after
342every function call: basic blocks no longer contain function calls, but are now
343terminated by them. This completely solves the problem and allows the compiler
344to break the structure of function calls/returns at will. However, the
345technique has several drawbacks. First of all, it greatly augments the number
346of cost labels that are injected in the source code and that become
347instrumentation statements. Thus, when reasoning on the source code to prove
348non-functional properties, the user (or the automation tool) will have to handle
349larger expressions. Second, the more labels are emitted, the more difficult it
350becomes to implement powerful optimisations respecting the code structure.
351Indeed, function calls are usually implemented in such a way that most registers
352are preserved by the call, so that the static analysis of the block is not
353interrupted by the call and an optimisation can involve both the code before
354and after the function call. Third, instrumenting the source code may require
355unpleasant modification of it. Take, for example, the code
356\texttt{f(g(x));}. We need to inject an emission statement/instrumentation
357instruction just after the execution of \texttt{g}. The only way to do that
358is to rewrite the code as \texttt{y = g(x); emit L; f(y);} for some fresh
359variable \texttt{y}. It is pretty clear how in certain situations the obtained
360code would be more obfuscated and then more difficult to manually reason on.
362For the previous reasons, in this paper and in the CerCo project we adopt a
363different approach. We do not inject emission statements after every
364function call. However, we want to propagate a strong additional invariant in
365the forward simulation proof. The invariant is the propagation of the structure
366 of the original high level code, even if the target language is unstructured.
367The structure we want to propagate, that will become more clear in the next
368section, comprises 1) the property that every function should return just after
369the function call point, which in turns imply well nesting of function calls;
3702) the property that every basic block starts with a code emission statement.
372In the original labelling approach of~\cite{easylabelling}, the second property
373was granted syntactically as a property of the generated code.
374In our revised approach, instead, we will impose the property on the runs:
375it will be possible to generate code that does not respect the syntactic
376property, as soon as all possible runs respect it. For instance, dead code will no longer
377be required to have all basic blocks correctly labelled. The switch is suggested
378from the fact that the first of the two properties --- that related to
379function calls/returns --- can only be defined as property of runs,
380not of the static code. The switch is
381beneficial to the proof because the original proof was made of two parts:
382the forward simulation proof and the proof that the static property was granted.
383In our revised approach the latter disappears and only the forward simulation
384is kept.
386In order to capture the structure semantics so that it is preserved
387by a forward simulation argument, we need to make the structure observable
388in the semantics. This is the topic of the next section.
390\section{Structured traces}
392The program semantics adopted in the traditional labelling approach is based
393on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
394a set of states $\mathcal{S}$, the semantics of one deterministic execution
395step is
396defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
397observables. The semantics is then lifted compositionally to multiple (finite
398or infinite) execution steps.
399Finally, the semantics of a a whole program execution is obtained by forgetting
400about the final state (if any), yielding a function $S \to O^*$ that given an
401initial status returns the finite or infinite stream of observables in output.
403We present here a new definition of semantics where the structure of execution,
404as defined in the previous section, is now observable. The idea is to replace
405the stream of observables with a structured data type that makes explicit
406function call and returns and that grants some additional invariants by
407construction. The data structure, called \emph{structured traces}, is
408defined inductively for terminating programs and coinductively for diverging
409ones. In the paper we focus only on the inductive structure, i.e. we assume
410that all programs that are given a semantics are total. The Matita formalisation
411also shows the coinductive definitions. The semantics of a program is then
412defined as a function that maps an initial state into a structured trace.
414In order to have a definition that works on multiple intermediate languages,
415we abstract the type of structure traces over an abstract data type of
416abstract statuses, which we aptly call $\texttt{abstract\_status}$. The fields
417of this record are the following.
419 \item \verb+S : Type[0]+, the type of states.
420 \item \verb+as_execute : S $\to$ S $\to$ Prop+, a binary predicate stating
421 an execution step. We write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
422 \item \verb+as_classifier : S $\to$ classification+, a function tagging all
423 states with a class in
424 $\{\texttt{cl\_return,cl\_jump,cl\_call,cl\_other}\}$, depending on the instruction
425 that is about to be executed (we omit tail-calls for simplicity). We will
426 use $s \class c$ as a shorthand for both $\texttt{as\_classifier}~s=c$
427 (if $c$ is a classification) and $\texttt{as\_classifier}~s\in c$
428 (if $c$ is a set of classifications).
429 \item \verb+as_label : S $\to$ option label+, telling whether the
430 next instruction to be executed in $s$ is a cost emission statement,
431 and if yes returning the associated cost label. Our shorthand for this function
432 will be $\ell$, and we will also abuse the notation by using $\ell~s$ as a
433 predicate stating that $s$ is labelled.
434 \item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
435 telling the identifier of the function which is being called in a
436 \verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
437 $\verb+as_call_ident+~s = f$.
438 \item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
439 which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
440 instruction to be executed in $s_2$ follows the function call to be
441 executed in (the witness of the $\Sigma$-type) $s_1$. We will use the notation
442 $s_1\ar s_2$ for this relation.
445% \begin{alltt}
446% record abstract_status := \{ S: Type[0];
447%  as_execute: S \(\to\) S \(\to\) Prop;   as_classifier: S \(\to\) classification;
448%  as_label: S \(\to\) option label;    as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
449%  as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
450% \end{alltt}
452The inductive type for structured traces is actually made by three multiple
453inductive types with the following semantics:
455 \item $(\texttt{trace\_label\_return}~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
456   is a trace that begins in
457   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
458   such that the instruction to be executed in $s_1$ is a label emission
459   statement and the one to be executed in the state before $s_2$ is a return
460   statement. Thus $s_2$ is the state after the return. The trace
461   may contain other label emission statements. It captures the structure of
462   the execution of function bodies: they must start with a cost emission
463   statement and must end with a return; they are obtained by concatenating
464   one or more basic blocks, all starting with a label emission
465   (e.g. in case of loops).
466 \item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
467   is a trace that begins in
468   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
469   such that the instruction to be executed in $s_2$/in the state before
470   $s_2$ is either a label emission statement or
471   or a return, according to the boolean $b$. It must not contain
472   any label emission statement. It captures the notion of a suffix of a
473   basic block.
474 \item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$ is the special case of
475   $\verb+TAL+~b~s_1~s_2)$ such that the instruction to be
476   executed in $s_1$ is a label emission statement. It captures the notion of
477   a basic block.
482 {\texttt{TLL}~true~s_1~s_2}
483 {\texttt{TLR}~s_1~s_2}
486 {\texttt{TLL}~false~s_1~s_2 \andalso
487  \texttt{TLR}~s_2~s_3
488 }
489 {\texttt{TLR}~s_1~s_3}
492 {\texttt{TAL}~b~s_1~s_2 \andalso
493  \ell~s_1
494 }
495 {\texttt{TLL}~b~s_1~s_2}
499 {s_1\exec s_2 \andalso
500  s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
501  \ell~s_2
502 }
503 {\texttt{TAL}~false~s_1~s_2}
506 {s_1\exec s_2 \andalso
507  s_1 \class \texttt{cl\_return}
508 }
509 {\texttt{TAL}~true~s_1~s_2}
512 {s_1\exec s_2 \andalso
513  s_1 \class \texttt{cl\_call} \andalso
514  s_1\ar s_3 \andalso
515  \texttt{TLR}~s_2~s_3 \andalso
516  \ell~s_3
517 }
518 {\texttt{TAL}~false~s_1~s_3}
521 {s_1\exec s_2 \andalso
522  s_1 \class \texttt{cl\_call} \andalso
523  s_1\ar s_3 \andalso
524  \texttt{TLR}~s_2~s_3 \andalso
525  \texttt{TAL}~b~s_3~s_4
526 }
527 {\texttt{TAL}~b~s_1~s_4}
530 {s_1\exec s_2 \andalso
531  \lnot \ell~s_2 \andalso
532  \texttt{TAL}~b~s_2~s_3\andalso
533  s_1 \class \texttt{cl\_other}
534 }
535 {\texttt{TAL}~b~s_1~s_3}
538inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
539  | tlr_base:
540      ∀status_before: S.
541      ∀status_after: S.
542        trace_label_label S ends_with_ret status_before status_after →
543        trace_label_return S status_before status_after
544  | tlr_step:
545      ∀status_initial: S.
546      ∀status_labelled: S.
547      ∀status_final: S.
548        trace_label_label S doesnt_end_with_ret status_initial status_labelled →
549        trace_label_return S status_labelled status_final →
550          trace_label_return S status_initial status_final
551with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
552  | tll_base:
553      ∀ends_flag: trace_ends_with_ret.
554      ∀start_status: S.
555      ∀end_status: S.
556        trace_any_label S ends_flag start_status end_status →
557        as_costed S start_status →
558          trace_label_label S ends_flag start_status end_status
559with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
560  (* Single steps within a function which reach a label.
561     Note that this is the only case applicable for a jump. *)
562  | tal_base_not_return:
563      ∀start_status: S.
564      ∀final_status: S.
565        as_execute S start_status final_status →
566        (as_classifier S start_status cl_jump ∨
567         as_classifier S start_status cl_other) →
568        as_costed S final_status →
569          trace_any_label S doesnt_end_with_ret start_status final_status
570  | tal_base_return:
571      ∀start_status: S.
572      ∀final_status: S.
573        as_execute S start_status final_status →
574        as_classifier S start_status cl_return →
575          trace_any_label S ends_with_ret start_status final_status
576  (* A call followed by a label on return. *)
577  | tal_base_call:
578      ∀status_pre_fun_call: S.
579      ∀status_start_fun_call: S.
580      ∀status_final: S.
581        as_execute S status_pre_fun_call status_start_fun_call →
582        ∀H:as_classifier S status_pre_fun_call cl_call.
583          as_after_return S «status_pre_fun_call, H» status_final →
584          trace_label_return S status_start_fun_call status_final →
585          as_costed S status_final →
586            trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
587  (* A call followed by a non-empty trace. *)
588  | tal_step_call:
589      ∀end_flag: trace_ends_with_ret.
590      ∀status_pre_fun_call: S.
591      ∀status_start_fun_call: S.
592      ∀status_after_fun_call: S.
593      ∀status_final: S.
594        as_execute S status_pre_fun_call status_start_fun_call →
595        ∀H:as_classifier S status_pre_fun_call cl_call.
596          as_after_return S «status_pre_fun_call, H» status_after_fun_call →
597          trace_label_return S status_start_fun_call status_after_fun_call →
598          ¬ as_costed S status_after_fun_call →
599          trace_any_label S end_flag status_after_fun_call status_final →
600            trace_any_label S end_flag status_pre_fun_call status_final
601  | tal_step_default:
602      ∀end_flag: trace_ends_with_ret.
603      ∀status_pre: S.
604      ∀status_init: S.
605      ∀status_end: S.
606        as_execute S status_pre status_init →
607        trace_any_label S end_flag status_init status_end →
608        as_classifier S status_pre cl_other →
609        ¬ (as_costed S status_init) →
610          trace_any_label S end_flag status_pre status_end.
613A \texttt{trace\_label\_return} is isomorphic to a list of
614\texttt{trace\_label\_label}s that ends with a cost emission followed by a
615return terminated \texttt{trace\_label\_label}.
616The interesting cases are those of $\texttt{trace\_any\_label}~b~s_1~s_2$.
617A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
618definition on the classification of $s_1$. The constructors of the datatype
619impose several invariants that are meant to impose a structure to the
620otherwise unstructured execution. In particular, the following invariants are
623 \item the trace is never empty; it ends with a return iff $b$ is
624       true
625 \item a jump must always be the last instruction of the trace, and it must
626       be followed by a cost emission statement; i.e. the target of a jump
627       is always the beginning of a new basic block; as such it must start
628       with a cost emission statement
629 \item a cost emission statement can never occur inside the trace, only in
630       the status immediately after
631 \item the trace for a function call step is made of a subtrace for the
632       function body of type
633       $\texttt{trace\_label\_return}~s_1~s_2$, possibly followed by the
634       rest of the trace for this basic block. The subtrace represents the
635       function execution. Being an inductive datum, it grants totality of
636       the function call. The status $s_2$ is the one that follows the return
637       statement. The next instruction of $s_2$ must follow the function call
638       instruction. As a consequence, function calls are also well nested.
641There are three mutual structural recursive functions, one for each of
642\verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
643$|\,.\,|$: the \emph{flattening} of the traces. These functions
644allow to extract from a structured trace the list of emitted cost labels.
645%  We only show here the type of one
646% of them:
647% \begin{alltt}
648% flatten_trace_label_return:
649%  \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
650%   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
651% \end{alltt}
653\paragraph{Cost prediction on structured traces.}
655The first main theorem of CerCo about traces
656(theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
657holds for the
659of the structured traces to the concrete status of object code programs.
660Simplifying a bit, it states that
662\begin{array}{l}\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
663  \texttt{clock}~s_2 = \texttt{clock}~s_1 +
664  \Sigma_{\alpha \in |\tau|}\;k(\alpha)
667where the cost model $k$ is statically computed from the object code
668by associating to each label $\alpha$ the sum of the cost of the instructions
669in the basic block that starts at $\alpha$ and ends before the next labelled
670instruction. The theorem is proved by structural induction over the structured
671trace, and is based on the invariant that
672iff the function that computes the cost model has analysed the instruction
673to be executed at $s_2$ after the one to be executed at $s_1$, and if
674the structured trace starts with $s_1$, then eventually it will contain also
675$s_2$. When $s_1$ is not a function call, the result holds trivially because
676of the $s_1\exec s_2$ condition obtained by inversion on
677the trace. The only non
678trivial case is the one of function calls: the cost model computation function
679does recursion on the first instruction that follows that function call; the
680\texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
681\texttt{tal\_step\_call} grants exactly that the execution will eventually reach
682this state.
684\paragraph{Structured traces similarity and cost prediction invariance.}
686A compiler pass maps source to object code and initial states to initial
687states. The source code and initial state uniquely determine the structured
688trace of a program, if it exists. The structured trace fails to exists iff
689the structural conditions are violated by the program execution (e.g. a function
690body does not start with a cost emission statement). Let us assume that the
691target structured trace exists.
693What is the relation between the source and target structured traces?
694In general, the two traces can be arbitrarily different. However, we are
695interested only in those compiler passes that maps a trace $\tau_1$ to a trace
696$\tau_2$ such that
697\begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
698The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
701\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
702  \texttt{clock}~s_2 - \texttt{clock}~s_1 =
703  \Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
704  \Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
706This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
707transfer the cost model from the target to the source code and reason on the
708source code only.
710We are therefore interested in conditions stronger than~\eqref{th2}.
711Therefore we introduce here a similarity relation between traces with
712the same structure. Theorem~\texttt{tlr\_rel\_to\_traces\_same\_flatten}
713in the Matita formalisation shows that~\eqref{th2} holds for every pair
714$(\tau_1,\tau_2)$ of similar traces.
716Intuitively, two traces are similar when one can be obtained from
717the other by erasing or inserting silent steps, i.e. states that are
718not \texttt{as\_costed} and that are classified as \texttt{cl\_other}.
719Silent steps do not alter the structure of the traces.
720In particular,
721the relation maps function calls to function calls to the same function,
722label emission statements to emissions of the same label, concatenation of
723subtraces to concatenation of subtraces of the same length and starting with
724the same emission statement, etc.
726In the formalisation the three similarity relations --- one for each trace
727kind --- are defined by structural recursion on the first trace and pattern
728matching over the second. Here we turn
729the definition into inference rules for the sake of readability. We also
730omit from trace constructors all arguments, but those that are traces or that
731are used in the premises of the rules. By abuse of notation we denote all three
732relations by infixing $\approx$
736 {tll_1\approx tll_2
737 }
738 {\texttt{tlr\_base}~tll_1 \approx \texttt{tlr\_base}~tll_2}
741 {tll_1 \approx tll_2 \andalso
742  tlr_1 \approx tlr_2
743 }
744 {\texttt{tlr\_step}~tll_1~tlr_1 \approx \texttt{tlr\_step}~tll_2~tlr_2}
748 {\ell~s_1 = \ell~s_2 \andalso
749  tal_1\approx tal_2
750 }
751 {\texttt{tll\_base}~s_1~tal_1 \approx \texttt{tll\_base}~s_2~tal_2}
754 {}
755 {\texttt{tal\_base\_not\_return}\approx taa \append \texttt{tal\_base\_not\_return}}
758 {}
759 {\texttt{tal\_base\_return}\approx taa \append \texttt{tal\_base\_return}}
762 {tlr_1\approx tlr_2 \andalso
763  s_1 \uparrow f \andalso s_2\uparrow f
764 }
765 {\texttt{tal\_base\_call}~s_1~tlr_1\approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2}
768 {tlr_1\approx tlr_2 \andalso
769  s_1 \uparrow f \andalso s_2\uparrow f \andalso
770  \texttt{tal\_collapsable}~tal_2
771 }
772 {\texttt{tal\_base\_call}~s_1~tlr_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2)}
775 {tlr_1\approx tlr_2 \andalso
776  s_1 \uparrow f \andalso s_2\uparrow f \andalso
777  \texttt{tal\_collapsable}~tal_1
778 }
779 {\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2)}
782 {tlr_1 \approx tlr_2 \andalso
783  s_1 \uparrow f \andalso s_2\uparrow f
784  tal_1 \approx tal_2 \andalso
785 }
786 {\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2}
789 {tal_1\approx tal_2
790 }
791 {\texttt{tal\_step\_default}~tal_1 \approx tal_2}
794let rec tlr_rel S1 st1 st1' S2 st2 st2'
795  (tlr1 : trace_label_return S1 st1 st1')
796  (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
797match tlr1 with
798  [ tlr_base st1 st1' tll1 ⇒
799    match tlr2 with
800    [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
801    | _ ⇒ False
802    ]
803  | tlr_step st1 st1' st1'' tll1 tl1 ⇒
804    match tlr2 with
805    [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
806      tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
807    | _ ⇒ False
808    ]
809  ]
810and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
811 (tll1 : trace_label_label S1 fl1 st1 st1')
812 (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
813  match tll1 with
814  [ tll_base fl1 st1 st1' tal1 H ⇒
815    match tll2 with
816    [ tll_base fl2 st2 st2 tal2 G ⇒
817      as_label_safe … («?, H») = as_label_safe … («?, G») ∧
818      tal_rel … tal1 tal2
819    ]
820  ]
821and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
822 (tal1 : trace_any_label S1 fl1 st1 st1')
823 (tal2 : trace_any_label S2 fl2 st2 st2')
824   on tal1 : Prop ≝
825  match tal1 with
826  [ tal_base_not_return st1 st1' _ _ _ ⇒
827    fl2 = doesnt_end_with_ret ∧
828    ∃st2mid,taa,H,G,K.
829    tal2 ≃ taa_append_tal ? st2 ??? taa
830      (tal_base_not_return ? st2mid st2' H G K)
831  | tal_base_return st1 st1' _ _ ⇒
832    fl2 = ends_with_ret ∧
833    ∃st2mid,taa,H,G.
834    tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
835      (tal_base_return ? st2mid st2' H G)
836  | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
837    fl2 = doesnt_end_with_ret ∧
838    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
839    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
840    (* we must allow a tal_base_call to be similar to a call followed
841      by a collapsable trace (trace_any_any followed by a base_not_return;
842      we cannot use trace_any_any as it disallows labels in the end as soon
843      as it is non-empty) *)
844    (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
845      tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
846    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
847    ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
848      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
849      tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
850  | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
851    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
852    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
853    (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
854      tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
855      tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
856    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
857    ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
858      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
859      tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
860  | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
861    tal_rel … tl1 tal2 (* <- this makes it many to many *)
862  ].
866In the preceding rules, a $taa$ is an inhabitant of the
867$\texttt{trace\_any\_any}~s_1~s_2$ (shorthand $\texttt{TAA}~s_1~s_2$),
868an inductive data type whose definition
869is not in the paper for lack of space. It is the type of valid
870prefixes (even empty ones) of \texttt{TAL}'s that do not contain
871any function call. Therefore it
872is possible to concatenate (using ``$\append$'') a \texttt{TAA} to the
873left of a \texttt{TAL}. A \texttt{TAA} captures
874a sequence of silent moves.
876The \texttt{tal\_collapsable} unary predicate over \texttt{TAL}'s
877holds when the argument does not contain any function call and it ends
878with a label (not a return). The intuition is that after a function call we
879can still perform a sequence of silent actions while remaining similar.
881\section{Forward simulation}
884We summarise here the results of the previous sections. Each intermediate
885unstructured language can be given a semantics based on structured traces,
886that single out those runs that respect a certain number of invariants.
887A cost model can be computed on the object code and it can be used to predict
888the execution costs of runs that produce structured traces. The cost model
889can be lifted from the target to the source code of a pass if the pass maps
890structured traces to similar structured traces. The latter property is called
891a \emph{forward simulation}.
893As for labelled transition systems, in order to establish the forward
894simulation we are interested in (preservation of observables), we are
895forced to prove a stronger notion of forward simulation that introduces
896an explicit relation between states. The classical notion of a 1-to-0-or-many
897forward simulation is the existence of a relation $R$ over states such that
898if $s_1 R s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
899$s_2 \to^* s_2'$ and $s_1' R s_2'$. In our context, we need to replace the
900one and multi step transition relations $\to^n$ with the existence of
901a structured trace between the two states, and we need to add the request that
902the two structured traces are similar. Thus what we would like to state is
903something like:\\
904for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
905$s_1$ to $s_1'$ and $s_1 R s_2$ there exists an $s_2'$ such that
906$s_1' R s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
907$\tau_1$ is similar to $\tau_2$. We call this particular form of forward
908simulation \emph{trace reconstruction}.
910The statement just introduced, however, is too simplistic and not provable
911in the general case. To understand why, consider the case of a function call
912and the pass that fixes the parameter passing conventions. A function
913call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
914input nor output parameters. The pass must add explicit code before and after
915the function call to move the pseudo-registers content from/to the hardware
916registers or the stack in order to implement the parameter passing strategy.
917Similarly, each function body must be augmented with a preamble and a postamble
918to complete/initiate the parameter passing strategy for the call/return phase.
919Therefore what used to be a call followed by the next instruction to execute
920after the function return, now becomes a sequence of instructions, followed by
921a call, followed by another sequence. The two states at the beginning of the
922first sequence and at the end of the second sequence are in relation with
923the status before/after the call in the source code, like in an usual forward
924simulation. How can we prove however the additional condition for function calls
925that asks that when the function returns the instruction immediately after the
926function call is called? To grant this invariant, there must be another relation
927between the address of the function call in the source and in the target code.
928This additional relation is to be used in particular to relate the two stacks.
930Another example is given by preservation of code emission statements. A single
931code emission instruction can be simulated by a sequence of steps, followed
932by a code emission, followed by another sequence. Clearly the initial and final
933statuses of the sequence are to be in relation with the status before/after the
934code emission in the source code. In order to preserve the structured traces
935invariants, however, we must consider a second relation between states that
936traces the preservation of the code emission statement.
938Therefore we now introduce an abstract notion of relation set between abstract
939statuses and an abstract notion of 1-to-0-or-many forward simulation conditions.
940These two definitions enjoy the following remarkable properties:
942 \item they are generic enough to accommodate all passes of the CerCo compiler
943 \item the conjunction of the 1-to-0-or-many forward simulation conditions are
944       just slightly stricter than the statement of a 1-to-0-or-many forward
945       simulation in the classical case. In particular, they only require
946       the construction of very simple forms of structured traces made of
947       silent states only.
948 \item they allow to prove our main result of the paper: the 1-to-0-or-many
949       forward simulation conditions are sufficient to prove the trace
950       reconstruction theorem
953Point 3. is the important one. First of all it means that we have reduced
954the complex problem of trace reconstruction to a much simpler one that,
955moreover, can be solved with slight adaptations of the forward simulation proof
956that is performed for a compiler that only cares about functional properties.
957Therefore we have successfully splitted as much as possible the proof of
958preservation of functional properties from that of non-functional ones.
959Secondly, combined with the results in the previous section, it implies
960that the cost model can be computed on the object code and lifted to the
961source code to reason on non-functional properties, assuming that
962the 1-to-0-or-many forward simulation conditions are fulfilled for every
963compiler pass.
965\paragraph{Relation sets.}
967We introduce now the four relations $\mathcal{S,C,L,R}$ between abstract
968statuses that are used to correlate the corresponding statues before and
969after a compiler pass. The first two are abstract and must be instantiated
970by every pass. The remaining two are derived relations.
972The $\mathcal{S}$ relation between states is the classical relation used
973in forward simulation proofs. It correlates the data of the status
974(e.g. registers, memory, etc.).
976The $\mathcal{C}$ relation correlates call states. It allows to track the
977position in the target code of every call in the source code.
979The $\mathcal{L}$ relation simply says that the two states are both label
980emitting states that emit the same label. It allows to track the position in
981the target code of every cost emitting statement in the source code.
983Finally the $\mathcal{R}$ relation is the more complex one. Two states
984$s_1$ and $s_2$ are $\mathcal{R}$ correlated if every time $s_1$ is the
985successors of a call state that is $\mathcal{C}$-related to a call state
986$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$.
987We will require all pairs of states that follow a related call to be
988$\mathcal{R}$-related. This is the fundamental requirement to grant
989that the target trace is well structured, i.e. that function calls are well
990nested and always return where they are supposed to.
993record status_rel (S1,S2 : abstract_status) : Type[1] := \{ 
994  \(\mathcal{S}\): S1 \(\to\) S2 \(\to\) Prop;
995  \(\mathcal{C}\): (\(\Sigma\)s.as_classifier S1 s cl_call) \(\to\)
996     (\(\Sigma\)s.as_classifier S2 s cl_call) \(\to\) Prop \}.
998definition \(\mathcal{L}\) S1 S2 st1 st2 := as_label S1 st1 = as_label S2 st2.
1000definition \(\mathcal{R}\) S1 S2 (R: status_rel S1 S2) s1_ret s2_ret ≝
1001 \(\forall\)s1_pre,s2_pre.
1002  as_after_return s1_pre s1_ret \(\to\) s1_pre \(\mathcal{R}\) s2_pre \(\to\)
1003   as_after_return s2_pre s2_ret.
1011\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1012                            every label/.style=overlay, node distance=10mm]
1013    \matrix [diag] (m) {%
1014         \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
1015         \node (s2) {}; & \node (t2) {}; \\
1016    };
1017    \node [above=0 of t1, overlay] {$\alpha$};
1018    {[-stealth]
1019    \draw (s1) -- (t1);
1020    \draw [new] (s2) -- node [above] {$*$} (t2);
1021    }
1022    \draw (s1) to node [rel] {$\S$} (s2);
1023    \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1025\caption{The \texttt{cl\_jump} case.}
1031\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1032                            every label/.style=overlay, node distance=10mm]
1033    \matrix [diag] (m) {%
1034         \node (s1) {}; & \node (t1) {};\\
1035         \node (s2) {}; & \node (t2) {}; \\
1036    };
1037    {[-stealth]
1038    \draw (s1) -- (t1);
1039    \draw [new] (s2) -- node [above] {$*$} (t2);
1040    }
1041    \draw (s1) to node [rel] {$\S$} (s2);
1042    \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1044\caption{The \texttt{cl\_oher} case.}
1050\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1051                            every label/.style=overlay, node distance=10mm]
1052    \matrix [diag, small vgap] (m) {%
1053        \node (t1) {}; \\
1054         \node (s1) [is call] {}; \\
1055         & \node (l) {}; & \node (t2) {};\\
1056         \node (s2) {}; & \node (c) [is call] {};\\   
1057    };
1058    {[-stealth]
1059    \draw (s1) -- node [left] {$f$} (t1);
1060    \draw [new] (s2) -- node [above] {$*$} (c);
1061    \draw [new] (c) -- node [right] {$f$} (l);
1062    \draw [new] (l) -- node [above] {$*$} (t2);
1063    }
1064    \draw (s1) to node [rel] {$\S$} (s2);
1065    \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
1066    \draw [new] (t1) to [bend left] node [rel] {$\L$} (l);
1067    \draw [new] (t1) to node [rel] {$\C$} (c);
1068    \end{tikzpicture}
1069\caption{The \texttt{cl\_call} case.}
1075\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1076                            every label/.style=overlay, node distance=10mm]
1077    \matrix [diag, small vgap] (m) {%
1078        \node (s1) [is ret] {}; \\
1079        \node (t1) {}; \\
1080        \node (s2) {}; & \node (c) [is ret] {};\\
1081        & \node (r) {}; & \node (t2) {}; \\   
1082    };
1083    {[-stealth]
1084    \draw (s1) -- (t1);
1085    \draw [new] (s2) -- node [above] {$*$} (c);
1086    \draw [new] (c) -- (r);
1087    \draw [new] (r) -- node [above] {$*$} (t2);
1088    }
1089    \draw (s1) to [bend right=45] node [rel] {$\S$} (s2);
1090    \draw [new, overlay] (t1) to [bend left=90, looseness=1] node [rel] {$\S,\L$} (t2);
1091    \draw [new, overlay] (t1) to [bend left=90, looseness=1.2] node [rel] {$\R$} (r);
1093\caption{The \texttt{cl\_return} case.}
1097\caption{The hypotheses for the preservation of structured traces, simplified.
1098         Dashed lines
1099         and arrows indicates how the diagrams must be closed when solid relations
1100         are present.}
1104\paragraph{1-to-0-or-many forward simulation conditions.}
1105\begin{condition}[Cases \texttt{cl\_other} and \texttt{cl\_jump}]
1106 For all $s_1,s_1',s_2$ such that $s_1 \S s_1'$, and
1107 $s_1\exec s_1'$, and either $s_1 \class \texttt{cl\_other}$ or
1108 both $s_1\class\texttt{cl\_other}\}$ and $\ell~s_1'$,
1109 there exists an $s_2'$ and a $\texttt{trace\_any\_any\_free}~s_2~s_2'$ called $taaf$
1110 such that $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1111$taaf$ is non empty, or one among $s_1$ and $s_1'$ is \texttt{as\_costed}.
1114In the above condition, a $\texttt{trace\_any\_any\_free}~s_1~s_2$ (which from now on
1115will be shorthanded as \verb+TAAF+) is an
1116inductive type of structured traces that do not contain function calls or
1117cost emission statements. Differently from a \verb+TAA+, the
1118instruction to be executed in the lookahead state $s_2$ may be a cost emission
1121The intuition of the condition is that one step can be replaced with zero or more steps if it
1122preserves the relation between the data and if the two final statuses are
1123labelled in the same way. Moreover, we must take special care of the empty case
1124to avoid collapsing two consecutive states that emit a label, missing one of the two emissions.
1126\begin{condition}[Case \texttt{cl\_call}]
1127 For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$ and
1128 $s_1\exec s_1'$ and $s_1 \class \texttt{cl\_call}$, there exists $s_a, s_b, s_2'$, a
1129$\verb+TAA+~s_2~s_a$, and a
1130$\verb+TAAF+~s_b~s_2'$ such that:
1131$s_a\class\texttt{cl\_call}$, the \texttt{as\_call\_ident}'s of
1132the two call states are the same, $s_1 \mathcal{C} s_a$,
1133$s_a\exec s_b$, $s_1' \L s_b$ and
1134$s_1' \S s_2'$.
1137The condition says that, to simulate a function call, we can perform a
1138sequence of silent actions before and after the function call itself.
1139The old and new call states must be $\mathcal{C}$-related, the old and new
1140states at the beginning of the function execution must be $\mathcal{L}$-related
1141and, finally, the two initial and final states must be $\mathcal{S}$-related
1142as usual.
1144\begin{condition}[Case \texttt{cl\_return}]
1145 For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$,
1146 $s_1\exec s_1'$ and $s_1 \class \texttt{cl\_return}$, there exists $s_a, s_b, s_2'$, a
1147$\verb+TAA+~s_2~s_a$, a
1148$\verb+TAAF+~s_b~s_2'$ called $taaf$ such that:
1149$s_a$ is classified as a \texttt{cl\_return},
1150$s_a\exec s_b$,
1151$s_1' \R s_b$ and
1152$s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1153$taaf$ is non empty, or $\lnot \ell~s_a$.
1156Similarly to the call condition, to simulate a return we can perform a
1157sequence of silent actions before and after the return statement itself.
1158The old and the new statements after the return must be $\mathcal{R}$-related,
1159to grant that they returned to corresponding calls.
1160The two initial and final states must be $\mathcal{S}$-related
1161as usual and, moreover, they must exhibit the same labels. Finally, when
1162the suffix is non empty we must take care of not inserting a new
1163unmatched cost emission statement just after the return statement.
1167definition status_simulation ≝
1168  λS1 : abstract_status.
1169  λS2 : abstract_status.
1170  λsim_status_rel : status_rel S1 S2.
1171    ∀st1,st1',st2.as_execute S1 st1 st1' →
1172    sim_status_rel st1 st2 →
1173    match as_classify … st1 with
1174    [ None ⇒ True
1175    | Some cl ⇒
1176      match cl with
1177      [ cl_call ⇒ ∀prf.
1178        (*
1179             st1' ------------S----------\
1180              ↑ \                         \
1181             st1 \--L--\                   \
1182              | \       \                   \
1183              S  \-C-\  st2_after_call →taa→ st2'
1184              |       \     ↑
1185             st2 →taa→ st2_pre_call
1186        *)
1187        ∃st2_pre_call.
1188        as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
1189        call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
1190        ∃st2_after_call,st2'.
1191        ∃taa2 : trace_any_any … st2 st2_pre_call.
1192        ∃taa2' : trace_any_any … st2_after_call st2'.
1193        as_execute … st2_pre_call st2_after_call ∧
1194        sim_status_rel st1' st2' ∧
1195        label_rel … st1' st2_after_call
1196      | cl_return ⇒
1197        (*
1198             st1
1199            / ↓
1200           | st1'----------S,L------------\
1201           S   \                           \
1202            \   \-----R-------\            |     
1203             \                 |           |
1204             st2 →taa→ st2_ret |           |
1205                          ↓   /            |
1206                     st2_after_ret →taaf→ st2'
1208           we also ask that st2_after_ret be not labelled if the taaf tail is
1209           not empty
1210        *)
1211        ∃st2_ret,st2_after_ret,st2'.
1212        ∃taa2 : trace_any_any … st2 st2_ret.
1213        ∃taa2' : trace_any_any_free … st2_after_ret st2'.
1214        (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
1215        as_classifier … st2_ret cl_return ∧
1216        as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
1217        ret_rel … sim_status_rel st1' st2_after_ret ∧
1218        label_rel … st1' st2'
1219      | cl_other ⇒
1220          (*         
1221          st1 → st1'
1222            |      \
1223            S      S,L
1224            |        \
1225           st2 →taaf→ st2'
1227           the taaf can be empty (e.g. tunneling) but we ask it must not be the
1228           case when both st1 and st1' are labelled (we would be able to collapse
1229           labels otherwise)
1230         *)
1231        ∃st2'.
1232        ∃taa2 : trace_any_any_free … st2 st2'.
1233        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1234        sim_status_rel st1' st2' ∧
1235        label_rel … st1' st2'
1236      | cl_jump ⇒
1237        (* just like cl_other, but with a hypothesis more *)
1238        as_costed … st1' →
1239        ∃st2'.
1240        ∃taa2 : trace_any_any_free … st2 st2'.
1241        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1242        sim_status_rel st1' st2' ∧
1243        label_rel … st1' st2'
1244      ]
1245    ].
1249\paragraph{Main result: the 1-to-0-or-many forward simulation conditions
1250are sufficient to trace reconstruction}
1252Let us assume that a relation set is given such that the 1-to-0-or-many
1253forward simulation conditions are satisfied. Under this assumption we
1254can prove the following three trace reconstruction theorems by mutual
1255structural induction over the traces given in input between the
1256$s_1$ and $s_1'$ states.
1258In particular, the \texttt{status\_simulation\_produce\_tlr} theorem
1259applied to the \texttt{main} function of the program and equal
1260$s_{2_b}$ and $s_2$ states shows that, for every initial state in the
1261source code that induces a structured trace in the source code,
1262the compiled code produces a similar structured trace.
1265For every $s_1,s_1',s_{2_b},s_2$ s.t.
1266there is a $\texttt{TLR}~s_1~s_1'$ called $tlr_1$ and a
1267$\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1268$s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1269there is a $\texttt{TLR}~s_{2_b}~s_{2_m}$ called $tlr_2$ and
1270there is a $\verb+TAAF+~s_{2_m}~s_2'$ called $taaf$
1271s.t. if $taaf$ is non empty then $\lnot (\ell~s_{2_m})$,
1272and $tlr_1\approx tlr_2$
1273and $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1274$s_1' \R s_{2_m}$.
1277The theorem states that a \texttt{trace\_label\_return} in the source code
1278together with a precomputed preamble of silent states
1279(the \verb+TAA+) in the target code induces a
1280similar \texttt{trace\_label\_return} in the target code which can be
1281followed by a sequence of silent states. Note that the statement does not
1282require the produced \texttt{trace\_label\_return} to start with the
1283precomputed preamble, even if this is likely to be the case in concrete
1284implementations. The preamble in input is necessary for compositionality, e.g.
1285because the 1-to-0-or-many forward simulation conditions allow in the
1286case of function calls to execute a preamble of silent instructions just after
1287the function call.
1290For every $s_1,s_1',s_{2_b},s_2$ s.t.
1291there is a $\texttt{TLL}~b~s_1~s_1'$ called $tll_1$ and a
1292$\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1293$s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1295 \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1296       and a trace $\texttt{TLL}~b~s_{2_b}~s_{2_m}$ called $tll_2$
1297       and a $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1298       $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1299       $s_1' \R s_{2_m}$ and
1300       $tll_1\approx tll_2$ and
1301       if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1302 \item else there exists $s_2'$ and a
1303       $\texttt{TLL}~b~s_{2_b}~s_2'$ called $tll_2$ such that
1304       $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1305       $tll_1\approx tll_2$.
1309The statement is similar to the previous one: a source
1310\texttt{trace\_label\_label} and a given target preamble of silent states
1311in the target code induce a similar \texttt{trace\_label\_label} in the
1312target code, possibly followed by a sequence of silent moves that become the
1313preamble for the next \texttt{trace\_label\_label} translation.
1316For every $s_1,s_1',s_2$ s.t.
1317there is a $\texttt{TAL}~b~s_1~s_1'$ called $tal_1$ and
1318$s_1 \mathcal{S} s_2$
1320 \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1321   and a trace $\texttt{TAL}~b~s_2~s_{2_m}$ called $tal_2$ and a
1322   $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1323   $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1324   $s_1' \R s_{2_m}$ and
1325   $tal_1 \approx tal_2$ and
1326   if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1327 \item else there exists $s_2'$ and a
1328   $\texttt{TAL}~b~s_2~s_2'$ called $tal_2$ such that
1329   either $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1330       $tal_1\approx tal_2$
1331   or $s_1' \mathrel{{\S} \cap {\L}} s_2$ and
1332   $\texttt{tal\_collapsable}~tal_1$ and $\lnot \ell~s_1$.
1336The statement is also similar to the previous ones, but for the lack of
1337the target code preamble.
1341For every $s_1,s_1',s_2$ s.t.
1342there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and
1343$s_1 (\mathcal{L} \cap \mathcal{S}) s_2$
1344there exists $s_{2_m},s_2'$ s.t.
1345there is a $\texttt{trace\_label\_return}~s_2~s_{2_m}$ called $tlr_2$ and
1346there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
1347s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
1348and $\texttt{tlr\_rel}~tlr_1~tlr_2$
1349and $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1350$s_1' \mathcal{R} s_{2_m}$.
1356status_simulation_produce_tlr S1 S2 R
1357(* we start from this situation
1358     st1 →→→→tlr→→→→ st1'
1359      | \
1360      L  \---S--\
1361      |          \
1362   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
1363                        by the preceding call)
1365   and we produce
1366     st1 →→→→tlr→→→→ st1'
1367             \\      /  \
1368             //     R    \-L,S-\
1369             \\     |           \
1370   st2_lab →tlr→ st2_mid →taaf→ st2'
1372  st1 st1' st2_lab st2
1373  (tlr1 : trace_label_return S1 st1 st1')
1374  (taa2_pre : trace_any_any S2 st2_lab st2)
1375  (sim_execute : status_simulation S1 S2 R)
1376  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
1377  ∃st2_mid.∃st2'.
1378  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
1379  ∃taa2 : trace_any_any_free … st2_mid st2'.
1380  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
1381  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
1382  tlr_rel … tlr1 tlr2
1386\section{Conclusions and future works}
1388The labelling approach is a technique to implement compilers that induce on
1389the source code a non uniform cost model determined from the object code
1390produced. The cost model assigns a cost to each basic block of the program.
1391The main theorem of the approach says that there is an exact
1392correspondence between the sequence of basic blocks started in the source
1393and object code, and that no instruction in the source or object code is
1394executed outside a basic block. Thus the cost of object code execution
1395can be computed precisely on the source.
1397In this paper we scale the labelling approach to cover a programming language
1398with function calls. This introduces new difficulties only when the language
1399is unstructured, i.e. it allows function calls to return anywhere in the code,
1400destroying the hope of a static prediction of the cost of basic blocks.
1401We restore static predictability by introducing a new semantics for unstructured
1402programs that single outs well structured executions. The latter are represented
1403by structured traces, a generalisation of streams of observables that capture
1404several structural invariants of the execution, like well nesting of functions
1405or the fact that every basic block must start with a code emission statement.
1406We show that structured traces are sufficiently structured to statically compute
1407a precise cost model on the object code.
1409We introduce a similarity relation on structured traces that must hold between
1410source and target traces. When the relation holds for every program, we prove
1411that the cost model can be lifted from the object to the source code.
1413In order to prove that similarity holds, we present a generic proof of forward
1414simulation that is aimed at pulling apart as much as possible the part of the
1415simulation related to non-functional properties (preservation of structure)
1416from that related to functional properties. In particular, we reduce the
1417problem of preservation of structure to that of showing a 1-to-0-or-many
1418forward simulation that only adds a few additional proof obligations to those
1419of a traditional, function properties only, proof.
1421All results presented in the paper are part of a larger certification of a
1422C compiler which is based on the labelling approach. The certification, done
1423in Matita, is the main deliverable of the FET-Open Certified Complexity (CerCo).
1425The short term future work consists in the completion of the certification of
1426the CerCo compiler exploiting the main theorem of this paper.
1428\paragraph{Related works.}
1429CerCo is the first project that explicitly tries to induce a
1430precise cost model on the source code in order to establish non-functional
1431properties of programs on an high level language. Traditional certifications
1432of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
1433of the functional properties.
1435Usually forward simulations take the following form: for each transition
1436from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
1437transitions in the target code of length $n$. The number $n$ of transition steps
1438in the target code can just be the witness of the existential statement.
1439An equivalent alternative when the proof of simulation is constructive consists
1440in providing an explicit function, called \emph{clock function} in the
1441literature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
1442function constitutes then a cost model for the source code, in the spirit of
1443what we are doing in CerCo. However, we believe our solution to be superior
1444in the following respects: 1) the machinery of the labelling approach is
1445insensible to the resource being measured. Indeed, any cost model computed on
1446the object code can be lifted to the source code (e.g. stack space used,
1447energy consumed, etc.). On the contrary, clock functions only talk about
1448number of transition steps. In order to extend the approach with clock functions
1449to other resources, additional functions must be introduced. Moreover, the
1450additional functions would be handled differently in the proof.
14512) the cost models induced by the labelling approach have a simple presentation.
1452In particular, they associate a number to each basic block. More complex
1453models can be induced when the approach is scaled to cover, for instance,
1454loop optimisations~\cite{loopoptimizations}, but the costs are still meant to
1455be easy to understand and manipulate in an interactive theorem prover or
1456in Frama-C.
1457On the contrary, a clock function is a complex function of the state $s_1$
1458which, as a function, is an opaque object that is difficult to reify as
1459source code in order to reason on it.
1464% \appendix
1465% \section{Notes for the reviewers}
1467% The results described in the paper are part of a larger formalization
1468% (the certification of the CerCo compiler). At the moment of the submission
1469% we need to single out from the CerCo formalization the results presented here.
1470% Before the 16-th of February we will submit an attachment that contains the
1471% minimal subset of the CerCo formalization that allows to prove those results.
1472% At that time it will also be possible to measure exactly the size of the
1473% formalization described here. At the moment a rough approximation suggests
1474% about 2700 lines of Matita code.
1476% We will also attach the development version of the interactive theorem
1477% prover Matita that compiles the submitted formalization. Another possibility
1478% is to backport the development to the last released version of the system
1479% to avoid having to re-compile Matita from scratch.
1481% The programming and certification style used in the formalization heavily
1482% exploit dependent types. Dependent types are used: 1) to impose invariants
1483% by construction on the data types and operations (e.g. a traces from a state
1484% $s_1$ to a state $s_2$ can be concatenad to a trace from a state
1485% $s_2'$ to a state $s_3$ only if $s_2$ is convertible with $s_2'$); 2)
1486% to state and prove the theorems by using the Russell methodology of
1487% Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
1488% }, better known in the Coq world as ``\texttt{Program}'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
1489% mandatorily on dependent types: it should be easy to adapt the technique
1490% and results presented in the paper to HOL.
1492% Finally, Matita and Coq are based on minor variations of the Calculus of
1493% (Co)Inductive Constructions. These variations do not affect the CerCo
1494% formalization. Therefore a porting of the proofs and ideas to Coq would be
1495% rather straightforward.
Note: See TracBrowser for help on using the repository browser.