source: Papers/itp-2013/ccexec.tex @ 2637

Last change on this file since 2637 was 2637, checked in by sacerdot, 8 years ago


File size: 66.4 KB
10% NB: might be worth removing this if changing class in favour of
11% a built-in definition.
16  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
17   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
18  }
21  morekeywords={int8_t}
25        keywordstyle=\color{red}\bfseries,
26        keywordstyle=[2]\color{blue},
27        commentstyle=\color{green},
28        stringstyle=\color{blue},
29        showspaces=false,showstringspaces=false,
30        xleftmargin=1em}
34\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
35Emerging Technologies (FET) programme within the Seventh Framework
36Programme for Research of the European Commission, under FET-Open grant
37number: 243881}}
38\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
39\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{}, \email{}}
42The labelling approach is a technique to lift cost models for non-functional
43properties of programs from the object code to the source code. It is based
44on the preservation of the structure of the high level program in every
45intermediate language used by the compiler. Such structure is captured by
46observables that are added to the semantics and that needs to be preserved
47by the forward simulation proof of correctness of the compiler. Additional
48special observables are required for function calls. In this paper we
49present a generic forward simulation proof that preserves all these observables.
50The proof statement is based on a new mechanised semantics that traces the
51structure of execution when the language is unstructured. The generic semantics
52and simulation proof have been mechanised in the interactive theorem prover
58The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
59a technique to \emph{lift} cost models for non-functional properties of programs
60from the object code to the source code. Examples of non-functional properties
61are execution time, amount of stack/heap space consumed and energy required for
62communication. The basic idea of the approach is that it is impossible to
63provide a \emph{uniform} cost model for an high level language that is preserved
64\emph{precisely} by a compiler. For instance, two instances of an assignment
65$x = y$ in the source code can be compiled very differently according to the
66place (registers vs stack) where $x$ and $y$ are stored at the moment of
67execution. Therefore a precise cost model must assign a different cost
68to every occurrence, and the exact cost can only be known after compilation.
70According to the labelling approach, the compiler is free to compile and optimise
71the source code without any major restriction, but it must keep trace
72of what happens to basic blocks during the compilation. The cost model is
73then computed on the object code. It assigns a cost to every basic block.
74Finally, the compiler propagates back the cost model to the source level,
75assigning a cost to each basic block of the source code.
77Implementing the labelling approach in a certified compiler
78allows to reason formally on the high level source code of a program to prove
79non-functional properties that are granted to be preserved by the compiler
80itself. The trusted code base is then reduced to 1) the interactive theorem
81prover (or its kernel) used in the certification of the compiler and
822) the software used to certify the property on the source language, that
83can be itself certified further reducing the trusted code base.
84In~\cite{easylabelling} the authors provide an example of a simple
85certified compiler that implements the labelling approach for the
86imperative \texttt{While} language~\cite{while}, that does not have
87pointers and function calls.
89The labelling approach has been shown to scale to more interesting scenarios.
90In particular in~\cite{functionallabelling} it has been applied to a functional
91language and in~\cite{loopoptimizations} it has been shown that the approach
92can be slightly complicated to handle loop optimisations and, more generally,
93program optimisations that do not preserve the structure of basic blocks.
94On-going work also shows that the labelling approach is also compatible with
95the complex analyses required to obtain a cost model for object code
96on processors that implement advanced features like pipelining, superscalar
97architectures and caches.
99In the European Project CerCo (Certified Complexity~\footnote{\url{}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
1008051 object code. The compiler is
101moderately optimising and implements a compilation chain that is largely
102inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novelty and source of difficulties is due to the presence
103of function calls. Surprisingly, the addition of function calls require a
104revisitation of the proof technique given in~\cite{easylabelling}. In
105particular, at the core of the labelling approach there is a forward
106simulation proof that, in the case of \texttt{While}, is only minimally
107more complex than the proof required for the preservation of the
108functional properties only. In the case of a programming language with
109function calls, instead, it turns out that the forward simulation proof for
110the back-end languages must grant a whole new set of invariants.
112In this paper we present a formalisation in the Matita interactive theorem
113prover~\cite{matita1,matita2} of a generic version of the simulation proof required for unstructured
114languages. All back-end languages of the CerCo compiler are unstructured
115languages, so the proof covers half of the correctness of the compiler.
116The statement of the generic proof is based on a new semantics
117for imperative unstructured languages that is based on \emph{structured
118traces} and that restores the preservation of structure in the observables of
119the semantics. The generic proof allows to almost completely split the
120part of the simulation that deals with functional properties only from the
121part that deals with the preservation of structure.
123The plan of this paper is the following. In Section~\ref{labelling} we
124sketch the labelling method and the problems derived from the application
125to languages with function calls. In Section~\ref{semantics} we introduce
126a generic description of an unstructured imperative language and the
127corresponding structured traces (the novel semantics). In
128Section~\ref{simulation} we describe the forward simulation proof.
129Conclusions and future works are in Section~\ref{conclusions}
131\section{The labelling approach}
133We briefly sketch here a simplified version of the labelling approach as
134introduced in~\cite{easylabelling}. The simplification strengthens the
135sufficient conditions given in~\cite{easylabelling} to allow a simpler
136explanation. The simplified conditions given here are also used in the
137CerCo compiler to simplify the proof.
139Let $\mathcal{P}$ be a programming language whose semantics is given in
140terms of observables: a run of a program yields a finite or infinite
141stream of observables. We also assume for the time being that function
142calls are not available in $\mathcal{P}$. We want to associate a cost
143model to a program $P$ written in $\mathcal{P}$. The first step is to
144extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
145where $L$ is a label distinct from all observables of $\mathcal{P}$.
146The semantics of $\texttt{emit L}$ is the emission of the observable
147\texttt{L} that is meant to signal the beginning of a basic block.
149There exists an automatic procedure that injects into the program $P$ an
150$\texttt{emit L}$ at the beginning of each basic block, using a fresh
151\texttt{L} for each block. In particular, the bodies of loops, both branches
152of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
153with an emission statement.
155Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
156that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
157language used by the compiler. We can easily extend every
158intermediate language (and its semantics) with an $\texttt{emit L}$ statement
159as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
160the additional difficulty that the syntax of object code is given as a
161sequence of bytes. The injection of an emission statement in the object code
162can be done using a map that maps two consecutive code addresses with the
163statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
164instruction stored at $pc_1$ and before the execution of the instruction
165stored at $pc_2$. The two program counters are necessary because the
166instruction stored at $pc_1$ can have multiple possible successors (e.g.
167in case of a conditional branch or an indirect call). Dually, the instruction
168stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
169target of a jump).
171The compiler, to be functionally correct, must preserve the observational
172equivalence, i.e. executing the program after each compiler pass should
173yield the same stream of observables. After the injection of emission
174statements, observables now capture both functional and non-functional
176This correctness property is called in the literature a forward simulation
177and is sufficient for correctness when the target language is
179We also require a stronger, non-functional preservation property: after each
180pass all basic blocks must start with an emission statement, and all labels
181\texttt{L} must be unique.
183Now let $M$ be the object code obtained for the program $P$. Let us suppose
184that we can statically inspect the code $M$ and associate to each basic block
185a cost (e.g. the number of clock cycles required to execute all instructions
186in the basic block, or an upper bound to that time). Every basic block is
187labelled with an unique label \texttt{L}, thus we can actually associate the
188cost to \texttt{L}. Let call it $k(\texttt{L})$.
190The function $k$ is defined as the cost model for the object code control
191blocks. It can be equally used as well as the cost model for the source
192control blocks. Indeed, if the semantics of $P$ is the stream
193$L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
194also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
195every instruction belongs to a control block and every control block is
196labelled. Thus it is correct to say that the execution cost of $P$ is also
197$\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
198the blocks of the high level program $P$ that is preserved by compilation.
200How can the user profit from the high level cost model? Suppose, for instance,
201that he wants to prove that the WCET of his program is bounded by $c$. It
202is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
203a purely functional property of the code. He can therefore use any technique
204available to certify functional properties of the source code.
205What is suggested in~\cite{easylabelling} is to actually instrument the
206source code $P$ by replacing every label emission statement
207$\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
208a global fresh variable \texttt{cost}. The bound is now proved by establishing
209the program invariant $\texttt{cost} \leq c$, which can be done for example
210using the Frama-C~\cite{framaC} suite if the source code is some variant of
213\subsection{Labelling function calls}
214We now want to extend the labelling approach to support function calls.
215On the high level, \emph{structured} programming language $\mathcal{P}$ there
216is not much to change.
217When a function is invoked, the current basic block is temporarily exited
218and the basic block the function starts with take control. When the function
219returns, the execution of the original basic block is resumed. Thus the only
220significant change is that basic blocks can now be nested. Let \texttt{E}
221be the label of the external block and \texttt{I} the label of a nested one.
222Since the external starts before the internal, the semantics observed will be
223\texttt{E I} and the cost associated to it on the source language will be
224$k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
225in the block \texttt{E} first plus the cost of executing all the instructions in
226the block \texttt{I}. However, we know that some instructions in \texttt{E} are
227executed after the last instruction in \texttt{I}. This is actually irrelevant
228because we are here assuming that costs are additive, so that we can freely
229permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
230the function call terminates and yields back control to the basic block
231\texttt{E}. If the call diverges, the instrumentation
232$\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
233but just as an upper bound to the real execution cost: only precision is lost.
235Let now consider what happens when we move down the compilation chain to an
236unstructured intermediate or final language. Here unstructured means that
237the only control operators are conditional and unconditional jumps, function
238calls and returns. Unlike a structured language, though, there is no guarantee
239that a function will return control just after the function call point.
240The semantics of the return statement, indeed, consists in fetching the
241return address from some internal structure (typically the control stack) and
242jumping directly to it. The code can freely manipulate the control stack to
243make the procedure returns to whatever position. Indeed, it is also possible
244to break the well nesting of function calls/returns.
246Is it the case that the code produced by a correct compiler must respect the
247additional property that every function returns just after its function call
248point? The answer is negative and the property is not implied by forward
249simulation proofs. For instance, imagine to modify a correct compiler pass
250by systematically adding one to the return address on the stack and by
251putting a \texttt{NOP} (or any other instruction that takes one byte) after
252every function call. The obtained code will be functionally indistinguishable,
253and the added instructions will all be dead code.
255This lack of structure in the semantics badly interferes with the labelling
256approach. The reason is the following: when a basic block labelled with
257\texttt{E} contains a function call, it no longer makes any sense to associate
258to a label \texttt{E} the sum of the costs of all the instructions in the block.
259Indeed, there is no guarantee that the function will return into the block and
260that the instructions that will be executed after the return will be the ones
261we are paying for in the cost model.
263How can we make the labelling approach work in this scenario? We only see two
264possible ways. The first one consists in injecting an emission statement after
265every function call: basic blocks no longer contain function calls, but are now
266terminated by them. This completely solves the problem and allows the compiler
267to break the structure of function calls/returns at will. However, the
268technique has several drawbacks. First of all, it greatly augments the number
269of cost labels that are injected in the source code and that become
270instrumentation statements. Thus, when reasoning on the source code to prove
271non-functional properties, the user (or the automation tool) will have to handle
272larger expressions. Second, the more labels are emitted, the more difficult it
273becomes to implement powerful optimisations respecting the code structure.
274Indeed, function calls are usually implemented in such a way that most registers
275are preserved by the call, so that the static analysis of the block is not
276interrupted by the call and an optimisation can involve both the code before
277and after the function call. Third, instrumenting the source code may require
278unpleasant modification of it. Take, for example, the code
279\texttt{f(g(x));}. We need to inject an emission statement/instrumentation
280instruction just after the execution of \texttt{g}. The only way to do that
281is to rewrite the code as \texttt{y = g(x); emit L; f(y);} for some fresh
282variable \texttt{y}. It is pretty clear how in certain situations the obtained
283code would be more obfuscated and then more difficult to manually reason on.
285For the previous reasons, in this paper and in the CerCo project we adopt a
286different approach. We do not inject emission statements after every
287function call. However, we want to propagate a strong additional invariant in
288the forward simulation proof. The invariant is the propagation of the structure
289 of the original high level code, even if the target language is unstructured.
290The structure we want to propagate, that will become more clear in the next
291section, comprises 1) the property that every function should return just after
292the function call point, which in turns imply well nesting of function calls;
2932) the property that every basic block starts with a code emission statement.
295In the original labelling approach of~\cite{easylabelling}, the second property
296was granted syntactically as a property of the generated code.
297In our revised approach, instead, we will impose the property on the runs:
298it will be possible to generate code that does not respect the syntactic
299property, as soon as all possible runs respect it. For instance, dead code will no longer
300be required to have all basic blocks correctly labelled. The switch is suggested
301from the fact that the first of the two properties --- that related to
302function calls/returns --- can only be defined as property of runs,
303not of the static code. The switch is
304beneficial to the proof because the original proof was made of two parts:
305the forward simulation proof and the proof that the static property was granted.
306In our revised approach the latter disappears and only the forward simulation
307is kept.
309In order to capture the structure semantics so that it is preserved
310by a forward simulation argument, we need to make the structure observable
311in the semantics. This is the topic of the next section.
313\section{Structured traces}
315The program semantics adopted in the traditional labelling approach is based
316on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
317a set of states $\mathcal{S}$, the semantics of one deterministic execution
318step is
319defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
320observables. The semantics is then lifted compositionally to multiple (finite
321or infinite) execution steps.
322Finally, the semantics of a a whole program execution is obtained by forgetting
323about the final state (if any), yielding a function $S \to O^*$ that given an
324initial status returns the finite or infinite stream of observables in output.
326We present here a new definition of semantics where the structure of execution,
327as defined in the previous section, is now observable. The idea is to replace
328the stream of observables with a structured data type that makes explicit
329function call and returns and that grants some additional invariants by
330construction. The data structure, called \emph{structured traces}, is
331defined inductively for terminating programs and coinductively for diverging
332ones. In the paper we focus only on the inductive structure, i.e. we assume
333that all programs that are given a semantics are total. The Matita formalisation
334also shows the coinductive definitions. The semantics of a program is then
335defined as a function that maps an initial state into a structured trace.
337In order to have a definition that works on multiple intermediate languages,
338we abstract the type of structure traces over an abstract data type of
339abstract statuses:
341record abstract_status := \{ S: Type[0];
342 as_execute: S \(\to\) S \(\to\) Prop;     as_classify: S \(\to\) classification;
343 as_costed: S \(\to\) Prop;      as_label: \(\forall\) s. as_costed S s \(\to\) label;
344 as_call_ident: \(\forall\) s. as_classify S s = cl_call \(\to\) label;
345 as_after_return:
346  (\(\Sigma\)s:as_status. as_classify s = Some ? cl_call) \(\to\) as_status \(\to\) Prop \}
348The predicate $\texttt{as\_execute}~s_1~s_2$ holds if $s_1$ evolves into
349$s_2$ in one step;\\ $\texttt{as\_classify}~s~c$ holds if the next instruction
350to be executed in $s$ is classified according to $c \in
351\{\texttt{cl\_return,cl\_jump,cl\_call,cl\_other}\}$ (we omit tail-calls for simplicity);
352the predicate $\texttt{as\_costed}~s$ holds if the next instruction to be
353executed in $s$ is a cost emission statement (also classified
354as \texttt{cl\_other}); finally $(\texttt{as\_after\_return}~s_1~s_2)$ holds
355if the next instruction to be executed in $s_2$ follows the function call
356to be executed in (the witness of the $\Sigma$-type) $s_1$. The two functions
357\texttt{as\_label} and \texttt{as\_cost\_ident} are used to extract the
358cost label/function call target from states whose next instruction is
359a cost emission/function call statement.
362The inductive type for structured traces is actually made by three multiple
363inductive types with the following semantics:
365 \item $(\texttt{trace\_label\_return}~s_1~s_2)$ is a trace that begins in
366   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
367   such that the instruction to be executed in $s_1$ is a label emission
368   statement and the one to be executed in the state before $s_2$ is a return
369   statement. Thus $s_2$ is the state after the return. The trace
370   may contain other label emission statements. It captures the structure of
371   the execution of function bodies: they must start with a cost emission
372   statement and must end with a return; they are obtained by concatenating
373   one or more basic blocks, all starting with a label emission
374   (e.g. in case of loops).
375 \item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ is a trace that begins in
376   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
377   such that the instruction to be executed in $s_2$/in the state before
378   $s_2$ is either a label emission statement or
379   or a return, according to the boolean $b$. It must not contain
380   any label emission statement. It captures the notion of a suffix of a
381   basic block.
382 \item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ is the special case of
383   $(\texttt{trace\_any\_label}~b~s_1~s_2)$ such that the instruction to be
384   executed in $s_1$ is a label emission statement. It captures the notion of
385   a basic block.
389 {\texttt{trace\_label\_label}~true~s_1~s_2}
390 {\texttt{trace\_label\_return}~s_1~s_2}
393 {\texttt{trace\_label\_label}~false~s_1~s_2 \andalso
394  \texttt{trace\_label\_return}~s_2~s_3
395 }
396 {\texttt{trace\_label\_return}~s_1~s_3}
399 {\texttt{trace\_any\_label}~b~s_1~s_2 \andalso
400  \texttt{as\_costed}~s_1
401 }
402 {\texttt{trace\_label\_label}~b~s_1~s_2}
405 {\texttt{as\_execute}~s_1~s_2 \andalso
406  \texttt{as\_classify}~s_1 \in \{\texttt{cl\_jump,cl\_other}\} \andalso
407  \texttt{as\_costed}~s_2
408 }
409 {\texttt{trace\_any\_label}~false~s_1~s_2}
412 {\texttt{as\_execute}~s_1~s_2 \andalso
413  \texttt{as\_classify}~s_1 = \texttt{cl\_return} \\
414 }
415 {\texttt{trace\_any\_label}~true~s_1~s_2}
418 {\texttt{as\_execute}~s_1~s_2 \andalso
419  \texttt{as\_classify}~s_1 = \texttt{cl\_call} \\
420  \texttt{as\_after\_return}~s_1~s_3 \andalso
421  \texttt{trace\_label\_return}~s_2~s_3 \andalso
422  \texttt{as\_costed}~s_3
423 }
424 {\texttt{trace\_any\_label}~false~s_1~s_3}
427 {\texttt{as\_execute}~s_1~s_2 \andalso
428  \texttt{as\_classify}~s_1 = \texttt{cl\_call} \\
429  \texttt{as\_after\_return}~s_1~s_3 \andalso
430  \texttt{trace\_label\_return}~s_2~s_3 \\
431  \lnot \texttt{as\_costed}~s_3 \andalso
432  \texttt{trace\_any\_label}~b~s_3~s_4
433 }
434 {\texttt{trace\_any\_label}~b~s_1~s_4}
437 {\texttt{as\_execute}~s_1~s_2 \andalso
438  \lnot \texttt{as\_costed}~s_2 \\
439  \texttt{trace\_any\_label}~b~s_2~s_3
440  \texttt{as\_classify}~s_1 = \texttt{cl\_other}
441 }
442 {\texttt{trace\_any\_label}~b~s_1~s_3}
447inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
448  | tlr_base:
449      ∀status_before: S.
450      ∀status_after: S.
451        trace_label_label S ends_with_ret status_before status_after →
452        trace_label_return S status_before status_after
453  | tlr_step:
454      ∀status_initial: S.
455      ∀status_labelled: S.
456      ∀status_final: S.
457        trace_label_label S doesnt_end_with_ret status_initial status_labelled →
458        trace_label_return S status_labelled status_final →
459          trace_label_return S status_initial status_final
460with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
461  | tll_base:
462      ∀ends_flag: trace_ends_with_ret.
463      ∀start_status: S.
464      ∀end_status: S.
465        trace_any_label S ends_flag start_status end_status →
466        as_costed S start_status →
467          trace_label_label S ends_flag start_status end_status
468with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
469  (* Single steps within a function which reach a label.
470     Note that this is the only case applicable for a jump. *)
471  | tal_base_not_return:
472      ∀start_status: S.
473      ∀final_status: S.
474        as_execute S start_status final_status →
475        (as_classifier S start_status cl_jump ∨
476         as_classifier S start_status cl_other) →
477        as_costed S final_status →
478          trace_any_label S doesnt_end_with_ret start_status final_status
479  | tal_base_return:
480      ∀start_status: S.
481      ∀final_status: S.
482        as_execute S start_status final_status →
483        as_classifier S start_status cl_return →
484          trace_any_label S ends_with_ret start_status final_status
485  (* A call followed by a label on return. *)
486  | tal_base_call:
487      ∀status_pre_fun_call: S.
488      ∀status_start_fun_call: S.
489      ∀status_final: S.
490        as_execute S status_pre_fun_call status_start_fun_call →
491        ∀H:as_classifier S status_pre_fun_call cl_call.
492          as_after_return S «status_pre_fun_call, H» status_final →
493          trace_label_return S status_start_fun_call status_final →
494          as_costed S status_final →
495            trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
496  (* A call followed by a non-empty trace. *)
497  | tal_step_call:
498      ∀end_flag: trace_ends_with_ret.
499      ∀status_pre_fun_call: S.
500      ∀status_start_fun_call: S.
501      ∀status_after_fun_call: S.
502      ∀status_final: S.
503        as_execute S status_pre_fun_call status_start_fun_call →
504        ∀H:as_classifier S status_pre_fun_call cl_call.
505          as_after_return S «status_pre_fun_call, H» status_after_fun_call →
506          trace_label_return S status_start_fun_call status_after_fun_call →
507          ¬ as_costed S status_after_fun_call →
508          trace_any_label S end_flag status_after_fun_call status_final →
509            trace_any_label S end_flag status_pre_fun_call status_final
510  | tal_step_default:
511      ∀end_flag: trace_ends_with_ret.
512      ∀status_pre: S.
513      ∀status_init: S.
514      ∀status_end: S.
515        as_execute S status_pre status_init →
516        trace_any_label S end_flag status_init status_end →
517        as_classifier S status_pre cl_other →
518        ¬ (as_costed S status_init) →
519          trace_any_label S end_flag status_pre status_end.
522A \texttt{trace\_label\_return} is isomorphic to a list of
523\texttt{trace\_label\_label}s that ends with a cost emission followed by a
524return terminated \texttt{trace\_label\_label}.
525The interesting cases are those of $\texttt{trace\_any\_label}~b~s_1~s_2$.
526A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
527definition on the classification of $s_1$. The constructors of the datatype
528impose several invariants that are meant to impose a structure to the
529otherwise unstructured execution. In particular, the following invariants are
532 \item the trace is never empty; it ends with a return iff $b$ is
533       true
534 \item a jump must always be the last instruction of the trace, and it must
535       be followed by a cost emission statement; i.e. the target of a jump
536       is always the beginning of a new basic block; as such it must start
537       with a cost emission statement
538 \item a cost emission statement can never occur inside the trace, only in
539       the status immediately after
540 \item the trace for a function call step is made of a subtrace for the
541       function body of type
542       $\texttt{trace\_label\_return}~s_1~s_2$, possibly followed by the
543       rest of the trace for this basic block. The subtrace represents the
544       function execution. Being an inductive datum, it grants totality of
545       the function call. The status $s_2$ is the one that follows the return
546       statement. The next instruction of $s_2$ must follow the function call
547       instruction. As a consequence, function calls are also well nested.
550The three mutual structural recursive functions \texttt{flatten\_trace\_label\_return, flatten\_trace\_label\_label} and \texttt{flatten\_trance\_any\_label}
551allow to extract from a structured trace the list of states whose next
552instruction is a cost emission statement. We only show here the type of one
553of them:
556 \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
557  trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
560\paragraph{Cost prediction on structured traces}
562The first main theorem of CerCo about traces
563(theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
564holds for the
566of the structured traces to the concrete status of object code programs.
567Simplifying a bit, it states that
569\begin{array}{l}\forall s_1,s_2. \forall \tau: \texttt{trace\_label\_return}~s_1~s_2.\\~~
570  \texttt{clock}~s_2 = \texttt{clock}~s_1 + \Sigma_{s \in (\texttt{flatten\_trace\_label\_return}~\tau)}\;k(\mathcal{L}(s))
573where $\mathcal{L}$ maps a labelled state to its emitted label, and the
574cost model $k$ is statically computed from the object code
575by associating to each label \texttt{L} the sum of the cost of the instructions
576in the basic block that starts at \texttt{L} and ends before the next labelled
577instruction. The theorem is proved by structural induction over the structured
578trace, and is based on the invariant that
579iff the function that computes the cost model has analysed the instruction
580to be executed at $s_2$ after the one to be executed at $s_1$, and if
581the structured trace starts with $s_1$, then eventually it will contain also
582$s_2$. When $s_1$ is not a function call, the result holds trivially because
583of the $(\texttt{as\_execute}~s_1~s_2)$ condition obtained by inversion on
584the trace. The only non
585trivial case is the one of function calls: the cost model computation function
586does recursion on the first instruction that follows that function call; the
587\texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
588\texttt{tal\_step\_call} grants exactly that the execution will eventually reach
589this state.
591\paragraph{Structured traces similarity and cost prediction invariance}
593A compiler pass maps source to object code and initial states to initial
594states. The source code and initial state uniquely determine the structured
595trace of a program, if it exists. The structured trace fails to exists iff
596the structural conditions are violated by the program execution (e.g. a function
597body does not start with a cost emission statement). Let us assume that the
598target structured trace exists.
600What is the relation between the source and target structured traces?
601In general, the two traces can be arbitrarily different. However, we are
602interested only in those compiler passes that maps a trace $\tau_1$ to a trace
603$\tau_2$ such that
604\begin{equation}\texttt{flatten\_trace\_label\_return}~\tau_1 = \texttt{flatten\_trace\_label\_return}~\tau_2\label{condition1}\label{th2}\end{equation}
605The reason is that the combination of~\ref{th1} with~\ref{th2} yields the
608\forall s_1,s_2. \forall \tau: \texttt{trace\_label\_return}~s_1~s_2.\\~~~~~\;
609  clock~s_2 - clock~s_1\\~ = \Sigma_{s \in (\texttt{flatten\_trace\_label\_return}~\tau_1)}\;k(\mathcal{L}(s))\\~ = \Sigma_{s \in (\texttt{flatten\_trace\_label\_return}~\tau_2)}\;k(\mathcal{L}(s))
611This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
612transfer the cost model from the target to the source code and reason on the
613source code only.
615We are therefore interested in conditions stronger than~\ref{condition1}.
616Therefore we introduce here a similarity relation between traces with
617the same structure. Theorem~\texttt{tlr\_rel\_to\_traces\_same\_flatten}
618in the Matita formalisation shows that~\ref{th2} holds for every pair
619$(\tau_1,\tau_2)$ of similar traces.
621Intuitively, two traces are similar when one can be obtained from
622the other by erasing or inserting silent steps, i.e. states that are
623not \texttt{as\_costed} and that are classified as \texttt{other}.
624Silent steps do not alter the structure of the traces.
625In particular,
626the relation maps function calls to function calls to the same function,
627label emission statements to emissions of the same label, concatenation of
628subtraces to concatenation of subtraces of the same length and starting with
629the same emission statement, etc.
631In the formalisation the three similarity relations --- one for each trace
632kind --- are defined by structural recursion on the first trace and pattern
633matching over the second. Here we turn
634the definition into inference rules for the sake of readability. We also
635omit from trace constructors all arguments, but those that are traces or that
636are used in the premises of the rules.
639 {\texttt{tll\_rel}~tll_1~tll_2
640 }
641 {\texttt{tlr\_rel}~(\texttt{tlr\_base}~tll_1)~(\texttt{tlr\_base}~tll_2)}
644 {\texttt{tll\_rel}~tll_1~tll_2 \andalso
645  \texttt{tlr\_rel}~tlr_1~tlr_2
646 }
647 {\texttt{tlr\_rel}~(\texttt{tlr\_step}~tll_1~tlr_1)~(\texttt{tlr\_step}~tll_2~tlr_2)}
650 {\texttt{as\_label}~H_1 = \texttt{as\_label}~H_2 \andalso
651  \texttt{tal\_rel}~tal_1~tal_2
652 }
653 {\texttt{tll\_rel}~(\texttt{tll\_base}~tal_1~H_1)~(\texttt{tll\_base}~tal_2~H_2)}
656 {}
657 {\texttt{tal\_rel}~\texttt{tal\_base\_not\_return}~(taa @ \texttt{tal\_base\_not\_return}}
660 {}
661 {\texttt{tal\_rel}~\texttt{tal\_base\_return}~(taa @ \texttt{tal\_base\_return}}
664 {\texttt{tlr\_rel}~tlr_1~tlr_2 \andalso
665  \texttt{as\_call\_ident}~H_1 = \texttt{as\_call\_ident}~H_2
666 }
667 {\texttt{tal\_rel}~(\texttt{tal\_base\_call}~H_1~tlr_1)~(taa @ \texttt{tal\_base\_call}~H_2~tlr_2)}
670 {\texttt{tlr\_rel}~tlr_1~tlr_2 \andalso
671  \texttt{as\_call\_ident}~H_1 = \texttt{as\_call\_ident}~H_2 \andalso
672  \texttt{tal\_collapsable}~tal_2
673 }
674 {\texttt{tal\_rel}~(\texttt{tal\_base\_call}~tlr_1)~(taa @ \texttt{tal\_step\_call}~tlr_2~tal_2)}
677 {\texttt{tlr\_rel}~tlr_1~tlr_2 \andalso
678  \texttt{as\_call\_ident}~H_1 = \texttt{as\_call\_ident}~H_2 \andalso
679  \texttt{tal\_collapsable}~tal_1
680 }
681 {\texttt{tal\_rel}~(\texttt{tal\_step\_call}~tlr_1~tal_1)~(taa @ \texttt{tal\_base\_call}~tlr_2)}
684 {\texttt{tlr\_rel}~tlr_1~tlr_2 \andalso
685  \texttt{tal\_rel}~tal_1~tal_2 \andalso
686  \texttt{as\_call\_ident}~H_1 = \texttt{as\_call\_ident}~H_2
687 }
688 {\texttt{tal\_rel}~(\texttt{tal\_step\_call}~tlr_1~tal_1)~(taa @ \texttt{tal\_step\_call}~tlr_2~tal_2)}
691 {\texttt{tal\_rel}~tal_1~tal_2
692 }
693 {\texttt{tal\_rel}~(\texttt{tal\_step\_default}~tal_1)~tal_2}
696let rec tlr_rel S1 st1 st1' S2 st2 st2'
697  (tlr1 : trace_label_return S1 st1 st1')
698  (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
699match tlr1 with
700  [ tlr_base st1 st1' tll1 ⇒
701    match tlr2 with
702    [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
703    | _ ⇒ False
704    ]
705  | tlr_step st1 st1' st1'' tll1 tl1 ⇒
706    match tlr2 with
707    [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
708      tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
709    | _ ⇒ False
710    ]
711  ]
712and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
713 (tll1 : trace_label_label S1 fl1 st1 st1')
714 (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
715  match tll1 with
716  [ tll_base fl1 st1 st1' tal1 H ⇒
717    match tll2 with
718    [ tll_base fl2 st2 st2 tal2 G ⇒
719      as_label_safe … («?, H») = as_label_safe … («?, G») ∧
720      tal_rel … tal1 tal2
721    ]
722  ]
723and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
724 (tal1 : trace_any_label S1 fl1 st1 st1')
725 (tal2 : trace_any_label S2 fl2 st2 st2')
726   on tal1 : Prop ≝
727  match tal1 with
728  [ tal_base_not_return st1 st1' _ _ _ ⇒
729    fl2 = doesnt_end_with_ret ∧
730    ∃st2mid,taa,H,G,K.
731    tal2 ≃ taa_append_tal ? st2 ??? taa
732      (tal_base_not_return ? st2mid st2' H G K)
733  | tal_base_return st1 st1' _ _ ⇒
734    fl2 = ends_with_ret ∧
735    ∃st2mid,taa,H,G.
736    tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
737      (tal_base_return ? st2mid st2' H G)
738  | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
739    fl2 = doesnt_end_with_ret ∧
740    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
741    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
742    (* we must allow a tal_base_call to be similar to a call followed
743      by a collapsable trace (trace_any_any followed by a base_not_return;
744      we cannot use trace_any_any as it disallows labels in the end as soon
745      as it is non-empty) *)
746    (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
747      tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
748    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
749    ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
750      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
751      tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
752  | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
753    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
754    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
755    (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
756      tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
757      tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
758    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
759    ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
760      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
761      tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
762  | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
763    tal_rel … tl1 tal2 (* <- this makes it many to many *)
764  ].
768In the preceding rules, a $taa$ is an inhabitant of the
769$\texttt{trace\_any\_any}~s_1~s_2$ inductive data type whose definition
770is not in the paper for lack of space. It is the type of valid
771prefixes (even empty ones) of \texttt{trace\_any\_label}s that do not contain
772any function call. Therefore it
773is possible to concatenate (using ``$@$'') a \texttt{trace\_any\_any} to the
774left of a \texttt{trace\_any\_label}. A \texttt{trace\_any\_any} captures
775a sequence of silent moves.
777The \texttt{tal\_collapsable} unary predicate over \texttt{trace\_any\_label}s
778holds when the argument does not contain any function call and it ends
779with a label (not a return). The intuition is that after a function call we
780can still perform a sequence of silent actions while remaining similar.
782\section{Forward simulation}
785We summarise here the results of the previous sections. Each intermediate
786unstructured language can be given a semantics based on structured traces,
787that single out those runs that respect a certain number of invariants.
788A cost model can be computed on the object code and it can be used to predict
789the execution costs of runs that produce structured traces. The cost model
790can be lifted from the target to the source code of a pass if the pass maps
791structured traces to similar structured traces. The latter property is called
792a \emph{forward simulation}.
794As for labelled transition systems, in order to establish the forward
795simulation we are interested in (preservation of observables), we are
796forced to prove a stronger notion of forward simulation that introduces
797an explicit relation between states. The classical notion of a 1-to-0-or-many
798forward simulation is the existence of a relation $R$ over states such that
799if $s_1 R s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
800$s_2 \to^* s_2'$ and $s_1' R s_2'$. In our context, we need to replace the
801one and multi step transition relations $\to^n$ with the existence of
802a structured trace between the two states, and we need to add the request that
803the two structured traces are similar. Thus what we would like to state is
804something like:\\
805for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
806$s_1$ to $s_1'$ and $s_1 R s_2$ there exists an $s_2'$ such that
807$s_1' R s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
808$\tau_1$ is similar to $\tau_2$. We call this particular form of forward
809simulation \emph{trace reconstruction}.
811The statement just introduced, however, is too simplistic and not provable
812in the general case. To understand why, consider the case of a function call
813and the pass that fixes the parameter passing conventions. A function
814call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
815input nor output parameters. The pass must add explicit code before and after
816the function call to move the pseudo-registers content from/to the hardware
817registers or the stack in order to implement the parameter passing strategy.
818Similarly, each function body must be augmented with a preamble and a postamble
819to complete/initiate the parameter passing strategy for the call/return phase.
820Therefore what used to be a call followed by the next instruction to execute
821after the function return, now becomes a sequence of instructions, followed by
822a call, followed by another sequence. The two states at the beginning of the
823first sequence and at the end of the second sequence are in relation with
824the status before/after the call in the source code, like in an usual forward
825simulation. How can we prove however the additional condition for function calls
826that asks that when the function returns the instruction immediately after the
827function call is called? To grant this invariant, there must be another relation
828between the address of the function call in the source and in the target code.
829This additional relation is to be used in particular to relate the two stacks.
831Another example is given by preservation of code emission statements. A single
832code emission instruction can be simulated by a sequence of steps, followed
833by a code emission, followed by another sequence. Clearly the initial and final
834statuses of the sequence are to be in relation with the status before/after the
835code emission in the source code. In order to preserve the structured traces
836invariants, however, we must consider a second relation between states that
837traces the preservation of the code emission statement.
839Therefore we now introduce an abstract notion of relation set between abstract
840statuses and an abstract notion of 1-to-0-or-many forward simulation conditions.
841These two definitions enjoy the following remarkable properties:
843 \item they are generic enough to accommodate all passes of the CerCo compiler
844 \item the conjunction of the 1-to-0-or-many forward simulation conditions are
845       just slightly stricter than the statement of a 1-to-0-or-many forward
846       simulation in the classical case. In particular, they only require
847       the construction of very simple forms of structured traces made of
848       silent states only.
849 \item they allow to prove our main result of the paper: the 1-to-0-or-many
850       forward simulation conditions are sufficient to prove the trace
851       reconstruction theorem
854Point 3. is the important one. First of all it means that we have reduced
855the complex problem of trace reconstruction to a much simpler one that,
856moreover, can be solved with slight adaptations of the forward simulation proof
857that is performed for a compiler that only cares about functional properties.
858Therefore we have successfully splitted as much as possible the proof of
859preservation of functional properties from that of non-functional ones.
860Secondly, combined with the results in the previous section, it implies
861that the cost model can be computed on the object code and lifted to the
862source code to reason on non-functional properties, assuming that
863the 1-to-0-or-many forward simulation conditions are fulfilled for every
864compiler pass.
866\paragraph{Relation sets}
868We introduce now the four relations $\mathcal{S,C,L,R}$ between abstract
869statuses that are used to correlate the corresponding statues before and
870after a compiler pass. The first two are abstract and must be instantiated
871by every pass. The remaining two are derived relations.
873The $\mathcal{S}$ relation between states is the classical relation used
874in forward simulation proofs. It correlates the data of the status
875(e.g. registers, memory, etc.).
877The $\mathcal{C}$ relation correlates call states. It allows to track the
878position in the target code of every call in the source code.
880The $\mathcal{L}$ relation simply says that the two states are both label
881emitting states that emit the same label. It allows to track the position in
882the target code of every cost emitting statement in the source code.
884Finally the $\mathcal{R}$ relation is the more complex one. Two states
885$s_1$ and $s_2$ are $\mathcal{R}$ correlated if every time $s_1$ is the
886successors of a call state that is $\mathcal{C}$-related to a call state
887$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$.
888We will require all pairs of states that follow a related call to be
889$\mathcal{R}$-related. This is the fundamental requirement to grant
890that the target trace is well structured, i.e. that function calls are well
891nested and always return where they are supposed to.
894record status_rel (S1,S2 : abstract_status) : Type[1] := \{ 
895  \(\mathcal{S}\): S1 \(\to\) S2 \(\to\) Prop;
896  \(\mathcal{C}\): (\(\Sigma\)s.as_classifier S1 s cl_call) \(\to\)
897     (\(\Sigma\)s.as_classifier S2 s cl_call) \(\to\) Prop \}.
899definition \(\mathcal{L}\) S1 S2 st1 st2 := as_label S1 st1 = as_label S2 st2.
901definition \(\mathcal{R}\) S1 S2 (R: status_rel S1 S2) s1_ret s2_ret ≝
902 \(\forall\)s1_pre,s2_pre.
903  as_after_return s1_pre s1_ret \(\to\) s1_pre \(\mathcal{R}\) s2_pre \(\to\)
904   as_after_return s2_pre s2_ret.
907\paragraph{1-to-0-or-many forward simulation conditions}
908\begin{condition}[Cases \texttt{cl\_other} and \texttt{cl\_jump}]
909 For all $s_1,s_1',s_2$ such that $s_1 \mathcal{S} s_1'$, and
910 $\texttt{as\_execute}~s_1~s_1'$, and $\texttt{as\_classify}~s_1 = \texttt{cl\_other}$ or $\texttt{as\_classify}~s_1 = \texttt{cl\_other}$ and
911$\texttt{as\_costed}~s_1'$, there exists an $s_2'$ and a $\texttt{trace\_any\_any\_free}~s_2~s_2'$ called $taaf$ such that $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and either
912$taaf$ is non empty, or one among $s_1$ and $s_1'$ is \texttt{as\_costed}.
915In the above condition, a $\texttt{trace\_any\_any\_free}~s_1~s_2$ is an
916inductive type of structured traces that do not contain function calls or
917cost emission statements. Differently from a \texttt{trace\_any\_any}, the
918instruction to be executed in the lookahead state $s_2$ may be a cost emission
921The intuition of the condition is that one step can be replaced with zero or more steps if it
922preserves the relation between the data and if the two final statuses are
923labelled in the same way. Moreover, we must take special care of the empty case
924to avoid collapsing two consecutive states that emit the same label to just
925one state, missing one of the two emissions.
927\begin{condition}[Case \texttt{cl\_call}]
928 For all $s_1,s_1',s_2$ s.t. $s_1 \mathcal{S} s_1'$ and
929 $\texttt{as\_execute}~s_1~s_1'$ and $\texttt{as\_classify}~s_1 = \texttt{cl\_call}$, there exists $s_2', s_b, s_a$, a
930$\texttt{trace\_any\_any}~s_2~s_b$, and a
931$\texttt{trace\_any\_any\_free}~s_a~s_2'$ such that:
932$s_a$ is classified as a \texttt{cl\_call}, the \texttt{as\_identifiers} of
933the two call states are the same, $s_1 \mathcal{C} s_b$,
934$\texttt{as\_execute}~s_b~s_a$ holds, $s_1' \mathcal{L} s_b$ and
935$s_1' \mathcal{S} s_2'$.
938The condition says that, to simulate a function call, we can perform a
939sequence of silent actions before and after the function call itself.
940The old and new call states must be $\mathcal{C}$-related, the old and new
941states at the beginning of the function execution must be $\mathcal{L}$-related
942and, finally, the two initial and final states must be $\mathcal{S}$-related
943as usual.
945\begin{condition}[Case \texttt{cl\_return}]
946 For all $s_1,s_1',s_2$ s.t. $s_1 \mathcal{S} s_1'$,
947 $\texttt{as\_execute}~s_1~s_1'$ and $\texttt{as\_classify}~s_1 = \texttt{cl\_return}$, there exists $s_2', s_b, s_a$, a
948$\texttt{trace\_any\_any}~s_2~s_b$, a
949$\texttt{trace\_any\_any\_free}~s_a~s_2'$ called $taaf$ such that:
950$s_a$ is classified as a \texttt{cl\_return},
951$s_1 \mathcal{C} s_b$, the predicate
952$\texttt{as\_execute}~s_b~s_a$ holds,
953$s_1' \mathcal{R} s_a$ and
954$s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and either
955$taaf$ is non empty, or $s_a$ is not \texttt{as\_costed}.
958Similarly to the call condition, to simulate a return we can perform a
959sequence of silent actions before and after the return statement itself.
960The old and the new statements after the return must be $\mathcal{R}$-related,
961to grant that they returned to corresponding calls.
962The two initial and final states must be $\mathcal{S}$-related
963as usual and, moreover, they must exhibit the same labels. Finally, when
964the suffix is non empty we must take care of not inserting a new
965unmatched cost emission statement just after the return statement.
969definition status_simulation ≝
970  λS1 : abstract_status.
971  λS2 : abstract_status.
972  λsim_status_rel : status_rel S1 S2.
973    ∀st1,st1',st2.as_execute S1 st1 st1' →
974    sim_status_rel st1 st2 →
975    match as_classify … st1 with
976    [ None ⇒ True
977    | Some cl ⇒
978      match cl with
979      [ cl_call ⇒ ∀prf.
980        (*
981             st1' ------------S----------\
982              ↑ \                         \
983             st1 \--L--\                   \
984              | \       \                   \
985              S  \-C-\  st2_after_call →taa→ st2'
986              |       \     ↑
987             st2 →taa→ st2_pre_call
988        *)
989        ∃st2_pre_call.
990        as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
991        call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
992        ∃st2_after_call,st2'.
993        ∃taa2 : trace_any_any … st2 st2_pre_call.
994        ∃taa2' : trace_any_any … st2_after_call st2'.
995        as_execute … st2_pre_call st2_after_call ∧
996        sim_status_rel st1' st2' ∧
997        label_rel … st1' st2_after_call
998      | cl_return ⇒
999        (*
1000             st1
1001            / ↓
1002           | st1'----------S,L------------\
1003           S   \                           \
1004            \   \-----R-------\            |     
1005             \                 |           |
1006             st2 →taa→ st2_ret |           |
1007                          ↓   /            |
1008                     st2_after_ret →taaf→ st2'
1010           we also ask that st2_after_ret be not labelled if the taaf tail is
1011           not empty
1012        *)
1013        ∃st2_ret,st2_after_ret,st2'.
1014        ∃taa2 : trace_any_any … st2 st2_ret.
1015        ∃taa2' : trace_any_any_free … st2_after_ret st2'.
1016        (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
1017        as_classifier … st2_ret cl_return ∧
1018        as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
1019        ret_rel … sim_status_rel st1' st2_after_ret ∧
1020        label_rel … st1' st2'
1021      | cl_other ⇒
1022          (*         
1023          st1 → st1'
1024            |      \
1025            S      S,L
1026            |        \
1027           st2 →taaf→ st2'
1029           the taaf can be empty (e.g. tunneling) but we ask it must not be the
1030           case when both st1 and st1' are labelled (we would be able to collapse
1031           labels otherwise)
1032         *)
1033        ∃st2'.
1034        ∃taa2 : trace_any_any_free … st2 st2'.
1035        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1036        sim_status_rel st1' st2' ∧
1037        label_rel … st1' st2'
1038      | cl_jump ⇒
1039        (* just like cl_other, but with a hypothesis more *)
1040        as_costed … st1' →
1041        ∃st2'.
1042        ∃taa2 : trace_any_any_free … st2 st2'.
1043        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1044        sim_status_rel st1' st2' ∧
1045        label_rel … st1' st2'
1046      ]
1047    ].
1051\paragraph{Main result: the 1-to-0-or-many forward simulation conditions
1052are sufficient to trace reconstruction}
1054Let us assume that a relation set is given such that the 1-to-0-or-many
1055forward simulation conditions are satisfied. Under this assumption we
1056can prove the following three trace reconstruction theorems by mutual
1057structural induction over the traces given in input between the
1058$s_1$ and $s_1'$ states.
1060In particular, the \texttt{status\_simulation\_produce\_tlr} theorem
1061applied to the \texttt{main} function of the program and equal
1062$s_{2_b}$ and $s_2$ states shows that, for every initial state in the
1063source code that induces a structured trace in the source code,
1064the compiled code produces a similar structured trace.
1067For every $s_1,s_1',s_{2_b},s_2$ s.t.
1068there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and a
1069$\texttt{trace\_any\_any}~s_{2_b}~s_2$ and $s_1 \mathcal{L} s_{2_b}$ and
1070$s_1 \mathcal{S} s_2$, there exists $s_{2_m},s_2'$ s.t.
1071there is a $\texttt{trace\_label\_return}~s_{2_b}~s_{2_m}$ called $tlr_2$ and
1072there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
1073s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
1074and $\texttt{tlr\_rel}~tlr_1~tlr_2$
1075and $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1076$s_1' \mathcal{R} s_{2_m}$.
1079The theorem states that a \texttt{trace\_label\_return} in the source code
1080together with a precomputed preamble of silent states
1081(the \texttt{trace\_any\_any}) in the target code induces a
1082similar \texttt{trace\_label\_return} trace in the target code which can be
1083followed by a sequence of silent states. Note that the statement does not
1084require the produced \texttt{trace\_label\_return} trace to start with the
1085precomputed preamble, even if this is likely to be the case in concrete
1086implementations. The preamble in input is necessary for compositionality, e.g.
1087because the 1-to-0-or-many forward simulation conditions allow in the
1088case of function calls to execute a preamble of silent instructions just after
1089the function call.
1092For every $s_1,s_1',s_{2_b},s_2$ s.t.
1093there is a $\texttt{trace\_label\_label}~b~s_1~s_1'$ called $tll_1$ and a
1094$\texttt{trace\_any\_any}~s_{2_b}~s_2$ and $s_1 \mathcal{L} s_{2_b}$ and
1095$s_1 \mathcal{S} s_2$, there exists $s_{2_m},s_2'$ s.t.
1097 \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1098       and a trace $\texttt{trace\_label\_label}~b~s_{2_b}~s_{2_m}$ called $tll_2$
1099       and a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1100       $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1101       $s_1' \mathcal{R} s_{2_m}$ and
1102       $\texttt{tll\_rel}~tll_1~tll_2$ and
1103       if $taa_2$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$
1104 \item else there exists $s_2'$ and a
1105       $\texttt{trace\_label\_label}~b~s_{2_b}~s_2'$ called $tll_2$ such that
1106       $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1107       $\texttt{tll\_rel}~tll_1~tll_2$.
1111The statement is similar to the previous one: a source
1112\texttt{trace\_label\_label} and a given target preamble of silent states
1113in the target code induce a similar \texttt{trace\_label\_label} in the
1114target code, possibly followed by a sequence of silent moves that become the
1115preamble for the next \texttt{trace\_label\_label} translation.
1118For every $s_1,s_1',s_2$ s.t.
1119there is a $\texttt{trace\_any\_label}~b~s_1~s_1'$ called $tal_1$ and
1120$s_1 \mathcal{S} s_2$
1122 \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1123   and a trace $\texttt{trace\_any\_label}~b~s_2~s_{2_m}$ called $tal_2$ and a
1124   $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1125   $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1126   $s_1' \mathcal{R} s_{2_m}$ and
1127   $\texttt{tal\_rel}~tal_1~tal_2$ and
1128   if $taa_2$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$
1129 \item else there exists $s_2'$ and a
1130   $\texttt{trace\_any\_label}~b~s_2~s_2'$ called $tal_2$ such that
1131   either $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1132       $\texttt{tal\_rel}~tal_1~tal_2$
1133   or $s_1' (\mathcal{S} \cap \mathcal{L}) s_2$ and
1134   $\texttt{tal\_collapsable}~tal_1$ and $\lnot (\texttt{as\_costed}~s_1)$
1138The statement is also similar to the previous ones, but for the lack of
1139the target code preamble.
1143For every $s_1,s_1',s_2$ s.t.
1144there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and
1145$s_1 (\mathcal{L} \cap \mathcal{S}) s_2$
1146there exists $s_{2_m},s_2'$ s.t.
1147there is a $\texttt{trace\_label\_return}~s_2~s_{2_m}$ called $tlr_2$ and
1148there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
1149s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
1150and $\texttt{tlr\_rel}~tlr_1~tlr_2$
1151and $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1152$s_1' \mathcal{R} s_{2_m}$.
1158status_simulation_produce_tlr S1 S2 R
1159(* we start from this situation
1160     st1 →→→→tlr→→→→ st1'
1161      | \
1162      L  \---S--\
1163      |          \
1164   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
1165                        by the preceding call)
1167   and we produce
1168     st1 →→→→tlr→→→→ st1'
1169             \\      /  \
1170             //     R    \-L,S-\
1171             \\     |           \
1172   st2_lab →tlr→ st2_mid →taaf→ st2'
1174  st1 st1' st2_lab st2
1175  (tlr1 : trace_label_return S1 st1 st1')
1176  (taa2_pre : trace_any_any S2 st2_lab st2)
1177  (sim_execute : status_simulation S1 S2 R)
1178  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
1179  ∃st2_mid.∃st2'.
1180  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
1181  ∃taa2 : trace_any_any_free … st2_mid st2'.
1182  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
1183  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
1184  tlr_rel … tlr1 tlr2
1188\section{Conclusions and future works}
1190The labelling approach is a technique to implement compilers that induce on
1191the source code a non uniform cost model determined from the object code
1192produced. The cost model assigns a cost to each basic block of the program.
1193The main theorem of the approach says that there is an exact
1194correspondence between the sequence of basic blocks started in the source
1195and object code, and that no instruction in the source or object code is
1196executed outside a basic block. Thus the cost of object code execution
1197can be computed precisely on the source.
1199In this paper we scale the labelling approach to cover a programming language
1200with function calls. This introduces new difficulties only when the language
1201is unstructured, i.e. it allows function calls to return anywhere in the code,
1202destroying the hope of a static prediction of the cost of basic blocks.
1203We restore static predictability by introducing a new semantics for unstructured
1204programs that single outs well structured executions. The latter are represented
1205by structured traces, a generalisation of streams of observables that capture
1206several structural invariants of the execution, like well nesting of functions
1207or the fact that every basic block must start with a code emission statement.
1208We show that structured traces are sufficiently structured to statically compute
1209a precise cost model on the object code.
1211We introduce a similarity relation on structured traces that must hold between
1212source and target traces. When the relation holds for every program, we prove
1213that the cost model can be lifted from the object to the source code.
1215In order to prove that similarity holds, we present a generic proof of forward
1216simulation that is aimed at pulling apart as much as possible the part of the
1217simulation related to non-functional properties (preservation of structure)
1218from that related to functional properties. In particular, we reduce the
1219problem of preservation of structure to that of showing a 1-to-0-or-many
1220forward simulation that only adds a few additional proof obligations to those
1221of a traditional, function properties only, proof.
1223All results presented in the paper are part of a larger certification of a
1224C compiler which is based on the labelling approach. The certification, done
1225in Matita, is the main deliverable of the FET-Open Certified Complexity (CerCo).
1227The short term future work consists in the completion of the certification of
1228the CerCo compiler exploiting the main theorem of this paper.
1230\paragraph{Related works}
1231CerCo is the first project that explicitly tries to induce a
1232precise cost model on the source code in order to establish non-functional
1233properties of programs on an high level language. Traditional certifications
1234of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
1235of the functional properties.
1237Usually forward simulations take the following form: for each transition
1238from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
1239transitions in the target code of length $n$. The number $n$ of transition steps
1240in the target code can just be the witness of the existential statement.
1241An equivalent alternative when the proof of simulation is constructive consists
1242in providing an explicit function, called \emph{clock function} in the
1243literature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
1244function constitutes then a cost model for the source code, in the spirit of
1245what we are doing in CerCo. However, we believe our solution to be superior
1246in the following respects: 1) the machinery of the labelling approach is
1247insensible to the resource being measured. Indeed, any cost model computed on
1248the object code can be lifted to the source code (e.g. stack space used,
1249energy consumed, etc.). On the contrary, clock functions only talk about
1250number of transition steps. In order to extend the approach with clock functions
1251to other resources, additional functions must be introduced. Moreover, the
1252additional functions would be handled differently in the proof.
12532) the cost models induced by the labelling approach have a simple presentation.
1254In particular, they associate a number to each basic block. More complex
1255models can be induced when the approach is scaled to cover, for instance,
1256loop optimisations~\cite{loopoptimizations}, but the costs are still meant to
1257be easy to understand and manipulate in an interactive theorem prover or
1258in Frama-C.
1259On the contrary, a clock function is a complex function of the state $s_1$
1260which, as a function, is an opaque object that is difficult to reify as
1261source code in order to reason on it.
1267\section{Notes for the reviewers}
1269The results described in the paper are part of a larger formalization
1270(the certification of the CerCo compiler). At the moment of the submission
1271we need to single out from the CerCo formalization the results presented here.
1272Before the 16-th of February we will submit an attachment that contains the
1273minimal subset of the CerCo formalization that allows to prove those results.
1274At that time it will also be possible to measure exactly the size of the
1275formalization described here. At the moment a rough approximation suggests
1276about 2700 lines of Matita code.
1278We will also attach the development version of the interactive theorem
1279prover Matita that compiles the submitted formalization. Another possibility
1280is to backport the development to the last released version of the system
1281to avoid having to re-compile Matita from scratch.
1283The programming and certification style used in the formalization heavily
1284exploit dependent types. Dependent types are used: 1) to impose invariants
1285by construction on the data types and operations (e.g. a traces from a state
1286$s_1$ to a state $s_2$ can be concatenad to a trace from a state
1287$s_2'$ to a state $s_3$ only if $s_2$ is convertible with $s_2'$); 2)
1288to state and prove the theorems by using the Russell methodology of
1289Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
1290}, better known in the Coq world as ``\texttt{Program}'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
1291mandatorily on dependent types: it should be easy to adapt the technique
1292and results presented in the paper to HOL.
1294Finally, Matita and Coq are based on minor variations of the Calculus of
1295(Co)Inductive Constructions. These variations do not affect the CerCo
1296formalization. Therefore a porting of the proofs and ideas to Coq would be
1297rather straightforward.
Note: See TracBrowser for help on using the repository browser.