source: Papers/itp-2013/ccexec.tex @ 2605

Last change on this file since 2605 was 2605, checked in by sacerdot, 9 years ago

A tentative submission to itp-2013.
We will probably not submit the paper, since it requires some more work to
polish it.

File size: 44.8 KB
7% NB: might be worth removing this if changing class in favour of
8% a built-in definition.
12  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
13   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
14  }
17  morekeywords={int8_t}
21        keywordstyle=\color{red}\bfseries,
22        keywordstyle=[2]\color{blue},
23        commentstyle=\color{green},
24        stringstyle=\color{blue},
25        showspaces=false,showstringspaces=false,
26        xleftmargin=1em}
30\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
31Emerging Technologies (FET) programme within the Seventh Framework
32Programme for Research of the European Commission, under FET-Open grant
33number: 243881}}
34\author{Brian Campbell}
35\institute{LFCS, University of Edinburgh,\\\email{}}
38The labelling approach is a technique to lift cost models for non-functional
39properties of programs from the object code to the source code. It is based
40on the preservation of the structure of the high level program in every
41intermediate language used by the compiler. Such structure is captured by
42observables that are added to the semantics and that needs to be preserved
43by the forward simulation proof of correctness of the compiler. Additional
44special observables are required for function calls. In this paper we
45present a generic forward simulation proof that preserves all these observables.
46The proof statement is based on a new mechanized semantics that traces the
47structure of execution when the language is unstructured. The generic semantics
48and simulation proof have been mechanized in the interactive theorem prover
54The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
55a technique to \emph{lift} cost models for non-functional properties of programs
56from the object code to the source code. Examples of non-functional properties
57are execution time, amount of stack/heap space consumed and energy required for
58communication. The basic idea of the approach is that it is impossible to
59provide a \emph{uniform} cost model for an high level language that is preserved
60\emph{precisely} by a compiler. For instance, two instances of an assignment
61$x = y$ in the source code can be compiled very differently according to the
62place (registers vs stack) where $x$ and $y$ are stored at the moment of
63execution. Therefore a precise cost model must assign a different cost
64to every occurrence, and the exact cost can only be known after compilation.
66According to the labeling approach, the compiler is free to compile and optimize
67the source code without any major restriction, but it must keep trace
68of what happens to basic blocks during the compilation. The cost model is
69then computed on the object code. It assigns a cost to every basic block.
70Finally, the compiler propagates back the cost model to the source level,
71assigning a cost to each basic block of the source code.
73Implementing the labelling approach in a certified compiler
74allows to reason formally on the high level source code of a program to prove
75non-functional properties that are granted to be preserved by the compiler
76itself. The trusted code base is then reduced to 1) the interactive theorem
77prover (or its kernel) used in the certification of the compiler and
782) the software used to certify the property on the source language, that
79can be itself certified further reducing the trusted code base.
80In~\cite{easylabelling} the authors provide an example of a simple
81certified compiler that implements the labelling approach for the
82imperative \texttt{While} language~\cite{while}, that does not have
83pointers and function calls.
85The labelling approach has been shown to scale to more interesting scenarios.
86In particular in~\cite{functionallabelling} it has been applied to a functional
87language and in~\cite{loopoptimizations} it has been shown that the approach
88can be slightly complicated to handle loop optimizations and, more generally,
89program optimizations that do not preserve the structure of basic blocks.
90On-going work also shows that the labelling approach is also compatible with
91the complex analyses required to obtain a cost model for object code
92on processors that implement advanced features like pipelining, superscalar
93architectures and caches.
95In the European Project CerCo (Certified Complexity~\footnote{\url{}}) we are certifying a labelling approach based compiler for a large subset of C to
968051 object code. The compiler is
97moderately optimizing and implements a compilation chain that is largely
98inspired to that of CompCert~\cite{compcert}. Compared to work done in~\cite{easylabelling}, the main novely and source of difficulties is due to the presence
99of function calls. Surprisingly, the addition of function calls require a
100rivisitation of the proof technique given in~\cite{easylabelling}. In
101particular, at the core of the labelling approach there is a forward
102simulation proof that, in the case of \texttt{While}, is only minimally
103more complex than the proof required for the preservation of the
104functional properties only. In the case of a programming language with
105function calls, instead, it turns out that the forward simulation proof for
106the back-end languages must grant a whole new set of invariants.
108In this paper we present a formalization in the Matita interactive theorem
109prover of a generic version of the simulation proof required for unstructured
110languages. All back-end languages of the CerCo compiler are unstructured
111languages, so the proof covers half of the correctness of the compiler.
112The statement of the generic proof is based on a new semantics
113for imperative unstructured languages that is based on \emph{structured
114traces} and that restores the preservation of structure in the observables of
115the semantics. The generic proof allows to almost completely split the
116part of the simulation that deals with functional properties only from the
117part that deals with the preservation of structure.
119The plan of this paper is the following. In Section~\ref{labelling} we
120sketch the labelling method and the problems derived from the application
121to languages with function calls. In Section~\ref{semantics} we introduce
122a generic description of an unstructured imperative language and the
123corresponding structured traces (the novel semantics). In
124Section~\ref{simulation} we describe the forward simulation proof.
125Conclusions and future works are in Section~\ref{conclusions}
127\section{The labelling approach}
129We briefly sketch here a simplified version of the labelling approach as
130introduced in~\cite{easylabelling}. The simplification strengthens the
131sufficient conditions given in~\cite{easylabelling} to allow a simpler
132explanation. The simplified conditions given here are also used in the
133CerCo compiler to simplify the proof.
135Let $\mathcal{P}$ be a programming language whose semantics is given in
136terms of observables: a run of a program yields a finite or infinite
137stream of observables. We also assume for the time being that function
138calls are not available in $\mathcal{P}$. We want to associate a cost
139model to a program $P$ written in $\mathcal{P}$. The first step is to
140extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
141where $L$ is a label distinct from all observables of $\mathcal{P}$.
142The semantics of $\texttt{emit L}$ is the emission of the observable
143\texttt{L} that is meant to signal the beginning of a basic block.
145There exists an automatic procedure that injects into the program $P$ an
146$\texttt{emit L}$ at the beginning of each basic block, using a fresh
147\texttt{L} for each block. In particular, the bodies of loops, both branches
148of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
149with an emission statement.
151Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
152that is organized in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
153language used by the compiler. We can easily extend every
154intermediate language (and its semantics) with an $\texttt{emit L}$ statement
155as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
156the additional difficulty that the syntax of object code is given as a
157sequence of bytes. The injection of an emission statement in the object code
158can be done using a map that maps two consecutive code addresses with the
159statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
160instruction stored at $pc_1$ and before the execution of the instruction
161stored at $pc_2$. The two program counters are necessary because the
162instruction stored at $pc_1$ can have multiple possible successors (e.g.
163in case of a conditional branch or an indirect call). Dually, the instruction
164stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
165target of a jump).
167The compiler, to be functionally correct, must preserve the observational
168equivalence, i.e. executing the program after each compiler pass should
169yield the same stream of observables. After the injection of emission
170statements, observables now capture both functional and non-functional
172This correctness property is called in the litterature a forward simulation
173and is sufficient for correctness when the target language is
175We also require a stronger, non-functional preservation property: after each
176pass all basic blocks must start with an emission statement, and all labels
177\texttt{L} must be unique.
179Now let $M$ be the object code obtained for the program $P$. Let us suppose
180that we can statically inspect the code $M$ and associate to each basic block
181a cost (e.g. the number of clock cycles required to execute all instructions
182in the basic block, or an upper bound to that time). Every basic block is
183labelled with an unique label \texttt{L}, thus we can actually associate the
184cost to \texttt{L}. Let call it $k(\texttt{L})$.
186The function $k$ is defined as the cost model for the object code control
187blocks. It can be equally used as well as the cost model for the source
188control blocks. Indeed, if the semantics of $P$ is the stream
189$L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
190also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
191every instruction belongs to a control block and every control block is
192labelled. Thus it is correct to say that the execution cost of $P$ is also
193$\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
194the blocks of the high level program $P$ that is preserved by compilation.
196How can the user profit from the high level cost model? Suppose, for instance,
197that he wants to prove that the WCET of his program is bounded by $c$. It
198is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
199a purely functional property of the code. He can therefore use any technique
200available to certify functional properties of the source code.
201What is suggested in~\cite{easylabelling} is to actually instrument the
202source code $P$ by replacing every label emission statement
203$\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
204a global fresh variable \texttt{cost}. The bound is now proved by establishing
205the program invariant $\texttt{cost} \leq c$, which can be done for example
206using the Frama-C~\cite{framaC} suite if the source code is some variant of
209\subsection{Labelling function calls}
210We now want to extend the labelling approach to support function calls.
211On the high level, \emph{structured} programming language $\mathcal{P}$ there
212is not much to change.
213When a function is invoked, the current basic block is temporarily exited
214and the basic block the function starts with take control. When the function
215returns, the execution of the original basic block is resumed. Thus the only
216significant change is that basic blocks can now be nested. Let \texttt{E}
217be the label of the external block and \texttt{I} the label of a nested one.
218Since the external starts before the internal, the semantics observed will be
219\texttt{E I} and the cost associated to it on the source language will be
220$k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
221in the block \texttt{E} first plus the cost of executing all the instructions in
222the block \texttt{I}. However, we know that some instructions in \texttt{E} are
223executed after the last instruction in \texttt{I}. This is actually irrelevant
224because we are here assuming that costs are additive, so that we can freely
225permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
226the function call terminates and yields back control to the basic block
227\texttt{E}. If the call diverges, the instrumentation
228$\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
229but just as an upper bound to the real execution cost: only precision is lost.
231Let now consider what happens when we move down the compilation chain to an
232unstructured intermediate or final language. Here unstructured means that
233the only control operators are conditional and unconditional jumps, function
234calls and returns. Unlike a structured language, though, there is no guarantee
235that a function will return control just after the function call point.
236The semantics of the return statement, indeed, consists in fetching the
237return address from some internal structure (tipically the control stack) and
238jumping directly to it. The code can freely manipulate the control stack to
239make the procedure returns to whatever position. Indeed, it is also possible
240to break the well nesting of function calls/returns.
242Is it the case that the code produced by a correct compiler must respect the
243additional property that every function returns just after its function call
244point? The answer is negative and the property is not implied by forward
245simulation proofs. For instance, imagine to modify a correct compiler pass
246by sistematically adding one to the return address on the stack and by
247putting a \texttt{NOP} (or any other instruction that takes one byte) after
248every function call. The obtained code will be functionally indistinguishable,
249and the added instructions will all be dead code.
251This lack of structure in the semantics badly interferes with the labelling
252approach. The reason is the following: when a basic block labelled with
253\texttt{E} contains a function call, it no longer makes any sense to associate
254to a label \texttt{E} the sum of the costs of all the instructions in the block.
255Indeed, there is no guarantee that the function will return into the block and
256that the instructions that will be executed after the return will be the ones
257we are paying for in the cost model.
259How can we make the labelling approach work in this scenario? We only see two
260possible ways. The first one consists in injecting an emission statement after
261every function call: basic blocks no longer contain function calls, but are now
262terminated by them. This completely solves the problem and allows the compiler
263to break the structure of function calls/returns at will. However, the
264technique has several drawbacks. First of all, it greatly augments the number
265of cost labels that are injected in the source code and that become
266instrumentation statements. Thus, when reasoning on the source code to prove
267non-functional properties, the user (or the automation tool) will have to handle
268larger expressions. Second, the more labels are emitted, the more difficult it
269becomes to implement powerful optimizations respecting the code structure.
270Indeed, function calls are usually implemented in such a way that most registers
271are preserved by the call, so that the static analysis of the block is not
272interrupted by the call and an optimization can involve both the code before
273and after the function call. Third, instrumenting the source code may require
274unpleasant modification of it. Take, for example, the code
275\texttt{f(g(x));}. We need to inject an emission statement/instrumentation
276instruction just after the execution of \texttt{g}. The only way to do that
277is to rewrite the code as \texttt{y = g(x); emit L$_1$; f(y);} for some fresh
278variable \texttt{y}. It is pretty clear how in certain situations the obtained
279code would be more obfuscated and then more difficult to manually reason on.
281For the previous reasons, in this paper and in the CerCo project we adopt a
282different approach. We do not inject any more emission statement after every
283function call. However, we want to propagate a strong additional invariant in
284the forward simulation proof. The invariant is the propagation of the structure
285 of the original high level code, even if the target language is unstructured.
286The structure we want to propagate, that will become more clear in the next
287section, comprises 1) the property that every function should return just after
288the function call point, which in turns imply well nesting of function calls;
2892) the property that every basic block starts with a code emission statement.
291In the original labelling approach of~\cite{easylabelling}, the second property
292was granted statically (syntactically) as a property of the generated code.
293In our revised approach, instead, we will grant the property dinamically:
294it will be possible to generate code that does not respect it, as soon as
295all execution traces do respect it. For instance, dead code will no longer
296be required to have all basic blocks correctly labelled. The switch is suggested
297from the fact that the first of the two properties --- that related to
298function calls/returns --- can only be defined as a dinamic property, i.e.
299as a property of execution traces, not of the static code. The switch is
300beneficial to the proof because the original proof was made of two parts:
301the forward simulation proof and the proof that the static property was granted.
302In our revised approach the latter disappears and only the forward simulation
303is kept.
305In order to capture the structure semantics so that it is preserved
306by a forward simulation argument, we need to make the structure observable
307in the semantics. This is the topic of the next Section.
309\section{Structured traces}
311The program semantics adopted in the traditional labelling approach is based
312on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
313a set of states $\mathcal{S}$, the semantics of one deterministic execution
314step is
315defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
316observables. The semantics is then lifted compositionally to multiple (finite
317or infinite) execution steps.
318Finally, the semantics of a a whole program execution is obtained by forgetting
319about the final state (if any), yielding a function $S \to O^*$ that given an
320initial status returns the finite or infinite stream of observables in output.
322We present here a new definition of semantics where the structure of execution,
323as defined in the previous Section, is now observable. The idea is to replace
324the stream of observables with a structured data type that makes explicit
325function call and returns and that grants some additional invariants by
326construction. The data structure, called \emph{structured traces}, is
327defined inductively for terminating programs and coinductively for diverging
328ones. In the paper we focus only on the inductive structure, i.e. we assume
329that all programs that are given a semantics are total. The Matita formalization
330also shows the coinductive definitions. The semantics of a program is then
331defined as a function that maps an initial state into a structured trace.
333In order to have a definition that works on multiple intermediate languages,
334we abstract the type of structure traces over an abstract data type of
335abstract statuses:
337record abstract_status := {
338   S: Type[0]
339;  as_execute: S -> S -> Prop
340;  as_classify: S -> classification
341;  as_costed: S -> Prop
342;  as_after_return:
343    (Σs:as_status. as_classify s = Some ? cl_call) → as_status → Prop
347The predicate $\texttt{as\_execute}~s_1~s_2$ holds if $s_1$ evolves into
348$s_2$ in one step; $\texttt{as\_classify}~s~c$ holds if the next instruction
349to be executed in $s$ is classified according to $c \in
350\{\texttt{return,jump,call,other}\}$ (we omit tailcalls for simplicity);
351the predicate $\texttt{as\_costed}~s$ holds if the next instruction to be
352executed in $s$ is a cost emission statement (also classified
353as \texttt{other}); finally $(\texttt{as\_after\_return}~s_1~s_2)$ holds
354if the next instruction to be executed in $s_2$ follows the function call
355to be executed in (the witness of the $\Sigma$-type) $s_1$.
358The inductive type for structured traces is actually made by three multiple
359inductive types with the following semantics:
361 \item $(\texttt{trace\_label\_return}~s_1~s_2)$ is a trace that begins in
362   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
363   such that the instruction to be executed in $s_1$ is a label emission
364   statement and the one to be executed in the state before $s_2$ is a return
365   statement. Thus $s_2$ is the state after the return. The trace
366   may contain other label emission statements. It captures the structure of
367   the execution of function bodies: they must start with a cost emission
368   statement and must end with a return; they are obtained by concatenating
369   one or more basic blocks, all starting with a label emission
370   (e.g. in case of loops).
371 \item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ is a trace that begins in
372   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
373   such that the instruction to be executed in $s_2$/in the state before
374   $s_2$ is either a label emission statement or
375   or a return, according to the boolean $b$. It must not contain
376   any label emission statement. It captures the notion of a suffix of a
377   basic block.
378 \item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ is the special case of
379   $(\texttt{trace\_any\_label}~b~s_1~s_2)$ such that the instruction to be
380   executed in $s_1$ is a label emission statement. It captures the notion of
381   a basic block.
384inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
385  | tlr_base:
386      ∀status_before: S.
387      ∀status_after: S.
388        trace_label_label S ends_with_ret status_before status_after →
389        trace_label_return S status_before status_after
390  | tlr_step:
391      ∀status_initial: S.
392      ∀status_labelled: S.
393      ∀status_final: S.
394        trace_label_label S doesnt_end_with_ret status_initial status_labelled →
395        trace_label_return S status_labelled status_final →
396          trace_label_return S status_initial status_final
397with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
398  | tll_base:
399      ∀ends_flag: trace_ends_with_ret.
400      ∀start_status: S.
401      ∀end_status: S.
402        trace_any_label S ends_flag start_status end_status →
403        as_costed S start_status →
404          trace_label_label S ends_flag start_status end_status
405with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
406  (* Single steps within a function which reach a label.
407     Note that this is the only case applicable for a jump. *)
408  | tal_base_not_return:
409      ∀start_status: S.
410      ∀final_status: S.
411        as_execute S start_status final_status →
412        (as_classifier S start_status cl_jump ∨
413         as_classifier S start_status cl_other) →
414        as_costed S final_status →
415          trace_any_label S doesnt_end_with_ret start_status final_status
416  | tal_base_return:
417      ∀start_status: S.
418      ∀final_status: S.
419        as_execute S start_status final_status →
420        as_classifier S start_status cl_return →
421          trace_any_label S ends_with_ret start_status final_status
422  (* A call followed by a label on return. *)
423  | tal_base_call:
424      ∀status_pre_fun_call: S.
425      ∀status_start_fun_call: S.
426      ∀status_final: S.
427        as_execute S status_pre_fun_call status_start_fun_call →
428        ∀H:as_classifier S status_pre_fun_call cl_call.
429          as_after_return S «status_pre_fun_call, H» status_final →
430          trace_label_return S status_start_fun_call status_final →
431          as_costed S status_final →
432            trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
433  (* A call followed by a non-empty trace. *)
434  | tal_step_call:
435      ∀end_flag: trace_ends_with_ret.
436      ∀status_pre_fun_call: S.
437      ∀status_start_fun_call: S.
438      ∀status_after_fun_call: S.
439      ∀status_final: S.
440        as_execute S status_pre_fun_call status_start_fun_call →
441        ∀H:as_classifier S status_pre_fun_call cl_call.
442          as_after_return S «status_pre_fun_call, H» status_after_fun_call →
443          trace_label_return S status_start_fun_call status_after_fun_call →
444          ¬ as_costed S status_after_fun_call →
445          trace_any_label S end_flag status_after_fun_call status_final →
446            trace_any_label S end_flag status_pre_fun_call status_final
447  | tal_step_default:
448      ∀end_flag: trace_ends_with_ret.
449      ∀status_pre: S.
450      ∀status_init: S.
451      ∀status_end: S.
452        as_execute S status_pre status_init →
453        trace_any_label S end_flag status_init status_end →
454        as_classifier S status_pre cl_other →
455        ¬ (as_costed S status_init) →
456          trace_any_label S end_flag status_pre status_end.
458A \texttt{trace\_label\_return} is isomorphic to a list of
459\texttt{trace\_label\_label}s that end with a cost emission followed by a
460\texttt{trace\_label\_label} that ends with a return.
462The interesting cases are those of $(\texttt{trace\_any\_label}~b~s_1~s_2)$.
463A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
464definition on the classification of $s_1$. The constructors of the datatype
465impose several invariants that are meant to impose a structure to the
466otherwise unstructured execution. In particular, the following invariants are
469 \item the trace is never empty; it ends with a return iff $b$ is
470       true (\texttt{ends\_with\_ret})
471 \item a jump must always be the last instruction of the trace, and it must
472       be followed by a cost emission statement; i.e. the target of a jump
473       is always the beginning of a new basic block; as such it must start
474       with a cost emission statement
475 \item a cost emission statement can never occur inside the trace, only in
476       the status immediately after
477 \item the trace for a function call step is made from a subtrace of type
478       $(\texttt{trace\_label\_return}~s_1~s_2)$, possibly followed by the
479       rest of the trace for this basic block. The subtrace represents the
480       function execution. Being an inductive datum, it grants totality of
481       the function call. The status $s_2$ is the one that follows the return
482       statement. The next instruction of $s_2$ must follow the function call
483       instruction. As a consequence, function calls are also well nested.
486The three mutual structural recursive functions \texttt{flatten\_trace\_label\_return, flatten\_trace\_label\_label} and \texttt{flatten\_trance\_any\_label}
487allow to extract from a structured trace the list of states whose next
488instruction is a cost emission statement. We only show here the type of one
489of them:
492  (S: abstract_status)
493    (start_status: S) (final_status: S)
494      (the_trace: trace_label_return S start_status final_status)
495        on the_trace: list (as_cost_label S)
498\paragraph{Cost prediction on structured traces}
500The first main theorem of CerCo about structured traces
501(theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
502holds for the
504of the structured traces to the concrete status of object code programs.
505Simplifying a bit, it states that
506$\forall s_1,s_2. \forall \tau: \texttt{trace\_label\_return}~s_1~s_2.
507  clock~s_2 = clock~s_1 + \Sigma_{s \in \texttt{flatten\_trace\_label\_return}~\tau} k(L(s))$ where $L$ maps a labelled state to its emitted label, and the
508cost model $k$ is statically computed from the object code
509by associating to each label \texttt{L} the sum of the cost of the instructions
510in the basic block that starts at \texttt{L} and ends before the next labelled
511instruction. The theorem is proved by structural induction over the structured
512trace, and is based on the invariant that
513iff the function that computes the cost model has analyzed the instruction
514to be executed at $s_2$ after the one to be executed at $s_1$, and if
515the structured trace starts with $s_1$, then eventually it will contain also
516$s_2$. When $s_1$ is not a function call, the result holds trivially because
517of the $(\texttt{as\_execute}~s_1~s_2)$ condition obtained by inversion on
518the trace. The only non
519trivial case is the one of function calls: the cost model computation function
520does recursion on the first instruction that follows that function call; the
521\texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
522\texttt{tal\_step\_call} grants exactly that the execution will eventually reach
523this state.
525\paragraph{Structured traces similarity and cost prediction invariance}
527A compiler pass maps source to object code and initial states to initial
528states. The source code and initial state uniquely determine the structured
529trace of a program, if it exists. The structured trace fails to exists iff
530the structural conditions are violated by the program execution (e.g. a function
531body does not start with a cost emission statement). Let us assume that the
532target structured trace exists.
534What is the relation between the source and target structured traces?
535In general, the two traces can be arbitrarily different. However, we are
536interested only in those compiler passes that maps a trace $\tau_1$ to a trace
537$\tau_2$ such that
538\begin{displaymath}(\texttt{flatten\_trace\_label\_return}~\tau_1 = \texttt{flatten\_trace\_label\_return}~\tau_2)\label{condition1}\end{displaymath}
539The reason is that when this happens we obviously have the
540following corollary to the previous theorem:
541$$\forall s_1,s_2. \forall \tau: \texttt{trace\_label\_return}~s_1~s_2.
542  clock~s_2 - clock~s_1 = \Sigma_{s \in \texttt{flatten\_trace\_label\_return}~\tau_1} k(L(s)) = \Sigma_{s \in \texttt{flatten\_trace\_label\_return}~\tau_2} k(L(s))$$
543The corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
544transfer the cost model from the target to the source code and reason on the
545source code only.
547We are therefore interested in conditions stronger than~\ref{condition1}.
548Therefore we introduce here a similarity relation between traces that captures
549the fact that two traces have the same structure and that
550implies~\ref{condition1} (theorem \texttt{tlr\_rel\_to\_traces\_same\_flatten}). Two traces are similar when one can be obtained from
551the other by erasing or inserting steps classified as \texttt{other} into the
552\texttt{trace\_any\_label} blocks of the trace. In particular,
553the relation maps function calls to function calls to the same function,
554label emission statements to emissions of the same label, concatenation of
555subtraces to concatenation of subtraces of the same lenghth and starting with
556the same emissione statement, etc.
558The similarity relation definition if simple, but it involves a large
559number of cases. We only show here a few of the cases:
561let rec tlr_rel S1 st1 st1' S2 st2 st2'
562  (tlr1 : trace_label_return S1 st1 st1')
563  (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
564match tlr1 with
565  [ tlr_base st1 st1' tll1 ⇒
566    match tlr2 with
567    [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
568    | _ ⇒ False
569    ]
570  | tlr_step st1 st1' st1'' tll1 tl1 ⇒
571    match tlr2 with
572    [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
573      tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
574    | _ ⇒ False
575    ]
576  ]
577and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
578 (tll1 : trace_label_label S1 fl1 st1 st1')
579 (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
580  match tll1 with
581  [ tll_base fl1 st1 st1' tal1 H ⇒
582    match tll2 with
583    [ tll_base fl2 st2 st2 tal2 G ⇒
584      as_label_safe … («?, H») = as_label_safe … («?, G») ∧
585      tal_rel … tal1 tal2
586    ]
587  ]
588and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
589 (tal1 : trace_any_label S1 fl1 st1 st1')
590 (tal2 : trace_any_label S2 fl2 st2 st2')
591   on tal1 : Prop ≝
592  match tal1 with
593  [ tal_base_not_return st1 st1' _ _ _ ⇒
594    fl2 = doesnt_end_with_ret ∧
595    ∃st2mid,taa,H,G,K.
596    tal2 ≃ taa_append_tal ? st2 ??? taa
597      (tal_base_not_return ? st2mid st2' H G K)
598  | tal_base_return st1 st1' _ _ ⇒
599    fl2 = ends_with_ret ∧
600    ∃st2mid,taa,H,G.
601    tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
602      (tal_base_return ? st2mid st2' H G)
603  | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
604    fl2 = doesnt_end_with_ret ∧
605    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
606    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
607    (* we must allow a tal_base_call to be similar to a call followed
608      by a collapsable trace (trace_any_any followed by a base_not_return;
609      we cannot use trace_any_any as it disallows labels in the end as soon
610      as it is non-empty) *)
611    (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
612      tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
613    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
614    ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
615      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
616      tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
617  | tal_base_tailcall st1 st1' st1'' _ prf tlr1 ⇒
618    fl2 = ends_with_ret ∧
619    ∃st2mid,G.as_tailcall_ident S2 («st2mid, G») = as_tailcall_ident ? «st1, prf» ∧
620    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
621    ∃tlr2 : trace_label_return ? st2mid' st2'.
622      tal2 ≃ taa @ (tal_base_tailcall … H G tlr2) ∧ tlr_rel … tlr1 tlr2
623  | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
624    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
625    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
626    (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
627      tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
628      tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
629    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
630    ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
631      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
632      tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
633  | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
634    tal_rel … tl1 tal2 (* <- this makes it many to many *)
635  ].
638\section{Forward simulation}
641We summarize here the results of the previous sections. Each intermediate
642unstructured language can be given a semantics based on structured traces,
643that single out those runs that respect a certain number of invariants.
644A cost model can be computed on the object code and it can be used to predict
645the execution costs of runs that produce structured traces. The cost model
646can be lifted from the target to the source code of a pass if the pass maps
647structured traces to similar structured traces. The latter property is called
648a \emph{forward simulation}.
650As for labelled transition systems, in order to establish the forward
651simulation we are interested in (preservation of observables), we are
652forced to prove a stronger notion of forward simulation that introduces
653an explicit relation between states. The classical notion of a 1-to-0-or-many
654forward simulation is the existence of a relation $R$ over states such that
655if $s_1 R s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
656$s_1' \to^* s_2'$ and $s_1' R s_2'$. In our context, we need to replace the
657one and multi step transition relations $\to^n$ with the existence of
658a structured trace between the two states, and we need to add the request that
659the two structured traces are similar. Thus what we would like to state is
660something like:\\
661for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
662$s_1$ to $s_1'$ and $s_1 R s_2$ there exists an $s_2'$ such that
663$s_1' R s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
664$\tau_1 \approx \tau_2$.\\
666The statement just introduced, however, is too simplicistic and not provable
667in the general case. To understand why, consider the case of a function call
668and the pass that fixes the parameter passing conventions. A function
669call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target call has no
670input nor output parameters. The pass must add explicit code before and after
671the function call to move the pseudo-registers content from/to the hardware
672registers or the stack in order to implement the parameter passing strategy.
673Similarly, each function body must be augmented with a preamble and a postamble
674to complete/initiate the parameter passing strategy for the call/return phase.
675Therefore what used to be a call followed by the next instruction to execute
676after the function return, now becomes a sequence of instructions, followed by
677a call, followed by another sequence. The two states at the beginning of the
678first sequence and at the end of the second sequence are in relation with
679the status before/after the call in the source code, like in an usual forward
680simulation. How can we prove however the additional condition for function calls
681that asks that when the function returns the instruction immediately after the
682function call is called? To grant this invariant, there must be another relation
683between the address of the function call in the source and in the target code.
684This additional relation is to be used in particular to relate the two stacks.
686Another example is given by preservation of code emission statements. A single
687code emission instruction can be simulated by a sequence of steps, followed
688by a code emission, followed by another sequence. Clearly the initial and final
689statuses of the sequence are to be in relation with the status before/after the
690code emission in the source code. In order to preserve the structured traces
691invariants, however, we must consider a second relation between states that
692traces the preservation of the code emission statement.
694Therefore we now introduce an abstract notion of relation set between abstract
695statuses and an abstract notion of 1-to-0-or-many forward simulation. These
696two definitions enjoy the following remarkable properties:
698 \item they are generic enough to accomodate all passes of the CerCo compiler
699 \item they allow to lift the 1-to-0-or-many forward simulation to the
700   structured traces preserving forward simulation we are interested in
703The latter lifting is proved by the following theorem which has the
7041-to-0-or-many forward simulation statement as an hypothesis. Combined with
705the results in the previous section, it has as a corollary that the cost
706model can be computed on the object code and lifted to the source code to
707reason on non-functional properties iff a simpler 1-to-0-or-many forward
708simulation is proved. The statement of the latter is way more complex than a
709traditional 1-to-0-or-many forward simulation statement, but proofs have now
710about the same complexity. Indeed, it should be possible to largely reuse an
711existent traditional forward simulation proof. The lifting theorem below,
712proved once and for all in this paper, is the one that does all the job to
713put the labelling approach in motion.
716definition status_simulation ≝
717  λS1 : abstract_status.
718  λS2 : abstract_status.
719  λsim_status_rel : status_rel S1 S2.
720    ∀st1,st1',st2.as_execute S1 st1 st1' →
721    sim_status_rel st1 st2 →
722    match as_classify … st1 with
723    [ None ⇒ True
724    | Some cl ⇒
725      match cl with
726      [ cl_call ⇒ ∀prf.
727        (*
728             st1' ------------S----------\
729              ↑ \                         \
730             st1 \--L--\                   \
731              | \       \                   \
732              S  \-C-\  st2_after_call →taa→ st2'
733              |       \     ↑
734             st2 →taa→ st2_pre_call
735        *)
736        ∃st2_pre_call.
737        as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
738        call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
739        ∃st2_after_call,st2'.
740        ∃taa2 : trace_any_any … st2 st2_pre_call.
741        ∃taa2' : trace_any_any … st2_after_call st2'.
742        as_execute … st2_pre_call st2_after_call ∧
743        sim_status_rel st1' st2' ∧
744        label_rel … st1' st2_after_call
745      | cl_tailcall ⇒ ∀prf.
746        (*
747             st1' ------------S----------\
748              ↑ \                         \
749             st1 \--L--\                   \
750              |         \                   \
751              S         st2_after_call →taa→ st2'
752              |             ↑
753             st2 →taa→ st2_pre_call
754        *)
755        ∃st2_pre_call.
756        as_tailcall_ident ? st2_pre_call = as_tailcall_ident ? («st1, prf») ∧
757        ∃st2_after_call,st2'.
758        ∃taa2 : trace_any_any … st2 st2_pre_call.
759        ∃taa2' : trace_any_any … st2_after_call st2'.
760        as_execute … st2_pre_call st2_after_call ∧
761        sim_status_rel st1' st2' ∧
762        label_rel … st1' st2_after_call
763      | cl_return ⇒
764        (*
765             st1
766            / ↓
767           | st1'----------S,L------------\
768           S   \                           \
769            \   \-----R-------\            |     
770             \                 |           |
771             st2 →taa→ st2_ret |           |
772                          ↓   /            |
773                     st2_after_ret →taaf→ st2'
775           we also ask that st2_after_ret be not labelled if the taaf tail is
776           not empty
777        *)
778        ∃st2_ret,st2_after_ret,st2'.
779        ∃taa2 : trace_any_any … st2 st2_ret.
780        ∃taa2' : trace_any_any_free … st2_after_ret st2'.
781        (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
782        as_classifier … st2_ret cl_return ∧
783        as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
784        ret_rel … sim_status_rel st1' st2_after_ret ∧
785        label_rel … st1' st2'
786      | cl_other ⇒
787          (*         
788          st1 → st1'
789            |      \
790            S      S,L
791            |        \
792           st2 →taaf→ st2'
794           the taaf can be empty (e.g. tunneling) but we ask it must not be the
795           case when both st1 and st1' are labelled (we would be able to collapse
796           labels otherwise)
797         *)
798        ∃st2'.
799        ∃taa2 : trace_any_any_free … st2 st2'.
800        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
801        sim_status_rel st1' st2' ∧
802        label_rel … st1' st2'
803      | cl_jump ⇒
804        (* just like cl_other, but with a hypothesis more *)
805        as_costed … st1' →
806        ∃st2'.
807        ∃taa2 : trace_any_any_free … st2 st2'.
808        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
809        sim_status_rel st1' st2' ∧
810        label_rel … st1' st2'
811      ]
812    ].
816status_simulation_produce_tlr S1 S2 R
817(* we start from this situation
818     st1 →→→→tlr→→→→ st1'
819      | \
820      L  \---S--\
821      |          \
822   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
823                        by the preceding call)
825   and we produce
826     st1 →→→→tlr→→→→ st1'
827             \\      /  \
828             //     R    \-L,S-\
829             \\     |           \
830   st2_lab →tlr→ st2_mid →taaf→ st2'
832  st1 st1' st2_lab st2
833  (tlr1 : trace_label_return S1 st1 st1')
834  (taa2_pre : trace_any_any S2 st2_lab st2)
835  (sim_execute : status_simulation S1 S2 R)
836  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
837  ∃st2_mid.∃st2'.
838  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
839  ∃taa2 : trace_any_any_free … st2_mid st2'.
840  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
841  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
842  tlr_rel … tlr1 tlr2
845\section{Conclusions and future works}
Note: See TracBrowser for help on using the repository browser.