source: Papers/itp-2013/ccexec.tex @ 2610

Last change on this file since 2610 was 2610, checked in by sacerdot, 8 years ago


File size: 49.6 KB
8% NB: might be worth removing this if changing class in favour of
9% a built-in definition.
13  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
14   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
15  }
18  morekeywords={int8_t}
22        keywordstyle=\color{red}\bfseries,
23        keywordstyle=[2]\color{blue},
24        commentstyle=\color{green},
25        stringstyle=\color{blue},
26        showspaces=false,showstringspaces=false,
27        xleftmargin=1em}
31\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
32Emerging Technologies (FET) programme within the Seventh Framework
33Programme for Research of the European Commission, under FET-Open grant
34number: 243881}}
35\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
36\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{}, \email{}}
39The labelling approach is a technique to lift cost models for non-functional
40properties of programs from the object code to the source code. It is based
41on the preservation of the structure of the high level program in every
42intermediate language used by the compiler. Such structure is captured by
43observables that are added to the semantics and that needs to be preserved
44by the forward simulation proof of correctness of the compiler. Additional
45special observables are required for function calls. In this paper we
46present a generic forward simulation proof that preserves all these observables.
47The proof statement is based on a new mechanized semantics that traces the
48structure of execution when the language is unstructured. The generic semantics
49and simulation proof have been mechanized in the interactive theorem prover
55The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
56a technique to \emph{lift} cost models for non-functional properties of programs
57from the object code to the source code. Examples of non-functional properties
58are execution time, amount of stack/heap space consumed and energy required for
59communication. The basic idea of the approach is that it is impossible to
60provide a \emph{uniform} cost model for an high level language that is preserved
61\emph{precisely} by a compiler. For instance, two instances of an assignment
62$x = y$ in the source code can be compiled very differently according to the
63place (registers vs stack) where $x$ and $y$ are stored at the moment of
64execution. Therefore a precise cost model must assign a different cost
65to every occurrence, and the exact cost can only be known after compilation.
67According to the labeling approach, the compiler is free to compile and optimize
68the source code without any major restriction, but it must keep trace
69of what happens to basic blocks during the compilation. The cost model is
70then computed on the object code. It assigns a cost to every basic block.
71Finally, the compiler propagates back the cost model to the source level,
72assigning a cost to each basic block of the source code.
74Implementing the labelling approach in a certified compiler
75allows to reason formally on the high level source code of a program to prove
76non-functional properties that are granted to be preserved by the compiler
77itself. The trusted code base is then reduced to 1) the interactive theorem
78prover (or its kernel) used in the certification of the compiler and
792) the software used to certify the property on the source language, that
80can be itself certified further reducing the trusted code base.
81In~\cite{easylabelling} the authors provide an example of a simple
82certified compiler that implements the labelling approach for the
83imperative \texttt{While} language~\cite{while}, that does not have
84pointers and function calls.
86The labelling approach has been shown to scale to more interesting scenarios.
87In particular in~\cite{functionallabelling} it has been applied to a functional
88language and in~\cite{loopoptimizations} it has been shown that the approach
89can be slightly complicated to handle loop optimizations and, more generally,
90program optimizations that do not preserve the structure of basic blocks.
91On-going work also shows that the labelling approach is also compatible with
92the complex analyses required to obtain a cost model for object code
93on processors that implement advanced features like pipelining, superscalar
94architectures and caches.
96In the European Project CerCo (Certified Complexity~\footnote{\url{}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
978051 object code. The compiler is
98moderately optimizing and implements a compilation chain that is largely
99inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novely and source of difficulties is due to the presence
100of function calls. Surprisingly, the addition of function calls require a
101rivisitation of the proof technique given in~\cite{easylabelling}. In
102particular, at the core of the labelling approach there is a forward
103simulation proof that, in the case of \texttt{While}, is only minimally
104more complex than the proof required for the preservation of the
105functional properties only. In the case of a programming language with
106function calls, instead, it turns out that the forward simulation proof for
107the back-end languages must grant a whole new set of invariants.
109In this paper we present a formalization in the Matita interactive theorem
110prover of a generic version of the simulation proof required for unstructured
111languages. All back-end languages of the CerCo compiler are unstructured
112languages, so the proof covers half of the correctness of the compiler.
113The statement of the generic proof is based on a new semantics
114for imperative unstructured languages that is based on \emph{structured
115traces} and that restores the preservation of structure in the observables of
116the semantics. The generic proof allows to almost completely split the
117part of the simulation that deals with functional properties only from the
118part that deals with the preservation of structure.
120The plan of this paper is the following. In Section~\ref{labelling} we
121sketch the labelling method and the problems derived from the application
122to languages with function calls. In Section~\ref{semantics} we introduce
123a generic description of an unstructured imperative language and the
124corresponding structured traces (the novel semantics). In
125Section~\ref{simulation} we describe the forward simulation proof.
126Conclusions and future works are in Section~\ref{conclusions}
128\section{The labelling approach}
130We briefly sketch here a simplified version of the labelling approach as
131introduced in~\cite{easylabelling}. The simplification strengthens the
132sufficient conditions given in~\cite{easylabelling} to allow a simpler
133explanation. The simplified conditions given here are also used in the
134CerCo compiler to simplify the proof.
136Let $\mathcal{P}$ be a programming language whose semantics is given in
137terms of observables: a run of a program yields a finite or infinite
138stream of observables. We also assume for the time being that function
139calls are not available in $\mathcal{P}$. We want to associate a cost
140model to a program $P$ written in $\mathcal{P}$. The first step is to
141extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
142where $L$ is a label distinct from all observables of $\mathcal{P}$.
143The semantics of $\texttt{emit L}$ is the emission of the observable
144\texttt{L} that is meant to signal the beginning of a basic block.
146There exists an automatic procedure that injects into the program $P$ an
147$\texttt{emit L}$ at the beginning of each basic block, using a fresh
148\texttt{L} for each block. In particular, the bodies of loops, both branches
149of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
150with an emission statement.
152Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
153that is organized in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
154language used by the compiler. We can easily extend every
155intermediate language (and its semantics) with an $\texttt{emit L}$ statement
156as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
157the additional difficulty that the syntax of object code is given as a
158sequence of bytes. The injection of an emission statement in the object code
159can be done using a map that maps two consecutive code addresses with the
160statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
161instruction stored at $pc_1$ and before the execution of the instruction
162stored at $pc_2$. The two program counters are necessary because the
163instruction stored at $pc_1$ can have multiple possible successors (e.g.
164in case of a conditional branch or an indirect call). Dually, the instruction
165stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
166target of a jump).
168The compiler, to be functionally correct, must preserve the observational
169equivalence, i.e. executing the program after each compiler pass should
170yield the same stream of observables. After the injection of emission
171statements, observables now capture both functional and non-functional
173This correctness property is called in the litterature a forward simulation
174and is sufficient for correctness when the target language is
176We also require a stronger, non-functional preservation property: after each
177pass all basic blocks must start with an emission statement, and all labels
178\texttt{L} must be unique.
180Now let $M$ be the object code obtained for the program $P$. Let us suppose
181that we can statically inspect the code $M$ and associate to each basic block
182a cost (e.g. the number of clock cycles required to execute all instructions
183in the basic block, or an upper bound to that time). Every basic block is
184labelled with an unique label \texttt{L}, thus we can actually associate the
185cost to \texttt{L}. Let call it $k(\texttt{L})$.
187The function $k$ is defined as the cost model for the object code control
188blocks. It can be equally used as well as the cost model for the source
189control blocks. Indeed, if the semantics of $P$ is the stream
190$L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
191also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
192every instruction belongs to a control block and every control block is
193labelled. Thus it is correct to say that the execution cost of $P$ is also
194$\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
195the blocks of the high level program $P$ that is preserved by compilation.
197How can the user profit from the high level cost model? Suppose, for instance,
198that he wants to prove that the WCET of his program is bounded by $c$. It
199is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
200a purely functional property of the code. He can therefore use any technique
201available to certify functional properties of the source code.
202What is suggested in~\cite{easylabelling} is to actually instrument the
203source code $P$ by replacing every label emission statement
204$\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
205a global fresh variable \texttt{cost}. The bound is now proved by establishing
206the program invariant $\texttt{cost} \leq c$, which can be done for example
207using the Frama-C~\cite{framaC} suite if the source code is some variant of
210\subsection{Labelling function calls}
211We now want to extend the labelling approach to support function calls.
212On the high level, \emph{structured} programming language $\mathcal{P}$ there
213is not much to change.
214When a function is invoked, the current basic block is temporarily exited
215and the basic block the function starts with take control. When the function
216returns, the execution of the original basic block is resumed. Thus the only
217significant change is that basic blocks can now be nested. Let \texttt{E}
218be the label of the external block and \texttt{I} the label of a nested one.
219Since the external starts before the internal, the semantics observed will be
220\texttt{E I} and the cost associated to it on the source language will be
221$k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
222in the block \texttt{E} first plus the cost of executing all the instructions in
223the block \texttt{I}. However, we know that some instructions in \texttt{E} are
224executed after the last instruction in \texttt{I}. This is actually irrelevant
225because we are here assuming that costs are additive, so that we can freely
226permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
227the function call terminates and yields back control to the basic block
228\texttt{E}. If the call diverges, the instrumentation
229$\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
230but just as an upper bound to the real execution cost: only precision is lost.
232Let now consider what happens when we move down the compilation chain to an
233unstructured intermediate or final language. Here unstructured means that
234the only control operators are conditional and unconditional jumps, function
235calls and returns. Unlike a structured language, though, there is no guarantee
236that a function will return control just after the function call point.
237The semantics of the return statement, indeed, consists in fetching the
238return address from some internal structure (tipically the control stack) and
239jumping directly to it. The code can freely manipulate the control stack to
240make the procedure returns to whatever position. Indeed, it is also possible
241to break the well nesting of function calls/returns.
243Is it the case that the code produced by a correct compiler must respect the
244additional property that every function returns just after its function call
245point? The answer is negative and the property is not implied by forward
246simulation proofs. For instance, imagine to modify a correct compiler pass
247by sistematically adding one to the return address on the stack and by
248putting a \texttt{NOP} (or any other instruction that takes one byte) after
249every function call. The obtained code will be functionally indistinguishable,
250and the added instructions will all be dead code.
252This lack of structure in the semantics badly interferes with the labelling
253approach. The reason is the following: when a basic block labelled with
254\texttt{E} contains a function call, it no longer makes any sense to associate
255to a label \texttt{E} the sum of the costs of all the instructions in the block.
256Indeed, there is no guarantee that the function will return into the block and
257that the instructions that will be executed after the return will be the ones
258we are paying for in the cost model.
260How can we make the labelling approach work in this scenario? We only see two
261possible ways. The first one consists in injecting an emission statement after
262every function call: basic blocks no longer contain function calls, but are now
263terminated by them. This completely solves the problem and allows the compiler
264to break the structure of function calls/returns at will. However, the
265technique has several drawbacks. First of all, it greatly augments the number
266of cost labels that are injected in the source code and that become
267instrumentation statements. Thus, when reasoning on the source code to prove
268non-functional properties, the user (or the automation tool) will have to handle
269larger expressions. Second, the more labels are emitted, the more difficult it
270becomes to implement powerful optimizations respecting the code structure.
271Indeed, function calls are usually implemented in such a way that most registers
272are preserved by the call, so that the static analysis of the block is not
273interrupted by the call and an optimization can involve both the code before
274and after the function call. Third, instrumenting the source code may require
275unpleasant modification of it. Take, for example, the code
276\texttt{f(g(x));}. We need to inject an emission statement/instrumentation
277instruction just after the execution of \texttt{g}. The only way to do that
278is to rewrite the code as \texttt{y = g(x); emit L$_1$; f(y);} for some fresh
279variable \texttt{y}. It is pretty clear how in certain situations the obtained
280code would be more obfuscated and then more difficult to manually reason on.
282For the previous reasons, in this paper and in the CerCo project we adopt a
283different approach. We do not inject any more emission statement after every
284function call. However, we want to propagate a strong additional invariant in
285the forward simulation proof. The invariant is the propagation of the structure
286 of the original high level code, even if the target language is unstructured.
287The structure we want to propagate, that will become more clear in the next
288section, comprises 1) the property that every function should return just after
289the function call point, which in turns imply well nesting of function calls;
2902) the property that every basic block starts with a code emission statement.
292In the original labelling approach of~\cite{easylabelling}, the second property
293was granted statically (syntactically) as a property of the generated code.
294In our revised approach, instead, we will grant the property dinamically:
295it will be possible to generate code that does not respect it, as soon as
296all execution traces do respect it. For instance, dead code will no longer
297be required to have all basic blocks correctly labelled. The switch is suggested
298from the fact that the first of the two properties --- that related to
299function calls/returns --- can only be defined as a dinamic property, i.e.
300as a property of execution traces, not of the static code. The switch is
301beneficial to the proof because the original proof was made of two parts:
302the forward simulation proof and the proof that the static property was granted.
303In our revised approach the latter disappears and only the forward simulation
304is kept.
306In order to capture the structure semantics so that it is preserved
307by a forward simulation argument, we need to make the structure observable
308in the semantics. This is the topic of the next Section.
310\section{Structured traces}
312The program semantics adopted in the traditional labelling approach is based
313on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
314a set of states $\mathcal{S}$, the semantics of one deterministic execution
315step is
316defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
317observables. The semantics is then lifted compositionally to multiple (finite
318or infinite) execution steps.
319Finally, the semantics of a a whole program execution is obtained by forgetting
320about the final state (if any), yielding a function $S \to O^*$ that given an
321initial status returns the finite or infinite stream of observables in output.
323We present here a new definition of semantics where the structure of execution,
324as defined in the previous Section, is now observable. The idea is to replace
325the stream of observables with a structured data type that makes explicit
326function call and returns and that grants some additional invariants by
327construction. The data structure, called \emph{structured traces}, is
328defined inductively for terminating programs and coinductively for diverging
329ones. In the paper we focus only on the inductive structure, i.e. we assume
330that all programs that are given a semantics are total. The Matita formalization
331also shows the coinductive definitions. The semantics of a program is then
332defined as a function that maps an initial state into a structured trace.
334In order to have a definition that works on multiple intermediate languages,
335we abstract the type of structure traces over an abstract data type of
336abstract statuses:
338record abstract_status := {
339   S: Type[0]
340;  as_execute: S -> S -> Prop
341;  as_classify: S -> classification
342;  as_costed: S -> Prop
343;  as_after_return:
344    (Σs:as_status. as_classify s = Some ? cl_call) → as_status → Prop
348The predicate $\texttt{as\_execute}~s_1~s_2$ holds if $s_1$ evolves into
349$s_2$ in one step; $\texttt{as\_classify}~s~c$ holds if the next instruction
350to be executed in $s$ is classified according to $c \in
351\{\texttt{return,jump,call,other}\}$ (we omit tailcalls for simplicity);
352the predicate $\texttt{as\_costed}~s$ holds if the next instruction to be
353executed in $s$ is a cost emission statement (also classified
354as \texttt{other}); finally $(\texttt{as\_after\_return}~s_1~s_2)$ holds
355if the next instruction to be executed in $s_2$ follows the function call
356to be executed in (the witness of the $\Sigma$-type) $s_1$.
359The inductive type for structured traces is actually made by three multiple
360inductive types with the following semantics:
362 \item $(\texttt{trace\_label\_return}~s_1~s_2)$ is a trace that begins in
363   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
364   such that the instruction to be executed in $s_1$ is a label emission
365   statement and the one to be executed in the state before $s_2$ is a return
366   statement. Thus $s_2$ is the state after the return. The trace
367   may contain other label emission statements. It captures the structure of
368   the execution of function bodies: they must start with a cost emission
369   statement and must end with a return; they are obtained by concatenating
370   one or more basic blocks, all starting with a label emission
371   (e.g. in case of loops).
372 \item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ is a trace that begins in
373   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
374   such that the instruction to be executed in $s_2$/in the state before
375   $s_2$ is either a label emission statement or
376   or a return, according to the boolean $b$. It must not contain
377   any label emission statement. It captures the notion of a suffix of a
378   basic block.
379 \item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ is the special case of
380   $(\texttt{trace\_any\_label}~b~s_1~s_2)$ such that the instruction to be
381   executed in $s_1$ is a label emission statement. It captures the notion of
382   a basic block.
385inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
386  | tlr_base:
387      ∀status_before: S.
388      ∀status_after: S.
389        trace_label_label S ends_with_ret status_before status_after →
390        trace_label_return S status_before status_after
391  | tlr_step:
392      ∀status_initial: S.
393      ∀status_labelled: S.
394      ∀status_final: S.
395        trace_label_label S doesnt_end_with_ret status_initial status_labelled →
396        trace_label_return S status_labelled status_final →
397          trace_label_return S status_initial status_final
398with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
399  | tll_base:
400      ∀ends_flag: trace_ends_with_ret.
401      ∀start_status: S.
402      ∀end_status: S.
403        trace_any_label S ends_flag start_status end_status →
404        as_costed S start_status →
405          trace_label_label S ends_flag start_status end_status
406with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
407  (* Single steps within a function which reach a label.
408     Note that this is the only case applicable for a jump. *)
409  | tal_base_not_return:
410      ∀start_status: S.
411      ∀final_status: S.
412        as_execute S start_status final_status →
413        (as_classifier S start_status cl_jump ∨
414         as_classifier S start_status cl_other) →
415        as_costed S final_status →
416          trace_any_label S doesnt_end_with_ret start_status final_status
417  | tal_base_return:
418      ∀start_status: S.
419      ∀final_status: S.
420        as_execute S start_status final_status →
421        as_classifier S start_status cl_return →
422          trace_any_label S ends_with_ret start_status final_status
423  (* A call followed by a label on return. *)
424  | tal_base_call:
425      ∀status_pre_fun_call: S.
426      ∀status_start_fun_call: S.
427      ∀status_final: S.
428        as_execute S status_pre_fun_call status_start_fun_call →
429        ∀H:as_classifier S status_pre_fun_call cl_call.
430          as_after_return S «status_pre_fun_call, H» status_final →
431          trace_label_return S status_start_fun_call status_final →
432          as_costed S status_final →
433            trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
434  (* A call followed by a non-empty trace. *)
435  | tal_step_call:
436      ∀end_flag: trace_ends_with_ret.
437      ∀status_pre_fun_call: S.
438      ∀status_start_fun_call: S.
439      ∀status_after_fun_call: S.
440      ∀status_final: S.
441        as_execute S status_pre_fun_call status_start_fun_call →
442        ∀H:as_classifier S status_pre_fun_call cl_call.
443          as_after_return S «status_pre_fun_call, H» status_after_fun_call →
444          trace_label_return S status_start_fun_call status_after_fun_call →
445          ¬ as_costed S status_after_fun_call →
446          trace_any_label S end_flag status_after_fun_call status_final →
447            trace_any_label S end_flag status_pre_fun_call status_final
448  | tal_step_default:
449      ∀end_flag: trace_ends_with_ret.
450      ∀status_pre: S.
451      ∀status_init: S.
452      ∀status_end: S.
453        as_execute S status_pre status_init →
454        trace_any_label S end_flag status_init status_end →
455        as_classifier S status_pre cl_other →
456        ¬ (as_costed S status_init) →
457          trace_any_label S end_flag status_pre status_end.
459A \texttt{trace\_label\_return} is isomorphic to a list of
460\texttt{trace\_label\_label}s that end with a cost emission followed by a
461\texttt{trace\_label\_label} that ends with a return.
463The interesting cases are those of $(\texttt{trace\_any\_label}~b~s_1~s_2)$.
464A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
465definition on the classification of $s_1$. The constructors of the datatype
466impose several invariants that are meant to impose a structure to the
467otherwise unstructured execution. In particular, the following invariants are
470 \item the trace is never empty; it ends with a return iff $b$ is
471       true (\texttt{ends\_with\_ret})
472 \item a jump must always be the last instruction of the trace, and it must
473       be followed by a cost emission statement; i.e. the target of a jump
474       is always the beginning of a new basic block; as such it must start
475       with a cost emission statement
476 \item a cost emission statement can never occur inside the trace, only in
477       the status immediately after
478 \item the trace for a function call step is made from a subtrace of type
479       $(\texttt{trace\_label\_return}~s_1~s_2)$, possibly followed by the
480       rest of the trace for this basic block. The subtrace represents the
481       function execution. Being an inductive datum, it grants totality of
482       the function call. The status $s_2$ is the one that follows the return
483       statement. The next instruction of $s_2$ must follow the function call
484       instruction. As a consequence, function calls are also well nested.
487The three mutual structural recursive functions \texttt{flatten\_trace\_label\_return, flatten\_trace\_label\_label} and \texttt{flatten\_trance\_any\_label}
488allow to extract from a structured trace the list of states whose next
489instruction is a cost emission statement. We only show here the type of one
490of them:
493  (S: abstract_status)
494    (start_status: S) (final_status: S)
495      (the_trace: trace_label_return S start_status final_status)
496        on the_trace: list (as_cost_label S)
499\paragraph{Cost prediction on structured traces}
501The first main theorem of CerCo about structured traces
502(theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
503holds for the
505of the structured traces to the concrete status of object code programs.
506Simplifying a bit, it states that
507$\forall s_1,s_2. \forall \tau: \texttt{trace\_label\_return}~s_1~s_2.
508  clock~s_2 = clock~s_1 + \Sigma_{s \in \texttt{flatten\_trace\_label\_return}~\tau} k(L(s))$ where $L$ maps a labelled state to its emitted label, and the
509cost model $k$ is statically computed from the object code
510by associating to each label \texttt{L} the sum of the cost of the instructions
511in the basic block that starts at \texttt{L} and ends before the next labelled
512instruction. The theorem is proved by structural induction over the structured
513trace, and is based on the invariant that
514iff the function that computes the cost model has analyzed the instruction
515to be executed at $s_2$ after the one to be executed at $s_1$, and if
516the structured trace starts with $s_1$, then eventually it will contain also
517$s_2$. When $s_1$ is not a function call, the result holds trivially because
518of the $(\texttt{as\_execute}~s_1~s_2)$ condition obtained by inversion on
519the trace. The only non
520trivial case is the one of function calls: the cost model computation function
521does recursion on the first instruction that follows that function call; the
522\texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
523\texttt{tal\_step\_call} grants exactly that the execution will eventually reach
524this state.
526\paragraph{Structured traces similarity and cost prediction invariance}
528A compiler pass maps source to object code and initial states to initial
529states. The source code and initial state uniquely determine the structured
530trace of a program, if it exists. The structured trace fails to exists iff
531the structural conditions are violated by the program execution (e.g. a function
532body does not start with a cost emission statement). Let us assume that the
533target structured trace exists.
535What is the relation between the source and target structured traces?
536In general, the two traces can be arbitrarily different. However, we are
537interested only in those compiler passes that maps a trace $\tau_1$ to a trace
538$\tau_2$ such that
539\begin{displaymath}(\texttt{flatten\_trace\_label\_return}~\tau_1 = \texttt{flatten\_trace\_label\_return}~\tau_2)\label{condition1}\end{displaymath}
540The reason is that when this happens we obviously have the
541following corollary to the previous theorem:
542$$\forall s_1,s_2. \forall \tau: \texttt{trace\_label\_return}~s_1~s_2.
543  clock~s_2 - clock~s_1 = \Sigma_{s \in \texttt{flatten\_trace\_label\_return}~\tau_1} k(L(s)) = \Sigma_{s \in \texttt{flatten\_trace\_label\_return}~\tau_2} k(L(s))$$
544The corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
545transfer the cost model from the target to the source code and reason on the
546source code only.
548We are therefore interested in conditions stronger than~\ref{condition1}.
549Therefore we introduce here a similarity relation between traces that captures
550the fact that two traces have the same structure and that
551implies~\ref{condition1} (theorem \texttt{tlr\_rel\_to\_traces\_same\_flatten}). Two traces are similar when one can be obtained from
552the other by erasing or inserting steps classified as \texttt{other} into the
553\texttt{trace\_any\_label} blocks of the trace. In particular,
554the relation maps function calls to function calls to the same function,
555label emission statements to emissions of the same label, concatenation of
556subtraces to concatenation of subtraces of the same lenghth and starting with
557the same emissione statement, etc.
559The similarity relation definition if simple, but it involves a large
560number of cases. We only show here a few of the cases:
562let rec tlr_rel S1 st1 st1' S2 st2 st2'
563  (tlr1 : trace_label_return S1 st1 st1')
564  (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
565match tlr1 with
566  [ tlr_base st1 st1' tll1 ⇒
567    match tlr2 with
568    [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
569    | _ ⇒ False
570    ]
571  | tlr_step st1 st1' st1'' tll1 tl1 ⇒
572    match tlr2 with
573    [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
574      tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
575    | _ ⇒ False
576    ]
577  ]
578and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
579 (tll1 : trace_label_label S1 fl1 st1 st1')
580 (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
581  match tll1 with
582  [ tll_base fl1 st1 st1' tal1 H ⇒
583    match tll2 with
584    [ tll_base fl2 st2 st2 tal2 G ⇒
585      as_label_safe … («?, H») = as_label_safe … («?, G») ∧
586      tal_rel … tal1 tal2
587    ]
588  ]
589and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
590 (tal1 : trace_any_label S1 fl1 st1 st1')
591 (tal2 : trace_any_label S2 fl2 st2 st2')
592   on tal1 : Prop ≝
593  match tal1 with
594  [ tal_base_not_return st1 st1' _ _ _ ⇒
595    fl2 = doesnt_end_with_ret ∧
596    ∃st2mid,taa,H,G,K.
597    tal2 ≃ taa_append_tal ? st2 ??? taa
598      (tal_base_not_return ? st2mid st2' H G K)
599  | tal_base_return st1 st1' _ _ ⇒
600    fl2 = ends_with_ret ∧
601    ∃st2mid,taa,H,G.
602    tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
603      (tal_base_return ? st2mid st2' H G)
604  | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
605    fl2 = doesnt_end_with_ret ∧
606    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
607    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
608    (* we must allow a tal_base_call to be similar to a call followed
609      by a collapsable trace (trace_any_any followed by a base_not_return;
610      we cannot use trace_any_any as it disallows labels in the end as soon
611      as it is non-empty) *)
612    (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
613      tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
614    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
615    ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
616      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
617      tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
618  | tal_base_tailcall st1 st1' st1'' _ prf tlr1 ⇒
619    fl2 = ends_with_ret ∧
620    ∃st2mid,G.as_tailcall_ident S2 («st2mid, G») = as_tailcall_ident ? «st1, prf» ∧
621    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
622    ∃tlr2 : trace_label_return ? st2mid' st2'.
623      tal2 ≃ taa @ (tal_base_tailcall … H G tlr2) ∧ tlr_rel … tlr1 tlr2
624  | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
625    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
626    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
627    (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
628      tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
629      tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
630    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
631    ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
632      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
633      tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
634  | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
635    tal_rel … tl1 tal2 (* <- this makes it many to many *)
636  ].
639\section{Forward simulation}
642We summarize here the results of the previous sections. Each intermediate
643unstructured language can be given a semantics based on structured traces,
644that single out those runs that respect a certain number of invariants.
645A cost model can be computed on the object code and it can be used to predict
646the execution costs of runs that produce structured traces. The cost model
647can be lifted from the target to the source code of a pass if the pass maps
648structured traces to similar structured traces. The latter property is called
649a \emph{forward simulation}.
651As for labelled transition systems, in order to establish the forward
652simulation we are interested in (preservation of observables), we are
653forced to prove a stronger notion of forward simulation that introduces
654an explicit relation between states. The classical notion of a 1-to-0-or-many
655forward simulation is the existence of a relation $R$ over states such that
656if $s_1 R s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
657$s_1' \to^* s_2'$ and $s_1' R s_2'$. In our context, we need to replace the
658one and multi step transition relations $\to^n$ with the existence of
659a structured trace between the two states, and we need to add the request that
660the two structured traces are similar. Thus what we would like to state is
661something like:\\
662for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
663$s_1$ to $s_1'$ and $s_1 R s_2$ there exists an $s_2'$ such that
664$s_1' R s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
665$\tau_1 \approx \tau_2$.\\
667The statement just introduced, however, is too simplicistic and not provable
668in the general case. To understand why, consider the case of a function call
669and the pass that fixes the parameter passing conventions. A function
670call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target call has no
671input nor output parameters. The pass must add explicit code before and after
672the function call to move the pseudo-registers content from/to the hardware
673registers or the stack in order to implement the parameter passing strategy.
674Similarly, each function body must be augmented with a preamble and a postamble
675to complete/initiate the parameter passing strategy for the call/return phase.
676Therefore what used to be a call followed by the next instruction to execute
677after the function return, now becomes a sequence of instructions, followed by
678a call, followed by another sequence. The two states at the beginning of the
679first sequence and at the end of the second sequence are in relation with
680the status before/after the call in the source code, like in an usual forward
681simulation. How can we prove however the additional condition for function calls
682that asks that when the function returns the instruction immediately after the
683function call is called? To grant this invariant, there must be another relation
684between the address of the function call in the source and in the target code.
685This additional relation is to be used in particular to relate the two stacks.
687Another example is given by preservation of code emission statements. A single
688code emission instruction can be simulated by a sequence of steps, followed
689by a code emission, followed by another sequence. Clearly the initial and final
690statuses of the sequence are to be in relation with the status before/after the
691code emission in the source code. In order to preserve the structured traces
692invariants, however, we must consider a second relation between states that
693traces the preservation of the code emission statement.
695Therefore we now introduce an abstract notion of relation set between abstract
696statuses and an abstract notion of 1-to-0-or-many forward simulation. These
697two definitions enjoy the following remarkable properties:
699 \item they are generic enough to accomodate all passes of the CerCo compiler
700 \item they allow to lift the 1-to-0-or-many forward simulation to the
701   structured traces preserving forward simulation we are interested in
704The latter lifting is proved by the following theorem which has the
7051-to-0-or-many forward simulation statement as an hypothesis. Combined with
706the results in the previous section, it has as a corollary that the cost
707model can be computed on the object code and lifted to the source code to
708reason on non-functional properties iff a simpler 1-to-0-or-many forward
709simulation is proved. The statement of the latter is way more complex than a
710traditional 1-to-0-or-many forward simulation statement, but proofs have now
711about the same complexity. Indeed, it should be possible to largely reuse an
712existent traditional forward simulation proof. The lifting theorem below,
713proved once and for all in this paper, is the one that does all the job to
714put the labelling approach in motion.
717definition status_simulation ≝
718  λS1 : abstract_status.
719  λS2 : abstract_status.
720  λsim_status_rel : status_rel S1 S2.
721    ∀st1,st1',st2.as_execute S1 st1 st1' →
722    sim_status_rel st1 st2 →
723    match as_classify … st1 with
724    [ None ⇒ True
725    | Some cl ⇒
726      match cl with
727      [ cl_call ⇒ ∀prf.
728        (*
729             st1' ------------S----------\
730              ↑ \                         \
731             st1 \--L--\                   \
732              | \       \                   \
733              S  \-C-\  st2_after_call →taa→ st2'
734              |       \     ↑
735             st2 →taa→ st2_pre_call
736        *)
737        ∃st2_pre_call.
738        as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
739        call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
740        ∃st2_after_call,st2'.
741        ∃taa2 : trace_any_any … st2 st2_pre_call.
742        ∃taa2' : trace_any_any … st2_after_call st2'.
743        as_execute … st2_pre_call st2_after_call ∧
744        sim_status_rel st1' st2' ∧
745        label_rel … st1' st2_after_call
746      | cl_tailcall ⇒ ∀prf.
747        (*
748             st1' ------------S----------\
749              ↑ \                         \
750             st1 \--L--\                   \
751              |         \                   \
752              S         st2_after_call →taa→ st2'
753              |             ↑
754             st2 →taa→ st2_pre_call
755        *)
756        ∃st2_pre_call.
757        as_tailcall_ident ? st2_pre_call = as_tailcall_ident ? («st1, prf») ∧
758        ∃st2_after_call,st2'.
759        ∃taa2 : trace_any_any … st2 st2_pre_call.
760        ∃taa2' : trace_any_any … st2_after_call st2'.
761        as_execute … st2_pre_call st2_after_call ∧
762        sim_status_rel st1' st2' ∧
763        label_rel … st1' st2_after_call
764      | cl_return ⇒
765        (*
766             st1
767            / ↓
768           | st1'----------S,L------------\
769           S   \                           \
770            \   \-----R-------\            |     
771             \                 |           |
772             st2 →taa→ st2_ret |           |
773                          ↓   /            |
774                     st2_after_ret →taaf→ st2'
776           we also ask that st2_after_ret be not labelled if the taaf tail is
777           not empty
778        *)
779        ∃st2_ret,st2_after_ret,st2'.
780        ∃taa2 : trace_any_any … st2 st2_ret.
781        ∃taa2' : trace_any_any_free … st2_after_ret st2'.
782        (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
783        as_classifier … st2_ret cl_return ∧
784        as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
785        ret_rel … sim_status_rel st1' st2_after_ret ∧
786        label_rel … st1' st2'
787      | cl_other ⇒
788          (*         
789          st1 → st1'
790            |      \
791            S      S,L
792            |        \
793           st2 →taaf→ st2'
795           the taaf can be empty (e.g. tunneling) but we ask it must not be the
796           case when both st1 and st1' are labelled (we would be able to collapse
797           labels otherwise)
798         *)
799        ∃st2'.
800        ∃taa2 : trace_any_any_free … st2 st2'.
801        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
802        sim_status_rel st1' st2' ∧
803        label_rel … st1' st2'
804      | cl_jump ⇒
805        (* just like cl_other, but with a hypothesis more *)
806        as_costed … st1' →
807        ∃st2'.
808        ∃taa2 : trace_any_any_free … st2 st2'.
809        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
810        sim_status_rel st1' st2' ∧
811        label_rel … st1' st2'
812      ]
813    ].
817status_simulation_produce_tlr S1 S2 R
818(* we start from this situation
819     st1 →→→→tlr→→→→ st1'
820      | \
821      L  \---S--\
822      |          \
823   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
824                        by the preceding call)
826   and we produce
827     st1 →→→→tlr→→→→ st1'
828             \\      /  \
829             //     R    \-L,S-\
830             \\     |           \
831   st2_lab →tlr→ st2_mid →taaf→ st2'
833  st1 st1' st2_lab st2
834  (tlr1 : trace_label_return S1 st1 st1')
835  (taa2_pre : trace_any_any S2 st2_lab st2)
836  (sim_execute : status_simulation S1 S2 R)
837  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
838  ∃st2_mid.∃st2'.
839  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
840  ∃taa2 : trace_any_any_free … st2_mid st2'.
841  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
842  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
843  tlr_rel … tlr1 tlr2
846\section{Conclusions and future works}
848The labelling approach is a technique to implement compilers that induce on
849the source code a non uniform cost model determined from the object code
850produced. The cost model assigns a cost to each basic block of the program.
851The main theorem of the labelling approach says that there is an exact
852correspondence between the sequence of basic blocks started in the source
853and object code, and that no instruction in the source or object code is
854executed outside a basic block. Thus the overall cost of object code execution
855can be computed precisely on the source code.
857In this paper we scale the labelling approach to cover a programming language
858with function calls. This introduces new difficulties only when the language
859is unstructured, i.e. it allows function calls to return anywhere in the code,
860destroying the hope of a static prediction of the cost of basic blocks.
861We restore static predictability by introducing a new semantics for unstructured
862programs that single outs well structured executions. The latter are represented
863by structured traces, a generalization of streams of observables that capture
864several structural invariants of the execution, like well nesting of functions
865or the fact that every basic block must start with a code emission statement.
866We show that structured traces are sufficiently structured to statically compute
867a precise cost model on the object code.
869We introduce a similarity relation on structured traces that must hold between
870source and target traces. When the relation holds for every program, we prove
871that the cost model can be lifted from the object to the source code.
873In order to prove that similarity holds, we present a generic proof of forward
874simulation that is aimed at pulling apart as much as possible the part of the
875simulation related to non-functional properties (preservation of structure)
876from that related to functional properties. In particular, we reduce the
877problem of preservation of structure to that of showing a 1-to-0-or-many
878forward simulation that only adds a few additional proof obligations to those
879of a traditional, function properties only, proof.
881All the results presented in the paper are part of a larger certification of a
882C compiler which is based on the labelling approach. The certification, done
883in Matita, is the main deliverable of the European Project CerCo.
885The short term future work consists in the completion of the certification of
886the CerCo compiler. In particular, the results presented in this paper will
887be used to certify all the passes of the back-end of the compiler.
889\paragraph{Related works}
890CerCo is the first certification project which explicitly tries to induce a
891precise cost model on the source code in order to establish non-functional
892properties of programs on an high level language. Traditional certifications
893of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
894of the functional properties via forward simulations.
896Usually forward simulations take the following form: for each transition
897from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
898transitions in the target code of length $n$. The number $n$ of transition steps
899in the target code can just be the witness of the existential statement.
900An equivalent alternative when the proof of simulation is constructive consists
901in providing an explicit function, called \emph{clock function} in the
902litterature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
903function constitutes then a cost model for the source code, in the spirit of
904what we are doing in CerCo. However, we believe our solution to be superior
905in the following points: 1) the machinery of the labelling approach is
906insensible to the resource being measured. Indeed, any cost model computed on
907the object code can be lifted to the source code (e.g. stack space used,
908energy consumed, etc.). On the contrary, clock functions only talk about
909number of transition steps. In order to extend the approach with clock functions
910to other resources, additional functions must be introduced. Moreover, the
911additional functions would be handled differently in the proof.
9122) the cost models induced by the labelling approach have a simple presentation.
913In particular, they associate a number to each basic block. More complex
914models can be induced when the approach is scaled to cover, for instance,
915loop optimizations~\cite{loopoptimizations}, but the costs are still meant to
916be easy to understandable and manipulate in an interactive theorem prover.
917On the contrary, a clock function is a complex function of the state $s_1$
918which, as a function, is an opaque object that is difficult to reify as
919source code in order to reason on it.
Note: See TracBrowser for help on using the repository browser.