source: Papers/itp-2013/ccexec.tex @ 3222

Last change on this file since 3222 was 3222, checked in by tranquil, 6 years ago

added pages to included papers. final version.

File size: 66.5 KB
Line 
1\documentclass{llncs}
2\usepackage{hyperref}
3\usepackage{graphicx}
4\usepackage{color}
5\usepackage{listings}
6\usepackage{bcprules}
7\usepackage{verbatim}
8\usepackage{alltt}
9
10% NB: might be worth removing this if changing class in favour of
11% a built-in definition.
12%\newtheorem{theorem}{Theorem}
13\newtheorem{condition}{Condition}
14
15\lstdefinelanguage{coq}
16  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
17   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
18  }
19
20\lstdefinelanguage[mine]{C}[ANSI]{C}{
21  morekeywords={int8_t}
22}
23
24\lstset{language=[mine]C,basicstyle=\footnotesize\tt,columns=flexible,breaklines=false,
25        keywordstyle=\color{red}\bfseries,
26        keywordstyle=[2]\color{blue},
27        commentstyle=\color{green},
28        stringstyle=\color{blue},
29        showspaces=false,showstringspaces=false,
30        xleftmargin=1em}
31
32\begin{document}
33\pagestyle{plain}
34
35\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
36Emerging Technologies (FET) programme within the Seventh Framework
37Programme for Research of the European Commission, under FET-Open grant
38number: 243881}}
39\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
40\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{Paolo.Tranquilli@unibo.it}, \email{Claudio.SacerdotiCoen@unibo.it}}
41\maketitle
42\begin{abstract}
43The labelling approach is a technique to lift cost models for non-functional
44properties of programs from the object code to the source code. It is based
45on the preservation of the structure of the high level program in every
46intermediate language used by the compiler. Such structure is captured by
47observables that are added to the semantics and that needs to be preserved
48by the forward simulation proof of correctness of the compiler. Additional
49special observables are required for function calls. In this paper we
50present a generic forward simulation proof that preserves all these observables.
51The proof statement is based on a new mechanised semantics that traces the
52structure of execution when the language is unstructured. The generic semantics
53and simulation proof have been mechanised in the interactive theorem prover
54Matita.
55\end{abstract}
56
57\section{Introduction}
58
59The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
60a technique to \emph{lift} cost models for non-functional properties of programs
61from the object code to the source code. Examples of non-functional properties
62are execution time, amount of stack/heap space consumed and energy required for
63communication. The basic idea of the approach is that it is impossible to
64provide a \emph{uniform} cost model for an high level language that is preserved
65\emph{precisely} by a compiler. For instance, two instances of an assignment
66$x = y$ in the source code can be compiled very differently according to the
67place (registers vs stack) where $x$ and $y$ are stored at the moment of
68execution. Therefore a precise cost model must assign a different cost
69to every occurrence, and the exact cost can only be known after compilation.
70
71According to the labelling approach, the compiler is free to compile and optimise
72the source code without any major restriction, but it must keep trace
73of what happens to basic blocks during the compilation. The cost model is
74then computed on the object code. It assigns a cost to every basic block.
75Finally, the compiler propagates back the cost model to the source level,
76assigning a cost to each basic block of the source code.
77
78Implementing the labelling approach in a certified compiler
79allows to reason formally on the high level source code of a program to prove
80non-functional properties that are granted to be preserved by the compiler
81itself. The trusted code base is then reduced to 1) the interactive theorem
82prover (or its kernel) used in the certification of the compiler and
832) the software used to certify the property on the source language, that
84can be itself certified further reducing the trusted code base.
85In~\cite{easylabelling} the authors provide an example of a simple
86certified compiler that implements the labelling approach for the
87imperative \texttt{While} language~\cite{while}, that does not have
88pointers and function calls.
89
90The labelling approach has been shown to scale to more interesting scenarios.
91In particular in~\cite{functionallabelling} it has been applied to a functional
92language and in~\cite{loopoptimizations} it has been shown that the approach
93can be slightly complicated to handle loop optimisations and, more generally,
94program optimisations that do not preserve the structure of basic blocks.
95On-going work also shows that the labelling approach is also compatible with
96the complex analyses required to obtain a cost model for object code
97on processors that implement advanced features like pipelining, superscalar
98architectures and caches.
99
100In the European Project CerCo (Certified Complexity~\footnote{\url{http://cerco.cs.unibo.it}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
1018051 object code. The compiler is
102moderately optimising and implements a compilation chain that is largely
103inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novelty and source of difficulties is due to the presence
104of function calls. Surprisingly, the addition of function calls require a
105revisitation of the proof technique given in~\cite{easylabelling}. In
106particular, at the core of the labelling approach there is a forward
107simulation proof that, in the case of \texttt{While}, is only minimally
108more complex than the proof required for the preservation of the
109functional properties only. In the case of a programming language with
110function calls, instead, it turns out that the forward simulation proof for
111the back-end languages must grant a whole new set of invariants.
112
113In this paper we present a formalisation in the Matita interactive theorem
114prover~\cite{matita1,matita2} of a generic version of the simulation proof required for unstructured
115languages. All back-end languages of the CerCo compiler are unstructured
116languages, so the proof covers half of the correctness of the compiler.
117The statement of the generic proof is based on a new semantics
118for imperative unstructured languages that is based on \emph{structured
119traces} and that restores the preservation of structure in the observables of
120the semantics. The generic proof allows to almost completely split the
121part of the simulation that deals with functional properties only from the
122part that deals with the preservation of structure.
123
124The plan of this paper is the following. In Section~\ref{labelling} we
125sketch the labelling method and the problems derived from the application
126to languages with function calls. In Section~\ref{semantics} we introduce
127a generic description of an unstructured imperative language and the
128corresponding structured traces (the novel semantics). In
129Section~\ref{simulation} we describe the forward simulation proof.
130Conclusions and future works are in Section~\ref{conclusions}
131
132\section{The labelling approach}
133\label{labelling}
134We briefly sketch here a simplified version of the labelling approach as
135introduced in~\cite{easylabelling}. The simplification strengthens the
136sufficient conditions given in~\cite{easylabelling} to allow a simpler
137explanation. The simplified conditions given here are also used in the
138CerCo compiler to simplify the proof.
139
140Let $\mathcal{P}$ be a programming language whose semantics is given in
141terms of observables: a run of a program yields a finite or infinite
142stream of observables. We also assume for the time being that function
143calls are not available in $\mathcal{P}$. We want to associate a cost
144model to a program $P$ written in $\mathcal{P}$. The first step is to
145extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
146where $L$ is a label distinct from all observables of $\mathcal{P}$.
147The semantics of $\texttt{emit L}$ is the emission of the observable
148\texttt{L} that is meant to signal the beginning of a basic block.
149
150There exists an automatic procedure that injects into the program $P$ an
151$\texttt{emit L}$ at the beginning of each basic block, using a fresh
152\texttt{L} for each block. In particular, the bodies of loops, both branches
153of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
154with an emission statement.
155
156Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
157that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
158language used by the compiler. We can easily extend every
159intermediate language (and its semantics) with an $\texttt{emit L}$ statement
160as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
161the additional difficulty that the syntax of object code is given as a
162sequence of bytes. The injection of an emission statement in the object code
163can be done using a map that maps two consecutive code addresses with the
164statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
165instruction stored at $pc_1$ and before the execution of the instruction
166stored at $pc_2$. The two program counters are necessary because the
167instruction stored at $pc_1$ can have multiple possible successors (e.g.
168in case of a conditional branch or an indirect call). Dually, the instruction
169stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
170target of a jump).
171
172The compiler, to be functionally correct, must preserve the observational
173equivalence, i.e. executing the program after each compiler pass should
174yield the same stream of observables. After the injection of emission
175statements, observables now capture both functional and non-functional
176behaviours.
177This correctness property is called in the literature a forward simulation
178and is sufficient for correctness when the target language is
179deterministic~\cite{compcert3}.
180We also require a stronger, non-functional preservation property: after each
181pass all basic blocks must start with an emission statement, and all labels
182\texttt{L} must be unique.
183
184Now let $M$ be the object code obtained for the program $P$. Let us suppose
185that we can statically inspect the code $M$ and associate to each basic block
186a cost (e.g. the number of clock cycles required to execute all instructions
187in the basic block, or an upper bound to that time). Every basic block is
188labelled with an unique label \texttt{L}, thus we can actually associate the
189cost to \texttt{L}. Let call it $k(\texttt{L})$.
190
191The function $k$ is defined as the cost model for the object code control
192blocks. It can be equally used as well as the cost model for the source
193control blocks. Indeed, if the semantics of $P$ is the stream
194$L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
195also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
196every instruction belongs to a control block and every control block is
197labelled. Thus it is correct to say that the execution cost of $P$ is also
198$\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
199the blocks of the high level program $P$ that is preserved by compilation.
200
201How can the user profit from the high level cost model? Suppose, for instance,
202that he wants to prove that the WCET of his program is bounded by $c$. It
203is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
204a purely functional property of the code. He can therefore use any technique
205available to certify functional properties of the source code.
206What is suggested in~\cite{easylabelling} is to actually instrument the
207source code $P$ by replacing every label emission statement
208$\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
209a global fresh variable \texttt{cost}. The bound is now proved by establishing
210the program invariant $\texttt{cost} \leq c$, which can be done for example
211using the Frama-C~\cite{framaC} suite if the source code is some variant of
212C.
213
214\subsection{Labelling function calls}
215We now want to extend the labelling approach to support function calls.
216On the high level, \emph{structured} programming language $\mathcal{P}$ there
217is not much to change.
218When a function is invoked, the current basic block is temporarily exited
219and the basic block the function starts with take control. When the function
220returns, the execution of the original basic block is resumed. Thus the only
221significant change is that basic blocks can now be nested. Let \texttt{E}
222be the label of the external block and \texttt{I} the label of a nested one.
223Since the external starts before the internal, the semantics observed will be
224\texttt{E I} and the cost associated to it on the source language will be
225$k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
226in the block \texttt{E} first plus the cost of executing all the instructions in
227the block \texttt{I}. However, we know that some instructions in \texttt{E} are
228executed after the last instruction in \texttt{I}. This is actually irrelevant
229because we are here assuming that costs are additive, so that we can freely
230permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
231the function call terminates and yields back control to the basic block
232\texttt{E}. If the call diverges, the instrumentation
233$\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
234but just as an upper bound to the real execution cost: only precision is lost.
235
236Let now consider what happens when we move down the compilation chain to an
237unstructured intermediate or final language. Here unstructured means that
238the only control operators are conditional and unconditional jumps, function
239calls and returns. Unlike a structured language, though, there is no guarantee
240that a function will return control just after the function call point.
241The semantics of the return statement, indeed, consists in fetching the
242return address from some internal structure (typically the control stack) and
243jumping directly to it. The code can freely manipulate the control stack to
244make the procedure returns to whatever position. Indeed, it is also possible
245to break the well nesting of function calls/returns.
246
247Is it the case that the code produced by a correct compiler must respect the
248additional property that every function returns just after its function call
249point? The answer is negative and the property is not implied by forward
250simulation proofs. For instance, imagine to modify a correct compiler pass
251by systematically adding one to the return address on the stack and by
252putting a \texttt{NOP} (or any other instruction that takes one byte) after
253every function call. The obtained code will be functionally indistinguishable,
254and the added instructions will all be dead code.
255
256This lack of structure in the semantics badly interferes with the labelling
257approach. The reason is the following: when a basic block labelled with
258\texttt{E} contains a function call, it no longer makes any sense to associate
259to a label \texttt{E} the sum of the costs of all the instructions in the block.
260Indeed, there is no guarantee that the function will return into the block and
261that the instructions that will be executed after the return will be the ones
262we are paying for in the cost model.
263
264How can we make the labelling approach work in this scenario? We only see two
265possible ways. The first one consists in injecting an emission statement after
266every function call: basic blocks no longer contain function calls, but are now
267terminated by them. This completely solves the problem and allows the compiler
268to break the structure of function calls/returns at will. However, the
269technique has several drawbacks. First of all, it greatly augments the number
270of cost labels that are injected in the source code and that become
271instrumentation statements. Thus, when reasoning on the source code to prove
272non-functional properties, the user (or the automation tool) will have to handle
273larger expressions. Second, the more labels are emitted, the more difficult it
274becomes to implement powerful optimisations respecting the code structure.
275Indeed, function calls are usually implemented in such a way that most registers
276are preserved by the call, so that the static analysis of the block is not
277interrupted by the call and an optimisation can involve both the code before
278and after the function call. Third, instrumenting the source code may require
279unpleasant modification of it. Take, for example, the code
280\texttt{f(g(x));}. We need to inject an emission statement/instrumentation
281instruction just after the execution of \texttt{g}. The only way to do that
282is to rewrite the code as \texttt{y = g(x); emit L; f(y);} for some fresh
283variable \texttt{y}. It is pretty clear how in certain situations the obtained
284code would be more obfuscated and then more difficult to manually reason on.
285
286For the previous reasons, in this paper and in the CerCo project we adopt a
287different approach. We do not inject emission statements after every
288function call. However, we want to propagate a strong additional invariant in
289the forward simulation proof. The invariant is the propagation of the structure
290 of the original high level code, even if the target language is unstructured.
291The structure we want to propagate, that will become more clear in the next
292section, comprises 1) the property that every function should return just after
293the function call point, which in turns imply well nesting of function calls;
2942) the property that every basic block starts with a code emission statement.
295
296In the original labelling approach of~\cite{easylabelling}, the second property
297was granted syntactically as a property of the generated code.
298In our revised approach, instead, we will impose the property on the runs:
299it will be possible to generate code that does not respect the syntactic
300property, as soon as all possible runs respect it. For instance, dead code will no longer
301be required to have all basic blocks correctly labelled. The switch is suggested
302from the fact that the first of the two properties --- that related to
303function calls/returns --- can only be defined as property of runs,
304not of the static code. The switch is
305beneficial to the proof because the original proof was made of two parts:
306the forward simulation proof and the proof that the static property was granted.
307In our revised approach the latter disappears and only the forward simulation
308is kept.
309
310In order to capture the structure semantics so that it is preserved
311by a forward simulation argument, we need to make the structure observable
312in the semantics. This is the topic of the next section.
313
314\section{Structured traces}
315\label{semantics}
316The program semantics adopted in the traditional labelling approach is based
317on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
318a set of states $\mathcal{S}$, the semantics of one deterministic execution
319step is
320defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
321observables. The semantics is then lifted compositionally to multiple (finite
322or infinite) execution steps.
323Finally, the semantics of a a whole program execution is obtained by forgetting
324about the final state (if any), yielding a function $S \to O^*$ that given an
325initial status returns the finite or infinite stream of observables in output.
326
327We present here a new definition of semantics where the structure of execution,
328as defined in the previous section, is now observable. The idea is to replace
329the stream of observables with a structured data type that makes explicit
330function call and returns and that grants some additional invariants by
331construction. The data structure, called \emph{structured traces}, is
332defined inductively for terminating programs and coinductively for diverging
333ones. In the paper we focus only on the inductive structure, i.e. we assume
334that all programs that are given a semantics are total. The Matita formalisation
335also shows the coinductive definitions. The semantics of a program is then
336defined as a function that maps an initial state into a structured trace.
337
338In order to have a definition that works on multiple intermediate languages,
339we abstract the type of structure traces over an abstract data type of
340abstract statuses:
341\begin{alltt}
342record abstract_status := \{ S: Type[0];
343 as_execute: S \(\to\) S \(\to\) Prop;     as_classify: S \(\to\) classification;
344 as_costed: S \(\to\) Prop;      as_label: \(\forall\) s. as_costed S s \(\to\) label;
345 as_call_ident: \(\forall\) s. as_classify S s = cl_call \(\to\) label;
346 as_after_return:
347  (\(\Sigma\)s:as_status. as_classify s = Some ? cl_call) \(\to\) as_status \(\to\) Prop \}
348\end{alltt}
349The predicate $\texttt{as\_execute}~s_1~s_2$ holds if $s_1$ evolves into
350$s_2$ in one step;\\ $\texttt{as\_classify}~s~c$ holds if the next instruction
351to be executed in $s$ is classified according to $c \in
352\{\texttt{cl\_return,cl\_jump,cl\_call,cl\_other}\}$ (we omit tail-calls for simplicity);
353the predicate $\texttt{as\_costed}~s$ holds if the next instruction to be
354executed in $s$ is a cost emission statement (also classified
355as \texttt{cl\_other}); finally $(\texttt{as\_after\_return}~s_1~s_2)$ holds
356if the next instruction to be executed in $s_2$ follows the function call
357to be executed in (the witness of the $\Sigma$-type) $s_1$. The two functions
358\texttt{as\_label} and \texttt{as\_cost\_ident} are used to extract the
359cost label/function call target from states whose next instruction is
360a cost emission/function call statement.
361
362
363The inductive type for structured traces is actually made by three multiple
364inductive types with the following semantics:
365\begin{enumerate}
366 \item $(\texttt{trace\_label\_return}~s_1~s_2)$ is a trace that begins in
367   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
368   such that the instruction to be executed in $s_1$ is a label emission
369   statement and the one to be executed in the state before $s_2$ is a return
370   statement. Thus $s_2$ is the state after the return. The trace
371   may contain other label emission statements. It captures the structure of
372   the execution of function bodies: they must start with a cost emission
373   statement and must end with a return; they are obtained by concatenating
374   one or more basic blocks, all starting with a label emission
375   (e.g. in case of loops).
376 \item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ is a trace that begins in
377   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
378   such that the instruction to be executed in $s_2$/in the state before
379   $s_2$ is either a label emission statement or
380   or a return, according to the boolean $b$. It must not contain
381   any label emission statement. It captures the notion of a suffix of a
382   basic block.
383 \item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ is the special case of
384   $(\texttt{trace\_any\_label}~b~s_1~s_2)$ such that the instruction to be
385   executed in $s_1$ is a label emission statement. It captures the notion of
386   a basic block.
387\end{enumerate}
388
389\infrule[\texttt{tlr\_base}]
390 {\texttt{trace\_label\_label}~true~s_1~s_2}
391 {\texttt{trace\_label\_return}~s_1~s_2}
392
393\infrule[\texttt{tlr\_step}]
394 {\texttt{trace\_label\_label}~false~s_1~s_2 \andalso
395  \texttt{trace\_label\_return}~s_2~s_3
396 }
397 {\texttt{trace\_label\_return}~s_1~s_3}
398
399\infrule[\texttt{tll\_base}]
400 {\texttt{trace\_any\_label}~b~s_1~s_2 \andalso
401  \texttt{as\_costed}~s_1
402 }
403 {\texttt{trace\_label\_label}~b~s_1~s_2}
404
405\infrule[\texttt{tal\_base\_not\_return}]
406 {\texttt{as\_execute}~s_1~s_2 \andalso
407  \texttt{as\_classify}~s_1 \in \{\texttt{cl\_jump,cl\_other}\} \andalso
408  \texttt{as\_costed}~s_2
409 }
410 {\texttt{trace\_any\_label}~false~s_1~s_2}
411
412\infrule[\texttt{tal\_base\_return}]
413 {\texttt{as\_execute}~s_1~s_2 \andalso
414  \texttt{as\_classify}~s_1 = \texttt{cl\_return} \\
415 }
416 {\texttt{trace\_any\_label}~true~s_1~s_2}
417
418\infrule[\texttt{tal\_base\_call}]
419 {\texttt{as\_execute}~s_1~s_2 \andalso
420  \texttt{as\_classify}~s_1 = \texttt{cl\_call} \\
421  \texttt{as\_after\_return}~s_1~s_3 \andalso
422  \texttt{trace\_label\_return}~s_2~s_3 \andalso
423  \texttt{as\_costed}~s_3
424 }
425 {\texttt{trace\_any\_label}~false~s_1~s_3}
426
427\infrule[\texttt{tal\_step\_call}]
428 {\texttt{as\_execute}~s_1~s_2 \andalso
429  \texttt{as\_classify}~s_1 = \texttt{cl\_call} \\
430  \texttt{as\_after\_return}~s_1~s_3 \andalso
431  \texttt{trace\_label\_return}~s_2~s_3 \\
432  \lnot \texttt{as\_costed}~s_3 \andalso
433  \texttt{trace\_any\_label}~b~s_3~s_4
434 }
435 {\texttt{trace\_any\_label}~b~s_1~s_4}
436
437\infrule[\texttt{tal\_step\_default}]
438 {\texttt{as\_execute}~s_1~s_2 \andalso
439  \lnot \texttt{as\_costed}~s_2 \\
440  \texttt{trace\_any\_label}~b~s_2~s_3
441  \texttt{as\_classify}~s_1 = \texttt{cl\_other}
442 }
443 {\texttt{trace\_any\_label}~b~s_1~s_3}
444
445
446\begin{comment}
447\begin{verbatim}
448inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
449  | tlr_base:
450      ∀status_before: S.
451      ∀status_after: S.
452        trace_label_label S ends_with_ret status_before status_after →
453        trace_label_return S status_before status_after
454  | tlr_step:
455      ∀status_initial: S.
456      ∀status_labelled: S.
457      ∀status_final: S.
458        trace_label_label S doesnt_end_with_ret status_initial status_labelled →
459        trace_label_return S status_labelled status_final →
460          trace_label_return S status_initial status_final
461with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
462  | tll_base:
463      ∀ends_flag: trace_ends_with_ret.
464      ∀start_status: S.
465      ∀end_status: S.
466        trace_any_label S ends_flag start_status end_status →
467        as_costed S start_status →
468          trace_label_label S ends_flag start_status end_status
469with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
470  (* Single steps within a function which reach a label.
471     Note that this is the only case applicable for a jump. *)
472  | tal_base_not_return:
473      ∀start_status: S.
474      ∀final_status: S.
475        as_execute S start_status final_status →
476        (as_classifier S start_status cl_jump ∨
477         as_classifier S start_status cl_other) →
478        as_costed S final_status →
479          trace_any_label S doesnt_end_with_ret start_status final_status
480  | tal_base_return:
481      ∀start_status: S.
482      ∀final_status: S.
483        as_execute S start_status final_status →
484        as_classifier S start_status cl_return →
485          trace_any_label S ends_with_ret start_status final_status
486  (* A call followed by a label on return. *)
487  | tal_base_call:
488      ∀status_pre_fun_call: S.
489      ∀status_start_fun_call: S.
490      ∀status_final: S.
491        as_execute S status_pre_fun_call status_start_fun_call →
492        ∀H:as_classifier S status_pre_fun_call cl_call.
493          as_after_return S «status_pre_fun_call, H» status_final →
494          trace_label_return S status_start_fun_call status_final →
495          as_costed S status_final →
496            trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
497  (* A call followed by a non-empty trace. *)
498  | tal_step_call:
499      ∀end_flag: trace_ends_with_ret.
500      ∀status_pre_fun_call: S.
501      ∀status_start_fun_call: S.
502      ∀status_after_fun_call: S.
503      ∀status_final: S.
504        as_execute S status_pre_fun_call status_start_fun_call →
505        ∀H:as_classifier S status_pre_fun_call cl_call.
506          as_after_return S «status_pre_fun_call, H» status_after_fun_call →
507          trace_label_return S status_start_fun_call status_after_fun_call →
508          ¬ as_costed S status_after_fun_call →
509          trace_any_label S end_flag status_after_fun_call status_final →
510            trace_any_label S end_flag status_pre_fun_call status_final
511  | tal_step_default:
512      ∀end_flag: trace_ends_with_ret.
513      ∀status_pre: S.
514      ∀status_init: S.
515      ∀status_end: S.
516        as_execute S status_pre status_init →
517        trace_any_label S end_flag status_init status_end →
518        as_classifier S status_pre cl_other →
519        ¬ (as_costed S status_init) →
520          trace_any_label S end_flag status_pre status_end.
521\end{verbatim}
522\end{comment}
523A \texttt{trace\_label\_return} is isomorphic to a list of
524\texttt{trace\_label\_label}s that ends with a cost emission followed by a
525return terminated \texttt{trace\_label\_label}.
526The interesting cases are those of $\texttt{trace\_any\_label}~b~s_1~s_2$.
527A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
528definition on the classification of $s_1$. The constructors of the datatype
529impose several invariants that are meant to impose a structure to the
530otherwise unstructured execution. In particular, the following invariants are
531imposed:
532\begin{enumerate}
533 \item the trace is never empty; it ends with a return iff $b$ is
534       true
535 \item a jump must always be the last instruction of the trace, and it must
536       be followed by a cost emission statement; i.e. the target of a jump
537       is always the beginning of a new basic block; as such it must start
538       with a cost emission statement
539 \item a cost emission statement can never occur inside the trace, only in
540       the status immediately after
541 \item the trace for a function call step is made of a subtrace for the
542       function body of type
543       $\texttt{trace\_label\_return}~s_1~s_2$, possibly followed by the
544       rest of the trace for this basic block. The subtrace represents the
545       function execution. Being an inductive datum, it grants totality of
546       the function call. The status $s_2$ is the one that follows the return
547       statement. The next instruction of $s_2$ must follow the function call
548       instruction. As a consequence, function calls are also well nested.
549\end{enumerate}
550
551The three mutual structural recursive functions \texttt{flatten\_trace\_label\_return, flatten\_trace\_label\_label} and \texttt{flatten\_trance\_any\_label}
552allow to extract from a structured trace the list of states whose next
553instruction is a cost emission statement. We only show here the type of one
554of them:
555\begin{alltt}
556flatten_trace_label_return:
557 \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
558  trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
559\end{alltt}
560
561\paragraph{Cost prediction on structured traces}
562
563The first main theorem of CerCo about traces
564(theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
565holds for the
566instantiation
567of the structured traces to the concrete status of object code programs.
568Simplifying a bit, it states that
569\begin{equation}\label{th1}
570\begin{array}{l}\forall s_1,s_2. \forall \tau: \texttt{trace\_label\_return}~s_1~s_2.\\~~
571  \texttt{clock}~s_2 = \texttt{clock}~s_1 + \Sigma_{s \in (\texttt{flatten\_trace\_label\_return}~\tau)}\;k(\mathcal{L}(s))
572\end{array}
573\end{equation}
574where $\mathcal{L}$ maps a labelled state to its emitted label, and the
575cost model $k$ is statically computed from the object code
576by associating to each label \texttt{L} the sum of the cost of the instructions
577in the basic block that starts at \texttt{L} and ends before the next labelled
578instruction. The theorem is proved by structural induction over the structured
579trace, and is based on the invariant that
580iff the function that computes the cost model has analysed the instruction
581to be executed at $s_2$ after the one to be executed at $s_1$, and if
582the structured trace starts with $s_1$, then eventually it will contain also
583$s_2$. When $s_1$ is not a function call, the result holds trivially because
584of the $(\texttt{as\_execute}~s_1~s_2)$ condition obtained by inversion on
585the trace. The only non
586trivial case is the one of function calls: the cost model computation function
587does recursion on the first instruction that follows that function call; the
588\texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
589\texttt{tal\_step\_call} grants exactly that the execution will eventually reach
590this state.
591
592\paragraph{Structured traces similarity and cost prediction invariance}
593
594A compiler pass maps source to object code and initial states to initial
595states. The source code and initial state uniquely determine the structured
596trace of a program, if it exists. The structured trace fails to exists iff
597the structural conditions are violated by the program execution (e.g. a function
598body does not start with a cost emission statement). Let us assume that the
599target structured trace exists.
600
601What is the relation between the source and target structured traces?
602In general, the two traces can be arbitrarily different. However, we are
603interested only in those compiler passes that maps a trace $\tau_1$ to a trace
604$\tau_2$ such that
605\begin{equation}\texttt{flatten\_trace\_label\_return}~\tau_1 = \texttt{flatten\_trace\_label\_return}~\tau_2\label{condition1}\label{th2}\end{equation}
606The reason is that the combination of~\ref{th1} with~\ref{th2} yields the
607corollary
608\begin{equation}\begin{array}{l}\label{th3}
609\forall s_1,s_2. \forall \tau: \texttt{trace\_label\_return}~s_1~s_2.\\~~~~~\;
610  clock~s_2 - clock~s_1\\~ = \Sigma_{s \in (\texttt{flatten\_trace\_label\_return}~\tau_1)}\;k(\mathcal{L}(s))\\~ = \Sigma_{s \in (\texttt{flatten\_trace\_label\_return}~\tau_2)}\;k(\mathcal{L}(s))
611\end{array}\end{equation}
612This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
613transfer the cost model from the target to the source code and reason on the
614source code only.
615
616We are therefore interested in conditions stronger than~\ref{condition1}.
617Therefore we introduce here a similarity relation between traces with
618the same structure. Theorem~\texttt{tlr\_rel\_to\_traces\_same\_flatten}
619in the Matita formalisation shows that~\ref{th2} holds for every pair
620$(\tau_1,\tau_2)$ of similar traces.
621
622Intuitively, two traces are similar when one can be obtained from
623the other by erasing or inserting silent steps, i.e. states that are
624not \texttt{as\_costed} and that are classified as \texttt{other}.
625Silent steps do not alter the structure of the traces.
626In particular,
627the relation maps function calls to function calls to the same function,
628label emission statements to emissions of the same label, concatenation of
629subtraces to concatenation of subtraces of the same length and starting with
630the same emission statement, etc.
631
632In the formalisation the three similarity relations --- one for each trace
633kind --- are defined by structural recursion on the first trace and pattern
634matching over the second. Here we turn
635the definition into inference rules for the sake of readability. We also
636omit from trace constructors all arguments, but those that are traces or that
637are used in the premises of the rules.
638
639\infrule
640 {\texttt{tll\_rel}~tll_1~tll_2
641 }
642 {\texttt{tlr\_rel}~(\texttt{tlr\_base}~tll_1)~(\texttt{tlr\_base}~tll_2)}
643
644\infrule
645 {\texttt{tll\_rel}~tll_1~tll_2 \andalso
646  \texttt{tlr\_rel}~tlr_1~tlr_2
647 }
648 {\texttt{tlr\_rel}~(\texttt{tlr\_step}~tll_1~tlr_1)~(\texttt{tlr\_step}~tll_2~tlr_2)}
649
650\infrule
651 {\texttt{as\_label}~H_1 = \texttt{as\_label}~H_2 \andalso
652  \texttt{tal\_rel}~tal_1~tal_2
653 }
654 {\texttt{tll\_rel}~(\texttt{tll\_base}~tal_1~H_1)~(\texttt{tll\_base}~tal_2~H_2)}
655
656\infrule
657 {}
658 {\texttt{tal\_rel}~\texttt{tal\_base\_not\_return}~(taa @ \texttt{tal\_base\_not\_return}}
659
660\infrule
661 {}
662 {\texttt{tal\_rel}~\texttt{tal\_base\_return}~(taa @ \texttt{tal\_base\_return}}
663
664\infrule
665 {\texttt{tlr\_rel}~tlr_1~tlr_2 \andalso
666  \texttt{as\_call\_ident}~H_1 = \texttt{as\_call\_ident}~H_2
667 }
668 {\texttt{tal\_rel}~(\texttt{tal\_base\_call}~H_1~tlr_1)~(taa @ \texttt{tal\_base\_call}~H_2~tlr_2)}
669
670\infrule
671 {\texttt{tlr\_rel}~tlr_1~tlr_2 \andalso
672  \texttt{as\_call\_ident}~H_1 = \texttt{as\_call\_ident}~H_2 \andalso
673  \texttt{tal\_collapsable}~tal_2
674 }
675 {\texttt{tal\_rel}~(\texttt{tal\_base\_call}~tlr_1)~(taa @ \texttt{tal\_step\_call}~tlr_2~tal_2)}
676
677\infrule
678 {\texttt{tlr\_rel}~tlr_1~tlr_2 \andalso
679  \texttt{as\_call\_ident}~H_1 = \texttt{as\_call\_ident}~H_2 \andalso
680  \texttt{tal\_collapsable}~tal_1
681 }
682 {\texttt{tal\_rel}~(\texttt{tal\_step\_call}~tlr_1~tal_1)~(taa @ \texttt{tal\_base\_call}~tlr_2)}
683
684\infrule
685 {\texttt{tlr\_rel}~tlr_1~tlr_2 \andalso
686  \texttt{tal\_rel}~tal_1~tal_2 \andalso
687  \texttt{as\_call\_ident}~H_1 = \texttt{as\_call\_ident}~H_2
688 }
689 {\texttt{tal\_rel}~(\texttt{tal\_step\_call}~tlr_1~tal_1)~(taa @ \texttt{tal\_step\_call}~tlr_2~tal_2)}
690
691\infrule
692 {\texttt{tal\_rel}~tal_1~tal_2
693 }
694 {\texttt{tal\_rel}~(\texttt{tal\_step\_default}~tal_1)~tal_2}
695\begin{comment}
696\begin{verbatim}
697let rec tlr_rel S1 st1 st1' S2 st2 st2'
698  (tlr1 : trace_label_return S1 st1 st1')
699  (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
700match tlr1 with
701  [ tlr_base st1 st1' tll1 ⇒
702    match tlr2 with
703    [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
704    | _ ⇒ False
705    ]
706  | tlr_step st1 st1' st1'' tll1 tl1 ⇒
707    match tlr2 with
708    [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
709      tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
710    | _ ⇒ False
711    ]
712  ]
713and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
714 (tll1 : trace_label_label S1 fl1 st1 st1')
715 (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
716  match tll1 with
717  [ tll_base fl1 st1 st1' tal1 H ⇒
718    match tll2 with
719    [ tll_base fl2 st2 st2 tal2 G ⇒
720      as_label_safe … («?, H») = as_label_safe … («?, G») ∧
721      tal_rel … tal1 tal2
722    ]
723  ]
724and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
725 (tal1 : trace_any_label S1 fl1 st1 st1')
726 (tal2 : trace_any_label S2 fl2 st2 st2')
727   on tal1 : Prop ≝
728  match tal1 with
729  [ tal_base_not_return st1 st1' _ _ _ ⇒
730    fl2 = doesnt_end_with_ret ∧
731    ∃st2mid,taa,H,G,K.
732    tal2 ≃ taa_append_tal ? st2 ??? taa
733      (tal_base_not_return ? st2mid st2' H G K)
734  | tal_base_return st1 st1' _ _ ⇒
735    fl2 = ends_with_ret ∧
736    ∃st2mid,taa,H,G.
737    tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
738      (tal_base_return ? st2mid st2' H G)
739  | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
740    fl2 = doesnt_end_with_ret ∧
741    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
742    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
743    (* we must allow a tal_base_call to be similar to a call followed
744      by a collapsable trace (trace_any_any followed by a base_not_return;
745      we cannot use trace_any_any as it disallows labels in the end as soon
746      as it is non-empty) *)
747    (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
748      tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
749    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
750    ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
751      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
752      tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
753  | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
754    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
755    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
756    (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
757      tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
758      tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
759    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
760    ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
761      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
762      tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
763  | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
764    tal_rel … tl1 tal2 (* <- this makes it many to many *)
765  ].
766\end{verbatim}
767\end{comment}
768
769In the preceding rules, a $taa$ is an inhabitant of the
770$\texttt{trace\_any\_any}~s_1~s_2$ inductive data type whose definition
771is not in the paper for lack of space. It is the type of valid
772prefixes (even empty ones) of \texttt{trace\_any\_label}s that do not contain
773any function call. Therefore it
774is possible to concatenate (using ``$@$'') a \texttt{trace\_any\_any} to the
775left of a \texttt{trace\_any\_label}. A \texttt{trace\_any\_any} captures
776a sequence of silent moves.
777
778The \texttt{tal\_collapsable} unary predicate over \texttt{trace\_any\_label}s
779holds when the argument does not contain any function call and it ends
780with a label (not a return). The intuition is that after a function call we
781can still perform a sequence of silent actions while remaining similar.
782
783\section{Forward simulation}
784\label{simulation}
785
786We summarise here the results of the previous sections. Each intermediate
787unstructured language can be given a semantics based on structured traces,
788that single out those runs that respect a certain number of invariants.
789A cost model can be computed on the object code and it can be used to predict
790the execution costs of runs that produce structured traces. The cost model
791can be lifted from the target to the source code of a pass if the pass maps
792structured traces to similar structured traces. The latter property is called
793a \emph{forward simulation}.
794
795As for labelled transition systems, in order to establish the forward
796simulation we are interested in (preservation of observables), we are
797forced to prove a stronger notion of forward simulation that introduces
798an explicit relation between states. The classical notion of a 1-to-0-or-many
799forward simulation is the existence of a relation $R$ over states such that
800if $s_1 R s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
801$s_2 \to^* s_2'$ and $s_1' R s_2'$. In our context, we need to replace the
802one and multi step transition relations $\to^n$ with the existence of
803a structured trace between the two states, and we need to add the request that
804the two structured traces are similar. Thus what we would like to state is
805something like:\\
806for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
807$s_1$ to $s_1'$ and $s_1 R s_2$ there exists an $s_2'$ such that
808$s_1' R s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
809$\tau_1$ is similar to $\tau_2$. We call this particular form of forward
810simulation \emph{trace reconstruction}.
811
812The statement just introduced, however, is too simplistic and not provable
813in the general case. To understand why, consider the case of a function call
814and the pass that fixes the parameter passing conventions. A function
815call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
816input nor output parameters. The pass must add explicit code before and after
817the function call to move the pseudo-registers content from/to the hardware
818registers or the stack in order to implement the parameter passing strategy.
819Similarly, each function body must be augmented with a preamble and a postamble
820to complete/initiate the parameter passing strategy for the call/return phase.
821Therefore what used to be a call followed by the next instruction to execute
822after the function return, now becomes a sequence of instructions, followed by
823a call, followed by another sequence. The two states at the beginning of the
824first sequence and at the end of the second sequence are in relation with
825the status before/after the call in the source code, like in an usual forward
826simulation. How can we prove however the additional condition for function calls
827that asks that when the function returns the instruction immediately after the
828function call is called? To grant this invariant, there must be another relation
829between the address of the function call in the source and in the target code.
830This additional relation is to be used in particular to relate the two stacks.
831
832Another example is given by preservation of code emission statements. A single
833code emission instruction can be simulated by a sequence of steps, followed
834by a code emission, followed by another sequence. Clearly the initial and final
835statuses of the sequence are to be in relation with the status before/after the
836code emission in the source code. In order to preserve the structured traces
837invariants, however, we must consider a second relation between states that
838traces the preservation of the code emission statement.
839
840Therefore we now introduce an abstract notion of relation set between abstract
841statuses and an abstract notion of 1-to-0-or-many forward simulation conditions.
842These two definitions enjoy the following remarkable properties:
843\begin{enumerate}
844 \item they are generic enough to accommodate all passes of the CerCo compiler
845 \item the conjunction of the 1-to-0-or-many forward simulation conditions are
846       just slightly stricter than the statement of a 1-to-0-or-many forward
847       simulation in the classical case. In particular, they only require
848       the construction of very simple forms of structured traces made of
849       silent states only.
850 \item they allow to prove our main result of the paper: the 1-to-0-or-many
851       forward simulation conditions are sufficient to prove the trace
852       reconstruction theorem
853\end{enumerate}
854
855Point 3. is the important one. First of all it means that we have reduced
856the complex problem of trace reconstruction to a much simpler one that,
857moreover, can be solved with slight adaptations of the forward simulation proof
858that is performed for a compiler that only cares about functional properties.
859Therefore we have successfully splitted as much as possible the proof of
860preservation of functional properties from that of non-functional ones.
861Secondly, combined with the results in the previous section, it implies
862that the cost model can be computed on the object code and lifted to the
863source code to reason on non-functional properties, assuming that
864the 1-to-0-or-many forward simulation conditions are fulfilled for every
865compiler pass.
866
867\paragraph{Relation sets}
868
869We introduce now the four relations $\mathcal{S,C,L,R}$ between abstract
870statuses that are used to correlate the corresponding statues before and
871after a compiler pass. The first two are abstract and must be instantiated
872by every pass. The remaining two are derived relations.
873
874The $\mathcal{S}$ relation between states is the classical relation used
875in forward simulation proofs. It correlates the data of the status
876(e.g. registers, memory, etc.).
877
878The $\mathcal{C}$ relation correlates call states. It allows to track the
879position in the target code of every call in the source code.
880
881The $\mathcal{L}$ relation simply says that the two states are both label
882emitting states that emit the same label. It allows to track the position in
883the target code of every cost emitting statement in the source code.
884
885Finally the $\mathcal{R}$ relation is the more complex one. Two states
886$s_1$ and $s_2$ are $\mathcal{R}$ correlated if every time $s_1$ is the
887successors of a call state that is $\mathcal{C}$-related to a call state
888$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$.
889We will require all pairs of states that follow a related call to be
890$\mathcal{R}$-related. This is the fundamental requirement to grant
891that the target trace is well structured, i.e. that function calls are well
892nested and always return where they are supposed to.
893
894\begin{alltt}
895record status_rel (S1,S2 : abstract_status) : Type[1] := \{ 
896  \(\mathcal{S}\): S1 \(\to\) S2 \(\to\) Prop;
897  \(\mathcal{C}\): (\(\Sigma\)s.as_classifier S1 s cl_call) \(\to\)
898     (\(\Sigma\)s.as_classifier S2 s cl_call) \(\to\) Prop \}.
899
900definition \(\mathcal{L}\) S1 S2 st1 st2 := as_label S1 st1 = as_label S2 st2.
901
902definition \(\mathcal{R}\) S1 S2 (R: status_rel S1 S2) s1_ret s2_ret ≝
903 \(\forall\)s1_pre,s2_pre.
904  as_after_return s1_pre s1_ret \(\to\) s1_pre \(\mathcal{R}\) s2_pre \(\to\)
905   as_after_return s2_pre s2_ret.
906\end{alltt}
907
908\paragraph{1-to-0-or-many forward simulation conditions}
909\begin{condition}[Cases \texttt{cl\_other} and \texttt{cl\_jump}]
910 For all $s_1,s_1',s_2$ such that $s_1 \mathcal{S} s_1'$, and
911 $\texttt{as\_execute}~s_1~s_1'$, and $\texttt{as\_classify}~s_1 = \texttt{cl\_other}$ or $\texttt{as\_classify}~s_1 = \texttt{cl\_other}$ and
912$\texttt{as\_costed}~s_1'$, there exists an $s_2'$ and a $\texttt{trace\_any\_any\_free}~s_2~s_2'$ called $taaf$ such that $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and either
913$taaf$ is non empty, or one among $s_1$ and $s_1'$ is \texttt{as\_costed}.
914\end{condition}
915
916In the above condition, a $\texttt{trace\_any\_any\_free}~s_1~s_2$ is an
917inductive type of structured traces that do not contain function calls or
918cost emission statements. Differently from a \texttt{trace\_any\_any}, the
919instruction to be executed in the lookahead state $s_2$ may be a cost emission
920statement.
921
922The intuition of the condition is that one step can be replaced with zero or more steps if it
923preserves the relation between the data and if the two final statuses are
924labelled in the same way. Moreover, we must take special care of the empty case
925to avoid collapsing two consecutive states that emit the same label to just
926one state, missing one of the two emissions.
927
928\begin{condition}[Case \texttt{cl\_call}]
929 For all $s_1,s_1',s_2$ s.t. $s_1 \mathcal{S} s_1'$ and
930 $\texttt{as\_execute}~s_1~s_1'$ and $\texttt{as\_classify}~s_1 = \texttt{cl\_call}$, there exists $s_2', s_b, s_a$, a
931$\texttt{trace\_any\_any}~s_2~s_b$, and a
932$\texttt{trace\_any\_any\_free}~s_a~s_2'$ such that:
933$s_a$ is classified as a \texttt{cl\_call}, the \texttt{as\_identifiers} of
934the two call states are the same, $s_1 \mathcal{C} s_b$,
935$\texttt{as\_execute}~s_b~s_a$ holds, $s_1' \mathcal{L} s_b$ and
936$s_1' \mathcal{S} s_2'$.
937\end{condition}
938
939The condition says that, to simulate a function call, we can perform a
940sequence of silent actions before and after the function call itself.
941The old and new call states must be $\mathcal{C}$-related, the old and new
942states at the beginning of the function execution must be $\mathcal{L}$-related
943and, finally, the two initial and final states must be $\mathcal{S}$-related
944as usual.
945
946\begin{condition}[Case \texttt{cl\_return}]
947 For all $s_1,s_1',s_2$ s.t. $s_1 \mathcal{S} s_1'$,
948 $\texttt{as\_execute}~s_1~s_1'$ and $\texttt{as\_classify}~s_1 = \texttt{cl\_return}$, there exists $s_2', s_b, s_a$, a
949$\texttt{trace\_any\_any}~s_2~s_b$, a
950$\texttt{trace\_any\_any\_free}~s_a~s_2'$ called $taaf$ such that:
951$s_a$ is classified as a \texttt{cl\_return},
952$s_1 \mathcal{C} s_b$, the predicate
953$\texttt{as\_execute}~s_b~s_a$ holds,
954$s_1' \mathcal{R} s_a$ and
955$s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and either
956$taaf$ is non empty, or $s_a$ is not \texttt{as\_costed}.
957\end{condition}
958
959Similarly to the call condition, to simulate a return we can perform a
960sequence of silent actions before and after the return statement itself.
961The old and the new statements after the return must be $\mathcal{R}$-related,
962to grant that they returned to corresponding calls.
963The two initial and final states must be $\mathcal{S}$-related
964as usual and, moreover, they must exhibit the same labels. Finally, when
965the suffix is non empty we must take care of not inserting a new
966unmatched cost emission statement just after the return statement.
967
968\begin{comment}
969\begin{verbatim}
970definition status_simulation ≝
971  λS1 : abstract_status.
972  λS2 : abstract_status.
973  λsim_status_rel : status_rel S1 S2.
974    ∀st1,st1',st2.as_execute S1 st1 st1' →
975    sim_status_rel st1 st2 →
976    match as_classify … st1 with
977    [ None ⇒ True
978    | Some cl ⇒
979      match cl with
980      [ cl_call ⇒ ∀prf.
981        (*
982             st1' ------------S----------\
983              ↑ \                         \
984             st1 \--L--\                   \
985              | \       \                   \
986              S  \-C-\  st2_after_call →taa→ st2'
987              |       \     ↑
988             st2 →taa→ st2_pre_call
989        *)
990        ∃st2_pre_call.
991        as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
992        call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
993        ∃st2_after_call,st2'.
994        ∃taa2 : trace_any_any … st2 st2_pre_call.
995        ∃taa2' : trace_any_any … st2_after_call st2'.
996        as_execute … st2_pre_call st2_after_call ∧
997        sim_status_rel st1' st2' ∧
998        label_rel … st1' st2_after_call
999      | cl_return ⇒
1000        (*
1001             st1
1002            / ↓
1003           | st1'----------S,L------------\
1004           S   \                           \
1005            \   \-----R-------\            |     
1006             \                 |           |
1007             st2 →taa→ st2_ret |           |
1008                          ↓   /            |
1009                     st2_after_ret →taaf→ st2'
1010
1011           we also ask that st2_after_ret be not labelled if the taaf tail is
1012           not empty
1013        *)
1014        ∃st2_ret,st2_after_ret,st2'.
1015        ∃taa2 : trace_any_any … st2 st2_ret.
1016        ∃taa2' : trace_any_any_free … st2_after_ret st2'.
1017        (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
1018        as_classifier … st2_ret cl_return ∧
1019        as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
1020        ret_rel … sim_status_rel st1' st2_after_ret ∧
1021        label_rel … st1' st2'
1022      | cl_other ⇒
1023          (*         
1024          st1 → st1'
1025            |      \
1026            S      S,L
1027            |        \
1028           st2 →taaf→ st2'
1029           
1030           the taaf can be empty (e.g. tunneling) but we ask it must not be the
1031           case when both st1 and st1' are labelled (we would be able to collapse
1032           labels otherwise)
1033         *)
1034        ∃st2'.
1035        ∃taa2 : trace_any_any_free … st2 st2'.
1036        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1037        sim_status_rel st1' st2' ∧
1038        label_rel … st1' st2'
1039      | cl_jump ⇒
1040        (* just like cl_other, but with a hypothesis more *)
1041        as_costed … st1' →
1042        ∃st2'.
1043        ∃taa2 : trace_any_any_free … st2 st2'.
1044        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1045        sim_status_rel st1' st2' ∧
1046        label_rel … st1' st2'
1047      ]
1048    ].
1049\end{verbatim}
1050\end{comment}
1051
1052\paragraph{Main result: the 1-to-0-or-many forward simulation conditions
1053are sufficient to trace reconstruction}
1054
1055Let us assume that a relation set is given such that the 1-to-0-or-many
1056forward simulation conditions are satisfied. Under this assumption we
1057can prove the following three trace reconstruction theorems by mutual
1058structural induction over the traces given in input between the
1059$s_1$ and $s_1'$ states.
1060
1061In particular, the \texttt{status\_simulation\_produce\_tlr} theorem
1062applied to the \texttt{main} function of the program and equal
1063$s_{2_b}$ and $s_2$ states shows that, for every initial state in the
1064source code that induces a structured trace in the source code,
1065the compiled code produces a similar structured trace.
1066
1067\begin{theorem}[\texttt{status\_simulation\_produce\_tlr}]
1068For every $s_1,s_1',s_{2_b},s_2$ s.t.
1069there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and a
1070$\texttt{trace\_any\_any}~s_{2_b}~s_2$ and $s_1 \mathcal{L} s_{2_b}$ and
1071$s_1 \mathcal{S} s_2$, there exists $s_{2_m},s_2'$ s.t.
1072there is a $\texttt{trace\_label\_return}~s_{2_b}~s_{2_m}$ called $tlr_2$ and
1073there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
1074s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
1075and $\texttt{tlr\_rel}~tlr_1~tlr_2$
1076and $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1077$s_1' \mathcal{R} s_{2_m}$.
1078\end{theorem}
1079
1080The theorem states that a \texttt{trace\_label\_return} in the source code
1081together with a precomputed preamble of silent states
1082(the \texttt{trace\_any\_any}) in the target code induces a
1083similar \texttt{trace\_label\_return} trace in the target code which can be
1084followed by a sequence of silent states. Note that the statement does not
1085require the produced \texttt{trace\_label\_return} trace to start with the
1086precomputed preamble, even if this is likely to be the case in concrete
1087implementations. The preamble in input is necessary for compositionality, e.g.
1088because the 1-to-0-or-many forward simulation conditions allow in the
1089case of function calls to execute a preamble of silent instructions just after
1090the function call.
1091
1092\begin{theorem}[\texttt{status\_simulation\_produce\_tll}]
1093For every $s_1,s_1',s_{2_b},s_2$ s.t.
1094there is a $\texttt{trace\_label\_label}~b~s_1~s_1'$ called $tll_1$ and a
1095$\texttt{trace\_any\_any}~s_{2_b}~s_2$ and $s_1 \mathcal{L} s_{2_b}$ and
1096$s_1 \mathcal{S} s_2$, there exists $s_{2_m},s_2'$ s.t.
1097\begin{itemize}
1098 \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1099       and a trace $\texttt{trace\_label\_label}~b~s_{2_b}~s_{2_m}$ called $tll_2$
1100       and a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1101       $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1102       $s_1' \mathcal{R} s_{2_m}$ and
1103       $\texttt{tll\_rel}~tll_1~tll_2$ and
1104       if $taa_2$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$
1105 \item else there exists $s_2'$ and a
1106       $\texttt{trace\_label\_label}~b~s_{2_b}~s_2'$ called $tll_2$ such that
1107       $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1108       $\texttt{tll\_rel}~tll_1~tll_2$.
1109\end{itemize}
1110\end{theorem}
1111
1112The statement is similar to the previous one: a source
1113\texttt{trace\_label\_label} and a given target preamble of silent states
1114in the target code induce a similar \texttt{trace\_label\_label} in the
1115target code, possibly followed by a sequence of silent moves that become the
1116preamble for the next \texttt{trace\_label\_label} translation.
1117
1118\begin{theorem}[\texttt{status\_simulation\_produce\_tal}]
1119For every $s_1,s_1',s_2$ s.t.
1120there is a $\texttt{trace\_any\_label}~b~s_1~s_1'$ called $tal_1$ and
1121$s_1 \mathcal{S} s_2$
1122\begin{itemize}
1123 \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1124   and a trace $\texttt{trace\_any\_label}~b~s_2~s_{2_m}$ called $tal_2$ and a
1125   $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1126   $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1127   $s_1' \mathcal{R} s_{2_m}$ and
1128   $\texttt{tal\_rel}~tal_1~tal_2$ and
1129   if $taa_2$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$
1130 \item else there exists $s_2'$ and a
1131   $\texttt{trace\_any\_label}~b~s_2~s_2'$ called $tal_2$ such that
1132   either $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1133       $\texttt{tal\_rel}~tal_1~tal_2$
1134   or $s_1' (\mathcal{S} \cap \mathcal{L}) s_2$ and
1135   $\texttt{tal\_collapsable}~tal_1$ and $\lnot (\texttt{as\_costed}~s_1)$
1136\end{itemize}
1137\end{theorem}
1138
1139The statement is also similar to the previous ones, but for the lack of
1140the target code preamble.
1141
1142\begin{comment}
1143\begin{corollary}
1144For every $s_1,s_1',s_2$ s.t.
1145there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and
1146$s_1 (\mathcal{L} \cap \mathcal{S}) s_2$
1147there exists $s_{2_m},s_2'$ s.t.
1148there is a $\texttt{trace\_label\_return}~s_2~s_{2_m}$ called $tlr_2$ and
1149there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
1150s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
1151and $\texttt{tlr\_rel}~tlr_1~tlr_2$
1152and $s_1' (\mathcal{S} \cap \mathcal{L}) s_2'$ and
1153$s_1' \mathcal{R} s_{2_m}$.
1154\end{corollary}
1155\end{comment}
1156
1157\begin{comment}
1158\begin{verbatim}
1159status_simulation_produce_tlr S1 S2 R
1160(* we start from this situation
1161     st1 →→→→tlr→→→→ st1'
1162      | \
1163      L  \---S--\
1164      |          \
1165   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
1166                        by the preceding call)
1167   
1168   and we produce
1169     st1 →→→→tlr→→→→ st1'
1170             \\      /  \
1171             //     R    \-L,S-\
1172             \\     |           \
1173   st2_lab →tlr→ st2_mid →taaf→ st2'
1174*)
1175  st1 st1' st2_lab st2
1176  (tlr1 : trace_label_return S1 st1 st1')
1177  (taa2_pre : trace_any_any S2 st2_lab st2)
1178  (sim_execute : status_simulation S1 S2 R)
1179  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
1180  ∃st2_mid.∃st2'.
1181  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
1182  ∃taa2 : trace_any_any_free … st2_mid st2'.
1183  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
1184  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
1185  tlr_rel … tlr1 tlr2
1186\end{verbatim}
1187\end{comment}
1188
1189\section{Conclusions and future works}
1190\label{conclusions}
1191The labelling approach is a technique to implement compilers that induce on
1192the source code a non uniform cost model determined from the object code
1193produced. The cost model assigns a cost to each basic block of the program.
1194The main theorem of the approach says that there is an exact
1195correspondence between the sequence of basic blocks started in the source
1196and object code, and that no instruction in the source or object code is
1197executed outside a basic block. Thus the cost of object code execution
1198can be computed precisely on the source.
1199
1200In this paper we scale the labelling approach to cover a programming language
1201with function calls. This introduces new difficulties only when the language
1202is unstructured, i.e. it allows function calls to return anywhere in the code,
1203destroying the hope of a static prediction of the cost of basic blocks.
1204We restore static predictability by introducing a new semantics for unstructured
1205programs that single outs well structured executions. The latter are represented
1206by structured traces, a generalisation of streams of observables that capture
1207several structural invariants of the execution, like well nesting of functions
1208or the fact that every basic block must start with a code emission statement.
1209We show that structured traces are sufficiently structured to statically compute
1210a precise cost model on the object code.
1211
1212We introduce a similarity relation on structured traces that must hold between
1213source and target traces. When the relation holds for every program, we prove
1214that the cost model can be lifted from the object to the source code.
1215
1216In order to prove that similarity holds, we present a generic proof of forward
1217simulation that is aimed at pulling apart as much as possible the part of the
1218simulation related to non-functional properties (preservation of structure)
1219from that related to functional properties. In particular, we reduce the
1220problem of preservation of structure to that of showing a 1-to-0-or-many
1221forward simulation that only adds a few additional proof obligations to those
1222of a traditional, function properties only, proof.
1223
1224All results presented in the paper are part of a larger certification of a
1225C compiler which is based on the labelling approach. The certification, done
1226in Matita, is the main deliverable of the FET-Open Certified Complexity (CerCo).
1227
1228The short term future work consists in the completion of the certification of
1229the CerCo compiler exploiting the main theorem of this paper.
1230
1231\paragraph{Related works}
1232CerCo is the first project that explicitly tries to induce a
1233precise cost model on the source code in order to establish non-functional
1234properties of programs on an high level language. Traditional certifications
1235of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
1236of the functional properties.
1237
1238Usually forward simulations take the following form: for each transition
1239from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
1240transitions in the target code of length $n$. The number $n$ of transition steps
1241in the target code can just be the witness of the existential statement.
1242An equivalent alternative when the proof of simulation is constructive consists
1243in providing an explicit function, called \emph{clock function} in the
1244literature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
1245function constitutes then a cost model for the source code, in the spirit of
1246what we are doing in CerCo. However, we believe our solution to be superior
1247in the following respects: 1) the machinery of the labelling approach is
1248insensible to the resource being measured. Indeed, any cost model computed on
1249the object code can be lifted to the source code (e.g. stack space used,
1250energy consumed, etc.). On the contrary, clock functions only talk about
1251number of transition steps. In order to extend the approach with clock functions
1252to other resources, additional functions must be introduced. Moreover, the
1253additional functions would be handled differently in the proof.
12542) the cost models induced by the labelling approach have a simple presentation.
1255In particular, they associate a number to each basic block. More complex
1256models can be induced when the approach is scaled to cover, for instance,
1257loop optimisations~\cite{loopoptimizations}, but the costs are still meant to
1258be easy to understand and manipulate in an interactive theorem prover or
1259in Frama-C.
1260On the contrary, a clock function is a complex function of the state $s_1$
1261which, as a function, is an opaque object that is difficult to reify as
1262source code in order to reason on it.
1263
1264\bibliographystyle{splncs03}
1265\bibliography{ccexec}
1266
1267% \appendix
1268% \section{Notes for the reviewers}
1269%
1270% The results described in the paper are part of a larger formalization
1271% (the certification of the CerCo compiler). At the moment of the submission
1272% we need to single out from the CerCo formalization the results presented here.
1273% Before the 16-th of February we will submit an attachment that contains the
1274% minimal subset of the CerCo formalization that allows to prove those results.
1275% At that time it will also be possible to measure exactly the size of the
1276% formalization described here. At the moment a rough approximation suggests
1277% about 2700 lines of Matita code.
1278%
1279% We will also attach the development version of the interactive theorem
1280% prover Matita that compiles the submitted formalization. Another possibility
1281% is to backport the development to the last released version of the system
1282% to avoid having to re-compile Matita from scratch.
1283%
1284% The programming and certification style used in the formalization heavily
1285% exploit dependent types. Dependent types are used: 1) to impose invariants
1286% by construction on the data types and operations (e.g. a traces from a state
1287% $s_1$ to a state $s_2$ can be concatenad to a trace from a state
1288% $s_2'$ to a state $s_3$ only if $s_2$ is convertible with $s_2'$); 2)
1289% to state and prove the theorems by using the Russell methodology of
1290% Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
1291% }, better known in the Coq world as ``\texttt{Program}'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
1292% mandatorily on dependent types: it should be easy to adapt the technique
1293% and results presented in the paper to HOL.
1294%
1295% Finally, Matita and Coq are based on minor variations of the Calculus of
1296% (Co)Inductive Constructions. These variations do not affect the CerCo
1297% formalization. Therefore a porting of the proofs and ideas to Coq would be
1298% rather straightforward.
1299
1300\end{document}
Note: See TracBrowser for help on using the repository browser.