source: Papers/itp-2013/ccexec.tex

Last change on this file was 3306, checked in by tranquil, 8 years ago

altre modifiche e qualche taglio (i risultati per tll e tal)

File size: 72.8 KB
12% \usepackage{amsmath}
17% NB: might be worth removing this if changing class in favour of
18% a built-in definition.
23  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
24   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
25  }
28  morekeywords={int8_t}
32        keywordstyle=\color{red}\bfseries,
33        keywordstyle=[2]\color{blue},
34        commentstyle=\color{green},
35        stringstyle=\color{blue},
36        showspaces=false,showstringspaces=false,
37        xleftmargin=1em}
42\pgfutil@ifundefined{pgf@arrow@code@implies}{% supply for lack of double arrow special arrow tip if it is not there%
43  \pgfarrowsdeclare{implies}{implies}%
44  {%
45  \pgfarrowsleftextend{2.2pt}%
46  \pgfarrowsrightextend{2.2pt}%
47  }%
48  {%
49    \pgfsetdash{}{0pt} % do not dash%
50    \pgfsetlinewidth{.33pt}%
51    \pgfsetroundjoin   % fix join%
52    \pgfsetroundcap    % fix cap%
53    \pgfpathmoveto{\pgfpoint{-1.5pt}{2.5pt}}%
54    \pgfpathcurveto{\pgfpoint{-.75pt}{1.5pt}}{\pgfpoint{0.5pt}{.5pt}}{\pgfpoint{2pt}{0pt}}%
55    \pgfpathcurveto{\pgfpoint{0.5pt}{-.5pt}}{\pgfpoint{-.75pt}{-1.5pt}}{\pgfpoint{-1.5pt}{-2.5pt}}%
56    \pgfusepathqstroke%
57  }%
61\tikzset{state/.style={inner sep = 0, outer sep = 2pt, draw, fill},
62         every node/.style={inner sep=2pt},
63         every on chain/.style = {inner sep = 0, outer sep = 2pt},
64         join all/.style = {every on chain/.append style={join}},
65         on/.style={on chain={#1}, state},
66         m/.style={execute at begin node=$, execute at end node=$},
67         node distance=3mm,
68         is other/.style={circle, minimum size = 3pt, state},
69         other/.style={on, is other},
70         is jump/.style={diamond, minimum size = 6pt, state},
71         jump/.style={on, is jump},
72         is call/.style={regular polygon, regular polygon sides=3, minimum size=5pt, state},
73         call/.style={on=going below, is call, node distance=6mm, label=above left:$#1$},
74         is ret/.style={regular polygon, regular polygon sides=3, minimum size=5pt, shape border rotate=180, state},
75         ret/.style={on=going above, is ret, node distance=6mm},
76         chain/.style={start chain=#1 going left},
77         rev ar/.style={stealth-, thick},
78         ar/.style={-stealth, thick},
79         every join/.style={rev ar},
80         labelled/.style={fill=white, label=above:$#1$},
81         vcenter/.style={baseline={([yshift=-.5ex]current bounding box)}},
82         every picture/.style={thick},
83         double equal sign distance/.prefix style={double distance=1.5pt}, %% if already defined (newest version of pgf) it should be ignored%}
84         implies/.style={double, -implies, thin, double equal sign distance, shorten <=5pt, shorten >=5pt},
85         new/.style={densely dashed},
86         rel/.style={font=\scriptsize, fill=white, inner sep=2pt},
87         diag/.style={row sep={11mm,between origins},
88                      column sep={11mm,between origins},
89                      every node/.style={draw, is other}},
90         small vgap/.style={row sep={7mm,between origins}},
91         % for article, maybe temporary
92         is jump/.style=is other,
93         is call/.style=is other,
94         is ret/.style=is other,
97\def\L{\mathrel{\mathcal L}}
98\def\S{\mathrel{\mathcal S}}
99\def\R{\mathrel{\mathcal R}}
100\def\C{\mathrel{\mathcal C}}
103\savebox{\execbox}{\tikz[baseline=-.5ex]\draw [-stealth] (0,0) -- ++(1em, 0);}
117\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
118Emerging Technologies (FET) programme within the Seventh Framework
119Programme for Research of the European Commission, under FET-Open grant
120number: 243881}}
121\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
122\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{}, \email{}}
125The labelling approach is a technique to lift cost models for non-functional
126properties of programs from the object code to the source code. It is based
127on the preservation of the structure of the high level program in every
128intermediate language used by the compiler. Such structure is captured by
129observables that are added to the semantics and that needs to be preserved
130by the forward simulation proof of correctness of the compiler. Additional
131special observables are required for function calls. In this paper we
132present a generic forward simulation proof that preserves all these observables.
133The proof statement is based on a new mechanised semantics that traces the
134structure of execution when the language is unstructured. The generic semantics
135and simulation proof have been mechanised in the interactive theorem prover
140The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
141a technique to \emph{lift} cost models for non-functional properties of programs
142from the object code to the source code. Examples of non-functional properties
143are execution time, amount of stack/heap space consumed and energy required for
144communication. The basic idea of the approach is that it is impossible to
145provide a \emph{uniform} cost model for an high level language that is preserved
146\emph{precisely} by a compiler. For instance, two instances of an assignment
147$x = y$ in the source code can be compiled very differently according to the
148place (registers vs stack) where $x$ and $y$ are stored at the moment of
149execution. Therefore a precise cost model must assign a different cost
150to every occurrence, and the exact cost can only be known after compilation.
152According to the labelling approach, the compiler is free to compile and optimise
153the source code without any major restriction, but it must keep trace
154of what happens to basic blocks during the compilation. The cost model is
155then computed on the object code. It assigns a cost to every basic block.
156Finally, the compiler propagates back the cost model to the source level,
157assigning a cost to each basic block of the source code.
159Implementing the labelling approach in a certified compiler
160allows to reason formally on the high level source code of a program to prove
161non-functional properties that are granted to be preserved by the compiler
162itself. The trusted code base is then reduced to 1) the interactive theorem
163prover (or its kernel) used in the certification of the compiler and
1642) the software used to certify the property on the source language, that
165can be itself certified further reducing the trusted code base.
166In~\cite{easylabelling} the authors provide an example of a simple
167certified compiler that implements the labelling approach for the
168imperative \texttt{While} language~\cite{while}, that does not have
169pointers and function calls.
171The labelling approach has been shown to scale to more interesting scenarios.
172In particular in~\cite{functionallabelling} it has been applied to a functional
173language and in~\cite{loopoptimizations} it has been shown that the approach
174can be slightly complicated to handle loop optimisations and, more generally,
175program optimisations that do not preserve the structure of basic blocks.
176On-going work also shows that the labelling approach is also compatible with
177the complex analyses required to obtain a cost model for object code
178on processors that implement advanced features like pipelining, superscalar
179architectures and caches.
181In the European Project CerCo (Certified Complexity~\footnote{\url{}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
1828051 object code. The compiler is
183moderately optimising and implements a compilation chain that is largely
184inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novelty and source of difficulties is due to the presence
185of function calls. Surprisingly, the addition of function calls require a
186revisitation of the proof technique given in~\cite{easylabelling}. In
187particular, at the core of the labelling approach there is a forward
188simulation proof that, in the case of \texttt{While}, is only minimally
189more complex than the proof required for the preservation of the
190functional properties only. In the case of a programming language with
191function calls, instead, it turns out that the forward simulation proof for
192the back-end languages must grant a whole new set of invariants.
194In this paper we present a formalisation in the Matita interactive theorem
195prover~\cite{matita1,matita2} of a generic version of the simulation proof required for unstructured
196languages. All back-end languages of the CerCo compiler are unstructured
197languages, so the proof covers half of the correctness of the compiler.
198The statement of the generic proof is based on a new semantics
199for imperative unstructured languages that is based on \emph{structured
200traces} and that restores the preservation of structure in the observables of
201the semantics. The generic proof allows to almost completely split the
202part of the simulation that deals with functional properties only from the
203part that deals with the preservation of structure.
205The plan of this paper is the following. In Section~\ref{labelling} we
206sketch the labelling method and the problems derived from the application
207to languages with function calls. In Section~\ref{semantics} we introduce
208a generic description of an unstructured imperative language and the
209corresponding structured traces (the novel semantics). In
210Section~\ref{simulation} we describe the forward simulation proof.
211Conclusions and future works are in Section~\ref{conclusions}
213\section{The labelling approach}
215We briefly sketch here a simplified version of the labelling approach as
216introduced in~\cite{easylabelling}. The simplification strengthens the
217sufficient conditions given in~\cite{easylabelling} to allow a simpler
218explanation. The simplified conditions given here are also used in the
219CerCo compiler to simplify the proof.
221Let $\mathcal{P}$ be a programming language whose semantics is given in
222terms of observables: a run of a program yields a finite or infinite
223stream of observables. We also assume for the time being that function
224calls are not available in $\mathcal{P}$. We want to associate a cost
225model to a program $P$ written in $\mathcal{P}$. The first step is to
226extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
227where $L$ is a label distinct from all observables of $\mathcal{P}$.
228The semantics of $\texttt{emit L}$ is the emission of the observable
229\texttt{L} that is meant to signal the beginning of a basic block.
231There exists an automatic procedure that injects into the program $P$ an
232$\texttt{emit L}$ at the beginning of each basic block, using a fresh
233\texttt{L} for each block. In particular, the bodies of loops, both branches
234of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
235with an emission statement.
237Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
238that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
239language used by the compiler. We can easily extend every
240intermediate language (and its semantics) with an $\texttt{emit L}$ statement
241as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
242the additional difficulty that the syntax of object code is given as a
243sequence of bytes. The injection of an emission statement in the object code
244can be done using a map that maps two consecutive code addresses with the
245statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
246instruction stored at $pc_1$ and before the execution of the instruction
247stored at $pc_2$. The two program counters are necessary because the
248instruction stored at $pc_1$ can have multiple possible successors (e.g.
249in case of a conditional branch or an indirect call). Dually, the instruction
250stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
251target of a jump).
253The compiler, to be functionally correct, must preserve the observational
254equivalence, i.e. executing the program after each compiler pass should
255yield the same stream of observables. After the injection of emission
256statements, observables now capture both functional and non-functional
258This correctness property is called in the literature a forward simulation
259and is sufficient for correctness when the target language is
261We also require a stronger, non-functional preservation property: after each
262pass all basic blocks must start with an emission statement, and all labels
263\texttt{L} must be unique.
265Now let $M$ be the object code obtained for the program $P$. Let us suppose
266that we can statically inspect the code $M$ and associate to each basic block
267a cost (e.g. the number of clock cycles required to execute all instructions
268in the basic block, or an upper bound to that time). Every basic block is
269labelled with an unique label \texttt{L}, thus we can actually associate the
270cost to \texttt{L}. Let call it $k(\texttt{L})$.
272The function $k$ is defined as the cost model for the object code control
273blocks. It can be equally used as well as the cost model for the source
274control blocks. Indeed, if the semantics of $P$ is the stream
275$L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
276also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
277every instruction belongs to a control block and every control block is
278labelled. Thus it is correct to say that the execution cost of $P$ is also
279$\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
280the blocks of the high level program $P$ that is preserved by compilation.
282How can the user profit from the high level cost model? Suppose, for instance,
283that he wants to prove that the WCET of his program is bounded by $c$. It
284is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
285a purely functional property of the code. He can therefore use any technique
286available to certify functional properties of the source code.
287What is suggested in~\cite{easylabelling} is to actually instrument the
288source code $P$ by replacing every label emission statement
289$\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
290a global fresh variable \texttt{cost}. The bound is now proved by establishing
291the program invariant $\texttt{cost} \leq c$, which can be done for example
292using the Frama-C~\cite{framaC} suite if the source code is some variant of
295\subsection{Labelling function calls}
296We now want to extend the labelling approach to support function calls.
297On the high level, \emph{structured} programming language $\mathcal{P}$ there
298is not much to change.
299When a function is invoked, the current basic block is temporarily exited
300and the basic block the function starts with take control. When the function
301returns, the execution of the original basic block is resumed. Thus the only
302significant change is that basic blocks can now be nested. Let \texttt{E}
303be the label of the external block and \texttt{I} the label of a nested one.
304Since the external starts before the internal, the semantics observed will be
305\texttt{E I} and the cost associated to it on the source language will be
306$k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
307in the block \texttt{E} first plus the cost of executing all the instructions in
308the block \texttt{I}. However, we know that some instructions in \texttt{E} are
309executed after the last instruction in \texttt{I}. This is actually irrelevant
310because we are here assuming that costs are additive, so that we can freely
311permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
312the function call terminates and yields back control to the basic block
313\texttt{E}. If the call diverges, the instrumentation
314$\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
315but just as an upper bound to the real execution cost: only precision is lost.
317Let now consider what happens when we move down the compilation chain to an
318unstructured intermediate or final language. Here unstructured means that
319the only control operators are conditional and unconditional jumps, function
320calls and returns. Unlike a structured language, though, there is no guarantee
321that a function will return control just after the function call point.
322The semantics of the return statement, indeed, consists in fetching the
323return address from some internal structure (typically the control stack) and
324jumping directly to it. The code can freely manipulate the control stack to
325make the procedure returns to whatever position. Indeed, it is also possible
326to break the well nesting of function calls/returns.
328Is it the case that the code produced by a correct compiler must respect the
329additional property that every function returns just after its function call
330point? The answer is negative and the property is not implied by forward
331simulation proofs. For instance, imagine to modify a correct compiler pass
332by systematically adding one to the return address on the stack and by
333putting a \texttt{NOP} (or any other instruction that takes one byte) after
334every function call. The obtained code will be functionally indistinguishable,
335and the added instructions will all be dead code.
337This lack of structure in the semantics badly interferes with the labelling
338approach. The reason is the following: when a basic block labelled with
339\texttt{E} contains a function call, it no longer makes any sense to associate
340to a label \texttt{E} the sum of the costs of all the instructions in the block.
341Indeed, there is no guarantee that the function will return into the block and
342that the instructions that will be executed after the return will be the ones
343we are paying for in the cost model.
345How can we make the labelling approach work in this scenario? We only see two
346possible ways. The first one consists in injecting an emission statement after
347every function call: basic blocks no longer contain function calls, but are now
348terminated by them. This completely solves the problem and allows the compiler
349to break the structure of function calls/returns at will. However, the
350technique has several drawbacks. First of all, it greatly augments the number
351of cost labels that are injected in the source code and that become
352instrumentation statements. Thus, when reasoning on the source code to prove
353non-functional properties, the user (or the automation tool) will have to handle
354larger expressions. Second, the more labels are emitted, the more difficult it
355becomes to implement powerful optimisations respecting the code structure.
356Indeed, function calls are usually implemented in such a way that most registers
357are preserved by the call, so that the static analysis of the block is not
358interrupted by the call and an optimisation can involve both the code before
359and after the function call. Third, instrumenting the source code may require
360unpleasant modification of it. Take, for example, the code
361\texttt{f(g(x));}. We need to inject an emission statement/instrumentation
362instruction just after the execution of \texttt{g}. The only way to do that
363is to rewrite the code as \texttt{y = g(x); emit L; f(y);} for some fresh
364variable \texttt{y}. It is pretty clear how in certain situations the obtained
365code would be more obfuscated and then more difficult to manually reason on.
367For the previous reasons, in this paper and in the CerCo project we adopt a
368different approach. We do not inject emission statements after every
369function call. However, we want to propagate a strong additional invariant in
370the forward simulation proof. The invariant is the propagation of the structure
371 of the original high level code, even if the target language is unstructured.
372The structure we want to propagate, that will become more clear in the next
373section, comprises 1) the property that every function should return just after
374the function call point, which in turns imply well nesting of function calls;
3752) the property that every basic block starts with a code emission statement.
377In the original labelling approach of~\cite{easylabelling}, the second property
378was granted syntactically as a property of the generated code.
379In our revised approach, instead, we will impose the property on the runs:
380it will be possible to generate code that does not respect the syntactic
381property, as soon as all possible runs respect it. For instance, dead code will no longer
382be required to have all basic blocks correctly labelled. The switch is suggested
383from the fact that the first of the two properties --- that related to
384function calls/returns --- can only be defined as property of runs,
385not of the static code. The switch is
386beneficial to the proof because the original proof was made of two parts:
387the forward simulation proof and the proof that the static property was granted.
388In our revised approach the latter disappears and only the forward simulation
389is kept.
391In order to capture the structure semantics so that it is preserved
392by a forward simulation argument, we need to make the structure observable
393in the semantics. This is the topic of the next section.
395\section{Structured traces}
397The program semantics adopted in the traditional labelling approach is based
398on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
399a set of states $\S$, the semantics of one deterministic execution
400step is
401defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
402observables. The semantics is then lifted compositionally to multiple (finite
403or infinite) execution steps.
404Finally, the semantics of a a whole program execution is obtained by forgetting
405about the final state (if any), yielding a function $S \to O^*$ that given an
406initial status returns the finite or infinite stream of observables in output.
408We present here a new definition of semantics where the structure of execution,
409as defined in the previous section, is now observable. The idea is to replace
410the stream of observables with a structured data type that makes explicit
411function call and returns and that grants some additional invariants by
412construction. The data structure, called \emph{structured traces}, is
413defined inductively for terminating programs and coinductively for diverging
414ones. In the paper we focus only on the inductive structure, i.e. we assume
415that all programs that are given a semantics are total. The Matita formalisation
416also shows the coinductive definitions. The semantics of a program is then
417defined as a function that maps an initial state into a structured trace.
419In order to have a definition that works on multiple intermediate languages,
420we abstract the type of structure traces over an abstract data type of
421abstract statuses, which we aptly call $\texttt{abstract\_status}$. The fields
422of this record are the following.
424 \item \verb+S : Type[0]+, the type of states.
425 \item \verb+as_execute : S $\to$ S $\to$ Prop+, a binary predicate stating
426 an execution step. We write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
427 \item \verb+as_classifier : S $\to$ classification+, a function tagging all
428 states with a class in
429 $\{\texttt{cl\_return,cl\_jump,cl\_call,cl\_other}\}$, depending on the instruction
430 that is about to be executed (we omit tail-calls for simplicity). We will
431 use $s \class c$ as a shorthand for both $\texttt{as\_classifier}~s=c$
432 (if $c$ is a classification) and $\texttt{as\_classifier}~s\in c$
433 (if $c$ is a set of classifications).
434 \item \verb+as_label : S $\to$ option label+, telling whether the
435 next instruction to be executed in $s$ is a cost emission statement,
436 and if yes returning the associated cost label. Our shorthand for this function
437 will be $\ell$, and we will also abuse the notation by using $\ell~s$ as a
438 predicate stating that $s$ is labelled.
439 \item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
440 telling the identifier of the function which is being called in a
441 \verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
442 $\verb+as_call_ident+~s = f$.
443 \item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
444 which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
445 instruction to be executed in $s_2$ follows the function call to be
446 executed in (the witness of the $\Sigma$-type) $s_1$. We will use the notation
447 $s_1\ar s_2$ for this relation.
450% \begin{alltt}
451% record abstract_status := \{ S: Type[0];
452%  as_execute: S \(\to\) S \(\to\) Prop;   as_classifier: S \(\to\) classification;
453%  as_label: S \(\to\) option label;    as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
454%  as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
455% \end{alltt}
457The inductive type for structured traces is actually made by three multiple
458inductive types with the following semantics:
460 \item $(\texttt{trace\_label\_return}~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
461   is a trace that begins in
462   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
463   such that the instruction to be executed in $s_1$ is a label emission
464   statement and the one to be executed in the state before $s_2$ is a return
465   statement. Thus $s_2$ is the state after the return. The trace
466   may contain other label emission statements. It captures the structure of
467   the execution of function bodies: they must start with a cost emission
468   statement and must end with a return; they are obtained by concatenating
469   one or more basic blocks, all starting with a label emission
470   (e.g. in case of loops).
471 \item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
472   is a trace that begins in
473   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
474   such that the instruction to be executed in $s_2$/in the state before
475   $s_2$ is either a label emission statement or
476   or a return, according to the boolean $b$. It must not contain
477   any label emission statement. It captures the notion of a suffix of a
478   basic block.
479 \item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$ is the special case of
480   $\verb+TAL+~b~s_1~s_2)$ such that the instruction to be
481   executed in $s_1$ is a label emission statement. It captures the notion of
482   a basic block.
487 {\texttt{TLL}~true~s_1~s_2}
488 {\texttt{TLR}~s_1~s_2}
491 {\texttt{TLL}~false~s_1~s_2 \andalso
492  \texttt{TLR}~s_2~s_3
493 }
494 {\texttt{TLR}~s_1~s_3}
497 {\texttt{TAL}~b~s_1~s_2 \andalso
498  \ell~s_1
499 }
500 {\texttt{TLL}~b~s_1~s_2}
504 {s_1\exec s_2 \andalso
505  s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
506  \ell~s_2
507 }
508 {\texttt{TAL}~false~s_1~s_2}
511 {s_1\exec s_2 \andalso
512  s_1 \class \texttt{cl\_return}
513 }
514 {\texttt{TAL}~true~s_1~s_2}
517 {s_1\exec s_2 \andalso
518  s_1 \class \texttt{cl\_call} \andalso
519  s_1\ar s_3 \andalso
520  \texttt{TLR}~s_2~s_3 \andalso
521  \ell~s_3
522 }
523 {\texttt{TAL}~false~s_1~s_3}
526 {s_1\exec s_2 \andalso
527  s_1 \class \texttt{cl\_call} \andalso
528  s_1\ar s_3 \andalso
529  \texttt{TLR}~s_2~s_3 \andalso
530  \texttt{TAL}~b~s_3~s_4
531 }
532 {\texttt{TAL}~b~s_1~s_4}
535 {s_1\exec s_2 \andalso
536  \lnot \ell~s_2 \andalso
537  \texttt{TAL}~b~s_2~s_3\andalso
538  s_1 \class \texttt{cl\_other}
539 }
540 {\texttt{TAL}~b~s_1~s_3}
543inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
544  | tlr_base:
545      ∀status_before: S.
546      ∀status_after: S.
547        trace_label_label S ends_with_ret status_before status_after →
548        trace_label_return S status_before status_after
549  | tlr_step:
550      ∀status_initial: S.
551      ∀status_labelled: S.
552      ∀status_final: S.
553        trace_label_label S doesnt_end_with_ret status_initial status_labelled →
554        trace_label_return S status_labelled status_final →
555          trace_label_return S status_initial status_final
556with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
557  | tll_base:
558      ∀ends_flag: trace_ends_with_ret.
559      ∀start_status: S.
560      ∀end_status: S.
561        trace_any_label S ends_flag start_status end_status →
562        as_costed S start_status →
563          trace_label_label S ends_flag start_status end_status
564with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
565  (* Single steps within a function which reach a label.
566     Note that this is the only case applicable for a jump. *)
567  | tal_base_not_return:
568      ∀start_status: S.
569      ∀final_status: S.
570        as_execute S start_status final_status →
571        (as_classifier S start_status cl_jump ∨
572         as_classifier S start_status cl_other) →
573        as_costed S final_status →
574          trace_any_label S doesnt_end_with_ret start_status final_status
575  | tal_base_return:
576      ∀start_status: S.
577      ∀final_status: S.
578        as_execute S start_status final_status →
579        as_classifier S start_status cl_return →
580          trace_any_label S ends_with_ret start_status final_status
581  (* A call followed by a label on return. *)
582  | tal_base_call:
583      ∀status_pre_fun_call: S.
584      ∀status_start_fun_call: S.
585      ∀status_final: S.
586        as_execute S status_pre_fun_call status_start_fun_call →
587        ∀H:as_classifier S status_pre_fun_call cl_call.
588          as_after_return S «status_pre_fun_call, H» status_final →
589          trace_label_return S status_start_fun_call status_final →
590          as_costed S status_final →
591            trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
592  (* A call followed by a non-empty trace. *)
593  | tal_step_call:
594      ∀end_flag: trace_ends_with_ret.
595      ∀status_pre_fun_call: S.
596      ∀status_start_fun_call: S.
597      ∀status_after_fun_call: S.
598      ∀status_final: S.
599        as_execute S status_pre_fun_call status_start_fun_call →
600        ∀H:as_classifier S status_pre_fun_call cl_call.
601          as_after_return S «status_pre_fun_call, H» status_after_fun_call →
602          trace_label_return S status_start_fun_call status_after_fun_call →
603          ¬ as_costed S status_after_fun_call →
604          trace_any_label S end_flag status_after_fun_call status_final →
605            trace_any_label S end_flag status_pre_fun_call status_final
606  | tal_step_default:
607      ∀end_flag: trace_ends_with_ret.
608      ∀status_pre: S.
609      ∀status_init: S.
610      ∀status_end: S.
611        as_execute S status_pre status_init →
612        trace_any_label S end_flag status_init status_end →
613        as_classifier S status_pre cl_other →
614        ¬ (as_costed S status_init) →
615          trace_any_label S end_flag status_pre status_end.
618A \texttt{trace\_label\_return} is isomorphic to a list of
619\texttt{trace\_label\_label}s that ends with a cost emission followed by a
620return terminated \texttt{trace\_label\_label}.
621The interesting cases are those of $\texttt{trace\_any\_label}~b~s_1~s_2$.
622A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
623definition on the classification of $s_1$. The constructors of the datatype
624impose several invariants that are meant to impose a structure to the
625otherwise unstructured execution. In particular, the following invariants are
628 \item the trace is never empty; it ends with a return iff $b$ is
629       true
630 \item a jump must always be the last instruction of the trace, and it must
631       be followed by a cost emission statement; i.e. the target of a jump
632       is always the beginning of a new basic block; as such it must start
633       with a cost emission statement
634 \item a cost emission statement can never occur inside the trace, only in
635       the status immediately after
636 \item the trace for a function call step is made of a subtrace for the
637       function body of type
638       $\texttt{trace\_label\_return}~s_1~s_2$, possibly followed by the
639       rest of the trace for this basic block. The subtrace represents the
640       function execution. Being an inductive datum, it grants totality of
641       the function call. The status $s_2$ is the one that follows the return
642       statement. The next instruction of $s_2$ must follow the function call
643       instruction. As a consequence, function calls are also well nested.
646There are three mutual structural recursive functions, one for each of
647\verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
648$|\,.\,|$: the \emph{flattening} of the traces. These functions
649allow to extract from a structured trace the list of emitted cost labels.
650%  We only show here the type of one
651% of them:
652% \begin{alltt}
653% flatten_trace_label_return:
654%  \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
655%   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
656% \end{alltt}
658\paragraph{Cost prediction on structured traces.}
660The first main theorem of CerCo about traces
661(theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
662holds for the
664of the structured traces to the concrete status of object code programs.
665Simplifying a bit, it states that
667\begin{array}{l}\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
668  \texttt{clock}~s_2 = \texttt{clock}~s_1 +
669  \Sigma_{\alpha \in |\tau|}\;k(\alpha)
672where the cost model $k$ is statically computed from the object code
673by associating to each label $\alpha$ the sum of the cost of the instructions
674in the basic block that starts at $\alpha$ and ends before the next labelled
675instruction. The theorem is proved by structural induction over the structured
676trace, and is based on the invariant that
677iff the function that computes the cost model has analysed the instruction
678to be executed at $s_2$ after the one to be executed at $s_1$, and if
679the structured trace starts with $s_1$, then eventually it will contain also
680$s_2$. When $s_1$ is not a function call, the result holds trivially because
681of the $s_1\exec s_2$ condition obtained by inversion on
682the trace. The only non
683trivial case is the one of function calls: the cost model computation function
684does recursion on the first instruction that follows that function call; the
685\texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
686\texttt{tal\_step\_call} grants exactly that the execution will eventually reach
687this state.
689\paragraph{Structured traces similarity and cost prediction invariance.}
691A compiler pass maps source to object code and initial states to initial
692states. The source code and initial state uniquely determine the structured
693trace of a program, if it exists. The structured trace fails to exists iff
694the structural conditions are violated by the program execution (e.g. a function
695body does not start with a cost emission statement). Let us assume that the
696target structured trace exists.
698What is the relation between the source and target structured traces?
699In general, the two traces can be arbitrarily different. However, we are
700interested only in those compiler passes that maps a trace $\tau_1$ to a trace
701$\tau_2$ such that
702\begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
703The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
706\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
707  \texttt{clock}~s_2 - \texttt{clock}~s_1 =
708  \Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
709  \Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
711This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
712transfer the cost model from the target to the source code and reason on the
713source code only.
715We are therefore interested in conditions stronger than~\eqref{th2}.
716Therefore we introduce here a similarity relation between traces with
717the same structure. Theorem~\texttt{tlr\_rel\_to\_traces\_same\_flatten}
718in the Matita formalisation shows that~\eqref{th2} holds for every pair
719$(\tau_1,\tau_2)$ of similar traces.
721Intuitively, two traces are similar when one can be obtained from
722the other by erasing or inserting silent steps, i.e. states that are
723not \texttt{as\_costed} and that are classified as \texttt{cl\_other}.
724Silent steps do not alter the structure of the traces.
725In particular,
726the relation maps function calls to function calls to the same function,
727label emission statements to emissions of the same label, concatenation of
728subtraces to concatenation of subtraces of the same length and starting with
729the same emission statement, etc.
731In the formalisation the three similarity relations --- one for each trace
732kind --- are defined by structural recursion on the first trace and pattern
733matching over the second. Here we turn
734the definition into the inference rules shown in \autoref{fig:txx_rel}
735for the sake of readability. We also omit from trace constructors all arguments,
736but those that are traces or that
737are used in the premises of the rules. By abuse of notation we denote all three
738relations by infixing $\approx$.
743 {tll_1\approx tll_2
744 }
745 {\texttt{tlr\_base}~tll_1 \approx \texttt{tlr\_base}~tll_2}
748 {tll_1 \approx tll_2 \andalso
749  tlr_1 \approx tlr_2
750 }
751 {\texttt{tlr\_step}~tll_1~tlr_1 \approx \texttt{tlr\_step}~tll_2~tlr_2}
756 {\ell~s_1 = \ell~s_2 \andalso
757  tal_1\approx tal_2
758 }
759 {\texttt{tll\_base}~s_1~tal_1 \approx \texttt{tll\_base}~s_2~tal_2}
762 {tal_1\approx tal_2
763 }
764 {\texttt{tal\_step\_default}~tal_1 \approx tal_2}
768 {}
769 {\texttt{tal\_base\_not\_return}\approx taa \append \texttt{tal\_base\_not\_return}}
772 {}
773 {\texttt{tal\_base\_return}\approx taa \append \texttt{tal\_base\_return}}
776 {tlr_1\approx tlr_2 \andalso
777  s_1 \uparrow f \andalso s_2\uparrow f
778 }
779 {\texttt{tal\_base\_call}~s_1~tlr_1\approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2}
782 {tlr_1\approx tlr_2 \andalso
783  s_1 \uparrow f \andalso s_2\uparrow f \andalso
784  \texttt{tal\_collapsable}~tal_2
785 }
786 {\texttt{tal\_base\_call}~s_1~tlr_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2)}
789 {tlr_1\approx tlr_2 \andalso
790  s_1 \uparrow f \andalso s_2\uparrow f \andalso
791  \texttt{tal\_collapsable}~tal_1
792 }
793 {\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2)}
796 {tlr_1 \approx tlr_2 \andalso
797  s_1 \uparrow f \andalso s_2\uparrow f\andalso
798  tal_1 \approx tal_2 \andalso
799 }
800 {\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2}
801\caption{The inference rule for the relation $\approx$.}
807let rec tlr_rel S1 st1 st1' S2 st2 st2'
808  (tlr1 : trace_label_return S1 st1 st1')
809  (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
810match tlr1 with
811  [ tlr_base st1 st1' tll1 ⇒
812    match tlr2 with
813    [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
814    | _ ⇒ False
815    ]
816  | tlr_step st1 st1' st1'' tll1 tl1 ⇒
817    match tlr2 with
818    [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
819      tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
820    | _ ⇒ False
821    ]
822  ]
823and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
824 (tll1 : trace_label_label S1 fl1 st1 st1')
825 (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
826  match tll1 with
827  [ tll_base fl1 st1 st1' tal1 H ⇒
828    match tll2 with
829    [ tll_base fl2 st2 st2 tal2 G ⇒
830      as_label_safe … («?, H») = as_label_safe … («?, G») ∧
831      tal_rel … tal1 tal2
832    ]
833  ]
834and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
835 (tal1 : trace_any_label S1 fl1 st1 st1')
836 (tal2 : trace_any_label S2 fl2 st2 st2')
837   on tal1 : Prop ≝
838  match tal1 with
839  [ tal_base_not_return st1 st1' _ _ _ ⇒
840    fl2 = doesnt_end_with_ret ∧
841    ∃st2mid,taa,H,G,K.
842    tal2 ≃ taa_append_tal ? st2 ??? taa
843      (tal_base_not_return ? st2mid st2' H G K)
844  | tal_base_return st1 st1' _ _ ⇒
845    fl2 = ends_with_ret ∧
846    ∃st2mid,taa,H,G.
847    tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
848      (tal_base_return ? st2mid st2' H G)
849  | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
850    fl2 = doesnt_end_with_ret ∧
851    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
852    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
853    (* we must allow a tal_base_call to be similar to a call followed
854      by a collapsable trace (trace_any_any followed by a base_not_return;
855      we cannot use trace_any_any as it disallows labels in the end as soon
856      as it is non-empty) *)
857    (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
858      tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
859    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
860    ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
861      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
862      tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
863  | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
864    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
865    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
866    (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
867      tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
868      tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
869    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
870    ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
871      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
872      tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
873  | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
874    tal_rel … tl1 tal2 (* <- this makes it many to many *)
875  ].
879In the preceding rules, a $taa$ is an inhabitant of the
880$\texttt{trace\_any\_any}~s_1~s_2$ (shorthand $\texttt{TAA}~s_1~s_2$),
881an inductive data type whose definition
882is not in the paper for lack of space. It is the type of valid
883prefixes (even empty ones) of \texttt{TAL}'s that do not contain
884any function call. Therefore it
885is possible to concatenate (using ``$\append$'') a \texttt{TAA} to the
886left of a \texttt{TAL}. A \texttt{TAA} captures
887a sequence of silent moves.
888The \texttt{tal\_collapsable} unary predicate over \texttt{TAL}'s
889holds when the argument does not contain any function call and it ends
890with a label (not a return). The intuition is that after a function call we
891can still perform a sequence of silent actions while remaining similar.
893As should be expected, even though the rules are asymmetric $\approx$ is in fact
894an equivalence relation.
895\section{Forward simulation}
898We summarise here the results of the previous sections. Each intermediate
899unstructured language can be given a semantics based on structured traces,
900that single out those runs that respect a certain number of invariants.
901A cost model can be computed on the object code and it can be used to predict
902the execution costs of runs that produce structured traces. The cost model
903can be lifted from the target to the source code of a pass if the pass maps
904structured traces to similar structured traces. The latter property is called
905a \emph{forward simulation}.
907As for labelled transition systems, in order to establish the forward
908simulation we are interested in (preservation of observables), we are
909forced to prove a stronger notion of forward simulation that introduces
910an explicit relation between states. The classical notion of a 1-to-many
911forward simulation is the existence of a relation $\S$ over states such that
912if $s_1 \S s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
913$s_2 \to^* s_2'$ and $s_1' \S s_2'$. In our context, we need to replace the
914one and multi step transition relations $\to^n$ with the existence of
915a structured trace between the two states, and we need to add the request that
916the two structured traces are similar. Thus what we would like to state is
917something like:\\
918for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
919$s_1$ to $s_1'$ and $s_1 \S s_2$ there exists an $s_2'$ such that
920$s_1' \S s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
921$\tau_1$ is similar to $\tau_2$. We call this particular form of forward
922simulation \emph{trace reconstruction}.
924The statement just introduced, however, is too simplistic and not provable
925in the general case. To understand why, consider the case of a function call
926and the pass that fixes the parameter passing conventions. A function
927call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
928input nor output parameters. The pass must add explicit code before and after
929the function call to move the pseudo-registers content from/to the hardware
930registers or the stack in order to implement the parameter passing strategy.
931Similarly, each function body must be augmented with a preamble and a postamble
932to complete/initiate the parameter passing strategy for the call/return phase.
933Therefore what used to be a call followed by the next instruction to execute
934after the function return, now becomes a sequence of instructions, followed by
935a call, followed by another sequence. The two states at the beginning of the
936first sequence and at the end of the second sequence are in relation with
937the status before/after the call in the source code, like in an usual forward
938simulation. How can we prove however the additional condition for function calls
939that asks that when the function returns the instruction immediately after the
940function call is called? To grant this invariant, there must be another relation
941between the address of the function call in the source and in the target code.
942This additional relation is to be used in particular to relate the two stacks.
944Another example is given by preservation of code emission statements. A single
945code emission instruction can be simulated by a sequence of steps, followed
946by a code emission, followed by another sequence. Clearly the initial and final
947statuses of the sequence are to be in relation with the status before/after the
948code emission in the source code. In order to preserve the structured traces
949invariants, however, we must consider a second relation between states that
950traces the preservation of the code emission statement.
952Therefore we now introduce an abstract notion of relation set between abstract
953statuses and an abstract notion of 1-to-many forward simulation conditions.
954These two definitions enjoy the following remarkable properties:
956 \item they are generic enough to accommodate all passes of the CerCo compiler
957 \item the conjunction of the 1-to-many forward simulation conditions are
958       just slightly stricter than the statement of a 1-to-many forward
959       simulation in the classical case. In particular, they only require
960       the construction of very simple forms of structured traces made of
961       silent states only.
962 \item they allow to prove our main result of the paper: the 1-to-many
963       forward simulation conditions are sufficient to prove the trace
964       reconstruction theorem
967Point 3. is the important one. First of all it means that we have reduced
968the complex problem of trace reconstruction to a much simpler one that,
969moreover, can be solved with slight adaptations of the forward simulation proof
970that is performed for a compiler that only cares about functional properties.
971Therefore we have successfully splitted as much as possible the proof of
972preservation of functional properties from that of non-functional ones.
973Secondly, combined with the results in the previous section, it implies
974that the cost model can be computed on the object code and lifted to the
975source code to reason on non-functional properties, assuming that
976the 1-to-many forward simulation conditions are fulfilled for every
977compiler pass.
979\paragraph{Relation sets.}
981We introduce now the four relations $\mathcal{S,C,L,R}$ between abstract
982statuses that are used to correlate the corresponding statues before and
983after a compiler pass. The first two are abstract and must be instantiated
984by every pass. The remaining two are derived relations.
986The $\S$ relation between states is the classical relation used
987in forward simulation proofs. It correlates the data of the status
988(e.g. registers, memory, etc.).
990The $\C$ relation correlates call states. It allows to track the
991position in the target code of every call in the source code.
993The $\L$ relation simply says that the two states are both label
994emitting states that emit the same label, \emph{i.e.}\ $s_1\L s_2\iffdef \ell~s_1=\ell~s_2$.
995It allows to track the position in
996the target code of every cost emitting statement in the source code.
998Finally the $\R$ relation is the more complex one. Two states
999$s_1$ and $s_2$ are $\R$ correlated if every time $s_1$ is the
1000successors of a call state that is $\C$-related to a call state
1001$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$. Formally:
1002$$s_1\R s_2 \iffdef \forall s_1',s_2'.s_1'\C s_2' \to s_1'\ar s_1 \to s_2' \ar s_2.$$
1003We will require all pairs of states that follow a related call to be
1004$\R$-related. This is the fundamental requirement granting
1005that the target trace is well structured, \emph{i.e.}\ that calls are well
1006nested and returning where they are supposed to.
1008% \begin{alltt}
1009% record status_rel (S1,S2 : abstract_status) : Type[1] := \{
1010%   \(\S\): S1 \(\to\) S2 \(\to\) Prop;
1011%   \(\C\): (\(\Sigma\)s.as_classifier S1 s cl_call) \(\to\)
1012%      (\(\Sigma\)s.as_classifier S2 s cl_call) \(\to\) Prop \}.
1014% definition \(\L\) S1 S2 st1 st2 := as_label S1 st1 = as_label S2 st2.
1016% definition \(\R\) S1 S2 (R: status_rel S1 S2) s1_ret s2_ret ≝
1017%  \(\forall\)s1_pre,s2_pre.
1018%   as_after_return s1_pre s1_ret \(\to\) s1_pre \(\R\) s2_pre \(\to\)
1019%    as_after_return s2_pre s2_ret.
1020% \end{alltt}
1025% \begin{subfigure}{.475\linewidth}
1026% \centering
1027% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1028%                             every label/.style=overlay, node distance=10mm]
1029%     \matrix [diag] (m) {%
1030%          \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
1031%          \node (s2) {}; & \node (t2) {}; \\
1032%     };
1033%     \node [above=0 of t1, overlay] {$\alpha$};
1034%     {[-stealth]
1035%     \draw (s1) -- (t1);
1036%     \draw [new] (s2) -- node [above] {$*$} (t2);
1037%     }
1038%     \draw (s1) to node [rel] {$\S$} (s2);
1039%     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1040% \end{tikzpicture}
1041% \caption{The \texttt{cl\_jump} case.}
1042% \label{subfig:cl_jump}
1043% \end{subfigure}
1044% &
1047\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1048                            every label/.style=overlay, node distance=10mm]
1049    \matrix [diag] (m) {%
1050         \node (s1) {}; & \node (t1) {};\\
1051         \node (s2) {}; & \node (t2) {}; \\
1052    };
1053    {[-stealth]
1054    \draw (s1) -- (t1);
1055    \draw [new] (s2) -- node [above] {$*$} (t2);
1056    }
1057    \draw (s1) to node [rel] {$\S$} (s2);
1058    \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1060\caption{The \texttt{cl\_oher} and \texttt{cl\_jump} cases.}
1066\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1067                            every label/.style=overlay, node distance=10mm]
1068    \matrix [diag, small vgap] (m) {%
1069        \node (t1) {}; \\
1070         \node (s1) [is call] {}; \\
1071         & \node (l) {}; & \node (t2) {};\\
1072         \node (s2) {}; & \node (c) [is call] {};\\   
1073    };
1074    {[-stealth]
1075    \draw (s1) -- node [left] {$f$} (t1);
1076    \draw [new] (s2) -- node [above] {$*$} (c);
1077    \draw [new] (c) -- node [right] {$f$} (l);
1078    \draw [new] (l) -- node [above] {$*$} (t2);
1079    }
1080    \draw (s1) to node [rel] {$\S$} (s2);
1081    \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
1082    \draw [new] (t1) to [bend left] node [rel] {$\L$} (l);
1083    \draw [new] (t1) to node [rel] {$\C$} (c);
1084    \end{tikzpicture}
1085\caption{The \texttt{cl\_call} case.}
1091\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1092                            every label/.style=overlay, node distance=10mm]
1093    \matrix [diag, small vgap] (m) {%
1094        \node (s1) [is ret] {}; \\
1095        \node (t1) {}; \\
1096        \node (s2) {}; & \node (c) [is ret] {};\\
1097        & \node (r) {}; & \node (t2) {}; \\   
1098    };
1099    {[-stealth]
1100    \draw (s1) -- (t1);
1101    \draw [new] (s2) -- node [above] {$*$} (c);
1102    \draw [new] (c) -- (r);
1103    \draw [new] (r) -- node [above] {$*$} (t2);
1104    }
1105    \draw (s1) to [bend right=45] node [rel] {$\S$} (s2);
1106    \draw [new, overlay] (t1) to [bend left=90, looseness=1] node [rel] {$\S,\L$} (t2);
1107    \draw [new, overlay] (t1) to [bend left=90, looseness=1.2] node [rel] {$\R$} (r);
1109\caption{The \texttt{cl\_return} case.}
1113\caption{Mnemonic diagrams depicting the hypotheses for the preservation of structured traces.
1114         Dashed lines
1115         and arrows indicates how the diagrams must be closed when solid relations
1116         are present.}
1120\paragraph{1-to-many forward simulation conditions.}
1121\begin{condition}[Cases \texttt{cl\_other} and \texttt{cl\_jump}]
1122 For all $s_1,s_1',s_2$ such that $s_1 \S s_1'$, and
1123 $s_1\exec s_1'$, and either $s_1 \class \texttt{cl\_other}$ or
1124 both $s_1\class\texttt{cl\_other}\}$ and $\ell~s_1'$,
1125 there exists an $s_2'$ and a $\texttt{trace\_any\_any\_free}~s_2~s_2'$ called $taaf$
1126 such that $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1127$taaf$ is non empty, or one among $s_1$ and $s_1'$ is \texttt{as\_costed}.
1130In the above condition depicted in \autoref{subfig:cl_other_jump},
1131a $\texttt{trace\_any\_any\_free}~s_1~s_2$ (which from now on
1132will be shorthanded as \verb+TAAF+) is an
1133inductive type of structured traces that do not contain function calls or
1134cost emission statements. Differently from a \verb+TAA+, the
1135instruction to be executed in the lookahead state $s_2$ may be a cost emission
1138The intuition of the condition is that one step can be replaced with zero or more steps if it
1139preserves the relation between the data and if the two final statuses are
1140labelled in the same way. Moreover, we must take special care of the empty case
1141to avoid collapsing two consecutive states that emit a label, missing one of the two emissions.
1143\begin{condition}[Case \texttt{cl\_call}]
1144 For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$ and
1145 $s_1\exec s_1'$ and $s_1 \class \texttt{cl\_call}$, there exists $s_a, s_b, s_2'$, a
1146$\verb+TAA+~s_2~s_a$, and a
1147$\verb+TAAF+~s_b~s_2'$ such that:
1148$s_a\class\texttt{cl\_call}$, the \texttt{as\_call\_ident}'s of
1149the two call states are the same, $s_1 \C s_a$,
1150$s_a\exec s_b$, $s_1' \L s_b$ and
1151$s_1' \S s_2'$.
1154The condition, depicted in \autoref{subfig:cl_call} says that, to simulate a function call, we can perform a
1155sequence of silent actions before and after the function call itself.
1156The old and new call states must be $\C$-related, the old and new
1157states at the beginning of the function execution must be $\L$-related
1158and, finally, the two initial and final states must be $\S$-related
1159as usual.
1161\begin{condition}[Case \texttt{cl\_return}]
1162 For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$,
1163 $s_1\exec s_1'$ and $s_1 \class \texttt{cl\_return}$, there exists $s_a, s_b, s_2'$, a
1164$\verb+TAA+~s_2~s_a$, a
1165$\verb+TAAF+~s_b~s_2'$ called $taaf$ such that:
1167$s_a\exec s_b$,
1168$s_1' \R s_b$ and
1169$s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1170$taaf$ is non empty, or $\lnot \ell~s_a$.
1173Similarly to the call condition, to simulate a return we can perform a
1174sequence of silent actions before and after the return statement itself,
1175as depicted in \autoref{subfig:cl_return}.
1176The old and the new statements after the return must be $\R$-related,
1177to grant that they returned to corresponding calls.
1178The two initial and final states must be $\S$-related
1179as usual and, moreover, they must exhibit the same labels. Finally, when
1180the suffix is non empty we must take care of not inserting a new
1181unmatched cost emission statement just after the return statement.
1185definition status_simulation ≝
1186  λS1 : abstract_status.
1187  λS2 : abstract_status.
1188  λsim_status_rel : status_rel S1 S2.
1189    ∀st1,st1',st2.as_execute S1 st1 st1' →
1190    sim_status_rel st1 st2 →
1191    match as_classify … st1 with
1192    [ None ⇒ True
1193    | Some cl ⇒
1194      match cl with
1195      [ cl_call ⇒ ∀prf.
1196        (*
1197             st1' ------------S----------\
1198              ↑ \                         \
1199             st1 \--L--\                   \
1200              | \       \                   \
1201              S  \-C-\  st2_after_call →taa→ st2'
1202              |       \     ↑
1203             st2 →taa→ st2_pre_call
1204        *)
1205        ∃st2_pre_call.
1206        as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
1207        call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
1208        ∃st2_after_call,st2'.
1209        ∃taa2 : trace_any_any … st2 st2_pre_call.
1210        ∃taa2' : trace_any_any … st2_after_call st2'.
1211        as_execute … st2_pre_call st2_after_call ∧
1212        sim_status_rel st1' st2' ∧
1213        label_rel … st1' st2_after_call
1214      | cl_return ⇒
1215        (*
1216             st1
1217            / ↓
1218           | st1'----------S,L------------\
1219           S   \                           \
1220            \   \-----R-------\            |     
1221             \                 |           |
1222             st2 →taa→ st2_ret |           |
1223                          ↓   /            |
1224                     st2_after_ret →taaf→ st2'
1226           we also ask that st2_after_ret be not labelled if the taaf tail is
1227           not empty
1228        *)
1229        ∃st2_ret,st2_after_ret,st2'.
1230        ∃taa2 : trace_any_any … st2 st2_ret.
1231        ∃taa2' : trace_any_any_free … st2_after_ret st2'.
1232        (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
1233        as_classifier … st2_ret cl_return ∧
1234        as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
1235        ret_rel … sim_status_rel st1' st2_after_ret ∧
1236        label_rel … st1' st2'
1237      | cl_other ⇒
1238          (*         
1239          st1 → st1'
1240            |      \
1241            S      S,L
1242            |        \
1243           st2 →taaf→ st2'
1245           the taaf can be empty (e.g. tunneling) but we ask it must not be the
1246           case when both st1 and st1' are labelled (we would be able to collapse
1247           labels otherwise)
1248         *)
1249        ∃st2'.
1250        ∃taa2 : trace_any_any_free … st2 st2'.
1251        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1252        sim_status_rel st1' st2' ∧
1253        label_rel … st1' st2'
1254      | cl_jump ⇒
1255        (* just like cl_other, but with a hypothesis more *)
1256        as_costed … st1' →
1257        ∃st2'.
1258        ∃taa2 : trace_any_any_free … st2 st2'.
1259        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1260        sim_status_rel st1' st2' ∧
1261        label_rel … st1' st2'
1262      ]
1263    ].
1267\paragraph{Main result: the 1-to-many forward simulation conditions
1268are sufficient to trace reconstruction}
1270Let us assume that a relation set is given such that the 1-to-many
1271forward simulation conditions are satisfied. Under this assumption we
1272can prove the following three trace reconstruction theorems by mutual
1273structural induction over the traces given in input between the
1274$s_1$ and $s_1'$ states.
1276In particular, the \texttt{status\_simulation\_produce\_tlr} theorem
1277applied to the \texttt{main} function of the program and equal
1278$s_{2_b}$ and $s_2$ states shows that, for every initial state in the
1279source code that induces a structured trace in the source code,
1280the compiled code produces a similar structured trace.
1283For every $s_1,s_1',s_{2_b},s_2$ s.t.
1284there is a $\texttt{TLR}~s_1~s_1'$ called $tlr_1$ and a
1285$\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1286$s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1287there is a $\texttt{TLR}~s_{2_b}~s_{2_m}$ called $tlr_2$ and
1288there is a $\verb+TAAF+~s_{2_m}~s_2'$ called $taaf$
1289s.t. if $taaf$ is non empty then $\lnot (\ell~s_{2_m})$,
1290and $tlr_1\approx tlr_2$
1291and $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1292$s_1' \R s_{2_m}$.
1295The theorem states that a \texttt{trace\_label\_return} in the source code
1296together with a precomputed preamble of silent states
1297(the \verb+TAA+) in the target code induces a
1298similar \texttt{trace\_label\_return} in the target code which can be
1299followed by a sequence of silent states. Note that the statement does not
1300require the produced \texttt{trace\_label\_return} to start with the
1301precomputed preamble, even if this is likely to be the case in concrete
1302implementations. The preamble in input is necessary for compositionality, e.g.
1303because the 1-to-many forward simulation conditions allow in the
1304case of function calls to execute a preamble of silent instructions just after
1305the function call.
1307Clearly similar results are also available for the other two types of structured
1308traces (in fact, they are all proved simultaneously by mutual induction).
1309% \begin{theorem}[\texttt{status\_simulation\_produce\_tll}]
1310% For every $s_1,s_1',s_{2_b},s_2$ s.t.
1311% there is a $\texttt{TLL}~b~s_1~s_1'$ called $tll_1$ and a
1312% $\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1313% $s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1314% \begin{itemize}
1315%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1316%        and a trace $\texttt{TLL}~b~s_{2_b}~s_{2_m}$ called $tll_2$
1317%        and a $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1318%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1319%        $s_1' \R s_{2_m}$ and
1320%        $tll_1\approx tll_2$ and
1321%        if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1322%  \item else there exists $s_2'$ and a
1323%        $\texttt{TLL}~b~s_{2_b}~s_2'$ called $tll_2$ such that
1324%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1325%        $tll_1\approx tll_2$.
1326% \end{itemize}
1327% \end{theorem}
1329% The statement is similar to the previous one: a source
1330% \texttt{trace\_label\_label} and a given target preamble of silent states
1331% in the target code induce a similar \texttt{trace\_label\_label} in the
1332% target code, possibly followed by a sequence of silent moves that become the
1333% preamble for the next \texttt{trace\_label\_label} translation.
1335% \begin{theorem}[\texttt{status\_simulation\_produce\_tal}]
1336% For every $s_1,s_1',s_2$ s.t.
1337% there is a $\texttt{TAL}~b~s_1~s_1'$ called $tal_1$ and
1338% $s_1 \S s_2$
1339% \begin{itemize}
1340%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1341%    and a trace $\texttt{TAL}~b~s_2~s_{2_m}$ called $tal_2$ and a
1342%    $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1343%    $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1344%    $s_1' \R s_{2_m}$ and
1345%    $tal_1 \approx tal_2$ and
1346%    if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1347%  \item else there exists $s_2'$ and a
1348%    $\texttt{TAL}~b~s_2~s_2'$ called $tal_2$ such that
1349%    either $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1350%        $tal_1\approx tal_2$
1351%    or $s_1' \mathrel{{\S} \cap {\L}} s_2$ and
1352%    $\texttt{tal\_collapsable}~tal_1$ and $\lnot \ell~s_1$.
1353% \end{itemize}
1354% \end{theorem}
1356% The statement is also similar to the previous ones, but for the lack of
1357% the target code preamble.
1361For every $s_1,s_1',s_2$ s.t.
1362there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and
1363$s_1 (\L \cap \S) s_2$
1364there exists $s_{2_m},s_2'$ s.t.
1365there is a $\texttt{trace\_label\_return}~s_2~s_{2_m}$ called $tlr_2$ and
1366there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
1367s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
1368and $\texttt{tlr\_rel}~tlr_1~tlr_2$
1369and $s_1' (\S \cap \L) s_2'$ and
1370$s_1' \R s_{2_m}$.
1376status_simulation_produce_tlr S1 S2 R
1377(* we start from this situation
1378     st1 →→→→tlr→→→→ st1'
1379      | \
1380      L  \---S--\
1381      |          \
1382   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
1383                        by the preceding call)
1385   and we produce
1386     st1 →→→→tlr→→→→ st1'
1387             \\      /  \
1388             //     R    \-L,S-\
1389             \\     |           \
1390   st2_lab →tlr→ st2_mid →taaf→ st2'
1392  st1 st1' st2_lab st2
1393  (tlr1 : trace_label_return S1 st1 st1')
1394  (taa2_pre : trace_any_any S2 st2_lab st2)
1395  (sim_execute : status_simulation S1 S2 R)
1396  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
1397  ∃st2_mid.∃st2'.
1398  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
1399  ∃taa2 : trace_any_any_free … st2_mid st2'.
1400  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
1401  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
1402  tlr_rel … tlr1 tlr2
1406\section{Conclusions and future works}
1408The labelling approach is a technique to implement compilers that induce on
1409the source code a non uniform cost model determined from the object code
1410produced. The cost model assigns a cost to each basic block of the program.
1411The main theorem of the approach says that there is an exact
1412correspondence between the sequence of basic blocks started in the source
1413and object code, and that no instruction in the source or object code is
1414executed outside a basic block. Thus the cost of object code execution
1415can be computed precisely on the source.
1417In this paper we scale the labelling approach to cover a programming language
1418with function calls. This introduces new difficulties only when the language
1419is unstructured, i.e. it allows function calls to return anywhere in the code,
1420destroying the hope of a static prediction of the cost of basic blocks.
1421We restore static predictability by introducing a new semantics for unstructured
1422programs that single outs well structured executions. The latter are represented
1423by structured traces, a generalisation of streams of observables that capture
1424several structural invariants of the execution, like well nesting of functions
1425or the fact that every basic block must start with a code emission statement.
1426We show that structured traces are sufficiently structured to statically compute
1427a precise cost model on the object code.
1429We introduce a similarity relation on structured traces that must hold between
1430source and target traces. When the relation holds for every program, we prove
1431that the cost model can be lifted from the object to the source code.
1433In order to prove that similarity holds, we present a generic proof of forward
1434simulation that is aimed at pulling apart as much as possible the part of the
1435simulation related to non-functional properties (preservation of structure)
1436from that related to functional properties. In particular, we reduce the
1437problem of preservation of structure to that of showing a 1-to-many
1438forward simulation that only adds a few additional proof obligations to those
1439of a traditional, function properties only, proof.
1441All results presented in the paper are part of a larger certification of a
1442C compiler which is based on the labelling approach. The certification, done
1443in Matita, is the main deliverable of the FET-Open Certified Complexity (CerCo).
1445The short term future work consists in the completion of the certification of
1446the CerCo compiler exploiting the main theorem of this paper.
1448\paragraph{Related works.}
1449CerCo is the first project that explicitly tries to induce a
1450precise cost model on the source code in order to establish non-functional
1451properties of programs on an high level language. Traditional certifications
1452of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
1453of the functional properties.
1455Usually forward simulations take the following form: for each transition
1456from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
1457transitions in the target code of length $n$. The number $n$ of transition steps
1458in the target code can just be the witness of the existential statement.
1459An equivalent alternative when the proof of simulation is constructive consists
1460in providing an explicit function, called \emph{clock function} in the
1461literature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
1462function constitutes then a cost model for the source code, in the spirit of
1463what we are doing in CerCo. However, we believe our solution to be superior
1464in the following respects: 1) the machinery of the labelling approach is
1465insensible to the resource being measured. Indeed, any cost model computed on
1466the object code can be lifted to the source code (e.g. stack space used,
1467energy consumed, etc.). On the contrary, clock functions only talk about
1468number of transition steps. In order to extend the approach with clock functions
1469to other resources, additional functions must be introduced. Moreover, the
1470additional functions would be handled differently in the proof.
14712) the cost models induced by the labelling approach have a simple presentation.
1472In particular, they associate a number to each basic block. More complex
1473models can be induced when the approach is scaled to cover, for instance,
1474loop optimisations~\cite{loopoptimizations}, but the costs are still meant to
1475be easy to understand and manipulate in an interactive theorem prover or
1476in Frama-C.
1477On the contrary, a clock function is a complex function of the state $s_1$
1478which, as a function, is an opaque object that is difficult to reify as
1479source code in order to reason on it.
1484% \appendix
1485% \section{Notes for the reviewers}
1487% The results described in the paper are part of a larger formalization
1488% (the certification of the CerCo compiler). At the moment of the submission
1489% we need to single out from the CerCo formalization the results presented here.
1490% Before the 16-th of February we will submit an attachment that contains the
1491% minimal subset of the CerCo formalization that allows to prove those results.
1492% At that time it will also be possible to measure exactly the size of the
1493% formalization described here. At the moment a rough approximation suggests
1494% about 2700 lines of Matita code.
1496% We will also attach the development version of the interactive theorem
1497% prover Matita that compiles the submitted formalization. Another possibility
1498% is to backport the development to the last released version of the system
1499% to avoid having to re-compile Matita from scratch.
1501% The programming and certification style used in the formalization heavily
1502% exploit dependent types. Dependent types are used: 1) to impose invariants
1503% by construction on the data types and operations (e.g. a traces from a state
1504% $s_1$ to a state $s_2$ can be concatenad to a trace from a state
1505% $s_2'$ to a state $s_3$ only if $s_2$ is convertible with $s_2'$); 2)
1506% to state and prove the theorems by using the Russell methodology of
1507% Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
1508% }, better known in the Coq world as ``\texttt{Program}'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
1509% mandatorily on dependent types: it should be easy to adapt the technique
1510% and results presented in the paper to HOL.
1512% Finally, Matita and Coq are based on minor variations of the Calculus of
1513% (Co)Inductive Constructions. These variations do not affect the CerCo
1514% formalization. Therefore a porting of the proofs and ideas to Coq would be
1515% rather straightforward.
Note: See TracBrowser for help on using the repository browser.