source: Papers/itp-2013/ccexec2.tex @ 3343

Last change on this file since 3343 was 3343, checked in by tranquil, 6 years ago

file parallelo

File size: 73.8 KB
12% \usepackage{amsmath}
17% NB: might be worth removing this if changing class in favour of
18% a built-in definition.
23  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
24   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
25  }
28  morekeywords={int8_t},
29  mathescape
33        keywordstyle=\color{red}\bfseries,
34        keywordstyle=[2]\color{blue},
35        commentstyle=\color{green},
36        stringstyle=\color{blue},
37        showspaces=false,showstringspaces=false,
38        xleftmargin=1em}
43\pgfutil@ifundefined{pgf@arrow@code@implies}{% supply for lack of double arrow special arrow tip if it is not there%
44  \pgfarrowsdeclare{implies}{implies}%
45  {%
46  \pgfarrowsleftextend{2.2pt}%
47  \pgfarrowsrightextend{2.2pt}%
48  }%
49  {%
50    \pgfsetdash{}{0pt} % do not dash%
51    \pgfsetlinewidth{.33pt}%
52    \pgfsetroundjoin   % fix join%
53    \pgfsetroundcap    % fix cap%
54    \pgfpathmoveto{\pgfpoint{-1.5pt}{2.5pt}}%
55    \pgfpathcurveto{\pgfpoint{-.75pt}{1.5pt}}{\pgfpoint{0.5pt}{.5pt}}{\pgfpoint{2pt}{0pt}}%
56    \pgfpathcurveto{\pgfpoint{0.5pt}{-.5pt}}{\pgfpoint{-.75pt}{-1.5pt}}{\pgfpoint{-1.5pt}{-2.5pt}}%
57    \pgfusepathqstroke%
58  }%
62\tikzset{state/.style={inner sep = 0, outer sep = 2pt, draw, fill},
63         every node/.style={inner sep=2pt},
64         every on chain/.style = {inner sep = 0, outer sep = 2pt},
65         join all/.style = {every on chain/.append style={join}},
66         on/.style={on chain={#1}, state},
67         m/.style={execute at begin node=$, execute at end node=$},
68         node distance=3mm,
69         is other/.style={circle, minimum size = 3pt, state},
70         other/.style={on, is other},
71         is jump/.style={diamond, minimum size = 6pt, state},
72         jump/.style={on, is jump},
73         is call/.style={regular polygon, regular polygon sides=3, minimum size=5pt, state},
74         call/.style={on=going below, is call, node distance=6mm, label=above left:$#1$},
75         is ret/.style={regular polygon, regular polygon sides=3, minimum size=5pt, shape border rotate=180, state},
76         ret/.style={on=going above, is ret, node distance=6mm},
77         chain/.style={start chain=#1 going left},
78         rev ar/.style={stealth-, thick},
79         ar/.style={-stealth, thick},
80         every join/.style={rev ar},
81         labelled/.style={fill=white, label=above:$#1$},
82         vcenter/.style={baseline={([yshift=-.5ex]current bounding box)}},
83         every picture/.style={thick},
84         double equal sign distance/.prefix style={double distance=1.5pt}, %% if already defined (newest version of pgf) it should be ignored%}
85         implies/.style={double, -implies, thin, double equal sign distance, shorten <=5pt, shorten >=5pt},
86         new/.style={densely dashed},
87         rel/.style={font=\scriptsize, fill=white, inner sep=2pt},
88         diag/.style={row sep={11mm,between origins},
89                      column sep={11mm,between origins},
90                      every node/.style={draw, is other}},
91         small vgap/.style={row sep={7mm,between origins}},
92         % for article, maybe temporary
93         is jump/.style=is other,
94         is call/.style=is other,
95         is ret/.style=is other,
98\def\L{\mathrel{\mathcal L}}
99\def\S{\mathrel{\mathcal S}}
100\def\R{\mathrel{\mathcal R}}
101\def\C{\mathrel{\mathcal C}}
104\savebox{\execbox}{\tikz[baseline=-.5ex]\draw [-stealth] (0,0) -- ++(1em, 0);}
118\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
119Emerging Technologies (FET) programme within the Seventh Framework
120Programme for Research of the European Commission, under FET-Open grant
121number: 243881}}
122\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
123\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{}, \email{}}
126The labelling approach is a technique to lift cost models for non-functional
127properties of programs from the object code to the source code. It is based
128on the preservation of the structure of the high level program in every
129intermediate language used by the compiler. Such structure is captured by
130observables that are added to the semantics and that needs to be preserved
131by the forward simulation proof of correctness of the compiler. Additional
132special observables are required for function calls. In this paper we
133present a generic forward simulation proof that preserves all these observables.
134The proof statement is based on a new mechanised semantics that traces the
135structure of execution when the language is unstructured. The generic semantics
136and simulation proof have been mechanised in the interactive theorem prover
141The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
142a technique to \emph{lift} cost models for non-functional properties of programs
143from the object code to the source code. Examples of non-functional properties
144are execution time, amount of stack/heap space consumed and energy required for
145communication. The basic idea of the approach is that it is impossible to
146provide a \emph{uniform} cost model for an high level language that is preserved
147\emph{precisely} by a compiler. For instance, two instances of an assignment
148$x = y$ in the source code can be compiled very differently according to the
149place (registers vs stack) where $x$ and $y$ are stored at the moment of
150execution. Therefore a precise cost model must assign a different cost
151to every occurrence, and the exact cost can only be known after compilation.
153According to the labelling approach, the compiler is free to compile and optimise
154the source code without any major restriction, but it must keep trace
155of what happens to basic blocks during the compilation. The cost model is
156then computed on the object code. It assigns a cost to every basic block.
157Finally, the compiler propagates back the cost model to the source level,
158assigning a cost to each basic block of the source code.
160Implementing the labelling approach in a certified compiler
161allows to reason formally on the high level source code of a program to prove
162non-functional properties that are granted to be preserved by the compiler
163itself. The trusted code base is then reduced to 1) the interactive theorem
164prover (or its kernel) used in the certification of the compiler and
1652) the software used to certify the property on the source language, that
166can be itself certified further reducing the trusted code base.
167In~\cite{easylabelling} the authors provide an example of a simple
168certified compiler that implements the labelling approach for the
169imperative \texttt{While} language~\cite{while}, that does not have
170pointers and function calls.
172The labelling approach has been shown to scale to more interesting scenarios.
173In particular in~\cite{functionallabelling} it has been applied to a functional
174language and in~\cite{loopoptimizations} it has been shown that the approach
175can be slightly complicated to handle loop optimisations and, more generally,
176program optimisations that do not preserve the structure of basic blocks.
177On-going work also shows that the labelling approach is also compatible with
178the complex analyses required to obtain a cost model for object code
179on processors that implement advanced features like pipelining, superscalar
180architectures and caches.
182In the European Project CerCo (Certified Complexity~\footnote{\url{}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
1838051 object code. The compiler is
184moderately optimising and implements a compilation chain that is largely
185inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novelty and source of difficulties is due to the presence
186of function calls. Surprisingly, the addition of function calls require a
187revisitation of the proof technique given in~\cite{easylabelling}. In
188particular, at the core of the labelling approach there is a forward
189simulation proof that, in the case of \texttt{While}, is only minimally
190more complex than the proof required for the preservation of the
191functional properties only. In the case of a programming language with
192function calls, instead, it turns out that the forward simulation proof for
193the back-end languages must grant a whole new set of invariants.
195In this paper we present a formalisation in the Matita interactive theorem
196prover~\cite{matita1,matita2} of a generic version of the simulation proof required for unstructured
197languages. All back-end languages of the CerCo compiler are unstructured
198languages, so the proof covers half of the correctness of the compiler.
199The statement of the generic proof is based on a new semantics
200for imperative unstructured languages that is based on \emph{structured
201traces} and that restores the preservation of structure in the observables of
202the semantics. The generic proof allows to almost completely split the
203part of the simulation that deals with functional properties only from the
204part that deals with the preservation of structure.
206The plan of this paper is the following. In Section~\ref{labelling} we
207sketch the labelling method and the problems derived from the application
208to languages with function calls. In Section~\ref{semantics} we introduce
209a generic description of an unstructured imperative language and the
210corresponding structured traces (the novel semantics). In
211Section~\ref{simulation} we describe the forward simulation proof.
212Conclusions and future works are in Section~\ref{conclusions}
214\section{The labelling approach}
216We briefly sketch here a simplified version of the labelling approach as
217introduced in~\cite{easylabelling}. The simplification strengthens the
218sufficient conditions given in~\cite{easylabelling} to allow a simpler
219explanation. The simplified conditions given here are also used in the
220CerCo compiler to simplify the proof.
222Let $\mathcal{P}$ be a programming language whose semantics is given in
223terms of observables: a run of a program yields a finite or infinite
224stream of observables. We also assume for the time being that function
225calls are not available in $\mathcal{P}$. We want to associate a cost
226model to a program $P$ written in $\mathcal{P}$. The first step is to
227extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
228where $L$ is a label distinct from all observables of $\mathcal{P}$.
229The semantics of $\texttt{emit L}$ is the emission of the observable
230\texttt{L} that is meant to signal the beginning of a basic block.
232There exists an automatic procedure that injects into the program $P$ an
233$\texttt{emit L}$ at the beginning of each basic block, using a fresh
234\texttt{L} for each block. In particular, the bodies of loops, both branches
235of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
236with an emission statement.
238Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
239that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
240language used by the compiler. We can easily extend every
241intermediate language (and its semantics) with an $\texttt{emit L}$ statement
242as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
243the additional difficulty that the syntax of object code is given as a
244sequence of bytes. The injection of an emission statement in the object code
245can be done using a map that maps two consecutive code addresses with the
246statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
247instruction stored at $pc_1$ and before the execution of the instruction
248stored at $pc_2$. The two program counters are necessary because the
249instruction stored at $pc_1$ can have multiple possible successors (e.g.
250in case of a conditional branch or an indirect call). Dually, the instruction
251stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
252target of a jump).
254The compiler, to be functionally correct, must preserve the observational
255equivalence, i.e. executing the program after each compiler pass should
256yield the same stream of observables. After the injection of emission
257statements, observables now capture both functional and non-functional
259This correctness property is called in the literature a forward simulation
260and is sufficient for correctness when the target language is
262We also require a stronger, non-functional preservation property: after each
263pass all basic blocks must start with an emission statement, and all labels
264\texttt{L} must be unique.
266Now let $M$ be the object code obtained for the program $P$. Let us suppose
267that we can statically inspect the code $M$ and associate to each basic block
268a cost (e.g. the number of clock cycles required to execute all instructions
269in the basic block, or an upper bound to that time). Every basic block is
270labelled with an unique label \texttt{L}, thus we can actually associate the
271cost to \texttt{L}. Let call it $k(\texttt{L})$.
273The function $k$ is defined as the cost model for the object code control
274blocks. It can be equally used as well as the cost model for the source
275control blocks. Indeed, if the semantics of $P$ is the stream
276$L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
277also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
278every instruction belongs to a control block and every control block is
279labelled. Thus it is correct to say that the execution cost of $P$ is also
280$\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
281the blocks of the high level program $P$ that is preserved by compilation.
283How can the user profit from the high level cost model? Suppose, for instance,
284that he wants to prove that the WCET of his program is bounded by $c$. It
285is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
286a purely functional property of the code. He can therefore use any technique
287available to certify functional properties of the source code.
288What is suggested in~\cite{easylabelling} is to actually instrument the
289source code $P$ by replacing every label emission statement
290$\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
291a global fresh variable \texttt{cost}. The bound is now proved by establishing
292the program invariant $\texttt{cost} \leq c$, which can be done for example
293using the Frama-C~\cite{framaC} suite if the source code is some variant of
296In order to extend the labeling approach to function calls we make
297\verb+CALL f+ emit the observable \verb+f+ and \verb+RET+ emit a distinguished observable
300For example the following execution history of the program in \autoref{fig:esempio}
301$$I_1; \verb+CALL f+; \verb+COND l+; \verb+EMIT $\ell_2$+; I_3; \verb+RET+; I_2; \verb+RET+$$
302emits the trace
303$$\verb+main+, \verb+f+$$
308main: $\!I_1$
309      CALL f
310      $I_2$
311      RET
325f: $\!$COND l
326   EMIT $\ell_2$
327   RET
328l: $\!$EMIT $\ell_3$
329   $I_3$
330   RET
349\subsection{Labelling function calls}
350We now want to extend the labelling approach to support function calls.
351On the high level, \emph{structured} programming language $\mathcal{P}$ there
352is not much to change.
353When a function is invoked, the current basic block is temporarily exited
354and the basic block the function starts with take control. When the function
355returns, the execution of the original basic block is resumed. Thus the only
356significant change is that basic blocks can now be nested. Let \texttt{E}
357be the label of the external block and \texttt{I} the label of a nested one.
358Since the external starts before the internal, the semantics observed will be
359\texttt{E I} and the cost associated to it on the source language will be
360$k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
361in the block \texttt{E} first plus the cost of executing all the instructions in
362the block \texttt{I}. However, we know that some instructions in \texttt{E} are
363executed after the last instruction in \texttt{I}. This is actually irrelevant
364because we are here assuming that costs are additive, so that we can freely
365permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
366the function call terminates and yields back control to the basic block
367\texttt{E}. If the call diverges, the instrumentation
368$\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
369but just as an upper bound to the real execution cost: only precision is lost.
371Let now consider what happens when we move down the compilation chain to an
372unstructured intermediate or final language. Here unstructured means that
373the only control operators are conditional and unconditional jumps, function
374calls and returns. Unlike a structured language, though, there is no guarantee
375that a function will return control just after the function call point.
376The semantics of the return statement, indeed, consists in fetching the
377return address from some internal structure (typically the control stack) and
378jumping directly to it. The code can freely manipulate the control stack to
379make the procedure returns to whatever position. Indeed, it is also possible
380to break the well nesting of function calls/returns.
382Is it the case that the code produced by a correct compiler must respect the
383additional property that every function returns just after its function call
384point? The answer is negative and the property is not implied by forward
385simulation proofs. For instance, imagine to modify a correct compiler pass
386by systematically adding one to the return address on the stack and by
387putting a \texttt{NOP} (or any other instruction that takes one byte) after
388every function call. The obtained code will be functionally indistinguishable,
389and the added instructions will all be dead code.
391This lack of structure in the semantics badly interferes with the labelling
392approach. The reason is the following: when a basic block labelled with
393\texttt{E} contains a function call, it no longer makes any sense to associate
394to a label \texttt{E} the sum of the costs of all the instructions in the block.
395Indeed, there is no guarantee that the function will return into the block and
396that the instructions that will be executed after the return will be the ones
397we are paying for in the cost model.
399How can we make the labelling approach work in this scenario? We only see two
400possible ways. The first one consists in injecting an emission statement after
401every function call: basic blocks no longer contain function calls, but are now
402terminated by them. This completely solves the problem and allows the compiler
403to break the structure of function calls/returns at will. However, the
404technique has several drawbacks. First of all, it greatly augments the number
405of cost labels that are injected in the source code and that become
406instrumentation statements. Thus, when reasoning on the source code to prove
407non-functional properties, the user (or the automation tool) will have to handle
408larger expressions. Second, the more labels are emitted, the more difficult it
409becomes to implement powerful optimisations respecting the code structure.
410Indeed, function calls are usually implemented in such a way that most registers
411are preserved by the call, so that the static analysis of the block is not
412interrupted by the call and an optimisation can involve both the code before
413and after the function call. Third, instrumenting the source code may require
414unpleasant modification of it. Take, for example, the code
415\texttt{f(g(x));}. We need to inject an emission statement/instrumentation
416instruction just after the execution of \texttt{g}. The only way to do that
417is to rewrite the code as \texttt{y = g(x); emit L; f(y);} for some fresh
418variable \texttt{y}. It is pretty clear how in certain situations the obtained
419code would be more obfuscated and then more difficult to manually reason on.
421For the previous reasons, in this paper and in the CerCo project we adopt a
422different approach. We do not inject emission statements after every
423function call. However, we want to propagate a strong additional invariant in
424the forward simulation proof. The invariant is the propagation of the structure
425 of the original high level code, even if the target language is unstructured.
426The structure we want to propagate, that will become more clear in the next
427section, comprises 1) the property that every function should return just after
428the function call point, which in turns imply well nesting of function calls;
4292) the property that every basic block starts with a code emission statement.
431In the original labelling approach of~\cite{easylabelling}, the second property
432was granted syntactically as a property of the generated code.
433In our revised approach, instead, we will impose the property on the runs:
434it will be possible to generate code that does not respect the syntactic
435property, as soon as all possible runs respect it. For instance, dead code will no longer
436be required to have all basic blocks correctly labelled. The switch is suggested
437from the fact that the first of the two properties --- that related to
438function calls/returns --- can only be defined as property of runs,
439not of the static code. The switch is
440beneficial to the proof because the original proof was made of two parts:
441the forward simulation proof and the proof that the static property was granted.
442In our revised approach the latter disappears and only the forward simulation
443is kept.
445In order to capture the structure semantics so that it is preserved
446by a forward simulation argument, we need to make the structure observable
447in the semantics. This is the topic of the next section.
449\section{Structured traces}
451The program semantics adopted in the traditional labelling approach is based
452on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
453a set of states $\S$, the semantics of one deterministic execution
454step is
455defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
456observables. The semantics is then lifted compositionally to multiple (finite
457or infinite) execution steps.
458Finally, the semantics of a a whole program execution is obtained by forgetting
459about the final state (if any), yielding a function $S \to O^*$ that given an
460initial status returns the finite or infinite stream of observables in output.
462We present here a new definition of semantics where the structure of execution,
463as defined in the previous section, is now observable. The idea is to replace
464the stream of observables with a structured data type that makes explicit
465function call and returns and that grants some additional invariants by
466construction. The data structure, called \emph{structured traces}, is
467defined inductively for terminating programs and coinductively for diverging
468ones. In the paper we focus only on the inductive structure, i.e. we assume
469that all programs that are given a semantics are total. The Matita formalisation
470also shows the coinductive definitions. The semantics of a program is then
471defined as a function that maps an initial state into a structured trace.
473In order to have a definition that works on multiple intermediate languages,
474we abstract the type of structure traces over an abstract data type of
475abstract statuses, which we aptly call $\texttt{abstract\_status}$. The fields
476of this record are the following.
478 \item \verb+S : Type[0]+, the type of states.
479 \item \verb+as_execute : S $\to$ S $\to$ Prop+, a binary predicate stating
480 an execution step. We write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
481 \item \verb+as_classifier : S $\to$ classification+, a function tagging all
482 states with a class in
483 $\{\texttt{cl\_return,cl\_jump,cl\_call,cl\_other}\}$, depending on the instruction
484 that is about to be executed (we omit tail-calls for simplicity). We will
485 use $s \class c$ as a shorthand for both $\texttt{as\_classifier}~s=c$
486 (if $c$ is a classification) and $\texttt{as\_classifier}~s\in c$
487 (if $c$ is a set of classifications).
488 \item \verb+as_label : S $\to$ option label+, telling whether the
489 next instruction to be executed in $s$ is a cost emission statement,
490 and if yes returning the associated cost label. Our shorthand for this function
491 will be $\ell$, and we will also abuse the notation by using $\ell~s$ as a
492 predicate stating that $s$ is labelled.
493 \item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
494 telling the identifier of the function which is being called in a
495 \verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
496 $\verb+as_call_ident+~s = f$.
497 \item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
498 which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
499 instruction to be executed in $s_2$ follows the function call to be
500 executed in (the witness of the $\Sigma$-type) $s_1$. We will use the notation
501 $s_1\ar s_2$ for this relation.
504% \begin{alltt}
505% record abstract_status := \{ S: Type[0];
506%  as_execute: S \(\to\) S \(\to\) Prop;   as_classifier: S \(\to\) classification;
507%  as_label: S \(\to\) option label;    as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
508%  as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
509% \end{alltt}
511The inductive type for structured traces is actually made by three multiple
512inductive types with the following semantics:
514 \item $(\texttt{trace\_label\_return}~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
515   is a trace that begins in
516   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
517   such that the instruction to be executed in $s_1$ is a label emission
518   statement and the one to be executed in the state before $s_2$ is a return
519   statement. Thus $s_2$ is the state after the return. The trace
520   may contain other label emission statements. It captures the structure of
521   the execution of function bodies: they must start with a cost emission
522   statement and must end with a return; they are obtained by concatenating
523   one or more basic blocks, all starting with a label emission
524   (e.g. in case of loops).
525 \item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
526   is a trace that begins in
527   the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
528   such that the instruction to be executed in $s_2$/in the state before
529   $s_2$ is either a label emission statement or
530   or a return, according to the boolean $b$. It must not contain
531   any label emission statement. It captures the notion of a suffix of a
532   basic block.
533 \item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$ is the special case of
534   $\verb+TAL+~b~s_1~s_2)$ such that the instruction to be
535   executed in $s_1$ is a label emission statement. It captures the notion of
536   a basic block.
541 {\texttt{TLL}~true~s_1~s_2}
542 {\texttt{TLR}~s_1~s_2}
545 {\texttt{TLL}~false~s_1~s_2 \andalso
546  \texttt{TLR}~s_2~s_3
547 }
548 {\texttt{TLR}~s_1~s_3}
551 {\texttt{TAL}~b~s_1~s_2 \andalso
552  \ell~s_1
553 }
554 {\texttt{TLL}~b~s_1~s_2}
558 {s_1\exec s_2 \andalso
559  s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
560  \ell~s_2
561 }
562 {\texttt{TAL}~false~s_1~s_2}
565 {s_1\exec s_2 \andalso
566  s_1 \class \texttt{cl\_return}
567 }
568 {\texttt{TAL}~true~s_1~s_2}
571 {s_1\exec s_2 \andalso
572  s_1 \class \texttt{cl\_call} \andalso
573  s_1\ar s_3 \andalso
574  \texttt{TLR}~s_2~s_3 \andalso
575  \ell~s_3
576 }
577 {\texttt{TAL}~false~s_1~s_3}
580 {s_1\exec s_2 \andalso
581  s_1 \class \texttt{cl\_call} \andalso
582  s_1\ar s_3 \andalso
583  \texttt{TLR}~s_2~s_3 \andalso
584  \texttt{TAL}~b~s_3~s_4
585 }
586 {\texttt{TAL}~b~s_1~s_4}
589 {s_1\exec s_2 \andalso
590  \lnot \ell~s_2 \andalso
591  \texttt{TAL}~b~s_2~s_3\andalso
592  s_1 \class \texttt{cl\_other}
593 }
594 {\texttt{TAL}~b~s_1~s_3}
597inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
598  | tlr_base:
599      ∀status_before: S.
600      ∀status_after: S.
601        trace_label_label S ends_with_ret status_before status_after →
602        trace_label_return S status_before status_after
603  | tlr_step:
604      ∀status_initial: S.
605      ∀status_labelled: S.
606      ∀status_final: S.
607        trace_label_label S doesnt_end_with_ret status_initial status_labelled →
608        trace_label_return S status_labelled status_final →
609          trace_label_return S status_initial status_final
610with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
611  | tll_base:
612      ∀ends_flag: trace_ends_with_ret.
613      ∀start_status: S.
614      ∀end_status: S.
615        trace_any_label S ends_flag start_status end_status →
616        as_costed S start_status →
617          trace_label_label S ends_flag start_status end_status
618with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
619  (* Single steps within a function which reach a label.
620     Note that this is the only case applicable for a jump. *)
621  | tal_base_not_return:
622      ∀start_status: S.
623      ∀final_status: S.
624        as_execute S start_status final_status →
625        (as_classifier S start_status cl_jump ∨
626         as_classifier S start_status cl_other) →
627        as_costed S final_status →
628          trace_any_label S doesnt_end_with_ret start_status final_status
629  | tal_base_return:
630      ∀start_status: S.
631      ∀final_status: S.
632        as_execute S start_status final_status →
633        as_classifier S start_status cl_return →
634          trace_any_label S ends_with_ret start_status final_status
635  (* A call followed by a label on return. *)
636  | tal_base_call:
637      ∀status_pre_fun_call: S.
638      ∀status_start_fun_call: S.
639      ∀status_final: S.
640        as_execute S status_pre_fun_call status_start_fun_call →
641        ∀H:as_classifier S status_pre_fun_call cl_call.
642          as_after_return S «status_pre_fun_call, H» status_final →
643          trace_label_return S status_start_fun_call status_final →
644          as_costed S status_final →
645            trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
646  (* A call followed by a non-empty trace. *)
647  | tal_step_call:
648      ∀end_flag: trace_ends_with_ret.
649      ∀status_pre_fun_call: S.
650      ∀status_start_fun_call: S.
651      ∀status_after_fun_call: S.
652      ∀status_final: S.
653        as_execute S status_pre_fun_call status_start_fun_call →
654        ∀H:as_classifier S status_pre_fun_call cl_call.
655          as_after_return S «status_pre_fun_call, H» status_after_fun_call →
656          trace_label_return S status_start_fun_call status_after_fun_call →
657          ¬ as_costed S status_after_fun_call →
658          trace_any_label S end_flag status_after_fun_call status_final →
659            trace_any_label S end_flag status_pre_fun_call status_final
660  | tal_step_default:
661      ∀end_flag: trace_ends_with_ret.
662      ∀status_pre: S.
663      ∀status_init: S.
664      ∀status_end: S.
665        as_execute S status_pre status_init →
666        trace_any_label S end_flag status_init status_end →
667        as_classifier S status_pre cl_other →
668        ¬ (as_costed S status_init) →
669          trace_any_label S end_flag status_pre status_end.
672A \texttt{trace\_label\_return} is isomorphic to a list of
673\texttt{trace\_label\_label}s that ends with a cost emission followed by a
674return terminated \texttt{trace\_label\_label}.
675The interesting cases are those of $\texttt{trace\_any\_label}~b~s_1~s_2$.
676A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
677definition on the classification of $s_1$. The constructors of the datatype
678impose several invariants that are meant to impose a structure to the
679otherwise unstructured execution. In particular, the following invariants are
682 \item the trace is never empty; it ends with a return iff $b$ is
683       true
684 \item a jump must always be the last instruction of the trace, and it must
685       be followed by a cost emission statement; i.e. the target of a jump
686       is always the beginning of a new basic block; as such it must start
687       with a cost emission statement
688 \item a cost emission statement can never occur inside the trace, only in
689       the status immediately after
690 \item the trace for a function call step is made of a subtrace for the
691       function body of type
692       $\texttt{trace\_label\_return}~s_1~s_2$, possibly followed by the
693       rest of the trace for this basic block. The subtrace represents the
694       function execution. Being an inductive datum, it grants totality of
695       the function call. The status $s_2$ is the one that follows the return
696       statement. The next instruction of $s_2$ must follow the function call
697       instruction. As a consequence, function calls are also well nested.
700There are three mutual structural recursive functions, one for each of
701\verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
702$|\,.\,|$: the \emph{flattening} of the traces. These functions
703allow to extract from a structured trace the list of emitted cost labels.
704%  We only show here the type of one
705% of them:
706% \begin{alltt}
707% flatten_trace_label_return:
708%  \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
709%   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
710% \end{alltt}
712\paragraph{Cost prediction on structured traces.}
714The first main theorem of CerCo about traces
715(theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
716holds for the
718of the structured traces to the concrete status of object code programs.
719Simplifying a bit, it states that
721\begin{array}{l}\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
722  \texttt{clock}~s_2 = \texttt{clock}~s_1 +
723  \Sigma_{\alpha \in |\tau|}\;k(\alpha)
726where the cost model $k$ is statically computed from the object code
727by associating to each label $\alpha$ the sum of the cost of the instructions
728in the basic block that starts at $\alpha$ and ends before the next labelled
729instruction. The theorem is proved by structural induction over the structured
730trace, and is based on the invariant that
731iff the function that computes the cost model has analysed the instruction
732to be executed at $s_2$ after the one to be executed at $s_1$, and if
733the structured trace starts with $s_1$, then eventually it will contain also
734$s_2$. When $s_1$ is not a function call, the result holds trivially because
735of the $s_1\exec s_2$ condition obtained by inversion on
736the trace. The only non
737trivial case is the one of function calls: the cost model computation function
738does recursion on the first instruction that follows that function call; the
739\texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
740\texttt{tal\_step\_call} grants exactly that the execution will eventually reach
741this state.
743\paragraph{Structured traces similarity and cost prediction invariance.}
745A compiler pass maps source to object code and initial states to initial
746states. The source code and initial state uniquely determine the structured
747trace of a program, if it exists. The structured trace fails to exists iff
748the structural conditions are violated by the program execution (e.g. a function
749body does not start with a cost emission statement). Let us assume that the
750target structured trace exists.
752What is the relation between the source and target structured traces?
753In general, the two traces can be arbitrarily different. However, we are
754interested only in those compiler passes that maps a trace $\tau_1$ to a trace
755$\tau_2$ such that
756\begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
757The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
760\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
761  \texttt{clock}~s_2 - \texttt{clock}~s_1 =
762  \Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
763  \Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
765This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
766transfer the cost model from the target to the source code and reason on the
767source code only.
769We are therefore interested in conditions stronger than~\eqref{th2}.
770Therefore we introduce here a similarity relation between traces with
771the same structure. Theorem~\texttt{tlr\_rel\_to\_traces\_same\_flatten}
772in the Matita formalisation shows that~\eqref{th2} holds for every pair
773$(\tau_1,\tau_2)$ of similar traces.
775Intuitively, two traces are similar when one can be obtained from
776the other by erasing or inserting silent steps, i.e. states that are
777not \texttt{as\_costed} and that are classified as \texttt{cl\_other}.
778Silent steps do not alter the structure of the traces.
779In particular,
780the relation maps function calls to function calls to the same function,
781label emission statements to emissions of the same label, concatenation of
782subtraces to concatenation of subtraces of the same length and starting with
783the same emission statement, etc.
785In the formalisation the three similarity relations --- one for each trace
786kind --- are defined by structural recursion on the first trace and pattern
787matching over the second. Here we turn
788the definition into the inference rules shown in \autoref{fig:txx_rel}
789for the sake of readability. We also omit from trace constructors all arguments,
790but those that are traces or that
791are used in the premises of the rules. By abuse of notation we denote all three
792relations by infixing $\approx$.
797 {tll_1\approx tll_2
798 }
799 {\texttt{tlr\_base}~tll_1 \approx \texttt{tlr\_base}~tll_2}
802 {tll_1 \approx tll_2 \andalso
803  tlr_1 \approx tlr_2
804 }
805 {\texttt{tlr\_step}~tll_1~tlr_1 \approx \texttt{tlr\_step}~tll_2~tlr_2}
810 {\ell~s_1 = \ell~s_2 \andalso
811  tal_1\approx tal_2
812 }
813 {\texttt{tll\_base}~s_1~tal_1 \approx \texttt{tll\_base}~s_2~tal_2}
816 {tal_1\approx tal_2
817 }
818 {\texttt{tal\_step\_default}~tal_1 \approx tal_2}
822 {}
823 {\texttt{tal\_base\_not\_return}\approx taa \append \texttt{tal\_base\_not\_return}}
826 {}
827 {\texttt{tal\_base\_return}\approx taa \append \texttt{tal\_base\_return}}
830 {tlr_1\approx tlr_2 \andalso
831  s_1 \uparrow f \andalso s_2\uparrow f
832 }
833 {\texttt{tal\_base\_call}~s_1~tlr_1\approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2}
836 {tlr_1\approx tlr_2 \andalso
837  s_1 \uparrow f \andalso s_2\uparrow f \andalso
838  \texttt{tal\_collapsable}~tal_2
839 }
840 {\texttt{tal\_base\_call}~s_1~tlr_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2)}
843 {tlr_1\approx tlr_2 \andalso
844  s_1 \uparrow f \andalso s_2\uparrow f \andalso
845  \texttt{tal\_collapsable}~tal_1
846 }
847 {\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2)}
850 {tlr_1 \approx tlr_2 \andalso
851  s_1 \uparrow f \andalso s_2\uparrow f\andalso
852  tal_1 \approx tal_2 \andalso
853 }
854 {\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2}
855\caption{The inference rule for the relation $\approx$.}
861let rec tlr_rel S1 st1 st1' S2 st2 st2'
862  (tlr1 : trace_label_return S1 st1 st1')
863  (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
864match tlr1 with
865  [ tlr_base st1 st1' tll1 ⇒
866    match tlr2 with
867    [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
868    | _ ⇒ False
869    ]
870  | tlr_step st1 st1' st1'' tll1 tl1 ⇒
871    match tlr2 with
872    [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
873      tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
874    | _ ⇒ False
875    ]
876  ]
877and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
878 (tll1 : trace_label_label S1 fl1 st1 st1')
879 (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
880  match tll1 with
881  [ tll_base fl1 st1 st1' tal1 H ⇒
882    match tll2 with
883    [ tll_base fl2 st2 st2 tal2 G ⇒
884      as_label_safe … («?, H») = as_label_safe … («?, G») ∧
885      tal_rel … tal1 tal2
886    ]
887  ]
888and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
889 (tal1 : trace_any_label S1 fl1 st1 st1')
890 (tal2 : trace_any_label S2 fl2 st2 st2')
891   on tal1 : Prop ≝
892  match tal1 with
893  [ tal_base_not_return st1 st1' _ _ _ ⇒
894    fl2 = doesnt_end_with_ret ∧
895    ∃st2mid,taa,H,G,K.
896    tal2 ≃ taa_append_tal ? st2 ??? taa
897      (tal_base_not_return ? st2mid st2' H G K)
898  | tal_base_return st1 st1' _ _ ⇒
899    fl2 = ends_with_ret ∧
900    ∃st2mid,taa,H,G.
901    tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
902      (tal_base_return ? st2mid st2' H G)
903  | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
904    fl2 = doesnt_end_with_ret ∧
905    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
906    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
907    (* we must allow a tal_base_call to be similar to a call followed
908      by a collapsable trace (trace_any_any followed by a base_not_return;
909      we cannot use trace_any_any as it disallows labels in the end as soon
910      as it is non-empty) *)
911    (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
912      tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
913    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
914    ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
915      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
916      tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
917  | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
918    ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
919    ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
920    (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
921      tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
922      tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
923    ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
924    ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
925      tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
926      tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
927  | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
928    tal_rel … tl1 tal2 (* <- this makes it many to many *)
929  ].
933In the preceding rules, a $taa$ is an inhabitant of the
934$\texttt{trace\_any\_any}~s_1~s_2$ (shorthand $\texttt{TAA}~s_1~s_2$),
935an inductive data type whose definition
936is not in the paper for lack of space. It is the type of valid
937prefixes (even empty ones) of \texttt{TAL}'s that do not contain
938any function call. Therefore it
939is possible to concatenate (using ``$\append$'') a \texttt{TAA} to the
940left of a \texttt{TAL}. A \texttt{TAA} captures
941a sequence of silent moves.
942The \texttt{tal\_collapsable} unary predicate over \texttt{TAL}'s
943holds when the argument does not contain any function call and it ends
944with a label (not a return). The intuition is that after a function call we
945can still perform a sequence of silent actions while remaining similar.
947As should be expected, even though the rules are asymmetric $\approx$ is in fact
948an equivalence relation.
949\section{Forward simulation}
952We summarise here the results of the previous sections. Each intermediate
953unstructured language can be given a semantics based on structured traces,
954that single out those runs that respect a certain number of invariants.
955A cost model can be computed on the object code and it can be used to predict
956the execution costs of runs that produce structured traces. The cost model
957can be lifted from the target to the source code of a pass if the pass maps
958structured traces to similar structured traces. The latter property is called
959a \emph{forward simulation}.
961As for labelled transition systems, in order to establish the forward
962simulation we are interested in (preservation of observables), we are
963forced to prove a stronger notion of forward simulation that introduces
964an explicit relation between states. The classical notion of a 1-to-many
965forward simulation is the existence of a relation $\S$ over states such that
966if $s_1 \S s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
967$s_2 \to^* s_2'$ and $s_1' \S s_2'$. In our context, we need to replace the
968one and multi step transition relations $\to^n$ with the existence of
969a structured trace between the two states, and we need to add the request that
970the two structured traces are similar. Thus what we would like to state is
971something like:\\
972for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
973$s_1$ to $s_1'$ and $s_1 \S s_2$ there exists an $s_2'$ such that
974$s_1' \S s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
975$\tau_1$ is similar to $\tau_2$. We call this particular form of forward
976simulation \emph{trace reconstruction}.
978The statement just introduced, however, is too simplistic and not provable
979in the general case. To understand why, consider the case of a function call
980and the pass that fixes the parameter passing conventions. A function
981call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
982input nor output parameters. The pass must add explicit code before and after
983the function call to move the pseudo-registers content from/to the hardware
984registers or the stack in order to implement the parameter passing strategy.
985Similarly, each function body must be augmented with a preamble and a postamble
986to complete/initiate the parameter passing strategy for the call/return phase.
987Therefore what used to be a call followed by the next instruction to execute
988after the function return, now becomes a sequence of instructions, followed by
989a call, followed by another sequence. The two states at the beginning of the
990first sequence and at the end of the second sequence are in relation with
991the status before/after the call in the source code, like in an usual forward
992simulation. How can we prove however the additional condition for function calls
993that asks that when the function returns the instruction immediately after the
994function call is called? To grant this invariant, there must be another relation
995between the address of the function call in the source and in the target code.
996This additional relation is to be used in particular to relate the two stacks.
998Another example is given by preservation of code emission statements. A single
999code emission instruction can be simulated by a sequence of steps, followed
1000by a code emission, followed by another sequence. Clearly the initial and final
1001statuses of the sequence are to be in relation with the status before/after the
1002code emission in the source code. In order to preserve the structured traces
1003invariants, however, we must consider a second relation between states that
1004traces the preservation of the code emission statement.
1006Therefore we now introduce an abstract notion of relation set between abstract
1007statuses and an abstract notion of 1-to-many forward simulation conditions.
1008These two definitions enjoy the following remarkable properties:
1010 \item they are generic enough to accommodate all passes of the CerCo compiler
1011 \item the conjunction of the 1-to-many forward simulation conditions are
1012       just slightly stricter than the statement of a 1-to-many forward
1013       simulation in the classical case. In particular, they only require
1014       the construction of very simple forms of structured traces made of
1015       silent states only.
1016 \item they allow to prove our main result of the paper: the 1-to-many
1017       forward simulation conditions are sufficient to prove the trace
1018       reconstruction theorem
1021Point 3. is the important one. First of all it means that we have reduced
1022the complex problem of trace reconstruction to a much simpler one that,
1023moreover, can be solved with slight adaptations of the forward simulation proof
1024that is performed for a compiler that only cares about functional properties.
1025Therefore we have successfully splitted as much as possible the proof of
1026preservation of functional properties from that of non-functional ones.
1027Secondly, combined with the results in the previous section, it implies
1028that the cost model can be computed on the object code and lifted to the
1029source code to reason on non-functional properties, assuming that
1030the 1-to-many forward simulation conditions are fulfilled for every
1031compiler pass.
1033\paragraph{Relation sets.}
1035We introduce now the four relations $\mathcal{S,C,L,R}$ between abstract
1036statuses that are used to correlate the corresponding statues before and
1037after a compiler pass. The first two are abstract and must be instantiated
1038by every pass. The remaining two are derived relations.
1040The $\S$ relation between states is the classical relation used
1041in forward simulation proofs. It correlates the data of the status
1042(e.g. registers, memory, etc.).
1044The $\C$ relation correlates call states. It allows to track the
1045position in the target code of every call in the source code.
1047The $\L$ relation simply says that the two states are both label
1048emitting states that emit the same label, \emph{i.e.}\ $s_1\L s_2\iffdef \ell~s_1=\ell~s_2$.
1049It allows to track the position in
1050the target code of every cost emitting statement in the source code.
1052Finally the $\R$ relation is the more complex one. Two states
1053$s_1$ and $s_2$ are $\R$ correlated if every time $s_1$ is the
1054successors of a call state that is $\C$-related to a call state
1055$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$. Formally:
1056$$s_1\R s_2 \iffdef \forall s_1',s_2'.s_1'\C s_2' \to s_1'\ar s_1 \to s_2' \ar s_2.$$
1057We will require all pairs of states that follow a related call to be
1058$\R$-related. This is the fundamental requirement granting
1059that the target trace is well structured, \emph{i.e.}\ that calls are well
1060nested and returning where they are supposed to.
1062% \begin{alltt}
1063% record status_rel (S1,S2 : abstract_status) : Type[1] := \{
1064%   \(\S\): S1 \(\to\) S2 \(\to\) Prop;
1065%   \(\C\): (\(\Sigma\)s.as_classifier S1 s cl_call) \(\to\)
1066%      (\(\Sigma\)s.as_classifier S2 s cl_call) \(\to\) Prop \}.
1068% definition \(\L\) S1 S2 st1 st2 := as_label S1 st1 = as_label S2 st2.
1070% definition \(\R\) S1 S2 (R: status_rel S1 S2) s1_ret s2_ret ≝
1071%  \(\forall\)s1_pre,s2_pre.
1072%   as_after_return s1_pre s1_ret \(\to\) s1_pre \(\R\) s2_pre \(\to\)
1073%    as_after_return s2_pre s2_ret.
1074% \end{alltt}
1079% \begin{subfigure}{.475\linewidth}
1080% \centering
1081% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1082%                             every label/.style=overlay, node distance=10mm]
1083%     \matrix [diag] (m) {%
1084%          \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
1085%          \node (s2) {}; & \node (t2) {}; \\
1086%     };
1087%     \node [above=0 of t1, overlay] {$\alpha$};
1088%     {[-stealth]
1089%     \draw (s1) -- (t1);
1090%     \draw [new] (s2) -- node [above] {$*$} (t2);
1091%     }
1092%     \draw (s1) to node [rel] {$\S$} (s2);
1093%     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1094% \end{tikzpicture}
1095% \caption{The \texttt{cl\_jump} case.}
1096% \label{subfig:cl_jump}
1097% \end{subfigure}
1098% &
1101\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1102                            every label/.style=overlay, node distance=10mm]
1103    \matrix [diag] (m) {%
1104         \node (s1) {}; & \node (t1) {};\\
1105         \node (s2) {}; & \node (t2) {}; \\
1106    };
1107    {[-stealth]
1108    \draw (s1) -- (t1);
1109    \draw [new] (s2) -- node [above] {$*$} (t2);
1110    }
1111    \draw (s1) to node [rel] {$\S$} (s2);
1112    \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1114\caption{The \texttt{cl\_oher} and \texttt{cl\_jump} cases.}
1120\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1121                            every label/.style=overlay, node distance=10mm]
1122    \matrix [diag, small vgap] (m) {%
1123        \node (t1) {}; \\
1124         \node (s1) [is call] {}; \\
1125         & \node (l) {}; & \node (t2) {};\\
1126         \node (s2) {}; & \node (c) [is call] {};\\   
1127    };
1128    {[-stealth]
1129    \draw (s1) -- node [left] {$f$} (t1);
1130    \draw [new] (s2) -- node [above] {$*$} (c);
1131    \draw [new] (c) -- node [right] {$f$} (l);
1132    \draw [new] (l) -- node [above] {$*$} (t2);
1133    }
1134    \draw (s1) to node [rel] {$\S$} (s2);
1135    \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
1136    \draw [new] (t1) to [bend left] node [rel] {$\L$} (l);
1137    \draw [new] (t1) to node [rel] {$\C$} (c);
1138    \end{tikzpicture}
1139\caption{The \texttt{cl\_call} case.}
1145\begin{tikzpicture}[every join/.style={ar}, join all, thick,
1146                            every label/.style=overlay, node distance=10mm]
1147    \matrix [diag, small vgap] (m) {%
1148        \node (s1) [is ret] {}; \\
1149        \node (t1) {}; \\
1150        \node (s2) {}; & \node (c) [is ret] {};\\
1151        & \node (r) {}; & \node (t2) {}; \\   
1152    };
1153    {[-stealth]
1154    \draw (s1) -- (t1);
1155    \draw [new] (s2) -- node [above] {$*$} (c);
1156    \draw [new] (c) -- (r);
1157    \draw [new] (r) -- node [above] {$*$} (t2);
1158    }
1159    \draw (s1) to [bend right=45] node [rel] {$\S$} (s2);
1160    \draw [new, overlay] (t1) to [bend left=90, looseness=1] node [rel] {$\S,\L$} (t2);
1161    \draw [new, overlay] (t1) to [bend left=90, looseness=1.2] node [rel] {$\R$} (r);
1163\caption{The \texttt{cl\_return} case.}
1167\caption{Mnemonic diagrams depicting the hypotheses for the preservation of structured traces.
1168         Dashed lines
1169         and arrows indicates how the diagrams must be closed when solid relations
1170         are present.}
1174\paragraph{1-to-many forward simulation conditions.}
1175\begin{condition}[Cases \texttt{cl\_other} and \texttt{cl\_jump}]
1176 For all $s_1,s_1',s_2$ such that $s_1 \S s_1'$, and
1177 $s_1\exec s_1'$, and either $s_1 \class \texttt{cl\_other}$ or
1178 both $s_1\class\texttt{cl\_other}\}$ and $\ell~s_1'$,
1179 there exists an $s_2'$ and a $\texttt{trace\_any\_any\_free}~s_2~s_2'$ called $taaf$
1180 such that $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1181$taaf$ is non empty, or one among $s_1$ and $s_1'$ is \texttt{as\_costed}.
1184In the above condition depicted in \autoref{subfig:cl_other_jump},
1185a $\texttt{trace\_any\_any\_free}~s_1~s_2$ (which from now on
1186will be shorthanded as \verb+TAAF+) is an
1187inductive type of structured traces that do not contain function calls or
1188cost emission statements. Differently from a \verb+TAA+, the
1189instruction to be executed in the lookahead state $s_2$ may be a cost emission
1192The intuition of the condition is that one step can be replaced with zero or more steps if it
1193preserves the relation between the data and if the two final statuses are
1194labelled in the same way. Moreover, we must take special care of the empty case
1195to avoid collapsing two consecutive states that emit a label, missing one of the two emissions.
1197\begin{condition}[Case \texttt{cl\_call}]
1198 For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$ and
1199 $s_1\exec s_1'$ and $s_1 \class \texttt{cl\_call}$, there exists $s_a, s_b, s_2'$, a
1200$\verb+TAA+~s_2~s_a$, and a
1201$\verb+TAAF+~s_b~s_2'$ such that:
1202$s_a\class\texttt{cl\_call}$, the \texttt{as\_call\_ident}'s of
1203the two call states are the same, $s_1 \C s_a$,
1204$s_a\exec s_b$, $s_1' \L s_b$ and
1205$s_1' \S s_2'$.
1208The condition, depicted in \autoref{subfig:cl_call} says that, to simulate a function call, we can perform a
1209sequence of silent actions before and after the function call itself.
1210The old and new call states must be $\C$-related, the old and new
1211states at the beginning of the function execution must be $\L$-related
1212and, finally, the two initial and final states must be $\S$-related
1213as usual.
1215\begin{condition}[Case \texttt{cl\_return}]
1216 For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$,
1217 $s_1\exec s_1'$ and $s_1 \class \texttt{cl\_return}$, there exists $s_a, s_b, s_2'$, a
1218$\verb+TAA+~s_2~s_a$, a
1219$\verb+TAAF+~s_b~s_2'$ called $taaf$ such that:
1221$s_a\exec s_b$,
1222$s_1' \R s_b$ and
1223$s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1224$taaf$ is non empty, or $\lnot \ell~s_a$.
1227Similarly to the call condition, to simulate a return we can perform a
1228sequence of silent actions before and after the return statement itself,
1229as depicted in \autoref{subfig:cl_return}.
1230The old and the new statements after the return must be $\R$-related,
1231to grant that they returned to corresponding calls.
1232The two initial and final states must be $\S$-related
1233as usual and, moreover, they must exhibit the same labels. Finally, when
1234the suffix is non empty we must take care of not inserting a new
1235unmatched cost emission statement just after the return statement.
1239definition status_simulation ≝
1240  λS1 : abstract_status.
1241  λS2 : abstract_status.
1242  λsim_status_rel : status_rel S1 S2.
1243    ∀st1,st1',st2.as_execute S1 st1 st1' →
1244    sim_status_rel st1 st2 →
1245    match as_classify … st1 with
1246    [ None ⇒ True
1247    | Some cl ⇒
1248      match cl with
1249      [ cl_call ⇒ ∀prf.
1250        (*
1251             st1' ------------S----------\
1252              ↑ \                         \
1253             st1 \--L--\                   \
1254              | \       \                   \
1255              S  \-C-\  st2_after_call →taa→ st2'
1256              |       \     ↑
1257             st2 →taa→ st2_pre_call
1258        *)
1259        ∃st2_pre_call.
1260        as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
1261        call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
1262        ∃st2_after_call,st2'.
1263        ∃taa2 : trace_any_any … st2 st2_pre_call.
1264        ∃taa2' : trace_any_any … st2_after_call st2'.
1265        as_execute … st2_pre_call st2_after_call ∧
1266        sim_status_rel st1' st2' ∧
1267        label_rel … st1' st2_after_call
1268      | cl_return ⇒
1269        (*
1270             st1
1271            / ↓
1272           | st1'----------S,L------------\
1273           S   \                           \
1274            \   \-----R-------\            |     
1275             \                 |           |
1276             st2 →taa→ st2_ret |           |
1277                          ↓   /            |
1278                     st2_after_ret →taaf→ st2'
1280           we also ask that st2_after_ret be not labelled if the taaf tail is
1281           not empty
1282        *)
1283        ∃st2_ret,st2_after_ret,st2'.
1284        ∃taa2 : trace_any_any … st2 st2_ret.
1285        ∃taa2' : trace_any_any_free … st2_after_ret st2'.
1286        (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
1287        as_classifier … st2_ret cl_return ∧
1288        as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
1289        ret_rel … sim_status_rel st1' st2_after_ret ∧
1290        label_rel … st1' st2'
1291      | cl_other ⇒
1292          (*         
1293          st1 → st1'
1294            |      \
1295            S      S,L
1296            |        \
1297           st2 →taaf→ st2'
1299           the taaf can be empty (e.g. tunneling) but we ask it must not be the
1300           case when both st1 and st1' are labelled (we would be able to collapse
1301           labels otherwise)
1302         *)
1303        ∃st2'.
1304        ∃taa2 : trace_any_any_free … st2 st2'.
1305        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1306        sim_status_rel st1' st2' ∧
1307        label_rel … st1' st2'
1308      | cl_jump ⇒
1309        (* just like cl_other, but with a hypothesis more *)
1310        as_costed … st1' →
1311        ∃st2'.
1312        ∃taa2 : trace_any_any_free … st2 st2'.
1313        (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1314        sim_status_rel st1' st2' ∧
1315        label_rel … st1' st2'
1316      ]
1317    ].
1321\paragraph{Main result: the 1-to-many forward simulation conditions
1322are sufficient to trace reconstruction}
1324Let us assume that a relation set is given such that the 1-to-many
1325forward simulation conditions are satisfied. Under this assumption we
1326can prove the following three trace reconstruction theorems by mutual
1327structural induction over the traces given in input between the
1328$s_1$ and $s_1'$ states.
1330In particular, the \texttt{status\_simulation\_produce\_tlr} theorem
1331applied to the \texttt{main} function of the program and equal
1332$s_{2_b}$ and $s_2$ states shows that, for every initial state in the
1333source code that induces a structured trace in the source code,
1334the compiled code produces a similar structured trace.
1337For every $s_1,s_1',s_{2_b},s_2$ s.t.
1338there is a $\texttt{TLR}~s_1~s_1'$ called $tlr_1$ and a
1339$\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1340$s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1341there is a $\texttt{TLR}~s_{2_b}~s_{2_m}$ called $tlr_2$ and
1342there is a $\verb+TAAF+~s_{2_m}~s_2'$ called $taaf$
1343s.t. if $taaf$ is non empty then $\lnot (\ell~s_{2_m})$,
1344and $tlr_1\approx tlr_2$
1345and $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1346$s_1' \R s_{2_m}$.
1349The theorem states that a \texttt{trace\_label\_return} in the source code
1350together with a precomputed preamble of silent states
1351(the \verb+TAA+) in the target code induces a
1352similar \texttt{trace\_label\_return} in the target code which can be
1353followed by a sequence of silent states. Note that the statement does not
1354require the produced \texttt{trace\_label\_return} to start with the
1355precomputed preamble, even if this is likely to be the case in concrete
1356implementations. The preamble in input is necessary for compositionality, e.g.
1357because the 1-to-many forward simulation conditions allow in the
1358case of function calls to execute a preamble of silent instructions just after
1359the function call.
1361Clearly similar results are also available for the other two types of structured
1362traces (in fact, they are all proved simultaneously by mutual induction).
1363% \begin{theorem}[\texttt{status\_simulation\_produce\_tll}]
1364% For every $s_1,s_1',s_{2_b},s_2$ s.t.
1365% there is a $\texttt{TLL}~b~s_1~s_1'$ called $tll_1$ and a
1366% $\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1367% $s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1368% \begin{itemize}
1369%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1370%        and a trace $\texttt{TLL}~b~s_{2_b}~s_{2_m}$ called $tll_2$
1371%        and a $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1372%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1373%        $s_1' \R s_{2_m}$ and
1374%        $tll_1\approx tll_2$ and
1375%        if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1376%  \item else there exists $s_2'$ and a
1377%        $\texttt{TLL}~b~s_{2_b}~s_2'$ called $tll_2$ such that
1378%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1379%        $tll_1\approx tll_2$.
1380% \end{itemize}
1381% \end{theorem}
1383% The statement is similar to the previous one: a source
1384% \texttt{trace\_label\_label} and a given target preamble of silent states
1385% in the target code induce a similar \texttt{trace\_label\_label} in the
1386% target code, possibly followed by a sequence of silent moves that become the
1387% preamble for the next \texttt{trace\_label\_label} translation.
1389% \begin{theorem}[\texttt{status\_simulation\_produce\_tal}]
1390% For every $s_1,s_1',s_2$ s.t.
1391% there is a $\texttt{TAL}~b~s_1~s_1'$ called $tal_1$ and
1392% $s_1 \S s_2$
1393% \begin{itemize}
1394%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1395%    and a trace $\texttt{TAL}~b~s_2~s_{2_m}$ called $tal_2$ and a
1396%    $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
1397%    $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1398%    $s_1' \R s_{2_m}$ and
1399%    $tal_1 \approx tal_2$ and
1400%    if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1401%  \item else there exists $s_2'$ and a
1402%    $\texttt{TAL}~b~s_2~s_2'$ called $tal_2$ such that
1403%    either $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1404%        $tal_1\approx tal_2$
1405%    or $s_1' \mathrel{{\S} \cap {\L}} s_2$ and
1406%    $\texttt{tal\_collapsable}~tal_1$ and $\lnot \ell~s_1$.
1407% \end{itemize}
1408% \end{theorem}
1410% The statement is also similar to the previous ones, but for the lack of
1411% the target code preamble.
1415For every $s_1,s_1',s_2$ s.t.
1416there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and
1417$s_1 (\L \cap \S) s_2$
1418there exists $s_{2_m},s_2'$ s.t.
1419there is a $\texttt{trace\_label\_return}~s_2~s_{2_m}$ called $tlr_2$ and
1420there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
1421s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
1422and $\texttt{tlr\_rel}~tlr_1~tlr_2$
1423and $s_1' (\S \cap \L) s_2'$ and
1424$s_1' \R s_{2_m}$.
1430status_simulation_produce_tlr S1 S2 R
1431(* we start from this situation
1432     st1 →→→→tlr→→→→ st1'
1433      | \
1434      L  \---S--\
1435      |          \
1436   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
1437                        by the preceding call)
1439   and we produce
1440     st1 →→→→tlr→→→→ st1'
1441             \\      /  \
1442             //     R    \-L,S-\
1443             \\     |           \
1444   st2_lab →tlr→ st2_mid →taaf→ st2'
1446  st1 st1' st2_lab st2
1447  (tlr1 : trace_label_return S1 st1 st1')
1448  (taa2_pre : trace_any_any S2 st2_lab st2)
1449  (sim_execute : status_simulation S1 S2 R)
1450  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
1451  ∃st2_mid.∃st2'.
1452  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
1453  ∃taa2 : trace_any_any_free … st2_mid st2'.
1454  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
1455  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
1456  tlr_rel … tlr1 tlr2
1460\section{Conclusions and future works}
1462The labelling approach is a technique to implement compilers that induce on
1463the source code a non uniform cost model determined from the object code
1464produced. The cost model assigns a cost to each basic block of the program.
1465The main theorem of the approach says that there is an exact
1466correspondence between the sequence of basic blocks started in the source
1467and object code, and that no instruction in the source or object code is
1468executed outside a basic block. Thus the cost of object code execution
1469can be computed precisely on the source.
1471In this paper we scale the labelling approach to cover a programming language
1472with function calls. This introduces new difficulties only when the language
1473is unstructured, i.e. it allows function calls to return anywhere in the code,
1474destroying the hope of a static prediction of the cost of basic blocks.
1475We restore static predictability by introducing a new semantics for unstructured
1476programs that single outs well structured executions. The latter are represented
1477by structured traces, a generalisation of streams of observables that capture
1478several structural invariants of the execution, like well nesting of functions
1479or the fact that every basic block must start with a code emission statement.
1480We show that structured traces are sufficiently structured to statically compute
1481a precise cost model on the object code.
1483We introduce a similarity relation on structured traces that must hold between
1484source and target traces. When the relation holds for every program, we prove
1485that the cost model can be lifted from the object to the source code.
1487In order to prove that similarity holds, we present a generic proof of forward
1488simulation that is aimed at pulling apart as much as possible the part of the
1489simulation related to non-functional properties (preservation of structure)
1490from that related to functional properties. In particular, we reduce the
1491problem of preservation of structure to that of showing a 1-to-many
1492forward simulation that only adds a few additional proof obligations to those
1493of a traditional, function properties only, proof.
1495All results presented in the paper are part of a larger certification of a
1496C compiler which is based on the labelling approach. The certification, done
1497in Matita, is the main deliverable of the FET-Open Certified Complexity (CerCo).
1499The short term future work consists in the completion of the certification of
1500the CerCo compiler exploiting the main theorem of this paper.
1502\paragraph{Related works.}
1503CerCo is the first project that explicitly tries to induce a
1504precise cost model on the source code in order to establish non-functional
1505properties of programs on an high level language. Traditional certifications
1506of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
1507of the functional properties.
1509Usually forward simulations take the following form: for each transition
1510from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
1511transitions in the target code of length $n$. The number $n$ of transition steps
1512in the target code can just be the witness of the existential statement.
1513An equivalent alternative when the proof of simulation is constructive consists
1514in providing an explicit function, called \emph{clock function} in the
1515literature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
1516function constitutes then a cost model for the source code, in the spirit of
1517what we are doing in CerCo. However, we believe our solution to be superior
1518in the following respects: 1) the machinery of the labelling approach is
1519insensible to the resource being measured. Indeed, any cost model computed on
1520the object code can be lifted to the source code (e.g. stack space used,
1521energy consumed, etc.). On the contrary, clock functions only talk about
1522number of transition steps. In order to extend the approach with clock functions
1523to other resources, additional functions must be introduced. Moreover, the
1524additional functions would be handled differently in the proof.
15252) the cost models induced by the labelling approach have a simple presentation.
1526In particular, they associate a number to each basic block. More complex
1527models can be induced when the approach is scaled to cover, for instance,
1528loop optimisations~\cite{loopoptimizations}, but the costs are still meant to
1529be easy to understand and manipulate in an interactive theorem prover or
1530in Frama-C.
1531On the contrary, a clock function is a complex function of the state $s_1$
1532which, as a function, is an opaque object that is difficult to reify as
1533source code in order to reason on it.
1538% \appendix
1539% \section{Notes for the reviewers}
1541% The results described in the paper are part of a larger formalization
1542% (the certification of the CerCo compiler). At the moment of the submission
1543% we need to single out from the CerCo formalization the results presented here.
1544% Before the 16-th of February we will submit an attachment that contains the
1545% minimal subset of the CerCo formalization that allows to prove those results.
1546% At that time it will also be possible to measure exactly the size of the
1547% formalization described here. At the moment a rough approximation suggests
1548% about 2700 lines of Matita code.
1550% We will also attach the development version of the interactive theorem
1551% prover Matita that compiles the submitted formalization. Another possibility
1552% is to backport the development to the last released version of the system
1553% to avoid having to re-compile Matita from scratch.
1555% The programming and certification style used in the formalization heavily
1556% exploit dependent types. Dependent types are used: 1) to impose invariants
1557% by construction on the data types and operations (e.g. a traces from a state
1558% $s_1$ to a state $s_2$ can be concatenad to a trace from a state
1559% $s_2'$ to a state $s_3$ only if $s_2$ is convertible with $s_2'$); 2)
1560% to state and prove the theorems by using the Russell methodology of
1561% Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
1562% }, better known in the Coq world as ``\texttt{Program}'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
1563% mandatorily on dependent types: it should be easy to adapt the technique
1564% and results presented in the paper to HOL.
1566% Finally, Matita and Coq are based on minor variations of the Calculus of
1567% (Co)Inductive Constructions. These variations do not affect the CerCo
1568% formalization. Therefore a porting of the proofs and ideas to Coq would be
1569% rather straightforward.
Note: See TracBrowser for help on using the repository browser.