source: Papers/itp-2013/ccexec2.tex @ 3350

Last change on this file since 3350 was 3350, checked in by sacerdot, 6 years ago


File size: 101.6 KB
[3343]12% \usepackage{amsmath}
17% NB: might be worth removing this if changing class in favour of
18% a built-in definition.
23  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
24   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
25  }
28  morekeywords={int8_t},
[3347]29  mathescape,
30  escapechar=\%
36%         keywordstyle=\color{red}\bfseries,
37%         keywordstyle=[2]\color{blue},
38%         commentstyle=\color{green},
39%         stringstyle=\color{blue},
[3343]40        showspaces=false,showstringspaces=false,
[3347]41%         xleftmargin=1em
42        }
47\pgfutil@ifundefined{pgf@arrow@code@implies}{% supply for lack of double arrow special arrow tip if it is not there%
48  \pgfarrowsdeclare{implies}{implies}%
49  {%
50  \pgfarrowsleftextend{2.2pt}%
51  \pgfarrowsrightextend{2.2pt}%
52  }%
53  {%
54    \pgfsetdash{}{0pt} % do not dash%
55    \pgfsetlinewidth{.33pt}%
56    \pgfsetroundjoin   % fix join%
57    \pgfsetroundcap    % fix cap%
58    \pgfpathmoveto{\pgfpoint{-1.5pt}{2.5pt}}%
59    \pgfpathcurveto{\pgfpoint{-.75pt}{1.5pt}}{\pgfpoint{0.5pt}{.5pt}}{\pgfpoint{2pt}{0pt}}%
60    \pgfpathcurveto{\pgfpoint{0.5pt}{-.5pt}}{\pgfpoint{-.75pt}{-1.5pt}}{\pgfpoint{-1.5pt}{-2.5pt}}%
61    \pgfusepathqstroke%
62  }%
66\tikzset{state/.style={inner sep = 0, outer sep = 2pt, draw, fill},
67         every node/.style={inner sep=2pt},
68         every on chain/.style = {inner sep = 0, outer sep = 2pt},
69         join all/.style = {every on chain/.append style={join}},
70         on/.style={on chain={#1}, state},
71         m/.style={execute at begin node=$, execute at end node=$},
72         node distance=3mm,
73         is other/.style={circle, minimum size = 3pt, state},
74         other/.style={on, is other},
75         is jump/.style={diamond, minimum size = 6pt, state},
76         jump/.style={on, is jump},
77         is call/.style={regular polygon, regular polygon sides=3, minimum size=5pt, state},
78         call/.style={on=going below, is call, node distance=6mm, label=above left:$#1$},
79         is ret/.style={regular polygon, regular polygon sides=3, minimum size=5pt, shape border rotate=180, state},
80         ret/.style={on=going above, is ret, node distance=6mm},
81         chain/.style={start chain=#1 going left},
82         rev ar/.style={stealth-, thick},
83         ar/.style={-stealth, thick},
84         every join/.style={rev ar},
85         labelled/.style={fill=white, label=above:$#1$},
86         vcenter/.style={baseline={([yshift=-.5ex]current bounding box)}},
87         every picture/.style={thick},
88         double equal sign distance/.prefix style={double distance=1.5pt}, %% if already defined (newest version of pgf) it should be ignored%}
89         implies/.style={double, -implies, thin, double equal sign distance, shorten <=5pt, shorten >=5pt},
90         new/.style={densely dashed},
91         rel/.style={font=\scriptsize, fill=white, inner sep=2pt},
92         diag/.style={row sep={11mm,between origins},
93                      column sep={11mm,between origins},
[3348]94                      every node/.style={execute at begin node=$, execute at end node=$},
95%                       every node/.style={draw, is other},
96                      },
[3343]97         small vgap/.style={row sep={7mm,between origins}},
[3348]98         small hgap/.style={column sep={7mm,between origins}},
99         small gap/.style={small vgap, small hgap},
[3343]100         % for article, maybe temporary
101         is jump/.style=is other,
102         is call/.style=is other,
103         is ret/.style=is other,
106\def\L{\mathrel{\mathcal L}}
107\def\S{\mathrel{\mathcal S}}
108\def\R{\mathrel{\mathcal R}}
109\def\C{\mathrel{\mathcal C}}
[3347]111\def\Labels{\mathbb L}
112\def\Functions{\mathbb F}
115\savebox{\execbox}{\tikz[baseline=-.5ex]\draw [-stealth] (0,0) -- ++(1em, 0);}
141\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
142Emerging Technologies (FET) programme within the Seventh Framework
143Programme for Research of the European Commission, under FET-Open grant
144number: 243881}}
145\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
146\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{}, \email{}}
149The labelling approach is a technique to lift cost models for non-functional
150properties of programs from the object code to the source code. It is based
151on the preservation of the structure of the high level program in every
152intermediate language used by the compiler. Such structure is captured by
153observables that are added to the semantics and that needs to be preserved
154by the forward simulation proof of correctness of the compiler. Additional
155special observables are required for function calls. In this paper we
156present a generic forward simulation proof that preserves all these observables.
157The proof statement is based on a new mechanised semantics that traces the
158structure of execution when the language is unstructured. The generic semantics
159and simulation proof have been mechanised in the interactive theorem prover
164The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
165a technique to \emph{lift} cost models for non-functional properties of programs
166from the object code to the source code. Examples of non-functional properties
167are execution time, amount of stack/heap space consumed and energy required for
168communication. The basic idea of the approach is that it is impossible to
169provide a \emph{uniform} cost model for an high level language that is preserved
170\emph{precisely} by a compiler. For instance, two instances of an assignment
171$x = y$ in the source code can be compiled very differently according to the
172place (registers vs stack) where $x$ and $y$ are stored at the moment of
173execution. Therefore a precise cost model must assign a different cost
174to every occurrence, and the exact cost can only be known after compilation.
176According to the labelling approach, the compiler is free to compile and optimise
177the source code without any major restriction, but it must keep trace
178of what happens to basic blocks during the compilation. The cost model is
179then computed on the object code. It assigns a cost to every basic block.
180Finally, the compiler propagates back the cost model to the source level,
181assigning a cost to each basic block of the source code.
183Implementing the labelling approach in a certified compiler
184allows to reason formally on the high level source code of a program to prove
185non-functional properties that are granted to be preserved by the compiler
186itself. The trusted code base is then reduced to 1) the interactive theorem
187prover (or its kernel) used in the certification of the compiler and
1882) the software used to certify the property on the source language, that
189can be itself certified further reducing the trusted code base.
190In~\cite{easylabelling} the authors provide an example of a simple
191certified compiler that implements the labelling approach for the
[3347]192imperative \verb+While+ language~\cite{while}, that does not have
[3343]193pointers and function calls.
195The labelling approach has been shown to scale to more interesting scenarios.
196In particular in~\cite{functionallabelling} it has been applied to a functional
197language and in~\cite{loopoptimizations} it has been shown that the approach
198can be slightly complicated to handle loop optimisations and, more generally,
199program optimisations that do not preserve the structure of basic blocks.
200On-going work also shows that the labelling approach is also compatible with
201the complex analyses required to obtain a cost model for object code
202on processors that implement advanced features like pipelining, superscalar
203architectures and caches.
205In the European Project CerCo (Certified Complexity~\footnote{\url{}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
2068051 object code. The compiler is
207moderately optimising and implements a compilation chain that is largely
208inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novelty and source of difficulties is due to the presence
209of function calls. Surprisingly, the addition of function calls require a
210revisitation of the proof technique given in~\cite{easylabelling}. In
211particular, at the core of the labelling approach there is a forward
[3347]212simulation proof that, in the case of \verb+While+, is only minimally
[3343]213more complex than the proof required for the preservation of the
214functional properties only. In the case of a programming language with
215function calls, instead, it turns out that the forward simulation proof for
216the back-end languages must grant a whole new set of invariants.
218In this paper we present a formalisation in the Matita interactive theorem
219prover~\cite{matita1,matita2} of a generic version of the simulation proof required for unstructured
220languages. All back-end languages of the CerCo compiler are unstructured
221languages, so the proof covers half of the correctness of the compiler.
222The statement of the generic proof is based on a new semantics
223for imperative unstructured languages that is based on \emph{structured
224traces} and that restores the preservation of structure in the observables of
225the semantics. The generic proof allows to almost completely split the
226part of the simulation that deals with functional properties only from the
227part that deals with the preservation of structure.
229The plan of this paper is the following. In Section~\ref{labelling} we
230sketch the labelling method and the problems derived from the application
231to languages with function calls. In Section~\ref{semantics} we introduce
232a generic description of an unstructured imperative language and the
233corresponding structured traces (the novel semantics). In
234Section~\ref{simulation} we describe the forward simulation proof.
235Conclusions and future works are in Section~\ref{conclusions}
237\section{The labelling approach}
[3347]239% \subsection{A brief introduction to the labelling approach}
240We briefly explain the labelling approach on the example in \autoref{examplewhile}.
241The user wants to analyse the execution time of the program (the black lines in
242\autoref{subfig:example_input}). He compiles the program using
[3344]243a special compiler that first inserts in the code three label emission
[3347]244statements (\verb+EMIT L_i+) to mark the beginning of basic blocks;
245then the compiler compiles the code to machine code (\autoref{subfig:example_oc}),
246granting that the execution of the source and object
[3344]247code emits the same sequence of labels ($L_1; L_2; L_2; L_3$ in the example).
248This is achieved by keeping track of basic blocks during compilation, avoiding
249all optimizations that alter the control flow. The latter can be recovered with
250a more refined version of the labelling approach~\cite{tranquill}, but in the
251present paper we stick to this simple variant for simplicity. Once the object
252code is produced, the compiler runs a static code analyzer to associate to
253each label $L_1, \ldots, L_3$ the cost (in clock cycles) of the instructions
254that belong to the corresponding basic block. For example, the cost $k_1$
255associated to $L_1$ is the number of cycles required to execute the block
[3347]256$I_3$ and \verb+COND l_2+, while the cost $k_2$ associated to $L_2$ counts the
[3350]257cycles required by the block $I_4$, \verb+GOTO l_1+ and \verb+COND l_2+. The compiler also guarantees
[3344]258that every executed instruction is in the scope of some code emission label,
259that each scope does not contain loops (to associate a finite cost), and that
260both branches of a conditional statement are followed by a code emission
261statement. Under these assumptions it is true that the total execution cost
262of the program $\Delta_t$ is equal to the sum over the sequence of emitted
263labels of the cost associated to every label:
[3347]264$\Delta_t = k(L_1; L_2; L_2; L_3) = k_1 + k_2 + k_2 + k_3$.
[3344]265Finally, the compiler emits an instrumented version of the source code
[3347]266(\autoref{subfig:example_instrument}) where label emission statements are replaced
267by increments of a global variable \verb+cost+ that, before every increment, holds the
[3344]268exact number of clock cycles spent by the microprocessor so far:
[3347]269the difference $\Deltacost$ between the final and initial value of the
270internal clock is $\Deltacost = k_1 + k_2 + k_2 + k_3 = \Delta_t$. Finally, the
[3344]271user can employ any available method (e.g. Hoare logic, invariant generators,
[3347]272abstract interpretation and automated provers) to certify that $\Deltacost$
[3344]273never exceeds a certain bound~\cite{cerco}, which is now a functional property
274of the code.
278|EMIT L_1;|
280for (i=0; i<2; i++) {
281  |EMIT L_2;|
282  $I_2$;
284|EMIT L_3;|
286\caption{The input program (black lines) with its labelling (red lines).}
292     EMIT L_1
293     $I_3$
294l_1: COND l_2
295     EMIT L_2
296     $I_4$
297     GOTO l_1
298l_2: EMIT L_3
300\caption{The output labelled object code.}
306cost += $k_1$;
308for (i=0; i<2; i++) {
309cost += $k_2$;
310  $I_2$;
312cost += $k_3$;           
314\caption{The output instrumented code.}
317% \begin{verbatim}
318% EMIT L_1;                         EMIT L_1         cost += k_1;
319% I_1;                              I_3              I_1;
320% for (i=0; i<2; i++) {        l_1: COND l_2         for (i=0; i<2; i++) {
321%   EMIT L_2;                       EMIT L_2           cost += k_2;           
322%   I_2;                            I_4                I_2;               
323%  }                                GOTO l_1          }                   
324% EMIT L_3;                    l_2: EMIT L_3         cost += k_3;           
325% \end{verbatim}
326\caption{The labelling approach applied to a simple program.\label{examplewhile}. The $I_i$ are sequences of instructions not containing jumps or loops. }
[3347]329\subsection{The labelling approach in presence of calls}
[3344]331Let's now consider a simple program written in C that contains a function
[3347]332pointer call inside the scope of the cost label $L_1$, in \autoref{subfig:call_input}.
[3344]333The labelling method works exactly as before, inserting
[3347]334code emission statements/\verb+cost+ variable increments at the beginning
[3344]335of every basic block and at the beginning of every function. The compiler
336still grants that the sequence of labels observed on the two programs are
[3347]337the same. A new difficulty appears when the compiler needs to statically
[3344]338analyze the object code to assign a cost to every label. What should the scope
[3347]339of the $L_1$ label be? After executing the $I_4$ block, the \verb+CALL+
[3344]340statement passes control to a function that cannot be determined statically.
341Therefore the cost of executing the body must be paid by some other label
342(hence the requirement that every function starts with a code emission
343statement). What label should pay for the cost for the block $I_5$? The only
344reasonable answer is $L_1$, i.e. \emph{the scope of labels should extend to the
[3347]345next label emission statement or the end of the function, stepping over function calls}.
352void main() {
353  EMIT L_1;
354  $I_1$;
355  (*f)();
356  $I_2$;
358void g() {
359  EMIT L_2;
360  $I_3$;
363\caption{The input labelled C program.}
371  EMIT L_1
372  $I_4$
373  CALL
374  $I_5$
377  EMIT L_2
378  $I_6$
381\caption{The output labelled object code.}
385\caption{An example compilation of a simple program with a function pointer
386         call.}
390The latter definition of scope is adeguate on the source level because
391C is a structured language that guarantees that every function call, if it
392returns, passes control to the first instruction that follows the call. However,
393this is not guaranteed for object code, the backend languages of a compiler
394and, more generally, for unstructured
395languages that use a writable control stack to store the return address of
396calls. For example, $I_6$ could increment by $1$ the return address on the
[3347]397stack so that the next \verb+RETURN+ would start at the second instruction
[3344]398of $I_5$. The compiler would still be perfectly correct if a random, dead
[3347]399code instruction was also added just after each \verb+CALL+. More generally,
[3344]400\emph{there is no guarantee that a correct compiler that respects the functional
401behaviour of a program also respects the calling structure of the source code}.
402Without such an assumption, however, it may not be true that the execution cost
403of the program is the sum of the costs associated to the labels emitted. In our
404example, the cost of $I_5$ is paid by $L_1$, but in place of $I_5$ the processor could execute any other code after $g$ returns.
406Obviously, any reasonably written compiler produces object code that behaves
407as if the language was structured (i.e. by properly nesting function
408calls/returns and without tampering with the return addresses on the control
409stack). This property, however, is a property of the runs of object code
410programs, and not a property of the object code that can be easily statically
411verified (as the ones we required for the basic labelling method).
412Therefore, we now need to single out those runs whose cost behaviour can be
413statically predicted, and we need to prove that every run of programs generated
414by our compiler are of that type. We call them \emph{structured} since their
415main property is to respect properties that hold for free on the source code
[3347]416because of structure. Moreover, in order to avoid proving
[3344]417too many preservation properties of our compiler, we drop the original
[3347]418requirements on the object code (all instructons must be in scope of some labels,
419no loops inside a scope, etc.) in favour of the corresponding requirement
420for structured runs (a structured run must start with a label emission, no
421instruction can be executed twice between two emissions, etc.).
423We will therefore proceed as follows. In the following section
4241) we formally introduce the notion of
425structured trace, which captures structured runs in the style of labelled
426transition systems; 2) we show that on the object code we can correctly
427compute the execution time of a structured run from the sequence of labels
428observed; 3) we give unstructured languages a semantics in terms of structured
429traces; 4) we show that on the source code we can correctly compute the
430execution time of a program if the compiler produces object code whose
431runs are weakly similar to the source code runs.
[3350]433The notion of weak simulation for structured traces is a global property
[3344]434which is hard to prove formally and much more demanding than the simple forward
435simulation required for proofs of preservation of functional properties.
[3347]436Therefore in \autoref{simulation} we will present a set of local simulation
[3344]437conditions that refine the corresponding conditions for forward simulation and
438that are sufficient to grant the production of weakly similar traces.
440All the definitions and theorems presented in the paper have been formalized
441in the interactive theorem prover Matita and are being used to certify
442the complexity preserving compiler developed in the CerCo project~\cite{cerco}.
443The formalization can be
444found at~\ref{YYY} and it heavily relies on algebraic and dependent types for
445both structured traces and the definition of weak similarity. In the paper
446we did not try to stay close to the formalization. On the contrary,
447the definitions given in the paper are the result of a significant
448simplification effort for
449the sake of presentation and to make easier the re-implementation of the
450concepts in a proof assistant which is not based on the Calculus of Inductive
[3347]451Constructions. In any case the formalization is heavily commented to allow the
[3344]452reader to understand the technical details of the formalization.
[3347]455% @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
457% We briefly sketch here a simplified version of the labelling approach as
458% introduced in~\cite{easylabelling}. The simplification strengthens the
459% sufficient conditions given in~\cite{easylabelling} to allow a simpler
460% explanation. The simplified conditions given here are also used in the
461% CerCo compiler to simplify the proof.
463% Let $\mathcal{P}$ be a programming language whose semantics is given in
464% terms of observables: a run of a program yields a finite or infinite
465% stream of observables. We also assume for the time being that function
466% calls are not available in $\mathcal{P}$. We want to associate a cost
467% model to a program $P$ written in $\mathcal{P}$. The first step is to
468% extend the syntax of $\mathcal{P}$ with a new construct $\verb+emit L+$
469% where $L$ is a label distinct from all observables of $\mathcal{P}$.
470% The semantics of $\verb+emit L+$ is the emission of the observable
471% \verb+L+ that is meant to signal the beginning of a basic block.
473% There exists an automatic procedure that injects into the program $P$ an
474% $\verb+emit L+$ at the beginning of each basic block, using a fresh
475% \verb+L+ for each block. In particular, the bodies of loops, both branches
476% of \verb+if-then-else+s and the targets of \verb+goto+s must all start
477% with an emission statement.
479% Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
480% that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
481% language used by the compiler. We can easily extend every
482% intermediate language (and its semantics) with an $\verb+emit L+$ statement
483% as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
484% the additional difficulty that the syntax of object code is given as a
485% sequence of bytes. The injection of an emission statement in the object code
486% can be done using a map that maps two consecutive code addresses with the
487% statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \verb+emit L+$ then the observable \verb+L+ is emitted after the execution of the
488% instruction stored at $pc_1$ and before the execution of the instruction
489% stored at $pc_2$. The two program counters are necessary because the
490% instruction stored at $pc_1$ can have multiple possible successors (e.g.
491% in case of a conditional branch or an indirect call). Dually, the instruction
492% stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
493% target of a jump).
495% The compiler, to be functionally correct, must preserve the observational
496% equivalence, i.e. executing the program after each compiler pass should
497% yield the same stream of observables. After the injection of emission
498% statements, observables now capture both functional and non-functional
499% behaviours.
500% This correctness property is called in the literature a forward simulation
501% and is sufficient for correctness when the target language is
502% deterministic~\cite{compcert3}.
503% We also require a stronger, non-functional preservation property: after each
504% pass all basic blocks must start with an emission statement, and all labels
505% \verb+L+ must be unique.
507% Now let $M$ be the object code obtained for the program $P$. Let us suppose
508% that we can statically inspect the code $M$ and associate to each basic block
509% a cost (e.g. the number of clock cycles required to execute all instructions
510% in the basic block, or an upper bound to that time). Every basic block is
511% labelled with an unique label \verb+L+, thus we can actually associate the
512% cost to \verb+L+. Let call it $k(\verb+L+)$.
514% The function $k$ is defined as the cost model for the object code control
515% blocks. It can be equally used as well as the cost model for the source
516% control blocks. Indeed, if the semantics of $P$ is the stream
517% $L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
518% also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
519% every instruction belongs to a control block and every control block is
520% labelled. Thus it is correct to say that the execution cost of $P$ is also
521% $\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
522% the blocks of the high level program $P$ that is preserved by compilation.
524% How can the user profit from the high level cost model? Suppose, for instance,
525% that he wants to prove that the WCET of his program is bounded by $c$. It
526% is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
527% a purely functional property of the code. He can therefore use any technique
528% available to certify functional properties of the source code.
529% What is suggested in~\cite{easylabelling} is to actually instrument the
530% source code $P$ by replacing every label emission statement
531% $\verb+emit L+$ with the instruction $\verb+cost += k(L)+$ that increments
532% a global fresh variable \verb+cost+. The bound is now proved by establishing
533% the program invariant $\verb+cost+ \leq c$, which can be done for example
534% using the Frama-C~\cite{framaC} suite if the source code is some variant of
535% C.
537% In order to extend the labelling approach to function calls we make
538% \verb+CALL f+ emit the observable \verb+f+ and \verb+RET+ emit a distinguished observable
539% \verb+ret+.
541% For example the following execution history of the program in \autoref{fig:esempio}
542% $$I_1; \verb+CALL f+; \verb+COND l+; \verb+EMIT $\ell_2$+; I_3; \verb+RET+; I_2; \verb+RET+$$
543% emits the trace
544% $$\verb+main+, \verb+f+$$
545% \begin{figure}
546% \hfil
547% \begin{minipage}{.2\linewidth}
548% \begin{lstlisting}
549% main: $\!I_1$
550%       CALL f
551%       $I_2$
552%       RET
553% \end{lstlisting}
554% \end{minipage}
555% \begin{minipage}{.1\linewidth}
556% \begin{lstlisting}
557% main
558% main
559% main
560% main
561% \end{lstlisting}
562% \end{minipage}
563% \hfil
564% \begin{minipage}{.2\linewidth}
565% \begin{lstlisting}
566% f: $\!$COND l
567%    EMIT $\ell_2$
568%    RET
569% l: $\!$EMIT $\ell_3$
570%    $I_3$
571%    RET
572% \end{lstlisting}
573% \end{minipage}
574% \begin{minipage}{.1\linewidth}
575% \begin{lstlisting}
576% f
578% $\ell_2$
580% $\ell_3$
581% $\ell_3$
582% \end{lstlisting}
583% \end{minipage}
584% \hfil{}
585% \caption{}
586% \label{fig:esempio}
587% \end{figure}
590% \subsection{Labelling function calls}
591% We now want to extend the labelling approach to support function calls.
592% On the high level, \emph{structured} programming language $\mathcal{P}$ there
593% is not much to change.
594% When a function is invoked, the current basic block is temporarily exited
595% and the basic block the function starts with take control. When the function
596% returns, the execution of the original basic block is resumed. Thus the only
597% significant change is that basic blocks can now be nested. Let \verb+E+
598% be the label of the external block and \verb+I+ the label of a nested one.
599% Since the external starts before the internal, the semantics observed will be
600% \verb+E I+ and the cost associated to it on the source language will be
601% $k(\verb+E+) + k(\verb+I+)$, i.e. the cost of executing all instructions
602% in the block \verb+E+ first plus the cost of executing all the instructions in
603% the block \verb+I+. However, we know that some instructions in \verb+E+ are
604% executed after the last instruction in \verb+I+. This is actually irrelevant
605% because we are here assuming that costs are additive, so that we can freely
606% permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
607% the function call terminates and yields back control to the basic block
608% \verb+E+. If the call diverges, the instrumentation
609% $\verb+cost += k(E)+$ executed at the beginning of \verb+E+ is still valid,
610% but just as an upper bound to the real execution cost: only precision is lost.
612% Let now consider what happens when we move down the compilation chain to an
613% unstructured intermediate or final language. Here unstructured means that
614% the only control operators are conditional and unconditional jumps, function
615% calls and returns. Unlike a structured language, though, there is no guarantee
616% that a function will return control just after the function call point.
617% The semantics of the return statement, indeed, consists in fetching the
618% return address from some internal structure (typically the control stack) and
619% jumping directly to it. The code can freely manipulate the control stack to
620% make the procedure returns to whatever position. Indeed, it is also possible
621% to break the well nesting of function calls/returns.
623% Is it the case that the code produced by a correct compiler must respect the
624% additional property that every function returns just after its function call
625% point? The answer is negative and the property is not implied by forward
626% simulation proofs. For instance, imagine to modify a correct compiler pass
627% by systematically adding one to the return address on the stack and by
628% putting a \verb+NOP+ (or any other instruction that takes one byte) after
629% every function call. The obtained code will be functionally indistinguishable,
630% and the added instructions will all be dead code.
632% This lack of structure in the semantics badly interferes with the labelling
633% approach. The reason is the following: when a basic block labelled with
634% \verb+E+ contains a function call, it no longer makes any sense to associate
635% to a label \verb+E+ the sum of the costs of all the instructions in the block.
636% Indeed, there is no guarantee that the function will return into the block and
637% that the instructions that will be executed after the return will be the ones
638% we are paying for in the cost model.
640% How can we make the labelling approach work in this scenario? We only see two
641% possible ways. The first one consists in injecting an emission statement after
642% every function call: basic blocks no longer contain function calls, but are now
643% terminated by them. This completely solves the problem and allows the compiler
644% to break the structure of function calls/returns at will. However, the
645% technique has several drawbacks. First of all, it greatly augments the number
646% of cost labels that are injected in the source code and that become
647% instrumentation statements. Thus, when reasoning on the source code to prove
648% non-functional properties, the user (or the automation tool) will have to handle
649% larger expressions. Second, the more labels are emitted, the more difficult it
650% becomes to implement powerful optimisations respecting the code structure.
651% Indeed, function calls are usually implemented in such a way that most registers
652% are preserved by the call, so that the static analysis of the block is not
653% interrupted by the call and an optimisation can involve both the code before
654% and after the function call. Third, instrumenting the source code may require
655% unpleasant modification of it. Take, for example, the code
656% \verb+f(g(x));+. We need to inject an emission statement/instrumentation
657% instruction just after the execution of \verb+g+. The only way to do that
658% is to rewrite the code as \verb+y = g(x); emit L; f(y);+ for some fresh
659% variable \verb+y+. It is pretty clear how in certain situations the obtained
660% code would be more obfuscated and then more difficult to manually reason on.
662% For the previous reasons, in this paper and in the CerCo project we adopt a
663% different approach. We do not inject emission statements after every
664% function call. However, we want to propagate a strong additional invariant in
665% the forward simulation proof. The invariant is the propagation of the structure
666%  of the original high level code, even if the target language is unstructured.
667% The structure we want to propagate, that will become more clear in the next
668% section, comprises 1) the property that every function should return just after
669% the function call point, which in turns imply well nesting of function calls;
670% 2) the property that every basic block starts with a code emission statement.
672% In the original labelling approach of~\cite{easylabelling}, the second property
673% was granted syntactically as a property of the generated code.
674% In our revised approach, instead, we will impose the property on the runs:
675% it will be possible to generate code that does not respect the syntactic
676% property, as soon as all possible runs respect it. For instance, dead code will no longer
677% be required to have all basic blocks correctly la. The switch is suggested
678% from the fact that the first of the two properties --- that related to
679% function calls/returns --- can only be defined as property of runs,
680% not of the static code. The switch is
681% beneficial to the proof because the original proof was made of two parts:
682% the forward simulation proof and the proof that the static property was granted.
683% In our revised approach the latter disappears and only the forward simulation
684% is kept.
686% In order to capture the structure semantics so that it is preserved
687% by a forward simulation argument, we need to make the structure observable
688% in the semantics. This is the topic of the next section.
[3343]690\section{Structured traces}
693Let's consider a generic unstructured language already equipped with a
694small step structured operational semantics (SOS). We introduce a
[3350]695deterministic labelled transition system~\cite{LTS} $(S,\Lambda,\to)$
[3345]696that refines the
697SOS by observing function calls and the beginning of basic blocks.
[3350]698$S$ is the set of states of the program and
[3347]699$\Lambda = \{ \tau, RET \} \cup \Labels \cup \Functions$
700where $\Functions$ is the set of names of functions that can occur in the
701program, $\Labels$ is a set of labels disjoint from $\Functions$
[3348]702and $\tau$ and $RET$ do not belong to $\Functions \cup \Labels$. Moreover there
703is an injective function $\ell : \Functions \to \Labels$ that tells the
704starting label of the body of each function, and $\ell(\Functions)\subseteq \Labels$
705denotes the image of this function.
[3347]706The transition function is defined as $s_1 \to[o] s_2$ if
[3350]707$s_1$ moves to $s_2$ according to the SOS and $o = f \in \Functions$ if
[3347]708the function $f$ is called, $o = RET$ if a \verb+RETURN+ is executed,
709$o = L \in \Labels$ if an \verb+EMIT $L$+ is executed to signal the
[3345]710beginning of a basic block, and $o = \tau$ in all other cases.
711Because we assume the language to be deterministic, the label emitted can
[3347]712actually be computed simply observing $s_1$. Finally, $S$ is also endowed with
[3350]713a relation $s\ar s'$ ($s'$ \emph{follows} $s$) that holds when the instruction
714to be executed in $s'$ follows syntactically the one in $s$ in the source program.
[3350]716In the rest of the paper we write $s_0 \to^{*} s_n$ for the finite execution
717fragment $T = s_0 \to[o_0] s_1 \to[o_1] \ldots \to[o_{n-1}] s_n$
718and, we call \emph{weak trace} of $T$ (denoted as $|T|$) the
719subsequence $o_{i_0} \ldots o_{i_m}$ of $o_0 \ldots o_{n-1}$ obtained dropping
720every internal action $\tau$.
722%Let $k$ be a cost model for observables actions that maps elements of
723%$\Lambda \setminus \{\tau\}$ to any commutative cost monoid
724%(e.g. natural numbers). We extend the domain of $k$ to executable fragments
725%by posing $k(T) = \Sigma_{o \in |T|} k(o)$.
727\paragraph{Structured execution fragments}
[3347]728Among all possible finite execution fragments we want to
[3345]729identify the ones that satisfy the requirements we sketched in the previous
[3347]730section. We say that an execution fragment
[3350]731$s_0 \to^{*} s_n$
732is \emph{structured} (and we denote it as $s_0 \To s_n$) iff the following conditions
[3347]733are met.
[3347]735 \item For every $i$, if $s_i \to[f] s_{i+1}$ then there is a
736   label $L$ and a $k\ge i+2$ such that
[3348]737    $s_{i+1} \to[\ell(f)] s_{i+2} \To s_k \to[RET] s_{k+1}$, with
[3347]738    $s_i \ar s_{k+1}$.
[3348]739   In other words, $s_{i+1}$ must start execution with \verb+EMIT $\ell(f)$+
[3350]740   --- so that no instruction falls outside the scope of every label ---
[3347]741   and then continue with a structured fragment returning control
742   to the instruction immediately following the call.
744   The condition also enforces convergence of every function call, which is
745   necessary to bound the cost of the fragment. Note that
746   non convergent programs may still have structured execution fragments
747   that are worth measuring. For example, we can measure the reaction time
748   of a server implemented as an unbounded loop whose body waits for an
749   input, process it and performs an output before looping: the processing
750   steps form a structured execution fragment.
[3347]751 \item The number of $RET$'s in the fragment is equal to the number of
[3350]752   calls performed. In combination with the previous condition, this ensures
753   well-backeting of function calls.
754 \item
755   \label{req3}
756   For every $i$ and $f$, if $s_{i+1}\to[\ell(f)]s_{i+2}$ then
757   $s_i\to[f]s_{i+1}$. This is a technical condition needed to ensure that a
758   label associated with a function is only used at the beginning of its
759   body. Its use will become clear in~\autoref{simulation}.
[3347]760 \item For every $i$, if the instruction to be executed in $s_i$ is a
761   conditional branch, then there is an $L$ such that $s_{i+1} \to[L] s_{i+2}$ or, equivalently, that $s_{i+1}$ must start execution with an
762   \verb+EMIT $L$+. This captures the requirement that every branch which is
[3350]763   live code must start with a label emission. Otherwise, it would be possible
764   to have conditional instructions whose branches are assigned different
765   costs, making impossible to assign a single cost to the label whose scope
766   contains the jump.
[3348]768One might wonder why $f$ and $\ell(f)$, that aways appear in this order, are not
[3350]769collapsed into a single observable. This would simplify some aspects of the
770formalisation at the price of others. For example, we should add special
771cases when the fragment starts at the beginning of a function body
772(e.g. the one of \texttt{main}) because in that case nobody would have emitted
773the observable $\ell(f)$.
[3350]775\paragraph{Measurable execution fragments and their cost prediction.}
776The first main theorem of CerCo deals with programs written in object code.
777It states that the execution cost of
778certain execution fragments, that we call \emph{measurable fragments}, can be
779computed from their weak trace by choosing the cost model $k$ that assigns to
780any label the cost (in clock cycles) of the instructions in its scope, and
781$0$ to function calls and $RET$ observables.
784 \label{static}
785 for all measurable fragment $T = s_0 \to^{*} s_n$,\\
786 $$\Delta_t := \verb+clock+_{s_n} - \verb+clock+_{s_0} = \Sigma_{o \in |T|} k(o)$$
[3350]789An execution fragment $s_0 \to^{*} s_n$ is
790measurable if it is structured (up to a possible final \texttt{RETURN}) and
791if it does not start or end in the middle of a basic block.
792Ending in the middle of a block would mean that the last label encountered
793would have pre-paid more instructions than the ones executed; starting in the
794middle would mean not paying any instruction up to the first label emission.
796Formally, $s_0 \to^{*} s_n$ is measurable iff $o_0 \in \Labels$ (or equivalently
[3347]797in $s_0$ the program must emit a label) and either
[3350]798$s_0 \To s_{n-1}$ and $s_{n-1}\to[RET]s_n$ or
799$s_0 \To s_n$ and $s_n$ must be a label emission statement.
[3350]801\textbf{CSC: PROVA----------------------}
802% The theorem is proved by structural induction over the structured
803% trace, and is based on the invariant that
804% iff the function that computes the cost model has analysed the instruction
805% to be executed at $s_2$ after the one to be executed at $s_1$, and if
806% the structured trace starts with $s_1$, then eventually it will contain also
807% $s_2$. When $s_1$ is not a function call, the result holds trivially because
808% of the $s_1\exec s_2$ condition obtained by inversion on
809% the trace. The only non
810% trivial case is the one of function calls: the cost model computation function
811% does recursion on the first instruction that follows that function call; the
812% \verb+as_after_return+ condition of the \verb+tal_base_call+ and
813% \verb+tal_step_call+ grants exactly that the execution will eventually reach
814% this state.
816\paragraph{Weak similarity and cost invariance.}
[3346]817Given two deterministic unstructured programming languages with their own
[3350]818operational semantics, we say that two execution fragments are
819\emph{weakly trace equivalent} if their weak traces are equal.
[3350]821A compiler (pass) that preserves the program semantics also preserves weak
822traces and propagates measurability iff for every measurable
823fragment $T_1 = s_1 \to^{*} s_1'$ of the source code, the corresponding
824execution fragment $T_2 = s_2 \to^{*} s_2'$ of the object code is measurable
825and $T_1$ and $T_2$ are weakly trace equivalent. The very intuitive notion of
826``corresponding fragment'' is made clear in the forward simulation proof of
827preservation of the semantics of the program by saying that $s_2$ and $s_1$
828are in a certain relation. Clearly the property holds for a compiler if it
829holds for each compiler pass.
831Having proved in~\autoref{static} that the statically computed cost model is
832accurate for the object code, we get as a corollary that it is also accurate
833for the source code if the compiler preserves weak traces and
834propagates measurability. Thus it becomes possible to compute cost models
835on the object code, transfer it to the source code and then reason comfortably
836on the source code only.
839Given a compiler that preserves weak traces and propagates measurability,
840for all measurable execution fragment $T_1 = s_1 \to^{*} s_1'$ of the source
841code such that $T_2 = s_2 \to^{*} s_2'$ is the corresponding fragment of the
842object code,
844$$\Delta_t := \verb+clock+_{s_2'} - \verb+clock+_{s_2} = \Sigma_{o \in |T_2|} k(o) = \Sigma_{o \in |T_1|} k(o)$$
847\section{Forward simulation}
849Because of \autoref{preservation}, to certify a compiler for the labelling
850approach we need to both prove that it respects the functional semantics of the
851program, and that it preserves weak traces and propagates measurability.
852The first property is standard and can be proved by means of a forward simulation argument (see for example~\cite{compcert}) that runs like this.
853First a relation between the corresponding
854source and target states is established. Then a lemma establishes
855a local simulation condition: given two states in relation, if the source
856one performs one step then the target one performs zero or more steps and
857the two resulting states are synchronized again according to the relation.
858Finally, the lemma is iterated over the execution trace to establish the
859final result.
861In principle, preservation of weak traces could be easily shown with the same
862argument (and also at the same time). Surprisingly, propagation of
863measurability cannot. What makes the standard forward
864simulation proof work is the fact that usually a compiler pass performs some
865kind of local or global analysis of the code followed by a compositional, order
866preserving translation of every instruction. In order to produce structured
867traces, however, code emission cannot be fully compositional any longer.
869For example, consider~requirement \ref{req3} that asks every function body
870to start with a label emission statement. Some compiler passes must
871add preambles to functions, for example to take care of the parameter passing
872convention. In order to not violate the requirement, the preamble must be
873inserted after the label emission. In the forward simulation proof, however,
874function call steps in the source language are simulated by the new function
875call followed by the execution of the preamble, and only at the end of the
876preamble the reached states are again in the expected relation. In the meantime,
877however, the object code has already performed the label emission statement,
878that still needs to be executed in the source code, breaking forward simulation.
880Another reason why the standard argument breaks is due to the requirement that
881function calls should yield back control after the calling point. This must be
882enforced just after
886A compiler preserves the program semantics by suppressing or introducing
887$\tau$ actions.
889Intuitively, it is because
891To understand why, consider the case of a function call
892and the pass that fixes the parameter passing conventions. A function
893call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
894input nor output parameters. The pass must add explicit code before and after
895the function call to move the pseudo-registers content from/to the hardware
896registers or the stack in order to implement the parameter passing strategy.
897Similarly, each function body must be augmented with a preamble and a postamble
898to complete/initiate the parameter passing strategy for the call/return phase.
899Therefore what used to be a call followed by the next instruction to execute
900after the function return, now becomes a sequence of instructions, followed by
901a call, followed by another sequence. The two states at the beginning of the
902first sequence and at the end of the second sequence are in relation with
903the status before/after the call in the source code, like in an usual forward
904simulation. How can we prove however the additional condition for function calls
905that asks that when the function returns the instruction immediately after the
906function call is called? To grant this invariant, there must be another relation
907between the address of the function call in the source and in the target code.
908This additional relation is to be used in particular to relate the two stacks.
[3347]913% @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[3347]915% The program semantics adopted in the traditional labelling approach is based
916% on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
917% a set of states $\S$, the semantics of one deterministic execution
918% step is
919% defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
920% observables. The semantics is then lifted compositionally to multiple (finite
921% or infinite) execution steps.
922% Finally, the semantics of a a whole program execution is obtained by forgetting
923% about the final state (if any), yielding a function $S \to O^*$ that given an
924% initial status returns the finite or infinite stream of observables in output.
[3347]926% We present here a new definition of semantics where the structure of execution,
927% as defined in the previous section, is now observable. The idea is to replace
928% the stream of observables with a structured data type that makes explicit
929% function call and returns and that grants some additional invariants by
930% construction. The data structure, called \emph{structured traces}, is
931% defined inductively for terminating programs and coinductively for diverging
932% ones. In the paper we focus only on the inductive structure, i.e. we assume
933% that all programs that are given a semantics are total. The Matita formalisation
934% also shows the coinductive definitions. The semantics of a program is then
935% defined as a function that maps an initial state into a structured trace.
937% In order to have a definition that works on multiple intermediate languages,
938% we abstract the type of structure traces over an abstract data type of
939% abstract statuses, which we aptly call $\verb+abstract_status+$. The fields
940% of this record are the following.
941% \begin{itemize}
942%  \item \verb+S : Type[0]+, the type of states.
943%  \item \verb+as_execute : S $\to$ S $\to$ Prop+, a binary predicate stating
944%  an execution step. We write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
945%  \item \verb+as_classifier : S $\to$ classification+, a function tagging all
946%  states with a class in
947%  $\{\verb+cl_return,cl_jump,cl_call,cl_other+\}$, depending on the instruction
948%  that is about to be executed (we omit tail-calls for simplicity). We will
949%  use $s \class c$ as a shorthand for both $\verb+as_classifier+~s=c$
950%  (if $c$ is a classification) and $\verb+as_classifier+~s\in c$
951%  (if $c$ is a set of classifications).
952%  \item \verb+as_label : S $\to$ option label+, telling whether the
953%  next instruction to be executed in $s$ is a cost emission statement,
954%  and if yes returning the associated cost label. Our shorthand for this function
955%  will be $\ell$, and we will also abuse the notation by using $\ell~s$ as a
956%  predicate stating that $s$ is labelled.
957%  \item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
958%  telling the identifier of the function which is being called in a
959%  \verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
960%  $\verb+as_call_ident+~s = f$.
961%  \item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
962%  which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
963%  instruction to be executed in $s_2$ follows the function call to be
964%  executed in (the witness of the $\Sigma$-type) $s_1$. We will use the notation
965%  $s_1\ar s_2$ for this relation.
966% \end{itemize}
968% % \begin{alltt}
969% % record abstract_status := \{ S: Type[0];
970% %  as_execute: S \(\to\) S \(\to\) Prop;   as_classifier: S \(\to\) classification;
971% %  as_label: S \(\to\) option label;    as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
972% %  as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
973% % \end{alltt}
975% The inductive type for structured traces is actually made by three multiple
976% inductive types with the following semantics:
977% \begin{enumerate}
978%  \item $(\verb+trace_label_return+~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
979%    is a trace that begins in
980%    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
981%    such that the instruction to be executed in $s_1$ is a label emission
982%    statement and the one to be executed in the state before $s_2$ is a return
983%    statement. Thus $s_2$ is the state after the return. The trace
984%    may contain other label emission statements. It captures the structure of
985%    the execution of function bodies: they must start with a cost emission
986%    statement and must end with a return; they are obtained by concatenating
987%    one or more basic blocks, all starting with a label emission
988%    (e.g. in case of loops).
989%  \item $(\verb+trace_any_label+~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
990%    is a trace that begins in
991%    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
992%    such that the instruction to be executed in $s_2$/in the state before
993%    $s_2$ is either a label emission statement or
994%    or a return, according to the boolean $b$. It must not contain
995%    any label emission statement. It captures the notion of a suffix of a
996%    basic block.
997%  \item $(\verb+trace_label_label+~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$ is the special case of
998%    $\verb+TAL+~b~s_1~s_2)$ such that the instruction to be
999%    executed in $s_1$ is a label emission statement. It captures the notion of
1000%    a basic block.
1001% \end{enumerate}
1003% \begin{multicols}{3}
1004% \infrule[\verb+tlr_base+]
1005%  {\verb+TLL+~true~s_1~s_2}
1006%  {\verb+TLR+~s_1~s_2}
1008% \infrule[\verb+tlr_step+]
1009%  {\verb+TLL+~false~s_1~s_2 \andalso
1010%   \verb+TLR+~s_2~s_3
1011%  }
1012%  {\verb+TLR+~s_1~s_3}
1014% \infrule[\verb+tll_base+]
1015%  {\verb+TAL+~b~s_1~s_2 \andalso
1016%   \ell~s_1
1017%  }
1018%  {\verb+TLL+~b~s_1~s_2}
1019% \end{multicols}
1021% \infrule[\verb+tal_base_not_return+]
1022%  {s_1\exec s_2 \andalso
1023%   s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
1024%   \ell~s_2
1025%  }
1026%  {\verb+TAL+~false~s_1~s_2}
1028% \infrule[\verb+tal_base_return+]
1029%  {s_1\exec s_2 \andalso
1030%   s_1 \class \verb+cl_return+
1031%  }
1032%  {\verb+TAL+~true~s_1~s_2}
1034% \infrule[\verb+tal_base_call+]
1035%  {s_1\exec s_2 \andalso
1036%   s_1 \class \verb+cl_call+ \andalso
1037%   s_1\ar s_3 \andalso
1038%   \verb+TLR+~s_2~s_3 \andalso
1039%   \ell~s_3
1040%  }
1041%  {\verb+TAL+~false~s_1~s_3}
1043% \infrule[\verb+tal_step_call+]
1044%  {s_1\exec s_2 \andalso
1045%   s_1 \class \verb+cl_call+ \andalso
1046%   s_1\ar s_3 \andalso
1047%   \verb+TLR+~s_2~s_3 \andalso
1048%   \verb+TAL+~b~s_3~s_4
1049%  }
1050%  {\verb+TAL+~b~s_1~s_4}
1052% \infrule[\verb+tal_step_default+]
1053%  {s_1\exec s_2 \andalso
1054%   \lnot \ell~s_2 \andalso
1055%   \verb+TAL+~b~s_2~s_3\andalso
1056%   s_1 \class \verb+cl_other+
1057%  }
1058%  {\verb+TAL+~b~s_1~s_3}
1059% \begin{comment}
1060% \begin{verbatim}
1061% inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
1062%   | tlr_base:
1063%       ∀status_before: S.
1064%       ∀status_after: S.
1065%         trace_label_label S ends_with_ret status_before status_after →
1066%         trace_label_return S status_before status_after
1067%   | tlr_step:
1068%       ∀status_initial: S.
1069%       ∀status_labelled: S.
1070%       ∀status_final: S.
1071%         trace_label_label S doesnt_end_with_ret status_initial status_labelled →
1072%         trace_label_return S status_labelled status_final →
1073%           trace_label_return S status_initial status_final
1074% with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
1075%   | tll_base:
1076%       ∀ends_flag: trace_ends_with_ret.
1077%       ∀start_status: S.
1078%       ∀end_status: S.
1079%         trace_any_label S ends_flag start_status end_status →
1080%         as_costed S start_status →
1081%           trace_label_label S ends_flag start_status end_status
1082% with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
1083%   (* Single steps within a function which reach a label.
1084%      Note that this is the only case applicable for a jump. *)
1085%   | tal_base_not_return:
1086%       ∀start_status: S.
1087%       ∀final_status: S.
1088%         as_execute S start_status final_status →
1089%         (as_classifier S start_status cl_jump ∨
1090%          as_classifier S start_status cl_other) →
1091%         as_costed S final_status →
1092%           trace_any_label S doesnt_end_with_ret start_status final_status
1093%   | tal_base_return:
1094%       ∀start_status: S.
1095%       ∀final_status: S.
1096%         as_execute S start_status final_status →
1097%         as_classifier S start_status cl_return →
1098%           trace_any_label S ends_with_ret start_status final_status
1099%   (* A call followed by a label on return. *)
1100%   | tal_base_call:
1101%       ∀status_pre_fun_call: S.
1102%       ∀status_start_fun_call: S.
1103%       ∀status_final: S.
1104%         as_execute S status_pre_fun_call status_start_fun_call →
1105%         ∀H:as_classifier S status_pre_fun_call cl_call.
1106%           as_after_return S «status_pre_fun_call, H» status_final →
1107%           trace_label_return S status_start_fun_call status_final →
1108%           as_costed S status_final →
1109%             trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
1110%   (* A call followed by a non-empty trace. *)
1111%   | tal_step_call:
1112%       ∀end_flag: trace_ends_with_ret.
1113%       ∀status_pre_fun_call: S.
1114%       ∀status_start_fun_call: S.
1115%       ∀status_after_fun_call: S.
1116%       ∀status_final: S.
1117%         as_execute S status_pre_fun_call status_start_fun_call →
1118%         ∀H:as_classifier S status_pre_fun_call cl_call.
1119%           as_after_return S «status_pre_fun_call, H» status_after_fun_call →
1120%           trace_label_return S status_start_fun_call status_after_fun_call →
1121%           ¬ as_costed S status_after_fun_call →
1122%           trace_any_label S end_flag status_after_fun_call status_final →
1123%             trace_any_label S end_flag status_pre_fun_call status_final
1124%   | tal_step_default:
1125%       ∀end_flag: trace_ends_with_ret.
1126%       ∀status_pre: S.
1127%       ∀status_init: S.
1128%       ∀status_end: S.
1129%         as_execute S status_pre status_init →
1130%         trace_any_label S end_flag status_init status_end →
1131%         as_classifier S status_pre cl_other →
1132%         ¬ (as_costed S status_init) →
1133%           trace_any_label S end_flag status_pre status_end.
1134% \end{verbatim}
1135% \end{comment}
1136% A \verb+trace_label_return+ is isomorphic to a list of
1137% \verb+trace_label_label+s that ends with a cost emission followed by a
1138% return terminated \verb+trace_label_label+.
1139% The interesting cases are those of $\verb+trace_any_label+~b~s_1~s_2$.
1140% A \verb+trace_any_label+ is a sequence of steps built by a syntax directed
1141% definition on the classification of $s_1$. The constructors of the datatype
1142% impose several invariants that are meant to impose a structure to the
1143% otherwise unstructured execution. In particular, the following invariants are
1144% imposed:
1145% \begin{enumerate}
1146%  \item the trace is never empty; it ends with a return iff $b$ is
1147%        true
1148%  \item a jump must always be the last instruction of the trace, and it must
1149%        be followed by a cost emission statement; i.e. the target of a jump
1150%        is always the beginning of a new basic block; as such it must start
1151%        with a cost emission statement
1152%  \item a cost emission statement can never occur inside the trace, only in
1153%        the status immediately after
1154%  \item the trace for a function call step is made of a subtrace for the
1155%        function body of type
1156%        $\verb+trace_label_return+~s_1~s_2$, possibly followed by the
1157%        rest of the trace for this basic block. The subtrace represents the
1158%        function execution. Being an inductive datum, it grants totality of
1159%        the function call. The status $s_2$ is the one that follows the return
1160%        statement. The next instruction of $s_2$ must follow the function call
1161%        instruction. As a consequence, function calls are also well nested.
1162% \end{enumerate}
1164% There are three mutual structural recursive functions, one for each of
1165% \verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
1166% $|\,.\,|$: the \emph{flattening} of the traces. These functions
1167% allow to extract from a structured trace the list of emitted cost labels.
1168% %  We only show here the type of one
1169% % of them:
1170% % \begin{alltt}
1171% % flatten_trace_label_return:
1172% %  \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
1173% %   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
1174% % \end{alltt}
1176% \paragraph{Structured traces similarity and cost prediction invariance.}
1178% A compiler pass maps source to object code and initial states to initial
1179% states. The source code and initial state uniquely determine the structured
1180% trace of a program, if it exists. The structured trace fails to exists iff
1181% the structural conditions are violated by the program execution (e.g. a function
1182% body does not start with a cost emission statement). Let us assume that the
1183% target structured trace exists.
1185% What is the relation between the source and target structured traces?
1186% In general, the two traces can be arbitrarily different. However, we are
1187% interested only in those compiler passes that maps a trace $\tau_1$ to a trace
1188% $\tau_2$ such that
1189% \begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
1190% The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
1191% corollary
1192% \begin{equation}\label{th3}
1193% \forall s_1,s_2. \forall \tau: \verb+TLR+~s_1~s_2.~
1194%   \verb+clock+~s_2 - \verb+clock+~s_1 =
1195%   \Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
1196%   \Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
1197% \end{equation}
1198% This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
1199% transfer the cost model from the target to the source code and reason on the
1200% source code only.
1202% We are therefore interested in conditions stronger than~\eqref{th2}.
1203% Therefore we introduce here a similarity relation between traces with
1204% the same structure. Theorem~\verb+tlr_rel_to_traces_same_flatten+
1205% in the Matita formalisation shows that~\eqref{th2} holds for every pair
1206% $(\tau_1,\tau_2)$ of similar traces.
1208% Intuitively, two traces are similar when one can be obtained from
1209% the other by erasing or inserting silent steps, i.e. states that are
1210% not \verb+as_costed+ and that are classified as \verb+cl_other+.
1211% Silent steps do not alter the structure of the traces.
1212% In particular,
1213% the relation maps function calls to function calls to the same function,
1214% label emission statements to emissions of the same label, concatenation of
1215% subtraces to concatenation of subtraces of the same length and starting with
1216% the same emission statement, etc.
1218% In the formalisation the three similarity relations --- one for each trace
1219% kind --- are defined by structural recursion on the first trace and pattern
1220% matching over the second. Here we turn
1221% the definition into the inference rules shown in \autoref{fig:txx_rel}
1222% for the sake of readability. We also omit from trace constructors all arguments,
1223% but those that are traces or that
1224% are used in the premises of the rules. By abuse of notation we denote all three
1225% relations by infixing $\approx$.
1227% \begin{figure}
1228% \begin{multicols}{2}
1229% \infrule
1230%  {tll_1\approx tll_2
1231%  }
1232%  {\verb+tlr_base+~tll_1 \approx \verb+tlr_base+~tll_2}
1234% \infrule
1235%  {tll_1 \approx tll_2 \andalso
1236%   tlr_1 \approx tlr_2
1237%  }
1238%  {\verb+tlr_step+~tll_1~tlr_1 \approx \verb+tlr_step+~tll_2~tlr_2}
1239% \end{multicols}
1240% \vspace{3ex}
1241% \begin{multicols}{2}
1242% \infrule
1243%  {\ell~s_1 = \ell~s_2 \andalso
1244%   tal_1\approx tal_2
1245%  }
1246%  {\verb+tll_base+~s_1~tal_1 \approx \verb+tll_base+~s_2~tal_2}
1248% \infrule
1249%  {tal_1\approx tal_2
1250%  }
1251%  {\verb+tal_step_default+~tal_1 \approx tal_2}
1252% \end{multicols}
1253% \vspace{3ex}
1254% \infrule
1255%  {}
1256%  {\verb+tal_base_not_return+\approx taa \append \verb+tal_base_not_return+}
1257% \vspace{1ex}
1258% \infrule
1259%  {}
1260%  {\verb+tal_base_return+\approx taa \append \verb+tal_base_return+}
1261% \vspace{1ex}
1262% \infrule
1263%  {tlr_1\approx tlr_2 \andalso
1264%   s_1 \uparrow f \andalso s_2\uparrow f
1265%  }
1266%  {\verb+tal_base_call+~s_1~tlr_1\approx taa \append \verb+tal_base_call+~s_2~tlr_2}
1267% \vspace{1ex}
1268% \infrule
1269%  {tlr_1\approx tlr_2 \andalso
1270%   s_1 \uparrow f \andalso s_2\uparrow f \andalso
1271%   \verb+tal_collapsable+~tal_2
1272%  }
1273%  {\verb+tal_base_call+~s_1~tlr_1 \approx taa \append \verb+tal_step_call+~s_2~tlr_2~tal_2)}
1274% \vspace{1ex}
1275% \infrule
1276%  {tlr_1\approx tlr_2 \andalso
1277%   s_1 \uparrow f \andalso s_2\uparrow f \andalso
1278%   \verb+tal_collapsable+~tal_1
1279%  }
1280%  {\verb+tal_step_call+~s_1~tlr_1~tal_1 \approx taa \append \verb+tal_base_call+~s_2~tlr_2)}
1281% \vspace{1ex}
1282% \infrule
1283%  {tlr_1 \approx tlr_2 \andalso
1284%   s_1 \uparrow f \andalso s_2\uparrow f\andalso
1285%   tal_1 \approx tal_2 \andalso
1286%  }
1287%  {\verb+tal_step_call+~s_1~tlr_1~tal_1 \approx taa \append \verb+tal_step_call+~s_2~tlr_2~tal_2}
1288% \caption{The inference rule for the relation $\approx$.}
1289% \label{fig:txx_rel}
1290% \end{figure}
1291% %
1292% \begin{comment}
1293% \begin{verbatim}
1294% let rec tlr_rel S1 st1 st1' S2 st2 st2'
1295%   (tlr1 : trace_label_return S1 st1 st1')
1296%   (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
1297% match tlr1 with
1298%   [ tlr_base st1 st1' tll1 ⇒
1299%     match tlr2 with
1300%     [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
1301%     | _ ⇒ False
1302%     ]
1303%   | tlr_step st1 st1' st1'' tll1 tl1 ⇒
1304%     match tlr2 with
1305%     [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
1306%       tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
1307%     | _ ⇒ False
1308%     ]
1309%   ]
1310% and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
1311%  (tll1 : trace_label_label S1 fl1 st1 st1')
1312%  (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
1313%   match tll1 with
1314%   [ tll_base fl1 st1 st1' tal1 H ⇒
1315%     match tll2 with
1316%     [ tll_base fl2 st2 st2 tal2 G ⇒
1317%       as_label_safe … («?, H») = as_label_safe … («?, G») ∧
1318%       tal_rel … tal1 tal2
1319%     ]
1320%   ]
1321% and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
1322%  (tal1 : trace_any_label S1 fl1 st1 st1')
1323%  (tal2 : trace_any_label S2 fl2 st2 st2')
1324%    on tal1 : Prop ≝
1325%   match tal1 with
1326%   [ tal_base_not_return st1 st1' _ _ _ ⇒
1327%     fl2 = doesnt_end_with_ret ∧
1328%     ∃st2mid,taa,H,G,K.
1329%     tal2 ≃ taa_append_tal ? st2 ??? taa
1330%       (tal_base_not_return ? st2mid st2' H G K)
1331%   | tal_base_return st1 st1' _ _ ⇒
1332%     fl2 = ends_with_ret ∧
1333%     ∃st2mid,taa,H,G.
1334%     tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
1335%       (tal_base_return ? st2mid st2' H G)
1336%   | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
1337%     fl2 = doesnt_end_with_ret ∧
1338%     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
1339%     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
1340%     (* we must allow a tal_base_call to be similar to a call followed
1341%       by a collapsable trace (trace_any_any followed by a base_not_return;
1342%       we cannot use trace_any_any as it disallows labels in the end as soon
1343%       as it is non-empty) *)
1344%     (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
1345%       tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
1346%     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
1347%     ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
1348%       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
1349%       tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
1350%   | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
1351%     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
1352%     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
1353%     (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
1354%       tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
1355%       tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
1356%     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
1357%     ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
1358%       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
1359%       tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
1360%   | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
1361%     tal_rel … tl1 tal2 (* <- this makes it many to many *)
1362%   ].
1363% \end{verbatim}
1364% \end{comment}
1365% %
1366% In the preceding rules, a $taa$ is an inhabitant of the
1367% $\verb+trace_any_any+~s_1~s_2$ (shorthand $\verb+TAA+~s_1~s_2$),
1368% an inductive data type whose definition
1369% is not in the paper for lack of space. It is the type of valid
1370% prefixes (even empty ones) of \verb+TAL+'s that do not contain
1371% any function call. Therefore it
1372% is possible to concatenate (using ``$\append$'') a \verb+TAA+ to the
1373% left of a \verb+TAL+. A \verb+TAA+ captures
1374% a sequence of silent moves.
1375% The \verb+tal_collapsable+ unary predicate over \verb+TAL+'s
1376% holds when the argument does not contain any function call and it ends
1377% with a label (not a return). The intuition is that after a function call we
1378% can still perform a sequence of silent actions while remaining similar.
1380% As should be expected, even though the rules are asymmetric $\approx$ is in fact
1381% an equivalence relation.
1383We summarise here the results of the previous sections. Each intermediate
1384unstructured language can be given a semantics based on structured traces,
1385that single out those runs that respect a certain number of invariants.
1386A cost model can be computed on the object code and it can be used to predict
1387the execution costs of runs that produce structured traces. The cost model
1388can be lifted from the target to the source code of a pass if the pass maps
1389structured traces to similar structured traces. The latter property is called
1390a \emph{forward simulation}.
1392As for labelled transition systems, in order to establish the forward
1393simulation we are interested in (preservation of observables), we are
1394forced to prove a stronger notion of forward simulation that introduces
1395an explicit relation between states. The classical notion of a 1-to-many
1396forward simulation is the existence of a relation $\S$ over states such that
1397if $s_1 \S s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
1398$s_2 \to^* s_2'$ and $s_1' \S s_2'$. In our context, we need to replace the
1399one and multi step transition relations $\to^n$ with the existence of
1400a structured trace between the two states, and we need to add the request that
1401the two structured traces are similar. Thus what we would like to state is
1402something like:\\
1403for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
1404$s_1$ to $s_1'$ and $s_1 \S s_2$ there exists an $s_2'$ such that
1405$s_1' \S s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
1406$\tau_1$ is similar to $\tau_2$. We call this particular form of forward
1407simulation \emph{trace reconstruction}.
1409The statement just introduced, however, is too simplistic and not provable
1410in the general case. To understand why, consider the case of a function call
1411and the pass that fixes the parameter passing conventions. A function
1412call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
1413input nor output parameters. The pass must add explicit code before and after
1414the function call to move the pseudo-registers content from/to the hardware
1415registers or the stack in order to implement the parameter passing strategy.
1416Similarly, each function body must be augmented with a preamble and a postamble
1417to complete/initiate the parameter passing strategy for the call/return phase.
1418Therefore what used to be a call followed by the next instruction to execute
1419after the function return, now becomes a sequence of instructions, followed by
1420a call, followed by another sequence. The two states at the beginning of the
1421first sequence and at the end of the second sequence are in relation with
1422the status before/after the call in the source code, like in an usual forward
1423simulation. How can we prove however the additional condition for function calls
1424that asks that when the function returns the instruction immediately after the
1425function call is called? To grant this invariant, there must be another relation
1426between the address of the function call in the source and in the target code.
1427This additional relation is to be used in particular to relate the two stacks.
1429Another example is given by preservation of code emission statements. A single
1430code emission instruction can be simulated by a sequence of steps, followed
1431by a code emission, followed by another sequence. Clearly the initial and final
1432statuses of the sequence are to be in relation with the status before/after the
1433code emission in the source code. In order to preserve the structured traces
1434invariants, however, we must consider a second relation between states that
1435traces the preservation of the code emission statement.
1437Therefore we now introduce an abstract notion of relation set between abstract
1438statuses and an abstract notion of 1-to-many forward simulation conditions.
1439These two definitions enjoy the following remarkable properties:
1441 \item they are generic enough to accommodate all passes of the CerCo compiler
1442 \item the conjunction of the 1-to-many forward simulation conditions are
1443       just slightly stricter than the statement of a 1-to-many forward
1444       simulation in the classical case. In particular, they only require
1445       the construction of very simple forms of structured traces made of
1446       silent states only.
1447 \item they allow to prove our main result of the paper: the 1-to-many
1448       forward simulation conditions are sufficient to prove the trace
1449       reconstruction theorem
1452Point 3. is the important one. First of all it means that we have reduced
1453the complex problem of trace reconstruction to a much simpler one that,
1454moreover, can be solved with slight adaptations of the forward simulation proof
1455that is performed for a compiler that only cares about functional properties.
1456Therefore we have successfully splitted as much as possible the proof of
1457preservation of functional properties from that of non-functional ones.
1458Secondly, combined with the results in the previous section, it implies
1459that the cost model can be computed on the object code and lifted to the
1460source code to reason on non-functional properties, assuming that
1461the 1-to-many forward simulation conditions are fulfilled for every
1462compiler pass.
1464\paragraph{Relation sets.}
1466Let $S_1$ and $S_2$ be two deterministic labelled transition systems as described
1467in \autoref{semantics}. We introduce now the four relations $\mathcal{S,C,R,L}\subseteq S_1\times S_2$
1468between states of the two systems. The first two are abstract and must be instantiated
1469by every pass. The remaining two are derived.
1471The $\S$ relation between states is the classical relation used
[3349]1472in forward simulation proofs. It correlates the data of the states
[3343]1473(e.g. registers, memory, etc.).
[3349]1475The $\C$ relation correlates states that are about to execute calls.
1476It allows to track the position in the target code of every call in the source code.
[3349]1478% The $\L$ relation simply says that the two states are both label
1479% emitting states that emit the same label, \emph{i.e.}\ $s_1\L s_2\iffdef \ell~s_1=\ell~s_2$.
1480% It allows to track the position in
1481% the target code of every cost emitting statement in the source code.
[3349]1483Two states
1484$s_1$ and $s_2$ are $\R$-related if every time $s_1$ is the
1485successor of a call state that is $\C$-related to a call state
[3343]1486$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$. Formally:
1487$$s_1\R s_2 \iffdef \forall s_1',s_2'.s_1'\C s_2' \to s_1'\ar s_1 \to s_2' \ar s_2.$$
[3349]1488We will require all pairs of states that return from related calls to be
[3343]1489$\R$-related. This is the fundamental requirement granting
1490that the target trace is well structured, \emph{i.e.}\ that calls are well
1491nested and returning where they are supposed to.
[3349]1493We say states in $s_1\in S_1$ and $s_2\in S_2$ are label-related
1494(marked $s_1\L s_2$) if
1496\item they both emit the same label $L$;
1497\item if $L\in \ell(\Functions)$ then
1498there is a state $s_2'$ such that $s_2\to[L]\to[\tau *]s_2'$ with
1499 $s_1\S s_2'$, otherwise if $L\notin\ell(\Functions)$ then $s_1\S s_2$.
1502The results that follow all rely on common hypotheses to hold for the
1503systems $S_1$ and $S_2$ endowed with the relations $\S$ and $\C$. These hypotheses
1504are depicted by the diagrams in \autoref{fig:forwardsim}.
1507% \begin{subfigure}{.475\linewidth}
1508% \centering
1509% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1510%                             every label/.style=overlay, node distance=10mm]
1511%     \matrix [diag] (m) {%
1512%          \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
1513%          \node (s2) {}; & \node (t2) {}; \\
1514%     };
1515%     \node [above=0 of t1, overlay] {$\alpha$};
1516%     {[-stealth]
1517%     \draw (s1) -- (t1);
1518%     \draw [new] (s2) -- node [above] {$*$} (t2);
1519%     }
1520%     \draw (s1) to node [rel] {$\S$} (s2);
1521%     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1522% \end{tikzpicture}
[3347]1523% \caption{The \verb+cl_jump+ case.}
[3343]1524% \label{subfig:cl_jump}
1525% \end{subfigure}
1526% &
[3349]1528\tikzset{diag/.append style=
1529         {every node/.append style={is other,
1530                                    text height=0,text depth=0,text width=0,
1531                                    text opacity=0}                                                         
1532         },
1533         }
[3343]1535    \matrix [diag] (m) {%
[3348]1536         \node (s1) {s_1}; & \node (t1) {s_1'};\\
1537         \node (s2) {s_2}; & \node (t2) {s_2'}; \\
[3343]1538    };
1539    {[-stealth]
[3348]1540    \draw (s1) -- node [above] {$\tau$} (t1);
1541    \draw [new] (s2) -- node [above] {$\tau *$} (t2);
[3343]1542    }
[3347]1543    \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
[3348]1544    \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S$} (t2);
1548    \matrix [diag] (m) {%
1549         \node (s1) {s_1}; & \node (t1) {s_1'};\\
[3349]1550         \node (s2) {s_2}; & \node (mid) {s_a}; & \node (t2) {s_2'}; \\
[3343]1551    };
1552    {[-stealth]
[3348]1553    \draw (s1) -- node [above] {$L$} (t1);
[3349]1554    \draw [new] (s2) -- node [above] {$L$} (mid);
1555    \draw [new] (mid) -- node [above] {$\tau *$} (t2);
[3348]1556    }
1557    \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
1558    \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S$} (t2);
1560\text{ if $L\notin\ell(\Functions)$}
1563    \matrix [diag] (m) {%
1564         \node (s1) {s_1}; & \node (t1) {s_1'};\\
1565         \node (s2) {s_2};\\
1566    };
1567    {[-stealth]
1568    \draw (s1) -- node [above] {$\ell(f)$} (t1);
1569    }
1570    \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
1571    \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S$} (s2);
1575    \matrix [diag, small vgap] (m) {%
1576         &\node (t1) {s_1'}; \\
1577         \node (s1) {s_1}; \\
1578         && \node (l1) {s_b}; & \node (l2) {s_c}; & \node (t2) {s_2'};\\
1579         \node (s2) {s_2}; & \node (c) {s_a};\\   
1580    };
1581    {[-stealth]
[3347]1582    \draw (s1) -- node [above left] {$f$} (t1);
[3348]1583    \draw [new] (s2) -- node [above] {$\tau *$} (c);
1584    \draw [new] (c) -- node [above left] {$f$} (l1);
1585    \draw [new] (l1) -- node [above] {$\ell(f)$} (l2);
1586    \draw [new] (l2) -- node [above] {$\tau *$} (t2);
[3343]1587    }
[3347]1588    \draw (s1) to [bend right] node [rel] {$\S$} (s2);
[3343]1589    \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
[3347]1590    \draw [new] (t1) to [bend right] node [rel] {$\C$} (c);
1594    \matrix [diag, small vgap] (m) {%
1595        \node (s1) {s_1}; \\
1596        &\node (t1) {s_1'}; \\
1597        \node (s2) {s_2}; & \node (c) {s_a};\\
1598        && \node (r) {s_b}; & \node (t2) {s_2'}; \\   
[3343]1599    };
1600    {[-stealth]
[3347]1601    \draw (s1) -- node [above right] {$RET$} (t1);
[3348]1602    \draw [new] (s2) -- node [above] {$\tau *$} (c);
[3347]1603    \draw [new] (c) -- node [below left] {$RET$} (r);
[3348]1604    \draw [new] (r) -- node [above] {$\tau *$} (t2);
[3343]1605    }
[3347]1606    \draw (s1) to [bend right] node [rel] {$\S$} (s2);
[3348]1607    \draw [new, overlay] (t1) to [bend left=60] node [rel] {$\S$} (t2);
[3347]1608    \draw [new, overlay] (t1) to [bend left ] node [rel] {$\R$} (r);
[3348]1610\caption{The hypotheses of ???. Dashed relations are implied by solid ones.}
1615If $S_1$, $S_2$,$\S$ and $\C$ satisfy the diagrams in \autoref{fig:forwardsim},
1616$T_1:s_1\To s_1'$ is a structured fragment not starting with a $\ell(f)$ emission,
1617and $s_1\S s_2$, then there is $T_2:s_2\To s_2'$ with $T\approx T'$ and $s_1'\S s_2$.
1620If $S_1$, $S_2$, $\S$ and $\C$ satisfy the diagrams in \autoref{fig:forwardsim},
1621$M_1:s_1\MeasTo s_1'$ is a measurable fragment of $S_1$ and $s_2$ is such that
1622$s_1\L s_2$, then there is $M_2:s_2\MeasTo s_2'$ with $M_1\approx M_2$.
1626% \begin{figure}
1627% \centering
1628% \begin{tabular}{@{}c@{}c@{}c@{}}
1629% % \begin{subfigure}{.475\linewidth}
1630% % \centering
1631% % \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1632% %                             every label/.style=overlay, node distance=10mm]
1633% %     \matrix [diag] (m) {%
1634% %          \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
1635% %          \node (s2) {}; & \node (t2) {}; \\
1636% %     };
1637% %     \node [above=0 of t1, overlay] {$\alpha$};
1638% %     {[-stealth]
1639% %     \draw (s1) -- (t1);
1640% %     \draw [new] (s2) -- node [above] {$*$} (t2);
1641% %     }
1642% %     \draw (s1) to node [rel] {$\S$} (s2);
1643% %     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1644% % \end{tikzpicture}
1645% % \caption{The \verb+cl_jump+ case.}
1646% % \label{subfig:cl_jump}
1647% % \end{subfigure}
1648% % &
1649% \begin{subfigure}{.25\linewidth}
1650% \centering
1651% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1652%                             every label/.style=overlay, node distance=10mm]
1653%     \matrix [diag] (m) {%
1654%          \node (s1) {}; & \node (t1) {};\\
1655%          \node (s2) {}; & \node (t2) {}; \\
1656%     };
1657%     {[-stealth]
1658%     \draw (s1) -- (t1);
1659%     \draw [new] (s2) -- node [above] {$*$} (t2);
1660%     }
1661%     \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
1662%     \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S,\L$} (t2);
1663% \end{tikzpicture}
1664% \caption{The \verb+cl_oher+ and \verb+cl_jump+ cases.}
1665% \label{subfig:cl_other_jump}
1666% \end{subfigure}
1667% &
1668% \begin{subfigure}{.375\linewidth}
1669% \centering
1670% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1671%                             every label/.style=overlay, node distance=10mm]
1672%     \matrix [diag, small gap] (m) {%
1673%          &\node (t1) {}; \\
1674%          \node (s1) [is call] {}; \\
1675%          && \node (l) {}; & \node (t2) {};\\
1676%          \node (s2) {}; & \node (c) [is call] {};\\   
1677%     };
1678%     {[-stealth]
1679%     \draw (s1) -- node [above left] {$f$} (t1);
1680%     \draw [new] (s2) -- node [above] {$*$} (c);
1681%     \draw [new] (c) -- node [below right] {$f$} (l);
1682%     \draw [new] (l) -- node [above] {$*$} (t2);
1683%     }
1684%     \draw (s1) to [bend right] node [rel] {$\S$} (s2);
1685%     \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
1686%     \draw [new] (t1) to [bend left] node [rel] {$\L$} (l);
1687%     \draw [new] (t1) to [bend right] node [rel] {$\C$} (c);
1688%     \end{tikzpicture}
1689% \caption{The \verb+cl_call+ case.}
1690% \label{subfig:cl_call}
1691% \end{subfigure}
1692% &
1693% \begin{subfigure}{.375\linewidth}
1694% \centering
1695% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1696%                             every label/.style=overlay, node distance=10mm]
1697%     \matrix [diag, small gap] (m) {%
1698%         \node (s1) [is ret] {}; \\
1699%         &\node (t1) {}; \\
1700%         \node (s2) {}; & \node (c) [is ret] {};\\
1701%         && \node (r) {}; & \node (t2) {}; \\   
1702%     };
1703%     {[-stealth]
1704%     \draw (s1) -- node [above right] {$RET$} (t1);
1705%     \draw [new] (s2) -- node [above] {$*$} (c);
1706%     \draw [new] (c) -- node [below left] {$RET$} (r);
1707%     \draw [new] (r) -- node [above] {$*$} (t2);
1708%     }
1709%     \draw (s1) to [bend right] node [rel] {$\S$} (s2);
1710%     \draw [new, overlay] (t1) to [bend left=60] node [rel] {$\S,\L$} (t2);
1711%     \draw [new, overlay] (t1) to [bend left ] node [rel] {$\R$} (r);
1712% \end{tikzpicture}
1713% \caption{The \verb+cl_return+ case.}
1714% \label{subfig:cl_return}
1715% \end{subfigure}
1716% \end{tabular}
1717% \caption{Mnemonic diagrams depicting the hypotheses for the preservation of structured traces.
1718%          Dashed lines
1719%          and arrows indicates how the diagrams must be closed when solid relations
1720%          are present.}
1721% \label{fig:forwardsim}
1722% \end{figure}
[3349]1724% \paragraph{1-to-many forward simulation conditions.}
1725% \begin{condition}[Cases \verb+cl_other+ and \verb+cl_jump+]
1726%  For all $s_1,s_1',s_2$ such that $s_1 \S s_1'$, and
1727%  $s_1\exec s_1'$, and either $s_1 \class \verb+cl_other+$ or
1728%  both $s_1\class\verb+cl_other+\}$ and $\ell~s_1'$,
1729%  there exists an $s_2'$ and a $\verb+trace_any_any_free+~s_2~s_2'$ called $taaf$
1730%  such that $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1731% $taaf$ is non empty, or one among $s_1$ and $s_1'$ is \verb+as_costed+.
1732% \end{condition}
1734% In the above condition depicted in \autoref{subfig:cl_other_jump},
1735% a $\verb+trace_any_any_free+~s_1~s_2$ (which from now on
1736% will be shorthanded as \verb+TAAF+) is an
1737% inductive type of structured traces that do not contain function calls or
1738% cost emission statements. Differently from a \verb+TAA+, the
1739% instruction to be executed in the lookahead state $s_2$ may be a cost emission
1740% statement.
1742% The intuition of the condition is that one step can be replaced with zero or more steps if it
1743% preserves the relation between the data and if the two final statuses are
1744% labelled in the same way. Moreover, we must take special care of the empty case
1745% to avoid collapsing two consecutive states that emit a label, missing one of the two emissions.
1747% \begin{condition}[Case \verb+cl_call+]
1748%  For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$ and
1749%  $s_1\exec s_1'$ and $s_1 \class \verb+cl_call+$, there exists $s_a, s_b, s_2'$, a
1750% $\verb+TAA+~s_2~s_a$, and a
1751% $\verb+TAAF+~s_b~s_2'$ such that:
1752% $s_a\class\verb+cl_call+$, the \verb+as_call_ident+'s of
1753% the two call states are the same, $s_1 \C s_a$,
1754% $s_a\exec s_b$, $s_1' \L s_b$ and
1755% $s_1' \S s_2'$.
1756% \end{condition}
1758% The condition, depicted in \autoref{subfig:cl_call} says that, to simulate a function call, we can perform a
1759% sequence of silent actions before and after the function call itself.
1760% The old and new call states must be $\C$-related, the old and new
1761% states at the beginning of the function execution must be $\L$-related
1762% and, finally, the two initial and final states must be $\S$-related
1763% as usual.
1765% \begin{condition}[Case \verb+cl_return+]
1766%  For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$,
1767%  $s_1\exec s_1'$ and $s_1 \class \verb+cl_return+$, there exists $s_a, s_b, s_2'$, a
1768% $\verb+TAA+~s_2~s_a$, a
1769% $\verb+TAAF+~s_b~s_2'$ called $taaf$ such that:
1770% $s_a\class\verb+cl_return+$,
1771% $s_a\exec s_b$,
1772% $s_1' \R s_b$ and
1773% $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1774% $taaf$ is non empty, or $\lnot \ell~s_a$.
1775% \end{condition}
1777% Similarly to the call condition, to simulate a return we can perform a
1778% sequence of silent actions before and after the return statement itself,
1779% as depicted in \autoref{subfig:cl_return}.
1780% The old and the new statements after the return must be $\R$-related,
1781% to grant that they returned to corresponding calls.
1782% The two initial and final states must be $\S$-related
1783% as usual and, moreover, they must exhibit the same labels. Finally, when
1784% the suffix is non empty we must take care of not inserting a new
1785% unmatched cost emission statement just after the return statement.
1787% \begin{comment}
1788% \begin{verbatim}
1789% definition status_simulation ≝
1790%   λS1 : abstract_status.
1791%   λS2 : abstract_status.
1792%   λsim_status_rel : status_rel S1 S2.
1793%     ∀st1,st1',st2.as_execute S1 st1 st1' →
1794%     sim_status_rel st1 st2 →
1795%     match as_classify … st1 with
1796%     [ None ⇒ True
1797%     | Some cl ⇒
1798%       match cl with
1799%       [ cl_call ⇒ ∀prf.
1800%         (*
1801%              st1' ------------S----------\
1802%               ↑ \                         \
1803%              st1 \--L--\                   \
1804%               | \       \                   \
1805%               S  \-C-\  st2_after_call →taa→ st2'
1806%               |       \     ↑
1807%              st2 →taa→ st2_pre_call
1808%         *)
1809%         ∃st2_pre_call.
1810%         as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
1811%         call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
1812%         ∃st2_after_call,st2'.
1813%         ∃taa2 : trace_any_any … st2 st2_pre_call.
1814%         ∃taa2' : trace_any_any … st2_after_call st2'.
1815%         as_execute … st2_pre_call st2_after_call ∧
1816%         sim_status_rel st1' st2' ∧
1817%         label_rel … st1' st2_after_call
1818%       | cl_return ⇒
1819%         (*
1820%              st1
1821%             / ↓
1822%            | st1'----------S,L------------\
1823%            S   \                           \
1824%             \   \-----R-------\            |     
1825%              \                 |           |
1826%              st2 →taa→ st2_ret |           |
1827%                           ↓   /            |
1828%                      st2_after_ret →taaf→ st2'
1830%            we also ask that st2_after_ret be not labelled if the taaf tail is
1831%            not empty
1832%         *)
1833%         ∃st2_ret,st2_after_ret,st2'.
1834%         ∃taa2 : trace_any_any … st2 st2_ret.
1835%         ∃taa2' : trace_any_any_free … st2_after_ret st2'.
1836%         (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
1837%         as_classifier … st2_ret cl_return ∧
1838%         as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
1839%         ret_rel … sim_status_rel st1' st2_after_ret ∧
1840%         label_rel … st1' st2'
1841%       | cl_other ⇒
1842%           (*         
1843%           st1 → st1'
1844%             |      \
1845%             S      S,L
1846%             |        \
1847%            st2 →taaf→ st2'
1849%            the taaf can be empty (e.g. tunneling) but we ask it must not be the
1850%            case when both st1 and st1' are labelled (we would be able to collapse
1851%            labels otherwise)
1852%          *)
1853%         ∃st2'.
1854%         ∃taa2 : trace_any_any_free … st2 st2'.
1855%         (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1856%         sim_status_rel st1' st2' ∧
1857%         label_rel … st1' st2'
1858%       | cl_jump ⇒
1859%         (* just like cl_other, but with a hypothesis more *)
1860%         as_costed … st1' →
1861%         ∃st2'.
1862%         ∃taa2 : trace_any_any_free … st2 st2'.
1863%         (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1864%         sim_status_rel st1' st2' ∧
1865%         label_rel … st1' st2'
1866%       ]
1867%     ].
1868% \end{verbatim}
1869% \end{comment}
[3349]1871% \paragraph{Main result: the 1-to-many forward simulation conditions
1872% are sufficient to trace reconstruction}
1874% Let us assume that a relation set is given such that the 1-to-many
1875% forward simulation conditions are satisfied. Under this assumption we
1876% can prove the following three trace reconstruction theorems by mutual
1877% structural induction over the traces given in input between the
1878% $s_1$ and $s_1'$ states.
1880% In particular, the \verb+status_simulation_produce_tlr+ theorem
1881% applied to the \verb+main+ function of the program and equal
1882% $s_{2_b}$ and $s_2$ states shows that, for every initial state in the
1883% source code that induces a structured trace in the source code,
1884% the compiled code produces a similar structured trace.
1886% \begin{theorem}[\verb+status_simulation_produce_tlr+]
1887% For every $s_1,s_1',s_{2_b},s_2$ s.t.
1888% there is a $\verb+TLR+~s_1~s_1'$ called $tlr_1$ and a
1889% $\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1890% $s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1891% there is a $\verb+TLR+~s_{2_b}~s_{2_m}$ called $tlr_2$ and
1892% there is a $\verb+TAAF+~s_{2_m}~s_2'$ called $taaf$
1893% s.t. if $taaf$ is non empty then $\lnot (\ell~s_{2_m})$,
1894% and $tlr_1\approx tlr_2$
1895% and $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1896% $s_1' \R s_{2_m}$.
1897% \end{theorem}
1899% The theorem states that a \verb+trace_label_return+ in the source code
1900% together with a precomputed preamble of silent states
1901% (the \verb+TAA+) in the target code induces a
1902% similar \verb+trace_label_return+ in the target code which can be
1903% followed by a sequence of silent states. Note that the statement does not
1904% require the produced \verb+trace_label_return+ to start with the
1905% precomputed preamble, even if this is likely to be the case in concrete
1906% implementations. The preamble in input is necessary for compositionality, e.g.
1907% because the 1-to-many forward simulation conditions allow in the
1908% case of function calls to execute a preamble of silent instructions just after
1909% the function call.
1911% Clearly similar results are also available for the other two types of structured
1912% traces (in fact, they are all proved simultaneously by mutual induction).
[3347]1913% \begin{theorem}[\verb+status_simulation_produce_tll+]
[3343]1914% For every $s_1,s_1',s_{2_b},s_2$ s.t.
[3347]1915% there is a $\verb+TLL+~b~s_1~s_1'$ called $tll_1$ and a
[3343]1916% $\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1917% $s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1918% \begin{itemize}
1919%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
[3347]1920%        and a trace $\verb+TLL+~b~s_{2_b}~s_{2_m}$ called $tll_2$
1921%        and a $\verb+TAAF+~s_{2_m}~s_2'$ called $taa_2$ s.t.
[3343]1922%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1923%        $s_1' \R s_{2_m}$ and
1924%        $tll_1\approx tll_2$ and
1925%        if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1926%  \item else there exists $s_2'$ and a
[3347]1927%        $\verb+TLL+~b~s_{2_b}~s_2'$ called $tll_2$ such that
[3343]1928%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1929%        $tll_1\approx tll_2$.
1930% \end{itemize}
1931% \end{theorem}
1933% The statement is similar to the previous one: a source
[3347]1934% \verb+trace_label_label+ and a given target preamble of silent states
1935% in the target code induce a similar \verb+trace_label_label+ in the
[3343]1936% target code, possibly followed by a sequence of silent moves that become the
[3347]1937% preamble for the next \verb+trace_label_label+ translation.
[3347]1939% \begin{theorem}[\verb+status_simulation_produce_tal+]
[3343]1940% For every $s_1,s_1',s_2$ s.t.
[3347]1941% there is a $\verb+TAL+~b~s_1~s_1'$ called $tal_1$ and
[3343]1942% $s_1 \S s_2$
1943% \begin{itemize}
1944%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
[3347]1945%    and a trace $\verb+TAL+~b~s_2~s_{2_m}$ called $tal_2$ and a
1946%    $\verb+TAAF+~s_{2_m}~s_2'$ called $taa_2$ s.t.
[3343]1947%    $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1948%    $s_1' \R s_{2_m}$ and
1949%    $tal_1 \approx tal_2$ and
1950%    if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1951%  \item else there exists $s_2'$ and a
[3347]1952%    $\verb+TAL+~b~s_2~s_2'$ called $tal_2$ such that
[3343]1953%    either $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1954%        $tal_1\approx tal_2$
1955%    or $s_1' \mathrel{{\S} \cap {\L}} s_2$ and
[3347]1956%    $\verb+tal_collapsable+~tal_1$ and $\lnot \ell~s_1$.
[3343]1957% \end{itemize}
1958% \end{theorem}
1960% The statement is also similar to the previous ones, but for the lack of
1961% the target code preamble.
1965For every $s_1,s_1',s_2$ s.t.
[3347]1966there is a $\verb+trace_label_return+~s_1~s_1'$ called $tlr_1$ and
[3343]1967$s_1 (\L \cap \S) s_2$
1968there exists $s_{2_m},s_2'$ s.t.
[3347]1969there is a $\verb+trace_label_return+~s_2~s_{2_m}$ called $tlr_2$ and
1970there is a $\verb+trace_any_any_free+~s_{2_m}~s_2'$ called $taaf$
1971s.t. if $taaf$ is non empty then $\lnot (\verb+as_costed+~s_{2_m})$,
1972and $\verb+tlr_rel+~tlr_1~tlr_2$
[3343]1973and $s_1' (\S \cap \L) s_2'$ and
1974$s_1' \R s_{2_m}$.
1980status_simulation_produce_tlr S1 S2 R
1981(* we start from this situation
1982     st1 →→→→tlr→→→→ st1'
1983      | \
1984      L  \---S--\
1985      |          \
1986   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
1987                        by the preceding call)
1989   and we produce
1990     st1 →→→→tlr→→→→ st1'
1991             \\      /  \
1992             //     R    \-L,S-\
1993             \\     |           \
1994   st2_lab →tlr→ st2_mid →taaf→ st2'
1996  st1 st1' st2_lab st2
1997  (tlr1 : trace_label_return S1 st1 st1')
1998  (taa2_pre : trace_any_any S2 st2_lab st2)
1999  (sim_execute : status_simulation S1 S2 R)
2000  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
2001  ∃st2_mid.∃st2'.
2002  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
2003  ∃taa2 : trace_any_any_free … st2_mid st2'.
2004  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
2005  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
2006  tlr_rel … tlr1 tlr2
2010\section{Conclusions and future works}
2014The labelling approach is a technique to implement compilers that induce on
2015the source code a non uniform cost model determined from the object code
2016produced. The cost model assigns a cost to each basic block of the program.
2017The main theorem of the approach says that there is an exact
2018correspondence between the sequence of basic blocks started in the source
2019and object code, and that no instruction in the source or object code is
2020executed outside a basic block. Thus the cost of object code execution
2021can be computed precisely on the source.
2023In this paper we scale the labelling approach to cover a programming language
2024with function calls. This introduces new difficulties only when the language
2025is unstructured, i.e. it allows function calls to return anywhere in the code,
2026destroying the hope of a static prediction of the cost of basic blocks.
2027We restore static predictability by introducing a new semantics for unstructured
2028programs that single outs well structured executions. The latter are represented
2029by structured traces, a generalisation of streams of observables that capture
2030several structural invariants of the execution, like well nesting of functions
2031or the fact that every basic block must start with a code emission statement.
2032We show that structured traces are sufficiently structured to statically compute
2033a precise cost model on the object code.
2035We introduce a similarity relation on structured traces that must hold between
2036source and target traces. When the relation holds for every program, we prove
2037that the cost model can be lifted from the object to the source code.
2039In order to prove that similarity holds, we present a generic proof of forward
2040simulation that is aimed at pulling apart as much as possible the part of the
2041simulation related to non-functional properties (preservation of structure)
2042from that related to functional properties. In particular, we reduce the
2043problem of preservation of structure to that of showing a 1-to-many
2044forward simulation that only adds a few additional proof obligations to those
2045of a traditional, function properties only, proof.
2047All results presented in the paper are part of a larger certification of a
2048C compiler which is based on the labelling approach. The certification, done
2049in Matita, is the main deliverable of the FET-Open Certified Complexity (CerCo).
2051The short term future work consists in the completion of the certification of
2052the CerCo compiler exploiting the main theorem of this paper.
2054\paragraph{Related works.}
2055CerCo is the first project that explicitly tries to induce a
2056precise cost model on the source code in order to establish non-functional
2057properties of programs on an high level language. Traditional certifications
2058of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
2059of the functional properties.
2061Usually forward simulations take the following form: for each transition
2062from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
2063transitions in the target code of length $n$. The number $n$ of transition steps
2064in the target code can just be the witness of the existential statement.
2065An equivalent alternative when the proof of simulation is constructive consists
2066in providing an explicit function, called \emph{clock function} in the
2067literature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
2068function constitutes then a cost model for the source code, in the spirit of
2069what we are doing in CerCo. However, we believe our solution to be superior
2070in the following respects: 1) the machinery of the labelling approach is
2071insensible to the resource being measured. Indeed, any cost model computed on
2072the object code can be lifted to the source code (e.g. stack space used,
2073energy consumed, etc.). On the contrary, clock functions only talk about
2074number of transition steps. In order to extend the approach with clock functions
2075to other resources, additional functions must be introduced. Moreover, the
2076additional functions would be handled differently in the proof.
20772) the cost models induced by the labelling approach have a simple presentation.
2078In particular, they associate a number to each basic block. More complex
2079models can be induced when the approach is scaled to cover, for instance,
2080loop optimisations~\cite{loopoptimizations}, but the costs are still meant to
2081be easy to understand and manipulate in an interactive theorem prover or
2082in Frama-C.
2083On the contrary, a clock function is a complex function of the state $s_1$
2084which, as a function, is an opaque object that is difficult to reify as
2085source code in order to reason on it.
2090% \appendix
2091% \section{Notes for the reviewers}
2093% The results described in the paper are part of a larger formalization
2094% (the certification of the CerCo compiler). At the moment of the submission
2095% we need to single out from the CerCo formalization the results presented here.
2096% Before the 16-th of February we will submit an attachment that contains the
2097% minimal subset of the CerCo formalization that allows to prove those results.
2098% At that time it will also be possible to measure exactly the size of the
2099% formalization described here. At the moment a rough approximation suggests
2100% about 2700 lines of Matita code.
2102% We will also attach the development version of the interactive theorem
2103% prover Matita that compiles the submitted formalization. Another possibility
2104% is to backport the development to the last released version of the system
2105% to avoid having to re-compile Matita from scratch.
2107% The programming and certification style used in the formalization heavily
2108% exploit dependent types. Dependent types are used: 1) to impose invariants
2109% by construction on the data types and operations (e.g. a traces from a state
2110% $s_1$ to a state $s_2$ can be concatenad to a trace from a state
2111% $s_2'$ to a state $s_3$ only if $s_2$ is convertible with $s_2'$); 2)
2112% to state and prove the theorems by using the Russell methodology of
2113% Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
[3347]2114% }, better known in the Coq world as ``\verb+Program+'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
[3343]2115% mandatorily on dependent types: it should be easy to adapt the technique
2116% and results presented in the paper to HOL.
2118% Finally, Matita and Coq are based on minor variations of the Calculus of
2119% (Co)Inductive Constructions. These variations do not affect the CerCo
2120% formalization. Therefore a porting of the proofs and ideas to Coq would be
2121% rather straightforward.
Note: See TracBrowser for help on using the repository browser.