source: Papers/itp-2013/ccexec2.tex @ 3360

Last change on this file since 3360 was 3360, checked in by tranquil, 6 years ago

aggiustate le figure

File size: 120.8 KB
12% \usepackage{amsmath}
17% NB: might be worth removing this if changing class in favour of
18% a built-in definition.
23  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
24   morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
25  }
28  morekeywords={int8_t},
29  mathescape,
30  escapechar=\%
36%         keywordstyle=\color{red}\bfseries,
37%         keywordstyle=[2]\color{blue},
38%         commentstyle=\color{green},
39%         stringstyle=\color{blue},
40        showspaces=false,showstringspaces=false,
41%         xleftmargin=1em
42        }
47\pgfutil@ifundefined{pgf@arrow@code@implies}{% supply for lack of double arrow special arrow tip if it is not there%
48  \pgfarrowsdeclare{implies}{implies}%
49  {%
50  \pgfarrowsleftextend{2.2pt}%
51  \pgfarrowsrightextend{2.2pt}%
52  }%
53  {%
54    \pgfsetdash{}{0pt} % do not dash%
55    \pgfsetlinewidth{.33pt}%
56    \pgfsetroundjoin   % fix join%
57    \pgfsetroundcap    % fix cap%
58    \pgfpathmoveto{\pgfpoint{-1.5pt}{2.5pt}}%
59    \pgfpathcurveto{\pgfpoint{-.75pt}{1.5pt}}{\pgfpoint{0.5pt}{.5pt}}{\pgfpoint{2pt}{0pt}}%
60    \pgfpathcurveto{\pgfpoint{0.5pt}{-.5pt}}{\pgfpoint{-.75pt}{-1.5pt}}{\pgfpoint{-1.5pt}{-2.5pt}}%
61    \pgfusepathqstroke%
62  }%
66\tikzset{state/.style={inner sep = 0, outer sep = 2pt, draw, fill},
67         every node/.style={inner sep=2pt},
68         every on chain/.style = {inner sep = 0, outer sep = 2pt},
69         join all/.style = {every on chain/.append style={join}},
70         on/.style={on chain={#1}, state},
71         m/.style={execute at begin node=$, execute at end node=$},
72         node distance=3mm,
73         is other/.style={circle, minimum size = 3pt, state},
74         other/.style={on, is other},
75         is jump/.style={diamond, minimum size = 6pt, state},
76         jump/.style={on, is jump},
77         is call/.style={regular polygon, regular polygon sides=3, minimum size=5pt, state},
78         call/.style={on=going below, is call, node distance=6mm, label=above left:$#1$},
79         is ret/.style={regular polygon, regular polygon sides=3, minimum size=5pt, shape border rotate=180, state},
80         ret/.style={on=going above, is ret, node distance=6mm},
81         chain/.style={start chain=#1 going left},
82         rev ar/.style={stealth-, thick},
83         ar/.style={-stealth, thick},
84         every join/.style={rev ar},
85         vcenter/.style={baseline={([yshift=-.5ex]current bounding box)}},
86         every picture/.style={thick},
87         double equal sign distance/.prefix style={double distance=1.5pt}, %% if already defined (newest version of pgf) it should be ignored%}
88         implies/.style={double, -implies, thin, double equal sign distance, shorten <=5pt, shorten >=5pt},
89         new/.style={densely dashed},
90         newn/.style={solid, fill=white},
91         rel/.style={font=\scriptsize, fill=white, inner sep=2pt},
92         diag/.style={row sep={11mm,between origins},
93                      column sep={11mm,between origins},
94                      every node/.style={execute at begin node=$, execute at end node=$},
95%                       every node/.style={draw, is other},
96                      },
97         small vgap/.style={row sep={7mm,between origins}},
98         small hgap/.style={column sep={7mm,between origins}},
99         small gap/.style={small vgap, small hgap},
100         % for article, maybe temporary
101         is jump/.style=is other,
102         is call/.style=is other,
103         is ret/.style=is other,
106\def\L{\mathrel{\mathcal L}}
107\def\S{\mathrel{\mathcal S}}
108\def\R{\mathrel{\mathcal R}}
109\def\C{\mathrel{\mathcal C}}
111\def\Labels{\mathbb L}
112\def\Functions{\mathbb F}
115\savebox{\execbox}{\tikz[baseline=-.5ex]\draw [-stealth] (0,0) -- ++(1em, 0);}
116% \newcommand{\exec}{\ensuremath{\mathrel{\usebox{\execbox}}}}
120                                identifierstyle=\texttt]}
143\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
144Emerging Technologies (FET) programme within the Seventh Framework
145Programme for Research of the European Commission, under FET-Open grant
146number: 243881}}
147\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
148\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{}, \email{}}
151The labelling approach is a technique to lift cost models for non-functional
152properties of programs from the object code to the source code. It is based
153on the preservation of the structure of the high level program in every
154intermediate language used by the compiler. Such structure is captured by
155observables that are added to the semantics and that needs to be preserved
156by the forward simulation proof of correctness of the compiler. Additional
157special observables are required for function calls. In this paper we
158present a generic forward simulation proof that preserves all these observables.
159The proof statement is based on a new mechanised semantics that traces the
160structure of execution when the language is unstructured. The generic semantics
161and simulation proof have been mechanised in the interactive theorem prover
166The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
167a technique to \emph{lift} cost models for non-functional properties of programs
168from the object code to the source code. Examples of non-functional properties
169are execution time, amount of stack/heap space consumed and energy required for
170communication. The basic premise of the approach is that it is impossible to
171provide a \emph{uniform} cost model for an high level language that is preserved
172\emph{precisely} by a compiler. For instance, two instances of an assignment
173$x = y$ in the source code can be compiled very differently according to the
174place (registers vs stack) where $x$ and $y$ are stored at the moment of
175execution. Therefore a precise cost model must assign a different cost
176to every occurrence, and the exact cost can only be known after compilation.
178According to the labelling approach, the compiler is free to compile and optimise
179the source code without any major restriction, but it must keep trace
180of what happens to basic blocks during the compilation. The cost model is
181then computed on the object code. It assigns a cost to every basic block.
182Finally, the compiler propagates back the cost model to the source level,
183assigning a cost to each basic block of the source code.
185Implementing the labelling approach in a certified compiler
186allows to reason formally on the high level source code of a program to prove
187non-functional properties that are granted to be preserved by the compiler
188itself. The trusted code base is then reduced to 1) the interactive theorem
189prover (or its kernel) used in the certification of the compiler and
1902) the software used to certify the property on the source language, that
191can be itself certified further reducing the trusted code base.
192In~\cite{easylabelling} the authors provide an example of a simple
193certified compiler that implements the labelling approach for the
194imperative \verb+While+ language~\cite{while}, that does not have
195pointers and function calls.
197The labelling approach has been shown to scale to more interesting scenarios.
198In particular in~\cite{functionallabelling} it has been applied to a functional
199language and in~\cite{loopoptimizations} it has been shown that the approach
200can be slightly complicated to handle loop optimisations and, more generally,
201program optimisations that do not preserve the structure of basic blocks.
202On-going work also shows that the labelling approach is also compatible with
203the complex analyses required to obtain a cost model for object code
204on processors that implement advanced features like pipelining, superscalar
205architectures and caches.
207In the European Project CerCo (Certified Complexity\footnote{\url{}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
2088051 object code. The compiler is
209moderately optimising and implements a compilation chain that is largely
210inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novelty and source of difficulties is due to the presence
211of function calls. Surprisingly, the addition of function calls require a
212revisitation of the proof technique given in~\cite{easylabelling}. In
213particular, at the core of the labelling approach there is a forward
214simulation proof that, in the case of \verb+While+, is only minimally
215more complex than the proof required for the preservation of the
216functional properties only. In the case of a programming language with
217function calls, instead, it turns out that the forward simulation proof for
218the back-end languages, which are unstructured, must grant a whole new set of invariants.
220In this paper we present a formalisation in the Matita interactive theorem
221prover~\cite{matita1,matita2} of a generic version of the simulation proof required for unstructured
222languages. All back-end languages of the CerCo compiler are unstructured
223languages, so the proof covers half of the correctness of the compiler.
224The statement of the generic proof is based on a new semantics
225for imperative unstructured languages that is based on \emph{structured
226traces} and that restores the preservation of structure in the observables of
227the semantics. The generic proof allows to almost completely split the
228part of the simulation that deals with functional properties only from the
229part that deals with the preservation of structure.
231The plan of this paper is the following. In Section~\ref{labelling} we
232sketch the labelling method and the problems derived from the application
233to languages with function calls. In Section~\ref{semantics} we introduce
234a generic description of an unstructured imperative language and the
235corresponding structured traces (the novel semantics). In
236Section~\ref{simulation} we describe the forward simulation proof.
237Conclusions and future works are in Section~\ref{conclusions}
239\section{The labelling approach}
241% \subsection{A brief introduction to the labelling approach}
242We briefly explain the labelling approach as introduced in~\cite{easylabelling}
243on the example in \autoref{examplewhile}.
244The user wants to analyse the execution time of the program (the black lines in
245\autoref{subfig:example_input}). He compiles the program using
246a special compiler that first inserts in the code three label emission
247statements (\verb+EMIT $L_i$+) to mark the beginning of basic blocks
249then the compiler compiles the code to machine code (\autoref{subfig:example_oc}),
250granting that the execution of the source and object
251code emits the same sequence of labels ($L_1; L_2; L_2; L_3$ in the example).
252This is achieved by keeping track of basic blocks during compilation, avoiding
253all optimizations that alter the control flow. The latter can be recovered with
254a more refined version of the labelling approach~\cite{loopoptimizations}, but in the
255present paper we stick to this simple variant for simplicity. Once the object
256code is produced, the compiler runs a static code analyzer to associate to
257each label $L_1, \ldots, L_3$ the cost (in clock cycles) of the instructions
258that belong to the corresponding basic block. For example, the cost $k_1$
259associated to $L_1$ is the number of cycles required to execute the block
260$I_3$ and \verb+COND $l_2$+, while the cost $k_2$ associated to $L_2$ counts the
261cycles required by the block $I_4$, \verb+GOTO $l_1$+ and \verb+COND $l_2$+. The compiler also guarantees
262that every executed instruction is in the scope of some code emission label,
263that each scope does not contain loops (to associate a finite cost), and that
264both branches of a conditional statement are followed by a code emission
265statement. Under these assumptions it is true that the total execution cost
266of the program $\Delta_t$ is equal to the sum over the sequence of emitted
267labels of the cost associated to every label:
268$\Delta_t = k(L_1; L_2; L_2; L_3) = k_1 + k_2 + k_2 + k_3$.
269Finally, the compiler emits an instrumented version of the source code
270(\autoref{subfig:example_instrument}) where label emission statements are replaced
271by increments of a global variable \verb+cost+ that, before every increment, holds the
272exact number of clock cycles spent by the microprocessor so far:
273the difference $\Deltacost$ between the final and initial value of the
274internal clock is $\Deltacost = k_1 + k_2 + k_2 + k_3 = \Delta_t$. Finally, the
275user can employ any available method (e.g. Hoare logic, invariant generators,
276abstract interpretation and automated provers) to certify that $\Deltacost$
277never exceeds a certain bound~\cite{cerco}, which is now a functional property
278of the code.
284|EMIT $L_1$;|
286for (i=0; i<2; i++) {
287  |EMIT $L_2$;|
288  $I_2$;
290|EMIT $L_3$;|
292\caption{The input program (black lines) with its labelling (red lines).}
298   EMIT $L_1$
299   $I_3$
300$l_1$:$\!$  COND $l_2$
301   EMIT $L_2$
302   $I_4$
303   GOTO $l_1$
304$l_2$:$\!$  EMIT $L_3$
306\caption{The output labelled object code.}
312cost += $k_1$;
314for (i=0; i<2; i++) {
315  cost += $k_2$;
316  $I_2$;
318cost += $k_3$;           
320\caption{The output instrumented code.}
323% \begin{verbatim}
324% EMIT L_1;                         EMIT L_1         cost += k_1;
325% I_1;                              I_3              I_1;
326% for (i=0; i<2; i++) {        l_1: COND l_2         for (i=0; i<2; i++) {
327%   EMIT L_2;                       EMIT L_2           cost += k_2;           
328%   I_2;                            I_4                I_2;               
329%  }                                GOTO l_1          }                   
330% EMIT L_3;                    l_2: EMIT L_3         cost += k_3;           
331% \end{verbatim}
332\caption{The labelling approach applied to a simple program.\label{examplewhile}. The $I_i$ are sequences of instructions not containing jumps or loops. }
335\section{Extending the labelling approach to function calls}
337Let's now consider a simple program written in C that contains a function
338pointer call inside the scope of the cost label $L_1$, in \autoref{subfig:call_input}.
339The labelling method works exactly as before, inserting
340code emission statements/\verb+cost+ variable increments at the beginning
341of every basic block and at the beginning of every function. The compiler
342still grants that the sequence of labels observed on the two programs are
343the same. A new difficulty appears when the compiler needs to statically
344analyze the object code to assign a cost to every label. What should the scope
345of the $L_1$ label be? After executing the $I_4$ block, the \verb+CALL+
346statement passes control to a function that cannot be determined statically.
347Therefore the cost of executing the body must be paid by some other label
348(hence the requirement that every function starts with a code emission
349statement). What label should pay for the cost for the block $I_5$? The only
350reasonable answer is $L_1$, i.e. \emph{the scope of labels should extend to the
351next label emission statement or the end of the function, stepping over function calls}.
358void main() {
359  EMIT $L_1$;
360  $I_1$;
361  (*f)();
362  $I_2$;
365void g() {
366  EMIT $L_2$;
367  $I_3$;
370\caption{The input labelled C program.}
378  EMIT $L_1$
379  $I_4$
380  CALL
381  $I_5$
385  EMIT $L_2$
386  $I_6$
389\caption{The output labelled object code.}
393\caption{An example compilation of a simple program with a function pointer
394         call.}
398The latter definition of scope is adeguate on the source level because
399C is a structured language that guarantees that every function call, if it
400returns, passes control to the first instruction that follows the call. However,
401this is not guaranteed for object code, the backend languages of a compiler
402and, more generally, for unstructured
403languages that use a writable control stack to store the return address of
404calls. For example, $I_6$ could increment by $1$ the return address on the
405stack so that the next \verb+RETURN+ would start at the second instruction
406of $I_5$. The compiler would still be perfectly correct if a random, dead
407code instruction was added after the \verb+CALL+ as the first instruction of $I_5$. More generally,
408\emph{there is no guarantee that a correct compiler that respects the functional
409behaviour of a program also respects the calling structure of the source code}.
410Without such an assumption, however, it may not be true that the execution cost
411of the program is the sum of the costs associated to the labels emitted. In our
412example, the cost of $I_5$ is paid by $L_1$, but in place of $I_5$ the processor could execute any other code after $g$ returns.
414Obviously, any reasonably written compiler produces object code that behaves
415as if the language was structured (i.e. by properly nesting function
416calls/returns and without tampering with the return addresses on the control
417stack). This property, however, is a property of the runs of object code
418programs, and not a property of the object code that can be easily statically
419verified (as the ones required in \autoref{labelling} in absence of function calls).
420Therefore, we now need to single out those runs whose cost behaviour can be
421statically predicted, and we need to prove that every run of programs generated
422by our compiler are of that type. We call them \emph{structured} since their
423main property is to respect properties that hold for free on the source code
424because of structure. Moreover, in order to avoid proving
425too many preservation properties of our compiler, we drop the original
426requirements on the object code (all instructons must be in scope of some labels,
427no loops inside a scope, etc.) in favour of the corresponding requirement
428for structured runs (a structured run must start with a label emission, no
429instruction can be executed twice between two emissions, etc.).
431We will therefore proceed as follows. In the following section
4321) we formally introduce the notion of
433structured trace, which captures structured runs in the style of labelled
434transition systems; 2) we show that on the object code we can correctly
435compute the execution time of a structured run from the sequence of labels
436observed; 3) we show that on the source code we can correctly compute the
437execution time of a program if the compiler produces object code whose
438runs are weakly similar to the source code runs and structured.
440The proof of correctness of such a compiler is harder than a traditional
441proof of preservation of functional properties, and a standard forward
442simulation argument does not work. In \autoref{simulation} we present
443a refinement of forward simulation that grants all required correctness
446All the definitions and theorems presented in the paper have been formalized
447in the interactive theorem prover Matita and are being used to certify
448the complexity preserving compiler developed in the CerCo project~\cite{cerco}.
449The formalization can be
450found at~\ref{YYY} and it heavily relies on algebraic and dependent types for
451both structured traces and the definition of weak similarity. In the paper
452we did not try to stay close to the formalization. On the contrary,
453the definitions given in the paper are the result of a significant
454simplification effort for
455the sake of presentation and to make easier the re-implementation of the
456concepts in a proof assistant which is not based on the Calculus of Inductive
457Constructions. In any case the formalization is heavily commented to allow the
458reader to understand the technical details of the formalization.
461% @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
463% We briefly sketch here a simplified version of the labelling approach as
464% introduced in~\cite{easylabelling}. The simplification strengthens the
465% sufficient conditions given in~\cite{easylabelling} to allow a simpler
466% explanation. The simplified conditions given here are also used in the
467% CerCo compiler to simplify the proof.
469% Let $\mathcal{P}$ be a programming language whose semantics is given in
470% terms of observables: a run of a program yields a finite or infinite
471% stream of observables. We also assume for the time being that function
472% calls are not available in $\mathcal{P}$. We want to associate a cost
473% model to a program $P$ written in $\mathcal{P}$. The first step is to
474% extend the syntax of $\mathcal{P}$ with a new construct $\verb+emit L+$
475% where $L$ is a label distinct from all observables of $\mathcal{P}$.
476% The semantics of $\verb+emit L+$ is the emission of the observable
477% \verb+L+ that is meant to signal the beginning of a basic block.
479% There exists an automatic procedure that injects into the program $P$ an
480% $\verb+emit L+$ at the beginning of each basic block, using a fresh
481% \verb+L+ for each block. In particular, the bodies of loops, both branches
482% of \verb+if-then-else+s and the targets of \verb+goto+s must all start
483% with an emission statement.
485% Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
486% that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
487% language used by the compiler. We can easily extend every
488% intermediate language (and its semantics) with an $\verb+emit L+$ statement
489% as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
490% the additional difficulty that the syntax of object code is given as a
491% sequence of bytes. The injection of an emission statement in the object code
492% can be done using a map that maps two consecutive code addresses with the
493% statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \verb+emit L+$ then the observable \verb+L+ is emitted after the execution of the
494% instruction stored at $pc_1$ and before the execution of the instruction
495% stored at $pc_2$. The two program counters are necessary because the
496% instruction stored at $pc_1$ can have multiple possible successors (e.g.
497% in case of a conditional branch or an indirect call). Dually, the instruction
498% stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
499% target of a jump).
501% The compiler, to be functionally correct, must preserve the observational
502% equivalence, i.e. executing the program after each compiler pass should
503% yield the same stream of observables. After the injection of emission
504% statements, observables now capture both functional and non-functional
505% behaviours.
506% This correctness property is called in the literature a forward simulation
507% and is sufficient for correctness when the target language is
508% deterministic~\cite{compcert3}.
509% We also require a stronger, non-functional preservation property: after each
510% pass all basic blocks must start with an emission statement, and all labels
511% \verb+L+ must be unique.
513% Now let $M$ be the object code obtained for the program $P$. Let us suppose
514% that we can statically inspect the code $M$ and associate to each basic block
515% a cost (e.g. the number of clock cycles required to execute all instructions
516% in the basic block, or an upper bound to that time). Every basic block is
517% labelled with an unique label \verb+L+, thus we can actually associate the
518% cost to \verb+L+. Let call it $k(\verb+L+)$.
520% The function $k$ is defined as the cost model for the object code control
521% blocks. It can be equally used as well as the cost model for the source
522% control blocks. Indeed, if the semantics of $P$ is the stream
523% $L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
524% also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
525% every instruction belongs to a control block and every control block is
526% labelled. Thus it is correct to say that the execution cost of $P$ is also
527% $\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
528% the blocks of the high level program $P$ that is preserved by compilation.
530% How can the user profit from the high level cost model? Suppose, for instance,
531% that he wants to prove that the WCET of his program is bounded by $c$. It
532% is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
533% a purely functional property of the code. He can therefore use any technique
534% available to certify functional properties of the source code.
535% What is suggested in~\cite{easylabelling} is to actually instrument the
536% source code $P$ by replacing every label emission statement
537% $\verb+emit L+$ with the instruction $\verb+cost += k(L)+$ that increments
538% a global fresh variable \verb+cost+. The bound is now proved by establishing
539% the program invariant $\verb+cost+ \leq c$, which can be done for example
540% using the Frama-C~\cite{framaC} suite if the source code is some variant of
541% C.
543% In order to extend the labelling approach to function calls we make
544% \verb+CALL f+ emit the observable \verb+f+ and \verb+RET+ emit a distinguished observable
545% \verb+ret+.
547% For example the following execution history of the program in \autoref{fig:esempio}
548% $$I_1; \verb+CALL f+; \verb+COND l+; \verb+EMIT $L_2$+; I_3; \verb+RET+; I_2; \verb+RET+$$
549% emits the trace
550% $$\verb+main+, \verb+f+$$
551% \begin{figure}
552% \hfil
553% \begin{minipage}{.2\linewidth}
554% \begin{lstlisting}
555% main: $\!I_1$
556%       CALL f
557%       $I_2$
558%       RET
559% \end{lstlisting}
560% \end{minipage}
561% \begin{minipage}{.1\linewidth}
562% \begin{lstlisting}
563% main
564% main
565% main
566% main
567% \end{lstlisting}
568% \end{minipage}
569% \hfil
570% \begin{minipage}{.2\linewidth}
571% \begin{lstlisting}
572% f: $\!$COND l
573%    EMIT $L_2$
574%    RET
575% l: $\!$EMIT $L_3$
576%    $I_3$
577%    RET
578% \end{lstlisting}
579% \end{minipage}
580% \begin{minipage}{.1\linewidth}
581% \begin{lstlisting}
582% f
584% $L_2$
586% $L_3$
587% $L_3$
588% \end{lstlisting}
589% \end{minipage}
590% \hfil{}
591% \caption{}
592% \label{fig:esempio}
593% \end{figure}
596% \subsection{Labelling function calls}
597% We now want to extend the labelling approach to support function calls.
598% On the high level, \emph{structured} programming language $\mathcal{P}$ there
599% is not much to change.
600% When a function is invoked, the current basic block is temporarily exited
601% and the basic block the function starts with take control. When the function
602% returns, the execution of the original basic block is resumed. Thus the only
603% significant change is that basic blocks can now be nested. Let \verb+E+
604% be the label of the external block and \verb+I+ the label of a nested one.
605% Since the external starts before the internal, the semantics observed will be
606% \verb+E I+ and the cost associated to it on the source language will be
607% $k(\verb+E+) + k(\verb+I+)$, i.e. the cost of executing all instructions
608% in the block \verb+E+ first plus the cost of executing all the instructions in
609% the block \verb+I+. However, we know that some instructions in \verb+E+ are
610% executed after the last instruction in \verb+I+. This is actually irrelevant
611% because we are here assuming that costs are additive, so that we can freely
612% permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
613% the function call terminates and yields back control to the basic block
614% \verb+E+. If the call diverges, the instrumentation
615% $\verb+cost += k(E)+$ executed at the beginning of \verb+E+ is still valid,
616% but just as an upper bound to the real execution cost: only precision is lost.
618% Let now consider what happens when we move down the compilation chain to an
619% unstructured intermediate or final language. Here unstructured means that
620% the only control operators are conditional and unconditional jumps, function
621% calls and returns. Unlike a structured language, though, there is no guarantee
622% that a function will return control just after the function call point.
623% The semantics of the return statement, indeed, consists in fetching the
624% return address from some internal structure (typically the control stack) and
625% jumping directly to it. The code can freely manipulate the control stack to
626% make the procedure returns to whatever position. Indeed, it is also possible
627% to break the well nesting of function calls/returns.
629% Is it the case that the code produced by a correct compiler must respect the
630% additional property that every function returns just after its function call
631% point? The answer is negative and the property is not implied by forward
632% simulation proofs. For instance, imagine to modify a correct compiler pass
633% by systematically adding one to the return address on the stack and by
634% putting a \verb+NOP+ (or any other instruction that takes one byte) after
635% every function call. The obtained code will be functionally indistinguishable,
636% and the added instructions will all be dead code.
638% This lack of structure in the semantics badly interferes with the labelling
639% approach. The reason is the following: when a basic block labelled with
640% \verb+E+ contains a function call, it no longer makes any sense to associate
641% to a label \verb+E+ the sum of the costs of all the instructions in the block.
642% Indeed, there is no guarantee that the function will return into the block and
643% that the instructions that will be executed after the return will be the ones
644% we are paying for in the cost model.
646% How can we make the labelling approach work in this scenario? We only see two
647% possible ways. The first one consists in injecting an emission statement after
648% every function call: basic blocks no longer contain function calls, but are now
649% terminated by them. This completely solves the problem and allows the compiler
650% to break the structure of function calls/returns at will. However, the
651% technique has several drawbacks. First of all, it greatly augments the number
652% of cost labels that are injected in the source code and that become
653% instrumentation statements. Thus, when reasoning on the source code to prove
654% non-functional properties, the user (or the automation tool) will have to handle
655% larger expressions. Second, the more labels are emitted, the more difficult it
656% becomes to implement powerful optimisations respecting the code structure.
657% Indeed, function calls are usually implemented in such a way that most registers
658% are preserved by the call, so that the static analysis of the block is not
659% interrupted by the call and an optimisation can involve both the code before
660% and after the function call. Third, instrumenting the source code may require
661% unpleasant modification of it. Take, for example, the code
662% \verb+f(g(x));+. We need to inject an emission statement/instrumentation
663% instruction just after the execution of \verb+g+. The only way to do that
664% is to rewrite the code as \verb+y = g(x); emit L; f(y);+ for some fresh
665% variable \verb+y+. It is pretty clear how in certain situations the obtained
666% code would be more obfuscated and then more difficult to manually reason on.
668% For the previous reasons, in this paper and in the CerCo project we adopt a
669% different approach. We do not inject emission statements after every
670% function call. However, we want to propagate a strong additional invariant in
671% the forward simulation proof. The invariant is the propagation of the structure
672%  of the original high level code, even if the target language is unstructured.
673% The structure we want to propagate, that will become more clear in the next
674% section, comprises 1) the property that every function should return just after
675% the function call point, which in turns imply well nesting of function calls;
676% 2) the property that every basic block starts with a code emission statement.
678% In the original labelling approach of~\cite{easylabelling}, the second property
679% was granted syntactically as a property of the generated code.
680% In our revised approach, instead, we will impose the property on the runs:
681% it will be possible to generate code that does not respect the syntactic
682% property, as soon as all possible runs respect it. For instance, dead code will no longer
683% be required to have all basic blocks correctly la. The switch is suggested
684% from the fact that the first of the two properties --- that related to
685% function calls/returns --- can only be defined as property of runs,
686% not of the static code. The switch is
687% beneficial to the proof because the original proof was made of two parts:
688% the forward simulation proof and the proof that the static property was granted.
689% In our revised approach the latter disappears and only the forward simulation
690% is kept.
692% In order to capture the structure semantics so that it is preserved
693% by a forward simulation argument, we need to make the structure observable
694% in the semantics. This is the topic of the next section.
696\subsection{Structured traces}
699Let's consider a generic unstructured language already equipped with a
700small step structured operational semantics (SOS). We introduce a
701deterministic labelled transition system $(S,\Lambda,\to)$
702that refines the
703SOS by observing function calls/returns and the beginning of basic blocks.
704$S$ is the set of states of the program and
705$\Lambda = \{ \tau, RET \} \cup \Labels \cup \Functions$
706where $\Functions$ is the set of names of functions that can occur in the
707program, $\Labels$ is a set of labels disjoint from $\Functions$
708and $\tau$ and $RET$ do not belong to $\Functions \cup \Labels$. Moreover there
709is an injective function $\ell : \Functions \to \Labels$ that tells the
710starting label of the body of each function, and $\ell(\Functions)\subseteq \Labels$
711denotes the image of this function.
712The transition function is defined as $s_1 \to[o] s_2$ if
713$s_1$ moves to $s_2$ according to the SOS and $o = f \in \Functions$ if
714the function $f$ is called, $o = RET$ if a \verb+RETURN+ is executed,
715$o = L \in \Labels$ if an \verb+EMIT $L$+ is executed to signal the
716beginning of a basic block, and $o = \tau$ in all other cases.
717Because we assume the language to be deterministic, the label emitted can
718actually be computed simply observing $s_1$. Finally, $S$ is also endowed with
719a relation $s\ar s'$ ($s'$ \emph{follows} $s$) that holds when the instruction
720to be executed in $s'$ follows syntactically the one in $s$ in the source program.
722In the rest of the paper we write $s_0 \to^{*} s_n$ for the finite execution
723fragment $T = s_0 \to[o_0] s_1 \to[o_1] \ldots \to[o_{n-1}] s_n$
724and, we call \emph{weak trace} of $T$ (denoted as $|T|$) the
725subsequence $o_{i_0} \ldots o_{i_m}$ of $o_0 \ldots o_{n-1}$ obtained dropping
726every internal action $\tau$.
728%Let $k$ be a cost model for observables actions that maps elements of
729%$\Lambda \setminus \{\tau\}$ to any commutative cost monoid
730%(e.g. natural numbers). We extend the domain of $k$ to executable fragments
731%by posing $k(T) = \Sigma_{o \in |T|} k(o)$.
733\paragraph{Structured execution fragments}
734Among all possible finite execution fragments we want to
735identify the ones that satisfy the requirements we sketched in the previous
736section. We say that an execution fragment
737$s_0 \to^{*} s_n$
738is \emph{structured} (and we denote it as $s_0 \To s_n$) iff the following conditions
739are met.
741 \item For every $i$, if $s_i \to[f] s_{i+1}$ then there is a
742   label $L$ and a $k\ge i+2$ such that
743    $s_{i+1} \to[\ell(f)] s_{i+2} \To s_k \to[RET] s_{k+1}$, with
744    $s_i \ar s_{k+1}$.
745   In other words, $s_{i+1}$ must start execution with \verb+EMIT $\ell(f)$+
746   --- so that no instruction falls outside the scope of every label ---
747   and then continue with a structured fragment returning control
748   to the instruction immediately following the call.
750   The condition also enforces convergence of every function call, which is
751   necessary to bound the cost of the fragment. Note that
752   non convergent programs may still have structured execution fragments
753   that are worth measuring. For example, we can measure the reaction time
754   of a server implemented as an unbounded loop whose body waits for an
755   input, process it and performs an output before looping: the processing
756   steps form a structured execution fragment.
757 \item The number of $RET$'s in the fragment is equal to the number of
758   calls performed. In combination with the previous condition, this ensures
759   well-backeting of function calls.
760 \item
761   \label{req3}
762   For every $i$ and $f$, if $s_{i+1}\to[\ell(f)]s_{i+2}$ then
763   $s_i\to[f]s_{i+1}$. This is a technical condition needed to ensure that a
764   label associated with a function is only used at the beginning of its
765   body. Its use will become clear in~\autoref{simulation}.
766 \item For every $i$, if the instruction to be executed in $s_i$ is a
767   conditional branch, then there is an $L$ such that $s_{i+1} \to[L] s_{i+2}$ or, equivalently, that $s_{i+1}$ must start execution with an
768   \verb+EMIT $L$+. This captures the requirement that every branch which is
769   live code must start with a label emission. Otherwise, it would be possible
770   to have conditional instructions whose branches are assigned different
771   costs, making impossible to assign a single cost to the label whose scope
772   contains the jump.
774One might wonder why $f$ and $\ell(f)$, that aways appear in this order, are not
775collapsed into a single observable. This would simplify some aspects of the
776formalisation at the price of others. For example, we should add special
777cases when the fragment starts at the beginning of a function body
778(e.g. the one of \texttt{main}) because in that case nobody would have emitted
779the observable $\ell(f)$. We plan to compare the two approaches in the future.
781\paragraph{Measurable execution fragments and their cost prediction.}
782The first main theorem of CerCo deals with the object code.
783It states that the execution cost of
784certain execution fragments, that we call \emph{measurable fragments}, can be
785computed from their weak trace by choosing the cost model $k$ that assigns to
786any label the cost (in clock cycles) of the instructions in its scope, and
787$0$ to function calls and $RET$ observables.
790 \label{static}
791 for all measurable fragment $T = s_0 \to^{*} s_n$,\\
792 $$\Delta_t := \verb+clock+_{s_n} - \verb+clock+_{s_0} = \Sigma_{o \in |T|} k(o)$$
795An execution fragment $s_0 \to^{*} s_n$ is
796measurable if it is structured (up to a possible final \texttt{RETURN}) and
797if it does not start or end in the middle of a basic block.
798Ending in the middle of a block would mean that the last label encountered
799would have pre-paid more instructions than the ones executed; starting in the
800middle would mean not paying any instruction up to the first label emission.
802Formally, $s_0 \to^{*} s_n$ is measurable iff $o_0 \in \Labels$ (or equivalently
803in $s_0$ the program must emit a label) and either
804$s_0 \To s_{n-1}$ and $s_{n-1}\to[RET]s_n$ or
805$s_0 \To s_n$ and $s_n$ must be a label emission statement.
807%\textbf{CSC: PROVA----------------------}
808% The theorem is proved by structural induction over the structured
809% trace, and is based on the invariant that
810% iff the function that computes the cost model has analysed the instruction
811% to be executed at $s_2$ after the one to be executed at $s_1$, and if
812% the structured trace starts with $s_1$, then eventually it will contain also
813% $s_2$. When $s_1$ is not a function call, the result holds trivially because
814% of the $s_1\exec s_2$ condition obtained by inversion on
815% the trace. The only non
816% trivial case is the one of function calls: the cost model computation function
817% does recursion on the first instruction that follows that function call; the
818% \verb+as_after_return+ condition of the \verb+tal_base_call+ and
819% \verb+tal_step_call+ grants exactly that the execution will eventually reach
820% this state.
822\paragraph{Weak similarity and cost invariance.}
823Given two deterministic unstructured programming languages with their own
824operational semantics, we say that two execution fragments are
825\emph{weakly trace equivalent} if their weak traces are equal.
827A compiler (pass) that preserves the program semantics also preserves weak
828traces and propagates measurability iff for every measurable
829fragment $T_1 = s_1 \to^{*} s_1'$ of the source code, the corresponding
830execution fragment $T_2 = s_2 \to^{*} s_2'$ of the object code is measurable
831and $T_1$ and $T_2$ are weakly trace equivalent. The very intuitive notion of
832``corresponding fragment'' is made clear in the forward simulation proof of
833preservation of the semantics of the program by saying that $s_2$ and $s_1$
834are in a certain relation. Clearly the property holds for a compiler if it
835holds for each compiler pass.
837Having proved in~\autoref{static} that the statically computed cost model is
838accurate for the object code, we get as a corollary that it is also accurate
839for the source code if the compiler preserves weak traces and
840propagates measurability. Thus, as prescribed by the CerCo's methodology~\cite{fopara}, it becomes possible to compute cost models
841on the object code, transfer it to the source code and then reason comfortably
842on the source code only.
845Given a compiler that preserves weak traces and propagates measurability,
846for all measurable execution fragment $T_1 = s_1 \to^{*} s_1'$ of the source
847code such that $T_2 = s_2 \to^{*} s_2'$ is the corresponding fragment of the
848object code,
850$$\Delta_t := \verb+clock+_{s_2'} - \verb+clock+_{s_2} = \Sigma_{o \in |T_2|} k(o) = \Sigma_{o \in |T_1|} k(o)$$
853\section{Proving the compiler correctness}
855Because of \autoref{preservation}, to certify a compiler for the labelling
856approach we need to both prove that it respects the functional semantics of the
857program, and that it preserves weak traces and propagates measurability.
858We achieve this by independently proving the three properties for each compiler
860The first property is standard and can be proved by means of a forward simulation argument (see for example~\cite{compcert3}) that runs like this.
861First a relation between the corresponding
862source and target states is defined. Then a lemma establishes
863a local simulation condition: given two states in relation, if the source
864one performs one step then the target one performs zero or more steps and
865the two resulting states are synchronized again according to the relation.
866No requirements are imposed on the intermediate target states.
867Finally, the lemma is iterated over the execution trace to establish the
868final result.
870In principle, preservation of weak traces could be easily shown with the same
871argument (and also at the same time). Surprisingly, propagation of
872measurability cannot. What makes the standard forward
873simulation proof work is the fact that usually a compiler pass performs some
874kind of local or global analysis of the code followed by a compositional, order
875preserving translation of every instruction. In order to produce structured
876traces, however, code emission cannot be fully compositional any longer,
877and requirements need to be enforced on intermediate target states.
879For example, consider~requirement \ref{req3} that asks every function body
880to start with a label emission statement. Some compiler passes must
881add preambles to functions, for example to take care of the parameter passing
882convention. In order to not violate the requirement, the preamble must be
883inserted after the label emission. In the forward simulation proof, however,
884function call steps in the source language are simulated by the new function
885call followed by the execution of the preamble, and only at the end of the
886preamble the reached states are again in the expected relation. In the meantime,
887however, the object code has already performed the label emission statement,
888that still needs to be executed in the source code, breaking forward simulation.
890Another reason why the standard argument breaks is due to the requirement that
891function calls should yield back control after the calling point. Some passes
892need to translate a function call to a function call followed by some
893instructions (for example to restore caller-saved registers in the pass that
894sets the parameter convenction). In the forward simulation proof, these
895instructions are taken care of when simulating the \texttt{RETURN} step:
896the state just after the return in the source code is matched by the state $s_2$
897after these steps in the object code. However, the aforementioned requirement
898does not involve $s_2$, but the intermediate state reached after the return in the object
899code. Therefore this requirement too
900cannot be enforced with the standard forward simulation argument.
902In this section we present now a modified forward simulation argument that
903can be used to prove at once that a compiler preserves the semantics of the
904program, its weak traces and that the compiler propagates measurability.
906% @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
908% The program semantics adopted in the traditional labelling approach is based
909% on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
910% a set of states $\S$, the semantics of one deterministic execution
911% step is
912% defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
913% observables. The semantics is then lifted compositionally to multiple (finite
914% or infinite) execution steps.
915% Finally, the semantics of a a whole program execution is obtained by forgetting
916% about the final state (if any), yielding a function $S \to O^*$ that given an
917% initial status returns the finite or infinite stream of observables in output.
919% We present here a new definition of semantics where the structure of execution,
920% as defined in the previous section, is now observable. The idea is to replace
921% the stream of observables with a structured data type that makes explicit
922% function call and returns and that grants some additional invariants by
923% construction. The data structure, called \emph{structured traces}, is
924% defined inductively for terminating programs and coinductively for diverging
925% ones. In the paper we focus only on the inductive structure, i.e. we assume
926% that all programs that are given a semantics are total. The Matita formalisation
927% also shows the coinductive definitions. The semantics of a program is then
928% defined as a function that maps an initial state into a structured trace.
930% In order to have a definition that works on multiple intermediate languages,
931% we abstract the type of structure traces over an abstract data type of
932% abstract statuses, which we aptly call $\verb+abstract_status+$. The fields
933% of this record are the following.
934% \begin{itemize}
935%  \item \verb+S : Type[0]+, the type of states.
936%  \item \verb+as_execute : S $\to$ S $\to$ Prop+, a binary predicate stating
937%  an execution step. We write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
938%  \item \verb+as_classifier : S $\to$ classification+, a function tagging all
939%  states with a class in
940%  $\{\verb+cl_return,cl_jump,cl_call,cl_other+\}$, depending on the instruction
941%  that is about to be executed (we omit tail-calls for simplicity). We will
942%  use $s \class c$ as a shorthand for both $\verb+as_classifier+~s=c$
943%  (if $c$ is a classification) and $\verb+as_classifier+~s\in c$
944%  (if $c$ is a set of classifications).
945%  \item \verb+as_label : S $\to$ option label+, telling whether the
946%  next instruction to be executed in $s$ is a cost emission statement,
947%  and if yes returning the associated cost label. Our shorthand for this function
948%  will be $\ell$, and we will also abuse the notation by using $\ell~s$ as a
949%  predicate stating that $s$ is labelled.
950%  \item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
951%  telling the identifier of the function which is being called in a
952%  \verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
953%  $\verb+as_call_ident+~s = f$.
954%  \item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
955%  which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
956%  instruction to be executed in $s_2$ follows the function call to be
957%  executed in (the witness of the $\Sigma$-type) $s_1$. We will use the notation
958%  $s_1\ar s_2$ for this relation.
959% \end{itemize}
961% % \begin{alltt}
962% % record abstract_status := \{ S: Type[0];
963% %  as_execute: S \(\to\) S \(\to\) Prop;   as_classifier: S \(\to\) classification;
964% %  as_label: S \(\to\) option label;    as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
965% %  as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
966% % \end{alltt}
968% The inductive type for structured traces is actually made by three multiple
969% inductive types with the following semantics:
970% \begin{enumerate}
971%  \item $(\verb+trace_label_return+~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
972%    is a trace that begins in
973%    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
974%    such that the instruction to be executed in $s_1$ is a label emission
975%    statement and the one to be executed in the state before $s_2$ is a return
976%    statement. Thus $s_2$ is the state after the return. The trace
977%    may contain other label emission statements. It captures the structure of
978%    the execution of function bodies: they must start with a cost emission
979%    statement and must end with a return; they are obtained by concatenating
980%    one or more basic blocks, all starting with a label emission
981%    (e.g. in case of loops).
982%  \item $(\verb+trace_any_label+~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
983%    is a trace that begins in
984%    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
985%    such that the instruction to be executed in $s_2$/in the state before
986%    $s_2$ is either a label emission statement or
987%    or a return, according to the boolean $b$. It must not contain
988%    any label emission statement. It captures the notion of a suffix of a
989%    basic block.
990%  \item $(\verb+trace_label_label+~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$ is the special case of
991%    $\verb+TAL+~b~s_1~s_2)$ such that the instruction to be
992%    executed in $s_1$ is a label emission statement. It captures the notion of
993%    a basic block.
994% \end{enumerate}
996% \begin{multicols}{3}
997% \infrule[\verb+tlr_base+]
998%  {\verb+TLL+~true~s_1~s_2}
999%  {\verb+TLR+~s_1~s_2}
1001% \infrule[\verb+tlr_step+]
1002%  {\verb+TLL+~false~s_1~s_2 \andalso
1003%   \verb+TLR+~s_2~s_3
1004%  }
1005%  {\verb+TLR+~s_1~s_3}
1007% \infrule[\verb+tll_base+]
1008%  {\verb+TAL+~b~s_1~s_2 \andalso
1009%   \ell~s_1
1010%  }
1011%  {\verb+TLL+~b~s_1~s_2}
1012% \end{multicols}
1014% \infrule[\verb+tal_base_not_return+]
1015%  {s_1\exec s_2 \andalso
1016%   s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
1017%   \ell~s_2
1018%  }
1019%  {\verb+TAL+~false~s_1~s_2}
1021% \infrule[\verb+tal_base_return+]
1022%  {s_1\exec s_2 \andalso
1023%   s_1 \class \verb+cl_return+
1024%  }
1025%  {\verb+TAL+~true~s_1~s_2}
1027% \infrule[\verb+tal_base_call+]
1028%  {s_1\exec s_2 \andalso
1029%   s_1 \class \verb+cl_call+ \andalso
1030%   s_1\ar s_3 \andalso
1031%   \verb+TLR+~s_2~s_3 \andalso
1032%   \ell~s_3
1033%  }
1034%  {\verb+TAL+~false~s_1~s_3}
1036% \infrule[\verb+tal_step_call+]
1037%  {s_1\exec s_2 \andalso
1038%   s_1 \class \verb+cl_call+ \andalso
1039%   s_1\ar s_3 \andalso
1040%   \verb+TLR+~s_2~s_3 \andalso
1041%   \verb+TAL+~b~s_3~s_4
1042%  }
1043%  {\verb+TAL+~b~s_1~s_4}
1045% \infrule[\verb+tal_step_default+]
1046%  {s_1\exec s_2 \andalso
1047%   \lnot \ell~s_2 \andalso
1048%   \verb+TAL+~b~s_2~s_3\andalso
1049%   s_1 \class \verb+cl_other+
1050%  }
1051%  {\verb+TAL+~b~s_1~s_3}
1052% \begin{comment}
1053% \begin{verbatim}
1054% inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
1055%   | tlr_base:
1056%       ∀status_before: S.
1057%       ∀status_after: S.
1058%         trace_label_label S ends_with_ret status_before status_after →
1059%         trace_label_return S status_before status_after
1060%   | tlr_step:
1061%       ∀status_initial: S.
1062%       ∀status_labelled: S.
1063%       ∀status_final: S.
1064%         trace_label_label S doesnt_end_with_ret status_initial status_labelled →
1065%         trace_label_return S status_labelled status_final →
1066%           trace_label_return S status_initial status_final
1067% with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
1068%   | tll_base:
1069%       ∀ends_flag: trace_ends_with_ret.
1070%       ∀start_status: S.
1071%       ∀end_status: S.
1072%         trace_any_label S ends_flag start_status end_status →
1073%         as_costed S start_status →
1074%           trace_label_label S ends_flag start_status end_status
1075% with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
1076%   (* Single steps within a function which reach a label.
1077%      Note that this is the only case applicable for a jump. *)
1078%   | tal_base_not_return:
1079%       ∀start_status: S.
1080%       ∀final_status: S.
1081%         as_execute S start_status final_status →
1082%         (as_classifier S start_status cl_jump ∨
1083%          as_classifier S start_status cl_other) →
1084%         as_costed S final_status →
1085%           trace_any_label S doesnt_end_with_ret start_status final_status
1086%   | tal_base_return:
1087%       ∀start_status: S.
1088%       ∀final_status: S.
1089%         as_execute S start_status final_status →
1090%         as_classifier S start_status cl_return →
1091%           trace_any_label S ends_with_ret start_status final_status
1092%   (* A call followed by a label on return. *)
1093%   | tal_base_call:
1094%       ∀status_pre_fun_call: S.
1095%       ∀status_start_fun_call: S.
1096%       ∀status_final: S.
1097%         as_execute S status_pre_fun_call status_start_fun_call →
1098%         ∀H:as_classifier S status_pre_fun_call cl_call.
1099%           as_after_return S «status_pre_fun_call, H» status_final →
1100%           trace_label_return S status_start_fun_call status_final →
1101%           as_costed S status_final →
1102%             trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
1103%   (* A call followed by a non-empty trace. *)
1104%   | tal_step_call:
1105%       ∀end_flag: trace_ends_with_ret.
1106%       ∀status_pre_fun_call: S.
1107%       ∀status_start_fun_call: S.
1108%       ∀status_after_fun_call: S.
1109%       ∀status_final: S.
1110%         as_execute S status_pre_fun_call status_start_fun_call →
1111%         ∀H:as_classifier S status_pre_fun_call cl_call.
1112%           as_after_return S «status_pre_fun_call, H» status_after_fun_call →
1113%           trace_label_return S status_start_fun_call status_after_fun_call →
1114%           ¬ as_costed S status_after_fun_call →
1115%           trace_any_label S end_flag status_after_fun_call status_final →
1116%             trace_any_label S end_flag status_pre_fun_call status_final
1117%   | tal_step_default:
1118%       ∀end_flag: trace_ends_with_ret.
1119%       ∀status_pre: S.
1120%       ∀status_init: S.
1121%       ∀status_end: S.
1122%         as_execute S status_pre status_init →
1123%         trace_any_label S end_flag status_init status_end →
1124%         as_classifier S status_pre cl_other →
1125%         ¬ (as_costed S status_init) →
1126%           trace_any_label S end_flag status_pre status_end.
1127% \end{verbatim}
1128% \end{comment}
1129% A \verb+trace_label_return+ is isomorphic to a list of
1130% \verb+trace_label_label+s that ends with a cost emission followed by a
1131% return terminated \verb+trace_label_label+.
1132% The interesting cases are those of $\verb+trace_any_label+~b~s_1~s_2$.
1133% A \verb+trace_any_label+ is a sequence of steps built by a syntax directed
1134% definition on the classification of $s_1$. The constructors of the datatype
1135% impose several invariants that are meant to impose a structure to the
1136% otherwise unstructured execution. In particular, the following invariants are
1137% imposed:
1138% \begin{enumerate}
1139%  \item the trace is never empty; it ends with a return iff $b$ is
1140%        true
1141%  \item a jump must always be the last instruction of the trace, and it must
1142%        be followed by a cost emission statement; i.e. the target of a jump
1143%        is always the beginning of a new basic block; as such it must start
1144%        with a cost emission statement
1145%  \item a cost emission statement can never occur inside the trace, only in
1146%        the status immediately after
1147%  \item the trace for a function call step is made of a subtrace for the
1148%        function body of type
1149%        $\verb+trace_label_return+~s_1~s_2$, possibly followed by the
1150%        rest of the trace for this basic block. The subtrace represents the
1151%        function execution. Being an inductive datum, it grants totality of
1152%        the function call. The status $s_2$ is the one that follows the return
1153%        statement. The next instruction of $s_2$ must follow the function call
1154%        instruction. As a consequence, function calls are also well nested.
1155% \end{enumerate}
1157% There are three mutual structural recursive functions, one for each of
1158% \verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
1159% $|\,.\,|$: the \emph{flattening} of the traces. These functions
1160% allow to extract from a structured trace the list of emitted cost labels.
1161% %  We only show here the type of one
1162% % of them:
1163% % \begin{alltt}
1164% % flatten_trace_label_return:
1165% %  \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
1166% %   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
1167% % \end{alltt}
1169% \paragraph{Structured traces similarity and cost prediction invariance.}
1171% A compiler pass maps source to object code and initial states to initial
1172% states. The source code and initial state uniquely determine the structured
1173% trace of a program, if it exists. The structured trace fails to exists iff
1174% the structural conditions are violated by the program execution (e.g. a function
1175% body does not start with a cost emission statement). Let us assume that the
1176% target structured trace exists.
1178% What is the relation between the source and target structured traces?
1179% In general, the two traces can be arbitrarily different. However, we are
1180% interested only in those compiler passes that maps a trace $\tau_1$ to a trace
1181% $\tau_2$ such that
1182% \begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
1183% The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
1184% corollary
1185% \begin{equation}\label{th3}
1186% \forall s_1,s_2. \forall \tau: \verb+TLR+~s_1~s_2.~
1187%   \verb+clock+~s_2 - \verb+clock+~s_1 =
1188%   \Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
1189%   \Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
1190% \end{equation}
1191% This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
1192% transfer the cost model from the target to the source code and reason on the
1193% source code only.
1195% We are therefore interested in conditions stronger than~\eqref{th2}.
1196% Therefore we introduce here a similarity relation between traces with
1197% the same structure. Theorem~\verb+tlr_rel_to_traces_same_flatten+
1198% in the Matita formalisation shows that~\eqref{th2} holds for every pair
1199% $(\tau_1,\tau_2)$ of similar traces.
1201% Intuitively, two traces are similar when one can be obtained from
1202% the other by erasing or inserting silent steps, i.e. states that are
1203% not \verb+as_costed+ and that are classified as \verb+cl_other+.
1204% Silent steps do not alter the structure of the traces.
1205% In particular,
1206% the relation maps function calls to function calls to the same function,
1207% label emission statements to emissions of the same label, concatenation of
1208% subtraces to concatenation of subtraces of the same length and starting with
1209% the same emission statement, etc.
1211% In the formalisation the three similarity relations --- one for each trace
1212% kind --- are defined by structural recursion on the first trace and pattern
1213% matching over the second. Here we turn
1214% the definition into the inference rules shown in \autoref{fig:txx_rel}
1215% for the sake of readability. We also omit from trace constructors all arguments,
1216% but those that are traces or that
1217% are used in the premises of the rules. By abuse of notation we denote all three
1218% relations by infixing $\approx$.
1220% \begin{figure}
1221% \begin{multicols}{2}
1222% \infrule
1223%  {tll_1\approx tll_2
1224%  }
1225%  {\verb+tlr_base+~tll_1 \approx \verb+tlr_base+~tll_2}
1227% \infrule
1228%  {tll_1 \approx tll_2 \andalso
1229%   tlr_1 \approx tlr_2
1230%  }
1231%  {\verb+tlr_step+~tll_1~tlr_1 \approx \verb+tlr_step+~tll_2~tlr_2}
1232% \end{multicols}
1233% \vspace{3ex}
1234% \begin{multicols}{2}
1235% \infrule
1236%  {\ell~s_1 = \ell~s_2 \andalso
1237%   tal_1\approx tal_2
1238%  }
1239%  {\verb+tll_base+~s_1~tal_1 \approx \verb+tll_base+~s_2~tal_2}
1241% \infrule
1242%  {tal_1\approx tal_2
1243%  }
1244%  {\verb+tal_step_default+~tal_1 \approx tal_2}
1245% \end{multicols}
1246% \vspace{3ex}
1247% \infrule
1248%  {}
1249%  {\verb+tal_base_not_return+\approx taa \append \verb+tal_base_not_return+}
1250% \vspace{1ex}
1251% \infrule
1252%  {}
1253%  {\verb+tal_base_return+\approx taa \append \verb+tal_base_return+}
1254% \vspace{1ex}
1255% \infrule
1256%  {tlr_1\approx tlr_2 \andalso
1257%   s_1 \uparrow f \andalso s_2\uparrow f
1258%  }
1259%  {\verb+tal_base_call+~s_1~tlr_1\approx taa \append \verb+tal_base_call+~s_2~tlr_2}
1260% \vspace{1ex}
1261% \infrule
1262%  {tlr_1\approx tlr_2 \andalso
1263%   s_1 \uparrow f \andalso s_2\uparrow f \andalso
1264%   \verb+tal_collapsable+~tal_2
1265%  }
1266%  {\verb+tal_base_call+~s_1~tlr_1 \approx taa \append \verb+tal_step_call+~s_2~tlr_2~tal_2)}
1267% \vspace{1ex}
1268% \infrule
1269%  {tlr_1\approx tlr_2 \andalso
1270%   s_1 \uparrow f \andalso s_2\uparrow f \andalso
1271%   \verb+tal_collapsable+~tal_1
1272%  }
1273%  {\verb+tal_step_call+~s_1~tlr_1~tal_1 \approx taa \append \verb+tal_base_call+~s_2~tlr_2)}
1274% \vspace{1ex}
1275% \infrule
1276%  {tlr_1 \approx tlr_2 \andalso
1277%   s_1 \uparrow f \andalso s_2\uparrow f\andalso
1278%   tal_1 \approx tal_2 \andalso
1279%  }
1280%  {\verb+tal_step_call+~s_1~tlr_1~tal_1 \approx taa \append \verb+tal_step_call+~s_2~tlr_2~tal_2}
1281% \caption{The inference rule for the relation $\approx$.}
1282% \label{fig:txx_rel}
1283% \end{figure}
1284% %
1285% \begin{comment}
1286% \begin{verbatim}
1287% let rec tlr_rel S1 st1 st1' S2 st2 st2'
1288%   (tlr1 : trace_label_return S1 st1 st1')
1289%   (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
1290% match tlr1 with
1291%   [ tlr_base st1 st1' tll1 ⇒
1292%     match tlr2 with
1293%     [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
1294%     | _ ⇒ False
1295%     ]
1296%   | tlr_step st1 st1' st1'' tll1 tl1 ⇒
1297%     match tlr2 with
1298%     [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
1299%       tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
1300%     | _ ⇒ False
1301%     ]
1302%   ]
1303% and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
1304%  (tll1 : trace_label_label S1 fl1 st1 st1')
1305%  (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
1306%   match tll1 with
1307%   [ tll_base fl1 st1 st1' tal1 H ⇒
1308%     match tll2 with
1309%     [ tll_base fl2 st2 st2 tal2 G ⇒
1310%       as_label_safe … («?, H») = as_label_safe … («?, G») ∧
1311%       tal_rel … tal1 tal2
1312%     ]
1313%   ]
1314% and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
1315%  (tal1 : trace_any_label S1 fl1 st1 st1')
1316%  (tal2 : trace_any_label S2 fl2 st2 st2')
1317%    on tal1 : Prop ≝
1318%   match tal1 with
1319%   [ tal_base_not_return st1 st1' _ _ _ ⇒
1320%     fl2 = doesnt_end_with_ret ∧
1321%     ∃st2mid,taa,H,G,K.
1322%     tal2 ≃ taa_append_tal ? st2 ??? taa
1323%       (tal_base_not_return ? st2mid st2' H G K)
1324%   | tal_base_return st1 st1' _ _ ⇒
1325%     fl2 = ends_with_ret ∧
1326%     ∃st2mid,taa,H,G.
1327%     tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
1328%       (tal_base_return ? st2mid st2' H G)
1329%   | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
1330%     fl2 = doesnt_end_with_ret ∧
1331%     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
1332%     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
1333%     (* we must allow a tal_base_call to be similar to a call followed
1334%       by a collapsable trace (trace_any_any followed by a base_not_return;
1335%       we cannot use trace_any_any as it disallows labels in the end as soon
1336%       as it is non-empty) *)
1337%     (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
1338%       tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
1339%     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
1340%     ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
1341%       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
1342%       tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
1343%   | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
1344%     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
1345%     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
1346%     (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
1347%       tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
1348%       tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
1349%     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
1350%     ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
1351%       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
1352%       tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
1353%   | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
1354%     tal_rel … tl1 tal2 (* <- this makes it many to many *)
1355%   ].
1356% \end{verbatim}
1357% \end{comment}
1358% %
1359% In the preceding rules, a $taa$ is an inhabitant of the
1360% $\verb+trace_any_any+~s_1~s_2$ (shorthand $\verb+TAA+~s_1~s_2$),
1361% an inductive data type whose definition
1362% is not in the paper for lack of space. It is the type of valid
1363% prefixes (even empty ones) of \verb+TAL+'s that do not contain
1364% any function call. Therefore it
1365% is possible to concatenate (using ``$\append$'') a \verb+TAA+ to the
1366% left of a \verb+TAL+. A \verb+TAA+ captures
1367% a sequence of silent moves.
1368% The \verb+tal_collapsable+ unary predicate over \verb+TAL+'s
1369% holds when the argument does not contain any function call and it ends
1370% with a label (not a return). The intuition is that after a function call we
1371% can still perform a sequence of silent actions while remaining similar.
1373% As should be expected, even though the rules are asymmetric $\approx$ is in fact
1374% an equivalence relation.
1378Therefore we now introduce an abstract notion of relation set between abstract
1379statuses and an abstract notion of 1-to-many forward simulation conditions.
1380These two definitions enjoy the following remarkable properties:
1382 \item they are generic enough to accommodate all passes of the CerCo compiler
1383 \item the conjunction of the 1-to-many forward simulation conditions are
1384       just slightly stricter than the statement of a 1-to-many forward
1385       simulation in the classical case. In particular, they only require
1386       the construction of very simple forms of structured traces made of
1387       silent states only.
1388 \item they allow to prove our main result of the paper: the 1-to-many
1389       forward simulation conditions are sufficient to prove the trace
1390       reconstruction theorem
1393Point 3. is the important one. First of all it means that we have reduced
1394the complex problem of trace reconstruction to a much simpler one that,
1395moreover, can be solved with slight adaptations of the forward simulation proof
1396that is performed for a compiler that only cares about functional properties.
1397Therefore we have successfully splitted as much as possible the proof of
1398preservation of functional properties from that of non-functional ones.
1402\paragraph{Relation sets.}
1403Let $S_1$ and $S_2$ be two deterministic labelled transition systems as described
1404in \autoref{semantics}. We introduce now the four relations $\mathcal{S,C,R,L}\subseteq S_1\times S_2$
1405between states of the two systems. The first two are abstract and must be instantiated
1406by every pass. The remaining two are derived.
1408The $\S$ relation between states is the classical relation used
1409in forward simulation proofs. It correlates the data of the states
1410(e.g. registers, memory, etc.).
1412The $\C$ relation correlates states that are about to execute calls.
1413It allows to track the position in the target code of every call in the source code.
1415% The $\L$ relation simply says that the two states are both label
1416% emitting states that emit the same label, \emph{i.e.}\ $s_1\L s_2\iffdef \ell~s_1=\ell~s_2$.
1417% It allows to track the position in
1418% the target code of every cost emitting statement in the source code.
1420Two states
1421$s_1$ and $s_2$ are $\R$-related if every time $s_1$ is the
1422successor of a call state $s_1'$ that is $\C$-related to a call state
1423$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$. Formally:
1424$$s_1\R s_2 \iffdef \forall s_1',s_2'.s_1'\C s_2' \to s_1'\ar s_1 \to s_2' \ar s_2.$$
1425We will require all pairs of states that return from related calls to be
1426$\R$-related. This, in combinantion with a dual requirement on function calls,
1427will grant that calls return exactly where they are supposed to be.
1429We say states in $s_1\in S_1$ and $s_2\in S_2$ are label-related
1430(marked $s_1\L s_2$) if
1432\item they both emit the same label $L$;
1433\item if $L\in \ell(\Functions)$ then
1434there is a state $s_2'$ such that $s_2\to[L]\to[\tau *]s_2'$ with
1435 $s_1\S s_2'$, otherwise if $L\notin\ell(\Functions)$ then $s_1\S s_2$.
1437This relation captures the fact that synchronisation on labels can be decoupled from
1438``semantic synchronisation'' (in particular at the start of functions). $s_1\L s_2$
1439tells that the two state are synchronised as to labels, and $s_2$ will eventually
1440synchronise semantically using only silent execution steps (apart from the very first
1443Given the relations $\S$ and $\C$, \autoref{fig:forwardsim} defines a set of
1444local simulation conditions that are sufficient to grant the correctness of
1445the compiler.
1449% \begin{subfigure}{.475\linewidth}
1450% \centering
1451% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1452%                             every label/.style=overlay, node distance=10mm]
1453%     \matrix [diag] (m) {%
1454%          \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
1455%          \node (s2) {}; & \node (t2) {}; \\
1456%     };
1457%     \node [above=0 of t1, overlay] {$\alpha$};
1458%     {[-stealth]
1459%     \draw (s1) -- (t1);
1460%     \draw [new] (s2) -- node [above] {$*$} (t2);
1461%     }
1462%     \draw (s1) to node [rel] {$\S$} (s2);
1463%     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1464% \end{tikzpicture}
1465% \caption{The \verb+cl_jump+ case.}
1466% \label{subfig:cl_jump}
1467% \end{subfigure}
1468% &
1470\tikzset{diag/.append style=
1471         {every node/.append style={is other,
1472                                    text height=0,text depth=0,text width=0,
1473                                    text opacity=0}                                                         
1474         },
1475         }
1477    \matrix [diag] (m) {%
1478         \node (s1) {s_1}; & \node (t1) {s_1'};\\
1479         \node (s2) {s_2}; & \node [newn] (t2) {s_2'}; \\
1480    };
1481    {[-stealth]
1482    \draw (s1) -- node [above] {$\tau$} (t1);
1483    \draw [new] (s2) -- node [above] {$\tau *$} (t2);
1484    }
1485    \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
1486    \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S$} (t2);
1490    \matrix [diag] (m) {%
1491         \node (s1) {s_1}; & \node (t1) {s_1'};\\
1492         \node (s2) {s_2}; & \node  [newn](mid) {s_a}; & \node [newn] (t2) {s_2'}; \\
1493    };
1494    {[-stealth]
1495    \draw (s1) -- node [above] {$L$} (t1);
1496    \draw [new] (s2) -- node [above] {$L$} (mid);
1497    \draw [new] (mid) -- node [above] {$\tau *$} (t2);
1498    }
1499    \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
1500    \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S$} (t2);
1502\text{ if $L\notin\ell(\Functions)$}
1505    \matrix [diag] (m) {%
1506         \node (s1) {s_1}; & \node (t1) {s_1'};\\
1507         \node (s2) {s_2};\\
1508    };
1509    {[-stealth]
1510    \draw (s1) -- node [above] {$\ell(f)$} (t1);
1511    }
1512    \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
1513    \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S$} (s2);
1517    \matrix [diag, small vgap] (m) {%
1518         &\node (t1) {s_1'}; \\
1519         \node (s1) {s_1}; \\
1520         && \node [newn] (l1) {s_b}; & \node [newn] (l2) {s_c}; & \node [newn] (t2) {s_2'};\\
1521         \node [newn] (s2) {s_2}; & \node [newn] (c) {s_a};\\   
1522    };
1523    {[-stealth]
1524    \draw (s1) -- node [above left] {$f$} (t1);
1525    \draw [new] (s2) -- node [above] {$\tau *$} (c);
1526    \draw [new] (c) -- node [above left] {$f$} (l1);
1527    \draw [new] (l1) -- node [above] {$\ell(f)$} (l2);
1528    \draw [new] (l2) -- node [above] {$\tau *$} (t2);
1529    }
1530    \draw (s1) to [bend right] node [rel] {$\S$} (s2);
1531    \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
1532    \draw [new] (s1) to [bend left] node [rel] {$\C$} (c);
1536    \matrix [diag, small vgap] (m) {%
1537        \node (s1) {s_1}; \\
1538        &\node (t1) {s_1'}; \\
1539        \node (s2) {s_2}; & \node [newn] (c) {s_a};\\
1540        && \node [newn] (r) {s_b}; & \node [newn] (t2) {s_2'}; \\   
1541    };
1542    {[-stealth]
1543    \draw (s1) -- node [above right] {$RET$} (t1);
1544    \draw [new] (s2) -- node [above] {$\tau *$} (c);
1545    \draw [new] (c) -- node [below left] {$RET$} (r);
1546    \draw [new] (r) -- node [above] {$\tau *$} (t2);
1547    }
1548    \draw (s1) to [bend right] node [rel] {$\S$} (s2);
1549    \draw [new, overlay] (t1) to [bend left=60] node [rel] {$\S$} (t2);
1550    \draw [new, overlay] (t1) to [bend left ] node [rel] {$\R$} (r);
1552\caption{Local simulation conditions. Each one states that the existence of
1553states in~XXX such that the dashed relations holds is implied by the assumptions
1554drawn as solid lines.}
1559If $S_1$, $S_2$, $\S$ and $\C$ satisfy the diagrams in \autoref{fig:forwardsim},
1560$T_1=s_1\To s_1'$ is a structured fragment not starting with a $\ell(f)$ emission,
1561and $s_1\S s_2$, then there is $T_2=s_2\To s_2'$ with $T\approx T'$ and $s_1'\S s_2'$.
1564If $S_1$, $S_2$, $\S$ and $\C$ satisfy the diagrams in \autoref{fig:forwardsim},
1565$M_1:s_1\MeasTo s_1'$ is a measurable fragment of $S_1$ and $s_2$ is such that
1566$s_1\L s_2$, then there is $M_2:s_2\MeasTo s_2'$ with $M_1\approx M_2$. Moreover,
1567there is a state $s_2''$ with $s_2'\to[\tau *]s_2''$ and $s_1'\S s_2''$.
1571% \begin{figure}
1572% \centering
1573% \begin{tabular}{@{}c@{}c@{}c@{}}
1574% % \begin{subfigure}{.475\linewidth}
1575% % \centering
1576% % \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1577% %                             every label/.style=overlay, node distance=10mm]
1578% %     \matrix [diag] (m) {%
1579% %          \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
1580% %          \node (s2) {}; & \node (t2) {}; \\
1581% %     };
1582% %     \node [above=0 of t1, overlay] {$\alpha$};
1583% %     {[-stealth]
1584% %     \draw (s1) -- (t1);
1585% %     \draw [new] (s2) -- node [above] {$*$} (t2);
1586% %     }
1587% %     \draw (s1) to node [rel] {$\S$} (s2);
1588% %     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
1589% % \end{tikzpicture}
1590% % \caption{The \verb+cl_jump+ case.}
1591% % \label{subfig:cl_jump}
1592% % \end{subfigure}
1593% % &
1594% \begin{subfigure}{.25\linewidth}
1595% \centering
1596% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1597%                             every label/.style=overlay, node distance=10mm]
1598%     \matrix [diag] (m) {%
1599%          \node (s1) {}; & \node (t1) {};\\
1600%          \node (s2) {}; & \node (t2) {}; \\
1601%     };
1602%     {[-stealth]
1603%     \draw (s1) -- (t1);
1604%     \draw [new] (s2) -- node [above] {$*$} (t2);
1605%     }
1606%     \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
1607%     \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S,\L$} (t2);
1608% \end{tikzpicture}
1609% \caption{The \verb+cl_oher+ and \verb+cl_jump+ cases.}
1610% \label{subfig:cl_other_jump}
1611% \end{subfigure}
1612% &
1613% \begin{subfigure}{.375\linewidth}
1614% \centering
1615% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1616%                             every label/.style=overlay, node distance=10mm]
1617%     \matrix [diag, small gap] (m) {%
1618%          &\node (t1) {}; \\
1619%          \node (s1) [is call] {}; \\
1620%          && \node (l) {}; & \node (t2) {};\\
1621%          \node (s2) {}; & \node (c) [is call] {};\\   
1622%     };
1623%     {[-stealth]
1624%     \draw (s1) -- node [above left] {$f$} (t1);
1625%     \draw [new] (s2) -- node [above] {$*$} (c);
1626%     \draw [new] (c) -- node [below right] {$f$} (l);
1627%     \draw [new] (l) -- node [above] {$*$} (t2);
1628%     }
1629%     \draw (s1) to [bend right] node [rel] {$\S$} (s2);
1630%     \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
1631%     \draw [new] (t1) to [bend left] node [rel] {$\L$} (l);
1632%     \draw [new] (t1) to [bend right] node [rel] {$\C$} (c);
1633%     \end{tikzpicture}
1634% \caption{The \verb+cl_call+ case.}
1635% \label{subfig:cl_call}
1636% \end{subfigure}
1637% &
1638% \begin{subfigure}{.375\linewidth}
1639% \centering
1640% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
1641%                             every label/.style=overlay, node distance=10mm]
1642%     \matrix [diag, small gap] (m) {%
1643%         \node (s1) [is ret] {}; \\
1644%         &\node (t1) {}; \\
1645%         \node (s2) {}; & \node (c) [is ret] {};\\
1646%         && \node (r) {}; & \node (t2) {}; \\   
1647%     };
1648%     {[-stealth]
1649%     \draw (s1) -- node [above right] {$RET$} (t1);
1650%     \draw [new] (s2) -- node [above] {$*$} (c);
1651%     \draw [new] (c) -- node [below left] {$RET$} (r);
1652%     \draw [new] (r) -- node [above] {$*$} (t2);
1653%     }
1654%     \draw (s1) to [bend right] node [rel] {$\S$} (s2);
1655%     \draw [new, overlay] (t1) to [bend left=60] node [rel] {$\S,\L$} (t2);
1656%     \draw [new, overlay] (t1) to [bend left ] node [rel] {$\R$} (r);
1657% \end{tikzpicture}
1658% \caption{The \verb+cl_return+ case.}
1659% \label{subfig:cl_return}
1660% \end{subfigure}
1661% \end{tabular}
1662% \caption{Mnemonic diagrams depicting the hypotheses for the preservation of structured traces.
1663%          Dashed lines
1664%          and arrows indicates how the diagrams must be closed when solid relations
1665%          are present.}
1666% \label{fig:forwardsim}
1667% \end{figure}
1669% \paragraph{1-to-many forward simulation conditions.}
1670% \begin{condition}[Cases \verb+cl_other+ and \verb+cl_jump+]
1671%  For all $s_1,s_1',s_2$ such that $s_1 \S s_1'$, and
1672%  $s_1\exec s_1'$, and either $s_1 \class \verb+cl_other+$ or
1673%  both $s_1\class\verb+cl_other+\}$ and $\ell~s_1'$,
1674%  there exists an $s_2'$ and a $\verb+trace_any_any_free+~s_2~s_2'$ called $taaf$
1675%  such that $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1676% $taaf$ is non empty, or one among $s_1$ and $s_1'$ is \verb+as_costed+.
1677% \end{condition}
1679% In the above condition depicted in \autoref{subfig:cl_other_jump},
1680% a $\verb+trace_any_any_free+~s_1~s_2$ (which from now on
1681% will be shorthanded as \verb+TAAF+) is an
1682% inductive type of structured traces that do not contain function calls or
1683% cost emission statements. Differently from a \verb+TAA+, the
1684% instruction to be executed in the lookahead state $s_2$ may be a cost emission
1685% statement.
1687% The intuition of the condition is that one step can be replaced with zero or more steps if it
1688% preserves the relation between the data and if the two final statuses are
1689% labelled in the same way. Moreover, we must take special care of the empty case
1690% to avoid collapsing two consecutive states that emit a label, missing one of the two emissions.
1692% \begin{condition}[Case \verb+cl_call+]
1693%  For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$ and
1694%  $s_1\exec s_1'$ and $s_1 \class \verb+cl_call+$, there exists $s_a, s_b, s_2'$, a
1695% $\verb+TAA+~s_2~s_a$, and a
1696% $\verb+TAAF+~s_b~s_2'$ such that:
1697% $s_a\class\verb+cl_call+$, the \verb+as_call_ident+'s of
1698% the two call states are the same, $s_1 \C s_a$,
1699% $s_a\exec s_b$, $s_1' \L s_b$ and
1700% $s_1' \S s_2'$.
1701% \end{condition}
1703% The condition, depicted in \autoref{subfig:cl_call} says that, to simulate a function call, we can perform a
1704% sequence of silent actions before and after the function call itself.
1705% The old and new call states must be $\C$-related, the old and new
1706% states at the beginning of the function execution must be $\L$-related
1707% and, finally, the two initial and final states must be $\S$-related
1708% as usual.
1710% \begin{condition}[Case \verb+cl_return+]
1711%  For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$,
1712%  $s_1\exec s_1'$ and $s_1 \class \verb+cl_return+$, there exists $s_a, s_b, s_2'$, a
1713% $\verb+TAA+~s_2~s_a$, a
1714% $\verb+TAAF+~s_b~s_2'$ called $taaf$ such that:
1715% $s_a\class\verb+cl_return+$,
1716% $s_a\exec s_b$,
1717% $s_1' \R s_b$ and
1718% $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
1719% $taaf$ is non empty, or $\lnot \ell~s_a$.
1720% \end{condition}
1722% Similarly to the call condition, to simulate a return we can perform a
1723% sequence of silent actions before and after the return statement itself,
1724% as depicted in \autoref{subfig:cl_return}.
1725% The old and the new statements after the return must be $\R$-related,
1726% to grant that they returned to corresponding calls.
1727% The two initial and final states must be $\S$-related
1728% as usual and, moreover, they must exhibit the same labels. Finally, when
1729% the suffix is non empty we must take care of not inserting a new
1730% unmatched cost emission statement just after the return statement.
1732% \begin{comment}
1733% \begin{verbatim}
1734% definition status_simulation ≝
1735%   λS1 : abstract_status.
1736%   λS2 : abstract_status.
1737%   λsim_status_rel : status_rel S1 S2.
1738%     ∀st1,st1',st2.as_execute S1 st1 st1' →
1739%     sim_status_rel st1 st2 →
1740%     match as_classify … st1 with
1741%     [ None ⇒ True
1742%     | Some cl ⇒
1743%       match cl with
1744%       [ cl_call ⇒ ∀prf.
1745%         (*
1746%              st1' ------------S----------\
1747%               ↑ \                         \
1748%              st1 \--L--\                   \
1749%               | \       \                   \
1750%               S  \-C-\  st2_after_call →taa→ st2'
1751%               |       \     ↑
1752%              st2 →taa→ st2_pre_call
1753%         *)
1754%         ∃st2_pre_call.
1755%         as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
1756%         call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
1757%         ∃st2_after_call,st2'.
1758%         ∃taa2 : trace_any_any … st2 st2_pre_call.
1759%         ∃taa2' : trace_any_any … st2_after_call st2'.
1760%         as_execute … st2_pre_call st2_after_call ∧
1761%         sim_status_rel st1' st2' ∧
1762%         label_rel … st1' st2_after_call
1763%       | cl_return ⇒
1764%         (*
1765%              st1
1766%             / ↓
1767%            | st1'----------S,L------------\
1768%            S   \                           \
1769%             \   \-----R-------\            |     
1770%              \                 |           |
1771%              st2 →taa→ st2_ret |           |
1772%                           ↓   /            |
1773%                      st2_after_ret →taaf→ st2'
1775%            we also ask that st2_after_ret be not labelled if the taaf tail is
1776%            not empty
1777%         *)
1778%         ∃st2_ret,st2_after_ret,st2'.
1779%         ∃taa2 : trace_any_any … st2 st2_ret.
1780%         ∃taa2' : trace_any_any_free … st2_after_ret st2'.
1781%         (if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
1782%         as_classifier … st2_ret cl_return ∧
1783%         as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
1784%         ret_rel … sim_status_rel st1' st2_after_ret ∧
1785%         label_rel … st1' st2'
1786%       | cl_other ⇒
1787%           (*         
1788%           st1 → st1'
1789%             |      \
1790%             S      S,L
1791%             |        \
1792%            st2 →taaf→ st2'
1794%            the taaf can be empty (e.g. tunneling) but we ask it must not be the
1795%            case when both st1 and st1' are labelled (we would be able to collapse
1796%            labels otherwise)
1797%          *)
1798%         ∃st2'.
1799%         ∃taa2 : trace_any_any_free … st2 st2'.
1800%         (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1801%         sim_status_rel st1' st2' ∧
1802%         label_rel … st1' st2'
1803%       | cl_jump ⇒
1804%         (* just like cl_other, but with a hypothesis more *)
1805%         as_costed … st1' →
1806%         ∃st2'.
1807%         ∃taa2 : trace_any_any_free … st2 st2'.
1808%         (if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
1809%         sim_status_rel st1' st2' ∧
1810%         label_rel … st1' st2'
1811%       ]
1812%     ].
1813% \end{verbatim}
1814% \end{comment}
1816% \paragraph{Main result: the 1-to-many forward simulation conditions
1817% are sufficient to trace reconstruction}
1819% Let us assume that a relation set is given such that the 1-to-many
1820% forward simulation conditions are satisfied. Under this assumption we
1821% can prove the following three trace reconstruction theorems by mutual
1822% structural induction over the traces given in input between the
1823% $s_1$ and $s_1'$ states.
1825% In particular, the \verb+status_simulation_produce_tlr+ theorem
1826% applied to the \verb+main+ function of the program and equal
1827% $s_{2_b}$ and $s_2$ states shows that, for every initial state in the
1828% source code that induces a structured trace in the source code,
1829% the compiled code produces a similar structured trace.
1831% \begin{theorem}[\verb+status_simulation_produce_tlr+]
1832% For every $s_1,s_1',s_{2_b},s_2$ s.t.
1833% there is a $\verb+TLR+~s_1~s_1'$ called $tlr_1$ and a
1834% $\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1835% $s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1836% there is a $\verb+TLR+~s_{2_b}~s_{2_m}$ called $tlr_2$ and
1837% there is a $\verb+TAAF+~s_{2_m}~s_2'$ called $taaf$
1838% s.t. if $taaf$ is non empty then $\lnot (\ell~s_{2_m})$,
1839% and $tlr_1\approx tlr_2$
1840% and $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1841% $s_1' \R s_{2_m}$.
1842% \end{theorem}
1844% The theorem states that a \verb+trace_label_return+ in the source code
1845% together with a precomputed preamble of silent states
1846% (the \verb+TAA+) in the target code induces a
1847% similar \verb+trace_label_return+ in the target code which can be
1848% followed by a sequence of silent states. Note that the statement does not
1849% require the produced \verb+trace_label_return+ to start with the
1850% precomputed preamble, even if this is likely to be the case in concrete
1851% implementations. The preamble in input is necessary for compositionality, e.g.
1852% because the 1-to-many forward simulation conditions allow in the
1853% case of function calls to execute a preamble of silent instructions just after
1854% the function call.
1856% Clearly similar results are also available for the other two types of structured
1857% traces (in fact, they are all proved simultaneously by mutual induction).
1858% \begin{theorem}[\verb+status_simulation_produce_tll+]
1859% For every $s_1,s_1',s_{2_b},s_2$ s.t.
1860% there is a $\verb+TLL+~b~s_1~s_1'$ called $tll_1$ and a
1861% $\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
1862% $s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
1863% \begin{itemize}
1864%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1865%        and a trace $\verb+TLL+~b~s_{2_b}~s_{2_m}$ called $tll_2$
1866%        and a $\verb+TAAF+~s_{2_m}~s_2'$ called $taa_2$ s.t.
1867%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1868%        $s_1' \R s_{2_m}$ and
1869%        $tll_1\approx tll_2$ and
1870%        if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1871%  \item else there exists $s_2'$ and a
1872%        $\verb+TLL+~b~s_{2_b}~s_2'$ called $tll_2$ such that
1873%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1874%        $tll_1\approx tll_2$.
1875% \end{itemize}
1876% \end{theorem}
1878% The statement is similar to the previous one: a source
1879% \verb+trace_label_label+ and a given target preamble of silent states
1880% in the target code induce a similar \verb+trace_label_label+ in the
1881% target code, possibly followed by a sequence of silent moves that become the
1882% preamble for the next \verb+trace_label_label+ translation.
1884% \begin{theorem}[\verb+status_simulation_produce_tal+]
1885% For every $s_1,s_1',s_2$ s.t.
1886% there is a $\verb+TAL+~b~s_1~s_1'$ called $tal_1$ and
1887% $s_1 \S s_2$
1888% \begin{itemize}
1889%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
1890%    and a trace $\verb+TAL+~b~s_2~s_{2_m}$ called $tal_2$ and a
1891%    $\verb+TAAF+~s_{2_m}~s_2'$ called $taa_2$ s.t.
1892%    $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1893%    $s_1' \R s_{2_m}$ and
1894%    $tal_1 \approx tal_2$ and
1895%    if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
1896%  \item else there exists $s_2'$ and a
1897%    $\verb+TAL+~b~s_2~s_2'$ called $tal_2$ such that
1898%    either $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
1899%        $tal_1\approx tal_2$
1900%    or $s_1' \mathrel{{\S} \cap {\L}} s_2$ and
1901%    $\verb+tal_collapsable+~tal_1$ and $\lnot \ell~s_1$.
1902% \end{itemize}
1903% \end{theorem}
1905% The statement is also similar to the previous ones, but for the lack of
1906% the target code preamble.
1910For every $s_1,s_1',s_2$ s.t.
1911there is a $\verb+trace_label_return+~s_1~s_1'$ called $tlr_1$ and
1912$s_1 (\L \cap \S) s_2$
1913there exists $s_{2_m},s_2'$ s.t.
1914there is a $\verb+trace_label_return+~s_2~s_{2_m}$ called $tlr_2$ and
1915there is a $\verb+trace_any_any_free+~s_{2_m}~s_2'$ called $taaf$
1916s.t. if $taaf$ is non empty then $\lnot (\verb+as_costed+~s_{2_m})$,
1917and $\verb+tlr_rel+~tlr_1~tlr_2$
1918and $s_1' (\S \cap \L) s_2'$ and
1919$s_1' \R s_{2_m}$.
1925status_simulation_produce_tlr S1 S2 R
1926(* we start from this situation
1927     st1 →→→→tlr→→→→ st1'
1928      | \
1929      L  \---S--\
1930      |          \
1931   st2_lab →taa→ st2   (the taa preamble is in general either empty or given
1932                        by the preceding call)
1934   and we produce
1935     st1 →→→→tlr→→→→ st1'
1936             \\      /  \
1937             //     R    \-L,S-\
1938             \\     |           \
1939   st2_lab →tlr→ st2_mid →taaf→ st2'
1941  st1 st1' st2_lab st2
1942  (tlr1 : trace_label_return S1 st1 st1')
1943  (taa2_pre : trace_any_any S2 st2_lab st2)
1944  (sim_execute : status_simulation S1 S2 R)
1945  on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
1946  ∃st2_mid.∃st2'.
1947  ∃tlr2 : trace_label_return S2 st2_lab st2_mid.
1948  ∃taa2 : trace_any_any_free … st2_mid st2'.
1949  (if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
1950  R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
1951  tlr_rel … tlr1 tlr2
1955\section{Conclusions and future works}
1959The labelling approach is a technique to implement compilers that induce on
1960the source code a non uniform cost model determined from the object code
1961produced. The cost model assigns a cost to each basic block of the program.
1962The main theorem of the approach says that there is an exact
1963correspondence between the sequence of basic blocks started in the source
1964and object code, and that no instruction in the source or object code is
1965executed outside a basic block. Thus the cost of object code execution
1966can be computed precisely on the source.
1968In this paper we scaled the labelling approach to cover a programming language
1969with function calls. This introduces new difficulties when the language
1970is unstructured, i.e. it allows function calls to return anywhere in the code,
1971destroying the hope of a static prediction of the cost of basic blocks.
1972We restored static predictability by introducing a new semantics for unstructured
1973programs that single outs well structured executions. The latter are represented
1974by structured traces, a generalisation of streams of observables that capture
1975several structural invariants of the execution, like well nesting of functions
1976or the fact that every basic block must start with a code emission statement.
1977We showed that structured traces are sufficiently well behaved to statically compute a precise cost model on the object code.
1979We also proved that the cost model computed on the object code is also valid
1980on the source code if the compiler respects two requirements: the weak execution
1981traces of the source and target code must be the same and the object
1982code execution fragments are structured.
1984To prove that a compiler respects the requirement we extended the notion
1985of forward simulation proof for a labelled transition system to grant
1986preservation of structured fragments. If the source language of the compiler
1987is structured, all its execution fragments are, allowing to deduce from
1988preservation of structure that object code fragments are structured too.
1990Finally, we identified measurable execution fragments that are those whose
1991execution time (once compiled) can be exactly computed looking at the object
1992code weak execution traces only. A final instrumentation pass on the source
1993code can then be used to turn the non functional property of having a certain
1994cost into the functional property of granting that a certain global variable
1995incremented at the beginning of every block according to the induced cost model
1996has a certain value.
1998All results presented in the paper are part of a larger certification of a
1999C compiler which is based on the labelling approach. The certification, done
2000in Matita, is the main deliverable of the FET-Open Certified Complexity (CerCo).
2002The short term future work consists in the completion of the certification of
2003the CerCo compiler exploiting the main theorem of this paper.
2005\paragraph{Related works.}
2006CerCo is the first project that explicitly tries to induce a
2007precise cost model on the source code in order to establish non-functional
2008properties of programs on an high level language. Traditional certifications
2009of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
2010of the functional properties.
2012Usually forward simulations take the following form: for each transition
2013from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
2014transitions in the target code of length $n$. The number $n$ of transition steps
2015in the target code can just be the witness of the existential statement.
2016An equivalent alternative when the proof of simulation is constructive consists
2017in providing an explicit function, called \emph{clock function} in the
2018literature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
2019function constitutes then a cost model for the source code, in the spirit of
2020what we are doing in CerCo. However, we believe our solution to be superior
2021in the following respects: 1) the machinery of the labelling approach is
2022insensible to the resource being measured. Indeed, any cost model computed on
2023the object code can be lifted to the source code (e.g. stack space used,
2024energy consumed, etc.) simply re-proving an analogue of~\autoref{static}.
2025For example, in CerCo we transported to the source level not only the execution
2026time cost model, but also the amount of stack used by function calls.
2027On the contrary, clock functions only talk about
2028number of transition steps. In order to extend the approach with clock functions
2029to other resources, additional functions must be introduced. Moreover, the
2030additional functions would be handled differently in the proof.
20312) the cost models induced by the labelling approach have a simple presentation.
2032In particular, they associate a number to each basic block. More complex
2033models can be induced when the approach is scaled to cover, for instance,
2034loop optimisations~\cite{loopoptimizations}, but the costs are still meant to
2035be easy to understand and manipulate in an interactive theorem prover or
2036in Frama-C.
2037On the contrary, a clock function is a complex function of the state $s_1$
2038which, as a function, is an opaque object that is difficult to reify as
2039source code in order to reason on it.
2041\section{The formalisation}
2042As we already explained, for the sake of presentation we explained the formal
2043content of our results departing from the actual Matita formalisation. In this
2044section we explain the basis of the Matita development, heavily based on
2045inductive definitions and dependent types.
2047\paragraph*{The main differences.} The main points where the development diverges
2048from the material presented in the previous sections are the following.
2050 \item Rather than having a generic notion of fragment and a predicate of
2051 structuredness and measurability, we use inductive definitions internalising
2052 the conditions. Among other things this turns the proof of \autoref{preservation} a matter of
2053 structural induction, when it would require more complex induction schemes
2054 otherwise.
2055 \item As we are dealing with a deterministic labelling transition system, we
2056 purposedly blur the distinction between labelling the transitions and labelling
2057 the states originating them, and opt in general for the latter in the actual
2058 definitions.
2059 \item ................. ALTRO?
2060 \item A small difference lies in the naming conventions. In this paper we went
2061 with the tradition of labelled transition systems and named sequences of states
2062 and steps \emph{execution fragments}. In the development we named our structures
2063 \emph{traces}. In the remainder of this section we will use both names.
2066\paragraph*{The main definitions.}
2067The notion of deterministic labelled transition system is captured in the development
2068via the abstract data type called $\verb+abstract_status+$. The fields
2069of this record are the following.
2071 \item \verb+S : Type[0]+, the type of states.
2072 \item \verb+as_execute : S $\to$ S $\to$ Prop+, a the binary predicate modelling
2073 the execution. As in the previous sections, we write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
2074 \item \verb+as_classifier : S $\to$ classification+, a function tagging all
2075 states with a class in
2076 $\{$\verb+cl_return,cl_jump,cl_call,cl_other+$\}$, depending on the instruction
2077 that is about to be executed (we omit tail-calls for simplicity). We will
2078 use $s \class c$ as a shorthand for both $\verb+as_classifier+~s=c$
2079 (if $c$ is a classification) and $\verb+as_classifier+~s\in c$
2080 (if $c$ is a set of classifications). This partly replaces the labelling of execution
2081 steps.
2082 \item \verb+as_label : S $\to$ option label+, telling whether the
2083 next instruction to be executed in $s$ is a cost emission statement,
2084 and if yes returning the associated cost label. Our shorthand for this function
2085 will be $L$, and we will also abuse the notation by using $L~s$ as a
2086 predicate stating that $s$ is labelled.
2087 \item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
2088 telling the identifier of the function which is being called in a
2089 \verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
2090 $\verb+as_call_ident+~s = f$.
2091 \item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
2092 which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
2093 instruction to be executed in $s_2$ follows the function call to be
2094 executed in (the witness of the $\Sigma$-type) $s_1$, i.e. when $s_1\ar s_2$.
2097% % \begin{alltt}
2098% % record abstract_status := \{ S: Type[0];
2099% %  as_execute: S \(\to\) S \(\to\) Prop;   as_classifier: S \(\to\) classification;
2100% %  as_label: S \(\to\) option label;    as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
2101% %  as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
2102% % \end{alltt}
2104The inductive type for structured traces is actually made by three multiple
2105inductive types with the following semantics:
2107 \item $(\verb+trace_label_return+~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
2108   is in fact what we called a measurable fragment ending in a return step.
2109%    is a trace that begins in
2110%    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
2111%    such that the instruction to be executed in $s_1$ is a label emission
2112%    statement and the one to be executed in the state before $s_2$ is a return
2113%    statement. Thus $s_2$ is the state after the return. The trace
2114%    may contain other label emission statements. It captures the structure of
2115%    the execution of function bodies: they must start with a cost emission
2116%    statement and must end with a return; they are obtained by concatenating
2117%    one or more basic blocks, all starting with a label emission
2118%    (e.g. in case of loops).
2119 \item $(\verb+trace_label_label+~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$)
2120   models basic blocks. It is the special case of a measurable fragment containing
2121   only its first label emission.
2122 \item $(\verb+trace_any_label+~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
2123   is a structured fragment, possibly with a return step appended to it, that
2124   does not contain any label apart possibly from the first state.
2126The above definition summarise the formal rules shown in \autoref{fig:traces}.
2130 {\verb+TLL+~true~s_1~s_2}
2131 {\verb+TLR+~s_1~s_2}
2134 {\verb+TLL+~false~s_1~s_2 \andalso
2135  \verb+TLR+~s_2~s_3
2136 }
2137 {\verb+TLR+~s_1~s_3}
2140 {\verb+TAL+~b~s_1~s_2 \andalso
2141  L~s_1
2142 }
2143 {\verb+TLL+~b~s_1~s_2}
2147 {s_1\exec s_2 \andalso
2148  s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
2149  L~s_2
2150 }
2151 {\verb+TAL+~false~s_1~s_2}
2154 {s_1\exec s_2 \andalso
2155  s_1 \class \verb+cl_return+
2156 }
2157 {\verb+TAL+~true~s_1~s_2}
2160 {s_1\exec s_2 \andalso
2161  s_1 \class \verb+cl_call+ \andalso
2162  s_1\ar s_3 \andalso
2163  \verb+TLR+~s_2~s_3 \andalso
2164  L~s_3
2165 }
2166 {\verb+TAL+~false~s_1~s_3}
2169 {s_1\exec s_2 \andalso
2170  s_1 \class \verb+cl_call+ \andalso
2171  s_1\ar s_3 \andalso
2172  \verb+TLR+~s_2~s_3 \andalso
2173  \verb+TAL+~b~s_3~s_4
2174 }
2175 {\verb+TAL+~b~s_1~s_4}
2178 {s_1\exec s_2 \andalso
2179  \lnot L~s_2 \andalso
2180  \verb+TAL+~b~s_2~s_3\andalso
2181  s_1 \class \verb+cl_other+
2182 }
2183 {\verb+TAL+~b~s_1~s_3}
2184\caption{The rules forming the inductive definitions of structured
2185         traces.}
2190inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
2191  | tlr_base:
2192      ∀status_before: S.
2193      ∀status_after: S.
2194        trace_label_label S ends_with_ret status_before status_after →
2195        trace_label_return S status_before status_after
2196  | tlr_step:
2197      ∀status_initial: S.
2198      ∀status_labelled: S.
2199      ∀status_final: S.
2200        trace_label_label S doesnt_end_with_ret status_initial status_labelled →
2201        trace_label_return S status_labelled status_final →
2202          trace_label_return S status_initial status_final
2203with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
2204  | tll_base:
2205      ∀ends_flag: trace_ends_with_ret.
2206      ∀start_status: S.
2207      ∀end_status: S.
2208        trace_any_label S ends_flag start_status end_status →
2209        as_costed S start_status →
2210          trace_label_label S ends_flag start_status end_status
2211with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
2212  (* Single steps within a function which reach a label.
2213     Note that this is the only case applicable for a jump. *)
2214  | tal_base_not_return:
2215      ∀start_status: S.
2216      ∀final_status: S.
2217        as_execute S start_status final_status →
2218        (as_classifier S start_status cl_jump ∨
2219         as_classifier S start_status cl_other) →
2220        as_costed S final_status →
2221          trace_any_label S doesnt_end_with_ret start_status final_status
2222  | tal_base_return:
2223      ∀start_status: S.
2224      ∀final_status: S.
2225        as_execute S start_status final_status →
2226        as_classifier S start_status cl_return →
2227          trace_any_label S ends_with_ret start_status final_status
2228  (* A call followed by a label on return. *)
2229  | tal_base_call:
2230      ∀status_pre_fun_call: S.
2231      ∀status_start_fun_call: S.
2232      ∀status_final: S.
2233        as_execute S status_pre_fun_call status_start_fun_call →
2234        ∀H:as_classifier S status_pre_fun_call cl_call.
2235          as_after_return S «status_pre_fun_call, H» status_final →
2236          trace_label_return S status_start_fun_call status_final →
2237          as_costed S status_final →
2238            trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
2239  (* A call followed by a non-empty trace. *)
2240  | tal_step_call:
2241      ∀end_flag: trace_ends_with_ret.
2242      ∀status_pre_fun_call: S.
2243      ∀status_start_fun_call: S.
2244      ∀status_after_fun_call: S.
2245      ∀status_final: S.
2246        as_execute S status_pre_fun_call status_start_fun_call →
2247        ∀H:as_classifier S status_pre_fun_call cl_call.
2248          as_after_return S «status_pre_fun_call, H» status_after_fun_call →
2249          trace_label_return S status_start_fun_call status_after_fun_call →
2250          ¬ as_costed S status_after_fun_call →
2251          trace_any_label S end_flag status_after_fun_call status_final →
2252            trace_any_label S end_flag status_pre_fun_call status_final
2253  | tal_step_default:
2254      ∀end_flag: trace_ends_with_ret.
2255      ∀status_pre: S.
2256      ∀status_init: S.
2257      ∀status_end: S.
2258        as_execute S status_pre status_init →
2259        trace_any_label S end_flag status_init status_end →
2260        as_classifier S status_pre cl_other →
2261        ¬ (as_costed S status_init) →
2262          trace_any_label S end_flag status_pre status_end.
2267The equivalent of $|T|$ on fragments can be easily defined by mutual recursion
2268on the three types of traces defined in \autoref{fig:traces}.
2270\paragraph{The forward simulation result.}
2273\tikzset{diag/.append style=
2274         {every node/.append style={is other,
2275                                    text height=0,text depth=0,text width=0,
2276                                    text opacity=0}                                                         
2277         },
2278         }
2280% \begin{subfigure}{.475\linewidth}
2281% \centering
2282% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
2283%                             every label/.style=overlay, node distance=10mm]
2284%     \matrix [diag] (m) {%
2285%          \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
2286%          \node (s2) {}; & \node (t2) {}; \\
2287%     };
2288%     \node [above=0 of t1, overlay] {$\alpha$};
2289%     {[-stealth]
2290%     \draw (s1) -- (t1);
2291%     \draw [new] (s2) -- node [above] {$*$} (t2);
2292%     }
2293%     \draw (s1) to node [rel] {$\S$} (s2);
2294%     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
2295% \end{tikzpicture}
2296% \caption{The \verb+cl_jump+ case.}
2297% \label{subfig:cl_jump}
2298% \end{subfigure}
2299% &
2302\begin{tikzpicture}[every join/.style={ar}, join all, thick,
2303                            every label/.style=overlay, node distance=10mm]
2304    \matrix [diag] (m) {%
2305         \node (s1) {}; & \node (t1) {};\\
2306         \node (s2) {}; & \node [newn] (t2) {}; \\
2307    };
2308    {[-stealth]
2309    \draw (s1) -- (t1);
2310    \draw [new] (s2) -- node [above] {$*$} (t2);
2311    }
2312    \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
2313    \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S,\L$} (t2);
2315\caption{The \verb+cl_oher+ and \verb+cl_jump+ cases.}
2321\begin{tikzpicture}[every join/.style={ar}, join all, thick,
2322                            every label/.style=overlay, node distance=10mm]
2323    \matrix [diag, small gap] (m) {%
2324         &\node (t1) {}; \\
2325         \node (s1) {}; \\
2326         && \node [newn] (l) {}; & \node [newn] (t2) {};\\
2327         \node (s2) {}; & \node [newn] (c) {};\\   
2328    };
2329    {[-stealth]
2330    \draw (s1) -- node [above left] {$f$} (t1);
2331    \draw [new] (s2) -- node [above] {$*$} (c);
2332    \draw [new] (c) -- node [below right] {$f$} (l);
2333    \draw [new] (l) -- node [above] {$*$} (t2);
2334    }
2335    \draw (s1) to [bend right] node [rel] {$\S$} (s2);
2336    \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
2337    \draw [new] (t1) to [bend left] node [rel] {$\L$} (l);
2338    \draw [new] (s1) to [bend left] node [rel] {$\C$} (c);
2339    \end{tikzpicture}
2340\caption{The \verb+cl_call+ case.}
2346\begin{tikzpicture}[every join/.style={ar}, join all, thick,
2347                            every label/.style=overlay, node distance=10mm]
2348    \matrix [diag, small gap] (m) {%
2349        \node (s1) {}; \\
2350        &\node (t1) {}; \\
2351        \node (s2) {}; & \node [newn] (c) {};\\
2352        && \node [newn] (r) {}; & \node [newn] (t2) {}; \\   
2353    };
2354    {[-stealth]
2355    \draw (s1) -- (t1);
2356    \draw [new] (s2) -- node [above] {$*$} (c);
2357    \draw [new] (c) -- (r);
2358    \draw [new] (r) -- node [above] {$*$} (t2);
2359    }
2360    \draw (s1) to [bend right] node [rel] {$\S$} (s2);
2361    \draw [new, overlay] (t1) to [bend left=60] node [rel] {$\S,\L$} (t2);
2362    \draw [new, overlay] (t1) to [bend left ] node [rel] {$\R$} (r);
2364\caption{The \verb+cl_return+ case.}
2368\caption{Mnemonic diagrams depicting the hypotheses for the preservation of structured traces.
2369         Dashed lines
2370         and arrows indicates how the diagrams must be closed when solid relations
2371         are present.
2372         DA CAMBIARE: $\L$ va in conflitto con l'altra definizione}
2376% There are three mutual structural recursive functions, one for each of
2377% \verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
2378% $|\,.\,|$: the \emph{flattening} of the traces. These functions
2379% allow to extract from a structured trace the list of emitted cost labels.
2380%  We only show here the type of one
2381% of them:
2382% \begin{alltt}
2383% flatten_trace_label_return:
2384%  \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
2385%   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
2386% \end{alltt}
2388% \paragraph{Structured traces similarity and cost prediction invariance.}
2390% A compiler pass maps source to object code and initial states to initial
2391% states. The source code and initial state uniquely determine the structured
2392% trace of a program, if it exists. The structured trace fails to exists iff
2393% the structural conditions are violated by the program execution (e.g. a function
2394% body does not start with a cost emission statement). Let us assume that the
2395% target structured trace exists.
2397% What is the relation between the source and target structured traces?
2398% In general, the two traces can be arbitrarily different. However, we are
2399% interested only in those compiler passes that maps a trace $\tau_1$ to a trace
2400% $\tau_2$ such that
2401% \begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
2402% The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
2403% corollary
2404% \begin{equation}\label{th3}
2405% \forall s_1,s_2. \forall \tau: \verb+TLR+~s_1~s_2.~
2406%   \verb+clock+~s_2 - \verb+clock+~s_1 =
2407%   \Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
2408%   \Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
2409% \end{equation}
2410% This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
2411% transfer the cost model from the target to the source code and reason on the
2412% source code only.
2414% We are therefore interested in conditions stronger than~\eqref{th2}.
2415% Therefore we introduce here a similarity relation between traces with
2416% the same structure. Theorem~\verb+tlr_rel_to_traces_same_flatten+
2417% in the Matita formalisation shows that~\eqref{th2} holds for every pair
2418% $(\tau_1,\tau_2)$ of similar traces.
2420% Intuitively, two traces are similar when one can be obtained from
2421% the other by erasing or inserting silent steps, i.e. states that are
2422% not \verb+as_costed+ and that are classified as \verb+cl_other+.
2423% Silent steps do not alter the structure of the traces.
2424% In particular,
2425% the relation maps function calls to function calls to the same function,
2426% label emission statements to emissions of the same label, concatenation of
2427% subtraces to concatenation of subtraces of the same length and starting with
2428% the same emission statement, etc.
2430% In the formalisation the three similarity relations --- one for each trace
2431% kind --- are defined by structural recursion on the first trace and pattern
2432% matching over the second. Here we turn
2433% the definition into the inference rules shown in \autoref{fig:txx_rel}
2434% for the sake of readability. We also omit from trace constructors all arguments,
2435% but those that are traces or that
2436% are used in the premises of the rules. By abuse of notation we denote all three
2437% relations by infixing $\approx$.
2439% \begin{figure}
2440% \begin{multicols}{2}
2441% \infrule
2442%  {tll_1\approx tll_2
2443%  }
2444%  {\verb+tlr_base+~tll_1 \approx \verb+tlr_base+~tll_2}
2446% \infrule
2447%  {tll_1 \approx tll_2 \andalso
2448%   tlr_1 \approx tlr_2
2449%  }
2450%  {\verb+tlr_step+~tll_1~tlr_1 \approx \verb+tlr_step+~tll_2~tlr_2}
2451% \end{multicols}
2452% \vspace{3ex}
2453% \begin{multicols}{2}
2454% \infrule
2455%  {L~s_1 = L~s_2 \andalso
2456%   tal_1\approx tal_2
2457%  }
2458%  {\verb+tll_base+~s_1~tal_1 \approx \verb+tll_base+~s_2~tal_2}
2460% \infrule
2461%  {tal_1\approx tal_2
2462%  }
2463%  {\verb+tal_step_default+~tal_1 \approx tal_2}
2464% \end{multicols}
2465% \vspace{3ex}
2466% \infrule
2467%  {}
2468%  {\verb+tal_base_not_return+\approx taa \append \verb+tal_base_not_return+}
2469% \vspace{1ex}
2470% \infrule
2471%  {}
2472%  {\verb+tal_base_return+\approx taa \append \verb+tal_base_return+}
2473% \vspace{1ex}
2474% \infrule
2475%  {tlr_1\approx tlr_2 \andalso
2476%   s_1 \uparrow f \andalso s_2\uparrow f
2477%  }
2478%  {\verb+tal_base_call+~s_1~tlr_1\approx taa \append \verb+tal_base_call+~s_2~tlr_2}
2479% \vspace{1ex}
2480% \infrule
2481%  {tlr_1\approx tlr_2 \andalso
2482%   s_1 \uparrow f \andalso s_2\uparrow f \andalso
2483%   \verb+tal_collapsable+~tal_2
2484%  }
2485%  {\verb+tal_base_call+~s_1~tlr_1 \approx taa \append \verb+tal_step_call+~s_2~tlr_2~tal_2)}
2486% \vspace{1ex}
2487% \infrule
2488%  {tlr_1\approx tlr_2 \andalso
2489%   s_1 \uparrow f \andalso s_2\uparrow f \andalso
2490%   \verb+tal_collapsable+~tal_1
2491%  }
2492%  {\verb+tal_step_call+~s_1~tlr_1~tal_1 \approx taa \append \verb+tal_base_call+~s_2~tlr_2)}
2493% \vspace{1ex}
2494% \infrule
2495%  {tlr_1 \approx tlr_2 \andalso
2496%   s_1 \uparrow f \andalso s_2\uparrow f\andalso
2497%   tal_1 \approx tal_2 \andalso
2498%  }
2499%  {\verb+tal_step_call+~s_1~tlr_1~tal_1 \approx taa \append \verb+tal_step_call+~s_2~tlr_2~tal_2}
2500% \caption{The inference rule for the relation $\approx$.}
2501% \label{fig:txx_rel}
2502% \end{figure}
2503% %
2504% \begin{comment}
2505% \begin{verbatim}
2506% let rec tlr_rel S1 st1 st1' S2 st2 st2'
2507%   (tlr1 : trace_label_return S1 st1 st1')
2508%   (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
2509% match tlr1 with
2510%   [ tlr_base st1 st1' tll1 ⇒
2511%     match tlr2 with
2512%     [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
2513%     | _ ⇒ False
2514%     ]
2515%   | tlr_step st1 st1' st1'' tll1 tl1 ⇒
2516%     match tlr2 with
2517%     [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
2518%       tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
2519%     | _ ⇒ False
2520%     ]
2521%   ]
2522% and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
2523%  (tll1 : trace_label_label S1 fl1 st1 st1')
2524%  (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
2525%   match tll1 with
2526%   [ tll_base fl1 st1 st1' tal1 H ⇒
2527%     match tll2 with
2528%     [ tll_base fl2 st2 st2 tal2 G ⇒
2529%       as_label_safe … («?, H») = as_label_safe … («?, G») ∧
2530%       tal_rel … tal1 tal2
2531%     ]
2532%   ]
2533% and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
2534%  (tal1 : trace_any_label S1 fl1 st1 st1')
2535%  (tal2 : trace_any_label S2 fl2 st2 st2')
2536%    on tal1 : Prop ≝
2537%   match tal1 with
2538%   [ tal_base_not_return st1 st1' _ _ _ ⇒
2539%     fl2 = doesnt_end_with_ret ∧
2540%     ∃st2mid,taa,H,G,K.
2541%     tal2 ≃ taa_append_tal ? st2 ??? taa
2542%       (tal_base_not_return ? st2mid st2' H G K)
2543%   | tal_base_return st1 st1' _ _ ⇒
2544%     fl2 = ends_with_ret ∧
2545%     ∃st2mid,taa,H,G.
2546%     tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
2547%       (tal_base_return ? st2mid st2' H G)
2548%   | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
2549%     fl2 = doesnt_end_with_ret ∧
2550%     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
2551%     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
2552%     (* we must allow a tal_base_call to be similar to a call followed
2553%       by a collapsable trace (trace_any_any followed by a base_not_return;
2554%       we cannot use trace_any_any as it disallows labels in the end as soon
2555%       as it is non-empty) *)
2556%     (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
2557%       tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
2558%     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
2559%     ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
2560%       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
2561%       tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
2562%   | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
2563%     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
2564%     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
2565%     (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
2566%       tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
2567%       tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
2568%     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
2569%     ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
2570%       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
2571%       tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
2572%   | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
2573%     tal_rel … tl1 tal2 (* <- this makes it many to many *)
2574%   ].
2575% \end{verbatim}
2576% \end{comment}
2577% %
2578% In the preceding rules, a $taa$ is an inhabitant of the
2579% $\verb+trace_any_any+~s_1~s_2$ (shorthand $\verb+TAA+~s_1~s_2$),
2580% an inductive data type whose definition
2581% is not in the paper for lack of space. It is the type of valid
2582% prefixes (even empty ones) of \verb+TAL+'s that do not contain
2583% any function call. Therefore it
2584% is possible to concatenate (using ``$\append$'') a \verb+TAA+ to the
2585% left of a \verb+TAL+. A \verb+TAA+ captures
2586% a sequence of silent moves.
2587% The \verb+tal_collapsable+ unary predicate over \verb+TAL+'s
2588% holds when the argument does not contain any function call and it ends
2589% with a label (not a return). The intuition is that after a function call we
2590% can still perform a sequence of silent actions while remaining similar.
2592% As should be expected, even though the rules are asymmetric $\approx$ is in fact
2593% an equivalence relation.
2597% \appendix
2598% \section{Notes for the reviewers}
2600% The results described in the paper are part of a larger formalization
2601% (the certification of the CerCo compiler). At the moment of the submission
2602% we need to single out from the CerCo formalization the results presented here.
2603% Before the 16-th of February we will submit an attachment that contains the
2604% minimal subset of the CerCo formalization that allows to prove those results.
2605% At that time it will also be possible to measure exactly the size of the
2606% formalization described here. At the moment a rough approximation suggests
2607% about 2700 lines of Matita code.
2609% We will also attach the development version of the interactive theorem
2610% prover Matita that compiles the submitted formalization. Another possibility
2611% is to backport the development to the last released version of the system
2612% to avoid having to re-compile Matita from scratch.
2614% The programming and certification style used in the formalization heavily
2615% exploit dependent types. Dependent types are used: 1) to impose invariants
2616% by construction on the data types and operations (e.g. a traces from a state
2617% $s_1$ to a state $s_2$ can be concatenad to a trace from a state
2618% $s_2'$ to a state $s_3$ only if $s_2$ is convertible with $s_2'$); 2)
2619% to state and prove the theorems by using the Russell methodology of
2620% Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
2621% }, better known in the Coq world as ``\verb+Program+'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
2622% mandatorily on dependent types: it should be easy to adapt the technique
2623% and results presented in the paper to HOL.
2625% Finally, Matita and Coq are based on minor variations of the Calculus of
2626% (Co)Inductive Constructions. These variations do not affect the CerCo
2627% formalization. Therefore a porting of the proofs and ideas to Coq would be
2628% rather straightforward.
Note: See TracBrowser for help on using the repository browser.