Changeset 3347


Ignore:
Timestamp:
Jun 12, 2013, 6:14:31 PM (4 years ago)
Author:
tranquil
Message:

andato avanti...

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Papers/itp-2013/ccexec2.tex

    r3346 r3347  
    1 \documentclass{llncs}
     1\documentclass[bookmarksdepth=2,bookmarksopen]{llncs}
    22\usepackage{hyperref}
    33\usepackage{graphicx}
     
    99\usepackage{subcaption}
    1010\usepackage{listings}
    11 \usepackage{amssymb}
     11\usepackage{amssymb,amsmath}
    1212% \usepackage{amsmath}
    1313\usepackage{multicol}
     
    2727\lstdefinelanguage[mine]{C}[ANSI]{C}{
    2828  morekeywords={int8_t},
    29   mathescape
     29  mathescape,
     30  escapechar=\%
    3031}
    3132
    32 \lstset{language=[mine]C,basicstyle=\footnotesize\tt,columns=flexible,breaklines=false,
    33         keywordstyle=\color{red}\bfseries,
    34         keywordstyle=[2]\color{blue},
    35         commentstyle=\color{green},
    36         stringstyle=\color{blue},
     33\newcommand\LSTfont{\tt\fontsize{8}{8.2}\selectfont}
     34
     35\lstset{language=[mine]C,basicstyle=\LSTfont,breaklines=false,
     36%         keywordstyle=\color{red}\bfseries,
     37%         keywordstyle=[2]\color{blue},
     38%         commentstyle=\color{green},
     39%         stringstyle=\color{blue},
    3740        showspaces=false,showstringspaces=false,
    38         xleftmargin=1em}
     41%         xleftmargin=1em
     42        }
    3943
    4044\usepackage{tikz}
     
    9094                      every node/.style={draw, is other}},
    9195         small vgap/.style={row sep={7mm,between origins}},
     96         small gap/.style={row sep={7mm,between origins},column sep={7mm,between origins}},
    9297         % for article, maybe temporary
    9398         is jump/.style=is other,
     
    101106\def\C{\mathrel{\mathcal C}}
    102107
     108\def\Labels{\mathbb L}
     109\def\Functions{\mathbb F}
     110
    103111\newsavebox{\execbox}
    104112\savebox{\execbox}{\tikz[baseline=-.5ex]\draw [-stealth] (0,0) -- ++(1em, 0);}
    105113\newcommand{\exec}{\ensuremath{\mathrel{\usebox{\execbox}}}}
    106 \let\ar\rightsquigarrow
    107 \renewcommand{\verb}{\lstinline[mathescape]}
     114\let\ar\triangleright
     115\renewcommand{\verb}{\lstinline[mathescape,basicstyle=\tt\selectfont]}
    108116
    109117\let\class\triangleright
     
    112120\newcommand{\append}{\mathbin{@}}
    113121\newcommand{\iffdef}{\mathrel{:\Leftrightarrow}}
     122\newcommand{\Deltacost}{\Delta\verb+cost+}
     123
     124\let\oldto\to
     125\makeatletter
     126\def\to{\@ifnextchar[{\new@to}{\oldto}}
     127\def\new@to[#1]{\xrightarrow{#1}}
     128\def\To{\@ifnextchar[{\new@To}{\implies}}
     129\def\new@To[#1]{\stackrel{#1}{\implies}}
     130\makeatother
     131
    114132
    115133\begin{document}
     
    167185In~\cite{easylabelling} the authors provide an example of a simple
    168186certified compiler that implements the labelling approach for the
    169 imperative \texttt{While} language~\cite{while}, that does not have
     187imperative \verb+While+ language~\cite{while}, that does not have
    170188pointers and function calls.
    171189
     
    187205revisitation of the proof technique given in~\cite{easylabelling}. In
    188206particular, at the core of the labelling approach there is a forward
    189 simulation proof that, in the case of \texttt{While}, is only minimally
     207simulation proof that, in the case of \verb+While+, is only minimally
    190208more complex than the proof required for the preservation of the
    191209functional properties only. In the case of a programming language with
     
    214232\section{The labelling approach}
    215233\label{labelling}
    216 
    217 \section{A brief introduction to the labelling approach}
    218 
    219 \begin{figure}
    220 \begin{verbatim}
    221 EMIT L_1;                         EMIT L_1         cost += k_1;
    222 I_1;                              I_3              I_1;
    223 for (i=0; i<2; i++) {        l_1: COND l_2         for (i=0; i<2; i++) {
    224   EMIT L_2;                       EMIT L_2           cost += k_2;           
    225   I_2;                            I_4                I_2;               
    226  }                                GOTO l_1          }                   
    227 EMIT L_3;                    l_2: EMIT L_3         cost += k_3;           
    228 \end{verbatim}
    229 \caption{The labelling approach applied to a simple program.\label{examplewhile}. The $I_i$ are sequences of instructions not containing jumps or loops. }
    230 \end{figure}
    231 We briefly explain the labelling approach on the example in Figure~\ref{examplewhile}. The user wants to analyse the execution time of the program made by
    232 the black lines in the r.h.s. of the figure. He compiles the program using
     234% \subsection{A brief introduction to the labelling approach}
     235We briefly explain the labelling approach on the example in \autoref{examplewhile}.
     236The user wants to analyse the execution time of the program (the black lines in
     237\autoref{subfig:example_input}). He compiles the program using
    233238a special compiler that first inserts in the code three label emission
    234 statements (\texttt{EMIT L$_i$}, in red) to mark the beginning of basic blocks;
    235 then the compiler compiles the code to machine code (in the
    236 middle of the figure), granting that the execution of the source and object
     239statements (\verb+EMIT L_i+) to mark the beginning of basic blocks;
     240then the compiler compiles the code to machine code (\autoref{subfig:example_oc}),
     241granting that the execution of the source and object
    237242code emits the same sequence of labels ($L_1; L_2; L_2; L_3$ in the example).
    238243This is achieved by keeping track of basic blocks during compilation, avoiding
     
    244249that belong to the corresponding basic block. For example, the cost $k_1$
    245250associated to $L_1$ is the number of cycles required to execute the block
    246 $I_3$ and $COND l_2$, while the cost $k_2$ associated to $L_2$ counts the
    247 cycles required by the block $I_4$ and $GOTO l_1$. The compiler also guarantees
     251$I_3$ and \verb+COND l_2+, while the cost $k_2$ associated to $L_2$ counts the
     252cycles required by the block $I_4$ and \verb+GOTO l_1+. The compiler also guarantees
    248253that every executed instruction is in the scope of some code emission label,
    249254that each scope does not contain loops (to associate a finite cost), and that
     
    252257of the program $\Delta_t$ is equal to the sum over the sequence of emitted
    253258labels of the cost associated to every label:
    254 $\Delta t = k(L_1; L_2; L_2; L_3) = k_1 + k_2 + k_2 + k_3$.
     259$\Delta_t = k(L_1; L_2; L_2; L_3) = k_1 + k_2 + k_2 + k_3$.
    255260Finally, the compiler emits an instrumented version of the source code
    256 (in the r.h.s. of the figure) where label emission statements are replaced
    257 by increments of a global variable cost that, before every increment, holds the
     261(\autoref{subfig:example_instrument}) where label emission statements are replaced
     262by increments of a global variable \verb+cost+ that, before every increment, holds the
    258263exact number of clock cycles spent by the microprocessor so far:
    259 the difference $\Delta clock$ between the final and initial value of the variable clock is $\Delta clock = k_1 + k_2 + k_2 + k_3 = \Delta t$. Finally, the
     264the difference $\Deltacost$ between the final and initial value of the
     265internal clock is $\Deltacost = k_1 + k_2 + k_2 + k_3 = \Delta_t$. Finally, the
    260266user can employ any available method (e.g. Hoare logic, invariant generators,
    261 abstract interpretation and automated provers) to certify that $\Delta clock$
     267abstract interpretation and automated provers) to certify that $\Deltacost$
    262268never exceeds a certain bound~\cite{cerco}, which is now a functional property
    263269of the code.
    264 
    265 \section{The labelling approach in presence of loops}
    266 
     270\begin{figure}
     271\begin{subfigure}[t]{.32\linewidth}
     272\begin{lstlisting}[moredelim={[is][\color{red}]{|}{|}}]
     273|EMIT L_1;|
     274$I_1$;
     275for (i=0; i<2; i++) {
     276  |EMIT L_2;|
     277  $I_2$;
     278}
     279|EMIT L_3;|
     280\end{lstlisting}
     281\caption{The input program (black lines) with its labelling (red lines).}
     282\label{subfig:example_input}
     283\end{subfigure}
     284\hfill
     285\begin{subfigure}[t]{.23\linewidth}
     286\begin{lstlisting}
     287     EMIT L_1
     288     $I_3$
     289l_1: COND l_2
     290     EMIT L_2
     291     $I_4$
     292     GOTO l_1
     293l_2: EMIT L_3
     294\end{lstlisting}
     295\caption{The output labelled object code.}
     296\label{subfig:example_oc}
     297\end{subfigure}
     298\hfill
     299\begin{subfigure}[t]{.32\linewidth}
     300\begin{lstlisting}
     301cost += $k_1$;
     302$I_1$;
     303for (i=0; i<2; i++) {
     304cost += $k_2$;
     305  $I_2$;
     306}
     307cost += $k_3$;           
     308\end{lstlisting}
     309\caption{The output instrumented code.}
     310\label{subfig:example_instrument}
     311\end{subfigure}
     312% \begin{verbatim}
     313% EMIT L_1;                         EMIT L_1         cost += k_1;
     314% I_1;                              I_3              I_1;
     315% for (i=0; i<2; i++) {        l_1: COND l_2         for (i=0; i<2; i++) {
     316%   EMIT L_2;                       EMIT L_2           cost += k_2;           
     317%   I_2;                            I_4                I_2;               
     318%  }                                GOTO l_1          }                   
     319% EMIT L_3;                    l_2: EMIT L_3         cost += k_3;           
     320% \end{verbatim}
     321\caption{The labelling approach applied to a simple program.\label{examplewhile}. The $I_i$ are sequences of instructions not containing jumps or loops. }
     322\end{figure}
     323
     324\subsection{The labelling approach in presence of calls}
     325%
    267326Let's now consider a simple program written in C that contains a function
    268 pointer call inside the scope of the cost label $L_1$.
    269 \begin{verbatim}
    270 main: EMIT L_1       g: EMIT L_3   EMIT L_1     g: EMIT L_3
    271       I_1;              I_3;       I_4;            I_6;
    272       (*f)();           return;    CALL            RETURN
    273       I_2;                         I_5;
    274       EMIT L_2                     EMIT L_2
    275 \end{verbatim}
     327pointer call inside the scope of the cost label $L_1$, in \autoref{subfig:call_input}.
    276328The labelling method works exactly as before, inserting
    277 code emission statements/\texttt{cost} variable increments at the beginning
     329code emission statements/\verb+cost+ variable increments at the beginning
    278330of every basic block and at the beginning of every function. The compiler
    279331still grants that the sequence of labels observed on the two programs are
    280 the same. A new difficulty happears when the compiler needs to statically
     332the same. A new difficulty appears when the compiler needs to statically
    281333analyze the object code to assign a cost to every label. What should the scope
    282 of the $L_1$ label be? After executing the $I_4$ block, the \texttt{CALL}
     334of the $L_1$ label be? After executing the $I_4$ block, the \verb+CALL+
    283335statement passes control to a function that cannot be determined statically.
    284336Therefore the cost of executing the body must be paid by some other label
     
    286338statement). What label should pay for the cost for the block $I_5$? The only
    287339reasonable answer is $L_1$, i.e. \emph{the scope of labels should extend to the
    288 next label emission statement, stepping over function calls}.
     340next label emission statement or the end of the function, stepping over function calls}.
     341%
     342\begin{figure}
     343{}\hfill
     344\begin{subfigure}[t]{.45\linewidth}
     345\centering
     346\begin{lstlisting}[xleftmargin=20pt]
     347void main() {
     348  EMIT L_1;
     349  $I_1$;
     350  (*f)();
     351  $I_2$;
     352}
     353void g() {
     354  EMIT L_2;
     355  $I_3$;
     356}
     357\end{lstlisting}
     358\caption{The input labelled C program.}
     359\label{subfig:call_input}
     360\end{subfigure}
     361\hfill
     362\begin{subfigure}[t]{.45\linewidth}
     363\centering
     364\begin{lstlisting}[xleftmargin=20pt]
     365main:
     366  EMIT L_1
     367  $I_4$
     368  CALL
     369  $I_5$
     370  RETURN
     371g:
     372  EMIT L_2
     373  $I_6$
     374  RETURN
     375\end{lstlisting}
     376\caption{The output labelled object code.}
     377\label{subfig:call_output}
     378\end{subfigure}
     379\hfill{}
     380\caption{An example compilation of a simple program with a function pointer
     381         call.}
     382\label{subfig:call_example}
     383\end{figure}
    289384
    290385The latter definition of scope is adeguate on the source level because
     
    295390languages that use a writable control stack to store the return address of
    296391calls. For example, $I_6$ could increment by $1$ the return address on the
    297 stack so that the next \texttt{RETURN} would start at the second instruction
     392stack so that the next \verb+RETURN+ would start at the second instruction
    298393of $I_5$. The compiler would still be perfectly correct if a random, dead
    299 code instruction was also added just after each \texttt{CALL}. More generally,
     394code instruction was also added just after each \verb+CALL+. More generally,
    300395\emph{there is no guarantee that a correct compiler that respects the functional
    301396behaviour of a program also respects the calling structure of the source code}.
     
    314409by our compiler are of that type. We call them \emph{structured} since their
    315410main property is to respect properties that hold for free on the source code
    316 because the source language is structured. Moreover, in order to avoid proving
     411because of structure. Moreover, in order to avoid proving
    317412too many preservation properties of our compiler, we drop the original
    318 requirements on the object code (all instructons must be in scope of some labels, no loops inside a scope, etc.) in favour of the corresponding requirement
    319 for structured runs (a structured run must start with a label emission, no instruction can be executed twice between two emissions, etc.).
     413requirements on the object code (all instructons must be in scope of some labels,
     414no loops inside a scope, etc.) in favour of the corresponding requirement
     415for structured runs (a structured run must start with a label emission, no
     416instruction can be executed twice between two emissions, etc.).
    320417
    321418We will therefore proceed as follows. In the following section
     
    332429which is hard to prove formally and much more demanding than the simple forward
    333430simulation required for proofs of preservation of functional properties.
    334 Therefore in Section~\ref{XXX} we will present a set of local simulation
     431Therefore in \autoref{simulation} we will present a set of local simulation
    335432conditions that refine the corresponding conditions for forward simulation and
    336433that are sufficient to grant the production of weakly similar traces.
     
    347444the sake of presentation and to make easier the re-implementation of the
    348445concepts in a proof assistant which is not based on the Calculus of Inductive
    349 Constructions. However the formalization is heavily commented to allow the
     446Constructions. In any case the formalization is heavily commented to allow the
    350447reader to understand the technical details of the formalization.
    351448
    352449
    353 ====================================================
    354 
    355 We briefly sketch here a simplified version of the labelling approach as
    356 introduced in~\cite{easylabelling}. The simplification strengthens the
    357 sufficient conditions given in~\cite{easylabelling} to allow a simpler
    358 explanation. The simplified conditions given here are also used in the
    359 CerCo compiler to simplify the proof.
    360 
    361 Let $\mathcal{P}$ be a programming language whose semantics is given in
    362 terms of observables: a run of a program yields a finite or infinite
    363 stream of observables. We also assume for the time being that function
    364 calls are not available in $\mathcal{P}$. We want to associate a cost
    365 model to a program $P$ written in $\mathcal{P}$. The first step is to
    366 extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
    367 where $L$ is a label distinct from all observables of $\mathcal{P}$.
    368 The semantics of $\texttt{emit L}$ is the emission of the observable
    369 \texttt{L} that is meant to signal the beginning of a basic block.
    370 
    371 There exists an automatic procedure that injects into the program $P$ an
    372 $\texttt{emit L}$ at the beginning of each basic block, using a fresh
    373 \texttt{L} for each block. In particular, the bodies of loops, both branches
    374 of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
    375 with an emission statement.
    376 
    377 Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
    378 that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
    379 language used by the compiler. We can easily extend every
    380 intermediate language (and its semantics) with an $\texttt{emit L}$ statement
    381 as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
    382 the additional difficulty that the syntax of object code is given as a
    383 sequence of bytes. The injection of an emission statement in the object code
    384 can be done using a map that maps two consecutive code addresses with the
    385 statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
    386 instruction stored at $pc_1$ and before the execution of the instruction
    387 stored at $pc_2$. The two program counters are necessary because the
    388 instruction stored at $pc_1$ can have multiple possible successors (e.g.
    389 in case of a conditional branch or an indirect call). Dually, the instruction
    390 stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
    391 target of a jump).
    392 
    393 The compiler, to be functionally correct, must preserve the observational
    394 equivalence, i.e. executing the program after each compiler pass should
    395 yield the same stream of observables. After the injection of emission
    396 statements, observables now capture both functional and non-functional
    397 behaviours.
    398 This correctness property is called in the literature a forward simulation
    399 and is sufficient for correctness when the target language is
    400 deterministic~\cite{compcert3}.
    401 We also require a stronger, non-functional preservation property: after each
    402 pass all basic blocks must start with an emission statement, and all labels
    403 \texttt{L} must be unique.
    404 
    405 Now let $M$ be the object code obtained for the program $P$. Let us suppose
    406 that we can statically inspect the code $M$ and associate to each basic block
    407 a cost (e.g. the number of clock cycles required to execute all instructions
    408 in the basic block, or an upper bound to that time). Every basic block is
    409 labelled with an unique label \texttt{L}, thus we can actually associate the
    410 cost to \texttt{L}. Let call it $k(\texttt{L})$.
    411 
    412 The function $k$ is defined as the cost model for the object code control
    413 blocks. It can be equally used as well as the cost model for the source
    414 control blocks. Indeed, if the semantics of $P$ is the stream
    415 $L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
    416 also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
    417 every instruction belongs to a control block and every control block is
    418 labelled. Thus it is correct to say that the execution cost of $P$ is also
    419 $\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
    420 the blocks of the high level program $P$ that is preserved by compilation.
    421 
    422 How can the user profit from the high level cost model? Suppose, for instance,
    423 that he wants to prove that the WCET of his program is bounded by $c$. It
    424 is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
    425 a purely functional property of the code. He can therefore use any technique
    426 available to certify functional properties of the source code.
    427 What is suggested in~\cite{easylabelling} is to actually instrument the
    428 source code $P$ by replacing every label emission statement
    429 $\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
    430 a global fresh variable \texttt{cost}. The bound is now proved by establishing
    431 the program invariant $\texttt{cost} \leq c$, which can be done for example
    432 using the Frama-C~\cite{framaC} suite if the source code is some variant of
    433 C.
    434 
    435 In order to extend the labeling approach to function calls we make
    436 \verb+CALL f+ emit the observable \verb+f+ and \verb+RET+ emit a distinguished observable
    437 \verb+ret+.
    438 
    439 For example the following execution history of the program in \autoref{fig:esempio}
    440 $$I_1; \verb+CALL f+; \verb+COND l+; \verb+EMIT $\ell_2$+; I_3; \verb+RET+; I_2; \verb+RET+$$
    441 emits the trace
    442 $$\verb+main+, \verb+f+$$
    443 \begin{figure}
    444 \hfil
    445 \begin{minipage}{.2\linewidth}
    446 \begin{lstlisting}
    447 main: $\!I_1$
    448       CALL f
    449       $I_2$
    450       RET
    451 \end{lstlisting}
    452 \end{minipage}
    453 \begin{minipage}{.1\linewidth}
    454 \begin{lstlisting}
    455 main
    456 main
    457 main
    458 main
    459 \end{lstlisting}
    460 \end{minipage}
    461 \hfil
    462 \begin{minipage}{.2\linewidth}
    463 \begin{lstlisting}
    464 f: $\!$COND l
    465    EMIT $\ell_2$
    466    RET
    467 l: $\!$EMIT $\ell_3$
    468    $I_3$
    469    RET
    470 \end{lstlisting}
    471 \end{minipage}
    472 \begin{minipage}{.1\linewidth}
    473 \begin{lstlisting}
    474 f
    475 
    476 $\ell_2$
    477 
    478 $\ell_3$
    479 $\ell_3$
    480 \end{lstlisting}
    481 \end{minipage}
    482 \hfil{}
    483 \caption{}
    484 \label{fig:esempio}
    485 \end{figure}
    486 
    487 
    488 \subsection{Labelling function calls}
    489 We now want to extend the labelling approach to support function calls.
    490 On the high level, \emph{structured} programming language $\mathcal{P}$ there
    491 is not much to change.
    492 When a function is invoked, the current basic block is temporarily exited
    493 and the basic block the function starts with take control. When the function
    494 returns, the execution of the original basic block is resumed. Thus the only
    495 significant change is that basic blocks can now be nested. Let \texttt{E}
    496 be the label of the external block and \texttt{I} the label of a nested one.
    497 Since the external starts before the internal, the semantics observed will be
    498 \texttt{E I} and the cost associated to it on the source language will be
    499 $k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
    500 in the block \texttt{E} first plus the cost of executing all the instructions in
    501 the block \texttt{I}. However, we know that some instructions in \texttt{E} are
    502 executed after the last instruction in \texttt{I}. This is actually irrelevant
    503 because we are here assuming that costs are additive, so that we can freely
    504 permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
    505 the function call terminates and yields back control to the basic block
    506 \texttt{E}. If the call diverges, the instrumentation
    507 $\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
    508 but just as an upper bound to the real execution cost: only precision is lost.
    509 
    510 Let now consider what happens when we move down the compilation chain to an
    511 unstructured intermediate or final language. Here unstructured means that
    512 the only control operators are conditional and unconditional jumps, function
    513 calls and returns. Unlike a structured language, though, there is no guarantee
    514 that a function will return control just after the function call point.
    515 The semantics of the return statement, indeed, consists in fetching the
    516 return address from some internal structure (typically the control stack) and
    517 jumping directly to it. The code can freely manipulate the control stack to
    518 make the procedure returns to whatever position. Indeed, it is also possible
    519 to break the well nesting of function calls/returns.
    520 
    521 Is it the case that the code produced by a correct compiler must respect the
    522 additional property that every function returns just after its function call
    523 point? The answer is negative and the property is not implied by forward
    524 simulation proofs. For instance, imagine to modify a correct compiler pass
    525 by systematically adding one to the return address on the stack and by
    526 putting a \texttt{NOP} (or any other instruction that takes one byte) after
    527 every function call. The obtained code will be functionally indistinguishable,
    528 and the added instructions will all be dead code.
    529 
    530 This lack of structure in the semantics badly interferes with the labelling
    531 approach. The reason is the following: when a basic block labelled with
    532 \texttt{E} contains a function call, it no longer makes any sense to associate
    533 to a label \texttt{E} the sum of the costs of all the instructions in the block.
    534 Indeed, there is no guarantee that the function will return into the block and
    535 that the instructions that will be executed after the return will be the ones
    536 we are paying for in the cost model.
    537 
    538 How can we make the labelling approach work in this scenario? We only see two
    539 possible ways. The first one consists in injecting an emission statement after
    540 every function call: basic blocks no longer contain function calls, but are now
    541 terminated by them. This completely solves the problem and allows the compiler
    542 to break the structure of function calls/returns at will. However, the
    543 technique has several drawbacks. First of all, it greatly augments the number
    544 of cost labels that are injected in the source code and that become
    545 instrumentation statements. Thus, when reasoning on the source code to prove
    546 non-functional properties, the user (or the automation tool) will have to handle
    547 larger expressions. Second, the more labels are emitted, the more difficult it
    548 becomes to implement powerful optimisations respecting the code structure.
    549 Indeed, function calls are usually implemented in such a way that most registers
    550 are preserved by the call, so that the static analysis of the block is not
    551 interrupted by the call and an optimisation can involve both the code before
    552 and after the function call. Third, instrumenting the source code may require
    553 unpleasant modification of it. Take, for example, the code
    554 \texttt{f(g(x));}. We need to inject an emission statement/instrumentation
    555 instruction just after the execution of \texttt{g}. The only way to do that
    556 is to rewrite the code as \texttt{y = g(x); emit L; f(y);} for some fresh
    557 variable \texttt{y}. It is pretty clear how in certain situations the obtained
    558 code would be more obfuscated and then more difficult to manually reason on.
    559 
    560 For the previous reasons, in this paper and in the CerCo project we adopt a
    561 different approach. We do not inject emission statements after every
    562 function call. However, we want to propagate a strong additional invariant in
    563 the forward simulation proof. The invariant is the propagation of the structure
    564  of the original high level code, even if the target language is unstructured.
    565 The structure we want to propagate, that will become more clear in the next
    566 section, comprises 1) the property that every function should return just after
    567 the function call point, which in turns imply well nesting of function calls;
    568 2) the property that every basic block starts with a code emission statement.
    569 
    570 In the original labelling approach of~\cite{easylabelling}, the second property
    571 was granted syntactically as a property of the generated code.
    572 In our revised approach, instead, we will impose the property on the runs:
    573 it will be possible to generate code that does not respect the syntactic
    574 property, as soon as all possible runs respect it. For instance, dead code will no longer
    575 be required to have all basic blocks correctly labelled. The switch is suggested
    576 from the fact that the first of the two properties --- that related to
    577 function calls/returns --- can only be defined as property of runs,
    578 not of the static code. The switch is
    579 beneficial to the proof because the original proof was made of two parts:
    580 the forward simulation proof and the proof that the static property was granted.
    581 In our revised approach the latter disappears and only the forward simulation
    582 is kept.
    583 
    584 In order to capture the structure semantics so that it is preserved
    585 by a forward simulation argument, we need to make the structure observable
    586 in the semantics. This is the topic of the next section.
     450% @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
     451%
     452% We briefly sketch here a simplified version of the labelling approach as
     453% introduced in~\cite{easylabelling}. The simplification strengthens the
     454% sufficient conditions given in~\cite{easylabelling} to allow a simpler
     455% explanation. The simplified conditions given here are also used in the
     456% CerCo compiler to simplify the proof.
     457%
     458% Let $\mathcal{P}$ be a programming language whose semantics is given in
     459% terms of observables: a run of a program yields a finite or infinite
     460% stream of observables. We also assume for the time being that function
     461% calls are not available in $\mathcal{P}$. We want to associate a cost
     462% model to a program $P$ written in $\mathcal{P}$. The first step is to
     463% extend the syntax of $\mathcal{P}$ with a new construct $\verb+emit L+$
     464% where $L$ is a label distinct from all observables of $\mathcal{P}$.
     465% The semantics of $\verb+emit L+$ is the emission of the observable
     466% \verb+L+ that is meant to signal the beginning of a basic block.
     467%
     468% There exists an automatic procedure that injects into the program $P$ an
     469% $\verb+emit L+$ at the beginning of each basic block, using a fresh
     470% \verb+L+ for each block. In particular, the bodies of loops, both branches
     471% of \verb+if-then-else+s and the targets of \verb+goto+s must all start
     472% with an emission statement.
     473%
     474% Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
     475% that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
     476% language used by the compiler. We can easily extend every
     477% intermediate language (and its semantics) with an $\verb+emit L+$ statement
     478% as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
     479% the additional difficulty that the syntax of object code is given as a
     480% sequence of bytes. The injection of an emission statement in the object code
     481% can be done using a map that maps two consecutive code addresses with the
     482% statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \verb+emit L+$ then the observable \verb+L+ is emitted after the execution of the
     483% instruction stored at $pc_1$ and before the execution of the instruction
     484% stored at $pc_2$. The two program counters are necessary because the
     485% instruction stored at $pc_1$ can have multiple possible successors (e.g.
     486% in case of a conditional branch or an indirect call). Dually, the instruction
     487% stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
     488% target of a jump).
     489%
     490% The compiler, to be functionally correct, must preserve the observational
     491% equivalence, i.e. executing the program after each compiler pass should
     492% yield the same stream of observables. After the injection of emission
     493% statements, observables now capture both functional and non-functional
     494% behaviours.
     495% This correctness property is called in the literature a forward simulation
     496% and is sufficient for correctness when the target language is
     497% deterministic~\cite{compcert3}.
     498% We also require a stronger, non-functional preservation property: after each
     499% pass all basic blocks must start with an emission statement, and all labels
     500% \verb+L+ must be unique.
     501%
     502% Now let $M$ be the object code obtained for the program $P$. Let us suppose
     503% that we can statically inspect the code $M$ and associate to each basic block
     504% a cost (e.g. the number of clock cycles required to execute all instructions
     505% in the basic block, or an upper bound to that time). Every basic block is
     506% labelled with an unique label \verb+L+, thus we can actually associate the
     507% cost to \verb+L+. Let call it $k(\verb+L+)$.
     508%
     509% The function $k$ is defined as the cost model for the object code control
     510% blocks. It can be equally used as well as the cost model for the source
     511% control blocks. Indeed, if the semantics of $P$ is the stream
     512% $L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
     513% also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
     514% every instruction belongs to a control block and every control block is
     515% labelled. Thus it is correct to say that the execution cost of $P$ is also
     516% $\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
     517% the blocks of the high level program $P$ that is preserved by compilation.
     518%
     519% How can the user profit from the high level cost model? Suppose, for instance,
     520% that he wants to prove that the WCET of his program is bounded by $c$. It
     521% is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
     522% a purely functional property of the code. He can therefore use any technique
     523% available to certify functional properties of the source code.
     524% What is suggested in~\cite{easylabelling} is to actually instrument the
     525% source code $P$ by replacing every label emission statement
     526% $\verb+emit L+$ with the instruction $\verb+cost += k(L)+$ that increments
     527% a global fresh variable \verb+cost+. The bound is now proved by establishing
     528% the program invariant $\verb+cost+ \leq c$, which can be done for example
     529% using the Frama-C~\cite{framaC} suite if the source code is some variant of
     530% C.
     531%
     532% In order to extend the labelling approach to function calls we make
     533% \verb+CALL f+ emit the observable \verb+f+ and \verb+RET+ emit a distinguished observable
     534% \verb+ret+.
     535%
     536% For example the following execution history of the program in \autoref{fig:esempio}
     537% $$I_1; \verb+CALL f+; \verb+COND l+; \verb+EMIT $\ell_2$+; I_3; \verb+RET+; I_2; \verb+RET+$$
     538% emits the trace
     539% $$\verb+main+, \verb+f+$$
     540% \begin{figure}
     541% \hfil
     542% \begin{minipage}{.2\linewidth}
     543% \begin{lstlisting}
     544% main: $\!I_1$
     545%       CALL f
     546%       $I_2$
     547%       RET
     548% \end{lstlisting}
     549% \end{minipage}
     550% \begin{minipage}{.1\linewidth}
     551% \begin{lstlisting}
     552% main
     553% main
     554% main
     555% main
     556% \end{lstlisting}
     557% \end{minipage}
     558% \hfil
     559% \begin{minipage}{.2\linewidth}
     560% \begin{lstlisting}
     561% f: $\!$COND l
     562%    EMIT $\ell_2$
     563%    RET
     564% l: $\!$EMIT $\ell_3$
     565%    $I_3$
     566%    RET
     567% \end{lstlisting}
     568% \end{minipage}
     569% \begin{minipage}{.1\linewidth}
     570% \begin{lstlisting}
     571% f
     572%
     573% $\ell_2$
     574%
     575% $\ell_3$
     576% $\ell_3$
     577% \end{lstlisting}
     578% \end{minipage}
     579% \hfil{}
     580% \caption{}
     581% \label{fig:esempio}
     582% \end{figure}
     583%
     584%
     585% \subsection{Labelling function calls}
     586% We now want to extend the labelling approach to support function calls.
     587% On the high level, \emph{structured} programming language $\mathcal{P}$ there
     588% is not much to change.
     589% When a function is invoked, the current basic block is temporarily exited
     590% and the basic block the function starts with take control. When the function
     591% returns, the execution of the original basic block is resumed. Thus the only
     592% significant change is that basic blocks can now be nested. Let \verb+E+
     593% be the label of the external block and \verb+I+ the label of a nested one.
     594% Since the external starts before the internal, the semantics observed will be
     595% \verb+E I+ and the cost associated to it on the source language will be
     596% $k(\verb+E+) + k(\verb+I+)$, i.e. the cost of executing all instructions
     597% in the block \verb+E+ first plus the cost of executing all the instructions in
     598% the block \verb+I+. However, we know that some instructions in \verb+E+ are
     599% executed after the last instruction in \verb+I+. This is actually irrelevant
     600% because we are here assuming that costs are additive, so that we can freely
     601% permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
     602% the function call terminates and yields back control to the basic block
     603% \verb+E+. If the call diverges, the instrumentation
     604% $\verb+cost += k(E)+$ executed at the beginning of \verb+E+ is still valid,
     605% but just as an upper bound to the real execution cost: only precision is lost.
     606%
     607% Let now consider what happens when we move down the compilation chain to an
     608% unstructured intermediate or final language. Here unstructured means that
     609% the only control operators are conditional and unconditional jumps, function
     610% calls and returns. Unlike a structured language, though, there is no guarantee
     611% that a function will return control just after the function call point.
     612% The semantics of the return statement, indeed, consists in fetching the
     613% return address from some internal structure (typically the control stack) and
     614% jumping directly to it. The code can freely manipulate the control stack to
     615% make the procedure returns to whatever position. Indeed, it is also possible
     616% to break the well nesting of function calls/returns.
     617%
     618% Is it the case that the code produced by a correct compiler must respect the
     619% additional property that every function returns just after its function call
     620% point? The answer is negative and the property is not implied by forward
     621% simulation proofs. For instance, imagine to modify a correct compiler pass
     622% by systematically adding one to the return address on the stack and by
     623% putting a \verb+NOP+ (or any other instruction that takes one byte) after
     624% every function call. The obtained code will be functionally indistinguishable,
     625% and the added instructions will all be dead code.
     626%
     627% This lack of structure in the semantics badly interferes with the labelling
     628% approach. The reason is the following: when a basic block labelled with
     629% \verb+E+ contains a function call, it no longer makes any sense to associate
     630% to a label \verb+E+ the sum of the costs of all the instructions in the block.
     631% Indeed, there is no guarantee that the function will return into the block and
     632% that the instructions that will be executed after the return will be the ones
     633% we are paying for in the cost model.
     634%
     635% How can we make the labelling approach work in this scenario? We only see two
     636% possible ways. The first one consists in injecting an emission statement after
     637% every function call: basic blocks no longer contain function calls, but are now
     638% terminated by them. This completely solves the problem and allows the compiler
     639% to break the structure of function calls/returns at will. However, the
     640% technique has several drawbacks. First of all, it greatly augments the number
     641% of cost labels that are injected in the source code and that become
     642% instrumentation statements. Thus, when reasoning on the source code to prove
     643% non-functional properties, the user (or the automation tool) will have to handle
     644% larger expressions. Second, the more labels are emitted, the more difficult it
     645% becomes to implement powerful optimisations respecting the code structure.
     646% Indeed, function calls are usually implemented in such a way that most registers
     647% are preserved by the call, so that the static analysis of the block is not
     648% interrupted by the call and an optimisation can involve both the code before
     649% and after the function call. Third, instrumenting the source code may require
     650% unpleasant modification of it. Take, for example, the code
     651% \verb+f(g(x));+. We need to inject an emission statement/instrumentation
     652% instruction just after the execution of \verb+g+. The only way to do that
     653% is to rewrite the code as \verb+y = g(x); emit L; f(y);+ for some fresh
     654% variable \verb+y+. It is pretty clear how in certain situations the obtained
     655% code would be more obfuscated and then more difficult to manually reason on.
     656%
     657% For the previous reasons, in this paper and in the CerCo project we adopt a
     658% different approach. We do not inject emission statements after every
     659% function call. However, we want to propagate a strong additional invariant in
     660% the forward simulation proof. The invariant is the propagation of the structure
     661% of the original high level code, even if the target language is unstructured.
     662% The structure we want to propagate, that will become more clear in the next
     663% section, comprises 1) the property that every function should return just after
     664% the function call point, which in turns imply well nesting of function calls;
     665% 2) the property that every basic block starts with a code emission statement.
     666%
     667% In the original labelling approach of~\cite{easylabelling}, the second property
     668% was granted syntactically as a property of the generated code.
     669% In our revised approach, instead, we will impose the property on the runs:
     670% it will be possible to generate code that does not respect the syntactic
     671% property, as soon as all possible runs respect it. For instance, dead code will no longer
     672% be required to have all basic blocks correctly la. The switch is suggested
     673% from the fact that the first of the two properties --- that related to
     674% function calls/returns --- can only be defined as property of runs,
     675% not of the static code. The switch is
     676% beneficial to the proof because the original proof was made of two parts:
     677% the forward simulation proof and the proof that the static property was granted.
     678% In our revised approach the latter disappears and only the forward simulation
     679% is kept.
     680%
     681% In order to capture the structure semantics so that it is preserved
     682% by a forward simulation argument, we need to make the structure observable
     683% in the semantics. This is the topic of the next section.
    587684
    588685\section{Structured traces}
     
    591688Let's consider a generic unstructured language already equipped with a
    592689small step structured operational semantics (SOS). We introduce a
    593 deterministic labelled transition system~\cite{LTS} $(S,s_i,\Lambda,\to)$
     690deterministic labelled transition system~\cite{LTS} $(S,s_{\mathrm{init}},\Lambda,\to)$
    594691that refines the
    595692SOS by observing function calls and the beginning of basic blocks.
    596 $S$ is the set of states of the program, $s_i$ the initial state and
    597 $\Lambda = \{ \tau, RET \} \cup \mathcal{L} \cup \mathcal{F}$
    598 where $\mathcal{F}$ is the set of names of functions that can occur in the
    599 program, $\mathcal{L}$ is a set of labels disjoint from $\mathcal{F}$
    600 and $\tau$ and $RET$ do not belong to $\mathcal{F} \cup \mathcal{L}$.
    601 The transition function is defined as $s_1 \stackrel{o}{\to} s_2$ if
    602 $s_1$ moves to $s_2$ according to the SOS; moreover $o = f \in \mathcal{F}$ if
    603 the function $f$ is called, $o = RET$ if a \texttt{RETURN} is executed,
    604 $o = L \in \mathcal{L}$ if a \texttt{EMIT L} is executed to signal the
     693$S$ is the set of states of the program, $s_\mathrm{init}$ the initial state and
     694$\Lambda = \{ \tau, RET \} \cup \Labels \cup \Functions$
     695where $\Functions$ is the set of names of functions that can occur in the
     696program, $\Labels$ is a set of labels disjoint from $\Functions$
     697and $\tau$ and $RET$ do not belong to $\Functions \cup \Labels$.
     698The transition function is defined as $s_1 \to[o] s_2$ if
     699$s_1$ moves to $s_2$ according to the SOS; moreover $o = f \in \Functions$ if
     700the function $f$ is called, $o = RET$ if a \verb+RETURN+ is executed,
     701$o = L \in \Labels$ if an \verb+EMIT $L$+ is executed to signal the
    605702beginning of a basic block, and $o = \tau$ in all other cases.
    606703Because we assume the language to be deterministic, the label emitted can
    607 actually be computed simply observing $s_1$.
    608 
    609 Among all possible finite execution fragments
    610 $s_0 \stackrel{o_0}{\to} s_1 \ldots \stackrel{o_n}{\to} s_{n+1}$ we want to
     704actually be computed simply observing $s_1$. Finally, $S$ is also endowed with
     705a relation $s\ar s'$ ($s'$ \emph{follows} $s$) when the instruction to be executed
     706in $s'$ is just after the one in $s$.
     707
     708Among all possible finite execution fragments we want to
    611709identify the ones that satisfy the requirements we sketched in the previous
    612 section. We say that an execution fragment is \emph{structured} iff
     710section. We say that an execution fragment
     711$s_0 \to[o_0] s_1 \to[o_1] \ldots \to[o_n] s_n$
     712is \emph{structured} (marking it as $s_0 \To s_n$) iff the following conditions
     713are met.
    613714\begin{enumerate}
    614  \item for every $i$, if $s_i \stackrel{f}{\to} s_{i+1}$ then there is a
    615    label $L$ such that $s_{i+1} \stackrel{L}{\to} s_{i+2}$.
    616    Equivalently, $s_{i+1}$ must start execution with an \texttt{EMIT L}.
    617    This captures the requirement that the body of function calls always start
    618    with a label emission statement.
    619  \item for every $i$, if $s_i \stackrel{f}{\to} s_{i+1}$ then
    620    there is a strucuted $s_{i+1} \stackrel{o_{i+1}}{\to} \ldots
    621    \stackrel{o_n}{\to} s_{n+1}$ such that $s_{n+1} \stackrel{RET}{\to} s_{n+2}$
    622    and the instruction to be executed in $s_{n+2}$ follows the call that was
    623    executed in $s_i$. This captures the double requirement that every function
    624    call must converge and that it must yield back control just after the call.
    625  \item for every $i$, if the instruction to be executed in $s_i$ is a
    626    conditional branch, then there is an $L$ such that $s_{i+1} \stackrel{L}{\to} s_{i+2}$ or, equivalently, that $s_{i+1}$ must start execution with an
    627    \texttt{EMIT L}. This captures the requirement that every branch which is
     715 \item For every $i$, if $s_i \to[f] s_{i+1}$ then there is a
     716   label $L$ and a $k\ge i+2$ such that
     717    $s_{i+1} \to[L] s_{i+2} \To s_k \to[RET] s_{k+1}$, with
     718    $s_i \ar s_{k+1}$.
     719   In other words, $s_{i+1}$ must start execution with an \verb+EMIT $L$+
     720   and then continue with a structured fragment returning control
     721   to the instruction immediately following the call.
     722   This captures the requirements that the body of function calls always start
     723   with a label emission statement, and every function
     724   call must converge yielding back control just after it.
     725 \item The number of $RET$'s in the fragment is equal to the number of
     726   calls, i.e.\ the number of observables in $\Functions$. This, together with
     727   the above condition, captures the well-bracketing of the fragment with
     728   respect to function calls.
     729 \item For every $i$, if the instruction to be executed in $s_i$ is a
     730   conditional branch, then there is an $L$ such that $s_{i+1} \to[L] s_{i+2}$ or, equivalently, that $s_{i+1}$ must start execution with an
     731   \verb+EMIT $L$+. This captures the requirement that every branch which is
    628732   live code must start with a label emission.
    629733\end{enumerate}
     
    642746computed from their weak trace by choosing a $k$ that assigns to any label
    643747the cost of the instructions in its scope.
    644 A fragment
    645 $T = s_0 \stackrel{o_0}{\to} s_1 \ldots \stackrel{o_n}{\to} s_{n+1}$ is
     748A structured fragment
     749$T = s_0 \To s_n$ is
    646750measurable if it does not start or end in the middle of a basic block.
    647751Ending in the middle of a block would mean having pre-paid more instructions
    648752than the ones executed, and starting in the middle would mean not paying any
    649753instruction up to the first label emission.
    650 Formally we require $o_0 \in \mathcal{L}$ (or equivalently
    651 in $s_0$ the program must emit a label) and $s_{n+1}$ must either be a \texttt{RETURN} or a label emission statement (or, equivalently, $s_{n+1} \stackrel{o_{n+1}} s_{n+2}$ with $o_{n+1} \in \mathcal{L} \cup \{RET\}$).
     754Formally we require $o_0 \in \Labels$ (or equivalently
     755in $s_0$ the program must emit a label) and either
     756$s_{n-1}\to[RET]s_n$ or $s_n$ must be a label emission statement
     757(i.e.\ $s_n \to[L] s_{n+1}$).
    652758
    653759Given two deterministic unstructured programming languages with their own
     
    659765 or, equivalently
    660766because of determinism, that they are weakly bisimilar.
    661 
    662 
    663 ========================================================================
    664 
    665 The program semantics adopted in the traditional labelling approach is based
    666 on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
    667 a set of states $\S$, the semantics of one deterministic execution
    668 step is
    669 defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
    670 observables. The semantics is then lifted compositionally to multiple (finite
    671 or infinite) execution steps.
    672 Finally, the semantics of a a whole program execution is obtained by forgetting
    673 about the final state (if any), yielding a function $S \to O^*$ that given an
    674 initial status returns the finite or infinite stream of observables in output.
    675 
    676 We present here a new definition of semantics where the structure of execution,
    677 as defined in the previous section, is now observable. The idea is to replace
    678 the stream of observables with a structured data type that makes explicit
    679 function call and returns and that grants some additional invariants by
    680 construction. The data structure, called \emph{structured traces}, is
    681 defined inductively for terminating programs and coinductively for diverging
    682 ones. In the paper we focus only on the inductive structure, i.e. we assume
    683 that all programs that are given a semantics are total. The Matita formalisation
    684 also shows the coinductive definitions. The semantics of a program is then
    685 defined as a function that maps an initial state into a structured trace.
    686 
    687 In order to have a definition that works on multiple intermediate languages,
    688 we abstract the type of structure traces over an abstract data type of
    689 abstract statuses, which we aptly call $\texttt{abstract\_status}$. The fields
    690 of this record are the following.
    691 \begin{itemize}
    692  \item \verb+S : Type[0]+, the type of states.
    693  \item \verb+as_execute : S $\to$ S $\to$ Prop+, a binary predicate stating
    694  an execution step. We write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
    695  \item \verb+as_classifier : S $\to$ classification+, a function tagging all
    696  states with a class in
    697  $\{\texttt{cl\_return,cl\_jump,cl\_call,cl\_other}\}$, depending on the instruction
    698  that is about to be executed (we omit tail-calls for simplicity). We will
    699  use $s \class c$ as a shorthand for both $\texttt{as\_classifier}~s=c$
    700  (if $c$ is a classification) and $\texttt{as\_classifier}~s\in c$
    701  (if $c$ is a set of classifications).
    702  \item \verb+as_label : S $\to$ option label+, telling whether the
    703  next instruction to be executed in $s$ is a cost emission statement,
    704  and if yes returning the associated cost label. Our shorthand for this function
    705  will be $\ell$, and we will also abuse the notation by using $\ell~s$ as a
    706  predicate stating that $s$ is labelled.
    707  \item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
    708  telling the identifier of the function which is being called in a
    709  \verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
    710  $\verb+as_call_ident+~s = f$.
    711  \item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
    712  which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
    713  instruction to be executed in $s_2$ follows the function call to be
    714  executed in (the witness of the $\Sigma$-type) $s_1$. We will use the notation
    715  $s_1\ar s_2$ for this relation.
    716 \end{itemize}
    717 
    718 % \begin{alltt}
    719 % record abstract_status := \{ S: Type[0];
    720 %  as_execute: S \(\to\) S \(\to\) Prop;   as_classifier: S \(\to\) classification;
    721 %  as_label: S \(\to\) option label;    as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
    722 %  as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
    723 % \end{alltt}
    724 
    725 The inductive type for structured traces is actually made by three multiple
    726 inductive types with the following semantics:
    727 \begin{enumerate}
    728  \item $(\texttt{trace\_label\_return}~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
    729    is a trace that begins in
    730    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
    731    such that the instruction to be executed in $s_1$ is a label emission
    732    statement and the one to be executed in the state before $s_2$ is a return
    733    statement. Thus $s_2$ is the state after the return. The trace
    734    may contain other label emission statements. It captures the structure of
    735    the execution of function bodies: they must start with a cost emission
    736    statement and must end with a return; they are obtained by concatenating
    737    one or more basic blocks, all starting with a label emission
    738    (e.g. in case of loops).
    739  \item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
    740    is a trace that begins in
    741    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
    742    such that the instruction to be executed in $s_2$/in the state before
    743    $s_2$ is either a label emission statement or
    744    or a return, according to the boolean $b$. It must not contain
    745    any label emission statement. It captures the notion of a suffix of a
    746    basic block.
    747  \item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$ is the special case of
    748    $\verb+TAL+~b~s_1~s_2)$ such that the instruction to be
    749    executed in $s_1$ is a label emission statement. It captures the notion of
    750    a basic block.
    751 \end{enumerate}
    752 
    753 \begin{multicols}{3}
    754 \infrule[\verb+tlr_base+]
    755  {\texttt{TLL}~true~s_1~s_2}
    756  {\texttt{TLR}~s_1~s_2}
    757 
    758 \infrule[\verb+tlr_step+]
    759  {\texttt{TLL}~false~s_1~s_2 \andalso
    760   \texttt{TLR}~s_2~s_3
    761  }
    762  {\texttt{TLR}~s_1~s_3}
    763 
    764 \infrule[\verb+tll_base+]
    765  {\texttt{TAL}~b~s_1~s_2 \andalso
    766   \ell~s_1
    767  }
    768  {\texttt{TLL}~b~s_1~s_2}
    769 \end{multicols}
    770 
    771 \infrule[\verb+tal_base_not_return+]
    772  {s_1\exec s_2 \andalso
    773   s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
    774   \ell~s_2
    775  }
    776  {\texttt{TAL}~false~s_1~s_2}
    777 
    778 \infrule[\verb+tal_base_return+]
    779  {s_1\exec s_2 \andalso
    780   s_1 \class \texttt{cl\_return}
    781  }
    782  {\texttt{TAL}~true~s_1~s_2}
    783 
    784 \infrule[\verb+tal_base_call+]
    785  {s_1\exec s_2 \andalso
    786   s_1 \class \texttt{cl\_call} \andalso
    787   s_1\ar s_3 \andalso
    788   \texttt{TLR}~s_2~s_3 \andalso
    789   \ell~s_3
    790  }
    791  {\texttt{TAL}~false~s_1~s_3}
    792 
    793 \infrule[\verb+tal_step_call+]
    794  {s_1\exec s_2 \andalso
    795   s_1 \class \texttt{cl\_call} \andalso
    796   s_1\ar s_3 \andalso
    797   \texttt{TLR}~s_2~s_3 \andalso
    798   \texttt{TAL}~b~s_3~s_4
    799  }
    800  {\texttt{TAL}~b~s_1~s_4}
    801 
    802 \infrule[\verb+tal_step_default+]
    803  {s_1\exec s_2 \andalso
    804   \lnot \ell~s_2 \andalso
    805   \texttt{TAL}~b~s_2~s_3\andalso
    806   s_1 \class \texttt{cl\_other}
    807  }
    808  {\texttt{TAL}~b~s_1~s_3}
    809 \begin{comment}
    810 \begin{verbatim}
    811 inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
    812   | tlr_base:
    813       ∀status_before: S.
    814       ∀status_after: S.
    815         trace_label_label S ends_with_ret status_before status_after →
    816         trace_label_return S status_before status_after
    817   | tlr_step:
    818       ∀status_initial: S.
    819       ∀status_labelled: S.
    820       ∀status_final: S.
    821         trace_label_label S doesnt_end_with_ret status_initial status_labelled →
    822         trace_label_return S status_labelled status_final →
    823           trace_label_return S status_initial status_final
    824 with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
    825   | tll_base:
    826       ∀ends_flag: trace_ends_with_ret.
    827       ∀start_status: S.
    828       ∀end_status: S.
    829         trace_any_label S ends_flag start_status end_status →
    830         as_costed S start_status →
    831           trace_label_label S ends_flag start_status end_status
    832 with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
    833   (* Single steps within a function which reach a label.
    834      Note that this is the only case applicable for a jump. *)
    835   | tal_base_not_return:
    836       ∀start_status: S.
    837       ∀final_status: S.
    838         as_execute S start_status final_status →
    839         (as_classifier S start_status cl_jump ∨
    840          as_classifier S start_status cl_other) →
    841         as_costed S final_status →
    842           trace_any_label S doesnt_end_with_ret start_status final_status
    843   | tal_base_return:
    844       ∀start_status: S.
    845       ∀final_status: S.
    846         as_execute S start_status final_status →
    847         as_classifier S start_status cl_return →
    848           trace_any_label S ends_with_ret start_status final_status
    849   (* A call followed by a label on return. *)
    850   | tal_base_call:
    851       ∀status_pre_fun_call: S.
    852       ∀status_start_fun_call: S.
    853       ∀status_final: S.
    854         as_execute S status_pre_fun_call status_start_fun_call →
    855         ∀H:as_classifier S status_pre_fun_call cl_call.
    856           as_after_return S «status_pre_fun_call, H» status_final →
    857           trace_label_return S status_start_fun_call status_final →
    858           as_costed S status_final →
    859             trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
    860   (* A call followed by a non-empty trace. *)
    861   | tal_step_call:
    862       ∀end_flag: trace_ends_with_ret.
    863       ∀status_pre_fun_call: S.
    864       ∀status_start_fun_call: S.
    865       ∀status_after_fun_call: S.
    866       ∀status_final: S.
    867         as_execute S status_pre_fun_call status_start_fun_call →
    868         ∀H:as_classifier S status_pre_fun_call cl_call.
    869           as_after_return S «status_pre_fun_call, H» status_after_fun_call →
    870           trace_label_return S status_start_fun_call status_after_fun_call →
    871           ¬ as_costed S status_after_fun_call →
    872           trace_any_label S end_flag status_after_fun_call status_final →
    873             trace_any_label S end_flag status_pre_fun_call status_final
    874   | tal_step_default:
    875       ∀end_flag: trace_ends_with_ret.
    876       ∀status_pre: S.
    877       ∀status_init: S.
    878       ∀status_end: S.
    879         as_execute S status_pre status_init →
    880         trace_any_label S end_flag status_init status_end →
    881         as_classifier S status_pre cl_other →
    882         ¬ (as_costed S status_init) →
    883           trace_any_label S end_flag status_pre status_end.
    884 \end{verbatim}
    885 \end{comment}
    886 A \texttt{trace\_label\_return} is isomorphic to a list of
    887 \texttt{trace\_label\_label}s that ends with a cost emission followed by a
    888 return terminated \texttt{trace\_label\_label}.
    889 The interesting cases are those of $\texttt{trace\_any\_label}~b~s_1~s_2$.
    890 A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
    891 definition on the classification of $s_1$. The constructors of the datatype
    892 impose several invariants that are meant to impose a structure to the
    893 otherwise unstructured execution. In particular, the following invariants are
    894 imposed:
    895 \begin{enumerate}
    896  \item the trace is never empty; it ends with a return iff $b$ is
    897        true
    898  \item a jump must always be the last instruction of the trace, and it must
    899        be followed by a cost emission statement; i.e. the target of a jump
    900        is always the beginning of a new basic block; as such it must start
    901        with a cost emission statement
    902  \item a cost emission statement can never occur inside the trace, only in
    903        the status immediately after
    904  \item the trace for a function call step is made of a subtrace for the
    905        function body of type
    906        $\texttt{trace\_label\_return}~s_1~s_2$, possibly followed by the
    907        rest of the trace for this basic block. The subtrace represents the
    908        function execution. Being an inductive datum, it grants totality of
    909        the function call. The status $s_2$ is the one that follows the return
    910        statement. The next instruction of $s_2$ must follow the function call
    911        instruction. As a consequence, function calls are also well nested.
    912 \end{enumerate}
    913 
    914 There are three mutual structural recursive functions, one for each of
    915 \verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
    916 $|\,.\,|$: the \emph{flattening} of the traces. These functions
    917 allow to extract from a structured trace the list of emitted cost labels.
    918 %  We only show here the type of one
    919 % of them:
    920 % \begin{alltt}
    921 % flatten_trace_label_return:
    922 %  \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
    923 %   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
    924 % \end{alltt}
    925 
    926 \paragraph{Cost prediction on structured traces.}
    927 
    928 The first main theorem of CerCo about traces
    929 (theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
    930 holds for the
    931 instantiation
    932 of the structured traces to the concrete status of object code programs.
    933 Simplifying a bit, it states that
    934 \begin{equation}\label{th1}
    935 \begin{array}{l}\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
    936   \texttt{clock}~s_2 = \texttt{clock}~s_1 +
    937   \Sigma_{\alpha \in |\tau|}\;k(\alpha)
    938 \end{array}
    939 \end{equation}
    940 where the cost model $k$ is statically computed from the object code
    941 by associating to each label $\alpha$ the sum of the cost of the instructions
    942 in the basic block that starts at $\alpha$ and ends before the next labelled
    943 instruction. The theorem is proved by structural induction over the structured
    944 trace, and is based on the invariant that
    945 iff the function that computes the cost model has analysed the instruction
    946 to be executed at $s_2$ after the one to be executed at $s_1$, and if
    947 the structured trace starts with $s_1$, then eventually it will contain also
    948 $s_2$. When $s_1$ is not a function call, the result holds trivially because
    949 of the $s_1\exec s_2$ condition obtained by inversion on
    950 the trace. The only non
    951 trivial case is the one of function calls: the cost model computation function
    952 does recursion on the first instruction that follows that function call; the
    953 \texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
    954 \texttt{tal\_step\_call} grants exactly that the execution will eventually reach
    955 this state.
    956 
    957 \paragraph{Structured traces similarity and cost prediction invariance.}
    958 
    959 A compiler pass maps source to object code and initial states to initial
    960 states. The source code and initial state uniquely determine the structured
    961 trace of a program, if it exists. The structured trace fails to exists iff
    962 the structural conditions are violated by the program execution (e.g. a function
    963 body does not start with a cost emission statement). Let us assume that the
    964 target structured trace exists.
    965 
    966 What is the relation between the source and target structured traces?
    967 In general, the two traces can be arbitrarily different. However, we are
    968 interested only in those compiler passes that maps a trace $\tau_1$ to a trace
    969 $\tau_2$ such that
    970 \begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
    971 The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
    972 corollary
    973 \begin{equation}\label{th3}
    974 \forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
    975   \texttt{clock}~s_2 - \texttt{clock}~s_1 =
    976   \Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
    977   \Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
    978 \end{equation}
    979 This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
    980 transfer the cost model from the target to the source code and reason on the
    981 source code only.
    982 
    983 We are therefore interested in conditions stronger than~\eqref{th2}.
    984 Therefore we introduce here a similarity relation between traces with
    985 the same structure. Theorem~\texttt{tlr\_rel\_to\_traces\_same\_flatten}
    986 in the Matita formalisation shows that~\eqref{th2} holds for every pair
    987 $(\tau_1,\tau_2)$ of similar traces.
    988 
    989 Intuitively, two traces are similar when one can be obtained from
    990 the other by erasing or inserting silent steps, i.e. states that are
    991 not \texttt{as\_costed} and that are classified as \texttt{cl\_other}.
    992 Silent steps do not alter the structure of the traces.
    993 In particular,
    994 the relation maps function calls to function calls to the same function,
    995 label emission statements to emissions of the same label, concatenation of
    996 subtraces to concatenation of subtraces of the same length and starting with
    997 the same emission statement, etc.
    998 
    999 In the formalisation the three similarity relations --- one for each trace
    1000 kind --- are defined by structural recursion on the first trace and pattern
    1001 matching over the second. Here we turn
    1002 the definition into the inference rules shown in \autoref{fig:txx_rel}
    1003 for the sake of readability. We also omit from trace constructors all arguments,
    1004 but those that are traces or that
    1005 are used in the premises of the rules. By abuse of notation we denote all three
    1006 relations by infixing $\approx$.
    1007 
    1008 \begin{figure}
    1009 \begin{multicols}{2}
    1010 \infrule
    1011  {tll_1\approx tll_2
    1012  }
    1013  {\texttt{tlr\_base}~tll_1 \approx \texttt{tlr\_base}~tll_2}
    1014 
    1015 \infrule
    1016  {tll_1 \approx tll_2 \andalso
    1017   tlr_1 \approx tlr_2
    1018  }
    1019  {\texttt{tlr\_step}~tll_1~tlr_1 \approx \texttt{tlr\_step}~tll_2~tlr_2}
    1020 \end{multicols}
    1021 \vspace{3ex}
    1022 \begin{multicols}{2}
    1023 \infrule
    1024  {\ell~s_1 = \ell~s_2 \andalso
    1025   tal_1\approx tal_2
    1026  }
    1027  {\texttt{tll\_base}~s_1~tal_1 \approx \texttt{tll\_base}~s_2~tal_2}
    1028 
    1029 \infrule
    1030  {tal_1\approx tal_2
    1031  }
    1032  {\texttt{tal\_step\_default}~tal_1 \approx tal_2}
    1033 \end{multicols}
    1034 \vspace{3ex}
    1035 \infrule
    1036  {}
    1037  {\texttt{tal\_base\_not\_return}\approx taa \append \texttt{tal\_base\_not\_return}}
    1038 \vspace{1ex}
    1039 \infrule
    1040  {}
    1041  {\texttt{tal\_base\_return}\approx taa \append \texttt{tal\_base\_return}}
    1042 \vspace{1ex}
    1043 \infrule
    1044  {tlr_1\approx tlr_2 \andalso
    1045   s_1 \uparrow f \andalso s_2\uparrow f
    1046  }
    1047  {\texttt{tal\_base\_call}~s_1~tlr_1\approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2}
    1048 \vspace{1ex}
    1049 \infrule
    1050  {tlr_1\approx tlr_2 \andalso
    1051   s_1 \uparrow f \andalso s_2\uparrow f \andalso
    1052   \texttt{tal\_collapsable}~tal_2
    1053  }
    1054  {\texttt{tal\_base\_call}~s_1~tlr_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2)}
    1055 \vspace{1ex}
    1056 \infrule
    1057  {tlr_1\approx tlr_2 \andalso
    1058   s_1 \uparrow f \andalso s_2\uparrow f \andalso
    1059   \texttt{tal\_collapsable}~tal_1
    1060  }
    1061  {\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2)}
    1062 \vspace{1ex}
    1063 \infrule
    1064  {tlr_1 \approx tlr_2 \andalso
    1065   s_1 \uparrow f \andalso s_2\uparrow f\andalso
    1066   tal_1 \approx tal_2 \andalso
    1067  }
    1068  {\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2}
    1069 \caption{The inference rule for the relation $\approx$.}
    1070 \label{fig:txx_rel}
    1071 \end{figure}
    1072 %
    1073 \begin{comment}
    1074 \begin{verbatim}
    1075 let rec tlr_rel S1 st1 st1' S2 st2 st2'
    1076   (tlr1 : trace_label_return S1 st1 st1')
    1077   (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
    1078 match tlr1 with
    1079   [ tlr_base st1 st1' tll1 ⇒
    1080     match tlr2 with
    1081     [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
    1082     | _ ⇒ False
    1083     ]
    1084   | tlr_step st1 st1' st1'' tll1 tl1 ⇒
    1085     match tlr2 with
    1086     [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
    1087       tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
    1088     | _ ⇒ False
    1089     ]
    1090   ]
    1091 and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
    1092  (tll1 : trace_label_label S1 fl1 st1 st1')
    1093  (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
    1094   match tll1 with
    1095   [ tll_base fl1 st1 st1' tal1 H ⇒
    1096     match tll2 with
    1097     [ tll_base fl2 st2 st2 tal2 G ⇒
    1098       as_label_safe … («?, H») = as_label_safe … («?, G») ∧
    1099       tal_rel … tal1 tal2
    1100     ]
    1101   ]
    1102 and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
    1103  (tal1 : trace_any_label S1 fl1 st1 st1')
    1104  (tal2 : trace_any_label S2 fl2 st2 st2')
    1105    on tal1 : Prop ≝
    1106   match tal1 with
    1107   [ tal_base_not_return st1 st1' _ _ _ ⇒
    1108     fl2 = doesnt_end_with_ret ∧
    1109     ∃st2mid,taa,H,G,K.
    1110     tal2 ≃ taa_append_tal ? st2 ??? taa
    1111       (tal_base_not_return ? st2mid st2' H G K)
    1112   | tal_base_return st1 st1' _ _ ⇒
    1113     fl2 = ends_with_ret ∧
    1114     ∃st2mid,taa,H,G.
    1115     tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
    1116       (tal_base_return ? st2mid st2' H G)
    1117   | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
    1118     fl2 = doesnt_end_with_ret ∧
    1119     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
    1120     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
    1121     (* we must allow a tal_base_call to be similar to a call followed
    1122       by a collapsable trace (trace_any_any followed by a base_not_return;
    1123       we cannot use trace_any_any as it disallows labels in the end as soon
    1124       as it is non-empty) *)
    1125     (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
    1126       tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
    1127     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
    1128     ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
    1129       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
    1130       tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
    1131   | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
    1132     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
    1133     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
    1134     (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
    1135       tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
    1136       tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
    1137     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
    1138     ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
    1139       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
    1140       tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
    1141   | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
    1142     tal_rel … tl1 tal2 (* <- this makes it many to many *)
    1143   ].
    1144 \end{verbatim}
    1145 \end{comment}
    1146 %
    1147 In the preceding rules, a $taa$ is an inhabitant of the
    1148 $\texttt{trace\_any\_any}~s_1~s_2$ (shorthand $\texttt{TAA}~s_1~s_2$),
    1149 an inductive data type whose definition
    1150 is not in the paper for lack of space. It is the type of valid
    1151 prefixes (even empty ones) of \texttt{TAL}'s that do not contain
    1152 any function call. Therefore it
    1153 is possible to concatenate (using ``$\append$'') a \texttt{TAA} to the
    1154 left of a \texttt{TAL}. A \texttt{TAA} captures
    1155 a sequence of silent moves.
    1156 The \texttt{tal\_collapsable} unary predicate over \texttt{TAL}'s
    1157 holds when the argument does not contain any function call and it ends
    1158 with a label (not a return). The intuition is that after a function call we
    1159 can still perform a sequence of silent actions while remaining similar.
    1160 
    1161 As should be expected, even though the rules are asymmetric $\approx$ is in fact
    1162 an equivalence relation.
     767% @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
     768%
     769% The program semantics adopted in the traditional labelling approach is based
     770% on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
     771% a set of states $\S$, the semantics of one deterministic execution
     772% step is
     773% defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
     774% observables. The semantics is then lifted compositionally to multiple (finite
     775% or infinite) execution steps.
     776% Finally, the semantics of a a whole program execution is obtained by forgetting
     777% about the final state (if any), yielding a function $S \to O^*$ that given an
     778% initial status returns the finite or infinite stream of observables in output.
     779%
     780% We present here a new definition of semantics where the structure of execution,
     781% as defined in the previous section, is now observable. The idea is to replace
     782% the stream of observables with a structured data type that makes explicit
     783% function call and returns and that grants some additional invariants by
     784% construction. The data structure, called \emph{structured traces}, is
     785% defined inductively for terminating programs and coinductively for diverging
     786% ones. In the paper we focus only on the inductive structure, i.e. we assume
     787% that all programs that are given a semantics are total. The Matita formalisation
     788% also shows the coinductive definitions. The semantics of a program is then
     789% defined as a function that maps an initial state into a structured trace.
     790%
     791% In order to have a definition that works on multiple intermediate languages,
     792% we abstract the type of structure traces over an abstract data type of
     793% abstract statuses, which we aptly call $\verb+abstract_status+$. The fields
     794% of this record are the following.
     795% \begin{itemize}
     796%  \item \verb+S : Type[0]+, the type of states.
     797%  \item \verb+as_execute : S $\to$ S $\to$ Prop+, a binary predicate stating
     798%  an execution step. We write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
     799%  \item \verb+as_classifier : S $\to$ classification+, a function tagging all
     800%  states with a class in
     801%  $\{\verb+cl_return,cl_jump,cl_call,cl_other+\}$, depending on the instruction
     802%  that is about to be executed (we omit tail-calls for simplicity). We will
     803%  use $s \class c$ as a shorthand for both $\verb+as_classifier+~s=c$
     804%  (if $c$ is a classification) and $\verb+as_classifier+~s\in c$
     805%  (if $c$ is a set of classifications).
     806%  \item \verb+as_label : S $\to$ option label+, telling whether the
     807%  next instruction to be executed in $s$ is a cost emission statement,
     808%  and if yes returning the associated cost label. Our shorthand for this function
     809%  will be $\ell$, and we will also abuse the notation by using $\ell~s$ as a
     810%  predicate stating that $s$ is labelled.
     811%  \item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
     812%  telling the identifier of the function which is being called in a
     813%  \verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
     814%  $\verb+as_call_ident+~s = f$.
     815%  \item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
     816%  which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
     817%  instruction to be executed in $s_2$ follows the function call to be
     818%  executed in (the witness of the $\Sigma$-type) $s_1$. We will use the notation
     819%  $s_1\ar s_2$ for this relation.
     820% \end{itemize}
     821%
     822% % \begin{alltt}
     823% % record abstract_status := \{ S: Type[0];
     824% %  as_execute: S \(\to\) S \(\to\) Prop;   as_classifier: S \(\to\) classification;
     825% %  as_label: S \(\to\) option label;    as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
     826% %  as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
     827% % \end{alltt}
     828%
     829% The inductive type for structured traces is actually made by three multiple
     830% inductive types with the following semantics:
     831% \begin{enumerate}
     832%  \item $(\verb+trace_label_return+~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
     833%    is a trace that begins in
     834%    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
     835%    such that the instruction to be executed in $s_1$ is a label emission
     836%    statement and the one to be executed in the state before $s_2$ is a return
     837%    statement. Thus $s_2$ is the state after the return. The trace
     838%    may contain other label emission statements. It captures the structure of
     839%    the execution of function bodies: they must start with a cost emission
     840%    statement and must end with a return; they are obtained by concatenating
     841%    one or more basic blocks, all starting with a label emission
     842%    (e.g. in case of loops).
     843%  \item $(\verb+trace_any_label+~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
     844%    is a trace that begins in
     845%    the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
     846%    such that the instruction to be executed in $s_2$/in the state before
     847%    $s_2$ is either a label emission statement or
     848%    or a return, according to the boolean $b$. It must not contain
     849%    any label emission statement. It captures the notion of a suffix of a
     850%    basic block.
     851%  \item $(\verb+trace_label_label+~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$ is the special case of
     852%    $\verb+TAL+~b~s_1~s_2)$ such that the instruction to be
     853%    executed in $s_1$ is a label emission statement. It captures the notion of
     854%    a basic block.
     855% \end{enumerate}
     856%
     857% \begin{multicols}{3}
     858% \infrule[\verb+tlr_base+]
     859%  {\verb+TLL+~true~s_1~s_2}
     860%  {\verb+TLR+~s_1~s_2}
     861%
     862% \infrule[\verb+tlr_step+]
     863%  {\verb+TLL+~false~s_1~s_2 \andalso
     864%   \verb+TLR+~s_2~s_3
     865%  }
     866%  {\verb+TLR+~s_1~s_3}
     867%
     868% \infrule[\verb+tll_base+]
     869%  {\verb+TAL+~b~s_1~s_2 \andalso
     870%   \ell~s_1
     871%  }
     872%  {\verb+TLL+~b~s_1~s_2}
     873% \end{multicols}
     874%
     875% \infrule[\verb+tal_base_not_return+]
     876%  {s_1\exec s_2 \andalso
     877%   s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
     878%   \ell~s_2
     879%  }
     880%  {\verb+TAL+~false~s_1~s_2}
     881%
     882% \infrule[\verb+tal_base_return+]
     883%  {s_1\exec s_2 \andalso
     884%   s_1 \class \verb+cl_return+
     885%  }
     886%  {\verb+TAL+~true~s_1~s_2}
     887%
     888% \infrule[\verb+tal_base_call+]
     889%  {s_1\exec s_2 \andalso
     890%   s_1 \class \verb+cl_call+ \andalso
     891%   s_1\ar s_3 \andalso
     892%   \verb+TLR+~s_2~s_3 \andalso
     893%   \ell~s_3
     894%  }
     895%  {\verb+TAL+~false~s_1~s_3}
     896%
     897% \infrule[\verb+tal_step_call+]
     898%  {s_1\exec s_2 \andalso
     899%   s_1 \class \verb+cl_call+ \andalso
     900%   s_1\ar s_3 \andalso
     901%   \verb+TLR+~s_2~s_3 \andalso
     902%   \verb+TAL+~b~s_3~s_4
     903%  }
     904%  {\verb+TAL+~b~s_1~s_4}
     905%
     906% \infrule[\verb+tal_step_default+]
     907%  {s_1\exec s_2 \andalso
     908%   \lnot \ell~s_2 \andalso
     909%   \verb+TAL+~b~s_2~s_3\andalso
     910%   s_1 \class \verb+cl_other+
     911%  }
     912%  {\verb+TAL+~b~s_1~s_3}
     913% \begin{comment}
     914% \begin{verbatim}
     915% inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
     916%   | tlr_base:
     917%       ∀status_before: S.
     918%       ∀status_after: S.
     919%         trace_label_label S ends_with_ret status_before status_after →
     920%         trace_label_return S status_before status_after
     921%   | tlr_step:
     922%       ∀status_initial: S.
     923%       ∀status_labelled: S.
     924%       ∀status_final: S.
     925%         trace_label_label S doesnt_end_with_ret status_initial status_labelled →
     926%         trace_label_return S status_labelled status_final →
     927%           trace_label_return S status_initial status_final
     928% with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
     929%   | tll_base:
     930%       ∀ends_flag: trace_ends_with_ret.
     931%       ∀start_status: S.
     932%       ∀end_status: S.
     933%         trace_any_label S ends_flag start_status end_status →
     934%         as_costed S start_status →
     935%           trace_label_label S ends_flag start_status end_status
     936% with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
     937%   (* Single steps within a function which reach a label.
     938%      Note that this is the only case applicable for a jump. *)
     939%   | tal_base_not_return:
     940%       ∀start_status: S.
     941%       ∀final_status: S.
     942%         as_execute S start_status final_status →
     943%         (as_classifier S start_status cl_jump ∨
     944%          as_classifier S start_status cl_other) →
     945%         as_costed S final_status →
     946%           trace_any_label S doesnt_end_with_ret start_status final_status
     947%   | tal_base_return:
     948%       ∀start_status: S.
     949%       ∀final_status: S.
     950%         as_execute S start_status final_status →
     951%         as_classifier S start_status cl_return →
     952%           trace_any_label S ends_with_ret start_status final_status
     953%   (* A call followed by a label on return. *)
     954%   | tal_base_call:
     955%       ∀status_pre_fun_call: S.
     956%       ∀status_start_fun_call: S.
     957%       ∀status_final: S.
     958%         as_execute S status_pre_fun_call status_start_fun_call →
     959%         ∀H:as_classifier S status_pre_fun_call cl_call.
     960%           as_after_return S «status_pre_fun_call, H» status_final →
     961%           trace_label_return S status_start_fun_call status_final →
     962%           as_costed S status_final →
     963%             trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
     964%   (* A call followed by a non-empty trace. *)
     965%   | tal_step_call:
     966%       ∀end_flag: trace_ends_with_ret.
     967%       ∀status_pre_fun_call: S.
     968%       ∀status_start_fun_call: S.
     969%       ∀status_after_fun_call: S.
     970%       ∀status_final: S.
     971%         as_execute S status_pre_fun_call status_start_fun_call →
     972%         ∀H:as_classifier S status_pre_fun_call cl_call.
     973%           as_after_return S «status_pre_fun_call, H» status_after_fun_call →
     974%           trace_label_return S status_start_fun_call status_after_fun_call →
     975%           ¬ as_costed S status_after_fun_call →
     976%           trace_any_label S end_flag status_after_fun_call status_final →
     977%             trace_any_label S end_flag status_pre_fun_call status_final
     978%   | tal_step_default:
     979%       ∀end_flag: trace_ends_with_ret.
     980%       ∀status_pre: S.
     981%       ∀status_init: S.
     982%       ∀status_end: S.
     983%         as_execute S status_pre status_init →
     984%         trace_any_label S end_flag status_init status_end →
     985%         as_classifier S status_pre cl_other →
     986%         ¬ (as_costed S status_init) →
     987%           trace_any_label S end_flag status_pre status_end.
     988% \end{verbatim}
     989% \end{comment}
     990% A \verb+trace_label_return+ is isomorphic to a list of
     991% \verb+trace_label_label+s that ends with a cost emission followed by a
     992% return terminated \verb+trace_label_label+.
     993% The interesting cases are those of $\verb+trace_any_label+~b~s_1~s_2$.
     994% A \verb+trace_any_label+ is a sequence of steps built by a syntax directed
     995% definition on the classification of $s_1$. The constructors of the datatype
     996% impose several invariants that are meant to impose a structure to the
     997% otherwise unstructured execution. In particular, the following invariants are
     998% imposed:
     999% \begin{enumerate}
     1000%  \item the trace is never empty; it ends with a return iff $b$ is
     1001%        true
     1002%  \item a jump must always be the last instruction of the trace, and it must
     1003%        be followed by a cost emission statement; i.e. the target of a jump
     1004%        is always the beginning of a new basic block; as such it must start
     1005%        with a cost emission statement
     1006%  \item a cost emission statement can never occur inside the trace, only in
     1007%        the status immediately after
     1008%  \item the trace for a function call step is made of a subtrace for the
     1009%        function body of type
     1010%        $\verb+trace_label_return+~s_1~s_2$, possibly followed by the
     1011%        rest of the trace for this basic block. The subtrace represents the
     1012%        function execution. Being an inductive datum, it grants totality of
     1013%        the function call. The status $s_2$ is the one that follows the return
     1014%        statement. The next instruction of $s_2$ must follow the function call
     1015%        instruction. As a consequence, function calls are also well nested.
     1016% \end{enumerate}
     1017%
     1018% There are three mutual structural recursive functions, one for each of
     1019% \verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
     1020% $|\,.\,|$: the \emph{flattening} of the traces. These functions
     1021% allow to extract from a structured trace the list of emitted cost labels.
     1022% %  We only show here the type of one
     1023% % of them:
     1024% % \begin{alltt}
     1025% % flatten_trace_label_return:
     1026% %  \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
     1027% %   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
     1028% % \end{alltt}
     1029%
     1030% \paragraph{Cost prediction on structured traces.}
     1031%
     1032% The first main theorem of CerCo about traces
     1033% (theorem \verb+compute_max_trace_label_return_cost_ok_with_trace+)
     1034% holds for the
     1035% instantiation
     1036% of the structured traces to the concrete status of object code programs.
     1037% Simplifying a bit, it states that
     1038% \begin{equation}\label{th1}
     1039% \begin{array}{l}\forall s_1,s_2. \forall \tau: \verb+TLR+~s_1~s_2.~
     1040%   \verb+clock+~s_2 = \verb+clock+~s_1 +
     1041%   \Sigma_{\alpha \in |\tau|}\;k(\alpha)
     1042% \end{array}
     1043% \end{equation}
     1044% where the cost model $k$ is statically computed from the object code
     1045% by associating to each label $\alpha$ the sum of the cost of the instructions
     1046% in the basic block that starts at $\alpha$ and ends before the next labelled
     1047% instruction. The theorem is proved by structural induction over the structured
     1048% trace, and is based on the invariant that
     1049% iff the function that computes the cost model has analysed the instruction
     1050% to be executed at $s_2$ after the one to be executed at $s_1$, and if
     1051% the structured trace starts with $s_1$, then eventually it will contain also
     1052% $s_2$. When $s_1$ is not a function call, the result holds trivially because
     1053% of the $s_1\exec s_2$ condition obtained by inversion on
     1054% the trace. The only non
     1055% trivial case is the one of function calls: the cost model computation function
     1056% does recursion on the first instruction that follows that function call; the
     1057% \verb+as_after_return+ condition of the \verb+tal_base_call+ and
     1058% \verb+tal_step_call+ grants exactly that the execution will eventually reach
     1059% this state.
     1060%
     1061% \paragraph{Structured traces similarity and cost prediction invariance.}
     1062%
     1063% A compiler pass maps source to object code and initial states to initial
     1064% states. The source code and initial state uniquely determine the structured
     1065% trace of a program, if it exists. The structured trace fails to exists iff
     1066% the structural conditions are violated by the program execution (e.g. a function
     1067% body does not start with a cost emission statement). Let us assume that the
     1068% target structured trace exists.
     1069%
     1070% What is the relation between the source and target structured traces?
     1071% In general, the two traces can be arbitrarily different. However, we are
     1072% interested only in those compiler passes that maps a trace $\tau_1$ to a trace
     1073% $\tau_2$ such that
     1074% \begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
     1075% The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
     1076% corollary
     1077% \begin{equation}\label{th3}
     1078% \forall s_1,s_2. \forall \tau: \verb+TLR+~s_1~s_2.~
     1079%   \verb+clock+~s_2 - \verb+clock+~s_1 =
     1080%   \Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
     1081%   \Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
     1082% \end{equation}
     1083% This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
     1084% transfer the cost model from the target to the source code and reason on the
     1085% source code only.
     1086%
     1087% We are therefore interested in conditions stronger than~\eqref{th2}.
     1088% Therefore we introduce here a similarity relation between traces with
     1089% the same structure. Theorem~\verb+tlr_rel_to_traces_same_flatten+
     1090% in the Matita formalisation shows that~\eqref{th2} holds for every pair
     1091% $(\tau_1,\tau_2)$ of similar traces.
     1092%
     1093% Intuitively, two traces are similar when one can be obtained from
     1094% the other by erasing or inserting silent steps, i.e. states that are
     1095% not \verb+as_costed+ and that are classified as \verb+cl_other+.
     1096% Silent steps do not alter the structure of the traces.
     1097% In particular,
     1098% the relation maps function calls to function calls to the same function,
     1099% label emission statements to emissions of the same label, concatenation of
     1100% subtraces to concatenation of subtraces of the same length and starting with
     1101% the same emission statement, etc.
     1102%
     1103% In the formalisation the three similarity relations --- one for each trace
     1104% kind --- are defined by structural recursion on the first trace and pattern
     1105% matching over the second. Here we turn
     1106% the definition into the inference rules shown in \autoref{fig:txx_rel}
     1107% for the sake of readability. We also omit from trace constructors all arguments,
     1108% but those that are traces or that
     1109% are used in the premises of the rules. By abuse of notation we denote all three
     1110% relations by infixing $\approx$.
     1111%
     1112% \begin{figure}
     1113% \begin{multicols}{2}
     1114% \infrule
     1115%  {tll_1\approx tll_2
     1116%  }
     1117%  {\verb+tlr_base+~tll_1 \approx \verb+tlr_base+~tll_2}
     1118%
     1119% \infrule
     1120%  {tll_1 \approx tll_2 \andalso
     1121%   tlr_1 \approx tlr_2
     1122%  }
     1123%  {\verb+tlr_step+~tll_1~tlr_1 \approx \verb+tlr_step+~tll_2~tlr_2}
     1124% \end{multicols}
     1125% \vspace{3ex}
     1126% \begin{multicols}{2}
     1127% \infrule
     1128%  {\ell~s_1 = \ell~s_2 \andalso
     1129%   tal_1\approx tal_2
     1130%  }
     1131%  {\verb+tll_base+~s_1~tal_1 \approx \verb+tll_base+~s_2~tal_2}
     1132%
     1133% \infrule
     1134%  {tal_1\approx tal_2
     1135%  }
     1136%  {\verb+tal_step_default+~tal_1 \approx tal_2}
     1137% \end{multicols}
     1138% \vspace{3ex}
     1139% \infrule
     1140%  {}
     1141%  {\verb+tal_base_not_return+\approx taa \append \verb+tal_base_not_return+}
     1142% \vspace{1ex}
     1143% \infrule
     1144%  {}
     1145%  {\verb+tal_base_return+\approx taa \append \verb+tal_base_return+}
     1146% \vspace{1ex}
     1147% \infrule
     1148%  {tlr_1\approx tlr_2 \andalso
     1149%   s_1 \uparrow f \andalso s_2\uparrow f
     1150%  }
     1151%  {\verb+tal_base_call+~s_1~tlr_1\approx taa \append \verb+tal_base_call+~s_2~tlr_2}
     1152% \vspace{1ex}
     1153% \infrule
     1154%  {tlr_1\approx tlr_2 \andalso
     1155%   s_1 \uparrow f \andalso s_2\uparrow f \andalso
     1156%   \verb+tal_collapsable+~tal_2
     1157%  }
     1158%  {\verb+tal_base_call+~s_1~tlr_1 \approx taa \append \verb+tal_step_call+~s_2~tlr_2~tal_2)}
     1159% \vspace{1ex}
     1160% \infrule
     1161%  {tlr_1\approx tlr_2 \andalso
     1162%   s_1 \uparrow f \andalso s_2\uparrow f \andalso
     1163%   \verb+tal_collapsable+~tal_1
     1164%  }
     1165%  {\verb+tal_step_call+~s_1~tlr_1~tal_1 \approx taa \append \verb+tal_base_call+~s_2~tlr_2)}
     1166% \vspace{1ex}
     1167% \infrule
     1168%  {tlr_1 \approx tlr_2 \andalso
     1169%   s_1 \uparrow f \andalso s_2\uparrow f\andalso
     1170%   tal_1 \approx tal_2 \andalso
     1171%  }
     1172%  {\verb+tal_step_call+~s_1~tlr_1~tal_1 \approx taa \append \verb+tal_step_call+~s_2~tlr_2~tal_2}
     1173% \caption{The inference rule for the relation $\approx$.}
     1174% \label{fig:txx_rel}
     1175% \end{figure}
     1176% %
     1177% \begin{comment}
     1178% \begin{verbatim}
     1179% let rec tlr_rel S1 st1 st1' S2 st2 st2'
     1180%   (tlr1 : trace_label_return S1 st1 st1')
     1181%   (tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
     1182% match tlr1 with
     1183%   [ tlr_base st1 st1' tll1 ⇒
     1184%     match tlr2 with
     1185%     [ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
     1186%     | _ ⇒ False
     1187%     ]
     1188%   | tlr_step st1 st1' st1'' tll1 tl1 ⇒
     1189%     match tlr2 with
     1190%     [ tlr_step st2 st2' st2'' tll2 tl2 ⇒
     1191%       tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
     1192%     | _ ⇒ False
     1193%     ]
     1194%   ]
     1195% and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
     1196%  (tll1 : trace_label_label S1 fl1 st1 st1')
     1197%  (tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
     1198%   match tll1 with
     1199%   [ tll_base fl1 st1 st1' tal1 H ⇒
     1200%     match tll2 with
     1201%     [ tll_base fl2 st2 st2 tal2 G ⇒
     1202%       as_label_safe … («?, H») = as_label_safe … («?, G») ∧
     1203%       tal_rel … tal1 tal2
     1204%     ]
     1205%   ]
     1206% and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
     1207%  (tal1 : trace_any_label S1 fl1 st1 st1')
     1208%  (tal2 : trace_any_label S2 fl2 st2 st2')
     1209%    on tal1 : Prop ≝
     1210%   match tal1 with
     1211%   [ tal_base_not_return st1 st1' _ _ _ ⇒
     1212%     fl2 = doesnt_end_with_ret ∧
     1213%     ∃st2mid,taa,H,G,K.
     1214%     tal2 ≃ taa_append_tal ? st2 ??? taa
     1215%       (tal_base_not_return ? st2mid st2' H G K)
     1216%   | tal_base_return st1 st1' _ _ ⇒
     1217%     fl2 = ends_with_ret ∧
     1218%     ∃st2mid,taa,H,G.
     1219%     tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
     1220%       (tal_base_return ? st2mid st2' H G)
     1221%   | tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
     1222%     fl2 = doesnt_end_with_ret ∧
     1223%     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
     1224%     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
     1225%     (* we must allow a tal_base_call to be similar to a call followed
     1226%       by a collapsable trace (trace_any_any followed by a base_not_return;
     1227%       we cannot use trace_any_any as it disallows labels in the end as soon
     1228%       as it is non-empty) *)
     1229%     (∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
     1230%       tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
     1231%     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
     1232%     ∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
     1233%       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
     1234%       tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
     1235%   | tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
     1236%     ∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
     1237%     ∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
     1238%     (fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
     1239%       tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
     1240%       tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
     1241%     ∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
     1242%     ∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
     1243%       tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
     1244%       tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
     1245%   | tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
     1246%     tal_rel … tl1 tal2 (* <- this makes it many to many *)
     1247%   ].
     1248% \end{verbatim}
     1249% \end{comment}
     1250% %
     1251% In the preceding rules, a $taa$ is an inhabitant of the
     1252% $\verb+trace_any_any+~s_1~s_2$ (shorthand $\verb+TAA+~s_1~s_2$),
     1253% an inductive data type whose definition
     1254% is not in the paper for lack of space. It is the type of valid
     1255% prefixes (even empty ones) of \verb+TAL+'s that do not contain
     1256% any function call. Therefore it
     1257% is possible to concatenate (using ``$\append$'') a \verb+TAA+ to the
     1258% left of a \verb+TAL+. A \verb+TAA+ captures
     1259% a sequence of silent moves.
     1260% The \verb+tal_collapsable+ unary predicate over \verb+TAL+'s
     1261% holds when the argument does not contain any function call and it ends
     1262% with a label (not a return). The intuition is that after a function call we
     1263% can still perform a sequence of silent actions while remaining similar.
     1264%
     1265% As should be expected, even though the rules are asymmetric $\approx$ is in fact
     1266% an equivalence relation.
    11631267\section{Forward simulation}
    11641268\label{simulation}
     
    13071411%     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
    13081412% \end{tikzpicture}
    1309 % \caption{The \texttt{cl\_jump} case.}
     1413% \caption{The \verb+cl_jump+ case.}
    13101414% \label{subfig:cl_jump}
    13111415% \end{subfigure}
     
    13231427    \draw [new] (s2) -- node [above] {$*$} (t2);
    13241428    }
    1325     \draw (s1) to node [rel] {$\S$} (s2);
    1326     \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
     1429    \draw (s1) to [bend right, anchor=mid] node [rel] {$\S$} (s2);
     1430    \draw [new] (t1) to [bend left, anchor=mid] node [rel] {$\S,\L$} (t2);
    13271431\end{tikzpicture}
    1328 \caption{The \texttt{cl\_oher} and \texttt{cl\_jump} cases.}
     1432\caption{The \verb+cl_oher+ and \verb+cl_jump+ cases.}
    13291433\label{subfig:cl_other_jump}
    13301434\end{subfigure}
     
    13341438\begin{tikzpicture}[every join/.style={ar}, join all, thick,
    13351439                            every label/.style=overlay, node distance=10mm]
    1336     \matrix [diag, small vgap] (m) {%
    1337         \node (t1) {}; \\
     1440    \matrix [diag, small gap] (m) {%
     1441         &\node (t1) {}; \\
    13381442         \node (s1) [is call] {}; \\
    1339          & \node (l) {}; & \node (t2) {};\\
     1443         && \node (l) {}; & \node (t2) {};\\
    13401444         \node (s2) {}; & \node (c) [is call] {};\\   
    13411445    };
    13421446    {[-stealth]
    1343     \draw (s1) -- node [left] {$f$} (t1);
     1447    \draw (s1) -- node [above left] {$f$} (t1);
    13441448    \draw [new] (s2) -- node [above] {$*$} (c);
    1345     \draw [new] (c) -- node [right] {$f$} (l);
     1449    \draw [new] (c) -- node [below right] {$f$} (l);
    13461450    \draw [new] (l) -- node [above] {$*$} (t2);
    13471451    }
    1348     \draw (s1) to node [rel] {$\S$} (s2);
     1452    \draw (s1) to [bend right] node [rel] {$\S$} (s2);
    13491453    \draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
    13501454    \draw [new] (t1) to [bend left] node [rel] {$\L$} (l);
    1351     \draw [new] (t1) to node [rel] {$\C$} (c);
     1455    \draw [new] (t1) to [bend right] node [rel] {$\C$} (c);
    13521456    \end{tikzpicture}
    1353 \caption{The \texttt{cl\_call} case.}
     1457\caption{The \verb+cl_call+ case.}
    13541458\label{subfig:cl_call}
    13551459\end{subfigure}
     
    13591463\begin{tikzpicture}[every join/.style={ar}, join all, thick,
    13601464                            every label/.style=overlay, node distance=10mm]
    1361     \matrix [diag, small vgap] (m) {%
     1465    \matrix [diag, small gap] (m) {%
    13621466        \node (s1) [is ret] {}; \\
    1363         \node (t1) {}; \\
     1467        &\node (t1) {}; \\
    13641468        \node (s2) {}; & \node (c) [is ret] {};\\
    1365         & \node (r) {}; & \node (t2) {}; \\   
     1469        && \node (r) {}; & \node (t2) {}; \\   
    13661470    };
    13671471    {[-stealth]
    1368     \draw (s1) -- (t1);
     1472    \draw (s1) -- node [above right] {$RET$} (t1);
    13691473    \draw [new] (s2) -- node [above] {$*$} (c);
    1370     \draw [new] (c) -- (r);
     1474    \draw [new] (c) -- node [below left] {$RET$} (r);
    13711475    \draw [new] (r) -- node [above] {$*$} (t2);
    13721476    }
    1373     \draw (s1) to [bend right=45] node [rel] {$\S$} (s2);
    1374     \draw [new, overlay] (t1) to [bend left=90, looseness=1] node [rel] {$\S,\L$} (t2);
    1375     \draw [new, overlay] (t1) to [bend left=90, looseness=1.2] node [rel] {$\R$} (r);
     1477    \draw (s1) to [bend right] node [rel] {$\S$} (s2);
     1478    \draw [new, overlay] (t1) to [bend left=60] node [rel] {$\S,\L$} (t2);
     1479    \draw [new, overlay] (t1) to [bend left ] node [rel] {$\R$} (r);
    13761480\end{tikzpicture}
    1377 \caption{The \texttt{cl\_return} case.}
     1481\caption{The \verb+cl_return+ case.}
    13781482\label{subfig:cl_return}
    13791483\end{subfigure}
     
    13871491
    13881492\paragraph{1-to-many forward simulation conditions.}
    1389 \begin{condition}[Cases \texttt{cl\_other} and \texttt{cl\_jump}]
     1493\begin{condition}[Cases \verb+cl_other+ and \verb+cl_jump+]
    13901494 For all $s_1,s_1',s_2$ such that $s_1 \S s_1'$, and
    1391  $s_1\exec s_1'$, and either $s_1 \class \texttt{cl\_other}$ or
    1392  both $s_1\class\texttt{cl\_other}\}$ and $\ell~s_1'$,
    1393  there exists an $s_2'$ and a $\texttt{trace\_any\_any\_free}~s_2~s_2'$ called $taaf$
     1495 $s_1\exec s_1'$, and either $s_1 \class \verb+cl_other+$ or
     1496 both $s_1\class\verb+cl_other+\}$ and $\ell~s_1'$,
     1497 there exists an $s_2'$ and a $\verb+trace_any_any_free+~s_2~s_2'$ called $taaf$
    13941498 such that $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
    1395 $taaf$ is non empty, or one among $s_1$ and $s_1'$ is \texttt{as\_costed}.
     1499$taaf$ is non empty, or one among $s_1$ and $s_1'$ is \verb+as_costed+.
    13961500\end{condition}
    13971501
    13981502In the above condition depicted in \autoref{subfig:cl_other_jump},
    1399 a $\texttt{trace\_any\_any\_free}~s_1~s_2$ (which from now on
     1503a $\verb+trace_any_any_free+~s_1~s_2$ (which from now on
    14001504will be shorthanded as \verb+TAAF+) is an
    14011505inductive type of structured traces that do not contain function calls or
     
    14091513to avoid collapsing two consecutive states that emit a label, missing one of the two emissions.
    14101514
    1411 \begin{condition}[Case \texttt{cl\_call}]
     1515\begin{condition}[Case \verb+cl_call+]
    14121516 For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$ and
    1413  $s_1\exec s_1'$ and $s_1 \class \texttt{cl\_call}$, there exists $s_a, s_b, s_2'$, a
     1517 $s_1\exec s_1'$ and $s_1 \class \verb+cl_call+$, there exists $s_a, s_b, s_2'$, a
    14141518$\verb+TAA+~s_2~s_a$, and a
    14151519$\verb+TAAF+~s_b~s_2'$ such that:
    1416 $s_a\class\texttt{cl\_call}$, the \texttt{as\_call\_ident}'s of
     1520$s_a\class\verb+cl_call+$, the \verb+as_call_ident+'s of
    14171521the two call states are the same, $s_1 \C s_a$,
    14181522$s_a\exec s_b$, $s_1' \L s_b$ and
     
    14271531as usual.
    14281532
    1429 \begin{condition}[Case \texttt{cl\_return}]
     1533\begin{condition}[Case \verb+cl_return+]
    14301534 For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$,
    1431  $s_1\exec s_1'$ and $s_1 \class \texttt{cl\_return}$, there exists $s_a, s_b, s_2'$, a
     1535 $s_1\exec s_1'$ and $s_1 \class \verb+cl_return+$, there exists $s_a, s_b, s_2'$, a
    14321536$\verb+TAA+~s_2~s_a$, a
    14331537$\verb+TAAF+~s_b~s_2'$ called $taaf$ such that:
    1434 $s_a\class\texttt{cl\_return}$,
     1538$s_a\class\verb+cl_return+$,
    14351539$s_a\exec s_b$,
    14361540$s_1' \R s_b$ and
     
    15421646$s_1$ and $s_1'$ states.
    15431647
    1544 In particular, the \texttt{status\_simulation\_produce\_tlr} theorem
    1545 applied to the \texttt{main} function of the program and equal
     1648In particular, the \verb+status_simulation_produce_tlr+ theorem
     1649applied to the \verb+main+ function of the program and equal
    15461650$s_{2_b}$ and $s_2$ states shows that, for every initial state in the
    15471651source code that induces a structured trace in the source code,
    15481652the compiled code produces a similar structured trace.
    15491653
    1550 \begin{theorem}[\texttt{status\_simulation\_produce\_tlr}]
     1654\begin{theorem}[\verb+status_simulation_produce_tlr+]
    15511655For every $s_1,s_1',s_{2_b},s_2$ s.t.
    1552 there is a $\texttt{TLR}~s_1~s_1'$ called $tlr_1$ and a
     1656there is a $\verb+TLR+~s_1~s_1'$ called $tlr_1$ and a
    15531657$\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
    15541658$s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
    1555 there is a $\texttt{TLR}~s_{2_b}~s_{2_m}$ called $tlr_2$ and
     1659there is a $\verb+TLR+~s_{2_b}~s_{2_m}$ called $tlr_2$ and
    15561660there is a $\verb+TAAF+~s_{2_m}~s_2'$ called $taaf$
    15571661s.t. if $taaf$ is non empty then $\lnot (\ell~s_{2_m})$,
     
    15611665\end{theorem}
    15621666
    1563 The theorem states that a \texttt{trace\_label\_return} in the source code
     1667The theorem states that a \verb+trace_label_return+ in the source code
    15641668together with a precomputed preamble of silent states
    15651669(the \verb+TAA+) in the target code induces a
    1566 similar \texttt{trace\_label\_return} in the target code which can be
     1670similar \verb+trace_label_return+ in the target code which can be
    15671671followed by a sequence of silent states. Note that the statement does not
    1568 require the produced \texttt{trace\_label\_return} to start with the
     1672require the produced \verb+trace_label_return+ to start with the
    15691673precomputed preamble, even if this is likely to be the case in concrete
    15701674implementations. The preamble in input is necessary for compositionality, e.g.
     
    15751679Clearly similar results are also available for the other two types of structured
    15761680traces (in fact, they are all proved simultaneously by mutual induction).
    1577 % \begin{theorem}[\texttt{status\_simulation\_produce\_tll}]
     1681% \begin{theorem}[\verb+status_simulation_produce_tll+]
    15781682% For every $s_1,s_1',s_{2_b},s_2$ s.t.
    1579 % there is a $\texttt{TLL}~b~s_1~s_1'$ called $tll_1$ and a
     1683% there is a $\verb+TLL+~b~s_1~s_1'$ called $tll_1$ and a
    15801684% $\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
    15811685% $s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
    15821686% \begin{itemize}
    15831687%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
    1584 %        and a trace $\texttt{TLL}~b~s_{2_b}~s_{2_m}$ called $tll_2$
    1585 %        and a $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
     1688%        and a trace $\verb+TLL+~b~s_{2_b}~s_{2_m}$ called $tll_2$
     1689%        and a $\verb+TAAF+~s_{2_m}~s_2'$ called $taa_2$ s.t.
    15861690%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
    15871691%        $s_1' \R s_{2_m}$ and
     
    15891693%        if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
    15901694%  \item else there exists $s_2'$ and a
    1591 %        $\texttt{TLL}~b~s_{2_b}~s_2'$ called $tll_2$ such that
     1695%        $\verb+TLL+~b~s_{2_b}~s_2'$ called $tll_2$ such that
    15921696%        $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
    15931697%        $tll_1\approx tll_2$.
     
    15961700%
    15971701% The statement is similar to the previous one: a source
    1598 % \texttt{trace\_label\_label} and a given target preamble of silent states
    1599 % in the target code induce a similar \texttt{trace\_label\_label} in the
     1702% \verb+trace_label_label+ and a given target preamble of silent states
     1703% in the target code induce a similar \verb+trace_label_label+ in the
    16001704% target code, possibly followed by a sequence of silent moves that become the
    1601 % preamble for the next \texttt{trace\_label\_label} translation.
    1602 %
    1603 % \begin{theorem}[\texttt{status\_simulation\_produce\_tal}]
     1705% preamble for the next \verb+trace_label_label+ translation.
     1706%
     1707% \begin{theorem}[\verb+status_simulation_produce_tal+]
    16041708% For every $s_1,s_1',s_2$ s.t.
    1605 % there is a $\texttt{TAL}~b~s_1~s_1'$ called $tal_1$ and
     1709% there is a $\verb+TAL+~b~s_1~s_1'$ called $tal_1$ and
    16061710% $s_1 \S s_2$
    16071711% \begin{itemize}
    16081712%  \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
    1609 %    and a trace $\texttt{TAL}~b~s_2~s_{2_m}$ called $tal_2$ and a
    1610 %    $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
     1713%    and a trace $\verb+TAL+~b~s_2~s_{2_m}$ called $tal_2$ and a
     1714%    $\verb+TAAF+~s_{2_m}~s_2'$ called $taa_2$ s.t.
    16111715%    $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
    16121716%    $s_1' \R s_{2_m}$ and
     
    16141718%    if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
    16151719%  \item else there exists $s_2'$ and a
    1616 %    $\texttt{TAL}~b~s_2~s_2'$ called $tal_2$ such that
     1720%    $\verb+TAL+~b~s_2~s_2'$ called $tal_2$ such that
    16171721%    either $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
    16181722%        $tal_1\approx tal_2$
    16191723%    or $s_1' \mathrel{{\S} \cap {\L}} s_2$ and
    1620 %    $\texttt{tal\_collapsable}~tal_1$ and $\lnot \ell~s_1$.
     1724%    $\verb+tal_collapsable+~tal_1$ and $\lnot \ell~s_1$.
    16211725% \end{itemize}
    16221726% \end{theorem}
     
    16281732\begin{corollary}
    16291733For every $s_1,s_1',s_2$ s.t.
    1630 there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and
     1734there is a $\verb+trace_label_return+~s_1~s_1'$ called $tlr_1$ and
    16311735$s_1 (\L \cap \S) s_2$
    16321736there exists $s_{2_m},s_2'$ s.t.
    1633 there is a $\texttt{trace\_label\_return}~s_2~s_{2_m}$ called $tlr_2$ and
    1634 there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
    1635 s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
    1636 and $\texttt{tlr\_rel}~tlr_1~tlr_2$
     1737there is a $\verb+trace_label_return+~s_2~s_{2_m}$ called $tlr_2$ and
     1738there is a $\verb+trace_any_any_free+~s_{2_m}~s_2'$ called $taaf$
     1739s.t. if $taaf$ is non empty then $\lnot (\verb+as_costed+~s_{2_m})$,
     1740and $\verb+tlr_rel+~tlr_1~tlr_2$
    16371741and $s_1' (\S \cap \L) s_2'$ and
    16381742$s_1' \R s_{2_m}$.
     
    17741878% to state and prove the theorems by using the Russell methodology of
    17751879% Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
    1776 % }, better known in the Coq world as ``\texttt{Program}'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
     1880% }, better known in the Coq world as ``\verb+Program+'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
    17771881% mandatorily on dependent types: it should be easy to adapt the technique
    17781882% and results presented in the paper to HOL.
Note: See TracChangeset for help on using the changeset viewer.