\documentclass{llncs}
\usepackage{hyperref}
\usepackage{graphicx}
\usepackage{color}
\usepackage{listings}
\usepackage{bcprules}%\bcprulessavespace
\usepackage{verbatim}
\usepackage{alltt}
\usepackage{subcaption}
\usepackage{listings}
\usepackage{amssymb}
% \usepackage{amsmath}
\usepackage{multicol}
\providecommand{\eqref}[1]{(\ref{#1})}
% NB: might be worth removing this if changing class in favour of
% a built-in definition.
%\newtheorem{theorem}{Theorem}
\newtheorem{condition}{Condition}
\lstdefinelanguage{coq}
{keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
morekeywords={[2]if,then,else,forall,Prop,match,with,end,let},
}
\lstdefinelanguage[mine]{C}[ANSI]{C}{
morekeywords={int8_t}
}
\lstset{language=[mine]C,basicstyle=\footnotesize\tt,columns=flexible,breaklines=false,
keywordstyle=\color{red}\bfseries,
keywordstyle=[2]\color{blue},
commentstyle=\color{green},
stringstyle=\color{blue},
showspaces=false,showstringspaces=false,
xleftmargin=1em}
\usepackage{tikz}
\usetikzlibrary{positioning,calc,patterns,chains,shapes.geometric,scopes}
\makeatletter
\pgfutil@ifundefined{pgf@arrow@code@implies}{% supply for lack of double arrow special arrow tip if it is not there%
\pgfarrowsdeclare{implies}{implies}%
{%
\pgfarrowsleftextend{2.2pt}%
\pgfarrowsrightextend{2.2pt}%
}%
{%
\pgfsetdash{}{0pt} % do not dash%
\pgfsetlinewidth{.33pt}%
\pgfsetroundjoin % fix join%
\pgfsetroundcap % fix cap%
\pgfpathmoveto{\pgfpoint{-1.5pt}{2.5pt}}%
\pgfpathcurveto{\pgfpoint{-.75pt}{1.5pt}}{\pgfpoint{0.5pt}{.5pt}}{\pgfpoint{2pt}{0pt}}%
\pgfpathcurveto{\pgfpoint{0.5pt}{-.5pt}}{\pgfpoint{-.75pt}{-1.5pt}}{\pgfpoint{-1.5pt}{-2.5pt}}%
\pgfusepathqstroke%
}%
}{}%
\makeatother
\tikzset{state/.style={inner sep = 0, outer sep = 2pt, draw, fill},
every node/.style={inner sep=2pt},
every on chain/.style = {inner sep = 0, outer sep = 2pt},
join all/.style = {every on chain/.append style={join}},
on/.style={on chain={#1}, state},
m/.style={execute at begin node=$, execute at end node=$},
node distance=3mm,
is other/.style={circle, minimum size = 3pt, state},
other/.style={on, is other},
is jump/.style={diamond, minimum size = 6pt, state},
jump/.style={on, is jump},
is call/.style={regular polygon, regular polygon sides=3, minimum size=5pt, state},
call/.style={on=going below, is call, node distance=6mm, label=above left:$#1$},
is ret/.style={regular polygon, regular polygon sides=3, minimum size=5pt, shape border rotate=180, state},
ret/.style={on=going above, is ret, node distance=6mm},
chain/.style={start chain=#1 going left},
rev ar/.style={stealth-, thick},
ar/.style={-stealth, thick},
every join/.style={rev ar},
labelled/.style={fill=white, label=above:$#1$},
vcenter/.style={baseline={([yshift=-.5ex]current bounding box)}},
every picture/.style={thick},
double equal sign distance/.prefix style={double distance=1.5pt}, %% if already defined (newest version of pgf) it should be ignored%}
implies/.style={double, -implies, thin, double equal sign distance, shorten <=5pt, shorten >=5pt},
new/.style={densely dashed},
rel/.style={font=\scriptsize, fill=white, inner sep=2pt},
diag/.style={row sep={11mm,between origins},
column sep={11mm,between origins},
every node/.style={draw, is other}},
small vgap/.style={row sep={7mm,between origins}},
% for article, maybe temporary
is jump/.style=is other,
is call/.style=is other,
is ret/.style=is other,
}
\def\L{\mathrel{\mathcal L}}
\def\S{\mathrel{\mathcal S}}
\def\R{\mathrel{\mathcal R}}
\def\C{\mathrel{\mathcal C}}
\newsavebox{\execbox}
\savebox{\execbox}{\tikz[baseline=-.5ex]\draw [-stealth] (0,0) -- ++(1em, 0);}
\newcommand{\exec}{\ensuremath{\mathrel{\usebox{\execbox}}}}
\let\ar\rightsquigarrow
\renewcommand{\verb}{\lstinline[mathescape]}
\let\class\triangleright
\let\andalso\quad
\newcommand{\append}{\mathbin{@}}
\newcommand{\iffdef}{\mathrel{:\Leftrightarrow}}
\begin{document}
\pagestyle{plain}
\title{Certification of the Preservation of Structure by a Compiler's Back-end Pass\thanks{The project CerCo acknowledges the financial support of the Future and
Emerging Technologies (FET) programme within the Seventh Framework
Programme for Research of the European Commission, under FET-Open grant
number: 243881}}
\author{Paolo Tranquilli \and Claudio Sacerdoti Coen}
\institute{Department of Computer Science and Engineering, University of Bologna,\\\email{Paolo.Tranquilli@unibo.it}, \email{Claudio.SacerdotiCoen@unibo.it}}
\maketitle
\begin{abstract}
The labelling approach is a technique to lift cost models for non-functional
properties of programs from the object code to the source code. It is based
on the preservation of the structure of the high level program in every
intermediate language used by the compiler. Such structure is captured by
observables that are added to the semantics and that needs to be preserved
by the forward simulation proof of correctness of the compiler. Additional
special observables are required for function calls. In this paper we
present a generic forward simulation proof that preserves all these observables.
The proof statement is based on a new mechanised semantics that traces the
structure of execution when the language is unstructured. The generic semantics
and simulation proof have been mechanised in the interactive theorem prover
Matita.
\end{abstract}
\section{Introduction}
The \emph{labelling approach} has been introduced in~\cite{easylabelling} as
a technique to \emph{lift} cost models for non-functional properties of programs
from the object code to the source code. Examples of non-functional properties
are execution time, amount of stack/heap space consumed and energy required for
communication. The basic idea of the approach is that it is impossible to
provide a \emph{uniform} cost model for an high level language that is preserved
\emph{precisely} by a compiler. For instance, two instances of an assignment
$x = y$ in the source code can be compiled very differently according to the
place (registers vs stack) where $x$ and $y$ are stored at the moment of
execution. Therefore a precise cost model must assign a different cost
to every occurrence, and the exact cost can only be known after compilation.
According to the labelling approach, the compiler is free to compile and optimise
the source code without any major restriction, but it must keep trace
of what happens to basic blocks during the compilation. The cost model is
then computed on the object code. It assigns a cost to every basic block.
Finally, the compiler propagates back the cost model to the source level,
assigning a cost to each basic block of the source code.
Implementing the labelling approach in a certified compiler
allows to reason formally on the high level source code of a program to prove
non-functional properties that are granted to be preserved by the compiler
itself. The trusted code base is then reduced to 1) the interactive theorem
prover (or its kernel) used in the certification of the compiler and
2) the software used to certify the property on the source language, that
can be itself certified further reducing the trusted code base.
In~\cite{easylabelling} the authors provide an example of a simple
certified compiler that implements the labelling approach for the
imperative \texttt{While} language~\cite{while}, that does not have
pointers and function calls.
The labelling approach has been shown to scale to more interesting scenarios.
In particular in~\cite{functionallabelling} it has been applied to a functional
language and in~\cite{loopoptimizations} it has been shown that the approach
can be slightly complicated to handle loop optimisations and, more generally,
program optimisations that do not preserve the structure of basic blocks.
On-going work also shows that the labelling approach is also compatible with
the complex analyses required to obtain a cost model for object code
on processors that implement advanced features like pipelining, superscalar
architectures and caches.
In the European Project CerCo (Certified Complexity~\footnote{\url{http://cerco.cs.unibo.it}})~\cite{cerco} we are certifying a labelling approach based compiler for a large subset of C to
8051 object code. The compiler is
moderately optimising and implements a compilation chain that is largely
inspired to that of CompCert~\cite{compcert1,compcert2}. Compared to work done in~\cite{easylabelling}, the main novelty and source of difficulties is due to the presence
of function calls. Surprisingly, the addition of function calls require a
revisitation of the proof technique given in~\cite{easylabelling}. In
particular, at the core of the labelling approach there is a forward
simulation proof that, in the case of \texttt{While}, is only minimally
more complex than the proof required for the preservation of the
functional properties only. In the case of a programming language with
function calls, instead, it turns out that the forward simulation proof for
the back-end languages must grant a whole new set of invariants.
In this paper we present a formalisation in the Matita interactive theorem
prover~\cite{matita1,matita2} of a generic version of the simulation proof required for unstructured
languages. All back-end languages of the CerCo compiler are unstructured
languages, so the proof covers half of the correctness of the compiler.
The statement of the generic proof is based on a new semantics
for imperative unstructured languages that is based on \emph{structured
traces} and that restores the preservation of structure in the observables of
the semantics. The generic proof allows to almost completely split the
part of the simulation that deals with functional properties only from the
part that deals with the preservation of structure.
The plan of this paper is the following. In Section~\ref{labelling} we
sketch the labelling method and the problems derived from the application
to languages with function calls. In Section~\ref{semantics} we introduce
a generic description of an unstructured imperative language and the
corresponding structured traces (the novel semantics). In
Section~\ref{simulation} we describe the forward simulation proof.
Conclusions and future works are in Section~\ref{conclusions}
\section{The labelling approach}
\label{labelling}
We briefly sketch here a simplified version of the labelling approach as
introduced in~\cite{easylabelling}. The simplification strengthens the
sufficient conditions given in~\cite{easylabelling} to allow a simpler
explanation. The simplified conditions given here are also used in the
CerCo compiler to simplify the proof.
Let $\mathcal{P}$ be a programming language whose semantics is given in
terms of observables: a run of a program yields a finite or infinite
stream of observables. We also assume for the time being that function
calls are not available in $\mathcal{P}$. We want to associate a cost
model to a program $P$ written in $\mathcal{P}$. The first step is to
extend the syntax of $\mathcal{P}$ with a new construct $\texttt{emit L}$
where $L$ is a label distinct from all observables of $\mathcal{P}$.
The semantics of $\texttt{emit L}$ is the emission of the observable
\texttt{L} that is meant to signal the beginning of a basic block.
There exists an automatic procedure that injects into the program $P$ an
$\texttt{emit L}$ at the beginning of each basic block, using a fresh
\texttt{L} for each block. In particular, the bodies of loops, both branches
of \texttt{if-then-else}s and the targets of \texttt{goto}s must all start
with an emission statement.
Let now $C$ be a compiler from $\mathcal{P}$ to the object code $\mathcal{M}$,
that is organised in passes. Let $\mathcal{Q}_i$ be the $i$-th intermediate
language used by the compiler. We can easily extend every
intermediate language (and its semantics) with an $\texttt{emit L}$ statement
as we did for $\mathcal{P}$. The same is possible for $\mathcal{M}$ too, with
the additional difficulty that the syntax of object code is given as a
sequence of bytes. The injection of an emission statement in the object code
can be done using a map that maps two consecutive code addresses with the
statement. The intended semantics is that, if $(pc_1,pc_2) \mapsto \texttt{emit L}$ then the observable \texttt{L} is emitted after the execution of the
instruction stored at $pc_1$ and before the execution of the instruction
stored at $pc_2$. The two program counters are necessary because the
instruction stored at $pc_1$ can have multiple possible successors (e.g.
in case of a conditional branch or an indirect call). Dually, the instruction
stored at $pc_2$ can have multiple possible predecessors (e.g. if it is the
target of a jump).
The compiler, to be functionally correct, must preserve the observational
equivalence, i.e. executing the program after each compiler pass should
yield the same stream of observables. After the injection of emission
statements, observables now capture both functional and non-functional
behaviours.
This correctness property is called in the literature a forward simulation
and is sufficient for correctness when the target language is
deterministic~\cite{compcert3}.
We also require a stronger, non-functional preservation property: after each
pass all basic blocks must start with an emission statement, and all labels
\texttt{L} must be unique.
Now let $M$ be the object code obtained for the program $P$. Let us suppose
that we can statically inspect the code $M$ and associate to each basic block
a cost (e.g. the number of clock cycles required to execute all instructions
in the basic block, or an upper bound to that time). Every basic block is
labelled with an unique label \texttt{L}, thus we can actually associate the
cost to \texttt{L}. Let call it $k(\texttt{L})$.
The function $k$ is defined as the cost model for the object code control
blocks. It can be equally used as well as the cost model for the source
control blocks. Indeed, if the semantics of $P$ is the stream
$L_1 L_2 \ldots$, then, because of forward simulation, the semantics of $M$ is
also $L_1 L_2 \ldots$ and its actual execution cost is $\Sigma_i k(L_i)$ because
every instruction belongs to a control block and every control block is
labelled. Thus it is correct to say that the execution cost of $P$ is also
$\Sigma_i k(L_i)$. In other words, we have obtained a cost model $k$ for
the blocks of the high level program $P$ that is preserved by compilation.
How can the user profit from the high level cost model? Suppose, for instance,
that he wants to prove that the WCET of his program is bounded by $c$. It
is sufficient for him to prove that $\Sigma_i k(L_i) \leq c$, which is now
a purely functional property of the code. He can therefore use any technique
available to certify functional properties of the source code.
What is suggested in~\cite{easylabelling} is to actually instrument the
source code $P$ by replacing every label emission statement
$\texttt{emit L}$ with the instruction $\texttt{cost += k(L)}$ that increments
a global fresh variable \texttt{cost}. The bound is now proved by establishing
the program invariant $\texttt{cost} \leq c$, which can be done for example
using the Frama-C~\cite{framaC} suite if the source code is some variant of
C.
\subsection{Labelling function calls}
We now want to extend the labelling approach to support function calls.
On the high level, \emph{structured} programming language $\mathcal{P}$ there
is not much to change.
When a function is invoked, the current basic block is temporarily exited
and the basic block the function starts with take control. When the function
returns, the execution of the original basic block is resumed. Thus the only
significant change is that basic blocks can now be nested. Let \texttt{E}
be the label of the external block and \texttt{I} the label of a nested one.
Since the external starts before the internal, the semantics observed will be
\texttt{E I} and the cost associated to it on the source language will be
$k(\texttt{E}) + k(\texttt{I})$, i.e. the cost of executing all instructions
in the block \texttt{E} first plus the cost of executing all the instructions in
the block \texttt{I}. However, we know that some instructions in \texttt{E} are
executed after the last instruction in \texttt{I}. This is actually irrelevant
because we are here assuming that costs are additive, so that we can freely
permute them\footnote{The additivity assumption fails on modern processors that have stateful subsystems, like caches and pipelines. The extension of the labelling approach to those systems is therefore non trivial and under development in the CerCo project.}. Note that, in the present discussion, we are assuming that
the function call terminates and yields back control to the basic block
\texttt{E}. If the call diverges, the instrumentation
$\texttt{cost += k(E)}$ executed at the beginning of \texttt{E} is still valid,
but just as an upper bound to the real execution cost: only precision is lost.
Let now consider what happens when we move down the compilation chain to an
unstructured intermediate or final language. Here unstructured means that
the only control operators are conditional and unconditional jumps, function
calls and returns. Unlike a structured language, though, there is no guarantee
that a function will return control just after the function call point.
The semantics of the return statement, indeed, consists in fetching the
return address from some internal structure (typically the control stack) and
jumping directly to it. The code can freely manipulate the control stack to
make the procedure returns to whatever position. Indeed, it is also possible
to break the well nesting of function calls/returns.
Is it the case that the code produced by a correct compiler must respect the
additional property that every function returns just after its function call
point? The answer is negative and the property is not implied by forward
simulation proofs. For instance, imagine to modify a correct compiler pass
by systematically adding one to the return address on the stack and by
putting a \texttt{NOP} (or any other instruction that takes one byte) after
every function call. The obtained code will be functionally indistinguishable,
and the added instructions will all be dead code.
This lack of structure in the semantics badly interferes with the labelling
approach. The reason is the following: when a basic block labelled with
\texttt{E} contains a function call, it no longer makes any sense to associate
to a label \texttt{E} the sum of the costs of all the instructions in the block.
Indeed, there is no guarantee that the function will return into the block and
that the instructions that will be executed after the return will be the ones
we are paying for in the cost model.
How can we make the labelling approach work in this scenario? We only see two
possible ways. The first one consists in injecting an emission statement after
every function call: basic blocks no longer contain function calls, but are now
terminated by them. This completely solves the problem and allows the compiler
to break the structure of function calls/returns at will. However, the
technique has several drawbacks. First of all, it greatly augments the number
of cost labels that are injected in the source code and that become
instrumentation statements. Thus, when reasoning on the source code to prove
non-functional properties, the user (or the automation tool) will have to handle
larger expressions. Second, the more labels are emitted, the more difficult it
becomes to implement powerful optimisations respecting the code structure.
Indeed, function calls are usually implemented in such a way that most registers
are preserved by the call, so that the static analysis of the block is not
interrupted by the call and an optimisation can involve both the code before
and after the function call. Third, instrumenting the source code may require
unpleasant modification of it. Take, for example, the code
\texttt{f(g(x));}. We need to inject an emission statement/instrumentation
instruction just after the execution of \texttt{g}. The only way to do that
is to rewrite the code as \texttt{y = g(x); emit L; f(y);} for some fresh
variable \texttt{y}. It is pretty clear how in certain situations the obtained
code would be more obfuscated and then more difficult to manually reason on.
For the previous reasons, in this paper and in the CerCo project we adopt a
different approach. We do not inject emission statements after every
function call. However, we want to propagate a strong additional invariant in
the forward simulation proof. The invariant is the propagation of the structure
of the original high level code, even if the target language is unstructured.
The structure we want to propagate, that will become more clear in the next
section, comprises 1) the property that every function should return just after
the function call point, which in turns imply well nesting of function calls;
2) the property that every basic block starts with a code emission statement.
In the original labelling approach of~\cite{easylabelling}, the second property
was granted syntactically as a property of the generated code.
In our revised approach, instead, we will impose the property on the runs:
it will be possible to generate code that does not respect the syntactic
property, as soon as all possible runs respect it. For instance, dead code will no longer
be required to have all basic blocks correctly labelled. The switch is suggested
from the fact that the first of the two properties --- that related to
function calls/returns --- can only be defined as property of runs,
not of the static code. The switch is
beneficial to the proof because the original proof was made of two parts:
the forward simulation proof and the proof that the static property was granted.
In our revised approach the latter disappears and only the forward simulation
is kept.
In order to capture the structure semantics so that it is preserved
by a forward simulation argument, we need to make the structure observable
in the semantics. This is the topic of the next section.
\section{Structured traces}
\label{semantics}
The program semantics adopted in the traditional labelling approach is based
on labelled deductive systems. Given a set of observables $\mathcal{O}$ and
a set of states $\S$, the semantics of one deterministic execution
step is
defined as a function $S \to S \times O^*$ where $O^*$ is a (finite) stream of
observables. The semantics is then lifted compositionally to multiple (finite
or infinite) execution steps.
Finally, the semantics of a a whole program execution is obtained by forgetting
about the final state (if any), yielding a function $S \to O^*$ that given an
initial status returns the finite or infinite stream of observables in output.
We present here a new definition of semantics where the structure of execution,
as defined in the previous section, is now observable. The idea is to replace
the stream of observables with a structured data type that makes explicit
function call and returns and that grants some additional invariants by
construction. The data structure, called \emph{structured traces}, is
defined inductively for terminating programs and coinductively for diverging
ones. In the paper we focus only on the inductive structure, i.e. we assume
that all programs that are given a semantics are total. The Matita formalisation
also shows the coinductive definitions. The semantics of a program is then
defined as a function that maps an initial state into a structured trace.
In order to have a definition that works on multiple intermediate languages,
we abstract the type of structure traces over an abstract data type of
abstract statuses, which we aptly call $\texttt{abstract\_status}$. The fields
of this record are the following.
\begin{itemize}
\item \verb+S : Type[0]+, the type of states.
\item \verb+as_execute : S $\to$ S $\to$ Prop+, a binary predicate stating
an execution step. We write $s_1\exec s_2$ for $\verb+as_execute+~s_1~s_2$.
\item \verb+as_classifier : S $\to$ classification+, a function tagging all
states with a class in
$\{\texttt{cl\_return,cl\_jump,cl\_call,cl\_other}\}$, depending on the instruction
that is about to be executed (we omit tail-calls for simplicity). We will
use $s \class c$ as a shorthand for both $\texttt{as\_classifier}~s=c$
(if $c$ is a classification) and $\texttt{as\_classifier}~s\in c$
(if $c$ is a set of classifications).
\item \verb+as_label : S $\to$ option label+, telling whether the
next instruction to be executed in $s$ is a cost emission statement,
and if yes returning the associated cost label. Our shorthand for this function
will be $\ell$, and we will also abuse the notation by using $\ell~s$ as a
predicate stating that $s$ is labelled.
\item \verb+as_call_ident : ($\Sigma$s:S. s $\class$ cl_call) $\to$ label+,
telling the identifier of the function which is being called in a
\verb+cl_call+ state. We will use the shorthand $s\uparrow f$ for
$\verb+as_call_ident+~s = f$.
\item \verb+as_after_return : ($\Sigma$s:S. s $\class$ cl_call) $\to$ S $\to$ Prop+,
which holds on the \verb+cl_call+ state $s_1$ and a state $s_2$ when the
instruction to be executed in $s_2$ follows the function call to be
executed in (the witness of the $\Sigma$-type) $s_1$. We will use the notation
$s_1\ar s_2$ for this relation.
\end{itemize}
% \begin{alltt}
% record abstract_status := \{ S: Type[0];
% as_execute: S \(\to\) S \(\to\) Prop; as_classifier: S \(\to\) classification;
% as_label: S \(\to\) option label; as_called: (\(\Sigma\)s:S. c s = cl_call) \(\to\) label;
% as_after_return: (\(\Sigma\)s:S. c s = cl_call) \(\to\) S \(\to\) Prop \}
% \end{alltt}
The inductive type for structured traces is actually made by three multiple
inductive types with the following semantics:
\begin{enumerate}
\item $(\texttt{trace\_label\_return}~s_1~s_2)$ (shorthand $\verb+TLR+~s_1~s_2$)
is a trace that begins in
the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
such that the instruction to be executed in $s_1$ is a label emission
statement and the one to be executed in the state before $s_2$ is a return
statement. Thus $s_2$ is the state after the return. The trace
may contain other label emission statements. It captures the structure of
the execution of function bodies: they must start with a cost emission
statement and must end with a return; they are obtained by concatenating
one or more basic blocks, all starting with a label emission
(e.g. in case of loops).
\item $(\texttt{trace\_any\_label}~b~s_1~s_2)$ (shorthand $\verb+TAL+~b~s_1~s_2$)
is a trace that begins in
the state $s_1$ (included) and ends just before the state $s_2$ (excluded)
such that the instruction to be executed in $s_2$/in the state before
$s_2$ is either a label emission statement or
or a return, according to the boolean $b$. It must not contain
any label emission statement. It captures the notion of a suffix of a
basic block.
\item $(\texttt{trace\_label\_label}~b~s_1~s_2)$ (shorthand $\verb+TLL+~b~s_1~s_2$ is the special case of
$\verb+TAL+~b~s_1~s_2)$ such that the instruction to be
executed in $s_1$ is a label emission statement. It captures the notion of
a basic block.
\end{enumerate}
\begin{multicols}{3}
\infrule[\verb+tlr_base+]
{\texttt{TLL}~true~s_1~s_2}
{\texttt{TLR}~s_1~s_2}
\infrule[\verb+tlr_step+]
{\texttt{TLL}~false~s_1~s_2 \andalso
\texttt{TLR}~s_2~s_3
}
{\texttt{TLR}~s_1~s_3}
\infrule[\verb+tll_base+]
{\texttt{TAL}~b~s_1~s_2 \andalso
\ell~s_1
}
{\texttt{TLL}~b~s_1~s_2}
\end{multicols}
\infrule[\verb+tal_base_not_return+]
{s_1\exec s_2 \andalso
s_1\class\{\verb+cl_jump+, \verb+cl_other+\}\andalso
\ell~s_2
}
{\texttt{TAL}~false~s_1~s_2}
\infrule[\verb+tal_base_return+]
{s_1\exec s_2 \andalso
s_1 \class \texttt{cl\_return}
}
{\texttt{TAL}~true~s_1~s_2}
\infrule[\verb+tal_base_call+]
{s_1\exec s_2 \andalso
s_1 \class \texttt{cl\_call} \andalso
s_1\ar s_3 \andalso
\texttt{TLR}~s_2~s_3 \andalso
\ell~s_3
}
{\texttt{TAL}~false~s_1~s_3}
\infrule[\verb+tal_step_call+]
{s_1\exec s_2 \andalso
s_1 \class \texttt{cl\_call} \andalso
s_1\ar s_3 \andalso
\texttt{TLR}~s_2~s_3 \andalso
\texttt{TAL}~b~s_3~s_4
}
{\texttt{TAL}~b~s_1~s_4}
\infrule[\verb+tal_step_default+]
{s_1\exec s_2 \andalso
\lnot \ell~s_2 \andalso
\texttt{TAL}~b~s_2~s_3\andalso
s_1 \class \texttt{cl\_other}
}
{\texttt{TAL}~b~s_1~s_3}
\begin{comment}
\begin{verbatim}
inductive trace_label_return (S:abstract_status) : S → S → Type[0] ≝
| tlr_base:
∀status_before: S.
∀status_after: S.
trace_label_label S ends_with_ret status_before status_after →
trace_label_return S status_before status_after
| tlr_step:
∀status_initial: S.
∀status_labelled: S.
∀status_final: S.
trace_label_label S doesnt_end_with_ret status_initial status_labelled →
trace_label_return S status_labelled status_final →
trace_label_return S status_initial status_final
with trace_label_label: trace_ends_with_ret → S → S → Type[0] ≝
| tll_base:
∀ends_flag: trace_ends_with_ret.
∀start_status: S.
∀end_status: S.
trace_any_label S ends_flag start_status end_status →
as_costed S start_status →
trace_label_label S ends_flag start_status end_status
with trace_any_label: trace_ends_with_ret → S → S → Type[0] ≝
(* Single steps within a function which reach a label.
Note that this is the only case applicable for a jump. *)
| tal_base_not_return:
∀start_status: S.
∀final_status: S.
as_execute S start_status final_status →
(as_classifier S start_status cl_jump ∨
as_classifier S start_status cl_other) →
as_costed S final_status →
trace_any_label S doesnt_end_with_ret start_status final_status
| tal_base_return:
∀start_status: S.
∀final_status: S.
as_execute S start_status final_status →
as_classifier S start_status cl_return →
trace_any_label S ends_with_ret start_status final_status
(* A call followed by a label on return. *)
| tal_base_call:
∀status_pre_fun_call: S.
∀status_start_fun_call: S.
∀status_final: S.
as_execute S status_pre_fun_call status_start_fun_call →
∀H:as_classifier S status_pre_fun_call cl_call.
as_after_return S «status_pre_fun_call, H» status_final →
trace_label_return S status_start_fun_call status_final →
as_costed S status_final →
trace_any_label S doesnt_end_with_ret status_pre_fun_call status_final
(* A call followed by a non-empty trace. *)
| tal_step_call:
∀end_flag: trace_ends_with_ret.
∀status_pre_fun_call: S.
∀status_start_fun_call: S.
∀status_after_fun_call: S.
∀status_final: S.
as_execute S status_pre_fun_call status_start_fun_call →
∀H:as_classifier S status_pre_fun_call cl_call.
as_after_return S «status_pre_fun_call, H» status_after_fun_call →
trace_label_return S status_start_fun_call status_after_fun_call →
¬ as_costed S status_after_fun_call →
trace_any_label S end_flag status_after_fun_call status_final →
trace_any_label S end_flag status_pre_fun_call status_final
| tal_step_default:
∀end_flag: trace_ends_with_ret.
∀status_pre: S.
∀status_init: S.
∀status_end: S.
as_execute S status_pre status_init →
trace_any_label S end_flag status_init status_end →
as_classifier S status_pre cl_other →
¬ (as_costed S status_init) →
trace_any_label S end_flag status_pre status_end.
\end{verbatim}
\end{comment}
A \texttt{trace\_label\_return} is isomorphic to a list of
\texttt{trace\_label\_label}s that ends with a cost emission followed by a
return terminated \texttt{trace\_label\_label}.
The interesting cases are those of $\texttt{trace\_any\_label}~b~s_1~s_2$.
A \texttt{trace\_any\_label} is a sequence of steps built by a syntax directed
definition on the classification of $s_1$. The constructors of the datatype
impose several invariants that are meant to impose a structure to the
otherwise unstructured execution. In particular, the following invariants are
imposed:
\begin{enumerate}
\item the trace is never empty; it ends with a return iff $b$ is
true
\item a jump must always be the last instruction of the trace, and it must
be followed by a cost emission statement; i.e. the target of a jump
is always the beginning of a new basic block; as such it must start
with a cost emission statement
\item a cost emission statement can never occur inside the trace, only in
the status immediately after
\item the trace for a function call step is made of a subtrace for the
function body of type
$\texttt{trace\_label\_return}~s_1~s_2$, possibly followed by the
rest of the trace for this basic block. The subtrace represents the
function execution. Being an inductive datum, it grants totality of
the function call. The status $s_2$ is the one that follows the return
statement. The next instruction of $s_2$ must follow the function call
instruction. As a consequence, function calls are also well nested.
\end{enumerate}
There are three mutual structural recursive functions, one for each of
\verb+TLR+, \verb+TLL+ and \verb+TAL+, for which we use the same notation
$|\,.\,|$: the \emph{flattening} of the traces. These functions
allow to extract from a structured trace the list of emitted cost labels.
% We only show here the type of one
% of them:
% \begin{alltt}
% flatten_trace_label_return:
% \(\forall\)S: abstract_status. \(\forall\)\(s_1,s_2\).
% trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
% \end{alltt}
\paragraph{Cost prediction on structured traces.}
The first main theorem of CerCo about traces
(theorem \texttt{compute\_max\_trace\_label\_return\_cost\_ok\_with\_trace})
holds for the
instantiation
of the structured traces to the concrete status of object code programs.
Simplifying a bit, it states that
\begin{equation}\label{th1}
\begin{array}{l}\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
\texttt{clock}~s_2 = \texttt{clock}~s_1 +
\Sigma_{\alpha \in |\tau|}\;k(\alpha)
\end{array}
\end{equation}
where the cost model $k$ is statically computed from the object code
by associating to each label $\alpha$ the sum of the cost of the instructions
in the basic block that starts at $\alpha$ and ends before the next labelled
instruction. The theorem is proved by structural induction over the structured
trace, and is based on the invariant that
iff the function that computes the cost model has analysed the instruction
to be executed at $s_2$ after the one to be executed at $s_1$, and if
the structured trace starts with $s_1$, then eventually it will contain also
$s_2$. When $s_1$ is not a function call, the result holds trivially because
of the $s_1\exec s_2$ condition obtained by inversion on
the trace. The only non
trivial case is the one of function calls: the cost model computation function
does recursion on the first instruction that follows that function call; the
\texttt{as\_after\_return} condition of the \texttt{tal\_base\_call} and
\texttt{tal\_step\_call} grants exactly that the execution will eventually reach
this state.
\paragraph{Structured traces similarity and cost prediction invariance.}
A compiler pass maps source to object code and initial states to initial
states. The source code and initial state uniquely determine the structured
trace of a program, if it exists. The structured trace fails to exists iff
the structural conditions are violated by the program execution (e.g. a function
body does not start with a cost emission statement). Let us assume that the
target structured trace exists.
What is the relation between the source and target structured traces?
In general, the two traces can be arbitrarily different. However, we are
interested only in those compiler passes that maps a trace $\tau_1$ to a trace
$\tau_2$ such that
\begin{equation}|\tau_1| = |\tau_2|.\label{th2}\end{equation}
The reason is that the combination of~\eqref{th1} with~\eqref{th2} yields the
corollary
\begin{equation}\label{th3}
\forall s_1,s_2. \forall \tau: \texttt{TLR}~s_1~s_2.~
\texttt{clock}~s_2 - \texttt{clock}~s_1 =
\Sigma_{\alpha \in |\tau_1|}\;k(\alpha) =
\Sigma_{\alpha \in |\tau_2|}\;k(\alpha).
\end{equation}
This corollary states that the actual execution time of the program can be computed equally well on the source or target language. Thus it becomes possible to
transfer the cost model from the target to the source code and reason on the
source code only.
We are therefore interested in conditions stronger than~\eqref{th2}.
Therefore we introduce here a similarity relation between traces with
the same structure. Theorem~\texttt{tlr\_rel\_to\_traces\_same\_flatten}
in the Matita formalisation shows that~\eqref{th2} holds for every pair
$(\tau_1,\tau_2)$ of similar traces.
Intuitively, two traces are similar when one can be obtained from
the other by erasing or inserting silent steps, i.e. states that are
not \texttt{as\_costed} and that are classified as \texttt{cl\_other}.
Silent steps do not alter the structure of the traces.
In particular,
the relation maps function calls to function calls to the same function,
label emission statements to emissions of the same label, concatenation of
subtraces to concatenation of subtraces of the same length and starting with
the same emission statement, etc.
In the formalisation the three similarity relations --- one for each trace
kind --- are defined by structural recursion on the first trace and pattern
matching over the second. Here we turn
the definition into the inference rules shown in \autoref{fig:txx_rel}
for the sake of readability. We also omit from trace constructors all arguments,
but those that are traces or that
are used in the premises of the rules. By abuse of notation we denote all three
relations by infixing $\approx$.
\begin{figure}
\begin{multicols}{2}
\infrule
{tll_1\approx tll_2
}
{\texttt{tlr\_base}~tll_1 \approx \texttt{tlr\_base}~tll_2}
\infrule
{tll_1 \approx tll_2 \andalso
tlr_1 \approx tlr_2
}
{\texttt{tlr\_step}~tll_1~tlr_1 \approx \texttt{tlr\_step}~tll_2~tlr_2}
\end{multicols}
\vspace{3ex}
\begin{multicols}{2}
\infrule
{\ell~s_1 = \ell~s_2 \andalso
tal_1\approx tal_2
}
{\texttt{tll\_base}~s_1~tal_1 \approx \texttt{tll\_base}~s_2~tal_2}
\infrule
{tal_1\approx tal_2
}
{\texttt{tal\_step\_default}~tal_1 \approx tal_2}
\end{multicols}
\vspace{3ex}
\infrule
{}
{\texttt{tal\_base\_not\_return}\approx taa \append \texttt{tal\_base\_not\_return}}
\vspace{1ex}
\infrule
{}
{\texttt{tal\_base\_return}\approx taa \append \texttt{tal\_base\_return}}
\vspace{1ex}
\infrule
{tlr_1\approx tlr_2 \andalso
s_1 \uparrow f \andalso s_2\uparrow f
}
{\texttt{tal\_base\_call}~s_1~tlr_1\approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2}
\vspace{1ex}
\infrule
{tlr_1\approx tlr_2 \andalso
s_1 \uparrow f \andalso s_2\uparrow f \andalso
\texttt{tal\_collapsable}~tal_2
}
{\texttt{tal\_base\_call}~s_1~tlr_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2)}
\vspace{1ex}
\infrule
{tlr_1\approx tlr_2 \andalso
s_1 \uparrow f \andalso s_2\uparrow f \andalso
\texttt{tal\_collapsable}~tal_1
}
{\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_base\_call}~s_2~tlr_2)}
\vspace{1ex}
\infrule
{tlr_1 \approx tlr_2 \andalso
s_1 \uparrow f \andalso s_2\uparrow f\andalso
tal_1 \approx tal_2 \andalso
}
{\texttt{tal\_step\_call}~s_1~tlr_1~tal_1 \approx taa \append \texttt{tal\_step\_call}~s_2~tlr_2~tal_2}
\caption{The inference rule for the relation $\approx$.}
\label{fig:txx_rel}
\end{figure}
%
\begin{comment}
\begin{verbatim}
let rec tlr_rel S1 st1 st1' S2 st2 st2'
(tlr1 : trace_label_return S1 st1 st1')
(tlr2 : trace_label_return S2 st2 st2') on tlr1 : Prop ≝
match tlr1 with
[ tlr_base st1 st1' tll1 ⇒
match tlr2 with
[ tlr_base st2 st2' tll2 ⇒ tll_rel … tll1 tll2
| _ ⇒ False
]
| tlr_step st1 st1' st1'' tll1 tl1 ⇒
match tlr2 with
[ tlr_step st2 st2' st2'' tll2 tl2 ⇒
tll_rel … tll1 tll2 ∧ tlr_rel … tl1 tl2
| _ ⇒ False
]
]
and tll_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
(tll1 : trace_label_label S1 fl1 st1 st1')
(tll2 : trace_label_label S2 fl2 st2 st2') on tll1 : Prop ≝
match tll1 with
[ tll_base fl1 st1 st1' tal1 H ⇒
match tll2 with
[ tll_base fl2 st2 st2 tal2 G ⇒
as_label_safe … («?, H») = as_label_safe … («?, G») ∧
tal_rel … tal1 tal2
]
]
and tal_rel S1 fl1 st1 st1' S2 fl2 st2 st2'
(tal1 : trace_any_label S1 fl1 st1 st1')
(tal2 : trace_any_label S2 fl2 st2 st2')
on tal1 : Prop ≝
match tal1 with
[ tal_base_not_return st1 st1' _ _ _ ⇒
fl2 = doesnt_end_with_ret ∧
∃st2mid,taa,H,G,K.
tal2 ≃ taa_append_tal ? st2 ??? taa
(tal_base_not_return ? st2mid st2' H G K)
| tal_base_return st1 st1' _ _ ⇒
fl2 = ends_with_ret ∧
∃st2mid,taa,H,G.
tal2 ≃ taa_append_tal ? st2 ? st2mid st2' taa
(tal_base_return ? st2mid st2' H G)
| tal_base_call st1 st1' st1'' _ prf _ tlr1 _ ⇒
fl2 = doesnt_end_with_ret ∧
∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
(* we must allow a tal_base_call to be similar to a call followed
by a collapsable trace (trace_any_any followed by a base_not_return;
we cannot use trace_any_any as it disallows labels in the end as soon
as it is non-empty) *)
(∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
tal2 ≃ taa @ (tal_base_call … H G K tlr2 L) ∧ tlr_rel … tlr1 tlr2) ∨
∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
∃tl2 : trace_any_label … doesnt_end_with_ret st2mid'' st2'.
tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
tlr_rel … tlr1 tlr2 ∧ tal_collapsable … tl2
| tal_step_call fl1 st1 st1' st1'' st1''' _ prf _ tlr1 _ tl1 ⇒
∃st2mid,G.as_call_ident S2 («st2mid, G») = as_call_ident ? «st1, prf» ∧
∃taa : trace_any_any ? st2 st2mid.∃st2mid',H.
(fl2 = doesnt_end_with_ret ∧ ∃K.∃tlr2 : trace_label_return ? st2mid' st2'.∃L.
tal2 ≃ taa @ tal_base_call … H G K tlr2 L ∧
tal_collapsable … tl1 ∧ tlr_rel … tlr1 tlr2) ∨
∃st2mid'',K.∃tlr2 : trace_label_return ? st2mid' st2mid''.∃L.
∃tl2 : trace_any_label ? fl2 st2mid'' st2'.
tal2 ≃ taa @ (tal_step_call … H G K tlr2 L tl2) ∧
tal_rel … tl1 tl2 ∧ tlr_rel … tlr1 tlr2
| tal_step_default fl1 st1 st1' st1'' _ tl1 _ _ ⇒
tal_rel … tl1 tal2 (* <- this makes it many to many *)
].
\end{verbatim}
\end{comment}
%
In the preceding rules, a $taa$ is an inhabitant of the
$\texttt{trace\_any\_any}~s_1~s_2$ (shorthand $\texttt{TAA}~s_1~s_2$),
an inductive data type whose definition
is not in the paper for lack of space. It is the type of valid
prefixes (even empty ones) of \texttt{TAL}'s that do not contain
any function call. Therefore it
is possible to concatenate (using ``$\append$'') a \texttt{TAA} to the
left of a \texttt{TAL}. A \texttt{TAA} captures
a sequence of silent moves.
The \texttt{tal\_collapsable} unary predicate over \texttt{TAL}'s
holds when the argument does not contain any function call and it ends
with a label (not a return). The intuition is that after a function call we
can still perform a sequence of silent actions while remaining similar.
As should be expected, even though the rules are asymmetric $\approx$ is in fact
an equivalence relation.
\section{Forward simulation}
\label{simulation}
We summarise here the results of the previous sections. Each intermediate
unstructured language can be given a semantics based on structured traces,
that single out those runs that respect a certain number of invariants.
A cost model can be computed on the object code and it can be used to predict
the execution costs of runs that produce structured traces. The cost model
can be lifted from the target to the source code of a pass if the pass maps
structured traces to similar structured traces. The latter property is called
a \emph{forward simulation}.
As for labelled transition systems, in order to establish the forward
simulation we are interested in (preservation of observables), we are
forced to prove a stronger notion of forward simulation that introduces
an explicit relation between states. The classical notion of a 1-to-many
forward simulation is the existence of a relation $\S$ over states such that
if $s_1 \S s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
$s_2 \to^* s_2'$ and $s_1' \S s_2'$. In our context, we need to replace the
one and multi step transition relations $\to^n$ with the existence of
a structured trace between the two states, and we need to add the request that
the two structured traces are similar. Thus what we would like to state is
something like:\\
for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
$s_1$ to $s_1'$ and $s_1 \S s_2$ there exists an $s_2'$ such that
$s_1' \S s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
$\tau_1$ is similar to $\tau_2$. We call this particular form of forward
simulation \emph{trace reconstruction}.
The statement just introduced, however, is too simplistic and not provable
in the general case. To understand why, consider the case of a function call
and the pass that fixes the parameter passing conventions. A function
call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
input nor output parameters. The pass must add explicit code before and after
the function call to move the pseudo-registers content from/to the hardware
registers or the stack in order to implement the parameter passing strategy.
Similarly, each function body must be augmented with a preamble and a postamble
to complete/initiate the parameter passing strategy for the call/return phase.
Therefore what used to be a call followed by the next instruction to execute
after the function return, now becomes a sequence of instructions, followed by
a call, followed by another sequence. The two states at the beginning of the
first sequence and at the end of the second sequence are in relation with
the status before/after the call in the source code, like in an usual forward
simulation. How can we prove however the additional condition for function calls
that asks that when the function returns the instruction immediately after the
function call is called? To grant this invariant, there must be another relation
between the address of the function call in the source and in the target code.
This additional relation is to be used in particular to relate the two stacks.
Another example is given by preservation of code emission statements. A single
code emission instruction can be simulated by a sequence of steps, followed
by a code emission, followed by another sequence. Clearly the initial and final
statuses of the sequence are to be in relation with the status before/after the
code emission in the source code. In order to preserve the structured traces
invariants, however, we must consider a second relation between states that
traces the preservation of the code emission statement.
Therefore we now introduce an abstract notion of relation set between abstract
statuses and an abstract notion of 1-to-many forward simulation conditions.
These two definitions enjoy the following remarkable properties:
\begin{enumerate}
\item they are generic enough to accommodate all passes of the CerCo compiler
\item the conjunction of the 1-to-many forward simulation conditions are
just slightly stricter than the statement of a 1-to-many forward
simulation in the classical case. In particular, they only require
the construction of very simple forms of structured traces made of
silent states only.
\item they allow to prove our main result of the paper: the 1-to-many
forward simulation conditions are sufficient to prove the trace
reconstruction theorem
\end{enumerate}
Point 3. is the important one. First of all it means that we have reduced
the complex problem of trace reconstruction to a much simpler one that,
moreover, can be solved with slight adaptations of the forward simulation proof
that is performed for a compiler that only cares about functional properties.
Therefore we have successfully splitted as much as possible the proof of
preservation of functional properties from that of non-functional ones.
Secondly, combined with the results in the previous section, it implies
that the cost model can be computed on the object code and lifted to the
source code to reason on non-functional properties, assuming that
the 1-to-many forward simulation conditions are fulfilled for every
compiler pass.
\paragraph{Relation sets.}
We introduce now the four relations $\mathcal{S,C,L,R}$ between abstract
statuses that are used to correlate the corresponding statues before and
after a compiler pass. The first two are abstract and must be instantiated
by every pass. The remaining two are derived relations.
The $\S$ relation between states is the classical relation used
in forward simulation proofs. It correlates the data of the status
(e.g. registers, memory, etc.).
The $\C$ relation correlates call states. It allows to track the
position in the target code of every call in the source code.
The $\L$ relation simply says that the two states are both label
emitting states that emit the same label, \emph{i.e.}\ $s_1\L s_2\iffdef \ell~s_1=\ell~s_2$.
It allows to track the position in
the target code of every cost emitting statement in the source code.
Finally the $\R$ relation is the more complex one. Two states
$s_1$ and $s_2$ are $\R$ correlated if every time $s_1$ is the
successors of a call state that is $\C$-related to a call state
$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$. Formally:
$$s_1\R s_2 \iffdef \forall s_1',s_2'.s_1'\C s_2' \to s_1'\ar s_1 \to s_2' \ar s_2.$$
We will require all pairs of states that follow a related call to be
$\R$-related. This is the fundamental requirement granting
that the target trace is well structured, \emph{i.e.}\ that calls are well
nested and returning where they are supposed to.
% \begin{alltt}
% record status_rel (S1,S2 : abstract_status) : Type[1] := \{
% \(\S\): S1 \(\to\) S2 \(\to\) Prop;
% \(\C\): (\(\Sigma\)s.as_classifier S1 s cl_call) \(\to\)
% (\(\Sigma\)s.as_classifier S2 s cl_call) \(\to\) Prop \}.
%
% definition \(\L\) S1 S2 st1 st2 := as_label S1 st1 = as_label S2 st2.
%
% definition \(\R\) S1 S2 (R: status_rel S1 S2) s1_ret s2_ret ≝
% \(\forall\)s1_pre,s2_pre.
% as_after_return s1_pre s1_ret \(\to\) s1_pre \(\R\) s2_pre \(\to\)
% as_after_return s2_pre s2_ret.
% \end{alltt}
\begin{figure}
\centering
\begin{tabular}{@{}c@{}c@{}c@{}}
% \begin{subfigure}{.475\linewidth}
% \centering
% \begin{tikzpicture}[every join/.style={ar}, join all, thick,
% every label/.style=overlay, node distance=10mm]
% \matrix [diag] (m) {%
% \node (s1) [is jump] {}; & \node [fill=white] (t1) {};\\
% \node (s2) {}; & \node (t2) {}; \\
% };
% \node [above=0 of t1, overlay] {$\alpha$};
% {[-stealth]
% \draw (s1) -- (t1);
% \draw [new] (s2) -- node [above] {$*$} (t2);
% }
% \draw (s1) to node [rel] {$\S$} (s2);
% \draw [new] (t1) to node [rel] {$\S,\L$} (t2);
% \end{tikzpicture}
% \caption{The \texttt{cl\_jump} case.}
% \label{subfig:cl_jump}
% \end{subfigure}
% &
\begin{subfigure}{.25\linewidth}
\centering
\begin{tikzpicture}[every join/.style={ar}, join all, thick,
every label/.style=overlay, node distance=10mm]
\matrix [diag] (m) {%
\node (s1) {}; & \node (t1) {};\\
\node (s2) {}; & \node (t2) {}; \\
};
{[-stealth]
\draw (s1) -- (t1);
\draw [new] (s2) -- node [above] {$*$} (t2);
}
\draw (s1) to node [rel] {$\S$} (s2);
\draw [new] (t1) to node [rel] {$\S,\L$} (t2);
\end{tikzpicture}
\caption{The \texttt{cl\_oher} and \texttt{cl\_jump} cases.}
\label{subfig:cl_other_jump}
\end{subfigure}
&
\begin{subfigure}{.375\linewidth}
\centering
\begin{tikzpicture}[every join/.style={ar}, join all, thick,
every label/.style=overlay, node distance=10mm]
\matrix [diag, small vgap] (m) {%
\node (t1) {}; \\
\node (s1) [is call] {}; \\
& \node (l) {}; & \node (t2) {};\\
\node (s2) {}; & \node (c) [is call] {};\\
};
{[-stealth]
\draw (s1) -- node [left] {$f$} (t1);
\draw [new] (s2) -- node [above] {$*$} (c);
\draw [new] (c) -- node [right] {$f$} (l);
\draw [new] (l) -- node [above] {$*$} (t2);
}
\draw (s1) to node [rel] {$\S$} (s2);
\draw [new] (t1) to [bend left] node [rel] {$\S$} (t2);
\draw [new] (t1) to [bend left] node [rel] {$\L$} (l);
\draw [new] (t1) to node [rel] {$\C$} (c);
\end{tikzpicture}
\caption{The \texttt{cl\_call} case.}
\label{subfig:cl_call}
\end{subfigure}
&
\begin{subfigure}{.375\linewidth}
\centering
\begin{tikzpicture}[every join/.style={ar}, join all, thick,
every label/.style=overlay, node distance=10mm]
\matrix [diag, small vgap] (m) {%
\node (s1) [is ret] {}; \\
\node (t1) {}; \\
\node (s2) {}; & \node (c) [is ret] {};\\
& \node (r) {}; & \node (t2) {}; \\
};
{[-stealth]
\draw (s1) -- (t1);
\draw [new] (s2) -- node [above] {$*$} (c);
\draw [new] (c) -- (r);
\draw [new] (r) -- node [above] {$*$} (t2);
}
\draw (s1) to [bend right=45] node [rel] {$\S$} (s2);
\draw [new, overlay] (t1) to [bend left=90, looseness=1] node [rel] {$\S,\L$} (t2);
\draw [new, overlay] (t1) to [bend left=90, looseness=1.2] node [rel] {$\R$} (r);
\end{tikzpicture}
\caption{The \texttt{cl\_return} case.}
\label{subfig:cl_return}
\end{subfigure}
\end{tabular}
\caption{Mnemonic diagrams depicting the hypotheses for the preservation of structured traces.
Dashed lines
and arrows indicates how the diagrams must be closed when solid relations
are present.}
\label{fig:forwardsim}
\end{figure}
\paragraph{1-to-many forward simulation conditions.}
\begin{condition}[Cases \texttt{cl\_other} and \texttt{cl\_jump}]
For all $s_1,s_1',s_2$ such that $s_1 \S s_1'$, and
$s_1\exec s_1'$, and either $s_1 \class \texttt{cl\_other}$ or
both $s_1\class\texttt{cl\_other}\}$ and $\ell~s_1'$,
there exists an $s_2'$ and a $\texttt{trace\_any\_any\_free}~s_2~s_2'$ called $taaf$
such that $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
$taaf$ is non empty, or one among $s_1$ and $s_1'$ is \texttt{as\_costed}.
\end{condition}
In the above condition depicted in \autoref{subfig:cl_other_jump},
a $\texttt{trace\_any\_any\_free}~s_1~s_2$ (which from now on
will be shorthanded as \verb+TAAF+) is an
inductive type of structured traces that do not contain function calls or
cost emission statements. Differently from a \verb+TAA+, the
instruction to be executed in the lookahead state $s_2$ may be a cost emission
statement.
The intuition of the condition is that one step can be replaced with zero or more steps if it
preserves the relation between the data and if the two final statuses are
labelled in the same way. Moreover, we must take special care of the empty case
to avoid collapsing two consecutive states that emit a label, missing one of the two emissions.
\begin{condition}[Case \texttt{cl\_call}]
For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$ and
$s_1\exec s_1'$ and $s_1 \class \texttt{cl\_call}$, there exists $s_a, s_b, s_2'$, a
$\verb+TAA+~s_2~s_a$, and a
$\verb+TAAF+~s_b~s_2'$ such that:
$s_a\class\texttt{cl\_call}$, the \texttt{as\_call\_ident}'s of
the two call states are the same, $s_1 \C s_a$,
$s_a\exec s_b$, $s_1' \L s_b$ and
$s_1' \S s_2'$.
\end{condition}
The condition, depicted in \autoref{subfig:cl_call} says that, to simulate a function call, we can perform a
sequence of silent actions before and after the function call itself.
The old and new call states must be $\C$-related, the old and new
states at the beginning of the function execution must be $\L$-related
and, finally, the two initial and final states must be $\S$-related
as usual.
\begin{condition}[Case \texttt{cl\_return}]
For all $s_1,s_1',s_2$ s.t. $s_1 \S s_1'$,
$s_1\exec s_1'$ and $s_1 \class \texttt{cl\_return}$, there exists $s_a, s_b, s_2'$, a
$\verb+TAA+~s_2~s_a$, a
$\verb+TAAF+~s_b~s_2'$ called $taaf$ such that:
$s_a\class\texttt{cl\_return}$,
$s_a\exec s_b$,
$s_1' \R s_b$ and
$s_1' \mathrel{{\S} \cap {\L}} s_2'$ and either
$taaf$ is non empty, or $\lnot \ell~s_a$.
\end{condition}
Similarly to the call condition, to simulate a return we can perform a
sequence of silent actions before and after the return statement itself,
as depicted in \autoref{subfig:cl_return}.
The old and the new statements after the return must be $\R$-related,
to grant that they returned to corresponding calls.
The two initial and final states must be $\S$-related
as usual and, moreover, they must exhibit the same labels. Finally, when
the suffix is non empty we must take care of not inserting a new
unmatched cost emission statement just after the return statement.
\begin{comment}
\begin{verbatim}
definition status_simulation ≝
λS1 : abstract_status.
λS2 : abstract_status.
λsim_status_rel : status_rel S1 S2.
∀st1,st1',st2.as_execute S1 st1 st1' →
sim_status_rel st1 st2 →
match as_classify … st1 with
[ None ⇒ True
| Some cl ⇒
match cl with
[ cl_call ⇒ ∀prf.
(*
st1' ------------S----------\
↑ \ \
st1 \--L--\ \
| \ \ \
S \-C-\ st2_after_call →taa→ st2'
| \ ↑
st2 →taa→ st2_pre_call
*)
∃st2_pre_call.
as_call_ident ? st2_pre_call = as_call_ident ? («st1, prf») ∧
call_rel ?? sim_status_rel «st1, prf» st2_pre_call ∧
∃st2_after_call,st2'.
∃taa2 : trace_any_any … st2 st2_pre_call.
∃taa2' : trace_any_any … st2_after_call st2'.
as_execute … st2_pre_call st2_after_call ∧
sim_status_rel st1' st2' ∧
label_rel … st1' st2_after_call
| cl_return ⇒
(*
st1
/ ↓
| st1'----------S,L------------\
S \ \
\ \-----R-------\ |
\ | |
st2 →taa→ st2_ret | |
↓ / |
st2_after_ret →taaf→ st2'
we also ask that st2_after_ret be not labelled if the taaf tail is
not empty
*)
∃st2_ret,st2_after_ret,st2'.
∃taa2 : trace_any_any … st2 st2_ret.
∃taa2' : trace_any_any_free … st2_after_ret st2'.
(if taaf_non_empty … taa2' then ¬as_costed … st2_after_ret else True) ∧
as_classifier … st2_ret cl_return ∧
as_execute … st2_ret st2_after_ret ∧ sim_status_rel st1' st2' ∧
ret_rel … sim_status_rel st1' st2_after_ret ∧
label_rel … st1' st2'
| cl_other ⇒
(*
st1 → st1'
| \
S S,L
| \
st2 →taaf→ st2'
the taaf can be empty (e.g. tunneling) but we ask it must not be the
case when both st1 and st1' are labelled (we would be able to collapse
labels otherwise)
*)
∃st2'.
∃taa2 : trace_any_any_free … st2 st2'.
(if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
sim_status_rel st1' st2' ∧
label_rel … st1' st2'
| cl_jump ⇒
(* just like cl_other, but with a hypothesis more *)
as_costed … st1' →
∃st2'.
∃taa2 : trace_any_any_free … st2 st2'.
(if taaf_non_empty … taa2 then True else (¬as_costed … st1 ∨ ¬as_costed … st1')) ∧
sim_status_rel st1' st2' ∧
label_rel … st1' st2'
]
].
\end{verbatim}
\end{comment}
\paragraph{Main result: the 1-to-many forward simulation conditions
are sufficient to trace reconstruction}
Let us assume that a relation set is given such that the 1-to-many
forward simulation conditions are satisfied. Under this assumption we
can prove the following three trace reconstruction theorems by mutual
structural induction over the traces given in input between the
$s_1$ and $s_1'$ states.
In particular, the \texttt{status\_simulation\_produce\_tlr} theorem
applied to the \texttt{main} function of the program and equal
$s_{2_b}$ and $s_2$ states shows that, for every initial state in the
source code that induces a structured trace in the source code,
the compiled code produces a similar structured trace.
\begin{theorem}[\texttt{status\_simulation\_produce\_tlr}]
For every $s_1,s_1',s_{2_b},s_2$ s.t.
there is a $\texttt{TLR}~s_1~s_1'$ called $tlr_1$ and a
$\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
$s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
there is a $\texttt{TLR}~s_{2_b}~s_{2_m}$ called $tlr_2$ and
there is a $\verb+TAAF+~s_{2_m}~s_2'$ called $taaf$
s.t. if $taaf$ is non empty then $\lnot (\ell~s_{2_m})$,
and $tlr_1\approx tlr_2$
and $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
$s_1' \R s_{2_m}$.
\end{theorem}
The theorem states that a \texttt{trace\_label\_return} in the source code
together with a precomputed preamble of silent states
(the \verb+TAA+) in the target code induces a
similar \texttt{trace\_label\_return} in the target code which can be
followed by a sequence of silent states. Note that the statement does not
require the produced \texttt{trace\_label\_return} to start with the
precomputed preamble, even if this is likely to be the case in concrete
implementations. The preamble in input is necessary for compositionality, e.g.
because the 1-to-many forward simulation conditions allow in the
case of function calls to execute a preamble of silent instructions just after
the function call.
Clearly similar results are also available for the other two types of structured
traces (in fact, they are all proved simultaneously by mutual induction).
% \begin{theorem}[\texttt{status\_simulation\_produce\_tll}]
% For every $s_1,s_1',s_{2_b},s_2$ s.t.
% there is a $\texttt{TLL}~b~s_1~s_1'$ called $tll_1$ and a
% $\verb+TAA+~s_{2_b}~s_2$ and $s_1 \L s_{2_b}$ and
% $s_1 \S s_2$, there exists $s_{2_m},s_2'$ s.t.
% \begin{itemize}
% \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
% and a trace $\texttt{TLL}~b~s_{2_b}~s_{2_m}$ called $tll_2$
% and a $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
% $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
% $s_1' \R s_{2_m}$ and
% $tll_1\approx tll_2$ and
% if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
% \item else there exists $s_2'$ and a
% $\texttt{TLL}~b~s_{2_b}~s_2'$ called $tll_2$ such that
% $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
% $tll_1\approx tll_2$.
% \end{itemize}
% \end{theorem}
%
% The statement is similar to the previous one: a source
% \texttt{trace\_label\_label} and a given target preamble of silent states
% in the target code induce a similar \texttt{trace\_label\_label} in the
% target code, possibly followed by a sequence of silent moves that become the
% preamble for the next \texttt{trace\_label\_label} translation.
%
% \begin{theorem}[\texttt{status\_simulation\_produce\_tal}]
% For every $s_1,s_1',s_2$ s.t.
% there is a $\texttt{TAL}~b~s_1~s_1'$ called $tal_1$ and
% $s_1 \S s_2$
% \begin{itemize}
% \item if $b$ (the trace ends with a return) then there exists $s_{2_m},s_2'$
% and a trace $\texttt{TAL}~b~s_2~s_{2_m}$ called $tal_2$ and a
% $\texttt{TAAF}~s_{2_m}~s_2'$ called $taa_2$ s.t.
% $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
% $s_1' \R s_{2_m}$ and
% $tal_1 \approx tal_2$ and
% if $taa_2$ is non empty then $\lnot \ell~s_{2_m}$;
% \item else there exists $s_2'$ and a
% $\texttt{TAL}~b~s_2~s_2'$ called $tal_2$ such that
% either $s_1' \mathrel{{\S} \cap {\L}} s_2'$ and
% $tal_1\approx tal_2$
% or $s_1' \mathrel{{\S} \cap {\L}} s_2$ and
% $\texttt{tal\_collapsable}~tal_1$ and $\lnot \ell~s_1$.
% \end{itemize}
% \end{theorem}
%
% The statement is also similar to the previous ones, but for the lack of
% the target code preamble.
\begin{comment}
\begin{corollary}
For every $s_1,s_1',s_2$ s.t.
there is a $\texttt{trace\_label\_return}~s_1~s_1'$ called $tlr_1$ and
$s_1 (\L \cap \S) s_2$
there exists $s_{2_m},s_2'$ s.t.
there is a $\texttt{trace\_label\_return}~s_2~s_{2_m}$ called $tlr_2$ and
there is a $\texttt{trace\_any\_any\_free}~s_{2_m}~s_2'$ called $taaf$
s.t. if $taaf$ is non empty then $\lnot (\texttt{as\_costed}~s_{2_m})$,
and $\texttt{tlr\_rel}~tlr_1~tlr_2$
and $s_1' (\S \cap \L) s_2'$ and
$s_1' \R s_{2_m}$.
\end{corollary}
\end{comment}
\begin{comment}
\begin{verbatim}
status_simulation_produce_tlr S1 S2 R
(* we start from this situation
st1 →→→→tlr→→→→ st1'
| \
L \---S--\
| \
st2_lab →taa→ st2 (the taa preamble is in general either empty or given
by the preceding call)
and we produce
st1 →→→→tlr→→→→ st1'
\\ / \
// R \-L,S-\
\\ | \
st2_lab →tlr→ st2_mid →taaf→ st2'
*)
st1 st1' st2_lab st2
(tlr1 : trace_label_return S1 st1 st1')
(taa2_pre : trace_any_any S2 st2_lab st2)
(sim_execute : status_simulation S1 S2 R)
on tlr1 : R st1 st2 → label_rel … st1 st2_lab →
∃st2_mid.∃st2'.
∃tlr2 : trace_label_return S2 st2_lab st2_mid.
∃taa2 : trace_any_any_free … st2_mid st2'.
(if taaf_non_empty … taa2 then ¬as_costed … st2_mid else True) ∧
R st1' st2' ∧ ret_rel … R st1' st2_mid ∧ label_rel … st1' st2' ∧
tlr_rel … tlr1 tlr2
\end{verbatim}
\end{comment}
\section{Conclusions and future works}
\label{conclusions}
The labelling approach is a technique to implement compilers that induce on
the source code a non uniform cost model determined from the object code
produced. The cost model assigns a cost to each basic block of the program.
The main theorem of the approach says that there is an exact
correspondence between the sequence of basic blocks started in the source
and object code, and that no instruction in the source or object code is
executed outside a basic block. Thus the cost of object code execution
can be computed precisely on the source.
In this paper we scale the labelling approach to cover a programming language
with function calls. This introduces new difficulties only when the language
is unstructured, i.e. it allows function calls to return anywhere in the code,
destroying the hope of a static prediction of the cost of basic blocks.
We restore static predictability by introducing a new semantics for unstructured
programs that single outs well structured executions. The latter are represented
by structured traces, a generalisation of streams of observables that capture
several structural invariants of the execution, like well nesting of functions
or the fact that every basic block must start with a code emission statement.
We show that structured traces are sufficiently structured to statically compute
a precise cost model on the object code.
We introduce a similarity relation on structured traces that must hold between
source and target traces. When the relation holds for every program, we prove
that the cost model can be lifted from the object to the source code.
In order to prove that similarity holds, we present a generic proof of forward
simulation that is aimed at pulling apart as much as possible the part of the
simulation related to non-functional properties (preservation of structure)
from that related to functional properties. In particular, we reduce the
problem of preservation of structure to that of showing a 1-to-many
forward simulation that only adds a few additional proof obligations to those
of a traditional, function properties only, proof.
All results presented in the paper are part of a larger certification of a
C compiler which is based on the labelling approach. The certification, done
in Matita, is the main deliverable of the FET-Open Certified Complexity (CerCo).
The short term future work consists in the completion of the certification of
the CerCo compiler exploiting the main theorem of this paper.
\paragraph{Related works.}
CerCo is the first project that explicitly tries to induce a
precise cost model on the source code in order to establish non-functional
properties of programs on an high level language. Traditional certifications
of compilers, like~\cite{compcert2,piton}, only explicitly prove preservation
of the functional properties.
Usually forward simulations take the following form: for each transition
from $s_1$ to $s_2$ in the source code, there exists an equivalent sequence of
transitions in the target code of length $n$. The number $n$ of transition steps
in the target code can just be the witness of the existential statement.
An equivalent alternative when the proof of simulation is constructive consists
in providing an explicit function, called \emph{clock function} in the
literature~\cite{clockfunctions}, that computes $n$ from $s_1$. Every clock
function constitutes then a cost model for the source code, in the spirit of
what we are doing in CerCo. However, we believe our solution to be superior
in the following respects: 1) the machinery of the labelling approach is
insensible to the resource being measured. Indeed, any cost model computed on
the object code can be lifted to the source code (e.g. stack space used,
energy consumed, etc.). On the contrary, clock functions only talk about
number of transition steps. In order to extend the approach with clock functions
to other resources, additional functions must be introduced. Moreover, the
additional functions would be handled differently in the proof.
2) the cost models induced by the labelling approach have a simple presentation.
In particular, they associate a number to each basic block. More complex
models can be induced when the approach is scaled to cover, for instance,
loop optimisations~\cite{loopoptimizations}, but the costs are still meant to
be easy to understand and manipulate in an interactive theorem prover or
in Frama-C.
On the contrary, a clock function is a complex function of the state $s_1$
which, as a function, is an opaque object that is difficult to reify as
source code in order to reason on it.
\bibliographystyle{splncs03}
\bibliography{ccexec}
% \appendix
% \section{Notes for the reviewers}
%
% The results described in the paper are part of a larger formalization
% (the certification of the CerCo compiler). At the moment of the submission
% we need to single out from the CerCo formalization the results presented here.
% Before the 16-th of February we will submit an attachment that contains the
% minimal subset of the CerCo formalization that allows to prove those results.
% At that time it will also be possible to measure exactly the size of the
% formalization described here. At the moment a rough approximation suggests
% about 2700 lines of Matita code.
%
% We will also attach the development version of the interactive theorem
% prover Matita that compiles the submitted formalization. Another possibility
% is to backport the development to the last released version of the system
% to avoid having to re-compile Matita from scratch.
%
% The programming and certification style used in the formalization heavily
% exploit dependent types. Dependent types are used: 1) to impose invariants
% by construction on the data types and operations (e.g. a traces from a state
% $s_1$ to a state $s_2$ can be concatenad to a trace from a state
% $s_2'$ to a state $s_3$ only if $s_2$ is convertible with $s_2'$); 2)
% to state and prove the theorems by using the Russell methodology of
% Matthieu Sozeau\footnote{Subset Coercions in Coq in TYPES'06. Matthieu Sozeau. Thorsten Altenkirch and Conor McBride (Eds). Volume 4502 of Lecture Notes in Computer Science. Springer, 2007, pp.237-252.
% }, better known in the Coq world as ``\texttt{Program}'' and reimplemented in a simpler way in Matita using coercion propagations\footnote{Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, Enrico Tassi: A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions. Logical Methods in Computer Science 8(1) (2012)}. However, no result presented depends
% mandatorily on dependent types: it should be easy to adapt the technique
% and results presented in the paper to HOL.
%
% Finally, Matita and Coq are based on minor variations of the Calculus of
% (Co)Inductive Constructions. These variations do not affect the CerCo
% formalization. Therefore a porting of the proofs and ideas to Coq would be
% rather straightforward.
\end{document}