Index: /Papers/itp2013/ccexec2.tex
===================================================================
 /Papers/itp2013/ccexec2.tex (revision 3355)
+++ /Papers/itp2013/ccexec2.tex (revision 3356)
@@ 238,9 +238,11 @@
\label{labelling}
% \subsection{A brief introduction to the labelling approach}
We briefly explain the labelling approach on the example in \autoref{examplewhile}.
+We briefly explain the labelling approach as introduced in~\cite{easylabelling}
+on the example in \autoref{examplewhile}.
The user wants to analyse the execution time of the program (the black lines in
\autoref{subfig:example_input}). He compiles the program using
a special compiler that first inserts in the code three label emission
statements (\verb+EMIT L_i+) to mark the beginning of basic blocks;
+statements (\verb+EMIT L_i+) to mark the beginning of basic blocks
+(\autoref{subfig:example_input});
then the compiler compiles the code to machine code (\autoref{subfig:example_oc}),
granting that the execution of the source and object
@@ 248,5 +250,5 @@
This is achieved by keeping track of basic blocks during compilation, avoiding
all optimizations that alter the control flow. The latter can be recovered with
a more refined version of the labelling approach~\cite{tranquill}, but in the
+a more refined version of the labelling approach~\cite{loopoptimizations}, but in the
present paper we stick to this simple variant for simplicity. Once the object
code is produced, the compiler runs a static code analyzer to associate to
@@ 273,5 +275,7 @@
never exceeds a certain bound~\cite{cerco}, which is now a functional property
of the code.
\begin{figure}
+
+\vspace{0.5cm}
+\begin{figure}[!h]
\begin{subfigure}[t]{.32\linewidth}
\begin{lstlisting}[moredelim={[is][\color{red}]{}{}}]
@@ 327,5 +331,5 @@
\end{figure}
\subsection{The labelling approach in presence of calls}
+\section{Extending the labelling approach to function calls}
%
Let's now consider a simple program written in C that contains a function
@@ 351,11 +355,12 @@
\begin{lstlisting}[xleftmargin=20pt]
void main() {
 EMIT L_1;
+ EMIT $L_1$;
$I_1$;
(*f)();
$I_2$;
}
+
void g() {
 EMIT L_2;
+ EMIT $L_2$;
$I_3$;
}
@@ 369,11 +374,12 @@
\begin{lstlisting}[xleftmargin=20pt]
main:
 EMIT L_1
+ EMIT $L_1$
$I_4$
CALL
$I_5$
RETURN
+
g:
 EMIT L_2
+ EMIT $L_2$
$I_6$
RETURN
@@ 397,5 +403,5 @@
stack so that the next \verb+RETURN+ would start at the second instruction
of $I_5$. The compiler would still be perfectly correct if a random, dead
code instruction was also added just after each \verb+CALL+. More generally,
+code instruction was added after the \verb+CALL+ as the first instruction of $I_5$. More generally,
\emph{there is no guarantee that a correct compiler that respects the functional
behaviour of a program also respects the calling structure of the source code}.
@@ 409,5 +415,5 @@
stack). This property, however, is a property of the runs of object code
programs, and not a property of the object code that can be easily statically
verified (as the ones we required for the basic labelling method).
+verified (as the ones required in \autoref{labelling} in absence of function calls).
Therefore, we now need to single out those runs whose cost behaviour can be
statically predicted, and we need to prove that every run of programs generated
@@ 426,15 +432,13 @@
transition systems; 2) we show that on the object code we can correctly
compute the execution time of a structured run from the sequence of labels
observed; 3) we give unstructured languages a semantics in terms of structured
traces; 4) we show that on the source code we can correctly compute the
+observed; 3) we show that on the source code we can correctly compute the
execution time of a program if the compiler produces object code whose
runs are weakly similar to the source code runs.

The notion of weak simulation for structured traces is a global property
which is hard to prove formally and much more demanding than the simple forward
simulation required for proofs of preservation of functional properties.
Therefore in \autoref{simulation} we will present a set of local simulation
conditions that refine the corresponding conditions for forward simulation and
that are sufficient to grant the production of weakly similar traces.
+runs are weakly similar to the source code runs and structured.
+
+The proof of correctness of such a compiler is harder than a traditional
+proof of preservation of functional properties, and a standard forward
+simulation argument does not work. In \autoref{simulation} we present
+a refinement of forward simulation that grants all required correctness
+properties.
All the definitions and theorems presented in the paper have been formalized
@@ 688,5 +692,5 @@
% in the semantics. This is the topic of the next section.
\section{Structured traces}
+\subsection{Structured traces}
\label{semantics}
@@ 695,5 +699,5 @@
deterministic labelled transition system~\cite{LTS} $(S,\Lambda,\to)$
that refines the
SOS by observing function calls and the beginning of basic blocks.
+SOS by observing function calls/returns and the beginning of basic blocks.
$S$ is the set of states of the program and
$\Lambda = \{ \tau, RET \} \cup \Labels \cup \Functions$
@@ 771,8 +775,8 @@
cases when the fragment starts at the beginning of a function body
(e.g. the one of \texttt{main}) because in that case nobody would have emitted
the observable $\ell(f)$.
+the observable $\ell(f)$. We plan to compare the two approaches in the future.
\paragraph{Measurable execution fragments and their cost prediction.}
The first main theorem of CerCo deals with programs written in object code.
+The first main theorem of CerCo deals with the object code.
It states that the execution cost of
certain execution fragments, that we call \emph{measurable fragments}, can be
@@ 799,5 +803,5 @@
$s_0 \To s_n$ and $s_n$ must be a label emission statement.
\textbf{CSC: PROVA}
+%\textbf{CSC: PROVA}
% The theorem is proved by structural induction over the structured
% trace, and is based on the invariant that
@@ 832,5 +836,5 @@
accurate for the object code, we get as a corollary that it is also accurate
for the source code if the compiler preserves weak traces and
propagates measurability. Thus it becomes possible to compute cost models
+propagates measurability. Thus, as prescribed by the CerCo's methodology~\cite{fopara}, it becomes possible to compute cost models
on the object code, transfer it to the source code and then reason comfortably
on the source code only.
@@ 845,15 +849,18 @@
\end{theorem}
\section{Forward simulation}
+\section{Proving the compiler correctness}
\label{simulation}
Because of \autoref{preservation}, to certify a compiler for the labelling
approach we need to both prove that it respects the functional semantics of the
program, and that it preserves weak traces and propagates measurability.
+We achieve this by independently proving the three properties for each compiler
+pass.
The first property is standard and can be proved by means of a forward simulation argument (see for example~\cite{compcert}) that runs like this.
First a relation between the corresponding
source and target states is established. Then a lemma establishes
+source and target states is defined. Then a lemma establishes
a local simulation condition: given two states in relation, if the source
one performs one step then the target one performs zero or more steps and
the two resulting states are synchronized again according to the relation.
+No requirements are imposed on the intermediate target states.
Finally, the lemma is iterated over the execution trace to establish the
final result.
@@ 865,5 +872,6 @@
kind of local or global analysis of the code followed by a compositional, order
preserving translation of every instruction. In order to produce structured
traces, however, code emission cannot be fully compositional any longer.
+traces, however, code emission cannot be fully compositional any longer,
+and requirements need to be enforced on intermediate target states.
For example, consider~requirement \ref{req3} that asks every function body
@@ 886,6 +894,6 @@
the state just after the return in the source code is matched by the state $s_2$
after these steps in the object code. However, the aforementioned requirement
does not involve $s_2$, but the state reached after the return in the object
code, that preceeds $s_2$ in the execution fragment. Therefore the requirement
+does not involve $s_2$, but the intermediate state reached after the return in the object
+code. Therefore this requirement too
cannot be enforced with the standard forward simulation argument.
@@ 1364,59 +1372,5 @@
% an equivalence relation.
\textbf{}

We summarise here the results of the previous sections. Each intermediate
unstructured language can be given a semantics based on structured traces,
that single out those runs that respect a certain number of invariants.
A cost model can be computed on the object code and it can be used to predict
the execution costs of runs that produce structured traces. The cost model
can be lifted from the target to the source code of a pass if the pass maps
structured traces to similar structured traces. The latter property is called
a \emph{forward simulation}.

As for labelled transition systems, in order to establish the forward
simulation we are interested in (preservation of observables), we are
forced to prove a stronger notion of forward simulation that introduces
an explicit relation between states. The classical notion of a 1tomany
forward simulation is the existence of a relation $\S$ over states such that
if $s_1 \S s_2$ and $s_1 \to^1 s_1'$ then there exists an $s_2'$ such that
$s_2 \to^* s_2'$ and $s_1' \S s_2'$. In our context, we need to replace the
one and multi step transition relations $\to^n$ with the existence of
a structured trace between the two states, and we need to add the request that
the two structured traces are similar. Thus what we would like to state is
something like:\\
for all $s_1,s_2,s_1'$ such that there is a $\tau_1$ from
$s_1$ to $s_1'$ and $s_1 \S s_2$ there exists an $s_2'$ such that
$s_1' \S s_2'$ and a $\tau_2$ from $s_2$ to $s_2'$ such that
$\tau_1$ is similar to $\tau_2$. We call this particular form of forward
simulation \emph{trace reconstruction}.

The statement just introduced, however, is too simplistic and not provable
in the general case. To understand why, consider the case of a function call
and the pass that fixes the parameter passing conventions. A function
call in the source code takes in input an arbitrary number of pseudoregisters (the actual parameters to pass) and returns an arbitrary number of pseudoregisters (where the result is stored). A function call in the target language has no
input nor output parameters. The pass must add explicit code before and after
the function call to move the pseudoregisters content from/to the hardware
registers or the stack in order to implement the parameter passing strategy.
Similarly, each function body must be augmented with a preamble and a postamble
to complete/initiate the parameter passing strategy for the call/return phase.
Therefore what used to be a call followed by the next instruction to execute
after the function return, now becomes a sequence of instructions, followed by
a call, followed by another sequence. The two states at the beginning of the
first sequence and at the end of the second sequence are in relation with
the status before/after the call in the source code, like in an usual forward
simulation. How can we prove however the additional condition for function calls
that asks that when the function returns the instruction immediately after the
function call is called? To grant this invariant, there must be another relation
between the address of the function call in the source and in the target code.
This additional relation is to be used in particular to relate the two stacks.

Another example is given by preservation of code emission statements. A single
code emission instruction can be simulated by a sequence of steps, followed
by a code emission, followed by another sequence. Clearly the initial and final
statuses of the sequence are to be in relation with the status before/after the
code emission in the source code. In order to preserve the structured traces
invariants, however, we must consider a second relation between states that
traces the preservation of the code emission statement.
+@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Therefore we now introduce an abstract notion of relation set between abstract
@@ 1441,12 +1395,8 @@
Therefore we have successfully splitted as much as possible the proof of
preservation of functional properties from that of nonfunctional ones.
Secondly, combined with the results in the previous section, it implies
that the cost model can be computed on the object code and lifted to the
source code to reason on nonfunctional properties, assuming that
the 1tomany forward simulation conditions are fulfilled for every
compiler pass.
+
+@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
\paragraph{Relation sets.}
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Let $S_1$ and $S_2$ be two deterministic labelled transition systems as described
in \autoref{semantics}. We introduce now the four relations $\mathcal{S,C,R,L}\subseteq S_1\times S_2$
@@ 1468,11 +1418,10 @@
Two states
$s_1$ and $s_2$ are $\R$related if every time $s_1$ is the
successor of a call state that is $\C$related to a call state
+successor of a call state $s_1'$ that is $\C$related to a call state
$s_2'$ in the target code, then $s_2$ is the successor of $s_2'$. Formally:
$$s_1\R s_2 \iffdef \forall s_1',s_2'.s_1'\C s_2' \to s_1'\ar s_1 \to s_2' \ar s_2.$$
We will require all pairs of states that return from related calls to be
$\R$related. This is the fundamental requirement granting
that the target trace is well structured, \emph{i.e.}\ that calls are well
nested and returning where they are supposed to.
+$\R$related. This, in combinantion with a dual requirement on function calls,
+will grant that calls return exactly where they are supposed to be.
We say states in $s_1\in S_1$ and $s_2\in S_2$ are labelrelated
@@ 1485,7 +1434,8 @@
\end{itemize}
The results that follow all rely on common hypotheses to hold for the
systems $S_1$ and $S_2$ endowed with the relations $\S$ and $\C$. These hypotheses
are depicted by the diagrams in \autoref{fig:forwardsim}.
+Given the relations $\S$ and $\C$, \autoref{fig:forwardsim} defines a set of
+local simulation conditions that are sufficient to grant the correctness of
+the compiler.
+
\begin{figure}
\centering
@@ 1593,17 +1543,19 @@
\draw [new, overlay] (t1) to [bend left ] node [rel] {$\R$} (r);
\end{tikzpicture}
\caption{The hypotheses of ???. Dashed relations are implied by solid ones.}
+\caption{Local simulation conditions. Each one states that the existence of
+states in~XXX such that the dashed relations holds is implied by the assumptions
+drawn as solid lines.}
\label{fig:forwardsim}
\end{figure}
\begin{lemma}
If $S_1$, $S_2$,$\S$ and $\C$ satisfy the diagrams in \autoref{fig:forwardsim},
$T_1:s_1\To s_1'$ is a structured fragment not starting with a $\ell(f)$ emission,
and $s_1\S s_2$, then there is $T_2:s_2\To s_2'$ with $T\approx T'$ and $s_1'\S s_2$.
+If $S_1,S_2,\S,\C$ satisfy the diagrams in \autoref{fig:forwardsim},
+$T_1=s_1\To s_1'$ is a structured fragment not starting with a $\ell(f)$ emission,
+and $s_1\S s_2$, then there is $T_2=s_2\To s_2'$ with $T\approx T'$ and $s_1'\S s_2'$.
\end{lemma}
\begin{theorem}
If $S_1$, $S_2$, $\S$ and $\C$ satisfy the diagrams in \autoref{fig:forwardsim},
$M_1:s_1\MeasTo s_1'$ is a measurable fragment of $S_1$ and $s_2$ is such that
$s_1\L s_2$, then there is $M_2:s_2\MeasTo s_2'$ with $M_1\approx M_2$.
+If $S_1,S_2,\S,\C$ satisfy the diagrams in \autoref{fig:forwardsim},
+$M_1:s_1\to^{*} s_1'$ is a measurable fragment of $S_1$ and $s_2$ is such that
+$s_1\L s_2$, then there is $M_2:s_2\to^{*} s_2'$ with $M_1\approx M_2$.
\end{theorem}
@@@@@@@@@@@@@@@@@@@@@@@
@@ 2006,27 +1958,33 @@
can be computed precisely on the source.
In this paper we scale the labelling approach to cover a programming language
with function calls. This introduces new difficulties only when the language
+In this paper we scaled the labelling approach to cover a programming language
+with function calls. This introduces new difficulties when the language
is unstructured, i.e. it allows function calls to return anywhere in the code,
destroying the hope of a static prediction of the cost of basic blocks.
We restore static predictability by introducing a new semantics for unstructured
+We restored static predictability by introducing a new semantics for unstructured
programs that single outs well structured executions. The latter are represented
by structured traces, a generalisation of streams of observables that capture
several structural invariants of the execution, like well nesting of functions
or the fact that every basic block must start with a code emission statement.
We show that structured traces are sufficiently structured to statically compute
a precise cost model on the object code.

We introduce a similarity relation on structured traces that must hold between
source and target traces. When the relation holds for every program, we prove
that the cost model can be lifted from the object to the source code.

In order to prove that similarity holds, we present a generic proof of forward
simulation that is aimed at pulling apart as much as possible the part of the
simulation related to nonfunctional properties (preservation of structure)
from that related to functional properties. In particular, we reduce the
problem of preservation of structure to that of showing a 1tomany
forward simulation that only adds a few additional proof obligations to those
of a traditional, function properties only, proof.
+We showed that structured traces are sufficiently well behaved to statically compute a precise cost model on the object code.
+
+We also proved that the cost model computed on the object code is also valid
+on the source code if the compiler respects two requirements: the weak execution
+traces of the source and target code must be the same and the object
+code execution fragments are structured.
+
+To prove that a compiler respects the requirement we extended the notion
+of forward simulation proof for a labelled transition system to grant
+preservation of structured fragments. If the source language of the compiler
+is structured, all its execution fragments are, allowing to deduce from
+preservation of structure that object code fragments are structured too.
+
+Finally, we identified measurable execution fragments that are those whose
+execution time (once compiled) can be exactly computed looking at the object
+code weak execution traces only. A final instrumentation pass on the source
+code can then be used to turn the non functional property of having a certain
+cost into the functional property of granting that a certain global variable
+incremented at the beginning of every block according to the induced cost model
+has a certain value.
All results presented in the paper are part of a larger certification of a
@@ 2056,5 +2014,8 @@
insensible to the resource being measured. Indeed, any cost model computed on
the object code can be lifted to the source code (e.g. stack space used,
energy consumed, etc.). On the contrary, clock functions only talk about
+energy consumed, etc.) simply reproving an analogue of~\autoref{static}.
+For example, in CerCo we transported to the source level not only the execution
+time cost model, but also the amount of stack used by function calls.
+On the contrary, clock functions only talk about
number of transition steps. In order to extend the approach with clock functions
to other resources, additional functions must be introduced. Moreover, the