Index: /Deliverables/addenda/indexed_labels/report.tex
===================================================================
 /Deliverables/addenda/indexed_labels/report.tex (revision 1677)
+++ /Deliverables/addenda/indexed_labels/report.tex (revision 1678)
@@ 626,5 +626,5 @@
\section{Introduction}
In~\cite{D2.1}, Armadio et al. propose an approach for building a compiler for a large fragment of the \textsc{c} programming language.
+In~\cite{D2.1}, Armadio \emph{et al} propose an approach for building a compiler for a large fragment of the \textsc{c} programming language.
The novelty of their proposal lies in the fact that their proposed design is capable of lifting execution cost information from the compiled code and presenting it to the user.
This idea is foundational for the CerCo project, which strives to produce a mechanically certified version of such a compiler.
@@ 653,5 +653,5 @@
\end{itemize}
The first point unveils a weakness of the current labelling approach when it comes to some common code transformations performed along a compilation chain.
In particular, most \emph{loop optimizations} are disruptive, in the sense outlined in the first bulletpoint above.
+In particular, most \emph{loop optimisations} are disruptive, in the sense outlined in the first bulletpoint above.
An example optimisation of this kind is \emph{loop peeling}.
This optimisation is employed by compilers in order to trigger other optimisations, for example, dead code elimination or invariant code motion.
@@ 665,6 +665,6 @@
If one looks closely, the source of the weakness of the labelling approach as presented in~\cite{D2.1} is common to both points: the inability to state different costs for different occurrences of labels, where the difference might be originated by labels being duplicated along the compilation, or the costs being sensitive to the current state of execution.
The preliminary work we present here addresses this weakness by introducing cost labels that are dependent on which iteration of its containing loops it occurs in.
This is achieved by means of \emph{indexed labels}; all cost labels are decorated with formal indexes coming from the loops containing such labels.
These indexes allow us to rebuild, even after multiple loop transformations, which iterations of the original loops in the source code a particular label occurrence belongs to.
+This is achieved by means of \emph{indexed labels}; all cost labels are decorated with formal indices coming from the loops containing such labels.
+These indices allow us to rebuild, even after multiple loop transformations, which iterations of the original loops in the source code a particular label occurrence belongs to.
During the annotation stage of the source code, this information is presented to the user by means of \emph{dependent costs}.
@@ 679,5 +679,5 @@
\paragraph{Loop unrolling}
This optimisation consists of the repetition of several copies of the body of the loop inside the loop itself (inserting appropriate guards, or avoiding them altogether if enough information about the loop's guard is available at compile time).
This can limit the number of (conditional or unconditional) jumps executed by the code and trigger further optimizations dealing with pipelining, if appropriate for the architecture.
+This can limit the number of (conditional or unconditional) jumps executed by the code and trigger further optimisations dealing with pipelining, if appropriate for the architecture.
\\\\
Whilst we cover only two loop optimisations in this report, we argue that the work presented herein poses a good foundation for extending the labelling approach, in order to cover more and more common optimisations, as well as gaining insight into how to integrate advanced cost estimation techniques, such as cache analysis, into the CerCo compiler.
@@ 686,9 +686,9 @@
\paragraph{Outline}
We will present our approach on a minimal `toy' imperative language, \imp{} with \s{goto}s, which we present in \autoref{sec:defimp} along with formal definitions of the loop transformations.
+We will present our approach on a minimal `toy' imperative language, \imp{} with \s{goto}s, which we present in Section~\ref{sec:defimp} along with formal definitions of the loop transformations.
This language already presents most of the difficulties encountered when dealing with \textsc{c}, so we stick to it for the sake of this presentation.
In Section~\autoref{sec:labelling} we summarize the labelling approach as presented in~\cite{D2.1}.
Section~\autoref{sec:indexedlabels} presents \emph{indexed labels}, our proposal for dependent labels which are able to describe precise costs even in the presence of the various loop transformations we consider.
Finally Section~\autoref{sec:conc} goes into more detail regarding the implementation of indexed labels in CerCo's untrusted compiler and speculates on further work on the subject.
+In Section~\ref{sec:labelling} we summarize the labelling approach as presented in~\cite{D2.1}.
+Section~\ref{sec:indexedlabels} presents \emph{indexed labels}, our proposal for dependent labels which are able to describe precise costs even in the presence of the various loop transformations we consider.
+Finally Section~\ref{sec:conc} goes into more detail regarding the implementation of indexed labels in CerCo's untrusted compiler and speculates on further work on the subject.
\section{\imp{} with goto}\label{sec:defimp}
@@ 696,5 +696,5 @@
The language was designed in order to pose problems for the existing labelling approach, and as a testing ground for our new notion of indexed labels.
The syntax and operational semantics of our toy language are presented in \autoref{fig:minimp}.
+The syntax and operational semantics of our toy language are presented in~\ref{fig:minimp}.
Note, we may augment the language further, with \s{break} and \s{continue}, at no further expense.
\begin{figureplr}
@@ 792,5 +792,5 @@
Note that for \s{break} and \s{continue} statements, those should be replaced with \s{goto}s in the peeled body $S$.
\paragraph{Loop unrolling.}
+\paragraph{Loop unrolling}
$$
\sop{while}b\sbin{do}S\mapsto
@@ 800,19 +800,19 @@
\sop{if} b \sbin{then} S[\ell^n_i/\ell_i]) \cdots)
$$
where $\ell^j_i$ are again fresh labels for any $\ell_i$ labelling a statement
in $S$. This is a willingly naïf version of loop unrolling, which usually
targets less general loops. The problem it poses to Cerco's labelling approach
are independent of the cleverness of the actual transformation.

\section{Labelling: a quick sketch of the previous approach}\label{sec:labelling}
Plainly labelled \imp{} is obtained adding to the code \emph{cost labels}
(with metavariables $\alpha,\beta,\ldots$), and costlabelled statements:
$$S,T\gramm \cdots \mid \alpha: S$$
Cost labels allow for some program points to be tracked along the compilation
chain. For further details we refer to\cite{D2.1}.

With labels the small step semantics turns into a labelled transition
system and a natural notion of trace (i.e.\ lists of labels) arises.
Evaluation of statements is enriched with traces, so that rules are like
+where $\ell^j_i$ are again fresh labels for any $\ell_i$ labelling a statement in $S$.
+This is a willfully na\"{i}ve version of loop unrolling, which usually targets less general loops.
+The problem this transformation poses to CerCo's labelling approach are independent of the sophistication of the actual transformation.
+
+\section{Labelling: a quick sketch of the previous approach}
+\label{sec:labelling}
+Plainly labelled \imp{} is obtained by adding to the code \emph{cost labels} (with metavariables $\alpha,\beta,\ldots$), and costlabelled statements:
+$$
+S,T\gramm \cdots \mid \alpha: S
+$$
+Cost labels allow us to track some program points along the compilation chain.
+For further details we refer to~\cite{D2.1}.
+
+With labels the small step semantics turns into a labelled transition system along with a natural notion of trace (i.e.\ lists of labels) arises.
+The evaluation of statements is enriched with traces, so that rules follow a pattern similar to the following:
$$
\begin{array}{lblL}
@@ 821,17 +821,11 @@
& \text{etc.}
\end{array}$$
where we identify cost labels $\alpha$ with singleton traces and we use $\varepsilon$
for the empty trace. Cost labels are emitted by costlabelled statements only%
\footnote{In the general case the evaluation of expressions can emit cost labels
too (see \autoref{sec:conc}).}. We then write $\to[\lambda]\!\!^*$ for the transitive
closure of the small step semantics which produces by concatenation the trace
$\lambda$.

\paragraph{Labelling.}
Given an \imp{} program $P$ its \emph{labelling} $\alpha:\Ell(P)$ in $\ell\imp$
is
defined by putting cost labels after every branching, at the start of both
branches, and a cost label at the beginning of the program. So the relevant
cases are
+Here, we identify cost labels $\alpha$ with singleton traces and we use $\varepsilon$ for the empty trace.
+Cost labels are emitted by costlabelled statements only\footnote{In the general case the evaluation of expressions can emit cost labels too (see~\ref{sec:conc}).}.
+We then write $\to[\lambda]\!\!^*$ for the transitive closure of the small step semantics which produces by concatenation the trace $\lambda$.
+
+\paragraph{Labelling}
+Given an \imp{} program $P$ its \emph{labelling} $\alpha:\Ell(P)$ in $\ell\imp$ is defined by putting cost labels after every branching statement, at the start of both branches, and a cost label at the beginning of the program.
+The relevant cases are
$$\begin{aligned}
\Ell(\sop{if}e\sbin{then}S\sbin{else}T) &=
@@ 840,53 +834,41 @@
(\sop{while}e\sbin{do}\alpha:\Ell(S));\beta:\s{skip}
\end{aligned}$$
where $\alpha,\beta$ are fresh cost labels, and while in other cases the
definition just passes to substatements.

\paragraph{Labels in the rest of the compilation chain.} All languages further
down the chain get a new sequential statement $\sop{emit}\alpha$ whose effect is
to be consumed in a labelled transition while keeping the same state. All other
instructions guard their operational semantics and do not emit cost labels.

Preservation of semantics throughout the compilation process is restated, in
rough terms, as
$$\text{starting state of $P$}\to[\lambda]\!\!^*\;\text{halting state} \iff
 \text{starting state of $\mathcal C(P)$} \to[\lambda]\!\!^*\;\text{halting state},$$
where $P$ is a program of a language along the compilation chain, starting and
halting states depend on the language, and $\mathcal C$ is the
compilation function\footnote{The case of divergent computations needs
to be addressed too. Also, the requirement can be weakened by demanding some
sort of equivalence of the traces rather than equality. Both these issues escape
the scope of this presentation.}.
+where $\alpha,\beta$ are fresh cost labels.
+In all other cases the definition just passes to substatements.
+
+\paragraph{Labels in the rest of the compilation chain}
+All languages further down the chain get a new sequential statement $\sop{emit}\alpha$ whose effect is to be consumed in a labelled transition while keeping the same state.
+All other instructions guard their operational semantics and do not emit cost labels.
+
+Preservation of semantics throughout the compilation process is restated, in rough terms, as:
+$$
+\text{starting state of $P$}\to[\lambda]\!\!^*\;\text{halting state} \iff
+\text{starting state of $\mathcal C(P)$} \to[\lambda]\!\!^*\;\text{halting state}
+$$
+Here $P$ is a program of a language along the compilation chain, starting and halting states depend on the language, and $\mathcal C$ is the compilation function\footnote{The case of divergent computations needs to be addressed too.
+Also, the requirement can be weakened by demanding some sort weaker form of equivalence of the traces than equality.
+Both of these issues are beyond the scope of this presentation.}.
\paragraph{Instrumentations}
Let $\mathcal C$ be the whole compilation from $\ell\imp$ to the labelled
version of some lowlevel language $L$. Supposing such compilation has not
introduced any new loop or branching, we have that
+Let $\mathcal C$ be the whole compilation from $\ell\imp$ to the labelled version of some lowlevel language $L$.
+Supposing such compilation has not introduced any new loop or branching, we have that:
\begin{itemize}
 \item every loop contains at least a cost label (\emph{soundness condition}),
and
 \item every branching has different labels for the two branches
 (\emph{preciseness condition}).
+\item
+Every loop contains at least a cost label (\emph{soundness condition})
+\item
+Every branching has different labels for the two branches (\emph{preciseness condition}).
\end{itemize}
With these two conditions, we have that each and every cost label in
$\mathcal C(P)$ for any $P$ corresponds to a block of sequential instructions,
to which we can assign a constant \emph{cost}\footnote{This in fact requires the
machine architecture to be simple enough, or for some form of execution analysis
to take place.} We can therefore assume a \emph{cost mapping} $\kappa_P$ from
cost labels to natural numbers, assigning to each cost label $\alpha$ the cost
of the block containing the single occurrance of $\alpha$.

Given any cost mapping $\kappa$, we can enrich a labelled program so that a
particular fresh variable (the \emph{cost variable} $c$) keeps track of the
assumulation of costs during the execution. We call this procedure
\emph{instrumentation} of the program, and it is defined recursively by
$$
 \mathcal I(\alpha:S) = c \ass c + \kappa(\alpha) ; \mathcal I(S)
$$
while in all other cases the definition passes to substatements.

\paragraph{The problem with loop optimizations.}
Let us take loop peeling, and apply it to the labelling of a program without any
adjustment:
+With these two conditions, we have that each and every cost label in $\mathcal C(P)$ for any $P$ corresponds to a block of sequential instructions, to which we can assign a constant \emph{cost}\footnote{This in fact requires the machine architecture to be `simple enough', or for some form of execution analysis to take place.}
+We therefore may assume the existence of a \emph{cost mapping} $\kappa_P$ from cost labels to natural numbers, assigning to each cost label $\alpha$ the cost of the block containing the single occurrance of $\alpha$.
+
+Given any cost mapping $\kappa$, we can enrich a labelled program so that a particular fresh variable (the \emph{cost variable} $c$) keeps track of the summation of costs during the execution.
+We call this procedure \emph{instrumentation} of the program, and it is defined recursively by:
+$$
+\mathcal I(\alpha:S) = c \ass c + \kappa(\alpha) ; \mathcal I(S)
+$$
+In all other cases the definition passes to substatements.
+
+\paragraph{The problem with loop optimisations}
+Let us take loop peeling, and apply it to the labelling of a program without any prior adjustment:
$$
(\sop{while}e\sbin{do}\alpha:S);\beta:\s{skip}
@@ 894,86 +876,68 @@
(\sop{if}b\sbin{then} \alpha : S; \sop{while} b \sbin{do} \alpha :
S[\ell'_i/\ell_i]);
\beta:\s{skip}$$
What happens is that the cost label $\alpha$ is duplicated into two distinct
occurrences. If these two occurrences correspond to different costs in the
compiled code, the best the cost mapping can do is to take the maximum of the
two, preserving soundness (i.e.\ the cost estimate still bounds the actual one)
but loosing preciseness (i.e.\ the actual cost would be strictly less than its
estimate).

\section{Indexed labels}\label{sec:indexedlabels}
This section presents the core of the new approach. In brief points it amounts
to the following.
\begin{enumerate}[\bfseries \ref*{sec:indexedlabels}.1.]
 \item\label{en:outline1}
Enrich cost labels with formal indexes corresponding, at the beginning of
the process, to which iteration of the loop they belong to.
 \item\label{en:outline2}
Each time a loop transformation is applied and a cost labels is split in
different occurrences, each of these will be reindexed so that every time they
are emitted their position in the original loop will be reconstructed.
 \item\label{en:outline3}
Along the compilation chain, add alongside the \s{emit} instruction other
ones updating the indexes, so that iterations of the original loops can be
rebuilt at the operational semantics level.
 \item\label{en:outline4}
The machinery computing the cost mapping will still work, but assigning
costs to indexed cost labels, rather than to cost labels as we wish: however
\emph{dependent costs} can be calculated, where dependency is on which iteration
of the containing loops we are in.
+\beta:\s{skip}
+$$
+What happens is that the cost label $\alpha$ is duplicated with two distinct occurrences.
+If these two occurrences correspond to different costs in the compiled code, the best the cost mapping can do is to take the maximum of the two, preserving soundness (i.e.\ the cost estimate still bounds the actual one) but losing preciseness (i.e.\ the actual cost would be strictly less than its estimate).
+
+\section{Indexed labels}
+\label{sec:indexedlabels}
+This section presents the core of the new approach.
+In brief points it amounts to the following:
+\begin{enumerate}[\bfseries~\ref*{sec:indexedlabels}.1.]
+\item
+\label{en:outline1}
+Enrich cost labels with formal indices corresponding, at the beginning of the process, to which iteration of the loop they belong to.
+\item
+\label{en:outline2}
+Each time a loop transformation is applied and a cost labels is split in different occurrences, each of these will be reindexed so that every time they are emitted their position in the original loop will be reconstructed.
+\item
+\label{en:outline3}
+Along the compilation chain, alongside the \s{emit} instruction we add other instructions updating the indices, so that iterations of the original loops can be rebuilt at the operational semantics level.
+\item
+\label{en:outline4}
+The machinery computing the cost mapping will still work, but assigning costs to indexed cost labels, rather than to cost labels as we wish.
+However, \emph{dependent costs} can be calculated, where dependency is on which iteration of the containing loops we are in.
\end{enumerate}
\subsection{Indexing the cost labels}\label{ssec:indlabs}
\paragraph{Formal indexes and $\iota\ell\imp$.}
Let $i_0,i_1,\ldots$ be a sequence of distinguished fresh identifiers that will
be used as loop indexes. A \emph{simple expression} is an affine arithmetical
expression in one of these indexes, that is $a*i_k+b$ with $a,b,k \in \mathbb N$.
Simple expressions $e_1=a_1*i_k+b_1$, $e_2=a2*i_k+b_2$ in the same index can be
composed, yielding $e_1\circ e_2\ass (a_1a_2)*i_k + (a_1b2+b_1)$, and this operation
has an identity element in $id_k \ass 1*i_k+0$. Constants can be expressed as simple
expressions, so that we identify a natural $c$ with $0*i_k+c$.

An \emph{indexing} (with metavariables $I$, $J$, \ldots) is a list of
transformations of successive formal indexes dictated by simple expressions,
that is a mapping%
\footnote{Here we restrict each mapping to be a simple expression
\emph{on the same index}. This might not be the case if more loop optimizations
are accounted for (for example, interchanging two nested loops).}
$$i_0\mapsto a_0*i_0+b_0,\dots, i_{k1} \mapsto a_{k1}*i_{k1}+b_{k1}.$$

An \emph{indexed cost label} (metavariables $\alphab$, $\betab$, \ldots) is
the combination of a cost label $\alpha$ and an indexing $I$, written
$\alpha\la I\ra$. The cost label underlying an indexed one is called its
\emph{atom}. All plain labels can be considered as indexed ones by taking
an empty indexing.

\imp{} with indexed labels ($\iota\ell\imp$) is defined by adding to $\imp$
statements with indexed labels, and by having loops with formal indexes
attached to them:
$$S,T,\ldots \gramm \cdots i_k:\sop{while}e\sbin{do}S\mid \alphab : S.$$
Notice than unindexed loops still are in the language: they will correspond
to multientry loops which are ignored by indexing and optimizations.
+\subsection{Indexing the cost labels}
+\label{ssec:indlabs}
+
+\paragraph{Formal indices and $\iota\ell\imp$}
+Let $i_0,i_1,\ldots$ be a sequence of distinguished fresh identifiers that will be used as loop indices.
+A \emph{simple expression} is an affine arithmetical expression in one of these indices, that is $a*i_k+b$ with $a,b,k \in \mathbb N$.
+Simple expressions $e_1=a_1*i_k+b_1$, $e_2=a2*i_k+b_2$ in the same index can be composed, yielding $e_1\circ e_2\ass (a_1a_2)*i_k + (a_1b2+b_1)$, and this operation has an identity element in $id_k \ass 1*i_k+0$.
+Constants can be expressed as simple expressions, so that we identify a natural $c$ with $0*i_k+c$.
+
+An \emph{indexing} (with metavariables $I$, $J$, \ldots) is a list of transformations of successive formal indices dictated by simple expressions, that is a mapping\footnote{Here we restrict each mapping to be a simple expression \emph{on the same index}.
+This might not be the case if more loop optimisations are accounted for (for example, interchanging two nested loops).}
+$$
+i_0\mapsto a_0*i_0+b_0,\dots, i_{k1} \mapsto a_{k1}*i_{k1}+b_{k1}
+$$
+
+An \emph{indexed cost label} (metavariables $\alphab$, $\betab$, \ldots) is the combination of a cost label $\alpha$ and an indexing $I$, written $\alpha\la I\ra$.
+The cost label underlying an indexed one is called its \emph{atom}.
+All plain labels can be considered as indexed ones by taking an empty indexing.
+
+\imp{} with indexed labels ($\iota\ell\imp$) is defined by adding to $\imp$ statements with indexed labels, and by having loops with formal indices attached to them:
+$$
+S,T,\ldots \gramm \cdots i_k:\sop{while}e\sbin{do}S\mid \alphab : S
+$$
+Note than unindexed loops still exist in the language: they will correspond to multientry loops which are ignored by indexing and optimisations.
We will discuss the semantics later.
\paragraph{Indexed labelling.}
Given an $\imp$ program $P$, in order to index loops and assign indexed labels
we must first of all distinguish singleentry loops. We just sketch how it can
be computed.

A first pass of the program $P$ can easily compute two maps: $\s{loopof}_P$
from each label $\ell$ to the occurrence (i.e.\ the path) of a $\s{while}$ loop
containing $\ell$, or the empty path if none exists; and $\s{gotosof}_P$ from
a label $\ell$ to the occurrences of \s{goto}s pointing to it. Then the set
$\s{multientry}_P$ of multientry loops of $P$ can be computed by
$$\s{multientry}_P\ass\{\, p \mid \exists \ell,q.p =\s{loopof}_P(\ell),q\in\s{gotosof}_P(\ell), q \not\le p\,\}$$
where $\le$ is the prefix relation\footnote{Possible simplification to this
procedure include keeping track just of while loops containing labels and
\s{goto}s (rather than paths in the syntactic tree of the program), and making
two passes while avoiding building the map to sets $\s{gotosof}$}.

Let $Id_k$ be the indexing of length $k$ made from identity simple expressions,
i.e.\ the sequence $i_0\mapsto id_0, \ldots , i_{k1}\mapsto id_{k1}$. We define the tiered indexed labelling
$\Ell^\iota_P (S,k)$ in program $P$ for occurrence $S$ of a statement in $P$
and a natural $k$ by recursion, setting
+\paragraph{Indexed labelling}
+Given an $\imp$ program $P$, in order to index loops and assign indexed labels, we must first distinguish singleentry loops.
+We sketch how this can be computed in the sequel.
+
+A first pass of the program $P$ can easily compute two maps: $\s{loopof}_P$ from each label $\ell$ to the occurrence (i.e.\ the path) of a $\s{while}$ loop containing $\ell$, or the empty path if none exists; and $\s{gotosof}_P$ from a label $\ell$ to the occurrences of \s{goto}s pointing to it.
+Then the set $\s{multientry}_P$ of multientry loops of $P$ can be computed by
+$$
+\s{multientry}_P\ass\{\, p \mid \exists \ell,q.p =\s{loopof}_P(\ell),q\in\s{gotosof}_P(\ell), q \not\le p\,\}
+$$
+Here $\le$ is the prefix relation\footnote{Possible simplifications to this procedure include keeping track of just the while loops containing labels and \s{goto}s (rather than paths in the syntactic tree of the program), and making two passes while avoiding building the map to sets $\s{gotosof}$}.
+
+Let $Id_k$ be the indexing of length $k$ made from identity simple expressions, i.e.\ the sequence $i_0\mapsto id_0, \ldots , i_{k1}\mapsto id_{k1}$.
+We define the tiered indexed labelling $\Ell^\iota_P (S,k)$ in program $P$ for occurrence $S$ of a statement in $P$ and a natural $k$ by recursion, setting:
$$
\Ell^\iota_P(S,k)\ass
@@ 990,17 +954,11 @@
\right.
$$
where as usual $\alpha$ and $\beta$ are to be fresh cost labels, and other cases just keep
making the recursive calls on the substatements. The \emph{indexed labelling} of
a program $P$ is then defined as $\alpha\la \ra : \Ell^\iota_P(P,0)$, i.e.\ a
further fresh unindexed cost label is added at the start, and we start from level $0$.

In other words: each singleentry loop is indexed by $i_k$ where $k$ is the number of
other singleentry loops containing this one, and all cost labels under the scope
of a singleentry loop indexed by $i_k$ are indexed by all indexes $i_0,\ldots,i_k$,
without any transformation.
+Here, as usual, $\alpha$ and $\beta$ are fresh cost labels, and other cases just keep making the recursive calls on the substatements.
+The \emph{indexed labelling} of a program $P$ is then defined as $\alpha\la \ra : \Ell^\iota_P(P,0)$, i.e.\ a further fresh unindexed cost label is added at the start, and we start from level $0$.
+
+In plainer words: each singleentry loop is indexed by $i_k$ where $k$ is the number of other singleentry loops containing this one, and all cost labels under the scope of a singleentry loop indexed by $i_k$ are indexed by all indices $i_0,\ldots,i_k$, without any transformation.
\subsection{Indexed labels and loop transformations}
We define the \emph{reindexing} $I \circ (i_k\mapsto a*i_k+b)$ as an operator
on indexings by setting
+We define the \emph{reindexing} $I \circ (i_k\mapsto a*i_k+b)$ as an operator on indexings by setting:
\begin{multline*}
(i_0\mapsto e_0,\ldots, i_k \mapsto e_k,\ldots,i_n\mapsto e_n)
@@ 1009,27 +967,20 @@
i_0\mapsto e_0,\ldots, i_k \mapsto e_k \circ(a*i_k+b),\ldots,i_n\mapsto e_n,
\end{multline*}
and extending then to indexed labels (by $\alpha\la I\ra\circ(i_k\mapsto e)\ass
\alpha\la I\circ (i_k\mapsto e)\ra$) and to statements in $\iota\ell\imp$
(by applying the above transformation to all indexed labels).

We can then redefine loop peeling and loop unrolling taking into account indexed labels.
It will be possible to apply the transformation only to indexed loops, that is loops
that are singleentry. The attentive reader will notice that no assumption is
made on the labelling of the statements involved. In particular the transformation
can be repeated and composed at will. Also, notice that erasing all labelling
information (i.e.\ indexed cost labels and loop indexes) we recover exactly the
same transformations presented in \autoref{sec:defimp}.

\paragraph{Indexed loop peeling.}

+We further extend to indexed labels (by $\alpha\la I\ra\circ(i_k\mapsto e)\ass \alpha\la I\circ (i_k\mapsto e)\ra$) and also to statements in $\iota\ell\imp$ (by applying the above transformation to all indexed labels).
+
+We can then redefine loop peeling and loop unrolling, taking into account indexed labels.
+It will only be possible to apply the transformation to indexed loops, that is loops that are singleentry.
+The attentive reader will notice that no assumptions are made on the labelling of the statements that are involved.
+In particular the transformation can be repeated and composed at will.
+Also, note that after erasing all labelling information (i.e.\ indexed cost labels and loop indices) we recover exactly the same transformations presented in~\ref{sec:defimp}.
+
+\paragraph{Indexed loop peeling}
$$
i_k:\sop{while}b\sbin{do}S\mapsto
\sop{if}b\sbin{then} S\circ (i_k\mapsto 0); i_k : \sop{while} b \sbin{do} S[\ell'_i/\ell_i]\circ(i_k\mapsto i_k + 1)
$$
As can be expected, the peeled iteration of the loop gets reindexed as always
being the first iteration of the loop, while the iterations of the remaining
loop get shifted by $1$.

\paragraph{Indexed loop unrolling.}
+As can be expected, the peeled iteration of the loop gets reindexed, always being the first iteration of the loop, while the iterations of the remaining loop are shifted by $1$.
+
+\paragraph{Indexed loop unrolling}
$$
\begin{array}{l}
@@ 1048,50 +999,34 @@
\end{array}
$$
Again, the reindexing is as can be expected: each copy of the unrolled body
has its indexes remapped so that when they are executed the original iteration
of the loop to which they correspond can be recovered.
+Again, the reindexing is as expected: each copy of the unrolled body has its indices remapped so that when they are executed, the original iteration of the loop to which they correspond can be recovered.
\subsection{Semantics and compilation of indexed labels}

In order to make sense of loop indexes, one must keep track of their values
in the state. A \emph{constant indexing} (metavariables $C,\ldots$) is an
indexing which employs only constant simple expressions. The evaluation
of an indexing $I$ in a constant indexing $C$, noted $I_C$, is defined
by
$$I\circ(i_0\mapsto c_0,\ldots, i_{k1}\mapsto c_{k1}) \ass
 \alphab\circ(i_0\mapsto c_0)\circ\cdots\circ(i_{k1}\mapsto c_{k1})$$
(using the definition of ${}\circ{}$ given in \autoref{ssec:indlabs}), considering it defined only
if the the resulting indexing is a constant one too\footnote{For example
$(i_0\mapsto 2*i_0,i_1\mapsto i_1+1)_{i_0\mapsto 2}$ is undefined,
but $(i_0\mapsto 2*i_0,i_1\mapsto 0)_{i_0\mapsto 2}=
i_0\mapsto 4,i_1\mapsto 2$, which is indeed a constant indexing,
even if the domain of the original indexing is not covered by the constant one.}.
+In order to make sense of loop indices, one must keep track of their values in the state.
+A \emph{constant indexing} (metavariables $C,\ldots$) is an indexing which employs only constant simple expressions.
+The evaluation of an indexing $I$ in a constant indexing $C$, noted $I_C$, is defined by:
+$$
+I\circ(i_0\mapsto c_0,\ldots, i_{k1}\mapsto c_{k1}) \ass \alphab\circ(i_0\mapsto c_0)\circ\cdots\circ(i_{k1}\mapsto c_{k1})
+$$
+Here, we are using the definition of ${}\circ{}$ given in~\ref{ssec:indlabs}.
+We consider the above defined only if the the resulting indexing is a constant one too\footnote{For example $(i_0\mapsto 2*i_0,i_1\mapsto i_1+1)_{i_0\mapsto 2}$ is undefined, but $(i_0\mapsto 2*i_0,i_1\mapsto 0)_{i_0\mapsto 2}= i_0\mapsto 4,i_1\mapsto 2$, is indeed a constant indexing, even if the domain of the original indexing is not covered by the constant one.}.
The definition is extended to indexed labels by $\alpha\la I\ra_C\ass \alpha\la I_C\ra$.
Constant indexings will be used to keep track of the exact iterations of the
original code the emitted labels belong to. We thus define two basic actions to
update constant indexings: $C[i_k{\uparrow}]$ which increments the value of
$i_k$ by one, and $C[i_k{\downarrow}0]$ which resets it to $0$.

We are ready to update the definition of the operational semantics of
indexed labelled \imp. The emitted cost labels will now be ones indexed by
constant indexings. We add a special indexed loop construct for continuations
that keeps track of active indexed loop indexes:
$$K,\ldots \gramm \cdots  i_k:\sop{while} b \sbin {do} S \sbin{then} K$$
The difference between the regular stack concatenation
$i_k:\sop{while}b\sbin{do}S\cdot K$ and the new constructor is that the latter
indicates the loop is the active one in which we already are, while the former
is a loop that still needs to be started%
\footnote{In the presence of \s{continue} and \s{break} statements active
loops need to be kept track of in any case.}.
+Constant indexings will be used to keep track of the exact iterations of the original code that the emitted labels belong to.
+We thus define two basic actions to update constant indexings: $C[i_k{\uparrow}]$ increments the value of $i_k$ by one, and $C[i_k{\downarrow}0]$ resets it to $0$.
+
+We are ready to update the definition of the operational semantics of indexed labelled \imp.
+The emitted cost labels will now be ones indexed by constant indexings.
+We add a special indexed loop construct for continuations that keeps track of active indexed loop indices:
+$$
+K,\ldots \gramm \cdots  i_k:\sop{while} b \sbin {do} S \sbin{then} K
+$$
+The difference between the regular stack concatenation $i_k:\sop{while}b\sbin{do}S\cdot K$ and the new constructor is that the latter indicates the loop is the active one in which we already are, while the former is a loop that still needs to be started\footnote{In the presence of \s{continue} and \s{break} statements active loops need to be kept track of in any case.}.
The \s{find} function is updated accordingly with the case
$$ \s{find}(\ell, i_k:\sop{while}b\sbin{do}S, K) \ass \s{find}(\ell, S, i_k:
 \sop{while}b\sbin{do}S\sbin{then}K).$$
The state will now be a 4uple
$(S,K,s,C)$ which adds a constant indexing to the 3uple of regular
semantics. The smallstep rules for all statements but the
costlabelled ones and the indexed loops remain the same, without
touching the $C$ parameter (in particular unindexed loops behave the same
as usual). The remaining cases are:
+$$
+\s{find}(\ell, i_k:\sop{while}b\sbin{do}S, K) \ass \s{find}(\ell, S, i_k: \sop{while}b\sbin{do}S\sbin{then}K)
+$$
+The state will now be a 4tuple $(S,K,s,C)$ which adds a constant indexing to the triple of the regular semantics.
+The smallstep rules for all statements remain the same, without touching the $C$ parameter (in particular unindexed loops behave the same as usual), apart from the ones regarding costlabels and indexed loops.
+The remaining cases are:
$$\begin{aligned}
(\alphab : S,K,s,C) &\to[\alphab_C]_P (S,K,s,C)\\
@@ 1109,89 +1044,65 @@
\end{cases}
\end{aligned}$$
Some explications are in order.
+Some explanations are in order:
\begin{itemize}
 \item Emitting a label always instantiates it with the current indexing.
 \item Hitting an indexed loop the first time initializes to 0 the corresponding
 index; continuing the same loop inrements the index as expected.
 \item The \s{find} function ignores the current indexing: this is correct
 under the assumption that all indexed loops are single entry ones, so that
 when we land inside an indexed loop with a \s{goto}, we are sure that its
 current index is right.
 \item The starting state with store $s$ for a program $P$ is
$(P,\s{halt},s,(i_0\mapsto 0,\dots,i_{n1}\mapsto 0)$ where $i_0,\ldots,i_{n1}$ cover
all loop indexes of $P$\footnote{For a program which is the indexed labelling of an
\imp{} one this corresponds to the maximum nesting of singleentry loops. We can also
avoid computing this value in advance if we define $C[i{\downarrow}0]$ to extend
$C$'s domain as needed, so that the starting constant indexing can be the empty one.}.
+\item
+Emitting a label always instantiates it with the current indexing.
+\item
+Hitting an indexed loop the first time initializes the corresponding index to 0; continuing the same loop increments the index as expected.
+\item
+The \s{find} function ignores the current indexing: this is correct under the assumption that all indexed loops are single entry, so that when we land inside an indexed loop with a \s{goto}, we are sure that its current index is right.
+\item
+The starting state with store $s$ for a program $P$ is $(P,\s{halt},s,(i_0\mapsto 0,\dots,i_{n1}\mapsto 0)$ where $i_0,\ldots,i_{n1}$ cover all loop indices of $P$\footnote{For a program which is the indexed labelling of an \imp{} one this corresponds to the maximum nesting of singleentry loops.
+We can also avoid computing this value in advance if we define $C[i{\downarrow}0]$ to extend $C$'s domain as needed, so that the starting constant indexing can be the empty one.}.
\end{itemize}
\paragraph{Compilation.}
Further down the compilation chain the loop
structure is usually partially or completely lost. We cannot rely on it anymore
to ensure keeping track of original source code iterations. We therefore add
alongside the \s{emit} instruction two other sequential instructions
$\sop{ind_reset}k$ and $\sop{ind_inc}k$ whose sole effect is to reset to
0 (resp.\ increment by 1) the loop index $i_k$, as kept track of in a constant
indexing accompanying the state.

The first step of compilation from $\iota\ell\imp$ consists in prefixing the
translation of an indexed loop $i_k:\s{while}\ b\ \s{do}\ S$ with
$\sop{ind_reset}k$ and postfixing the translation of its body $S$ with
$\sop{ind_inc}k$. Later on in the compilation chain we just need to propagate
the instructions dealing with cost labels.

We would like to stress the fact that this machinery is only needed to give a
suitable semantics of observables on which preservation proofs can be done. By no
means the added instructions and the constant indexing in the state are meant
to change the actual (let us say denotational) semantics of the programs. In this
regard the two new instruction have a similar role as the \s{emit} one. A
forgetful mapping of everything (syntax, states, operational semantics rules)
can be defined erasing all occurrences of cost labels and loop indexes, and the
result will always be a regular version of the language considered.

\paragraph{Stating the preservation of semantics.} In fact, the statement of preservation
of semantics does not change at all, if not for considering traces of evaluated
indexed cost labels rather than traces of plain ones.


\subsection{Dependent costs in the source code}\label{ssec:depcosts}
The task of producing dependent costs out of the constant costs of indexed labels
is quite technical. Before presenting it here, we would like to point out that
the annotations produced by the procedure described in this subsection, even
if correct, can be enormous and unreadable. In \autoref{sec:conc}, when we will
detail the actual implementation, we will also sketch how we mitigated this
problem.

Having the result of compiling the indexed labelling $\Ell^\iota(P)$ of an \imp{}
program $P$, we can still suppose that a cost mapping can be computed, but
from indexed labels to naturals. We want to annotate the source code, so we need
a way to express and compute costs of cost labels, i.e.\ group the costs of
indexed labels to ones of their atoms. In order to do so we introduce
\emph{dependent costs}. Let us suppose that for the sole purpose of annotation,
we have available in the language ternary expressions of the form
+\paragraph{Compilation}
+Further down the compilation chain the loop structure is usually partially or completely lost.
+We cannot rely on it anymore to keep track of the original source code iterations.
+We therefore add, alongside the \s{emit} instruction, two other sequential instructions $\sop{ind_reset}k$ and $\sop{ind_inc}k$ whose sole effect is to reset to 0 (resp.\ increment by 1) the loop index $i_k$, as kept track of in a constant indexing accompanying the state.
+
+The first step of compilation from $\iota\ell\imp$ consists of prefixing the translation of an indexed loop $i_k:\s{while}\ b\ \s{do}\ S$ with $\sop{ind_reset}k$ and postfixing the translation of its body $S$ with $\sop{ind_inc}k$.
+Later in the compilation chain we must propagate the instructions dealing with cost labels.
+
+We would like to stress the fact that this machinery is only needed to give a suitable semantics of observables on which preservation proofs can be done.
+By no means are the added instructions and the constant indexing in the state meant to change the actual (let us say denotational) semantics of the programs.
+In this regard the two new instruction have a similar role as the \s{emit} one.
+A forgetful mapping of everything (syntax, states, operational semantics rules) can be defined erasing all occurrences of cost labels and loop indices, and the result will always be a regular version of the language considered.
+
+\paragraph{Stating the preservation of semantics}
+In fact, the statement of preservation of semantics does not change at all, if not for considering traces of evaluated indexed cost labels rather than traces of plain ones.
+
+\subsection{Dependent costs in the source code}
+\label{ssec:depcosts}
+The task of producing dependent costs from constant costs induced by indexed labels is quite technical.
+Before presenting it here, we would like to point out that the annotations produced by the procedure described in this Subsection, even if correct, can be enormous and unreadable.
+In Section~\ref{sec:conc}, where we detail the actual implementation, we will also sketch how we mitigated this problem.
+
+Having the result of compiling the indexed labelling $\Ell^\iota(P)$ of an \imp{} program $P$, we may still suppose that a cost mapping can be computed, but from indexed labels to naturals.
+We want to annotate the source code, so we need a way to express and compute the costs of cost labels, i.e.\ group the costs of indexed labels to ones of their atoms.
+In order to do so we introduce \emph{dependent costs}.
+Let us suppose that for the sole purpose of annotation, we have available in the language ternary expressions of the form
$$\tern e {f_1}{f_2},$$
and that we have access to common operators on integers such as equality, order
and modulus.

\paragraph{Simple conditions.}
First, we need to shift from \emph{transformations} of loop indexes to
\emph{conditions} on them. We identify a set of conditions on natural numbers
which are able to express the image of any composition of simple expressions.

+and that we have access to common operators on integers such as equality, order and modulus.
+
+\paragraph{Simple conditions}
+
+First, we need to shift from \emph{transformations} of loop indices to \emph{conditions} on them.
+We identify a set of conditions on natural numbers which are able to express the image of any composition of simple expressions.
\emph{Simple conditions} are of three possible forms:
\begin{itemize}
 \item equality $i_k=n$ for some natural $n$;
 \item inequality $i_k\ge n$ for some natural $n$;
 \item modular equality together with inequality $i_k\bmod a = b\wedge i_k\ge n$
 for naturals $a, b, n$.
+\item
+Equality $i_k=n$ for some natural $n$.
+\item
+Inequality $i_k\ge n$ for some natural $n$.
+\item
+Modular equality together with inequality $i_k\bmod a = b\wedge i_k\ge n$ for naturals $a, b, n$.
\end{itemize}
The always true simple condition is given by $i_k\ge 0$, and similarly we
write $i_k\bmod a = b$ as a simple condition for $i_k\bmod a = b\wedge i_k\ge 0$.
Given a simple condition $p$ and a constant indexing $C$ we can easily define
when $p$ holds for $C$ (written $p\circ C$). A \emph{dependent cost expression}
is an expression built solely out of integer constants and ternary expressions
with simple conditions at their head. Given a dependent cost expression $e$ where
all of the loop indexes appearing in it are in the domain of a constant indexing
$C$, we can define the value $e\circ C\in \mathbb N$ by
+The `always true' simple condition is given by $i_k\ge 0$.
+We write $i_k\bmod a = b$ as a simple condition for $i_k\bmod a = b\wedge i_k\ge 0$.
+
+Given a simple condition $p$ and a constant indexing $C$ we can easily define when $p$ holds for $C$ (written $p\circ C$).
+A \emph{dependent cost expression} is an expression built solely out of integer constants and ternary expressions with simple conditions at their head.
+Given a dependent cost expression $e$ where all of the loop indices appearing in it are in the domain of a constant indexing $C$, we can define the value $e\circ C\in \mathbb N$ by:
$$n\circ C\ass n,\qquad (\tern p e f)\circ C\ass
\begin{cases}
@@ 1200,12 +1111,10 @@
\end{cases}$$
\paragraph{From indexed costs to dependent ones.}
Every simple expression $e$ corresponds to a simple condition $p(e)$ which expresses the
set of values that can be the result of it. Following is the definition of such
relation. We recall that in this development loop indexes are always mapped to
simple expressions over the same index. If it was not the case, the condition
obtained from an expression should be on the mapped index, not the indeterminate
of the simple expression. We leave to further work all generalizations of what
we present here.
+\paragraph{From indexed costs to dependent ones}
+Every simple expression $e$ corresponds to a simple condition $p(e)$ which expresses the set of values that $e$ can take.
+Following is the definition of such a relation.
+We recall that in this development, loop indices are always mapped to simple expressions over the same index.
+If it was not the case, the condition obtained from an expression should be on the mapped index, not the indeterminate of the simple expression.
+We leave all generalisations of what we present here for further work:
$$
p(a*i_k+b)\ass
@@ 1214,42 +1123,31 @@
i_k \ge b & \text{if $a = 1$,}\\
i_k\bmod a = b \wedge i_k \ge b & \text{otherwise}.
\end{cases}$$
Now, suppose we are given a mapping $\kappa$ from indexed labels to natural
numbers. We will transform it in a mapping (identified with an abuse of notation
with the same symbol $\kappa$) from atoms to \imp{} expressions built with
ternary expressions which depend solely on loop indexes. To that end we define
an auxiliary function $\kappa^\alpha_L$ parameterized by atoms and words of
simple expressions and defined on \emph{sets} of $n$uples of simple expressions
(with $n$ constant across each such set, i.e.\ each set is made of words with
the same length).

We will employ a bijection between words of simple expressions and indexings,
given by\footnote{Lists of simple expressions is in fact how indexings are
represented in Cerco's current implementation of the compiler.}
$$i_0\mapsto e_0,\ldots,i_{k1}\mapsto e_{k1} \cong e_0\cdots e_{k1}.$$
As usual, $\varepsilon$ denotes the empty word/indexing, and juxtaposition
word concatenation.

For every set $s$ of $n$uples of simple expressions, we are in one of the following
three exclusive cases:
+\end{cases}
+$$
+Now, suppose we are given a mapping $\kappa$ from indexed labels to natural numbers.
+We will transform it in a mapping (identified, via abuse of notation, with the same symbol $\kappa$) from atoms to \imp{} expressions built with ternary expressions which depend solely on loop indices.
+To that end we define an auxiliary function $\kappa^\alpha_L$, parameterized by atoms and words of simple expressions, and defined on \emph{sets} of $n$uples of simple expressions (with $n$ constant across each such set, i.e.\ each set is made of words all with the same length).
+
+We will employ a bijection between words of simple expressions and indexings, given by:\footnote{Lists of simple expressions are in fact how indexings are represented in CerCo's current implementation of the compiler.}
+$$
+i_0\mapsto e_0,\ldots,i_{k1}\mapsto e_{k1} \cong e_0\cdots e_{k1}.
+$$
+As usual, $\varepsilon$ denotes the empty word/indexing, and juxtaposition is used to denote word concatenation.
+
+For every set $s$ of $n$uples of simple expressions, we are in one of the following three exclusive cases:
\begin{itemize}
 \item $S=\emptyset$, or
 \item $S=\{\varepsilon\}$, or
 \item there is a simple expression $e$ such that $S$ can be decomposed in
 $eS'+S''$, with $S'\neq \emptyset$ and none of the words in $S''$ starting with $e$
+\item
+$S=\emptyset$.
+\item
+$S=\{\varepsilon\}$.
+\item
+There is a simple expression $e$ such that $S$ can be decomposed in $eS'+S''$, with $S'\neq \emptyset$ and none of the words in $S''$ starting with $e$.
\end{itemize}
where $eS'$ denotes prepending $e$ to all elements of $S'$ and $+$ is disjoint
union. This classification can serve as the basis of a definition by recursion
on $n+\card S$ where $n$ is the size of tuples in $S$ and $\card S$ is its cardinality.
Indeed in the third case in $S'$ the size of tuples decreases strictly (and
cardinality does not increase) while for $S''$ the size of tuples remains the same
but cardinality strictly decreases. The expression $e$ of the third case will be chosen
as minimal for some total order\footnote{The specific order used does not change
the correctness of the procedure, but different orders can give more or less
readable results. A ``good'' order is the lexicographic one, with $a*i_k+b \le a'*i_k+b'$
if $a