Changeset 1678


Ignore:
Timestamp:
Feb 7, 2012, 4:36:28 PM (6 years ago)
Author:
mulligan
Message:

finished editing the english in the report

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Deliverables/addenda/indexed_labels/report.tex

    r1677 r1678  
    626626
    627627\section{Introduction}
    628 In~\cite{D2.1}, Armadio et al. propose an approach for building a compiler for a large fragment of the \textsc{c} programming language.
     628In~\cite{D2.1}, Armadio \emph{et al} propose an approach for building a compiler for a large fragment of the \textsc{c} programming language.
    629629The novelty of their proposal lies in the fact that their proposed design is capable of lifting execution cost information from the compiled code and presenting it to the user.
    630630This idea is foundational for the CerCo project, which strives to produce a mechanically certified version of such a compiler.
     
    653653\end{itemize}
    654654The first point unveils a weakness of the current labelling approach when it comes to some common code transformations performed along a compilation chain.
    655 In particular, most \emph{loop optimizations} are disruptive, in the sense outlined in the first bulletpoint above.
     655In particular, most \emph{loop optimisations} are disruptive, in the sense outlined in the first bulletpoint above.
    656656An example optimisation of this kind is \emph{loop peeling}.
    657657This optimisation is employed by compilers in order to trigger other optimisations, for example, dead code elimination or invariant code motion.
     
    665665If one looks closely, the source of the weakness of the labelling approach as presented in~\cite{D2.1} is common to both points: the inability to state different costs for different occurrences of labels, where the difference might be originated by labels being duplicated along the compilation, or the costs being sensitive to the current state of execution.
    666666The preliminary work we present here addresses this weakness by introducing cost labels that are dependent on which iteration of its containing loops it occurs in.
    667 This is achieved by means of \emph{indexed labels}; all cost labels are decorated with formal indexes coming from the loops containing such labels.
    668 These indexes allow us to rebuild, even after multiple loop transformations, which iterations of the original loops in the source code a particular label occurrence belongs to.
     667This is achieved by means of \emph{indexed labels}; all cost labels are decorated with formal indices coming from the loops containing such labels.
     668These indices allow us to rebuild, even after multiple loop transformations, which iterations of the original loops in the source code a particular label occurrence belongs to.
    669669During the annotation stage of the source code, this information is presented to the user by means of \emph{dependent costs}.
    670670
     
    679679\paragraph{Loop unrolling}
    680680This optimisation consists of the repetition of several copies of the body of the loop inside the loop itself (inserting appropriate guards, or avoiding them altogether if enough information about the loop's guard is available at compile time).
    681 This can limit the number of (conditional or unconditional) jumps executed by the code and trigger further optimizations dealing with pipelining, if appropriate for the architecture.
     681This can limit the number of (conditional or unconditional) jumps executed by the code and trigger further optimisations dealing with pipelining, if appropriate for the architecture.
    682682\\\\
    683683Whilst we cover only two loop optimisations in this report, we argue that the work presented herein poses a good foundation for extending the labelling approach, in order to cover more and more common optimisations, as well as gaining insight into how to integrate advanced cost estimation techniques, such as cache analysis, into the CerCo compiler.
     
    686686
    687687\paragraph{Outline}
    688 We will present our approach on a minimal `toy' imperative language, \imp{} with \s{goto}s, which we present in \autoref{sec:defimp} along with formal definitions of the loop transformations.
     688We will present our approach on a minimal `toy' imperative language, \imp{} with \s{goto}s, which we present in Section~\ref{sec:defimp} along with formal definitions of the loop transformations.
    689689This language already presents most of the difficulties encountered when dealing with \textsc{c}, so we stick to it for the sake of this presentation.
    690 In Section~\autoref{sec:labelling} we summarize the labelling approach as presented in~\cite{D2.1}.
    691 Section~\autoref{sec:indexedlabels} presents \emph{indexed labels}, our proposal for dependent labels which are able to describe precise costs even in the presence of the various loop transformations we consider.
    692 Finally Section~\autoref{sec:conc} goes into more detail regarding the implementation of indexed labels in CerCo's untrusted compiler and speculates on further work on the subject.
     690In Section~\ref{sec:labelling} we summarize the labelling approach as presented in~\cite{D2.1}.
     691Section~\ref{sec:indexedlabels} presents \emph{indexed labels}, our proposal for dependent labels which are able to describe precise costs even in the presence of the various loop transformations we consider.
     692Finally Section~\ref{sec:conc} goes into more detail regarding the implementation of indexed labels in CerCo's untrusted compiler and speculates on further work on the subject.
    693693
    694694\section{\imp{} with goto}\label{sec:defimp}
     
    696696The language was designed in order to pose problems for the existing labelling approach, and as a testing ground for our new notion of indexed labels.
    697697
    698 The syntax and operational semantics of our toy language are presented in \autoref{fig:minimp}.
     698The syntax and operational semantics of our toy language are presented in~\ref{fig:minimp}.
    699699Note, we may augment the language further, with \s{break} and \s{continue}, at no further expense.
    700700\begin{figureplr}
     
    792792Note that for \s{break} and \s{continue} statements, those should be replaced with \s{goto}s in the peeled body $S$.
    793793
    794 \paragraph{Loop unrolling.}
     794\paragraph{Loop unrolling}
    795795$$
    796796\sop{while}b\sbin{do}S\mapsto
     
    800800  \sop{if} b \sbin{then} S[\ell^n_i/\ell_i]) \cdots)
    801801$$
    802 where $\ell^j_i$ are again fresh labels for any $\ell_i$ labelling a statement
    803 in $S$. This is a willingly naïf version of loop unrolling, which usually
    804 targets less general loops. The problem it poses to Cerco's labelling approach
    805 are independent of the cleverness of the actual transformation.
    806 
    807 \section{Labelling: a quick sketch of the previous approach}\label{sec:labelling}
    808 Plainly labelled \imp{} is obtained adding to the code \emph{cost labels}
    809 (with metavariables $\alpha,\beta,\ldots$), and cost-labelled statements:
    810 $$S,T\gramm \cdots \mid \alpha: S$$
    811 Cost labels allow for some program points to be tracked along the compilation
    812 chain. For further details we refer to\cite{D2.1}.
    813 
    814 With labels the small step semantics turns into a labelled transition
    815 system and a natural notion of trace (i.e.\ lists of labels) arises.
    816 Evaluation of statements is enriched with traces, so that rules are like
     802where $\ell^j_i$ are again fresh labels for any $\ell_i$ labelling a statement in $S$.
     803This is a willfully na\"{i}ve version of loop unrolling, which usually targets less general loops.
     804The problem this transformation poses to CerCo's labelling approach are independent of the sophistication of the actual transformation.
     805
     806\section{Labelling: a quick sketch of the previous approach}
     807\label{sec:labelling}
     808Plainly labelled \imp{} is obtained by adding to the code \emph{cost labels} (with metavariables $\alpha,\beta,\ldots$), and cost-labelled statements:
     809$$
     810S,T\gramm \cdots \mid \alpha: S
     811$$
     812Cost labels allow us to track some program points along the compilation chain.
     813For further details we refer to~\cite{D2.1}.
     814
     815With labels the small step semantics turns into a labelled transition system along with a natural notion of trace (i.e.\ lists of labels) arises.
     816The evaluation of statements is enriched with traces, so that rules follow a pattern similar to the following:
    817817$$
    818818\begin{array}{lblL}
     
    821821& \text{etc.}
    822822\end{array}$$
    823 where we identify cost labels $\alpha$ with singleton traces and we use $\varepsilon$
    824 for the empty trace. Cost labels are emitted by cost-labelled statements only%
    825 \footnote{In the general case the evaluation of expressions can emit cost labels
    826 too (see \autoref{sec:conc}).}. We then write $\to[\lambda]\!\!^*$ for the transitive
    827 closure of the small step semantics which produces by concatenation the trace
    828 $\lambda$.
    829 
    830 \paragraph{Labelling.}
    831 Given an \imp{} program $P$ its \emph{labelling} $\alpha:\Ell(P)$ in $\ell-\imp$
    832 is
    833 defined by putting cost labels after every branching, at the start of both
    834 branches, and a cost label at the beginning of the program. So the relevant
    835 cases are
     823Here, we identify cost labels $\alpha$ with singleton traces and we use $\varepsilon$ for the empty trace.
     824Cost labels are emitted by cost-labelled statements only\footnote{In the general case the evaluation of expressions can emit cost labels too (see~\ref{sec:conc}).}.
     825We then write $\to[\lambda]\!\!^*$ for the transitive closure of the small step semantics which produces by concatenation the trace $\lambda$.
     826
     827\paragraph{Labelling}
     828Given an \imp{} program $P$ its \emph{labelling} $\alpha:\Ell(P)$ in $\ell-\imp$ is defined by putting cost labels after every branching statement, at the start of both branches, and a cost label at the beginning of the program.
     829The relevant cases are
    836830$$\begin{aligned}
    837831  \Ell(\sop{if}e\sbin{then}S\sbin{else}T) &=
     
    840834    (\sop{while}e\sbin{do}\alpha:\Ell(S));\beta:\s{skip}
    841835  \end{aligned}$$
    842 where $\alpha,\beta$ are fresh cost labels, and while in other cases the
    843 definition just passes to sub-statements.
    844 
    845 \paragraph{Labels in the rest of the compilation chain.} All languages further
    846 down the chain get a new sequential statement $\sop{emit}\alpha$ whose effect is
    847 to be consumed in a labelled transition while keeping the same state. All other
    848 instructions guard their operational semantics and do not emit cost labels.
    849 
    850 Preservation of semantics throughout the compilation process is restated, in
    851 rough terms, as
    852 $$\text{starting state of $P$}\to[\lambda]\!\!^*\;\text{halting state} \iff
    853   \text{starting state of $\mathcal C(P)$} \to[\lambda]\!\!^*\;\text{halting state},$$
    854 where $P$ is a program of a language along the compilation chain, starting and
    855 halting states depend on the language, and $\mathcal C$ is the
    856 compilation function\footnote{The case of divergent computations needs
    857 to be addressed too. Also, the requirement can be weakened by demanding some
    858 sort of equivalence of the traces rather than equality. Both these issues escape
    859 the scope of this presentation.}.
     836where $\alpha,\beta$ are fresh cost labels.
     837In all other cases the definition just passes to substatements.
     838
     839\paragraph{Labels in the rest of the compilation chain}
     840All languages further down the chain get a new sequential statement $\sop{emit}\alpha$ whose effect is to be consumed in a labelled transition while keeping the same state.
     841All other instructions guard their operational semantics and do not emit cost labels.
     842
     843Preservation of semantics throughout the compilation process is restated, in rough terms, as:
     844$$
     845\text{starting state of $P$}\to[\lambda]\!\!^*\;\text{halting state} \iff
     846\text{starting state of $\mathcal C(P)$} \to[\lambda]\!\!^*\;\text{halting state}
     847$$
     848Here $P$ is a program of a language along the compilation chain, starting and halting states depend on the language, and $\mathcal C$ is the compilation function\footnote{The case of divergent computations needs to be addressed too.
     849Also, the requirement can be weakened by demanding some sort weaker form of equivalence of the traces than equality.
     850Both of these issues are beyond the scope of this presentation.}.
    860851
    861852\paragraph{Instrumentations}
    862 Let $\mathcal C$ be the whole compilation from $\ell\imp$ to the labelled
    863 version of some low-level language $L$. Supposing such compilation has not
    864 introduced any new loop or branching, we have that
     853Let $\mathcal C$ be the whole compilation from $\ell\imp$ to the labelled version of some low-level language $L$.
     854Supposing such compilation has not introduced any new loop or branching, we have that:
    865855\begin{itemize}
    866  \item every loop contains at least a cost label (\emph{soundness condition}),
    867 and
    868  \item every branching has different labels for the two branches
    869   (\emph{preciseness condition}).
     856\item
     857Every loop contains at least a cost label (\emph{soundness condition})
     858\item
     859Every branching has different labels for the two branches (\emph{preciseness condition}).
    870860\end{itemize}
    871 With these two conditions, we have that each and every cost label in
    872 $\mathcal C(P)$ for any $P$ corresponds to a block of sequential instructions,
    873 to which we can assign a constant \emph{cost}\footnote{This in fact requires the
    874 machine architecture to be simple enough, or for some form of execution analysis
    875 to take place.} We can therefore assume a \emph{cost mapping} $\kappa_P$ from
    876 cost labels to natural numbers, assigning to each cost label $\alpha$ the cost
    877 of the block containing the single occurrance of $\alpha$.
    878 
    879 Given any cost mapping $\kappa$, we can enrich a labelled program so that a
    880 particular fresh variable (the \emph{cost variable} $c$) keeps track of the
    881 assumulation of costs during the execution. We call this procedure
    882 \emph{instrumentation} of the program, and it is defined recursively by
    883 $$
    884   \mathcal I(\alpha:S) = c \ass c + \kappa(\alpha) ; \mathcal I(S)
    885 $$
    886 while in all other cases the definition passes to substatements.
    887 
    888 \paragraph{The problem with loop optimizations.}
    889 Let us take loop peeling, and apply it to the labelling of a program without any
    890 adjustment:
     861With these two conditions, we have that each and every cost label in $\mathcal C(P)$ for any $P$ corresponds to a block of sequential instructions, to which we can assign a constant \emph{cost}\footnote{This in fact requires the machine architecture to be `simple enough', or for some form of execution analysis to take place.}
     862We therefore may assume the existence of a \emph{cost mapping} $\kappa_P$ from cost labels to natural numbers, assigning to each cost label $\alpha$ the cost of the block containing the single occurrance of $\alpha$.
     863
     864Given any cost mapping $\kappa$, we can enrich a labelled program so that a particular fresh variable (the \emph{cost variable} $c$) keeps track of the summation of costs during the execution.
     865We call this procedure \emph{instrumentation} of the program, and it is defined recursively by:
     866$$
     867\mathcal I(\alpha:S) = c \ass c + \kappa(\alpha) ; \mathcal I(S)
     868$$
     869In all other cases the definition passes to substatements.
     870
     871\paragraph{The problem with loop optimisations}
     872Let us take loop peeling, and apply it to the labelling of a program without any prior adjustment:
    891873$$
    892874(\sop{while}e\sbin{do}\alpha:S);\beta:\s{skip}
     
    894876(\sop{if}b\sbin{then} \alpha : S; \sop{while} b \sbin{do} \alpha :
    895877S[\ell'_i/\ell_i]);
    896 \beta:\s{skip}$$
    897 What happens is that the cost label $\alpha$ is duplicated into two distinct
    898 occurrences. If these two occurrences correspond to different costs in the
    899 compiled code, the best the cost mapping can do is to take the maximum of the
    900 two, preserving soundness (i.e.\ the cost estimate still bounds the actual one)
    901 but loosing preciseness (i.e.\ the actual cost would be strictly less than its
    902 estimate).
    903 
    904 \section{Indexed labels}\label{sec:indexedlabels}
    905 This section presents the core of the new approach. In brief points it amounts
    906 to the following.
    907 \begin{enumerate}[\bfseries \ref*{sec:indexedlabels}.1.]
    908  \item\label{en:outline1}
    909 Enrich cost labels with formal indexes corresponding, at the beginning of
    910 the process, to which iteration of the loop they belong to.
    911  \item\label{en:outline2}
    912 Each time a loop transformation is applied and a cost labels is split in
    913 different occurrences, each of these will be reindexed so that every time they
    914 are emitted their position in the original loop will be reconstructed.
    915  \item\label{en:outline3}
    916 Along the compilation chain, add alongside the \s{emit} instruction other
    917 ones updating the indexes, so that iterations of the original loops can be
    918 rebuilt at the operational semantics level.
    919  \item\label{en:outline4}
    920 The machinery computing the cost mapping will still work, but assigning
    921 costs to indexed cost labels, rather than to cost labels as we wish: however
    922 \emph{dependent costs} can be calculated, where dependency is on which iteration
    923 of the containing loops we are in.
     878\beta:\s{skip}
     879$$
     880What happens is that the cost label $\alpha$ is duplicated with two distinct occurrences.
     881If these two occurrences correspond to different costs in the compiled code, the best the cost mapping can do is to take the maximum of the two, preserving soundness (i.e.\ the cost estimate still bounds the actual one) but losing preciseness (i.e.\ the actual cost would be strictly less than its estimate).
     882
     883\section{Indexed labels}
     884\label{sec:indexedlabels}
     885This section presents the core of the new approach.
     886In brief points it amounts to the following:
     887\begin{enumerate}[\bfseries~\ref*{sec:indexedlabels}.1.]
     888\item
     889\label{en:outline1}
     890Enrich cost labels with formal indices corresponding, at the beginning of the process, to which iteration of the loop they belong to.
     891\item
     892\label{en:outline2}
     893Each time a loop transformation is applied and a cost labels is split in different occurrences, each of these will be reindexed so that every time they are emitted their position in the original loop will be reconstructed.
     894\item
     895\label{en:outline3}
     896Along the compilation chain, alongside the \s{emit} instruction we add other instructions updating the indices, so that iterations of the original loops can be rebuilt at the operational semantics level.
     897\item
     898\label{en:outline4}
     899The machinery computing the cost mapping will still work, but assigning costs to indexed cost labels, rather than to cost labels as we wish.
     900However, \emph{dependent costs} can be calculated, where dependency is on which iteration of the containing loops we are in.
    924901\end{enumerate}
    925902
    926 \subsection{Indexing the cost labels}\label{ssec:indlabs}
    927 \paragraph{Formal indexes and $\iota\ell\imp$.}
    928 Let $i_0,i_1,\ldots$ be a sequence of distinguished fresh identifiers that will
    929 be used as loop indexes. A \emph{simple expression} is an affine arithmetical
    930 expression in one of these indexes, that is $a*i_k+b$ with $a,b,k \in \mathbb N$.
    931 Simple expressions $e_1=a_1*i_k+b_1$, $e_2=a2*i_k+b_2$ in the same index can be
    932 composed, yielding $e_1\circ e_2\ass (a_1a_2)*i_k + (a_1b2+b_1)$, and this operation
    933 has an identity element in $id_k \ass 1*i_k+0$. Constants can be expressed as simple
    934 expressions, so that we identify a natural $c$ with $0*i_k+c$.
    935 
    936 An \emph{indexing} (with metavariables $I$, $J$, \ldots) is a list of
    937 transformations of successive formal indexes dictated by simple expressions,
    938 that is a mapping%
    939 \footnote{Here we restrict each mapping to be a simple expression
    940 \emph{on the same index}. This might not be the case if more loop optimizations
    941 are accounted for (for example, interchanging two nested loops).}
    942 $$i_0\mapsto a_0*i_0+b_0,\dots, i_{k-1} \mapsto a_{k-1}*i_{k-1}+b_{k-1}.$$
    943 
    944 An \emph{indexed cost label} (metavariables $\alphab$, $\betab$, \ldots) is
    945 the combination of a cost label $\alpha$ and an indexing $I$, written
    946 $\alpha\la I\ra$. The cost label underlying an indexed one is called its
    947 \emph{atom}. All plain labels can be considered as indexed ones by taking
    948 an empty indexing.
    949 
    950 \imp{} with indexed labels ($\iota\ell\imp$) is defined by adding to $\imp$
    951 statements with indexed labels, and by having loops with formal indexes
    952 attached to them:
    953 $$S,T,\ldots \gramm \cdots i_k:\sop{while}e\sbin{do}S\mid \alphab : S.$$
    954 Notice than unindexed loops still are in the language: they will correspond
    955 to multi-entry loops which are ignored by indexing and optimizations.
     903\subsection{Indexing the cost labels}
     904\label{ssec:indlabs}
     905
     906\paragraph{Formal indices and $\iota\ell\imp$}
     907Let $i_0,i_1,\ldots$ be a sequence of distinguished fresh identifiers that will be used as loop indices.
     908A \emph{simple expression} is an affine arithmetical expression in one of these indices, that is $a*i_k+b$ with $a,b,k \in \mathbb N$.
     909Simple expressions $e_1=a_1*i_k+b_1$, $e_2=a2*i_k+b_2$ in the same index can be composed, yielding $e_1\circ e_2\ass (a_1a_2)*i_k + (a_1b2+b_1)$, and this operation has an identity element in $id_k \ass 1*i_k+0$.
     910Constants can be expressed as simple expressions, so that we identify a natural $c$ with $0*i_k+c$.
     911
     912An \emph{indexing} (with metavariables $I$, $J$, \ldots) is a list of transformations of successive formal indices dictated by simple expressions, that is a mapping\footnote{Here we restrict each mapping to be a simple expression \emph{on the same index}.
     913This might not be the case if more loop optimisations are accounted for (for example, interchanging two nested loops).}
     914$$
     915i_0\mapsto a_0*i_0+b_0,\dots, i_{k-1} \mapsto a_{k-1}*i_{k-1}+b_{k-1}
     916$$
     917
     918An \emph{indexed cost label} (metavariables $\alphab$, $\betab$, \ldots) is the combination of a cost label $\alpha$ and an indexing $I$, written $\alpha\la I\ra$.
     919The cost label underlying an indexed one is called its \emph{atom}.
     920All plain labels can be considered as indexed ones by taking an empty indexing.
     921
     922\imp{} with indexed labels ($\iota\ell\imp$) is defined by adding to $\imp$ statements with indexed labels, and by having loops with formal indices attached to them:
     923$$
     924S,T,\ldots \gramm \cdots i_k:\sop{while}e\sbin{do}S\mid \alphab : S
     925$$
     926Note than unindexed loops still exist in the language: they will correspond to multi-entry loops which are ignored by indexing and optimisations.
    956927We will discuss the semantics later.
    957928
    958 \paragraph{Indexed labelling.}
    959 Given an $\imp$ program $P$, in order to index loops and assign indexed labels
    960 we must first of all distinguish single-entry loops. We just sketch how it can
    961 be computed.
    962 
    963 A first pass of the program $P$ can easily compute two maps: $\s{loopof}_P$
    964 from each label $\ell$ to the occurrence (i.e.\ the path) of a $\s{while}$ loop
    965 containing $\ell$, or the empty path if none exists; and $\s{gotosof}_P$ from
    966 a label $\ell$ to the occurrences of \s{goto}s pointing to it. Then the set
    967 $\s{multientry}_P$ of multi-entry loops of $P$ can be computed by
    968 $$\s{multientry}_P\ass\{\, p \mid \exists \ell,q.p =\s{loopof}_P(\ell),q\in\s{gotosof}_P(\ell), q \not\le p\,\}$$
    969 where $\le$ is the prefix relation\footnote{Possible simplification to this
    970 procedure include keeping track just of while loops containing labels and
    971 \s{goto}s (rather than paths in the syntactic tree of the program), and making
    972 two passes while avoiding building the map to sets $\s{gotosof}$}.
    973 
    974 Let $Id_k$ be the indexing of length $k$ made from identity simple expressions,
    975 i.e.\ the sequence $i_0\mapsto id_0, \ldots , i_{k-1}\mapsto id_{k-1}$. We define the tiered indexed labelling
    976 $\Ell^\iota_P (S,k)$ in program $P$ for occurrence $S$ of a statement in $P$
    977 and a natural $k$ by recursion, setting
     929\paragraph{Indexed labelling}
     930Given an $\imp$ program $P$, in order to index loops and assign indexed labels, we must first distinguish single-entry loops.
     931We sketch how this can be computed in the sequel.
     932
     933A first pass of the program $P$ can easily compute two maps: $\s{loopof}_P$ from each label $\ell$ to the occurrence (i.e.\ the path) of a $\s{while}$ loop containing $\ell$, or the empty path if none exists; and $\s{gotosof}_P$ from a label $\ell$ to the occurrences of \s{goto}s pointing to it.
     934Then the set $\s{multientry}_P$ of multi-entry loops of $P$ can be computed by
     935$$
     936\s{multientry}_P\ass\{\, p \mid \exists \ell,q.p =\s{loopof}_P(\ell),q\in\s{gotosof}_P(\ell), q \not\le p\,\}
     937$$
     938Here $\le$ is the prefix relation\footnote{Possible simplifications to this procedure include keeping track of just the while loops containing labels and \s{goto}s (rather than paths in the syntactic tree of the program), and making two passes while avoiding building the map to sets $\s{gotosof}$}.
     939
     940Let $Id_k$ be the indexing of length $k$ made from identity simple expressions, i.e.\ the sequence $i_0\mapsto id_0, \ldots , i_{k-1}\mapsto id_{k-1}$.
     941We define the tiered indexed labelling $\Ell^\iota_P (S,k)$ in program $P$ for occurrence $S$ of a statement in $P$ and a natural $k$ by recursion, setting:
    978942$$
    979943\Ell^\iota_P(S,k)\ass
     
    990954\right.
    991955$$
    992 where as usual $\alpha$ and $\beta$ are to be fresh cost labels, and other cases just keep
    993 making the recursive calls on the substatements. The \emph{indexed labelling} of
    994 a program $P$ is then defined as $\alpha\la \ra : \Ell^\iota_P(P,0)$, i.e.\ a
    995 further fresh unindexed cost label is added at the start, and we start from level $0$.
    996 
    997 In other words: each single-entry loop is indexed by $i_k$ where $k$ is the number of
    998 other single-entry loops containing this one, and all cost labels under the scope
    999 of a single-entry loop indexed by $i_k$ are indexed by all indexes $i_0,\ldots,i_k$,
    1000 without any transformation.
     956Here, as usual, $\alpha$ and $\beta$ are fresh cost labels, and other cases just keep making the recursive calls on the substatements.
     957The \emph{indexed labelling} of a program $P$ is then defined as $\alpha\la \ra : \Ell^\iota_P(P,0)$, i.e.\ a further fresh unindexed cost label is added at the start, and we start from level $0$.
     958
     959In plainer words: each single-entry loop is indexed by $i_k$ where $k$ is the number of other single-entry loops containing this one, and all cost labels under the scope of a single-entry loop indexed by $i_k$ are indexed by all indices $i_0,\ldots,i_k$, without any transformation.
    1001960
    1002961\subsection{Indexed labels and loop transformations}
    1003 We define the \emph{reindexing} $I \circ (i_k\mapsto a*i_k+b)$ as an operator
    1004 on indexings by setting
     962We define the \emph{reindexing} $I \circ (i_k\mapsto a*i_k+b)$ as an operator on indexings by setting:
    1005963\begin{multline*}
    1006964(i_0\mapsto e_0,\ldots, i_k \mapsto e_k,\ldots,i_n\mapsto e_n)
     
    1009967i_0\mapsto e_0,\ldots, i_k \mapsto e_k \circ(a*i_k+b),\ldots,i_n\mapsto e_n,
    1010968\end{multline*}
    1011 and extending then to indexed labels (by $\alpha\la I\ra\circ(i_k\mapsto e)\ass
    1012 \alpha\la I\circ (i_k\mapsto e)\ra$) and to statements in $\iota\ell\imp$
    1013 (by applying the above transformation to all indexed labels).
    1014 
    1015 We can then redefine loop peeling and loop unrolling taking into account indexed labels.
    1016 It will be possible to apply the transformation only to indexed loops, that is loops
    1017 that are single-entry. The attentive reader will notice that no assumption is
    1018 made on the labelling of the statements involved. In particular the transformation
    1019 can be repeated and composed at will. Also, notice that erasing all labelling
    1020 information (i.e.\ indexed cost labels and loop indexes) we recover exactly the
    1021 same transformations presented in \autoref{sec:defimp}.
    1022 
    1023 \paragraph{Indexed loop peeling.}
    1024 
     969We further extend to indexed labels (by $\alpha\la I\ra\circ(i_k\mapsto e)\ass \alpha\la I\circ (i_k\mapsto e)\ra$) and also to statements in $\iota\ell\imp$ (by applying the above transformation to all indexed labels).
     970
     971We can then redefine loop peeling and loop unrolling, taking into account indexed labels.
     972It will only be possible to apply the transformation to indexed loops, that is loops that are single-entry.
     973The attentive reader will notice that no assumptions are made on the labelling of the statements that are involved.
     974In particular the transformation can be repeated and composed at will.
     975Also, note that after erasing all labelling information (i.e.\ indexed cost labels and loop indices) we recover exactly the same transformations presented in~\ref{sec:defimp}.
     976
     977\paragraph{Indexed loop peeling}
    1025978$$
    1026979i_k:\sop{while}b\sbin{do}S\mapsto
    1027980\sop{if}b\sbin{then} S\circ (i_k\mapsto 0); i_k : \sop{while} b \sbin{do} S[\ell'_i/\ell_i]\circ(i_k\mapsto i_k + 1)
    1028981$$
    1029 As can be expected, the peeled iteration of the loop gets reindexed as always
    1030 being the first iteration of the loop, while the iterations of the remaining
    1031 loop get shifted by $1$.
    1032 
    1033 \paragraph{Indexed loop unrolling.}
     982As can be expected, the peeled iteration of the loop gets reindexed, always being the first iteration of the loop, while the iterations of the remaining loop are shifted by $1$.
     983
     984\paragraph{Indexed loop unrolling}
    1034985$$
    1035986\begin{array}{l}
     
    1048999\end{array}
    10491000$$
    1050 Again, the reindexing is as can be expected: each copy of the unrolled body
    1051 has its indexes remapped so that when they are executed the original iteration
    1052 of the loop to which they correspond can be recovered.
     1001Again, the reindexing is as expected: each copy of the unrolled body has its indices remapped so that when they are executed, the original iteration of the loop to which they correspond can be recovered.
    10531002
    10541003\subsection{Semantics and compilation of indexed labels}
    1055 
    1056 In order to make sense of loop indexes, one must keep track of their values
    1057 in the state. A \emph{constant indexing} (metavariables $C,\ldots$) is an
    1058 indexing which employs only constant simple expressions. The evaluation
    1059 of an indexing $I$ in a constant indexing $C$, noted $I|_C$, is defined
    1060 by
    1061 $$I\circ(i_0\mapsto c_0,\ldots, i_{k-1}\mapsto c_{k-1}) \ass
    1062   \alphab\circ(i_0\mapsto c_0)\circ\cdots\circ(i_{k-1}\mapsto c_{k-1})$$
    1063 (using the definition of ${-}\circ{-}$ given in \autoref{ssec:indlabs}), considering it defined only
    1064 if the the resulting indexing is a constant one too\footnote{For example
    1065 $(i_0\mapsto 2*i_0,i_1\mapsto i_1+1)|_{i_0\mapsto 2}$ is undefined,
    1066 but $(i_0\mapsto 2*i_0,i_1\mapsto 0)|_{i_0\mapsto 2}=
    1067 i_0\mapsto 4,i_1\mapsto 2$, which is indeed a constant indexing,
    1068 even if the domain of the original indexing is not covered by the constant one.}.
     1004In order to make sense of loop indices, one must keep track of their values in the state.
     1005A \emph{constant indexing} (metavariables $C,\ldots$) is an indexing which employs only constant simple expressions.
     1006The evaluation of an indexing $I$ in a constant indexing $C$, noted $I|_C$, is defined by:
     1007$$
     1008I\circ(i_0\mapsto c_0,\ldots, i_{k-1}\mapsto c_{k-1}) \ass \alphab\circ(i_0\mapsto c_0)\circ\cdots\circ(i_{k-1}\mapsto c_{k-1})
     1009$$
     1010Here, we are using the definition of ${-}\circ{-}$ given in~\ref{ssec:indlabs}.
     1011We consider the above defined only if the the resulting indexing is a constant one too\footnote{For example $(i_0\mapsto 2*i_0,i_1\mapsto i_1+1)|_{i_0\mapsto 2}$ is undefined, but $(i_0\mapsto 2*i_0,i_1\mapsto 0)|_{i_0\mapsto 2}= i_0\mapsto 4,i_1\mapsto 2$, is indeed a constant indexing, even if the domain of the original indexing is not covered by the constant one.}.
    10691012The definition is extended to indexed labels by $\alpha\la I\ra|_C\ass \alpha\la I|_C\ra$.
    10701013
    1071 Constant indexings will be used to keep track of the exact iterations of the
    1072 original code the emitted labels belong to. We thus define two basic actions to
    1073 update constant indexings: $C[i_k{\uparrow}]$ which increments the value of
    1074 $i_k$ by one, and $C[i_k{\downarrow}0]$ which resets it to $0$.
    1075 
    1076 We are ready to update the definition of the operational semantics of
    1077 indexed labelled \imp. The emitted cost labels will now be ones indexed by
    1078 constant indexings. We add a special indexed loop construct for continuations
    1079 that keeps track of active indexed loop indexes:
    1080 $$K,\ldots  \gramm \cdots | i_k:\sop{while} b \sbin {do} S \sbin{then}  K$$
    1081 The difference between the regular stack concatenation
    1082 $i_k:\sop{while}b\sbin{do}S\cdot K$ and the new constructor is that the latter
    1083 indicates the loop is the active one in which we already are, while the former
    1084 is a loop that still needs to be started%
    1085 \footnote{In the presence of \s{continue} and \s{break} statements active
    1086 loops need to be kept track of in any case.}.
     1014Constant indexings will be used to keep track of the exact iterations of the original code that the emitted labels belong to.
     1015We thus define two basic actions to update constant indexings: $C[i_k{\uparrow}]$ increments the value of $i_k$ by one, and $C[i_k{\downarrow}0]$ resets it to $0$.
     1016
     1017We are ready to update the definition of the operational semantics of indexed labelled \imp.
     1018The emitted cost labels will now be ones indexed by constant indexings.
     1019We add a special indexed loop construct for continuations that keeps track of active indexed loop indices:
     1020$$
     1021K,\ldots  \gramm \cdots | i_k:\sop{while} b \sbin {do} S \sbin{then}  K
     1022$$
     1023The difference between the regular stack concatenation $i_k:\sop{while}b\sbin{do}S\cdot K$ and the new constructor is that the latter indicates the loop is the active one in which we already are, while the former is a loop that still needs to be started\footnote{In the presence of \s{continue} and \s{break} statements active loops need to be kept track of in any case.}.
    10871024The \s{find} function is updated accordingly with the case
    1088 $$ \s{find}(\ell, i_k:\sop{while}b\sbin{do}S, K) \ass \s{find}(\ell, S, i_k:
    1089   \sop{while}b\sbin{do}S\sbin{then}K).$$
    1090 The state will now be a 4-uple
    1091 $(S,K,s,C)$ which adds a constant indexing to the 3-uple of regular
    1092 semantics. The small-step rules for all statements but the
    1093 cost-labelled ones and the indexed loops remain the same, without
    1094 touching the $C$ parameter (in particular unindexed loops behave the same
    1095 as usual). The remaining cases are:
     1025$$
     1026\s{find}(\ell, i_k:\sop{while}b\sbin{do}S, K) \ass \s{find}(\ell, S, i_k: \sop{while}b\sbin{do}S\sbin{then}K)
     1027$$
     1028The state will now be a 4-tuple $(S,K,s,C)$ which adds a constant indexing to the triple of the regular semantics.
     1029The small-step rules for all statements remain the same, without touching the $C$ parameter (in particular unindexed loops behave the same as usual), apart from the ones regarding cost-labels and indexed loops.
     1030The remaining cases are:
    10961031$$\begin{aligned}
    10971032   (\alphab : S,K,s,C) &\to[\alphab|_C]_P (S,K,s,C)\\
     
    11091044    \end{cases}
    11101045  \end{aligned}$$
    1111 Some explications are in order.
     1046Some explanations are in order:
    11121047\begin{itemize}
    1113  \item Emitting a label always instantiates it with the current indexing.
    1114  \item Hitting an indexed loop the first time initializes to 0 the corresponding
    1115   index; continuing the same loop inrements the index as expected.
    1116  \item The \s{find} function ignores the current indexing: this is correct
    1117   under the assumption that all indexed loops are single entry ones, so that
    1118   when we land inside an indexed loop with a \s{goto}, we are sure that its
    1119   current index is right.
    1120   \item The starting state with store $s$ for a program $P$ is
    1121 $(P,\s{halt},s,(i_0\mapsto 0,\dots,i_{n-1}\mapsto 0)$ where $i_0,\ldots,i_{n-1}$ cover
    1122 all loop indexes of $P$\footnote{For a program which is the indexed labelling of an
    1123 \imp{} one this corresponds to the maximum nesting of single-entry loops. We can also
    1124 avoid computing this value in advance if we define $C[i{\downarrow}0]$ to extend
    1125 $C$'s domain as needed, so that the starting constant indexing can be the empty one.}.
     1048\item
     1049Emitting a label always instantiates it with the current indexing.
     1050\item
     1051Hitting an indexed loop the first time initializes the corresponding index to 0; continuing the same loop increments the index as expected.
     1052\item
     1053The \s{find} function ignores the current indexing: this is correct under the assumption that all indexed loops are single entry, so that when we land inside an indexed loop with a \s{goto}, we are sure that its current index is right.
     1054\item
     1055The starting state with store $s$ for a program $P$ is $(P,\s{halt},s,(i_0\mapsto 0,\dots,i_{n-1}\mapsto 0)$ where $i_0,\ldots,i_{n-1}$ cover all loop indices of $P$\footnote{For a program which is the indexed labelling of an \imp{} one this corresponds to the maximum nesting of single-entry loops.
     1056We can also avoid computing this value in advance if we define $C[i{\downarrow}0]$ to extend $C$'s domain as needed, so that the starting constant indexing can be the empty one.}.
    11261057\end{itemize}
    11271058
    1128 \paragraph{Compilation.}
    1129 Further down the compilation chain the loop
    1130 structure is usually partially or completely lost. We cannot rely on it anymore
    1131 to ensure keeping track of original source code iterations. We therefore add
    1132 alongside the \s{emit} instruction two other sequential instructions
    1133 $\sop{ind_reset}k$ and $\sop{ind_inc}k$ whose sole effect is to reset to
    1134 0 (resp.\ increment by 1) the loop index $i_k$, as kept track of in a constant
    1135 indexing accompanying the state.
    1136 
    1137 The first step of compilation from $\iota\ell\imp$ consists in prefixing the
    1138 translation of an indexed loop $i_k:\s{while}\ b\ \s{do}\ S$ with
    1139 $\sop{ind_reset}k$ and postfixing the translation of its body $S$ with
    1140 $\sop{ind_inc}k$. Later on in the compilation chain we just need to propagate
    1141 the instructions dealing with cost labels.
    1142 
    1143 We would like to stress the fact that this machinery is only needed to give a
    1144 suitable semantics of observables on which preservation proofs can be done. By no
    1145 means the added instructions and the constant indexing in the state are meant
    1146 to change the actual (let us say denotational) semantics of the programs. In this
    1147 regard the two new instruction have a similar role as the \s{emit} one. A
    1148 forgetful mapping of everything (syntax, states, operational semantics rules)
    1149 can be defined erasing all occurrences of cost labels and loop indexes, and the
    1150 result will always be a regular version of the language considered.
    1151 
    1152 \paragraph{Stating the preservation of semantics.} In fact, the statement of preservation
    1153 of semantics does not change at all, if not for considering traces of evaluated
    1154 indexed cost labels rather than traces of plain ones.
    1155 
    1156 
    1157 \subsection{Dependent costs in the source code}\label{ssec:depcosts}
    1158 The task of producing dependent costs out of the constant costs of indexed labels
    1159 is quite technical. Before presenting it here, we would like to point out that
    1160 the annotations produced by the procedure described in this subsection, even
    1161 if correct, can be enormous and unreadable. In \autoref{sec:conc}, when we will
    1162 detail the actual implementation, we will also sketch how we mitigated this
    1163 problem.
    1164 
    1165 Having the result of compiling the indexed labelling $\Ell^\iota(P)$ of an \imp{}
    1166 program $P$, we can still suppose that a cost mapping can be computed, but
    1167 from indexed labels to naturals. We want to annotate the source code, so we need
    1168 a way to express and compute costs of cost labels, i.e.\ group the costs of
    1169 indexed labels to ones of their atoms. In order to do so we introduce
    1170 \emph{dependent costs}. Let us suppose that for the sole purpose of annotation,
    1171 we have available in the language ternary expressions of the form
     1059\paragraph{Compilation}
     1060Further down the compilation chain the loop structure is usually partially or completely lost.
     1061We cannot rely on it anymore to keep track of the original source code iterations.
     1062We therefore add, alongside the \s{emit} instruction, two other sequential instructions $\sop{ind_reset}k$ and $\sop{ind_inc}k$ whose sole effect is to reset to 0 (resp.\ increment by 1) the loop index $i_k$, as kept track of in a constant indexing accompanying the state.
     1063
     1064The first step of compilation from $\iota\ell\imp$ consists of prefixing the translation of an indexed loop $i_k:\s{while}\ b\ \s{do}\ S$ with $\sop{ind_reset}k$ and postfixing the translation of its body $S$ with $\sop{ind_inc}k$.
     1065Later in the compilation chain we must propagate the instructions dealing with cost labels.
     1066
     1067We would like to stress the fact that this machinery is only needed to give a suitable semantics of observables on which preservation proofs can be done.
     1068By no means are the added instructions and the constant indexing in the state meant to change the actual (let us say denotational) semantics of the programs.
     1069In this regard the two new instruction have a similar role as the \s{emit} one.
     1070A forgetful mapping of everything (syntax, states, operational semantics rules) can be defined erasing all occurrences of cost labels and loop indices, and the result will always be a regular version of the language considered.
     1071
     1072\paragraph{Stating the preservation of semantics}
     1073In fact, the statement of preservation of semantics does not change at all, if not for considering traces of evaluated indexed cost labels rather than traces of plain ones.
     1074
     1075\subsection{Dependent costs in the source code}
     1076\label{ssec:depcosts}
     1077The task of producing dependent costs from constant costs induced by indexed labels is quite technical.
     1078Before presenting it here, we would like to point out that the annotations produced by the procedure described in this Subsection, even if correct, can be enormous and unreadable.
     1079In Section~\ref{sec:conc}, where we detail the actual implementation, we will also sketch how we mitigated this problem.
     1080
     1081Having the result of compiling the indexed labelling $\Ell^\iota(P)$ of an \imp{} program $P$, we may still suppose that a cost mapping can be computed, but from indexed labels to naturals.
     1082We want to annotate the source code, so we need a way to express and compute the costs of cost labels, i.e.\ group the costs of indexed labels to ones of their atoms.
     1083In order to do so we introduce \emph{dependent costs}.
     1084Let us suppose that for the sole purpose of annotation, we have available in the language ternary expressions of the form
    11721085$$\tern e {f_1}{f_2},$$
    1173 and that we have access to common operators on integers such as equality, order
    1174 and modulus.
    1175 
    1176 \paragraph{Simple conditions.}
    1177 First, we need to shift from \emph{transformations} of loop indexes to
    1178 \emph{conditions} on them. We identify a set of conditions on natural numbers
    1179 which are able to express the image of any composition of simple expressions.
    1180 
     1086and that we have access to common operators on integers such as equality, order and modulus.
     1087
     1088\paragraph{Simple conditions}
     1089
     1090First, we need to shift from \emph{transformations} of loop indices to \emph{conditions} on them.
     1091We identify a set of conditions on natural numbers which are able to express the image of any composition of simple expressions.
    11811092\emph{Simple conditions} are of three possible forms:
    11821093\begin{itemize}
    1183  \item equality $i_k=n$ for some natural $n$;
    1184  \item inequality $i_k\ge n$ for some natural $n$;
    1185  \item modular equality together with inequality $i_k\bmod a = b\wedge i_k\ge n$
    1186   for naturals $a, b, n$.
     1094\item
     1095Equality $i_k=n$ for some natural $n$.
     1096\item
     1097Inequality $i_k\ge n$ for some natural $n$.
     1098\item
     1099Modular equality together with inequality $i_k\bmod a = b\wedge i_k\ge n$ for naturals $a, b, n$.
    11871100\end{itemize}
    1188 The always true simple condition is given by $i_k\ge 0$, and similarly we
    1189 write $i_k\bmod a = b$ as a simple condition for $i_k\bmod a = b\wedge i_k\ge 0$.
    1190 Given a simple condition $p$ and a constant indexing $C$ we can easily define
    1191 when $p$ holds for $C$ (written $p\circ C$). A \emph{dependent cost expression}
    1192 is an expression built solely out of integer constants and ternary expressions
    1193 with simple conditions at their head. Given a dependent cost expression $e$ where
    1194 all of the loop indexes appearing in it are in the domain of a constant indexing
    1195 $C$, we can define the value $e\circ C\in \mathbb N$ by
     1101The `always true' simple condition is given by $i_k\ge 0$.
     1102We write $i_k\bmod a = b$ as a simple condition for $i_k\bmod a = b\wedge i_k\ge 0$.
     1103
     1104Given a simple condition $p$ and a constant indexing $C$ we can easily define when $p$ holds for $C$ (written $p\circ C$).
     1105A \emph{dependent cost expression} is an expression built solely out of integer constants and ternary expressions with simple conditions at their head.
     1106Given a dependent cost expression $e$ where all of the loop indices appearing in it are in the domain of a constant indexing $C$, we can define the value $e\circ C\in \mathbb N$ by:
    11961107$$n\circ C\ass n,\qquad (\tern p e f)\circ C\ass
    11971108\begin{cases}
     
    12001111\end{cases}$$
    12011112
    1202 \paragraph{From indexed costs to dependent ones.}
    1203 Every simple expression $e$ corresponds to a simple condition $p(e)$ which expresses the
    1204 set of values that can be the result of it. Following is the definition of such
    1205 relation. We recall that in this development loop indexes are always mapped to
    1206 simple expressions over the same index. If it was not the case, the condition
    1207 obtained from an expression should be on the mapped index, not the indeterminate
    1208 of the simple expression. We leave to further work all generalizations of what
    1209 we present here.
     1113\paragraph{From indexed costs to dependent ones}
     1114Every simple expression $e$ corresponds to a simple condition $p(e)$ which expresses the set of values that $e$ can take.
     1115Following is the definition of such a relation.
     1116We recall that in this development, loop indices are always mapped to simple expressions over the same index.
     1117If it was not the case, the condition obtained from an expression should be on the mapped index, not the indeterminate of the simple expression.
     1118We leave all generalisations of what we present here for further work:
    12101119$$
    12111120p(a*i_k+b)\ass
     
    12141123i_k \ge b & \text{if $a = 1$,}\\
    12151124i_k\bmod a = b \wedge i_k \ge b & \text{otherwise}.
    1216 \end{cases}$$
    1217 Now, suppose we are given a mapping $\kappa$ from indexed labels to natural
    1218 numbers. We will transform it in a mapping (identified with an abuse of notation
    1219 with the same symbol $\kappa$) from atoms to \imp{} expressions built with
    1220 ternary expressions which depend solely on loop indexes. To that end we define
    1221 an auxiliary function $\kappa^\alpha_L$ parameterized by atoms and words of
    1222 simple expressions and defined on \emph{sets} of $n$-uples of simple expressions
    1223 (with $n$ constant across each such set, i.e.\ each set is made of words with
    1224 the same length).
    1225 
    1226 We will employ a bijection between words of simple expressions and indexings,
    1227 given by\footnote{Lists of simple expressions is in fact how indexings are
    1228 represented in Cerco's current implementation of the compiler.}
    1229 $$i_0\mapsto e_0,\ldots,i_{k-1}\mapsto e_{k-1} \cong e_0\cdots e_{k-1}.$$
    1230 As usual, $\varepsilon$ denotes the empty word/indexing, and juxtaposition
    1231 word concatenation.
    1232 
    1233 For every set $s$ of $n$-uples of simple expressions, we are in one of the following
    1234 three exclusive cases:
     1125\end{cases}
     1126$$
     1127Now, suppose we are given a mapping $\kappa$ from indexed labels to natural numbers.
     1128We will transform it in a mapping (identified, via abuse of notation, with the same symbol $\kappa$) from atoms to \imp{} expressions built with ternary expressions which depend solely on loop indices.
     1129To that end we define an auxiliary function $\kappa^\alpha_L$, parameterized by atoms and words of simple expressions, and defined on \emph{sets} of $n$-uples of simple expressions (with $n$ constant across each such set, i.e.\ each set is made of words all with the same length).
     1130
     1131We will employ a bijection between words of simple expressions and indexings, given by:\footnote{Lists of simple expressions are in fact how indexings are -represented in CerCo's current implementation of the compiler.}
     1132$$
     1133i_0\mapsto e_0,\ldots,i_{k-1}\mapsto e_{k-1} \cong e_0\cdots e_{k-1}.
     1134$$
     1135As usual, $\varepsilon$ denotes the empty word/indexing, and juxtaposition is used to denote word concatenation.
     1136
     1137For every set $s$ of $n$-uples of simple expressions, we are in one of the following three exclusive cases:
    12351138\begin{itemize}
    1236  \item $S=\emptyset$, or
    1237  \item $S=\{\varepsilon\}$, or
    1238  \item there is a simple expression $e$ such that $S$ can be decomposed in
    1239   $eS'+S''$, with $S'\neq \emptyset$ and none of the words in $S''$ starting with $e$
     1139\item
     1140$S=\emptyset$.
     1141\item
     1142$S=\{\varepsilon\}$.
     1143\item
     1144There is a simple expression $e$ such that $S$ can be decomposed in $eS'+S''$, with $S'\neq \emptyset$ and none of the words in $S''$ starting with $e$.
    12401145\end{itemize}
    1241 where $eS'$ denotes prepending $e$ to all elements of $S'$ and $+$ is disjoint
    1242 union. This classification can serve as the basis of a definition by recursion
    1243 on $n+\card S$ where $n$ is the size of tuples in $S$ and $\card S$ is its cardinality.
    1244 Indeed in the third case in $S'$ the size of tuples decreases strictly (and
    1245 cardinality does not increase) while for $S''$ the size of tuples remains the same
    1246 but cardinality strictly decreases. The expression $e$ of the third case will be chosen
    1247 as minimal for some total order\footnote{The specific order used does not change
    1248 the correctness of the procedure, but different orders can give more or less
    1249 readable results. A ``good'' order is the lexicographic one, with $a*i_k+b \le a'*i_k+b'$
    1250 if $a<a'$ or $a=a'$ and $b\le b'$.}.
    1251 
    1252 Following is the definition of the auxiliary function $\kappa^\alpha_L$, following
    1253 the recursion scheme presented above.
     1146Here $eS'$ denotes prepending $e$ to all elements of $S'$ and $+$ is disjoint union.
     1147This classification can serve as the basis of a definition by recursion on $n+\card S$ where $n$ is the size of tuples in $S$ and $\card S$ is its cardinality.
     1148Indeed in the third case in $S'$ the size of tuples decreases strictly (and cardinality does not increase) while for $S''$ the size of tuples remains the same but cardinality strictly decreases.
     1149The expression $e$ of the third case will be chosen as minimal for some total order\footnote{The specific order used does not change the correctness of the procedure, but different orders can give more or less readable results. A ``good'' order is the lexicographic one, with $a*i_k+b \le a'*i_k+b'$ if $a<a'$ or $a=a'$ and $b\le b'$.}.
     1150
     1151Following is the definition of the auxiliary function $\kappa^\alpha_L$, which follows the recursion scheme presented above:
    12541152$$
    12551153\begin{aligned}
     
    12591157\end{aligned}
    12601158$$
    1261 
     1159\noindent
    12621160Finally, the wanted dependent cost mapping is defined by
    1263 $$\kappa(\alpha)\ass\kappa^\alpha_\varepsilon(\{\,L\mid \alpha\la L\ra \text{ appears
    1264 in the compiled code}\,\}).$$
    1265 
    1266 \paragraph{Indexed instrumentation.}
    1267 The \emph{indexed instrumentation} generalizes the instrumentation presented in
    1268 \autoref{sec:labelling}.
    1269 We described
    1270 above how cost atoms can be mapped to dependent costs. The instrumentation must
    1271 also insert code dealing with the loop indexes. As instrumentation is done
    1272 on the code produced by the labelling phase, all cost labels are indexed by
    1273 identity indexings. The relevant cases of the recursive definition
    1274 (supposing $c$ is the cost variable) are then
     1161$$
     1162\kappa(\alpha)\ass\kappa^\alpha_\varepsilon(\{\,L\mid \alpha\la L\ra \text{ appears in the compiled code}\,\})
     1163$$
     1164
     1165\paragraph{Indexed instrumentation}
     1166The \emph{indexed instrumentation} generalises the instrumentation presented in~\ref{sec:labelling}.
     1167We described above how cost atoms can be mapped to dependent costs.
     1168The instrumentation must also insert code dealing with the loop indices.
     1169As instrumentation is done on the code produced by the labelling phase, all cost labels are indexed by identity indexings.
     1170The relevant cases of the recursive definition (supposing $c$ is the cost variable) are then:
    12751171$$
    12761172\begin{aligned}
     
    12811177$$
    12821178
    1283 \section{Notes on the implementation and further work}\label{sec:conc}
    1284 Implementing the indexed label approach in CerCo's untrusted OcaML prototype
    1285 does not introduce many aspects beyond what has already been presented for the toy
    1286 language \imp{} with \s{goto}s. \s{Clight}, the fragment of \s{C} which is the
    1287 source language of CerCo's compilation chain~\cite{D2.1}, has several more fetaures,
    1288 but few demand changes in the indexed labelled approach.
    1289 \paragraph{Indexed loops vs index update instructions.} In this presentation
    1290 we had indexed loops in $\iota\ell\imp$, while we hinted at the fact that later
    1291 languages in the compilation chain would have specific index update instructions.
    1292 In CerCo's actual compilation chain from \s{Clight} to 8051 assembly, indexed
    1293 loops are only in \s{Clight}, while from \s{Cminor} onward all languages have
    1294 the same three cost-involving instructions: label emitting, index resetting and
    1295 index incrementing.
    1296 \paragraph{Loop transformations in the front end.} We decided to implement
    1297 the two loop transformations in the front end, namely in \s{Clight}. This
    1298 decision is due to user readability concerns: if costs are to be presented to
    1299 the programmer, they should depend on structures written by the programmer
    1300 himself. If loop transformation were performed later it would be harder to
    1301 relate information on loops in the control flow graph with actual loops
    1302 written in the source code. Another solution would be however to index
    1303 loops in the source code and then use these indexes later in the compilation
    1304 chain to pinpoint explicit loops of the source code: loop indexes can be used
    1305 to preserve such information just like cost labels.
    1306 \paragraph{Break and continue statements.} \s{Clight}'s loop flow control
    1307 statements for breaking and continuing a loop are equivalent to appropriate
    1308 \s{goto} statements. The only difference is that we are assured that they cannot
    1309 cause loops to be multi-entry, and that when a transformation such as loop
    1310 peeling is done, they need to be replaced by actual \s{goto}s (which will happen
    1311 further down the compilation chain anyway).
    1312 \paragraph{Function calls.} Every internal function definition has its own
    1313 space of loop indexes. Executable semantics must thus take into account saving
    1314 and resetting the constant indexing of current loops upon hitting a function
    1315 call, and restoring it upon return of control. A peculiarity is that this cannot
    1316 be attached to actions saving and restoring frames: namely in the case of
    1317 tail calls the constant indexing needs to be saved whereas the frame does not.
    1318 \paragraph{Cost-labelled expressions.} In labelled \s{Clight} expressions get
    1319 cost labels too, because of the presence of ternary conditional expressions
    1320 (and lazy logical operators, which get translated to ternary expressions too).
    1321 Adapting the indexed labelled approach to cost-labelled expressions does not
    1322 pose particular problems.
     1179\section{Notes on the implementation and further work}
     1180\label{sec:conc}
     1181Implementing the indexed label approach in CerCo's untrusted Ocaml prototype does not introduce many new challenges beyond what has already been presented for the toy language, \imp{} with \s{goto}s.
     1182\s{Clight}, the \s{C} fragment source language of CerCo's compilation chain~\cite{D2.1}, has several more fetaures, but few demand changes in the indexed labelled approach.
     1183
     1184\paragraph{Indexed loops \emph{vs}. index update instructions}
     1185In our presentation we have indexed loops in $\iota\ell\imp$, while we hinted that later languages in the compilation chain would have specific index update instructions.
     1186In CerCo's actual compilation chain from \s{Clight} to 8051 assembly, indexed loops are only in \s{Clight}, while from \s{Cminor} onward all languages have the same three cost-involving instructions: label emitting, index resetting and index incrementing.
     1187
     1188\paragraph{Loop transformations in the front end}
     1189We decided to implement the two loop transformations in the front end, namely in \s{Clight}.
     1190This decision is due to user readability concerns: if costs are to be presented to the programmer, they should depend on structures written by the programmer himself.
     1191If loop transformation were performed later it would be harder to create a correspondence between loops in the control flow graph and actual loops written in the source code.
     1192However, another solution would be to index loops in the source code and then use these indices later in the compilation chain to pinpoint explicit loops of the source code: loop indices can be used to preserve such information, just like cost labels.
     1193
     1194\paragraph{Break and continue statements}
     1195\s{Clight}'s loop flow control statements for breaking and continuing a loop are equivalent to appropriate \s{goto} statements.
     1196The only difference is that we are assured that they cannot cause loops to be multi-entry, and that when a transformation such as loop peeling is complete, they need to be replaced by actual \s{goto}s (which happens further down the compilation chain anyway).
     1197
     1198\paragraph{Function calls}
     1199Every internal function definition has its own space of loop indices.
     1200Executable semantics must thus take into account saving and resetting the constant indexing of current loops upon hitting a function call, and restoring it upon return of control.
     1201A peculiarity is that this cannot be attached to actions that save and restore frames: namely in the case of tail calls the constant indexing needs to be saved whereas the frame does not.
     1202
     1203\paragraph{Cost-labelled expressions}
     1204In labelled \s{Clight}, expressions also get cost labels, due to the presence of ternary conditional expressions (and lazy logical operators, which get translated to ternary expressions too).
     1205Adapting the indexed labelled approach to cost-labelled expressions does not pose any particular problems.
    13231206
    13241207\paragraph{Simplification of dependent costs}
    1325 As previously mentioned, the na\"{i}ve application of the procedure described in~\autoref{ssec:depcosts} produces unwieldy cost annotations.
     1208As previously mentioned, the na\"{i}ve application of the procedure described in~\ref{ssec:depcosts} produces unwieldy cost annotations.
    13261209In our implementation several transformations are used to simplify such complex dependent costs.
    13271210
     
    13601243The cost analysis carried out by the plugin now takes into account such dependent costs.
    13611244
    1362 The only limitation (which provided a simplification in the code) is that within a dependent cost simple conditions with modulus on the
    1363 same loop index should not be modulo different numbers, which corresponds to
    1364 the reasonable limitation of not applying multiple times loop unrolling to
    1365 the same loops.
    1366 \paragraph{Further work.}
    1367 Indexed labels are for now implemented only in the untrusted OcaML compiler,
    1368 while they are not present yet in the Matita code. Porting them should pose no
    1369 significant problem, and then the proof effort should begin.
    1370 
    1371 Because most of the executable operational semantics of the languages across the
    1372 front end and the back end are oblivious to cost labels, it should be expected
    1373 that the bulk of the semantic preservation proofs that still needs to be done
    1374 will not get any harder because of indexed labels. The only trickier point
    1375 would be in the translation of \s{Clight} to \s{Cminor}, where we
    1376 pass from structured indexed loops to atomic instructions on loop indexes.
    1377 
    1378 An invariant which should probably be proved and provably preserved along compilation
    1379 is the non-overlap of indexings for the same atom. Then, supposing cost
    1380 correctness for the unindexed approach, the indexed one will just need to
    1381 add the proof that
     1245The only limitation (which actually simplifies the code) is that, within a dependent cost, simple conditions with modulus on the same loop index should not be modulo different numbers.
     1246This corresponds to a reasonable limitation on the number of times loop unrolling may be applied to the same loop: at most once.
     1247
     1248\paragraph{Further work}
     1249For the time being, indexed labels are only implemented in the untrusted Ocaml compiler, while they are not present yet in the Matita code.
     1250Porting them should pose no significant problem.
     1251Once ported, the task of proving properties about them in Matita can begin.
     1252
     1253Because most of the executable operational semantics of the languages across the frontend and the backend are oblivious to cost labels, it should be expected that the bulk of the semantic preservation proofs that still needs to be done will not get any harder because of indexed labels.
     1254The only trickier point that we foresee would be in the translation of \s{Clight} to \s{Cminor}, where we pass from structured indexed loops to atomic instructions on loop indices.
     1255
     1256An invariant which should probably be proved and provably preserved along the compilation chain is the non-overlap of indexings for the same atom.
     1257Then, supposing cost correctness for the unindexed approach, the indexed one will just need to amend the proof that
    13821258$$\forall C\text{ constant indexing}.\forall \alpha\la I\ra\text{ appearing in the compiled code}.
    1383   \kappa(\alpha)\circ (I\circ C) = \kappa(\alpha\la I \ra).$$
    1384 $C$ represents a snapshot of loop indexes in the compiled code, while
    1385 $I\circ C$ is the corresponding snapshot in the source code. Semantics preservation
    1386 will make sure that when with snapshot $C$ we emit $\alpha\la I\ra$ (that is,
    1387 we have $\alpha\la I\circ C\ra$ in the trace), this means that $\alpha$ must be
    1388 emitted in the source code with indexing $I\circ C$, so the cost
    1389 $\kappa(\alpha)\circ (I\circ C)$ applies.
    1390 
    1391 Apart from carrying over the proofs, we would like to extend the approach
    1392 to more loop transformations. Important examples are loop inversion
    1393 (where a for loop is reversed, usually to make iterations appear as truly
    1394 independent) or loop interchange (where two nested loops are swapped, usually
    1395 to have more loop invariants or to enhance strength reduction). This introduces
    1396 interesting changes to the approach, where we would have indexings such as
     1259  \kappa(\alpha)\circ (I\circ C) = \kappa(\alpha\la I \ra).
     1260$$
     1261Here, $C$ represents a snapshot of loop indices in the compiled code, while $I\circ C$ is the corresponding snapshot in the source code.
     1262Semantics preservation will ensure that when, with snapshot $C$, we emit $\alpha\la I\ra$ (that is, we have $\alpha\la I\circ C\ra$ in the trace), $\alpha$ must also be emitted in the source code with indexing $I\circ C$, so the cost $\kappa(\alpha)\circ (I\circ C)$ applies.
     1263
     1264Aside from carrying over the proofs, we would like to extend the approach to more loop transformations.
     1265Important examples are loop inversion (where a for loop is reversed, usually to make iterations appear to be truly independent) or loop interchange (where two nested loops are swapped, usually to have more loop invariants or to enhance strength reduction).
     1266This introduces interesting changes to the approach, where we would have indexings such as:
    13971267$$i_0\mapsto n - i_0\quad\text{or}\quad i_0\mapsto i_1, i_1\mapsto i_0.$$
    1398 In particular dependency over actual variables of the code would enter the
    1399 frame, as indexings would depend on the number of iterations of a well-behaving
    1400 guarded loop (the $n$ in the first example).
    1401 
    1402 Finally, as stated in the introduction, the approach should allow integrating
    1403 some techniques for cache analysis, a possibility that for now has been put aside
    1404 as the standard 8051 architecture targeted by the CerCo project has no cache.
    1405 As possible developments of this line of work without straying from the 8051
    1406 architecture, two possibilities present themselves.
     1268In particular dependency over actual variables of the code would enter the frame, as indexings would depend on the number of iterations of a well-behaving guarded loop (the $n$ in the first example).
     1269
     1270Finally, as stated in the introduction, the approach should allow some integration of techniques for cache analysis, a possibility that for now has been put aside as the standard 8051 target architecture for the CerCo project lacks a cache.
     1271Two possible developments for this line of work present themselves:
    14071272\begin{enumerate}
    1408  \item One could extend the development to some 8051 variants, of which some have
    1409   been produced with a cache.
    1410  \item One could make the compiler implement its own cache: this cannot
    1411   apply to RAM accesses of the standard 8051 architecture, as the difference in
    1412   cost of accessing the two types of RAM is of just 1 clock cycle, which makes
    1413   any implementation of cache counterproductive. So this possibility should
    1414   either artificially change the accessing cost of RAM of the model just for the
    1415   sake of possible future adaptations to other architectures, or otherwise
    1416   model access to an external memory by means of the serial port.\end{enumerate}
     1273\item
     1274One could extend the development to some 8051 variants, of which some have been produced with a cache.
     1275\item
     1276One could make the compiler implement its own cache: this cannot apply to \textsc{ram} accesses of the standard 8051 architecture, as the difference in cost of accessing the two types of \textsc{ram} is only one clock cycle, which makes any implementation of cache counterproductive.
     1277So for this proposal, we could either artificially change the accessing cost of \textsc{ram} of the model just for the sake of possible future adaptations to other architectures, or otherwise model access to an external memory by means of the serial port.
     1278\end{enumerate}
    14171279
    14181280
Note: See TracChangeset for help on using the changeset viewer.