# Changeset 3666

Ignore:
Timestamp:
Mar 16, 2017, 5:21:42 PM (12 months ago)
Message:

Updated the proof part

Location:
Papers/jar-cerco-2017
Files:
4 edited

Unmodified
Removed
• ## Papers/jar-cerco-2017/cerco.bib

 r3661 doi = {10.1007/3-540-45937-5_16} } @Inbook{Ayache2012, author="Ayache, Nicolas and Amadio, Roberto M. and R{\'e}gis-Gianas, Yann", editor="Stoelinga, Mari{\"e}lle and Pinger, Ralf", title="Certifying and Reasoning on Cost Annotations in C Programs", bookTitle="Formal Methods for Industrial Critical Systems: 17th International Workshop, FMICS 2012, Paris, France, August 27-28, 2012. Proceedings", year="2012", publisher="Springer Berlin Heidelberg", address="Berlin, Heidelberg", pages="32--46", isbn="978-3-642-32469-7", doi="10.1007/978-3-642-32469-7_3", url="http://dx.doi.org/10.1007/978-3-642-32469-7_3" }
• ## Papers/jar-cerco-2017/cerco.tex

 r3659 \usepackage{stmaryrd} \usepackage{url} \usepackage[all]{xy} \usepackage{tikz}
• ## Papers/jar-cerco-2017/development.tex

 r3662 \label{sect.formal.development} \subsection*{Statistics} \subsection{Statistics} \begin{tabular}{|c|c|}
• ## Papers/jar-cerco-2017/proof.tex

 r3659 %   Main theorem statement \section{Introduction} In~\cite{D2.1}, Armadio \emph{et al} propose an approach for building a compiler for a large fragment of the \textsc{c} programming language. The novelty of their proposal lies in the fact that their proposed design is capable of lifting execution cost information from the compiled code and presenting it to the user. This idea is foundational for the CerCo project, which strives to produce a mechanically certified version of such a compiler. To summarise, Armadio's proposal consisted of decorations' on the source code, along with the insertion of labels at key points. These labels are preserved as compilation progresses, from one intermediate language to another. Once the final object code is produced, such labels should correspond to the parts of the compiled code that have a constant cost. Two properties must hold of any cost estimate. The first property, paramount to the correctness of the method, is \emph{soundness}, that is, that the actual execution cost is bounded by the estimate. In the labelling approach, this is guaranteed if every loop in the control flow of the compiled code passes through at least one cost label. The second property, optional but desirable, is \emph{preciseness}: the estimate \emph{is} the actual cost. In the labelling approach, this will be true if, for every label, every possible execution of the compiled code starting from such a label yields the same cost before hitting another one. In simple architectures such as the 8051 micro-controller this can be guaranteed by placing labels at the start of any branch in the control flow, and by ensuring that no labels are duplicated. The reader should note that the above mentioned requirements must hold when executing the code obtained at the end of the compilation chain. So even if one is careful about injecting the labels at suitable places in the source code, the requirements might still fail because of two main obstacles: \begin{itemize} \item The compilation process introduces important changes in the control flow, inserting loops or branches. For example, the insertion of functions in the source code replacing instructions that are unavailable in the target architecture. This require loops to be inserted (for example, for multi-word division and generic shift in the 8051 architecture), or effort spent in providing unbranching translations of higher level instructions~\cite{D2.2}. \item Even when the compiled code \emph{does}---as far as the the syntactic control flow graph is concerned---respect the conditions for soundness and preciseness, the cost of blocks of instructions might not be independent of context, so that different passes through a label might have different costs. This becomes a concern if one wishes to apply the approach to more complex architectures, for example one with caching or pipelining. \end{itemize} The first point unveils a weakness of the current labelling approach when it comes to some common code transformations performed along a compilation chain. In particular, most \emph{loop optimisations} are disruptive, in the sense outlined in the first bulletpoint above. An example optimisation of this kind is \emph{loop peeling}. This optimisation is employed by compilers in order to trigger other optimisations, for example, dead code elimination or invariant code motion. Here, a first iteration of the loop is hoisted out of the body of the loop, possibly being assigned a different cost than later iterations. The second bulletpoint above highlights another weakness. Different tools allow to predict up to a certain extent the behaviour of cache. For example, the well known tool \s{aiT}~\cite{absint}---based on abstract interpretation---allows the user to estimate the worst-case execution time (\textsc{wcet}) of a piece of source code, taking into account advanced features of the target architecture. While such a tool is not fit for a compositional approach which is central to CerCo's project\footnote{\s{aiT} assumes the cache is empty at the start of computation, and treats each procedure call separately, unrolling a great part of the control flow.}, \s{aiT}'s ability to produce tight estimates of execution costs would sthill enhance the effectiveness of the CerCo compiler, \eg{} by integrating such techniques in its development. A typical case where cache analysis yields a difference in the execution cost of a block is in loops: the first iteration will usually stumble upon more cache misses than subsequent iterations. If one looks closely, the source of the weakness of the labelling approach as presented in~\cite{D2.1} is common to both points: the inability to state different costs for different occurrences of labels, where the difference might be originated by labels being duplicated along the compilation, or the costs being sensitive to the current state of execution. The preliminary work we present here addresses this weakness by introducing cost labels that are dependent on which iteration of its containing loops it occurs in. \section{Compiler proof} \label{sect.compiler.proof} \subsection{Labelling approach} In this section, we will explore the labelling approach mentioned in the introduction in more detail. The labelling' approach to the problem of building cost annotations is summarized in the following diagram. \newcolumntype{M}[1]{>{\raggedright}m{#1}} \- \begin{tabular}{M{10cm}||M{4cm}} $$\xymatrix{ % Row 1 L_1 \\ % % Row 2 L_{1,\ell} \ar[u]^{\cl{I}} \ar@/^/[d]^{\w{er}_1} \ar[r]^{\cl{C}_1} % & L_{2,\ell} \ar[d]^{\w{er}_2} % & \ldots \hspace{0.3cm}\ar[r]^{\cl{C}_k} % & L_{k+1, \ell} \ar[d]^{\w{er}_{k+1}} \\ % % Row 3 L_1 \ar@/^/[u]^{\cl{L}} \ar[r]^{\cl{C}_1} & L_2 & \ldots\hspace{0.3cm} \ar[r]^{\cl{C}_{k}} & L_{k+1} }$$ & $$\begin{array}{ccc} \w{er}_{i+1} \comp \cl{C}_{i} &= &\cl{C}_{i} \comp \w{er}_{i} \\ \w{er}_{1} \comp \cl{L} &= &\w{id}_{L_{1}} \\ \w{An}_{1} &= &\cl{I} \comp \cl{L} \end{array}$$ \end{tabular} \- For each language $L_i$ considered in the compilation process, we define an extended {\em labelled} language $L_{i,\ell}$ and an extended operational semantics. The labels are used to mark certain points of the control. The semantics makes sure that whenever we cross a labelled control point a labelled and observable transition is produced. For each labelled language there is an obvious function $\w{er}_i$ erasing all labels and producing a program in the corresponding unlabelled language. The compilation functions $\cl{C}_i$ are extended from the unlabelled to the labelled language so that they enjoy commutation with the erasure functions. Moreover, we lift the soundness properties of the compilation functions from the unlabelled to the labelled languages and transition systems. A {\em labelling} $\cl{L}$  of the source language $L_1$ is just a function such that $\w{er}_{L_{1}} \comp \cl{L}$ is the identity function. An {\em instrumentation} $\cl{I}$ of the source labelled language  $L_{1,\ell}$ is a function replacing the labels with suitable increments of, say, a fresh \w{cost} variable. Then an {\em annotation} $\w{An}_{1}$ of the source program can be derived simply as the composition of the labelling and the instrumentation functions: $\w{An}_{1} = \cl{I}\comp \cl{L}$. As for the direct approach, suppose $s$ is some adequate representation of the state of a program. Let $S$ be a source program and suppose that its annotation satisfies the following property: \label{STEP1} (\w{An}_{1}(S),s[c/\w{cost}])  \eval s'[c+\delta/\w{cost}] where $\delta$ is some non-negative number. Then the definition of the instrumentation and the fact that the soundness proofs of the compilation functions have been lifted to the labelled languages allows to conclude that \label{step2} (\cl{C}(L(S)),s[c/\w{cost}])  \eval (s'[c/\w{cost}],\lambda) where $\cl{C} = \cl{C}_{k} \comp \cdots \comp \cl{C}_{1}$ and $\lambda$ is a sequence (or a multi-set) of labels whose cost' corresponds to the number $\delta$ produced by the annotated program. Then the commutation properties of erasure and compilation functions allows to conclude that the {\em erasure} of the compiled labelled code $\w{er}_{k+1}(\cl{C}(L(S)))$ is actually  equal to the compiled code $\cl{C}(S)$ we are interested in. Given this, the following question arises: \begin{quote} Under which conditions the sequence $\lambda$, {\em i.e.}, the increment $\delta$, is a sound and possibly precise description of the execution cost of the object code? \end{quote} To answer this question, we observe that the object code we are interested in is some kind of assembly code and its control flow can be easily represented as a control flow graph. The fact that we have to prove the soundness of the compilation functions means that we have plenty of pieces of information on the way the control flows in the compiled code, in particular as far as procedure calls and returns are concerned. These pieces of information allow to build a rather accurate representation of the control flow of the compiled code at run time. The idea is then to perform two simple checks on the control flow graph. The first check is to verify that all loops go through a labelled node. If this is the case then we can associate a finite cost with every label and prove that the cost annotations are sound. The second check amounts to verify that all paths starting from a label have the same cost. If this check is successful then we can conclude that the cost annotations are precise. This approach, as published in~\cite{Ayache2012} is the one used in the CerCo project. However, it has some weaknesses, which we will describe in the next subsection. We also propose a method to resolve these weaknesses. \subsection{Indexed labelling} THe labelling approach as described above does not deal well with some optimisations the compiler might introduce, such as changes in the control flow. For example, the 8051 architecture does not have an instruction for multi-word division, which means that a function containing loops has to be inserted into the code. In fact, this is symptomatic of a more general weakness: the inability to state different costs for different occurrences of labels, where the difference might be originated by labels being duplicated along the compilation, or the costs being sensitive to the current state of execution. We present a solution that addresses this weakness by introducing cost labels that are dependent on the iteration of their containing loop. This is achieved by means of \emph{indexed labels}; all cost labels are decorated with formal indices coming from the loops containing such labels. These indices allow us to rebuild, even after multiple loop transformations, which iterations of the original loops in the source code a particular label occurrence belongs to. These indices allow us to keep track, even after multiple loop transformations, of which iteration of the original loop in the source code a particular label occurrence belongs to. During the annotation stage of the source code, this information is presented to the user by means of \emph{dependent costs}. We concentrate on integrating the labelling approach with two loop transformations. For general information on general compiler optimisations (and loop optimisations in particular) we refer the reader to the vast literature on the subject (\eg\cite{muchnick,morgan}). Over the course of this section, we will use two specific types of optimisations to introduce the indexed labelling approach. We argue that these two types pose a good foundation for extending the labelling approach, in order to cover more and more common optimisations, as well as gaining insight into how to integrate advanced cost estimation techniques, such as cache analysis, into the CerCo compiler. Moreover loop peeling itself has the fortuitous property of enhancing and enabling other optimisations. Experimentation with CerCo's untrusted prototype compiler, which implements constant propagation and partial redundancy elimination~\cite{PRE,muchnick}, show how loop peeling enhances those other optimisations. \paragraph{Loop peeling} As already mentioned, loop peeling consists in preceding the loop with a copy of its body, appropriately guarded. Loop peeling consists in preceding the loop with a copy of its body, appropriately guarded. This is used, in general, to trigger further optimisations, such as those that rely on execution information which can be computed at compile time, but which is erased by further iterations of the loop, or those that use the hoisted code to be more effective at eliminating redundant code. Integrating this transformation in to the labelling approach would also allow the integration of the common case of cache analysis explained above; the analysis of cache hits and misses usually benefits from a form of \emph{virtual} loop peeling~\cite{cacheprediction}. This optimisation consists of the repetition of several copies of the body of the loop inside the loop itself (inserting appropriate guards, or avoiding them altogether if enough information about the loop's guard is available at compile time). This can limit the number of (conditional or unconditional) jumps executed by the code and trigger further optimisations dealing with pipelining, if appropriate for the architecture. \\\\ Whilst we cover only two loop optimisations in this report, we argue that the work presented herein poses a good foundation for extending the labelling approach, in order to cover more and more common optimisations, as well as gaining insight into how to integrate advanced cost estimation techniques, such as cache analysis, into the CerCo compiler. Moreover loop peeling itself has the fortuitous property of enhancing and enabling other optimisations. Experimentation with CerCo's untrusted prototype compiler, which implements constant propagation and partial redundancy elimination~\cite{PRE,muchnick}, show how loop peeling enhances those other optimisations. \paragraph{Outline} We will present our approach on a minimal toy' imperative language, \imp{} with \s{goto}s, which we present in Section~\ref{sec:defimp} along with formal definitions of the loop transformations. This language already presents most of the difficulties encountered when dealing with \textsc{c}, so we stick to it for the sake of this presentation. In Section~\ref{sec:labelling} we summarize the labelling approach as presented in~\cite{D2.1}. Section~\ref{sec:indexedlabels} presents \emph{indexed labels}, our proposal for dependent labels which are able to describe precise costs even in the presence of the various loop transformations we consider. Finally Section~\ref{sec:conc} goes into more detail regarding the implementation of indexed labels in CerCo's untrusted compiler and speculates on further work on the subject. \section{\imp{} with goto}\label{sec:defimp} We briefly outline the toy language, \imp{} with \s{goto}s. The language was designed in order to pose problems for the existing labelling approach, and as a testing ground for our new notion of indexed labels. The syntax and operational semantics of our toy language are presented in~\ref{fig:minimp}. Note, we may augment the language further, with \s{break} and \s{continue}, at no further expense. \subsubsection{\imp{} with goto}\label{sec:defimp} We briefly outline a toy language, \imp{} with \s{goto}s. This language was designed in order to pose problems for the existing labelling approach, and as a testing ground for our new notion of indexed labels. The syntax and operational semantics of \imp{} are presented in figure~\ref{fig:minimp}. Note that we can augment the language with \s{break} and \s{continue} statements at no further expense. \begin{figureplr} $$\begin{gathered} \label{fig:minimp} \end{figureplr} The precise grammar for expressions is not particularly relevant so we do not give one in full. For the sake of conciseness we also treat boolean and arithmetic expressions together (with the usual \textsc{c} convention of an expression being true iff non-zero). We posit the existence of a suitable notion of sequential instructions', wherein each instruction has a single natural successor to which we can add our own, for every language L further down the compilation chain. \subsection{Loop transformations} \subsubsection{Loop transformations} We call a loop L \emph{single-entry} in P if there is no \s{goto} to P outside of L which jumps into L.\footnote{This is a reasonable aproximation: it defines a loop as multi-entry if it has an external but unreachable \s{goto} jumping into it.} Many loop optimisations do not preserve the semantics of multi-entry loops in general, or are otherwise rendered ineffective. \end{example} \section{Labelling: a quick sketch of the previous approach} \label{sec:labelling} Plainly labelled \imp{} is obtained by adding to the code \emph{cost labels} (with metavariables \alpha,\beta,\ldots), and cost-labelled statements:$$ S,T\gramm \cdots \mid \alpha: S $$Cost labels allow us to track some program points along the compilation chain. For further details we refer to~\cite{D2.1}. With labels the small step semantics turns into a labelled transition system along with a natural notion of trace (\ie lists of labels) arises. The evaluation of statements is enriched with traces, so that rules follow a pattern similar to the following:$$ \begin{array}{lblL} (\alpha: S, K,s) &\to[\alpha]_P (S,K,s)\\ (\s{skip}, S \cdot K,s) &\to[\varepsilon]_P (S, K, s)\\ & \text{etc.} \end{array}$$Here, we identify cost labels \alpha with singleton traces and we use \varepsilon for the empty trace. Cost labels are emitted by cost-labelled statements only\footnote{In the general case the evaluation of expressions can emit cost labels too (see~\ref{sec:conc}).}. We then write \to[\lambda]\!\!^* for the transitive closure of the small step semantics which produces by concatenation the trace \lambda. \paragraph{Labelling} Given an \imp{} program P its \emph{labelling} \alpha:\Ell(P) in \ell-\imp is defined by putting cost labels after every branching statement, at the start of both branches, and a cost label at the beginning of the program. Also, every labelled statement gets a cost label, which is a conservative approach to ensuring that all loops have labels inside them, as a loop might be done with \s{goto}s. The relevant cases are$$\begin{aligned} \Ell(\sop{if}e\sbin{then}S\sbin{else}T) &= \sop{if}e\sbin{then}\alpha:\Ell(S)\sbin{else}\beta:\Ell(T)\\ \Ell(\sop{while}e\sbin{do}S) &= (\sop{while}e\sbin{do}\alpha:\Ell(S));\beta:\s{skip}\\ \Ell(\ell : S) &= (\ell : \alpha : \Ell(S)) \end{aligned}$$where \alpha,\beta are fresh cost labels. In all other cases the definition just passes to substatements. \paragraph{Labels in the rest of the compilation chain} All languages further down the chain get a new sequential statement \sop{emit}\alpha whose effect is to be consumed in a labelled transition while keeping the same state. All other instructions guard their operational semantics and do not emit cost labels. Preservation of semantics throughout the compilation process is restated, in rough terms, as:$$ \text{starting state of $P$}\to[\lambda]\!\!^*\;\text{halting state} \iff \text{starting state of $\mathcal C(P)$} \to[\lambda]\!\!^*\;\text{halting state} $$Here P is a program of a language along the compilation chain, starting and halting states depend on the language, and \mathcal C is the compilation function\footnote{The case of divergent computations needs to be addressed too. Also, the requirement can be weakened by demanding some sort weaker form of equivalence of the traces than equality. Both of these issues are beyond the scope of this presentation.}. \paragraph{Instrumentations} Let \mathcal C be the whole compilation from \ell\imp to the labelled version of some low-level language L. Supposing such compilation has not introduced any new loop or branching, we have that: \begin{itemize} \item Every loop contains at least a cost label (\emph{soundness condition}) \item Every branching has different labels for the two branches (\emph{preciseness condition}). \end{itemize} With these two conditions, we have that each and every cost label in \mathcal C(P) for any P corresponds to a block of sequential instructions, to which we can assign a constant \emph{cost}\footnote{This in fact requires the machine architecture to be simple enough', or for some form of execution analysis to take place.} We therefore may assume the existence of a \emph{cost mapping} \kappa_P from cost labels to natural numbers, assigning to each cost label \alpha the cost of the block containing the single occurrance of \alpha. Given any cost mapping \kappa, we can enrich a labelled program so that a particular fresh variable (the \emph{cost variable} c) keeps track of the summation of costs during the execution. We call this procedure \emph{instrumentation} of the program, and it is defined recursively by:$$ \mathcal I(\alpha:S) = c \ass c + \kappa(\alpha) ; \mathcal I(S) $$In all other cases the definition passes to substatements. \paragraph{The problem with loop optimisations} Let us take loop peeling, and apply it to the labelling of a program without any prior adjustment:$$ (\sop{while}e\sbin{do}\alpha:S);\beta:\s{skip} \mapsto (\sop{if}b\sbin{then} \alpha : S; \sop{while} b \sbin{do} \alpha : S[\ell'_i/\ell_i]); \beta:\s{skip} $$What happens is that the cost label \alpha is duplicated with two distinct occurrences. If these two occurrences correspond to different costs in the compiled code, the best the cost mapping can do is to take the maximum of the two, preserving soundness (\ie the cost estimate still bounds the actual one) but losing preciseness (\ie the actual cost could be strictly less than its estimate). \section{Indexed labels} \subsubsection{Indexed labels} \label{sec:indexedlabels} This section presents the core of the new approach. \end{enumerate} \subsection{Indexing the cost labels} \subsubsection{Indexing the cost labels} \label{ssec:indlabs} In plainer words: each single-entry loop is indexed by i_k where k is the number of other single-entry loops containing this one, and all cost labels under the scope of a single-entry loop indexed by i_k are indexed by all indices i_0,\ldots,i_k, without any transformation. \subsection{Indexed labels and loop transformations}\label{ssec:looptrans} \subsubsection{Indexed labels and loop transformations}\label{ssec:looptrans} We define the \emph{reindexing} I \circ (i_k\mapsto a*i_k+b) as an operator on indexings by setting: \begin{multline*} Again, the reindexing is as expected: each copy of the unrolled body has its indices remapped so that when they are executed, the original iteration of the loop to which they correspond can be recovered. \subsection{Semantics and compilation of indexed labels} \subsubsection{Semantics and compilation of indexed labels} In order to make sense of loop indices, one must keep track of their values in the state. A \emph{constant indexing} (metavariables C,\ldots) is an indexing which employs only constant simple expressions. In fact, the statement of preservation of semantics does not change at all, if not for considering traces of evaluated indexed cost labels rather than traces of plain ones. \subsection{Dependent costs in the source code} \subsubsection{Dependent costs in the source code} \label{ssec:depcosts} The task of producing dependent costs from constant costs induced by indexed labels is quite technical.$$ \subsection{A detailed example}\label{ssec:detailedex} \subsubsection{A detailed example}\label{ssec:detailedex} Take the program in \autoref{fig:example1}. Its initial labelling will be: $$\begin{array}{l} We will see later on \autopageref{pag:continued} how such an expression can be simplified. \section{Notes on the implementation and further work} \subsection{Notes on the implementation and further work} \label{sec:conc} Implementing the indexed label approach in CerCo's untrusted Ocaml prototype does not introduce many new challenges beyond what has already been presented for the toy language, \imp{} with \s{goto}s. This corresponds to a reasonable limitation on the number of times loop unrolling may be applied to the same loop: at most once. \paragraph{Further work} For the time being, indexed labels are only implemented in the untrusted Ocaml compiler, while they are not present yet in the Matita code. Porting them should pose no significant problem. Once ported, the task of proving properties about them in Matita can begin. Because most of the executable operational semantics of the languages across the frontend and the backend are oblivious to cost labels, it should be expected that the bulk of the semantic preservation proofs that still needs to be done will not get any harder because of indexed labels. The only trickier point that we foresee would be in the translation of \s{Clight} to \s{Cminor}, where we pass from structured indexed loops to atomic instructions on loop indices. An invariant which should probably be proved and provably preserved along the compilation chain is the non-overlap of indexings for the same atom. Then, supposing cost correctness for the unindexed approach, the indexed one will just need to amend the proof that$$\forall C\text{ constant indexing}.\forall \alpha\la I\ra\text{ appearing in the compiled code}. \kappa(\alpha)\circ (I\circ C) = \kappa(\alpha\la I \ra). $$Here, C represents a snapshot of loop indices in the compiled code, while I\circ C is the corresponding snapshot in the source code. Semantics preservation will ensure that when, with snapshot C, we emit \alpha\la I\ra (that is, we have \alpha\la I\circ C\ra in the trace), \alpha must also be emitted in the source code with indexing I\circ C, so the cost \kappa(\alpha)\circ (I\circ C) applies. Aside from carrying over the proofs, we would like to extend the approach to more loop transformations. Important examples are loop inversion (where a for loop is reversed, usually to make iterations appear to be truly independent) or loop interchange (where two nested loops are swapped, usually to have more loop invariants or to enhance strength reduction). This introduces interesting changes to the approach, where we would have indexings such as:$$i_0\mapsto n - i_0\quad\text{or}\quad i_0\mapsto i_1, i_1\mapsto i_0. In particular dependency over actual variables of the code would enter the frame, as indexings would depend on the number of iterations of a well-behaving guarded loop (the $n$ in the first example). Finally, as stated in the introduction, the approach should allow some integration of techniques for cache analysis, a possibility that for now has been put aside as the standard 8051 target architecture for the CerCo project lacks a cache. Two possible developments for this line of work present themselves: \begin{enumerate} \item One could extend the development to some 8051 variants, of which some have been produced with a cache. \item One could make the compiler implement its own cache: this cannot apply to \textsc{ram} accesses of the standard 8051 architecture, as the difference in cost of accessing the two types of \textsc{ram} is only one clock cycle, which makes any implementation of cache counterproductive. So for this proposal, we could either artificially change the accessing cost of \textsc{ram} of the model just for the sake of possible future adaptations to other architectures, or otherwise model access to an external memory by means of the serial port. \end{enumerate} \section{Compiler proof} \label{sect.compiler.proof} %%% HERE BEGINS A NEW BIT \section{Introduction}
Note: See TracChangeset for help on using the changeset viewer.