# Changeset 3433 for Papers

Ignore:
Timestamp:
Feb 14, 2014, 2:43:25 PM (6 years ago)
Message:

Revise sections 4.1, 4.2.

File:
1 edited

### Legend:

Unmodified
 r3432 \section{The typical CerCo workflow} \label{sec:workflow} \begin{figure}[!t] \begin{tabular}{l@{\hspace{0.2cm}}|@{\hspace{0.2cm}}l} behaviour; 4) after execution and in interesting points during execution, the cost information computed by $P'$ must be an upper bound of the one spent by $O$ to perform the corresponding operations (soundness property); 5) the difference to perform the corresponding operations (\emph{soundness} property); 5) the difference between the costs computed by $P'$ and the execution costs of $O$ must be bounded by a program-dependent constant (precision property). bounded by a program-dependent constant (\emph{precision} property). \paragraph{The labelling software components.} We solve the problem in four The first component labels the source program $P$ by injecting label emission statements in appropriate positions to mark the beginning of basic blocks. These are the positions where the cost instrumentation will appear in the final output. % The % set of labels with their positions is called labelling. % of label emissions: $\ell_1,\ldots,\ell_n$, called the program trace. We clarify the conditions % that the labelling must respect later. For the example in Section~\ref{sec:workflow} this is just the original C code with EMIT'' instructions added at every point a \lstinline'__cost_incr' call appears in the final code. The second component is a labelling preserving compiler. It can be obtained from language and by propagating label emission statements during compilation. The compiler is correct if it preserves both the functional behaviour of the program and the traces of observables. and the traces of observables, including the labels `emitted'. % We may also ask that the function that erases the cost % emission statements commute with compilation. This optional property grants that % set of requirements will be added later. The third component is a static object code analyser. It takes as input a labelled object code and it computes the scope of each of its label emission statements, i.e.\ the tree of instructions that may be executed after the statement and before a new label emission is encountered. It is a tree and not a sequence because the scope may contain a branching statement. In order to grant that such a finite tree exists, the object code must not contain any loop that is not broken by a label emission statement. This is the first requirement of a sound labelling. The analyser fails if the labelling is unsound. For each scope, the analyser computes an upper bound of the execution time required by the scope, using the maximum of the costs of the two branches in case of a conditional statement. Finally, the analyser computes the cost of a label by taking the maximum of the costs of the scopes of every statement that emits that label. The fourth and last component takes in input the cost model computed at step 3 and the labelled code computed at step 1. It outputs a source program obtained by replacing each label emission statement with a statement that increments the global cost variable with the cost associated to the label by the cost model. The obtained source code is the instrumented source code. \paragraph{Correctness.} Requirements 1, 2 and 3 of the problem statement obviously hold, with 2 and 3 being a consequence of the definition of a correct labelling preserving compiler. It is also obvious that the value of the global cost variable of an instrumented source code is at any time equal to the sum of the costs of the labels emitted by the corresponding labelled code. Moreover, because the compiler preserves all traces, the sum of the costs of the labels emitted in the source and target labelled code are the same. Therefore, to satisfy the fourth requirement, we need to grant that the time taken to execute the object code is equal to the sum of the costs of the labels emitted by the object code. We collect all the necessary conditions for this to happen in the definition of a sound labelling: a) all loops must be broken by a cost emission statement;  b) all program instructions must be in the scope of some cost emission statement. To satisfy also the fifth requirement, additional requirements must be imposed on the object code labelling to avoid all uses of the maximum in the cost computation: the labelling is precise if every label is emitted at most once and both branches of a conditional jump start with a label The third component analyses the labelled object code to compute the scope of each of its label emission statements, i.e.\ the instructions that may be executed after the statement and before a new label emission is encountered, and then computes the maximum cost of each.  Note that we only have enough information at this point to compute the cost of loop-free portions of code.  We will consider how to ensure that every loop is broken by a cost label shortly. %It is a tree and not a sequence because the scope %may contain a branching statement. In order to grant that such a finite tree %exists, the object code must not contain any loop that is not broken by a label %emission statement. This is the first requirement of a sound labelling. The %analyser fails if the labelling is unsound. For each scope, the analyser %computes an upper bound of the execution time required by the scope, using the %maximum of the costs of the two branches in case of a conditional statement. %Finally, the analyser computes the cost of a label by taking the maximum of the %costs of the scopes of every statement that emits that label. The fourth and final component replaces the labels in the labelled version of the source code produced at the start with the costs computed for each label's scope.  This yields the instrumented source code.  For the example, this is the code in \autoref{itest1}, except for the specifications in comments, which we consider in Section~\ref{sec:exploit}. \paragraph{Correctness.} Requirements 1 and 2 hold because of the non-invasive labelling procedure.  Requirement 3 can be satisfied by implementing compilation correctly.  It is obvious that the value of the global cost variable of the instrumented source code is always equal to the sum of the costs of the labels emitted by the corresponding labelled code. Moreover, because the compiler preserves all traces, the sum of the costs of the labels emitted in the source and target labelled code are the same. Therefore, to satisfy the soundness requirement, we need to ensure that the time taken to execute the object code is equal to the sum of the costs of the labels emitted by the object code. We collect all the necessary conditions for this to happen in the definition of a \emph{sound} labelling: a) all loops must be broken by a cost emission statement; b) all program instructions must be in the scope of some cost emission statement. This ensures that every label's scope is a tree of instructions, with the cost being the most expensive path. To satisfy also the precision requirement, we must make the scopes flat sequences of instructions. We require a \emph{precise} labelling where every label is emitted at most once and both branches of each conditional jump start with a label emission statement. The correctness and precision of the labelling approach only rely on the correctness and precision of the object code labelling. The simplest way to achieve them is to impose correctness and precision requirements also on the initial labelling produced by the first software component, and to ask the compiler to preserve these way to achieve that is to impose correctness and precision requirements on the source code labelling produced at the start, and to demand that the compiler to preserve these properties too. The latter requirement imposes serious limitations on the compilation strategy and optimizations: the compiler may not duplicate any code the presence of stateful hardware whose state influences the cost of operations, like pipelines and caches. In Section~\ref{lab:deplabel} we will see an extension of the basic labelling approach to cover this situation. basic labelling approach which tackles these problems. In CerCo we have developed several cost preserving compilers based on The second C compiler is the \emph{Trusted CerCo Compiler}, whose cost predictions are formally verified. The code distributed is extracted OCaml code from the Matita implementation. In the rest of this section we will only focus on the Trusted CerCo Compiler, that only targets the C language and the 8051/8052 family, and that does not implement any advanced optimisations yet. Its user interface, however, is the same as the one of the other version, in order to trade safety with additional performances. In particular, the Frama-C CerCo plugin can work without recompilation predictions are formally verified. The executable code is OCaml code extracted from the Matita implementation. The Trusted CerCo Compiler only targets the C language and the 8051/8052 family, and does not yet implement any advanced optimisations. Its user interface, however, is the same as the other version for interoperability. In particular, the Frama-C CerCo plugin descibed in Section~\ref{sec:exploit} can work without recompilation with both of our C compilers. The 8051/8052 microprocessor is a very simple one, with constant-cost instructions. It was chosen to separate the issue of exact propagation of the cost model from the orthogonal problem of the static analysis of object code that may require approximations or dependent costs. cost model from the orthogonal problem of low-level timing analysis of object code that may require approximation or dependent costs. The (trusted) CerCo compiler implements the following optimisations: cast \item All the intermediate languages include label emitting instructions to implement the labelling approach, and the compiler preserves execution traces. \item Traditionally compilers target an assembly language with additional macro-instructions to be expanded before assembly; for CerCo we need to go all the way down to object code in order to perform the static analysis. Therefore we integrated also an optimising assembler and a static analyser. \item Instead of targeting an assembly language with additional macro-instructions which are expanded before assembly, we directly produce object code in order to perform the timing analysis, using an integrated optimising assembler. \item In order to avoid the additional work of implementing a linker and a loader, we do not support separate becoming common again in multi-core processors. Therefore the compiler has to keep track of data and it must move data between memory regions in the proper way. Moreover the size of pointers to different regions is not uniform. An additional difficulty was that the space available for the stack in internal memory in the 8051 is tiny, allowing only a minor number of nested calls. To support full recursion in order to test the CerCo tools also on recursive programs, the compiler implements a stack in external memory. way. Moreover the size of pointers to different regions is not uniform. %An %additional difficulty was that the space available for the stack in internal %memory in the 8051 is tiny, allowing only a minor number of nested calls. To %support full recursion in order to test the CerCo tools also on recursive %programs, the compiler implements a stack in external memory. \end{enumerate} \subsection{Techniques to exploit the induced cost model} \label{sec:exploit} We now turn our attention to synthesising high-level costs, such as the execution time of a whole program.  We consider as our starting point source level costs