# Changeset 3425 for Papers

Ignore:
Timestamp:
Feb 11, 2014, 5:02:28 PM (6 years ago)
Message:

Rearrange section 4

File:
1 edited

### Legend:

Unmodified
 r3424 \end{figure} \section{Main scientific and technical results} We will now review the main results that the CerCo project has achieved. First we describe the basic labeling approach and our compiler implementations that use it.  This is suitable for basic architectures with simple cost models.  Then we will discuss the dependent labeling extension which is suitable for more advanced processor architectures and compiler optimisations.  At the end of this section we will demonstrate automated high level reasoning about the source level costs provided by the compilers. % \emph{Dependent labeling~\cite{?}.} The basic labeling approach assumes a to every instruction. This is not possible in the presence of stateful hardware whose state influences the cost of operations, like pipelines and caches. In the next subsection we will see an extension of the like pipelines and caches. In Section~\ref{lab:deplabel} we will see an extension of the basic labeling approach to cover this situation. The labeling approach described in this section can be applied equally well and with minor modifications to imperative and functional languages \cite{labeling2}. We have tested it on a simple imperative language without functions (a while' language), on a subset of C and on two compilation chains for a purely functional higher-order language. The two main changes required to deal with functional languages are: 1) because global variables and updates are not available, the instrumentation phase produces monadic code to update' the global costs; 2) the requirements for a sound and precise labeling of the source code must be changed when the compilation is based on CPS translations. In particular, we need to introduce labels emitted before a statement is executed and also labels emitted after a statement is executed. The latter capture code that is inserted by the CPS translation and that would escape all label scopes. In CerCo we have developed several cost preserving compilers based on the labeling approach. Excluding an initial certified compiler for a while' language, all remaining compilers target realistic source languages---a pure higher order functional language and a large subset of C with pointers, gotos and all data structures---and real world target processors---MIPS and the Intel 8051/8052 processor family. Moreover, they achieve a level of optimisation that ranges from moderate (comparable to GCC level 1) to intermediate (including loop peeling and unrolling, hoisting and late constant propagation). We describe the C compilers in detail in the following section. Two compilation chains were implemented for a purely functional higher-order language~\cite{labeling2}. The two main changes required to deal with functional languages are: 1) because global variables and updates are not available, the instrumentation phase produces monadic code to update' the global costs; 2) the requirements for a sound and precise labeling of the source code must be changed when the compilation is based on CPS translations.  In particular, we need to introduce labels emitted before a statement is executed and also labels emitted after a statement is executed. The latter capture code that is inserted by the CPS translation and that would escape all label scopes. % Brian: one of the reviewers pointed out that standard Prolog implementations % non-backtrackable state could support our labeling approach. \subsection{Dependent labeling} The core idea of the basic labeling approach is to establish a tight connection between basic blocks executed in the source and target languages. Once the connection is established, any cost model computed on the object code can be transferred to the source code, without affecting the code of the compiler or its proof. In particular, it is immediate that we can also transport cost models that associate to each label functions from hardware state to natural numbers. However, a problem arises during the instrumentation phase that replaces cost emission statements with increments of global cost variables. The global cost variable must be incremented with the result of applying the function associated to the label to the hardware state at the time of execution of the block. The hardware state comprises both the functional state that affects the computation (the value of the registers and memory) and the non-functional state that does not (the pipeline and cache contents, for example). The former is in correspondence with the source code state, but reconstructing the correspondence may be hard and lifting the cost model to work on the source code state is likely to produce cost expressions that are too complex to understand and reason about. Fortunately, in all modern architectures the cost of executing single instructions is either independent of the functional state or the jitter---the difference between the worst and best case execution times---is small enough to be bounded without losing too much precision. Therefore we only consider dependencies on the non-functional' parts of the state. The non-functional state has no correspondence in the high level state and does not influence the functional properties. What can be done is to expose the non-functional state in the source code. We present here the basic intuition in a simplified form: the technical details that allow us to handle the general case are more complex and can be found in~\cite{paolo}. We add to the source code an additional global variable that represents the non-functional state and another one that remembers the last labels emitted. The state variable must be updated at every label emission statement, using an update function which is computed during static analysis too. The update function associates to each label a function from the recently emitted labels and old state to the new state. It is computed by composing the semantics of every instruction in a basic block and restricting it to the non-functional part of the state. Not all the details of the non-functional state needs to be exposed, and the technique works better when the part of state that is required can be summarized in a simple data structure. For example, to handle simple but realistic pipelines it is sufficient to remember a short integer that encodes the position of bubbles (stuck instructions) in the pipeline. In any case, it is not necessary for the user to understand the meaning of the state to reason over the properties of the program. Moreover, at any moment the user, or the invariant generator tools that analyse the instrumented source code produced by the compiler, can decide to trade precision of the analysis for simplicity by approximating the parametric cost with safe non parametric bounds. Interestingly, the functional analysis of the code could determine which blocks are executed more frequently in order to use more aggressive approximations for the ones that are executed least. Dependent labeling can also be applied to allow the compiler to duplicate blocks that contain labels (e.g. in loop optimisations)~\cite{paolo}. The effect is to assign a different cost to the different occurrences of a duplicated label. For example, loop peeling turns a loop into the concatenation of a copy of the loop body for the first iteration and the conditional execution of the loop for successive iterations. Further optimisations will compile the two copies of the loop differently, with the first body usually taking more time. By introducing a variable that keeps track of the iteration number, we can associate to the label a cost that is a function of the iteration number. The same technique works for loop unrolling without modifications: the function will assign a cost to the even iterations and another cost to the odd ones. The actual work to be done consists of introducing within the source code, and for each loop, a variable that counts the number of iterations. The loop optimisation code that duplicate the loop bodies must also modify the code to correctly propagate the update of the iteration numbers. The technical details are more complex and can be found in the CerCo reports and publications. The implementation, however, is quite simple and the changes to a loop optimising compiler are minimal. \subsection{Techniques to exploit the induced cost model} We review the cost synthesis techniques developed in the project. The starting hypothesis is that we have a certified methodology to annotate blocks in the source code with constants which provide a sound and sufficiently precise upper bound on the cost of executing the blocks after compilation to object code. The principle that we have followed in designing the cost synthesis tools is that the synthetic bounds should be expressed and proved within a general purpose tool built to reason on the source code. In particular, we rely on the Frama-C tool to reason on C code and on the Coq proof-assistant to reason on higher-order functional programs. This principle entails that: 1) The inferred synthetic bounds are indeed correct as long as the general purpose tool is; 2) there is no limitation on the class of programs that can be handled as long as the user is willing to carry on an interactive proof. Of course, automation is desirable whenever possible. Within this framework, automation means writing programs that give hints to the general purpose tool. These hints may take the form, say, of loop invariants/variants, of predicates describing the structure of the heap, or of types in a light logic. If these hints are correct and sufficiently precise the general purpose tool will produce a proof automatically, otherwise, user interaction is required. \paragraph{The Cost plugin and its application to the Lustre compiler.} Frama-C \cite{framac} is a set of analysers for C programs with a specification language called ACSL. New analyses can be dynamically added via a plugin system. For instance, the Jessie plugin allows deductive verification of C programs with respect to their specification in ACSL, with various provers as back-end tools. We developed the CerCo Cost plugin for the Frama-C platform as a proof of concept of an automatic environment exploiting the cost annotations produced by the CerCo compiler. It consists of an OCaml program which essentially takes the following actions: 1) it receives as input a C program, 2) it applies the CerCo compiler to produce a related C program with cost annotations, 3) it applies some heuristics to produce a tentative bound on the cost of executing the C functions of the program as a function of the value of their parameters, 4) the user can then call the Jessie plugin to discharge the related proof obligations. In the following we elaborate on the soundness of the framework and the experiments we performed with the Cost tool on the C programs produced by a Lustre compiler. \paragraph{Soundness.} The soundness of the whole framework depends on the cost annotations added by the CerCo compiler, the synthetic costs produced by the cost plugin, the verification conditions (VCs) generated by Jessie, and the external provers discharging the VCs. The synthetic costs being in ACSL format, Jessie can be used to verify them. Thus, even if the added synthetic costs are incorrect (relatively to the cost annotations), the process as a whole is still correct: indeed, Jessie will not validate incorrect costs and no conclusion can be made about the WCET of the program in this case. In other terms, the soundness does not really depend on the action of the cost plugin, which can in principle produce any synthetic cost. However, in order to be able to actually prove a WCET of a C function, we need to add correct annotations in a way that Jessie and subsequent automatic provers have enough information to deduce their validity. In practice this is not straightforward even for very simple programs composed of branching and assignments (no loops and no recursion) because a fine analysis of the VCs associated with branching may lead to a complexity blow up. \paragraph{Experience with Lustre.} Lustre is a data-flow language for programming synchronous systems, with a compiler which targets C. We designed a wrapper for supporting Lustre files. The C function produced by the compiler implements the step function of the synchronous system and computing the WCET of the function amounts to obtain a bound on the reaction time of the system. We tested the Cost plugin and the Lustre wrapper on the C programs generated by the Lustre compiler. For programs consisting of a few hundred lines of code, the cost plugin computes a WCET and Alt-Ergo is able to discharge all VCs automatically. \paragraph{Handling C programs with simple loops.} The cost annotations added by the CerCo compiler take the form of C instructions that update a fresh global variable called the cost variable by a constant. Synthesizing a WCET of a C function thus consists of statically resolving an upper bound of the difference between the value of the cost variable before and after the execution of the function, i.e. finding the instructions that update the cost variable and establish the number of times they are passed through during the flow of execution. To perform the analysis the plugin makes the following assumptions on the programs: 1) there are no recursive functions; 2) every loop must be annotated with a variant. The variants of for' loops are automatically inferred. The plugin proceeds as follows. First the call graph of the program is computed. Then the computation of the cost of the function is performed by traversing its control flow graph. If the function $f$ calls the function $g$ then the function $g$ is processed before the function $f$. The cost at a node is the maximum of the costs of the successors. In the case of a loop with a body that has a constant cost for every step of the loop, the cost is the product of the cost of the body and of the variant taken at the start of the loop. In the case of a loop with a body whose cost depends on the values of some free variables, a fresh logic function $f$ is introduced to represent the cost of the loop in the logic assertions. This logic function takes the variant as a first parameter. The other parameters of $f$ are the free variables of the body of the loop. An axiom is added to account the fact that the cost is accumulated at each step of the loop. The cost of the function is directly added as post-condition of the function. The user can influence the annotation by two different means: 1) by using more precise variants; 2) by annotating functions with cost specifications. The plugin will use this cost for the function instead of computing it. \paragraph{C programs with pointers.} When it comes to verifying programs involving pointer-based data structures, such as linked lists, trees, or graphs, the use of traditional first-order logic to specify, and of SMT solvers to verify, shows some limitations. Separation logic~\cite{separation} is an elegant alternative. It is a program logic with a new notion of conjunction to express spatial heap separation. Bobot has recently introduced automatically generated separation predicates to simulate separation logic reasoning in the Jessie plugin where the specification language, the verification condition generator, and the theorem provers were not designed with separation logic in mind. CerCo's plugin can exploit the separation predicates to automatically reason on the cost of execution of simple heap manipulation programs such as an in-place list reversal. \subsection{The CerCo compiler} In CerCo we have developed several cost preserving compilers based on the labeling approach. Excluding an initial certified compiler for a while' language, all remaining compilers target realistic source languages---a pure higher order functional language and a large subset of C with pointers, gotos and all data structures---and real world target processors---MIPS and the Intel 8051/8052 processor family. Moreover, they achieve a level of optimisation that ranges from moderate (comparable to GCC level 1) to intermediate (including loop peeling and unrolling, hoisting and late constant propagation). The so called \emph{Trusted CerCo Compiler} is the only one that was implemented in the interactive theorem prover Matita~\cite{matita} and its costs certified. The code distributed \subsection{The CerCo C compilers} We implemented two C compilers, one implemented directly in OCaml and the other implemented in the interactive theorem prover Matita~\cite{matita}.  The first acted as a prototype for the second, but also supported MIPS and acted as a testbed for more advanced features such as the dependent labeling approach in Section~\ref{lab:deplabel}. The second C compiler is the \emph{Trusted CerCo Compiler}, whose cost predictions are formally verified. The code distributed is extracted OCaml code from the Matita implementation. In the rest of this section we will only focus on the Trusted CerCo Compiler, that only targets the C language and the 8051/8052 family, and that does not implement any advanced optimisations yet. Its user interface, however, is the same as the one of the other versions, in order to trade safety with additional performances. In of the other version, in order to trade safety with additional performances. In particular, the Frama-C CerCo plugin can work without recompilation with all of our C compilers. with both of our C compilers. The 8051/8052 microprocessor is a very simple one, with constant-cost \subsection{Formal certification of the CerCo compiler} We implemented the CerCo compiler in the interactive theorem prover Matita and have formally certified that the cost models induced on the source code correctly and precisely predicts the object code behaviour. There are two cost models, one for execution We have formally certified in the Matita interactive proof assistant that the cost models induced on the source code by the Trusted CerCo Compiler correctly and precisely predict the object code behaviour. There are two cost models, one for execution time and one for stack space consumption. We show the correctness of the prediction only for those programs that do not exhaust the available machine stack space, a not I/O times, as desired. \subsection{Dependent labeling} \label{lab:deplabel} The core idea of the basic labeling approach is to establish a tight connection between basic blocks executed in the source and target languages. Once the connection is established, any cost model computed on the object code can be transferred to the source code, without affecting the code of the compiler or its proof. In particular, it is immediate that we can also transport cost models that associate to each label functions from hardware state to natural numbers. However, a problem arises during the instrumentation phase that replaces cost emission statements with increments of global cost variables. The global cost variable must be incremented with the result of applying the function associated to the label to the hardware state at the time of execution of the block. The hardware state comprises both the functional state that affects the computation (the value of the registers and memory) and the non-functional state that does not (the pipeline and cache contents, for example). The former is in correspondence with the source code state, but reconstructing the correspondence may be hard and lifting the cost model to work on the source code state is likely to produce cost expressions that are too complex to understand and reason about. Fortunately, in all modern architectures the cost of executing single instructions is either independent of the functional state or the jitter---the difference between the worst and best case execution times---is small enough to be bounded without losing too much precision. Therefore we only consider dependencies on the non-functional' parts of the state. The non-functional state has no correspondence in the high level state and does not influence the functional properties. What can be done is to expose the non-functional state in the source code. We present here the basic intuition in a simplified form: the technical details that allow us to handle the general case are more complex and can be found in~\cite{paolo}. We add to the source code an additional global variable that represents the non-functional state and another one that remembers the last labels emitted. The state variable must be updated at every label emission statement, using an update function which is computed during static analysis too. The update function associates to each label a function from the recently emitted labels and old state to the new state. It is computed by composing the semantics of every instruction in a basic block and restricting it to the non-functional part of the state. Not all the details of the non-functional state needs to be exposed, and the technique works better when the part of state that is required can be summarized in a simple data structure. For example, to handle simple but realistic pipelines it is sufficient to remember a short integer that encodes the position of bubbles (stuck instructions) in the pipeline. In any case, it is not necessary for the user to understand the meaning of the state to reason over the properties of the program. Moreover, at any moment the user, or the invariant generator tools that analyse the instrumented source code produced by the compiler, can decide to trade precision of the analysis for simplicity by approximating the parametric cost with safe non parametric bounds. Interestingly, the functional analysis of the code could determine which blocks are executed more frequently in order to use more aggressive approximations for the ones that are executed least. Dependent labeling can also be applied to allow the compiler to duplicate blocks that contain labels (e.g. in loop optimisations)~\cite{paolo}. The effect is to assign a different cost to the different occurrences of a duplicated label. For example, loop peeling turns a loop into the concatenation of a copy of the loop body for the first iteration and the conditional execution of the loop for successive iterations. Further optimisations will compile the two copies of the loop differently, with the first body usually taking more time. By introducing a variable that keeps track of the iteration number, we can associate to the label a cost that is a function of the iteration number. The same technique works for loop unrolling without modifications: the function will assign a cost to the even iterations and another cost to the odd ones. The actual work to be done consists of introducing within the source code, and for each loop, a variable that counts the number of iterations. The loop optimisation code that duplicate the loop bodies must also modify the code to correctly propagate the update of the iteration numbers. The technical details are more complex and can be found in the CerCo reports and publications. The implementation, however, is quite simple (and forms part of our OCaml version of the compiler) and the changes to a loop optimising compiler are minimal. \subsection{Techniques to exploit the induced cost model} We now turn our attention to synthesising high-level costs, such as the execution time of a whole program.  We consider as our starting point source level costs provided by basic labeling, in other words annotations on the source code which are constants that provide a sound and sufficiently precise upper bound on the cost of executing the blocks after compilation to object code. The principle that we have followed in designing the cost synthesis tools is that the synthetic bounds should be expressed and proved within a general purpose tool built to reason on the source code. In particular, we rely on the Frama-C tool to reason on C code and on the Coq proof-assistant to reason on higher-order functional programs. This principle entails that: 1) The inferred synthetic bounds are indeed correct as long as the general purpose tool is; 2) there is no limitation on the class of programs that can be handled as long as the user is willing to carry on an interactive proof. Of course, automation is desirable whenever possible. Within this framework, automation means writing programs that give hints to the general purpose tool. These hints may take the form, say, of loop invariants/variants, of predicates describing the structure of the heap, or of types in a light logic. If these hints are correct and sufficiently precise the general purpose tool will produce a proof automatically, otherwise, user interaction is required. \paragraph{The Cost plugin and its application to the Lustre compiler.} Frama-C \cite{framac} is a set of analysers for C programs with a specification language called ACSL. New analyses can be dynamically added via a plugin system. For instance, the Jessie plugin allows deductive verification of C programs with respect to their specification in ACSL, with various provers as back-end tools. We developed the CerCo Cost plugin for the Frama-C platform as a proof of concept of an automatic environment exploiting the cost annotations produced by the CerCo compiler. It consists of an OCaml program which essentially takes the following actions: 1) it receives as input a C program, 2) it applies the CerCo compiler to produce a related C program with cost annotations, 3) it applies some heuristics to produce a tentative bound on the cost of executing the C functions of the program as a function of the value of their parameters, 4) the user can then call the Jessie plugin to discharge the related proof obligations. In the following we elaborate on the soundness of the framework and the experiments we performed with the Cost tool on the C programs produced by a Lustre compiler. \paragraph{Soundness.} The soundness of the whole framework depends on the cost annotations added by the CerCo compiler, the synthetic costs produced by the cost plugin, the verification conditions (VCs) generated by Jessie, and the external provers discharging the VCs. The synthetic costs being in ACSL format, Jessie can be used to verify them. Thus, even if the added synthetic costs are incorrect (relatively to the cost annotations), the process as a whole is still correct: indeed, Jessie will not validate incorrect costs and no conclusion can be made about the WCET of the program in this case. In other terms, the soundness does not really depend on the action of the cost plugin, which can in principle produce any synthetic cost. However, in order to be able to actually prove a WCET of a C function, we need to add correct annotations in a way that Jessie and subsequent automatic provers have enough information to deduce their validity. In practice this is not straightforward even for very simple programs composed of branching and assignments (no loops and no recursion) because a fine analysis of the VCs associated with branching may lead to a complexity blow up. \paragraph{Experience with Lustre.} Lustre is a data-flow language for programming synchronous systems, with a compiler which targets C. We designed a wrapper for supporting Lustre files. The C function produced by the compiler implements the step function of the synchronous system and computing the WCET of the function amounts to obtain a bound on the reaction time of the system. We tested the Cost plugin and the Lustre wrapper on the C programs generated by the Lustre compiler. For programs consisting of a few hundred lines of code, the cost plugin computes a WCET and Alt-Ergo is able to discharge all VCs automatically. \paragraph{Handling C programs with simple loops.} The cost annotations added by the CerCo compiler take the form of C instructions that update a fresh global variable called the cost variable by a constant. Synthesizing a WCET of a C function thus consists of statically resolving an upper bound of the difference between the value of the cost variable before and after the execution of the function, i.e. finding the instructions that update the cost variable and establish the number of times they are passed through during the flow of execution. To perform the analysis the plugin makes the following assumptions on the programs: 1) there are no recursive functions; 2) every loop must be annotated with a variant. The variants of `for' loops are automatically inferred. The plugin proceeds as follows. First the call graph of the program is computed. Then the computation of the cost of the function is performed by traversing its control flow graph. If the function $f$ calls the function $g$ then the function $g$ is processed before the function $f$. The cost at a node is the maximum of the costs of the successors. In the case of a loop with a body that has a constant cost for every step of the loop, the cost is the product of the cost of the body and of the variant taken at the start of the loop. In the case of a loop with a body whose cost depends on the values of some free variables, a fresh logic function $f$ is introduced to represent the cost of the loop in the logic assertions. This logic function takes the variant as a first parameter. The other parameters of $f$ are the free variables of the body of the loop. An axiom is added to account the fact that the cost is accumulated at each step of the loop. The cost of the function is directly added as post-condition of the function. The user can influence the annotation by two different means: 1) by using more precise variants; 2) by annotating functions with cost specifications. The plugin will use this cost for the function instead of computing it. \paragraph{C programs with pointers.} When it comes to verifying programs involving pointer-based data structures, such as linked lists, trees, or graphs, the use of traditional first-order logic to specify, and of SMT solvers to verify, shows some limitations. Separation logic~\cite{separation} is an elegant alternative. It is a program logic with a new notion of conjunction to express spatial heap separation. Bobot has recently introduced automatically generated separation predicates to simulate separation logic reasoning in the Jessie plugin where the specification language, the verification condition generator, and the theorem provers were not designed with separation logic in mind. CerCo's plugin can exploit the separation predicates to automatically reason on the cost of execution of simple heap manipulation programs such as an in-place list reversal. \section{Conclusions and future work}