Changeset 3314


Ignore:
Timestamp:
Jun 5, 2013, 6:09:01 PM (4 years ago)
Author:
sacerdot
Message:

Text shortened.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Papers/fopara2013/fopara13.tex

    r3313 r3314  
    102102
    103103The second reason to perform the analysis on the object code is that bounding the worst case execution time of small code fragments in isolation (e.g. loop bodies) and then adding up the bounds yields very poor estimations because no knowledge on the hardware state can be assumed when executing the fragment. By analysing longer runs (e.g. by full unrolling loops) the bound obtained becomes more precise because the lack of knowledge on the initial state has less effects on longer computations.
     104
    104105In CerCo we propose a radically new approach to the problem: we reject the idea of a uniform cost model and we propose that the compiler, which knows how the code is translated, must return the cost model for basic blocks of high level instructions. It must do so by keeping track of the control flow modifications to reverse them and by interfacing with static analysers.
    105106
     
    129130\begin{enumerate}
    130131\item \emph{The (basic) labelling approach.} It is the main technique that underlies all the developments in CerCo. It allows to track the evolution of basic blocks during compilation in order to propagate the cost model from the object code to the source code without loosing precision in the process.
    131 \item \emph{Dependent labelling.} The basic labelling approach assumes a bijective mapping between object code and source code O(1) blocks (called basic blocks). This assumption is violated by many program optimizations (e.g. loop peeling and loop unrolling). It also assumes the cost model computed on the object code to be non parametric: every block must be assigned a cost that does not depend on the state. This assumption is violated by stateful hardware like pipelines, caches, branch prediction units. The dependent labelling approach is an extension of the basic labelling approach that allows to handle parametric cost models. We showed how to the method allows to deal with loop optimizations and pipelines, and we speculated about its applications to caches.
     132\item \emph{Dependent labelling.} The basic labelling approach assumes a bijective mapping between object code and source code O(1) blocks (called basic blocks). This assumption is violated by many program optimizations (e.g. loop peeling and loop unrolling). It also assumes the cost model computed on the object code to be non parametric: every block must be assigned a cost that does not depend on the state. This assumption is violated by stateful hardware like pipelines, caches, branch prediction units. The dependent labelling approach is an extension of the basic labelling approach that allows to handle parametric cost models. We showed how the method allows to deal with loop optimizations and pipelines, and we speculated about its applications to caches.
    132133\item \emph{Techniques to exploit the induced cost model.} Every technique used for the analysis of functional properties of programs can be adapted to analyse the non functional properties of the source code instrumented by compilers that implement the labelling approach. In order to gain confidence in this claim, we showed how to implement a cost invariant generator combining abstract interpretation with separation logic ideas. We integrated everything in the Frama-C modular architecture, in order to automatically compute proof obligations from the functional and the cost invariants and to use automatic theorem provers to proof them. This is an example of a new technique that is not currently exploited in WCET analysis. It also shows how precise functional invariants benefits the non functional analysis too. Finally, we show how to fully automatically analyse the reaction time of Lustre nodes that are first compiled to C using a standard Lustre compiler and then processed by a C compiler that implements the labelling approach.
    133134\item \emph{The CerCo compiler.} This is a compiler from a large subset of the C program to 8051/8052 object code. The compiler implements the labelling approach and integrates a static analyser for 8051 executables. The latter can be implemented easily and does not require dependent costs because the 8051 microprocessor is a very simple processor whose instructions generally have a constant cost. It was picked to separate the issue of exact propagation of the cost model from the target to the source language from the orthogonal problem of the static analysis of object code that requires approximations or dependent costs. The compiler comes in several versions: some are prototypes implemented directly in OCaml, and they implement both the basic and dependent labelling approaches; the final version is extracted from a Matita certification and at the moment implements only the basic approach.
     
    161162Not all the details of the non functional state needs to be exposed, and the technique works better when the part of state that is required can be summarized in a simple data structure. For example, to handle simple but realistic pipelines it is sufficient to remember a short integer that encodes the position of bubbles (stuck instructions) in the pipeline. In any case, the user does not need to understand the meaning of the state to reason over the properties of the program. Moreover, at any moment the user or the invariant generator tools that analyse the instrumented source code produced by the compiler can decide to trade precision of the analysis with simplicity by approximating the parametric cost with safe non parametric bounds. Interestingly, the functional analysis of the code can determine which blocks are executed more frequently in order to approximate more aggressively the ones that are executed less.
    162163
    163 The idea of dependent labelling can also be applied to allow the compiler to duplicate blocks that contain labels (e.g. to allow loop optimizations). The effect of duplication is to assign a different cost to the different occurrences of a duplicated label. For example, loop peeling turns a loop into the concatenation of a copy the loop body (that executes the first iteration) with the conditional execution of the loop (for the successive iterations). Because of further optimizations, the two copies of the loop will be compiled differently, with the first body usually taking more time.
     164The idea of dependent labelling can also be applied to allow the compiler to duplicate blocks that contain labels (e.g. to allow loop optimizations). The effect of duplication is to assign a different cost to the different occurrences of a duplicated label. For example, loop peeling turns a loop into the concatenation of a copy of the loop body (that executes the first iteration) with the conditional execution of the loop (for the successive iterations). Because of further optimizations, the two copies of the loop will be compiled differently, with the first body usually taking more time.
    164165
    165166By introducing a variable that keep tracks of the iteration number, we can associate to the label a cost that is a function of the iteration number. The same technique works for loop unrolling without modifications: the function will assign a cost to the even iterations and another cost to the odd ones. The actual work to be done consists in introducing in the source code and for each loop a variable that counts the number of iterations. The loop optimization code that duplicate the loop bodies must also modify the code to propagate correctly the update of the iteration numbers. The technical details are more complex and can be found in the CerCo reports and publications. The implementation, however, is quite simple and the changes to a loop optimizing compiler are minimal.
     
    203204by annotating function with cost specification. The plugin will use this cost for the function instead of computing it.
    204205\paragraph{C programs with pointers}
    205 When it comes to verifying programs involving pointer-based data structures, such as linked lists, trees, or graphs, the use of traditional first-order logic to specify, and of SMT solvers to verify, shows some limitations. Separation logic is then an elegant alternative. Designed at the turn of the century, it is a program logic with a new notion of conjunction to express spatial heap separation. Separation logic has been implemented in dedicated theorem provers such as Smallfoot or VeriFast. One drawback of such provers, however, is to either limit the expressiveness of formulas (e.g. to the so-called symbolic heaps), or to require some user-guidance (e.g. open/close commands in VeriFast).
    206 In an attempt to conciliate both approaches, we introduced the notion of separation predicates. The approach consists in reformulating some ideas from separation logic into a traditional verification framework where the specification language, the verification condition generator, and the theorem provers were not designed with separation logic in mind. Separation predicates are automatically derived from user-defined inductive predicates, on demand. Then they can be used in program annotations, exactly as other predicates, i.e., without any constraint. Simply speaking, where one would write P ∗ Q in separation logic, one will here ask for the generation of a separation predicate sep and then use it as P ∧ Q ∧ sep(P, Q). We have implemented separation predicates within the Jessie plug-in and tested it on non-trivial case studies (e.g. the composite pattern from the VACID-0 benchmark). In these cases, we achieve a fully automatic proof using three existing SMT solver.
    207 We have also used the separation predicates to reason on the cost of executing simple heap manipulating programs such as an in-place list reversal.
     206When it comes to verifying programs involving pointer-based data structures, such as linked lists, trees, or graphs, the use of traditional first-order logic to specify, and of SMT solvers to verify, shows some limitations. Separation logic is then an elegant alternative. Designed at the turn of the century, it is a program logic with a new notion of conjunction to express spatial heap separation.Bobot has recently introduced in the Jessie plug-in automatically generated separation predicates that allow to simulate separation logic reasoning in a traditional verification framework where the specification language, the verification condition generator, and the theorem provers were not designed with separation logic in mind. CerCo's plug-in can exploit the separation predicates to automatically reason on the cost of execution of simple heap manipulation programs such as an in-place list reversal.
    208207
    209208\subsection{The CerCo Compiler}
Note: See TracChangeset for help on using the changeset viewer.