# Changeset 3323

Ignore:
Timestamp:
Jun 6, 2013, 1:50:02 PM (5 years ago)
Message:

citazioni

File:
1 edited

Unmodified
Removed
• ## Papers/fopara2013/fopara13.tex

 r3322 What has previously prevented this approach is lack of a uniform and precise cost model for high-level code: 1) each statement occurrence is compiled differently and optimizations may change control flow; 2) the cost of an object code instruction may depend on the runtime state of hardware components like pipelines and caches, which is not visible in the source code. To solve the issue, we envision a new generation of compilers able to keep track of program structure through compilation and optimisation, and able to exploit that information to define a cost model for source code that is precise, non-uniform, and accounts for runtime state. With such a source-level cost model we can reduce non-functional verification to the functional case and exploit the state of the art in automated high-level verification. The techniques previously used by WCET analysers on the object code are still available, but can now be coupled with additional source-level analysis. Where the approach produces precise cost models too complex to reason about, safe approximations can be used to trade complexity with precision. Finally, analysis on source code can be performed even during early development stages, when components have been specified but not yet implemented: source code modularity means that it is enough to specify the non-functional behavior of unimplemented components. \paragraph{Contributions:} We have developed a technique, the labelling approach, to implement compilers that induce cost models on source programs by very lightweight tracking of code changes through compilation. We have studied how to formally prove the correctness of compilers implementing the technique. We have implemented such a compiler from C to object binaries for the 8051 microcontroller, and verified it in an interactive theorem prover. We have implemented a Frama-C plug-in that invokes the compiler on a source program and uses this to generate invariants on the high-level source that correctly model low-level costs. Finally, the plug-in certifies that the program respects these costs by calling automated theorem provers, a new and innovative technique in the field of cost analysis. As a case study, we show how the plug-in can automatically compute and certify the exact reaction time of Lustre dataflow programs compiled into C. To solve the issue, we envision a new generation of compilers able to keep track of program structure through compilation and optimisation, and able to exploit that information to define a cost model for source code that is precise, non-uniform, and accounts for runtime state. With such a source-level cost model we can reduce non-functional verification to the functional case and exploit the state of the art in automated high-level verification \cite{????}. The techniques previously used by Worst Case Execution Time (WCET) analysers on the object code are still available, but can now be coupled with additional source-level analysis. Where the approach produces precise cost models too complex to reason about, safe approximations can be used to trade complexity with precision. Finally, analysis on source code can be performed even during early development stages, when components have been specified but not yet implemented: source code modularity means that it is enough to specify the non-functional behavior of unimplemented components. \paragraph{Contributions:} We have developed a technique, the labelling approach, to implement compilers that induce cost models on source programs by very lightweight tracking of code changes through compilation. We have studied how to formally prove the correctness of compilers implementing the technique. We have implemented such a compiler from C to object binaries for the 8051 microcontroller, and verified it in an interactive theorem prover. We have implemented a Frama-C plug-in \cite{???} that invokes the compiler on a source program and uses this to generate invariants on the high-level source that correctly model low-level costs. Finally, the plug-in certifies that the program respects these costs by calling automated theorem provers, a new and innovative technique in the field of cost analysis. As a case study, we show how the plug-in can automatically compute and certify the exact reaction time of Lustre \cite{????} dataflow programs compiled into C. \section{Project context and objectives} The scenario for the verification of non functional properties (time spent, memory used, energy consumed) is more bleak and the future seems to be getting even worse. Most industries verify that real time systems meets their deadlines simply measuring the time spent in many runs of the systems,  computing the maximum time and adding an empirical safety margin, claiming the result to be a bound for the Worst Case Execution Time of the program. Formal methods and software to statically analyse the WCET of programs exist, but they often produce bounds that are too pessimistic to be useful. Recent advancements in hardware architectures is all focused on the improvement of the average case performance, not the predictability of the worst case. Execution time is getting more and more dependent from the execution history, that determines the internal state of hardware components like pipelines and caches. Multi-core processors and non uniform memory models are drastically reducing the possibility of performing static analysis in isolation, because programs are less and less time composable. Clock precise hardware models are necessary to static analysis, and getting them is becoming harder as a consequence of the increased hardware complexity. Despite the latter scenario, the need for reliable real time systems and programs is increasing, and there is an increasing pressure from the research community towards the differentiation of hardware. The aim is the introduction of alternative hardware whose behavior would be more predictable and more suitable to be statically analysed, for example decoupling execution time from the execution history by introducing randomization. Despite the latter scenario, the need for reliable real time systems and programs is increasing, and there is an increasing pressure from the research community towards the differentiation of hardware. The aim is the introduction of alternative hardware whose behavior would be more predictable and more suitable to be statically analysed, for example decoupling execution time from the execution history by introducing randomization \cite{?????}. In the CerCo project we do not try to address this problem, optimistically assuming that static analysis of non functional properties of programs will return to be feasible in the long term. The main objective of our work is instead to bring together static analysis of functional and non functional properties, which, according to the current state of the art, are completely independent activities with limited exchange of information: while the functional properties are verified on the source code, the analysis of non functional properties is entirely performed on the object code to exploit clock precise hardware models. There are two main reasons to currently perform the analysis on the object code. The first one is the obvious lack of a uniform, precise cost model for source code instructions (or even basic blocks). During compilation, high level instructions are torn apart and reassembled in context specific ways so that there is no way to identify a fragment of object code with a single high level instruction. Even the control flow of the object and source code can be very different as a result of optimizations. For instance, loop optimizations reduce the number or the order of the iterations of loops, and may assign different object code, and thus different costs, to different iterations.  Despite the lack of a uniform, compilation and program independent cost model on the source language, the literature on the analysis of non asymptotic execution time on high level languages that assumes such a model is growing and getting momentum. Its practical usefulness is doomed to be minimal, unless we can provide a replacement for such cost models. Some hope has been provided by the EmBounded project, which compositionally compiles high level code to a byte code that is executed by an emulator with guarantees on the maximal execution time spent for each byte code instruction. The approach indeed provides a uniform model, at the price of loosing precision of the model (each cost is a pessimistic upper bound) and performance of the executed code (because the byte code is emulated compositionally instead of performing a fully non compositional compilation). There are two main reasons to currently perform the analysis on the object code. The first one is the obvious lack of a uniform, precise cost model for source code instructions (or even basic blocks). During compilation, high level instructions are torn apart and reassembled in context specific ways so that there is no way to identify a fragment of object code with a single high level instruction. Even the control flow of the object and source code can be very different as a result of optimizations. For instance, loop optimizations reduce the number or the order of the iterations of loops, and may assign different object code, and thus different costs, to different iterations.  Despite the lack of a uniform, compilation and program independent cost model on the source language, the literature on the analysis of non asymptotic execution time on high level languages that assumes such a model is growing and getting momentum. Its practical usefulness is doomed to be minimal, unless we can provide a replacement for such cost models. Some hope has been provided by the EmBounded project \cite{???}, which compositionally compiles high level code to a byte code that is executed by an emulator with guarantees on the maximal execution time spent for each byte code instruction. The approach indeed provides a uniform model, at the price of loosing precision of the model (each cost is a pessimistic upper bound) and performance of the executed code (because the byte code is emulated compositionally instead of performing a fully non compositional compilation). The second reason to perform the analysis on the object code is that bounding the worst case execution time of small code fragments in isolation (e.g. loop bodies) and then adding up the bounds yields very poor estimations because no knowledge on the hardware state can be assumed when executing the fragment. By analysing longer runs the bound obtained becomes more precise because the lack of knowledge on the initial state has less effects on longer computations. By embracing compilation, instead of avoiding it like EmBounded did, a CerCo compiler can at the same time produce efficient code and return costs that are as precise as the static analysis can be. Moreover, we allow our costs to be parametric: the cost of a block can depend on actual program data or on a summary of the execution history or on an approximated representation of the hardware state. For example, loop optimizations assign to a loop body a cost that is a function of the number of iterations performed. For another example, the cost of a loop body may be a function of the vector of stalled pipeline states, which can be exposed in the source code and updated at each basic block exit. It is parametricity that allows to analyse small code fragments without loosing precision: in the analysis of the code fragment we do not have to be ignorant on the initial hardware state. On the contrary, we can assume to know exactly which state (or mode, as WCET literature calls it) we are in. The cost of an execution is always the sum of the cost of basic blocks multiplied by the number of times they are executed, which is a functional property of the program. Therefore, in order to perform (parametric) time analysis of programs, it is necessary to combine a cost model with control and data flow analysis. Current state of the art WCET technology performs the analysis on the object code, where the logic of the program is harder to reconstruct and most information available on the source code (e.g. types) has been lost. Imprecision in the analysis leads to useless bounds. To augment precision, the tools ask the user to provide constraints on the object code control flow, usually in the form of bounds on the number of iterations of loops or linear inequalities on them. This requires the user to manually link the source and object code, translating his often wrong assumptions on the source code to object code constraints. The task is error prone and, in presence of complex optimizations, may be very hard if not impossible. The cost of an execution is always the sum of the cost of basic blocks multiplied by the number of times they are executed, which is a functional property of the program. Therefore, in order to perform (parametric) time analysis of programs, it is necessary to combine a cost model with control and data flow analysis. Current state of the art WCET technology performs the analysis on the object code, where the logic of the program is harder to reconstruct and most information available on the source code (e.g. types) has been lost. Imprecision in the analysis leads to useless bounds. To augment precision, the tools ask the user to provide constraints on the object code control flow, usually in the form of bounds on the number of iterations of loops or linear inequalities on them. This requires the user to manually link the source and object code, translating his often wrong assumptions on the source code to object code constraints. The task is error prone and, in presence of complex optimizations, may be very hard if not impossible \cite{????}. The CerCo approach has the potentiality to dramatically improve the state of the art. By performing control and data flow analysis on the source code, the error prone translation of invariants is completely avoided. It is in fact performed by the compiler itself when it induces the cost model on the source language. Moreover, any available technique for the verification of functional properties can be immediately reused and multiple techniques can collaborate together to infer and certify cost invariants for the program. Parametric cost analysis becomes the default one, with non parametric bounds used as last resorts when trading the complexity of the analysis with its precision. A priori, no technique previously used in traditional WCET is lost: they can just be applied on the source code. \emph{Dependent labelling.} The basic labelling approach assumes a bijective mapping between object code and source code O(1) blocks (called basic blocks). This assumption is violated by many program optimizations (e.g. loop peeling and loop unrolling). It also assumes the cost model computed on the object code to be non parametric: every block must be assigned a cost that does not depend on the state. This assumption is violated by stateful hardware like pipelines, caches, branch prediction units. The dependent labelling approach is an extension of the basic labelling approach that allows to handle parametric cost models. We showed how the method allows to deal with loop optimizations and pipelines, and we speculated about its applications to caches. \emph{Techniques to exploit the induced cost model.} Every technique used for the analysis of functional properties of programs can be adapted to analyse the non functional properties of the source code instrumented by compilers that implement the labelling approach. In order to gain confidence in this claim, we showed how to implement a cost invariant generator combining abstract interpretation with separation logic ideas. We integrated everything in the Frama-C modular architecture, in order to automatically compute proof obligations from the functional and the cost invariants and to use automatic theorem provers to proof them. This is an example of a new technique that is not currently exploited in WCET analysis. It also shows how precise functional invariants benefits the non functional analysis too. Finally, we show how to fully automatically analyse the reaction time of Lustre nodes that are first compiled to C using a standard Lustre compiler and then processed by a C compiler that implements the labelling approach. \emph{Techniques to exploit the induced cost model.} Every technique used for the analysis of functional properties of programs can be adapted to analyse the non functional properties of the source code instrumented by compilers that implement the labelling approach. In order to gain confidence in this claim, we showed how to implement a cost invariant generator combining abstract interpretation with separation logic ideas \cite{???}. We integrated everything in the Frama-C modular architecture, in order to automatically compute proof obligations from the functional and the cost invariants and to use automatic theorem provers to proof them. This is an example of a new technique that is not currently exploited in WCET analysis. It also shows how precise functional invariants benefits the non functional analysis too. Finally, we show how to fully automatically analyse the reaction time of Lustre nodes that are first compiled to C using a standard Lustre compiler and then processed by a C compiler that implements the labelling approach. \emph{The CerCo compiler.} This is a compiler from a large subset of the C program to 8051/8052 object code. The compiler implements the labelling approach and integrates a static analyser for 8051 executables. The latter can be implemented easily and does not require dependent costs because the 8051 microprocessor is a very simple processor whose instructions generally have a constant cost. It was picked to separate the issue of exact propagation of the cost model from the target to the source language from the orthogonal problem of the static analysis of object code that requires approximations or dependent costs. The compiler comes in several versions: some are prototypes implemented directly in OCaml, and they implement both the basic and dependent labelling approaches; the final version is extracted from a Matita certification and at the moment implements only the basic approach. \subsection{The (basic) labelling approach.} \paragraph{Problem statement:} given a source program P, we want to obtain an instrumented source program P',  written in the same programming language, and the object code O such that: 1) P' is obtained by inserting into P some additional instructions to update global cost information like the amount of time spent during execution or the maximal stack space required; 2) P and P' must have the same functional behavior, i.e., they must produce that same output and intermediate observables; 3) P and O must have the same functional behavior; 4) after execution and in interesting points during execution, the cost information computed by P' must be an upper bound of the one spent by O  to perform the corresponding operations (soundness property); 5) the difference between the costs computed by P' and the execution costs of O must be bounded by a program dependent constant (precision property). \paragraph{Problem statement:} given a source program $P$, we want to obtain an instrumented source program $P'$,  written in the same programming language, and the object code $O$ such that: 1) $P'$ is obtained by inserting into $P$ some additional instructions to update global cost information like the amount of time spent during execution or the maximal stack space required; 2) $P$ and $P'$ must have the same functional behavior, i.e., they must produce that same output and intermediate observables; 3) $P$ and $O$ must have the same functional behavior; 4) after execution and in interesting points during execution, the cost information computed by $P'$ must be an upper bound of the one spent by $O$  to perform the corresponding operations (soundness property); 5) the difference between the costs computed by $P'$ and the execution costs of $O$ must be bounded by a program dependent constant (precision property). \paragraph{The labeling software components:} we solve the problem in four stages, implemented by four software components that are used in sequence. The first component labels the source program P by injecting label emission statements in appropriate positions to mark the beginning of basic blocks. The set of labels with their positions is called labelling. The syntax and semantics of the source programming language is augmented with label emission statements. The statement “EMIT l” behaves like a NOP instruction that does not affect the program state or control flow, but it changes the semantics by making the label l observable. Therefore the observables of a run of a program becomes a stream of label emissions: l1 … ln, called the program trace. We clarify the conditions that the labelling must respect later. The first component labels the source program $P$ by injecting label emission statements in appropriate positions to mark the beginning of basic blocks. The set of labels with their positions is called labelling. The syntax and semantics of the source programming language is augmented with label emission statements. The statement “EMIT l” behaves like a NOP instruction that does not affect the program state or control flow, but it changes the semantics by making the label l observable. Therefore the observables of a run of a program becomes a stream of label emissions: l1 … ln, called the program trace. We clarify the conditions that the labelling must respect later. The second component is a labelling preserving compiler. It can be obtained from an existing compiler by adding label emission statements to every intermediate language and by propagating label emission statements during compilation. The compiler is correct if it preserves both the functional behavior of the program and the generated traces. We may also ask that the function that erases the cost emission statements commute with compilation. This optional property grants that the labelling does not interfere with the original compiler behavior. A further set of requirements will be added later. The fourth and last component takes in input the cost model computed at step 3 and the labelled code computed at step 1. It outputs a source program obtained by replacing each label emission statement with a statement that increments the global cost variable with the cost associated to the label by the cost model.  The obtained source code is the instrumented source code. \paragraph{Correctness:} Requirements 1, 2 and 3 of the program statement obviously hold, with 2 and 3 being a consequence of the definition of a correct labelling preserving compiler. It is also obvious that the value of the global cost variable of an instrumented source code is at any time equal to the sum of the costs of the labels emitted by the corresponding labelled code. Moreover, because the compiler preserves all traces, the sum of the costs of the labels emitted in the source and target labelled code are the same. Therefore, to satisfy the 4th requirement, we need to grant that the time taken to execute the object code is equal to the sum of the costs of the labels emitted by the object code. We collect all the necessary conditions for this to happen in the definition of a sound labelling: a) all loops must be broken by a cost emission statement;  b) all program instructions must be in the scope of some cost emission statement. To satisfy also the 5th requirement, additional requirements must be imposed on the object code labelling to avoid all uses of the maximum in the cost computation: the labelling is precise if every label is emitted at most once and both branches of a conditional jump start with a label emission statement. \paragraph{Correctness:} Requirements 1, 2 and 3 of the program statement obviously hold, with 2 and 3 being a consequence of the definition of a correct labelling preserving compiler. It is also obvious that the value of the global cost variable of an instrumented source code is at any time equal to the sum of the costs of the labels emitted by the corresponding labelled code. Moreover, because the compiler preserves all traces, the sum of the costs of the labels emitted in the source and target labelled code are the same. Therefore, to satisfy the fourth requirement, we need to grant that the time taken to execute the object code is equal to the sum of the costs of the labels emitted by the object code. We collect all the necessary conditions for this to happen in the definition of a sound labelling: a) all loops must be broken by a cost emission statement;  b) all program instructions must be in the scope of some cost emission statement. To satisfy also the fifth requirement, additional requirements must be imposed on the object code labelling to avoid all uses of the maximum in the cost computation: the labelling is precise if every label is emitted at most once and both branches of a conditional jump start with a label emission statement. The correctness and precision of the labelling approach only rely on the correctness and precision of the object code labelling. The simplest, but not necessary, way to achieve them is to impose correctness and precision requirements also on the initial labelling produced by the first software component, and to ask the labelling preserving compiler to preserve these properties too. The latter requirement imposes serious limitations on the compilation strategy and optimizations: the compiler may not duplicate any code that contains label emission statements, like loop bodies. Therefore several loop optimizations like peeling or unrolling are prevented. Moreover, precision of the object code labelling is not sufficient per se to obtain global precision: we also implicitly assumed the static analysis to be able to associate a precise constant cost to every instruction. This is not possible in presence of stateful hardware whose state influences the cost of operations, like pipelines and caches. In the next section we will see an extension of the basic labelling approach to cover this situation. Dependent labelling can also be applied to allow the compiler to duplicate blocks that contain labels (e.g. in loop optimizations). The effect is to assign a different cost to the different occurrences of a duplicated label. For example, loop peeling turns a loop into the concatenation of a copy of the loop body (that executes the first iteration) with the conditional execution of the loop (for the successive iterations). Because of further optimizations, the two copies of the loop will be compiled differently, with the first body usually taking more time. By introducing a variable that keep tracks of the iteration number, we can associate to the label a cost that is a function of the iteration number. The same technique works for loop unrolling without modifications: the function will assign a cost to the even iterations and another cost to the odd ones. The actual work to be done consists in introducing in the source code and for each loop a variable that counts the number of iterations. The loop optimization code that duplicate the loop bodies must also modify the code to propagate correctly the update of the iteration numbers. The technical details are more complex and can be found in the CerCo reports and publications. The implementation, however, is quite simple and the changes to a loop optimizing compiler are minimal. By introducing a variable that keep tracks of the iteration number, we can associate to the label a cost that is a function of the iteration number. The same technique works for loop unrolling without modifications: the function will assign a cost to the even iterations and another cost to the odd ones. The actual work to be done consists in introducing in the source code and for each loop a variable that counts the number of iterations. The loop optimization code that duplicate the loop bodies must also modify the code to propagate correctly the update of the iteration numbers. The technical details are more complex and can be found in the CerCo reports and publications. The implementation, however, is quite simple and the changes to a loop optimizing compiler are minimal \cite{Paolo}. \subsection{Techniques to exploit the induced cost model.} \paragraph{The Cost plug-in and its application to the Lustre compiler} Frama − C is a set of analysers for C programs with a specification language called ACSL. New analyses can be dynamically added through a plug-in system. For instance, the Jessie plug-in allows deductive verification of C programs with respect to their specification in ACSL, with various provers as back-end tools. We developed the CerCo Cost plug-in for the Frama − C platform as a proof of concept of an automatic environment exploiting the cost annotations produced by the CerCo compiler. It consists of an OCaml program which in first approximation takes the following actions: (1) it receives as input a C program, (2) it applies the CerCo compiler to produce a related C program with cost annotations, (3) it applies some heuristics to produce a tentative bound on the cost of executing the C functions of the program as a function of the value of their parameters, (4) the user can then call the Jessie tool to discharge the related proof obligations. Frama − C \cite{???} is a set of analysers for C programs with a specification language called ACSL. New analyses can be dynamically added through a plug-in system. For instance, the Jessie plug-in allows deductive verification of C programs with respect to their specification in ACSL, with various provers as back-end tools. We developed the CerCo Cost plug-in for the Frama − C platform as a proof of concept of an automatic environment exploiting the cost annotations produced by the CerCo compiler. It consists of an OCaml program which in first approximation takes the following actions: (1) it receives as input a C program, (2) it applies the CerCo compiler to produce a related C program with cost annotations, (3) it applies some heuristics to produce a tentative bound on the cost of executing the C functions of the program as a function of the value of their parameters, (4) the user can then call the Jessie tool \cite{???} to discharge the related proof obligations. In the following we elaborate on the soundness of the framework and the experiments we performed with the Cost tool on the C programs produced by a Lustre compiler. by annotating function with cost specification. The plugin will use this cost for the function instead of computing it. \paragraph{C programs with pointers} When it comes to verifying programs involving pointer-based data structures, such as linked lists, trees, or graphs, the use of traditional first-order logic to specify, and of SMT solvers to verify, shows some limitations. Separation logic is then an elegant alternative. It is a program logic with a new notion of conjunction to express spatial heap separation. Bobot has recently introduced in the Jessie plug-in automatically generated separation predicates to simulate separation logic reasoning in a traditional verification framework where the specification language, the verification condition generator, and the theorem provers were not designed with separation logic in mind. CerCo's plug-in can exploit the separation predicates to automatically reason on the cost of execution of simple heap manipulation programs such as an in-place list reversal. When it comes to verifying programs involving pointer-based data structures, such as linked lists, trees, or graphs, the use of traditional first-order logic to specify, and of SMT solvers to verify, shows some limitations. Separation logic \cite{???} is then an elegant alternative. It is a program logic with a new notion of conjunction to express spatial heap separation. Bobot has recently introduced in the Jessie plug-in automatically generated separation predicates to simulate separation logic reasoning in a traditional verification framework where the specification language, the verification condition generator, and the theorem provers were not designed with separation logic in mind. CerCo's plug-in can exploit the separation predicates to automatically reason on the cost of execution of simple heap manipulation programs such as an in-place list reversal. \subsection{The CerCo Compiler} The (trusted) CerCo compiler implements the following optimizations: cast simplification, constant propagation in expressions, liveness analysis driven spilling of registers, dead code elimination, branch displacement, tunneling. The two latter optimizations are performed by our optimizing assembler. The back-end of the compiler works on three addresses instructions, preferred to static single assignment code for the simplicity of the formal certification. The CerCo compiler is loosely based on the CompCert compiler, a recently developed certified compiler from C to the PowerPC, ARM and x86 microprocessors. Contrarily to CompCert, both the CerCo code and its certification are open source. Some data structures and language definitions for the front-end are directly taken from CompCert, while the back-end is a redesign of a compiler from Pascal to MIPS used by Francois Pottier for a course at the Ecole Polytechnique. The CerCo compiler is loosely based on the CompCert compiler, a recently developed certified compiler from C to the PowerPC, ARM and x86 microprocessors. Contrarily to CompCert, both the CerCo code and its certification are open source. Some data structures and language definitions for the front-end are directly taken from CompCert, while the back-end is a redesign of a compiler from Pascal to MIPS used by Francois Pottier for a course at the Ecole Polytechnique \cite{????}. The main peculiarities of the CerCo compiler are:
Note: See TracChangeset for help on using the changeset viewer.