Changeset 3321


Ignore:
Timestamp:
Jun 6, 2013, 12:16:12 PM (4 years ago)
Author:
sacerdot
Message:

...

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Papers/fopara2013/fopara13.tex

    r3320 r3321  
    325325
    326326\paragraph{The labeling software components:} we solve the problem in four stages, implemented by four software components that are used in sequence.
    327 \begin{enumerate}
    328 \item The first component labels the source program P by injecting label emission statements in the program in appropriate positions. The set of labels with their positions is called labelling. The syntax and semantics of the source programming language is augmented with label emission statements. The statement “EMIT l” behaves like a NOP instruction that does not affect the program state or control flow, but it changes the semantics by making the label l observable. Therefore the observables of a run of a program becomes a stream of label emissions: l1 … ln, called the program trace. We clarify the conditions that the labelling must respect later.
    329 \item The second component is a labelling preserving compiler. It can be obtained from an existing compiler by adding label emission statements to every intermediate language and by propagating label emission statements during compilation. The compiler is correct if it preserves both the functional behavior of the program and the generated traces. We may also ask that the function that erases the cost emission statements commute with compilation. This optional property grants that the labelling does not interfere with the original compiler behavior. A further set of requirements will be added later.
    330 \item The third component is a static object code analyser. It takes in input the object code augmented with label emission statements and it computes, for every such statement, its scope. The scope of a label emission statement is the tree of instructions that may be executed after the statement and before a new label emission statement is found. It is a tree and not a sequence because the scope may contain a branching statement. In order to grant that such a finite tree exists, the object code must not contain any loop that is not broken by a label emission statement. This is the first requirement of a sound labelling. The analyser fails if the labelling is unsound. For each scope, the analyser computes an upper bound of the execution time required by the scope, using the maximum of the costs of the two branches in case of a conditional statement. Finally, the analyser computes the cost of a label by taking the maximum of the costs of the scopes of every statement that emits that label.
    331 \item The fourth and last component takes in input the cost model computed at step 3 and the labelled code computed at step 1. It outputs a source program obtained by replacing each label emission statement with a statement that increments the global cost variable with the cost associated to the label by the cost model.  The obtained source code is the instrumented source code.
    332 \end{enumerate}
     327
     328The first component labels the source program P by injecting label emission statements in appropriate positions to mark the beginning of basic blocks. The set of labels with their positions is called labelling. The syntax and semantics of the source programming language is augmented with label emission statements. The statement “EMIT l” behaves like a NOP instruction that does not affect the program state or control flow, but it changes the semantics by making the label l observable. Therefore the observables of a run of a program becomes a stream of label emissions: l1 … ln, called the program trace. We clarify the conditions that the labelling must respect later.
     329
     330The second component is a labelling preserving compiler. It can be obtained from an existing compiler by adding label emission statements to every intermediate language and by propagating label emission statements during compilation. The compiler is correct if it preserves both the functional behavior of the program and the generated traces. We may also ask that the function that erases the cost emission statements commute with compilation. This optional property grants that the labelling does not interfere with the original compiler behavior. A further set of requirements will be added later.
     331
     332The third component is a static object code analyser. It takes in input the object code augmented with label emission statements and it computes, for every such statement, its scope. The scope of a label emission statement is the tree of instructions that may be executed after the statement and before a new label emission statement is found. It is a tree and not a sequence because the scope may contain a branching statement. In order to grant that such a finite tree exists, the object code must not contain any loop that is not broken by a label emission statement. This is the first requirement of a sound labelling. The analyser fails if the labelling is unsound. For each scope, the analyser computes an upper bound of the execution time required by the scope, using the maximum of the costs of the two branches in case of a conditional statement. Finally, the analyser computes the cost of a label by taking the maximum of the costs of the scopes of every statement that emits that label.
     333
     334The fourth and last component takes in input the cost model computed at step 3 and the labelled code computed at step 1. It outputs a source program obtained by replacing each label emission statement with a statement that increments the global cost variable with the cost associated to the label by the cost model.  The obtained source code is the instrumented source code.
     335
    333336\paragraph{Correctness:} Requirements 1, 2 and 3 of the program statement obviously hold, with 2 and 3 being a consequence of the definition of a correct labelling preserving compiler. It is also obvious that the value of the global cost variable of an instrumented source code is at any time equal to the sum of the costs of the labels emitted by the corresponding labelled code. Moreover, because the compiler preserves all traces, the sum of the costs of the labels emitted in the source and target labelled code are the same. Therefore, to satisfy the 4th requirement, we need to grant that the time taken to execute the object code is equal to the sum of the costs of the labels emitted by the object code. We collect all the necessary conditions for this to happen in the definition of a sound labelling: a) all loops must be broken by a cost emission statement;  b) all program instructions must be in the scope of some cost emission statement. To satisfy also the 5th requirement, additional requirements must be imposed on the object code labelling to avoid all uses of the maximum in the cost computation: the labelling is precise if every label is emitted at most once and both branches of a conditional jump start with a label emission statement.
    334337
    335338The correctness and precision of the labelling approach only rely on the correctness and precision of the object code labelling. The simplest, but not necessary, way to achieve them is to impose correctness and precision requirements also on the initial labelling produced by the first software component, and to ask the labelling preserving compiler to preserve these properties too. The latter requirement imposes serious limitations on the compilation strategy and optimizations: the compiler may not duplicate any code that contains label emission statements, like loop bodies. Therefore several loop optimizations like peeling or unrolling are prevented. Moreover, precision of the object code labelling is not sufficient per se to obtain global precision: we also implicitly assumed the static analysis to be able to associate a precise constant cost to every instruction. This is not possible in presence of stateful hardware whose state influences the cost of operations, like pipelines and caches. In the next section we will see an extension of the basic labelling approach to cover this situation.
    336339
    337 The labelling approach described in this section can be applied equally well and with minor modifications to imperative and functional languages. In the CerCo project, we have tested it on a simple imperative language without functions (a While language), to a subset of C and to two compilation chains for a purely functional higher order language. The two main changes required to deal with functional languages are: 1) because global variables and updates are not available, the instrumentation phase produces monadic code to “update” the global costs; 2) the requirements for a sound and precise labelling of the source code must be changed when the compilation is based on CPS translations. In particular, we need to introduce both labels emitted before a statement is executed and labels emitted after a statement is executed. The latter capture code that is inserted by the CPS translation and that would escape all label scopes.
     340The labelling approach described in this section can be applied equally well and with minor modifications to imperative and functional languages. We have tested it on a simple imperative language without functions (a While language), to a subset of C and to two compilation chains for a purely functional higher order language. The two main changes required to deal with functional languages are: 1) because global variables and updates are not available, the instrumentation phase produces monadic code to “update” the global costs; 2) the requirements for a sound and precise labelling of the source code must be changed when the compilation is based on CPS translations. In particular, we need to introduce both labels emitted before a statement is executed and labels emitted after a statement is executed. The latter capture code that is inserted by the CPS translation and that would escape all label scopes.
    338341
    339342Phases 1, 2 and 3 can be applied as well to logic languages (e.g. Prolog). However, the instrumentation phase cannot: in standard Prolog there is no notion of (global) variable whose state is not retracted during backtracking. Therefore, the cost of executing computations that are later backtracked would not be correctly counted in. Any extension of logic languages with non-backtrackable state should support the labelling approach.
     
    347350Not all the details of the non functional state needs to be exposed, and the technique works better when the part of state that is required can be summarized in a simple data structure. For example, to handle simple but realistic pipelines it is sufficient to remember a short integer that encodes the position of bubbles (stuck instructions) in the pipeline. In any case, the user does not need to understand the meaning of the state to reason over the properties of the program. Moreover, at any moment the user or the invariant generator tools that analyse the instrumented source code produced by the compiler can decide to trade precision of the analysis with simplicity by approximating the parametric cost with safe non parametric bounds. Interestingly, the functional analysis of the code can determine which blocks are executed more frequently in order to approximate more aggressively the ones that are executed less.
    348351
    349 The idea of dependent labelling can also be applied to allow the compiler to duplicate blocks that contain labels (e.g. to allow loop optimizations). The effect of duplication is to assign a different cost to the different occurrences of a duplicated label. For example, loop peeling turns a loop into the concatenation of a copy of the loop body (that executes the first iteration) with the conditional execution of the loop (for the successive iterations). Because of further optimizations, the two copies of the loop will be compiled differently, with the first body usually taking more time.
     352Dependent labelling can also be applied to allow the compiler to duplicate blocks that contain labels (e.g. in loop optimizations). The effect is to assign a different cost to the different occurrences of a duplicated label. For example, loop peeling turns a loop into the concatenation of a copy of the loop body (that executes the first iteration) with the conditional execution of the loop (for the successive iterations). Because of further optimizations, the two copies of the loop will be compiled differently, with the first body usually taking more time.
    350353
    351354By introducing a variable that keep tracks of the iteration number, we can associate to the label a cost that is a function of the iteration number. The same technique works for loop unrolling without modifications: the function will assign a cost to the even iterations and another cost to the odd ones. The actual work to be done consists in introducing in the source code and for each loop a variable that counts the number of iterations. The loop optimization code that duplicate the loop bodies must also modify the code to propagate correctly the update of the iteration numbers. The technical details are more complex and can be found in the CerCo reports and publications. The implementation, however, is quite simple and the changes to a loop optimizing compiler are minimal.
     
    377380
    378381The plugin proceeds as follows.
    379 \begin{enumerate}
    380 \item First the call graph of the program is computed. If the function f calls the function g
    381 then the function g is processed before the function f .
    382 \item The computation of the cost of the function is performed by traversing its control flow graph. The cost at a node is the maximum of the costs of the successors.
    383 \item In the case of a loop with a body that has a constant cost for every step of the loop, the cost is the product of the cost of the body and of the variant taken at the start of the loop.
    384 \item In the case of a loop with a body whose cost depends on the values of some free variables, a fresh logic function f is introduced to represent the cost of the loop in the logic assertions. This logic function takes the variant as a first parameter. The other parameters of f are the free variables of the body of the loop. An axiom is added to account the fact that the cost is accumulated at each step of the loop.
    385 \item The cost of the function is directly added as post-condition of the function
    386 \end{enumerate}
     382First the call graph of the program is computed.
     383Then the computation of the cost of the function is performed by traversing its control flow graph. If the function f calls the function g then the function g is processed before the function f. The cost at a node is the maximum of the costs of the successors.
     384In the case of a loop with a body that has a constant cost for every step of the loop, the cost is the product of the cost of the body and of the variant taken at the start of the loop.
     385In the case of a loop with a body whose cost depends on the values of some free variables, a fresh logic function f is introduced to represent the cost of the loop in the logic assertions. This logic function takes the variant as a first parameter. The other parameters of f are the free variables of the body of the loop. An axiom is added to account the fact that the cost is accumulated at each step of the loop.
     386The cost of the function is directly added as post-condition of the function
     387
    387388The user can influence the annotation by different means:
    388389by using more precise variants or
    389390by annotating function with cost specification. The plugin will use this cost for the function instead of computing it.
    390391\paragraph{C programs with pointers}
    391 When it comes to verifying programs involving pointer-based data structures, such as linked lists, trees, or graphs, the use of traditional first-order logic to specify, and of SMT solvers to verify, shows some limitations. Separation logic is then an elegant alternative. Designed at the turn of the century, it is a program logic with a new notion of conjunction to express spatial heap separation.Bobot has recently introduced in the Jessie plug-in automatically generated separation predicates that allow to simulate separation logic reasoning in a traditional verification framework where the specification language, the verification condition generator, and the theorem provers were not designed with separation logic in mind. CerCo's plug-in can exploit the separation predicates to automatically reason on the cost of execution of simple heap manipulation programs such as an in-place list reversal.
     392When it comes to verifying programs involving pointer-based data structures, such as linked lists, trees, or graphs, the use of traditional first-order logic to specify, and of SMT solvers to verify, shows some limitations. Separation logic is then an elegant alternative. It is a program logic with a new notion of conjunction to express spatial heap separation. Bobot has recently introduced in the Jessie plug-in automatically generated separation predicates to simulate separation logic reasoning in a traditional verification framework where the specification language, the verification condition generator, and the theorem provers were not designed with separation logic in mind. CerCo's plug-in can exploit the separation predicates to automatically reason on the cost of execution of simple heap manipulation programs such as an in-place list reversal.
    392393
    393394\subsection{The CerCo Compiler}
    394395In CerCo we have developed a certain number of cost preserving compilers based on the labelling approach. Excluding an initial certified compiler for a While language, all remaining compilers target realistic source languages --- a pure higher order functional language and a large subset of C with pointers, gotos and all data structures --- and real world target processors --- MIPS and the Intel 8051/8052 processor family. Moreover, they achieve a level of optimization that ranges from moderate (comparable to gcc level 1) to intermediate (including loop peeling and unrolling, hoisting and late constant propagation). The so called Trusted CerCo Compiler is the only one that was implemented in the interactive theorem prover Matita and its costs certified. The code distributed is obtained extracting OCaml code from the Matita implementation. In the rest of this section we will only focus on the Trusted CerCo Compiler, that only targets the C language and the 8051/8052 family, and that does not implement the advanced optimizations yet. Its user interface, however, is the same as the one of the other versions, in order to trade safety with additional performances. In particular, the Frama-C CerCo plug-in can work without recompilation with all compilers.
    395396
    396 The (trusted) CerCo compiler implements the following optimizations: cast simplification, constant propagation in expressions, liveness analysis driven spilling of registers, dead code elimination, branch displacement, tunneling. The two latter optimizations are performed by the optimizing assembler which is part of the compiler. The back-end of the compiler works on three addresses instructions, preferred to static single assignment code for the simplicity of the formal certification.
    397 
    398 The CerCo compiler is loosely based on the CompCert compiler, a recently developed certified compiler from C to the PowerPC, ARM and x86 microprocessors. Contrarily to CompCert, both the CerCo code and its certification are open source. Some data structures and language definitions for the front-end are directly taken from CompCert, while the back-end is a redesign and reimplementation of a didactic compiler from Pascal to MIPS used by Francois Pottier for a course at the Ecole Polytechnique.
     397The (trusted) CerCo compiler implements the following optimizations: cast simplification, constant propagation in expressions, liveness analysis driven spilling of registers, dead code elimination, branch displacement, tunneling. The two latter optimizations are performed by our optimizing assembler. The back-end of the compiler works on three addresses instructions, preferred to static single assignment code for the simplicity of the formal certification.
     398
     399The CerCo compiler is loosely based on the CompCert compiler, a recently developed certified compiler from C to the PowerPC, ARM and x86 microprocessors. Contrarily to CompCert, both the CerCo code and its certification are open source. Some data structures and language definitions for the front-end are directly taken from CompCert, while the back-end is a redesign of a compiler from Pascal to MIPS used by Francois Pottier for a course at the Ecole Polytechnique.
    399400
    400401The main peculiarities of the CerCo compiler are:
    401402\begin{enumerate}
    402403\item all of our intermediate languages include label emitting instructions to implement the labelling approach, and the compiler preserves execution traces.
    403 \item tradiniotally compilers target an assembly language with additional macro-instructions to be expanded before assembly; for CerCo we need to go all the way down to object code in order to perform the static analysis. Therefore we developed also an optimizing assembler and a static analyser, all integrated in the compiler.
    404 \item to avoid implementing a linker and a loader, we do not support separate compilation and external calls. Adding a linker and a loader is a transparent process to the labelling approach and should create no unknown problem
     404\item traditionally compilers target an assembly language with additional macro-instructions to be expanded before assembly; for CerCo we need to go all the way down to object code in order to perform the static analysis. Therefore we integrated also an optimizing assembler and a static analyser.
     405\item to avoid implementing a linker and a loader, we do not support separate compilation and external calls. Adding a linker and a loader is a transparent process to the labelling approach and should create no unknown problem.
    405406\item we target an 8-bit processor. Targeting an 8 bit processor requires several changes and increased code size, but it is not fundamentally more complex. The proof of correctness, however, becomes much harder.
    406 \item we target a microprocessor that has a non uniform memory model, which is still often the case for microprocessors used in embedded systems and that is becoming common again in multi-core processors. Therefore the compiler has to keep track of the position of data and it must move data between memory regions in the proper way. Also the size of pointers to different regions is not uniform. In our case, an additional difficulty was that the space available for the stack in internal memory in the 8051 is tiny, allowing only a minor number of nested calls. To support full recursion in order to test the CerCo tools also on recursive programs, the compiler manually manages a stack in external memory.
     407\item we target a microprocessor that has a non uniform memory model, which is still often the case for microprocessors used in embedded systems and that is becoming common again in multi-core processors. Therefore the compiler has to keep track of data and it must move data between memory regions in the proper way. Also the size of pointers to different regions is not uniform. An additional difficulty was that the space available for the stack in internal memory in the 8051 is tiny, allowing only a minor number of nested calls. To support full recursion in order to test the CerCo tools also on recursive programs, the compiler manually manages a stack in external memory.
    407408\end{enumerate}
    408409
Note: See TracChangeset for help on using the changeset viewer.