Changeset 3152


Ignore:
Timestamp:
Apr 16, 2013, 2:46:53 PM (4 years ago)
Author:
amadio
Message:

r

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Deliverables/D6.3/report.tex

    r3126 r3152  
    1515\usepackage{bbm}
    1616
     17% Amadio's macros
     18\newcommand{\cost}{{\sf Cost}}
     19\newcommand{\lamcost}{\sf LamCost}
     20\newcommand{\cil}{{\sf CIL}}
     21\newcommand{\scade}{{\sf Scade}}
     22\newcommand{\absint}{{\sf AbsInt}}
     23\newcommand{\framac}{{\sf Frama-C}}
     24\newcommand{\acsl}{{\sf {ACSL}}}
     25\newcommand{\jessie}{{\sf {Jessie}}}
     26\newcommand{\powerpc}{{\sf PowerPc}}
     27\newcommand{\lustre}{{\sf Lustre}}
     28\newcommand{\esterel}{{\sf Esterel}}
     29\newcommand{\ml}{{\sf ML}}
     30\newcommand{\altergo}{{\sf {Alt-Ergo}}}
     31\newcommand{\why}{{\sf {Why}}}
     32\newcommand{\old}{\backslash old}
     33\newcommand{\C}{{\sf C}}
     34\newcommand{\coq}{{\sf Coq}}
     35\newcommand{\ocaml}{{\sf ocaml}}
     36\newcommand{\AND}{\wedge}               % conjunction
     37\newcommand{\w}[1]{{\it #1}}    %To write in math style
     38
    1739\title{
    1840INFORMATION AND COMMUNICATION TECHNOLOGIES\\
     
    79101\begin{large}
    80102Main Authors:\\
    81 XXXX %Dominic P. Mulligan and Claudio Sacerdoti Coen
     103Roberto M.~Amadio, Claudio Sacerdoti Coen
     104%Dominic P. Mulligan and Claudio Sacerdoti Coen
    82105\end{large}
    83106\end{center}
     
    117140\newpage
    118141
     142
     143\section{Review of cost synthesis techniques}
     144We review the {\em cost synthesis techniques} developed in the project.
     145
     146The {\em starting hypothesis} is that we have a certified methodology
     147to annotate `blocks' in the source code with constants which provide
     148a sound and possibly precise upper bound on the cost of executing the
     149`blocks' after compilation to binary code.
     150
     151The {\em principle} that we have followed in designing the cost synthesis
     152tools is that the synthetic bounds should be expressed and proved within
     153a general purpose tool built to reason on the source code.
     154In particular, we rely on the $\framac$ tool to reason on $\C$ code and on the $\coq$
     155proof-assistant to reason on higher-order functional programs.
     156
     157This principle entails that:
     158
     159\begin{itemize}
     160
     161\item The inferred synthetic bounds are indeed {\em correct}
     162as long as the general purpose tool is.
     163
     164\item There is {\em no limitation on the  class of programs} that can be handled
     165as long as the user is willing to carry on an interactive proof.
     166
     167\end{itemize}
     168
     169Of course, {\em automation} is desirable whenever possible.  Within this
     170framework, automation means writing programs that give hints to the
     171general purpose tool. These hints may take the form, say, of loop
     172invariants/variants, of predicates describing the structure of the heap,
     173or of types in a light logic.  If these hints are
     174correct and sufficiently precise the general purpose tool will produce
     175a proof automatically, otherwise, user interaction is required.  What
     176follows is a summary of work described in more detail in deliverables
     177D5.1 and D5.3. The cost synthesis techniques we outline are at
     178varying degree of maturity ranging from a complete experimental
     179validation to preliminary thought experiments.
     180
     181
     182\subsection{The $\cost$ plug-in and its application to the Lustre compiler}
     183$\framac$ is a set of analysers for $\C$ programs with a specification language
     184called $\acsl$. New analyses can be dynamically added through a plug-in
     185system. For instance, the $\jessie$ plug-in allows deductive verification of
     186$\C$ programs with respect to their specification in $\acsl$, with various
     187provers as back-end tools.
     188
     189We developed the $\cost$ plug-in for the $\framac$ platform as a proof of
     190concept of an automatic environment exploiting the cost annotations produced by
     191the $\cerco$ compiler. It consists of an $\ocaml$ program
     192which in first approximation takes the following actions: (1) it receives as
     193input a $\C$ program, (2) it applies the $\cerco$ compiler to produce a related
     194$\C$ program with cost annotations, (3) it applies some heuristics to produce a
     195tentative bound on the cost of executing the $\C$ functions of the program as a
     196function of the value of their parameters, (4) the user can then call the
     197$\jessie$ tool to discharge the related proof
     198obligations.  %\marginpar{Some facts/pointer to Cost tool.}
     199
     200In the following we elaborate on the soundness of the framework
     201and the experiments we performed with the $\cost$ tool
     202on the $\C$ programs produced by a $\lustre$ compiler.
     203
     204
     205
     206% First, the input file is fed to
     207% $\framac$ that will in turn send it to the $\cost$ plug-in. The action of the
     208% plug-in is then:
     209% \begin{enumerate}
     210% \item apply the $\cerco$ compiler to the source file;
     211% \item synthesize an upper bound of the WCET of each function of the source
     212%   program by reading the cost annotations added by $\cerco$;
     213% \item add the results in the form of post-conditions in $\acsl$ format, relating
     214%   the cost of the function before and after its execution.
     215% \end{enumerate}
     216
     217\paragraph{Soundness}
     218The soundness of the whole framework depends on the cost annotations
     219added by the $\cerco$ compiler, the synthetic costs produced by the $\cost$
     220plug-in, the verification conditions (VCs) generated by $\jessie$, and
     221the external provers discharging the VCs.  The synthetic costs being
     222in $\acsl$ format, $\jessie$ can be used to verify them.
     223%RA There was a confusion between basic cost annotations and synthetic costs.
     224Thus, even if the added synthetic costs are incorrect (relatively to the cost annotations),
     225the process in its globality is still correct: indeed, $\jessie$ will not validate
     226incorrect costs and no conclusion can be made about the WCET of
     227the program in this case. In other terms, the soundness does not really depend on the action
     228of the $\cost$ plug-in, which can in principle produce \emph{any}
     229synthetic cost. However, in order to be able to actually prove a WCET of a
     230$\C$ function, we need to add correct annotations in a way that $\jessie$
     231and subsequent automatic provers have enough information to deduce
     232their validity. In practice this is not straightforward
     233even for very simple programs composed of branching and assignments
     234(no loops and no recursion) because a fine analysis of the VCs associated with
     235branching may lead to a complexity blow up.
     236
     237\paragraph{Experience with $\lustre$}
     238$\lustre$ is a data-flow language to program
     239synchronous systems and the language comes with a compiler to
     240$\C$.
     241We designed a wrapper for supporting $\lustre$ files.
     242The $\C$ function produced by the compiler implements the {\em step function}
     243of the synchronous system and computing the WCET of the function amounts to obtain
     244a bound on the reaction time of the system.
     245We tested the  $\cost$ plug-in and the $\lustre$ wrapper on the $\C$ programs generated by the
     246$\lustre$ compiler. For programs consisting of a few hundreds loc,
     247the $\cost$ plug-in computes a WCET and $\altergo$ is able to discharge all VCs
     248automatically.
     249
     250
     251
     252
     253\subsection{Handling $\C$ programs with simple loops}
     254The cost annotations added by the $\cerco$ compiler take the form of $\C$
     255instructions that update by a constant a fresh global variable called the
     256\emph{cost variable}. Synthesizing a WCET of a $\C$ function thus consists in
     257statically resolving an upper bound of the difference between the value of the
     258cost variable before and after the execution of the function, {\em i.e.} find in the
     259function the instructions that update the cost variable and establish the number
     260of times they are passed through during the flow of execution. In
     261order to do the analysis the plugin makes the following assumptions on the programs:
     262\begin{itemize}
     263\item No recursive functions.
     264\item Every loop must be annotated with a {\em variant}. The
     265  variants of `for' loops are automatically inferred.
     266\end{itemize}
     267
     268The plugin proceeds as follows.
     269
     270\begin{itemize}
     271
     272\item
     273First the callgraph of the program is computed.
     274If the function $f$ calls the function $g$ then the function $g$ is processed before the function $f$.
     275
     276\item The computation of the cost of the function is performed by traversing its control flow graph.
     277The cost at a node is the maximum of the costs of the successors.
     278%If the node is an assignment it is substituted in the cost in  the same way as for weakest-precondition.
     279
     280\item In the case of a loop with a body that has a constant cost for
     281  every step of the loop, the cost is the product of the cost of the
     282  body and of the variant taken at the start of the loop.
     283
     284\item In the case of a loop with a body whose cost depends on
     285  the values of some free variables, a fresh logic function $f$ is
     286  introduced to represent the cost of the loop in the logic
     287  assertions. This logic function takes the variant as a first
     288  parameter. The other parameters of $f$ are the free variables of the
     289  body of the loop.  An axiom is added to account the fact that the
     290  cost is accumulated at each step of the loop:
     291  \[
     292  f(v, \vec x) =
     293    \textrm{if } v < 0 \textrm{ then } 0 \textrm{ else } (f(v-1, \phi(\vec x)) + \psi(\vec x))
     294  \]
     295  where $\vec x$ are the free variables, $v$ is the variant, $\phi$ computes the
     296  modification of the variable at each step of the loop, and $\psi$ is the
     297  cost of the body of the loop.
     298
     299\item The cost of the function is directly added as post-condition of
     300  the function: $\_\_cost \le \old(\_\_cost) + t$ where $t$ is the term
     301  computing the cost of the function, $\_\_cost$ is the time taken from
     302  the start of the program, $\old(\_\_cost)$ is the same time but before
     303  the execution of the function.
     304\end{itemize}
     305
     306
     307The user can influence the annotation by different means:
     308\begin{itemize}
     309\item By using more precise variants.
     310\item By annotating function with cost specification. The plugin will
     311  use this cost for the function instead of computing it.
     312\end{itemize}
     313
     314
     315
     316
     317\subsection{$\C$ programs with pointers}
     318When it comes to verifying programs involving pointer-based data
     319structures, such as linked lists, trees, or graphs, the use of
     320traditional first-order logic to specify, and of SMT solvers to
     321verify, shows some limitations. {\em Separation logic} is then an
     322elegant alternative. Designed at the turn of the century, it is a
     323program logic with a new notion of conjunction to express spatial
     324heap separation.  Separation logic has been implemented in dedicated theorem provers
     325such as {\sf Smallfoot} or {\sf VeriFast}. One
     326drawback of such provers, however, is to either limit the
     327expressiveness of formulas (e.g. to the so-called symbolic heaps), or
     328to require some user-guidance (e.g. open/close commands in Verifast).
     329
     330In an attempt to conciliate both approaches, Bobot introduced the
     331notion of separation predicates during his PhD thesis. The approach
     332consists in reformulating some ideas from separation logic into a
     333traditional verification framework where the specification language,
     334the verification condition generator, and the theorem provers were not
     335designed with separation logic in mind. Separation predicates are
     336automatically derived from user-defined inductive predicates, on
     337demand. Then they can be used in program annotations, exactly as other
     338predicates, {\em i.e.}, without any constraint. Simply speaking, where
     339one would write $P*Q$ in separation logic, one will here ask for the
     340generation of a separation predicate {\em sep} and then use it as $P
     341\AND Q \AND \w{sep}(P, Q)$.  We have implemented separation predicates
     342within the $\jessie$ plug-in and tested it on a non-trivial case study
     343(the composite pattern from the VACID-0 benchmark).  In this case, we
     344achieve a fully automatic proof using three existing SMT solver. We
     345have also used the separation predicates to reason on the {\em cost}
     346of executing simple heap manipulating programs such as an in-place
     347list reversal.
     348
     349
     350
     351\subsection{The cost of higher-order functional programs}
     352We have analysed a rather standard compilation chain from
     353a higher-order functional languages to an abstract
     354{\sf RTL} language which corresponds directly to the source
     355language of the back-end of the $\C$ compiler developed in the $\cerco$ project.
     356The compilation consists of four transformations: continuation passing-style,
     357value naming, closure conversion, and hoisting.
     358
     359We have shown that it is possible to extend the labelling approach
     360described for the $\C$ language to a higher-order call-by-value functional language.
     361
     362The first issue we have considered is that of designing a `good
     363labelling' function, {\em i.e.}, a function that inserts labels in the
     364source code which correspond to `basic blocks' of the compiled
     365code. To this end, we have introduced two labelling operators: a
     366{\em pre-labelling} $\ell>M$ which emits the label $\ell$ before running $M$
     367and a {\em post-labelling} $M>\ell$ which reduces $M$ to a value and then
     368emits the label $\ell$.  Roughly speaking, the `good labelling'
     369associates a pre-labelling to every function abstraction and a
     370post-labelling to every application which is not immediately
     371sourrounded by an abstraction.  In particular, the post-labelling
     372takes care of the functions created by the CPS translation.
     373
     374The second issue relates to the instrumentation of the program.
     375To this end, we have relied on a {\em cost monad} which associates
     376to each program a pair consisting of its denotation and the
     377cost of reducing the program to a value.
     378In this way, the instrumented program can still be regarded
     379as a higher-order functional program.
     380
     381The third issue concerns the method to {\em reason on the instrumented
     382(functional) program}. We have built on a higher-order Hoare logic and a related tool
     383that generates automatically the proof obligations. These proof
     384obligations can either be discharged automatically or
     385interactively using the {\sf Coq} proof assistant and its tactics.
     386Some simple experiments are described in the $\lamcost$ software.
     387
     388
     389\subsection{The cost of memory management}
     390In a realistic implementation of a functional programming language,
     391the runtime environment usually includes a garbage collector.  In
     392spite of considerable progress in {\em real-time garbage
     393  collectors}  it seems to us that
     394such collectors do not offer yet a viable path to a certified and
     395usable WCET prediction of the running time of functional programs.
     396As far as we know, the cost predictions concern the {\em amortized case}
     397rather than the {\em worst case} and are supported more by experimental evaluations
     398than by formal proofs.
     399
     400The approach we have adopted instead, following the seminal work
     401of  Tofte {\em et al.}, is to enrich the last calculus of
     402the compilation chain : (1) with a notion
     403of {\em memory region}, (2) with operations to allocate and dispose
     404memory regions, and (3) with a {\em type and effect system} that guarantees
     405the safety of the dispose operation. This allows to further extend
     406the compilation chain mentioned above and then to include the cost
     407of safe memory management in our analysis. Actually, because effects are intertwined with types,
     408what we have actually done, following the work of Morrisett {\em et al.}, 
     409is to extend a {\em typed} version of the compilation chain.
     410An experimental validation of the approach is left for future work and it
     411would require the integration  of region-inference algorithms such as those
     412developed by Aiken {\em et al.} in the compilation chain.
     413
     414
     415
     416
     417\subsection{Feasible bounds by light typing}
     418In our experience, the cost analysis of higher-order programs requires
     419human intervention both at the level of the specification and of the
     420proofs.
     421One path to automation consists in devising programming disciplines
     422that entail feasible bounds (polynomial time). The most interesting approaches to this problem
     423build on {\em light} versions of linear logic.
     424Our main contribution is to devise a type system that guarantees feasible
     425bounds for a higher-order call-by-value functional language with
     426references and threads.
     427The first proof of this result  relies on a kind of standardisation theorem and it is of a combinatorial nature.
     428More recentely, we have shown that a proof of a similar result can be obtained
     429by semantic means building on the so called {\em quantitative realizability
     430models} proposed by Dal Lago and Hofmann.
     431We believe this semantic setting is  particularly appropriate because
     432it allows to reason both on typed and untyped programs.
     433Thus one can imagine a framework where some programs are feasible `by typing'
     434while others are feasible as a result of an `interactive proof' of the obligations
     435generated by quantitative realizability.
     436Beyond building such a framework, an interesting issue concerns the
     437certification of {\em concrete bounds} at the level of the {\em compiled code}.
     438This has to be contrasted with the current state of the art in implicit computational complexity
     439where most bounds are {\em asymptotic} and are stated at the level of the {\em source code}.
     440
     441
     442
     443
     444\newpage
     445
    119446\includepdf[pages={-}]{pipelines.pdf}
    120447
Note: See TracChangeset for help on using the changeset viewer.