# source:Papers/fopara2013/fopara13.tex@3441

Last change on this file since 3441 was 3441, checked in by mulligan, 6 years ago

Some language/spelling changes in fopara2013.tex up to Section 2. Continuing...

File size: 64.5 KB
RevLine
[3307]1\documentclass{llncs}
2
3\usepackage{amsfonts}
4\usepackage{amsmath}
5\usepackage{amssymb}
6\usepackage[english]{babel}
7\usepackage{color}
8\usepackage{fancybox}
[3310]9\usepackage{fancyvrb}
[3307]10\usepackage{graphicx}
12            bookmarks,bookmarksopen,bookmarksdepth=2]{hyperref}
[3307]13\usepackage{hyphenat}
14\usepackage[utf8x]{inputenc}
15\usepackage{listings}
16\usepackage{mdwlist}
17\usepackage{microtype}
18\usepackage{stmaryrd}
19\usepackage{url}
20
[3313]21\usepackage{tikz}
22\usetikzlibrary{positioning,calc,patterns,chains,shapes.geometric,scopes}
23
[3307]24%\renewcommand{\verb}{\lstinline}
25%\def\lstlanguagefiles{lst-grafite.tex}
26%\lstset{language=Grafite}
[3429]27\lstset{language=C,basicstyle=\tt,basewidth=.5em,lineskip=-1.5pt}
[3307]28
29\newlength{\mylength}
30\newenvironment{frametxt}%
31        {\setlength{\fboxsep}{5pt}
32                \setlength{\mylength}{\linewidth}%
35                \Sbox
36                \minipage{\mylength}%
37                        \setlength{\abovedisplayskip}{0pt}%
38                        \setlength{\belowdisplayskip}{0pt}%
39                }%
40                {\endminipage\endSbox
41                        $\fbox{\TheSbox}$}
42
[3327]43\title{Certified Complexity (CerCo)\thanks{The project CerCo acknowledges the
44financial support of the Future and Emerging Technologies (FET) programme within
45the Seventh Framework Programme for Research of the European Commission, under
46FET-Open grant number: 243881}}
[3320]47\author{
48%Roberto
[3335]49R.~M. Amadio$^{4}$ \and
[3320]50%Nicolas
[3335]51N.~Ayache$^{3,4}$ \and
[3320]52%François
[3335]53F.~Bobot$^{3,4}$ \and
[3320]54%Jacob
[3333]55J.~P. Boender$^1$ \and
[3320]56%Brian
[3333]57B.~Campbell$^2$ \and
[3320]58%Ilias
[3333]59I.~Garnier$^2$ \and
[3320]60%Antoine
[3333]61A.~Madet$^4$ \and
[3320]62%James
[3333]63J.~McKinna$^2$ \and
[3320]64%Dominic
[3333]65D.~P. Mulligan$^1$ \and
[3320]66%Mauro
[3333]67M.~Piccolo$^1$ \and
[3332]68%Randy
[3333]69R.~Pollack$^2$ \and
[3320]70%Yann
[3335]71Y.~R\'egis-Gianas$^{3,4}$ \and
[3320]72%Claudio
[3333]73C.~Sacerdoti Coen$^1$ \and
[3320]74%Ian
[3333]75I.~Stark$^2$ \and
[3320]76%Paolo
[3333]77P.~Tranquilli$^1$}
[3327]78\institute{Dipartimento di Informatica - Scienza e Ingegneria, Universit\'a di
79Bologna \and
[3333]80LFCS, School of Informatics, University of Edinburgh
[3431]81\and INRIA (Team $\pi r^2$)
[3308]82\and
[3335]83Universit\e Paris Diderot
[3308]84}
[3307]85
86\bibliographystyle{splncs03}
87
88\begin{document}
89
90\maketitle
91
[3419]92% Brian: I've changed "without loss of accuracy" because we discuss
94% in the dependent labelling section.
[3307]95\begin{abstract}
[3320]96This paper provides an overview of the FET-Open Project CerCo
[3334]97(Certified Complexity'). The project's main achievement is
[3419]98the development of a technique for analysing non-functional
99properties of programs (time, space) at the source level, with little or no loss of accuracy
[3430]100and with a small trusted code base. The main software component
[3335]101developed is a certified compiler producing cost annotations. The compiler
[3331]102translates source code to object code and produces an instrumented copy of the
[3334]103source code. This instrumentation exposes at
[3331]104the source level---and tracks precisely---the actual (non-asymptotic)
105computational cost of the input program. Untrusted invariant generators and trusted theorem provers
[3419]106can then be used to compute and certify the parametric execution time of the
[3309]107code.
[3307]108\end{abstract}
109
110% ---------------------------------------------------------------------------- %
111% SECTION                                                                      %
112% ---------------------------------------------------------------------------- %
113\section{Introduction}
114
[3430]115%\paragraph{Problem statement.}
116Computer programs can be specified with both
[3327]117functional constraints (what a program must do) and non-functional constraints
[3430]118(what resources---time, space, etc.---the program may use).  In the current
[3327]119state of the art, functional properties are verified for high-level source code
120by combining user annotations (e.g. preconditions and invariants) with a
121multitude of automated analyses (invariant generators, type systems, abstract
[3437]122interpretation, theorem proving, and so on). By contrast, many non-functional properties
123are generally checked on low-level object code,
124%\footnote{A notable
125%  exception is the explicit allocation of heap space in languages like C, which
126%  can be handled at the source level.}
127but also demand information
[3430]128about high-level functional behaviour that must somehow be reconstructed.
[3307]129
[3327]130This situation presents several problems: 1) it can be hard to infer this
[3334]131high-level structure in the presence of compiler optimisations; 2) techniques
132working on object code are not useful in early development, yet problems
[3327]133detected later are more expensive to tackle; 3) parametric cost analysis is very
134hard: how can we reflect a cost that depends on the execution state (e.g. the
135value of a register or a carry bit) to a cost that the user can understand
[3430]136looking at source code? 4) performing functional analysis on the object code
137makes it hard for the programmer to provide information, leading to less precision in
[3327]138the estimates.
[3307]139
[3437]140\paragraph{Vision and approach.}
141 We want to reconcile functional and
[3334]142non-functional analyses: to share information and perform both at the same time
[3327]143on source code.
144%
145What has previously prevented this approach is the lack of a uniform and precise
146cost model for high-level code: 1) each statement occurrence is compiled
[3334]147differently and optimisations may change control flow; 2) the cost of an object
[3327]148code instruction may depend on the runtime state of hardware components like
[3334]149pipelines and caches, all of which are not visible in the source code.
[3307]150
[3327]151To solve the issue, we envision a new generation of compilers able to keep track
152of program structure through compilation and optimisation, and to exploit that
153information to define a cost model for source code that is precise, non-uniform,
[3430]154and which accounts for runtime state. With such a source-level cost model we can
[3327]155reduce non-functional verification to the functional case and exploit the state
[3334]156of the art in automated high-level verification~\cite{survey}. The techniques
[3441]157currently used by the Worst Case Execution Time (WCET) community, who perform analyses on object code,
[3417]158are still available but can now be coupled with an additional source-level
[3327]159analysis. Where the approach produces precise cost models too complex to reason
[3430]161Finally, source code analysis can be used during the early stages of development, when
[3327]162components have been specified but not implemented: source code modularity means
[3329]163that it is enough to specify the non-functional behaviour of unimplemented
[3327]164components.
[3307]165
[3437]166\paragraph{Contributions.}
167We have developed what we refer to as \emph{the labelling approach} \cite{labelling}, a
[3327]168technique to implement compilers that induce cost models on source programs by
169very lightweight tracking of code changes through compilation. We have studied
170how to formally prove the correctness of compilers implementing this technique.
[3441]171We have implemented such a compiler from C to object binaries for the MCS-51
[3437]172micro-controller for predicting execution time and stack space usage,
173and verified it in an interactive theorem prover.  As we are targeting
174an embedded micro-controller we do not consider dynamic memory allocation.
[3307]175
[3430]176To demonstrate source-level verification of costs we have implemented
177a Frama-C plugin \cite{framac} that invokes the compiler on a source
178program and uses this to generate invariants on the high-level source
179that correctly model low-level costs. The plugin certifies that the
180program respects these costs by calling automated theorem provers, a
181new and innovative technique in the field of cost analysis. Finally,
[3441]182we have conducted several case studies, including showing that the
[3430]183plugin can automatically compute and certify the exact reaction time
184of Lustre~\cite{lustre} data flow programs compiled into C.
185
186\section{Project context and approach}
[3327]187Formal methods for verification of functional properties of programs have
[3334]188now reached a level of maturity and automation that is facilitating a slow but
[3327]189increasing adoption in production environments. For safety critical code, it is
[3334]190becoming commonplace to combine rigorous software engineering methodologies and testing
191with static analysis, taking the strong points of each approach and mitigating
192their weaknesses. Of particular interest are open frameworks
[3327]193for the combination of different formal methods, where the programs can be
194progressively specified and are continuously enriched with new safety
195guarantees: every method contributes knowledge (e.g. new invariants) that
196becomes an assumption for later analysis.
[3307]197
[3327]198The scenario for the verification of non-functional properties (time spent,
[3430]199memory used, energy consumed) is bleaker.
200% and the future seems to be getting even worse.
201Most industries verify that real time systems meet their deadlines
[3334]202by simply performing many runs of the system and timing their execution,  computing the
[3327]203maximum time and adding an empirical safety margin, claiming the result to be a
204bound for the WCET of the program. Formal methods and software to statically
205analyse the WCET of programs exist, but they often produce bounds that are too
[3430]206pessimistic to be useful. Recent advancements in hardware architecture
207have been
[3327]208focused on the improvement of the average case performance, not the
[3334]209predictability of the worst case. Execution time is becoming increasingly
210dependent on execution history and the internal state of
211hardware components like pipelines and caches. Multi-core processors and non-uniform
212memory models are drastically reducing the possibility of performing
[3327]213static analysis in isolation, because programs are less and less time
[3334]214composable. Clock-precise hardware models are necessary for static analysis, and
[3430]215obtaining them is becoming harder due to the increased sophistication
[3334]216of hardware design.
[3307]217
[3430]218Despite the aforementioned problems, the need for reliable real time
219systems and programs is increasing, and there is increasing pressure
220from the research community towards the introduction of alternative
221hardware with more predictable behaviour, which would be more suitable
222for static analysis.  One example being investigated by the Proartis
223project~\cite{proartis} is to decouple execution time from execution
224history by introducing randomisation.
[3307]225
[3330]226In the CerCo project \cite{cerco} we do not try to address this problem, optimistically
[3430]227assuming that improvements in low-level timing analysis or architecture will make
228verification feasible in the longer term. The main objective of our work is
[3327]229instead to bring together static analysis of functional and non-functional
230properties, which, according to the current state of the art, are completely
231independent activities with limited exchange of information: while the
232functional properties are verified on the source code, the analysis of
233non-functional properties is entirely performed on the object code to exploit
[3334]234clock-precise hardware models.
[3307]235
[3422]236\subsection{Current object-code methods}
237
[3334]238Analysis currently takes place on object code for two main reasons.
[3422]239First, there cannot be a uniform, precise cost model for source
[3327]240code instructions (or even basic blocks). During compilation, high level
241instructions are torn apart and reassembled in context-specific ways so that
[3334]242identifying a fragment of object code and a single high level instruction is
[3329]243infeasible. Even the control flow of the object and source code can be very
[3334]244different as a result of optimisations, for example aggressive loop
245optimisations may completely transform source level loops. Despite the lack of a uniform, compilation- and
[3327]246program-independent cost model on the source language, the literature on the
[3334]247analysis of non-asymptotic execution time on high level languages that assumes
[3430]248such a model is growing and gaining momentum. However, unless we provide a
[3334]249replacement for such cost models, this literature's future practical impact looks
250to be minimal. Some hope has been provided by the EmBounded project \cite{embounded}, which
251compositionally compiles high-level code to a byte code that is executed by an
[3430]252interpreter with guarantees on the maximal execution time spent for each byte code
253instruction. This provides a uniform model at the expense of the model's
254precision (each cost is a pessimistic upper bound) and the performance of the
255executed code (because the byte code is interpreted compositionally instead of
[3334]256performing a fully non-compositional compilation).
[3422]257
[3327]258The second reason to perform the analysis on the object code is that bounding
259the worst case execution time of small code fragments in isolation (e.g. loop
260bodies) and then adding up the bounds yields very poor estimates because no
[3430]261knowledge of the hardware state on executing the fragment can be assumed. By
[3327]262analysing longer runs the bound obtained becomes more precise because the lack
[3430]263of information about the initial state has a relatively small impact.
[3307]264
[3430]265To calculate the cost of an execution, value and control flow analyses
266are required to bound the number of times each basic block is
267executed.  Current state
268of the art WCET technology such as AbsInt performs these analyses on the
269object code, where the logic of the program is harder to reconstruct
270and most information available at the source code level has been lost;
271see~\cite{stateart} for a survey. Imprecision in the analysis can lead to useless bounds. To
272augment precision, the tools ask the user to provide constraints on
273the object code control flow, usually in the form of bounds on the
274number of iterations of loops or linear inequalities on them. This
275requires the user to manually link the source and object code,
276translating his assumptions on the source code (which may be wrong) to
277object code constraints. The task is error prone and hard, especially
278in the presence of complex compiler optimisations.
[3422]279
280Traditional techniques for WCET that work on object code are also affected by
281another problem: they cannot be applied before the generation of the object
282code. Functional properties can be analysed in early development stages, while
283analysis of non-functional properties may come too late to avoid expensive
284changes to the program architecture.
285
[3437]286\subsection{CerCo's approach}
[3422]287
[3327]288In CerCo we propose a radically new approach to the problem: we reject the idea
289of a uniform cost model and we propose that the compiler, which knows how the
290code is translated, must return the cost model for basic blocks of high level
291instructions. It must do so by keeping track of the control flow modifications
[3430]292to reverse them and by interfacing with processor timing analysis.
[3314]293
[3327]294By embracing compilation, instead of avoiding it like EmBounded did, a CerCo
[3430]295compiler can both produce efficient code and return costs that are
296as precise as the processor timing analysis can be. Moreover, our costs can be
297parametric: the cost of a block can depend on actual program data, on a
298summary of the execution history, or on an approximated representation of the
[3327]299hardware state. For example, loop optimizations may assign to a loop body a cost
300that is a function of the number of iterations performed. As another example,
301the cost of a block may be a function of the vector of stalled pipeline states,
302which can be exposed in the source code and updated at each basic block exit. It
[3334]303is parametricity that allows one to analyse small code fragments without losing
[3327]304precision: in the analysis of the code fragment we do not have to ignore the
[3334]305initial hardware state. On the contrary, we can assume that we know exactly which
306state (or mode, as the WCET literature calls it) we are in.
[3307]307
[3334]308The CerCo approach has the potential to dramatically improve the state of the
309art. By performing control and data flow analyses on the source code, the error
[3437]310prone translation of invariants is completely avoided. Instead, this
311work is done at the source level using tools of the user's choice.
312Any available technique for the verification of functional properties
[3327]313can be immediately reused and multiple techniques can collaborate together to
[3437]314infer and certify cost invariants for the program.  There are no
315limitations on the types of loops or data structures involved. Parametric cost analysis
[3430]316becomes the default one, with non-parametric bounds used as a last resort when
[3334]317trading the complexity of the analysis with its precision. \emph{A priori}, no
[3430]318technique previously used in traditional WCET is lost: processor
319timing analyses can be used by the compiler on the object code, and the rest can be applied
[3334]320at the source code level.
[3422]321 Our approach can also work in the early
322stages of development by axiomatically attaching costs to unimplemented components.
[3307]323
324
[3327]325All software used to verify properties of programs must be as bug free as
[3430]326possible. The trusted code base for verification consists of the code that needs
[3327]327to be trusted to believe that the property holds. The trusted code base of
328state-of-the-art WCET tools is very large: one needs to trust the control flow
329analyser and the linear programming libraries it uses and also the formal models
330of the hardware. In CerCo we are moving the control flow analysis to the source
[3334]331code and we are introducing a non-standard compiler too. To reduce the trusted
[3327]332code base, we implemented a prototype and a static analyser in an interactive
[3430]333theorem prover, which was used to certify that the costs added to the source
334code are indeed those incurred by the hardware. Formal models of the
[3327]335hardware and of the high level source languages were also implemented in the
336interactive theorem prover. Control flow analysis on the source code has been
337obtained using invariant generators, tools to produce proof obligations from
338generated invariants and automatic theorem provers to verify the obligations. If
339the automatic provers are able to generate proof traces that can be
340independently checked, the only remaining component that enters the trusted code
341base is an off-the-shelf invariant generator which, in turn, can be proved
342correct using an interactive theorem prover. Therefore we achieve the double
[3417]343objective of allowing the use of more off-the-shelf components (e.g. provers and
[3334]344invariant generators) whilst reducing the trusted code base at the same time.
[3307]345
[3327]346%\paragraph{Summary of the CerCo objectives.} To summarize, the goal of CerCo is
347% to reconcile functional with non-functional analysis by performing them together
348% on the source code, sharing common knowledge about execution invariants. We want
349% to achieve the goal implementing a new generation of compilers that induce a
350% parametric, precise cost model for basic blocks on the source code. The compiler
351% should be certified using an interactive theorem prover to minimize the trusted
352% code base of the analysis. Once the cost model is induced, off-the-shelf tools
353% and techniques can be combined together to infer and prove parametric cost
354% bounds.
[3320]355%The long term benefits of the CerCo vision are expected to be:
356%1. the possibility to perform static analysis during early development stages
[3327]358%3.  the application of off-the-shelf techniques currently unused for the
359% analysis of non-functional properties, like automated proving and type systems
360%4. simpler and safer interaction with the user, that is still asked for
361% knowledge, but on the source code, with the additional possibility of actually
362% verifying the provided knowledge
[3320]363%5. a reduced trusted code base
364%6. the increased accuracy of the bounds themselves.
365%
[3327]366%The long term success of the project is hindered by the increased complexity of
[3329]367% the static prediction of the non-functional behaviour of modern hardware. In the
[3327]368% time frame of the European contribution we have focused on the general
369% methodology and on the difficulties related to the development and certification
370% of a cost model inducing compiler.
[3307]371
[3334]372\section{The typical CerCo workflow}
[3433]373\label{sec:workflow}
[3320]374\begin{figure}[!t]
[3313]375\begin{tabular}{l@{\hspace{0.2cm}}|@{\hspace{0.2cm}}l}
[3310]376\begin{lstlisting}
[3420]377char a[] = {3, 2, 7, 14};
378char threshold = 4;
[3310]379
[3420]380int count(char *p, int len) {
[3311]381  char j;
382  int found = 0;
[3420]383  for (j=0; j < len; j++) {
384    if (*p <= threshold)
385      found++;
[3311]386    p++;
[3420]387    }
[3311]388  return found;
[3310]389}
[3420]390
391int main() {
392  return count(a,4);
393}
[3310]394\end{lstlisting}
[3311]395&
[3313]396%  $\vcenter{\includegraphics[width=7.5cm]{interaction_diagram.pdf}}$
397\begin{tikzpicture}[
398    baseline={([yshift=-.5ex]current bounding box)},
399    element/.style={draw, text width=1.6cm, on chain, text badly centered},
400    >=stealth
401    ]
402{[start chain=going below, node distance=.5cm]
403\node [element] (cerco) {CerCo\\compiler};
404\node [element] (cost)  {CerCo\\cost plugin};
405{[densely dashed]
406\node [element] (ded)   {Deductive\\platform};
407\node [element] (check) {Proof\\checker};
408}
409}
410\coordinate [left=3.5cm of cerco] (left);
411{[every node/.style={above, text width=3.5cm, text badly centered,
412                     font=\scriptsize}]
413\draw [<-] ([yshift=-1ex]cerco.north west) coordinate (t) --
414    node {C source}
415    (t-|left);
416\draw [->] (cerco) -- (cost);
417\draw [->] ([yshift=1ex]cerco.south west) coordinate (t) --
[3420]418    node {C source+\color{red}{cost annotations}}
[3313]419    (t-|left) coordinate (cerco out);
420\draw [->] ([yshift=1ex]cost.south west) coordinate (t) --
[3420]421    node {C source+\color{red}{cost annotations}\\+\color{blue}{synthesized assertions}}
[3313]422    (t-|left) coordinate (out);
423{[densely dashed]
424\draw [<-] ([yshift=-1ex]ded.north west) coordinate (t) --
[3420]425    node {C source+\color{red}{cost annotations}\\+\color{blue}{complexity assertions}}
[3313]426    (t-|left) coordinate (ded in) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
427\draw [->] ([yshift=1ex]ded.south west) coordinate (t) --
428    node {complexity obligations}
429    (t-|left) coordinate (out);
430\draw [<-] ([yshift=-1ex]check.north west) coordinate (t) --
431    node {complexity proof}
432    (t-|left) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
[3327]433\draw [dash phase=2.5pt] (cerco out) .. controls +(-1, 0) and +(-1, 0) ..
434    (ded in);
[3313]435}}
436%% user:
438\draw (current bounding box.west) ++(-.2,.5)
439    circle (.2) ++(0,-.2) -- ++(0,-.1) coordinate (t);
440% arms
441\draw (t) -- +(-.3,-.2);
442\draw (t) -- +(.3,-.2);
443% body
444\draw (t) -- ++(0,-.4) coordinate (t);
445% legs
446\draw (t) -- +(-.2,-.6);
447\draw (t) -- +(.2,-.6);
448\end{tikzpicture}
[3311]449\end{tabular}
[3320]450\caption{On the left: code to count the array's elements
[3420]451 that are less than or equal to the threshold. On the right: CerCo's interaction
[3313]452 diagram. CerCo's components are drawn solid.}
[3310]453\label{test1}
454\end{figure}
[3427]455We illustrate the workflow we envisage (on the right
456of~\autoref{test1}) on an example program (on the left
457of~\autoref{test1}).  The user writes the program and feeds it to the
458CerCo compiler, which outputs an instrumented version of the same
459program that updates global variables that record the elapsed
460execution time and the stack space usage.  The red lines in
461\autoref{itest1} introducing variables, functions and function calls
[3430]462starting with \lstinline'__cost' and \lstinline'__stack' are the instrumentation introduced by the
463compiler.  For example, the two calls at the start of
464\lstinline'count' say that 4 bytes of stack are required, and that it
465takes 111 cycles to reach the next cost annotation (in the loop body).
[3432]466The compiler measures these on the labelled object code that it generates.
[3420]467
[3427]468 The annotated program can then be enriched with complexity
469assertions in the style of Hoare logic, that are passed to a deductive
470platform (in our case Frama-C). We provide as a Frama-C cost plugin a
[3430]471simple automatic synthesiser for complexity assertions which can
472be overridden by the user to increase or decrease accuracy.  These are the blue
[3431]473comments starting with \lstinline'/*@' in \autoref{itest1}, written in
474Frama-C's specification language, ACSL.  From the
[3427]475assertions, a general purpose deductive platform produces proof
476obligations which in turn can be closed by automatic or interactive
477provers, ending in a proof certificate.
478
[3420]479% NB: if you try this example with the live CD you should increase the timeout
480
481Twelve proof obligations are generated from~\autoref{itest1} (to prove
482that the loop invariant holds after one execution if it holds before,
483to prove that the whole program execution takes at most 1358 cycles,
484etc.).  Note that the synthesised time bound for \lstinline'count',
[3430]485$178+214*(1+\text{\lstinline'len'})$ cycles, is parametric in the length of
[3420]486the array. The CVC3 prover
487closes all obligations within half a minute on routine commodity
488hardware.  A simpler non-parametric version can be solved in a few
489seconds.
[3317]490%
[3310]491\fvset{commandchars=\\\{\}}
492\lstset{morecomment=[s][\color{blue}]{/*@}{*/},
493        moredelim=[is][\color{blue}]{$}{$},
[3429]494        moredelim=[is][\color{red}]{|}{|},
495        lineskip=-2pt}
[3310]496\begin{figure}[!p]
497\begin{lstlisting}
[3420]498|int __cost = 33, __stack = 5, __stack_max = 5;|
499|void __cost_incr(int incr) { __cost += incr; }|
[3310]500|void __stack_incr(int incr) {
[3420]501  __stack += incr;
[3310]502  __stack_max = __stack_max < __stack ? __stack : __stack_max;
503}|
504
[3420]505char a[4] = {3, 2, 7, 14};  char threshold = 4;
[3310]506
[3420]507/*@ behavior stack_cost:
508      ensures __stack_max <= __max(\old(__stack_max), 4+\old(__stack));
[3310]509      ensures __stack == \old(__stack);
[3420]510    behavior time_cost:
511      ensures __cost <= \old(__cost)+(178+214*__max(1+\at(len,Pre), 0));
512*/
[3430]513int count(char *p, int len) {
[3420]514  char j;  int found = 0;
515  |__stack_incr(4);  __cost_incr(111);|
[3310]516  $__l: /* internal */$
517  /*@ for time_cost: loop invariant
[3420]518        __cost <= \at(__cost,__l)+
519                  214*(__max(\at((len-j)+1,__l), 0)-__max(1+(len-j), 0));
[3310]520      for stack_cost: loop invariant
521        __stack_max == \at(__stack_max,__l);
522      for stack_cost: loop invariant
523        __stack == \at(__stack,__l);
[3420]524      loop variant len-j;
525  */
526  for (j = 0; j < len; j++) {
527    |__cost_incr(78);|
528    if (*p <= threshold) { |__cost_incr(136);| found ++; }
529    else { |__cost_incr(114);| }
530    p ++;
[3310]531  }
[3420]532  |__cost_incr(67);  __stack_incr(-4);|
[3310]533  return found;
534}
[3420]535
536/*@ behavior stack_cost:
537      ensures __stack_max <= __max(\old(__stack_max), 6+\old(__stack));
538      ensures __stack == \old(__stack);
539    behavior time_cost:
540      ensures __cost <= \old(__cost)+1358;
541*/
542int main(void) {
543  int t;
544  |__stack_incr(2);  __cost_incr(110);|
545  t = count(a,4);
546  |__stack_incr(-2);|
547  return t;
548}
[3310]549\end{lstlisting}
[3317]550\caption{The instrumented version of the program in \autoref{test1},
[3320]551 with instrumentation added by the CerCo compiler in red and cost invariants
552 added by the CerCo Frama-C plugin in blue. The \texttt{\_\_cost},
[3327]553 \texttt{\_\_stack} and \texttt{\_\_stack\_max} variables hold the elapsed time
[3329]554in clock cycles and the current and maximum stack usage. Their initial values
555hold the clock cycles spent in initialising the global data before calling
[3320]556\texttt{main} and the space required by global data (and thus unavailable for
557the stack).
558}
[3310]559\label{itest1}
560\end{figure}
[3339]561\section{Main scientific and technical results}
[3432]562First we describe the basic labelling approach and our compiler
[3425]563implementations that use it.  This is suitable for basic architectures
[3432]564with simple cost models.  Then we will discuss the dependent labelling
[3425]565extension which is suitable for more advanced processor architectures
566and compiler optimisations.  At the end of this section we will
567demonstrate automated high level reasoning about the source level
568costs provided by the compilers.
[3317]569
[3432]570% \emph{Dependent labelling~\cite{?}.} The basic labelling approach assumes a
[3328]571% bijective mapping between object code and source code O(1) blocks (called basic
572% blocks). This assumption is violated by many program optimizations (e.g. loop
573% peeling and loop unrolling). It also assumes the cost model computed on the
574% object code to be non parametric: every block must be assigned a cost that does
575% not depend on the state. This assumption is violated by stateful hardware like
[3432]576% pipelines, caches, branch prediction units. The dependent labelling approach is
577% an extension of the basic labelling approach that allows to handle parametric
[3328]578% cost models. We showed how the method allows to deal with loop optimizations and
579% pipelines, and we speculated about its applications to caches.
580%
581% \emph{Techniques to exploit the induced cost model.} Every technique used for
582% the analysis of functional properties of programs can be adapted to analyse the
583% non-functional properties of the source code instrumented by compilers that
[3432]584% implement the labelling approach. In order to gain confidence in this claim, we
[3328]585% showed how to implement a cost invariant generator combining abstract
586% interpretation with separation logic ideas \cite{separation}. We integrated
587% everything in the Frama-C modular architecture, in order to compute proof
588% obligations from functional and cost invariants and to use automatic theorem
589% provers on them. This is an example of a new technique that is not currently
590% exploited in WCET analysis. It also shows how precise functional invariants
591% benefits the non-functional analysis too. Finally, we show how to fully
592% automatically analyse the reaction time of Lustre nodes that are first compiled
593% to C using a standard Lustre compiler and then processed by a C compiler that
[3432]594% implements the labelling approach.
[3328]595
596% \emph{The CerCo compiler.} This is a compiler from a large subset of the C
597% program to 8051/8052 object code,
[3432]598% integrating the labelling approach and a static analyser for 8051 executables.
[3328]599% The latter can be implemented easily and does not require dependent costs
600% because the 8051 microprocessor is a very simple one, with constant-cost
601% instruction. It was chosen to separate the issue of exact propagation of the
602% cost model from the orthogonal problem of the static analysis of object code
603% that may require approximations or dependent costs. The compiler comes in
604% several versions: some are prototypes implemented directly in OCaml, and they
[3432]605% implement both the basic and dependent labelling approaches; the final version
[3328]606% is extracted from a Matita certification and at the moment implements only the
607% basic approach.
608
[3432]609\subsection{The (basic) labelling approach}
610The labelling approach is the foundational insight that underlies all the developments in CerCo.
[3336]611It allows the tracking of the evolution of
612basic blocks during the compilation process in order to propagate the cost model from the
613object code to the source code without losing precision in the process.
[3328]614\paragraph{Problem statement.} Given a source program $P$, we want to obtain an
[3327]615instrumented source program $P'$,  written in the same programming language, and
616the object code $O$ such that: 1) $P'$ is obtained by inserting into $P$ some
617additional instructions to update global cost information like the amount of
618time spent during execution or the maximal stack space required; 2) $P$ and $P'$
[3329]619must have the same functional behaviour, i.e.\ they must produce that same output
[3327]620and intermediate observables; 3) $P$ and $O$ must have the same functional
[3329]621behaviour; 4) after execution and in interesting points during execution, the
[3327]622cost information computed by $P'$ must be an upper bound of the one spent by $O$
[3433]623to perform the corresponding operations (\emph{soundness} property); 5) the difference
[3327]624between the costs computed by $P'$ and the execution costs of $O$ must be
[3433]625bounded by a program-dependent constant (\emph{precision} property).
[3317]626
[3432]627\paragraph{The labelling software components.} We solve the problem in four
628stages \cite{labelling}, implemented by four software components that are used
[3327]629in sequence.
[3321]630
[3327]631The first component labels the source program $P$ by injecting label emission
[3328]632statements in appropriate positions to mark the beginning of basic blocks.
[3433]633These are the positions where the cost instrumentation will appear in the final output.
[3328]634% The
[3432]635% set of labels with their positions is called labelling.
[3328]636The syntax and semantics
[3327]637of the source programming language is augmented with label emission statements.
[3328]638The statement EMIT $\ell$'' behaves like a NOP instruction that does not affect the
639program state or control flow, but its execution is observable.
640% Therefore the observables of a run of a program becomes a stream
641% of label emissions: $\ell_1,\ldots,\ell_n$, called the program trace. We clarify the conditions
[3432]642% that the labelling must respect later.
[3433]643For the example in Section~\ref{sec:workflow} this is just the original C code
644with EMIT'' instructions added at every point a \lstinline'__cost_incr' call
645appears in the final code.
[3321]646
[3432]647The second component is a labelling preserving compiler. It can be obtained from
[3327]648an existing compiler by adding label emission statements to every intermediate
649language and by propagating label emission statements during compilation. The
[3329]650compiler is correct if it preserves both the functional behaviour of the program
[3433]651and the traces of observables, including the labels emitted'.
[3328]652% We may also ask that the function that erases the cost
653% emission statements commute with compilation. This optional property grants that
[3432]654% the labelling does not interfere with the original compiler behaviour. A further
[3328]655% set of requirements will be added later.
[3321]656
[3433]657The third component analyses the labelled
658object code to compute the scope of each of its label emission statements,
659i.e.\ the instructions that may be executed after the statement and before
660a new label emission is encountered, and then computes the maximum cost of
661each.  Note that we only have enough information at this point to compute
662the cost of loop-free portions of code.  We will consider how to ensure that
663every loop is broken by a cost label shortly.
664%It is a tree and not a sequence because the scope
665%may contain a branching statement. In order to grant that such a finite tree
666%exists, the object code must not contain any loop that is not broken by a label
667%emission statement. This is the first requirement of a sound labelling. The
668%analyser fails if the labelling is unsound. For each scope, the analyser
669%computes an upper bound of the execution time required by the scope, using the
670%maximum of the costs of the two branches in case of a conditional statement.
671%Finally, the analyser computes the cost of a label by taking the maximum of the
672%costs of the scopes of every statement that emits that label.
[3321]673
[3433]674The fourth and final component replaces the labels in the labelled
675version of the source code produced at the start with the costs
676computed for each label's scope.  This yields the instrumented source
677code.  For the example, this is the code in \autoref{itest1}, except
678for the specifications in comments, which we consider in
679Section~\ref{sec:exploit}.
[3321]680
[3433]681\paragraph{Correctness.} Requirements 1 and 2 hold because of the
682non-invasive labelling procedure.  Requirement 3 can be satisfied by
683implementing compilation correctly.  It is obvious that the value of
684the global cost variable of the instrumented source code is always
685equal to the sum of the costs of the labels emitted by the
686corresponding labelled code. Moreover, because the compiler preserves
687all traces, the sum of the costs of the labels emitted in the source
688and target labelled code are the same. Therefore, to satisfy the
689soundness requirement, we need to ensure that the time taken to
690execute the object code is equal to the sum of the costs of the labels
691emitted by the object code. We collect all the necessary conditions
692for this to happen in the definition of a \emph{sound} labelling: a)
693all loops must be broken by a cost emission statement; b) all program
694instructions must be in the scope of some cost emission statement.
695This ensures that every label's scope is a tree of instructions, with
696the cost being the most expensive path. To satisfy also the precision
697requirement, we must make the scopes flat sequences of instructions.
698We require a \emph{precise} labelling where every label is emitted at
699most once and both branches of each conditional jump start with a label
[3327]700emission statement.
[3317]701
[3432]702The correctness and precision of the labelling approach only rely on the
703correctness and precision of the object code labelling. The simplest
[3433]704way to achieve that is to impose correctness and precision
705requirements on the source code labelling produced at the start, and
706to demand that the compiler to preserve these
[3327]707properties too. The latter requirement imposes serious limitations on the
708compilation strategy and optimizations: the compiler may not duplicate any code
709that contains label emission statements, like loop bodies. Therefore several
[3336]710loop optimisations like peeling or unrolling are prevented. Moreover, precision
[3432]711of the object code labelling is not sufficient \emph{per se} to obtain global
[3417]712precision: we implicitly assumed that a precise constant cost can be assigned
713to every instruction. This is not possible in
[3336]714the presence of stateful hardware whose state influences the cost of operations,
[3425]715like pipelines and caches. In Section~\ref{lab:deplabel} we will see an extension of the
[3433]716basic labelling approach which tackles these problems.
[3317]717
[3425]718In CerCo we have developed several cost preserving compilers based on
[3432]719the labelling approach. Excluding an initial certified compiler for a
[3425]720while' language, all remaining compilers target realistic source
721languages---a pure higher order functional language and a large subset
722of C with pointers, gotos and all data structures---and real world
723target processors---MIPS and the Intel 8051/8052 processor
724family. Moreover, they achieve a level of optimisation that ranges
725from moderate (comparable to GCC level 1) to intermediate (including
726loop peeling and unrolling, hoisting and late constant propagation).
727We describe the C compilers in detail in the following section.
[3317]728
[3425]729Two compilation chains were implemented for a purely functional
[3432]730higher-order language~\cite{labelling2}. The two main changes required
[3425]731to deal with functional languages are: 1) because global variables and
733code to update' the global costs; 2) the requirements for a sound and
[3432]734precise labelling of the source code must be changed when the
[3425]735compilation is based on CPS translations.  In particular, we need to
736introduce labels emitted before a statement is executed and also
737labels emitted after a statement is executed. The latter capture code
738that is inserted by the CPS translation and that would escape all
739label scopes.
740
[3417]741% Brian: one of the reviewers pointed out that standard Prolog implementations
742% do have some global state that apparently survives backtracking and might be
743% used.  As we haven't experimented with this, I think it's best to elide it
744% entirely.
[3317]745
[3417]746% Phases 1, 2 and 3 can be applied as well to logic languages (e.g. Prolog).
747% However, the instrumentation phase cannot: in standard Prolog there is no notion
748% of (global) variable whose state is not retracted during backtracking.
749% Therefore, the cost of executing computations that are later backtracked would
750% not be correctly counted in. Any extension of logic languages with
[3432]751% non-backtrackable state could support our labelling approach.
[3417]752
[3425]753\subsection{The CerCo C compilers}
754We implemented two C compilers, one implemented directly in OCaml and
755the other implemented in the interactive theorem prover
756Matita~\cite{matita}.  The first acted as a prototype for the second,
757but also supported MIPS and acted as a testbed for more advanced
[3432]758features such as the dependent labelling approach
[3425]759in Section~\ref{lab:deplabel}.
760
761The second C compiler is the \emph{Trusted CerCo Compiler}, whose cost
[3433]762predictions are formally verified. The executable code
763is OCaml code extracted from the Matita implementation. The Trusted
764CerCo Compiler only targets
765the C language and the 8051/8052 family, and does not yet implement any
766advanced optimisations. Its user interface, however, is the same as the
767other version for interoperability. In
768particular, the Frama-C CerCo plugin descibed in
769Section~\ref{sec:exploit} can work without recompilation
[3425]770with both of our C compilers.
771
772The 8051/8052 microprocessor is a very simple one, with constant-cost
773instructions. It was chosen to separate the issue of exact propagation of the
[3433]774cost model from the orthogonal problem of low-level timing analysis of object code
775that may require approximation or dependent costs.
[3425]776
777The (trusted) CerCo compiler implements the following optimisations: cast
778simplification, constant propagation in expressions, liveness analysis driven
779spilling of registers, dead code elimination, branch displacement, and tunneling.
780The two latter optimisations are performed by our optimising assembler
781\cite{correctness}. The back-end of the compiler works on three address
782instructions, preferred to static single assignment code for the simplicity of
783the formal certification.
784
785The CerCo compiler is loosely based on the CompCert compiler \cite{compcert}, a
786recently developed certified compiler from C to the PowerPC, ARM and x86
787microprocessors. In contrast to CompCert, both the CerCo code and its
788certification are fully open source. Some data structures and language definitions for
789the front-end are directly taken from CompCert, while the back-end is a redesign
790of a compiler from Pascal to MIPS used by Fran\c{c}ois Pottier for a course at the
791Ecole Polytechnique.
792
793The main differences in the CerCo compiler are:
794\begin{enumerate}
795\item All the intermediate languages include label emitting instructions to
[3432]796implement the labelling approach, and the compiler preserves execution traces.
798macro-instructions which are expanded before assembly, we directly
799produce object code in order to perform the timing analysis, using
800an integrated optimising assembler.
[3425]801\item In order to avoid the additional work of implementing a linker
802 and a loader, we do not support separate
803compilation and external calls. Adding them is orthogonal
[3432]804to the labelling approach and should not introduce extra problems.
[3425]805\item We target an 8-bit processor, in contrast to CompCert's 32-bit targets. This requires
806many changes and more compiler code, but it is not fundamentally more
807complex. The proof of correctness, however, becomes much harder.
[3430]808\item We target a microprocessor that has a non-uniform memory model, which is
[3425]809still often the case for microprocessors used in embedded systems and that is
810becoming common again in multi-core processors. Therefore the compiler has to
811keep track of data and it must move data between memory regions in the proper
[3433]812way. Moreover the size of pointers to different regions is not uniform. %An
813%additional difficulty was that the space available for the stack in internal
814%memory in the 8051 is tiny, allowing only a minor number of nested calls. To
815%support full recursion in order to test the CerCo tools also on recursive
816%programs, the compiler implements a stack in external memory.
[3425]817\end{enumerate}
818
819\subsection{Formal certification of the CerCo compiler}
820We have formally
821certified in the Matita interactive proof assistant that the cost models induced on the source code by the
822Trusted CerCo Compiler correctly and precisely
823predict the object code behaviour. There are two cost models, one for execution
824time and one for stack space consumption. We show the correctness of the prediction
[3434]825only for those programs that do not exhaust the available stack space, a
[3425]826property that---thanks to the stack cost model---we can statically analyse on the
[3434]827source code in sharp contrast to other certified compilers.  Other projects have
[3425]828already certified the preservation of functional semantics in similar compilers,
[3434]829so we have not attempted to directly repeat that work and assume
830functional correctness for most passes. In order to complete the
831proof for non-functional properties, we have introduced a new,
832structured, form of execution traces, with the
833related notions for forward similarity and the intensional
[3425]834consequences of forward similarity. We have also introduced a unified
835representation for back-end intermediate languages that was exploited to provide
836a uniform proof of forward similarity.
837
838The details on the proof techniques employed
839and
[3434]840the proof sketch can be found in the CerCo deliverables and papers.
[3425]841In this section we will only hint at the correctness statement, which turned
[3434]842out to be more complex than expected.
[3425]843
[3434]844\paragraph{The correctness statement.}
[3425]845Real time programs are often reactive programs that loop forever responding to
846events (inputs) by performing some computation followed by some action (output)
[3434]847and continuing as before. For these programs the overall execution
848time does not make sense. The same is true for reactive programs that spend an
849unpredictable amount of time in I/O. Instead, what is interesting is the reaction time ---
850 the time spent between I/O events. Moreover, we are interested in
[3428]851predicting and ruling out crashes due to running out of space on certain
852inputs.
[3434]853Therefore we need a statement that talks about sub-runs of a
854program. A natural candidate is that the time predicted on the source
[3425]855code and spent on the object code by two corresponding sub-runs are the same.
[3434]856To make this statement formal we must identify the
[3425]857corresponding sub-runs and how to single out those that are meaningful.
[3434]858We introduce the notion of a \emph{measurable} sub-run of a run
859which does not exhaust the available stack before or during the
860sub-run, the number of function calls and returns
861in the sub-run is the same, the sub-run does not perform any I/O, and the sub-run
[3425]862starts with a label emission statement and ends with a return or another label
[3434]863emission statement. The stack usage is bounded using the stack usage model
[3425]864that is computed by the compiler.
865
866The statement that we formally proved is: for each C run with a measurable
[3434]867sub-run, there exists an object code run with a sub-run, with the same
868execution trace for both the prefix of the run and the sub-run itself,
869 and where the time spent by the
870object code in the sub-run is the same as the time predicted on the source code
[3425]871using the time cost model generated by the compiler.
[3434]872
873
[3425]874We briefly discuss the constraints for measurability. Not exhausting the stack
[3434]875space is necessary for a run to be meaningful, because the source semantics
876has no notion of running out of memory. Balancing
[3432]877function calls and returns is a requirement for precision: the labelling approach
[3434]878allows the scope of a label to extend after function calls to
879minimize the number of labels. (The scope excludes the called
880function's execution.) If the number of
[3425]881calls/returns is unbalanced, it means that there is a call we have not returned
882to that could be followed by additional instructions whose cost has already been
[3434]883taken in account.  The last condition on the start and end points of
884a run is also required to make the bound precise.  With these
885restrictions and the 8051's simple timing model we obtain \emph{exact}
886predictions.  If we relax these conditions then we obtain a corollary
887with an upper bound on the cost.
888Finally, I/O operations can be performed in the prefix of the run, but not in the
[3425]889measurable sub-run. Therefore we prove that we can predict reaction times, but
890not I/O times, as desired.
891
[3432]892\subsection{Dependent labelling}
[3425]893\label{lab:deplabel}
[3432]894The core idea of the basic labelling approach is to establish a tight connection
[3327]895between basic blocks executed in the source and target languages. Once the
896connection is established, any cost model computed on the object code can be
897transferred to the source code, without affecting the code of the compiler or
[3435]898its proof. In particular, we can also transport cost models
899that associate to each label a \emph{function} from the hardware state
900to a natural number.
901However, a problem arises during the instrumentation phase that replaces label
902emission statements with increments of global cost variables. They are
903incremented by the result of applying the label's cost function
904to the hardware state at the time of execution of the block. However,
905the hardware state comprises both the functional state that affects the
[3328]906computation (the value of the registers and memory) and the non-functional
[3435]907state that does not (the pipeline and cache contents, for example).
908We can find corresponding information for the former in the source code state, but constructing the
[3327]909correspondence may be hard and lifting the cost model to work on the source code
[3336]910state is likely to produce cost expressions that are too complex to understand and reason about.
[3435]911Fortunately, in modern architectures the cost of executing single
[3328]912instructions is either independent of the functional state or the jitter---the
913difference between the worst and best case execution times---is small enough
[3424]914to be bounded without losing too much precision. Therefore we only consider
915dependencies on the non-functional' parts of the state.
[3317]916
[3428]917The non-functional state is not directly related to the high level
918state and does not influence the functional properties. What can be
919done is to expose key aspects of the non-functional state in the
920source code. We present here the basic intuition in a simplified form:
921the technical details that allow us to handle the general case are
922more complex and can be found in~\cite{paolo}. We add to the source
923code an additional global variable that represents the non-functional
924state and another one that remembers the last few labels emitted. The
925state variable must be updated at every label emission statement,
[3435]926using an update function which is computed during the processor timing
927analysis.
928This update function assigns to each label a function from the
[3428]929recently emitted labels and old state to the new state. It is computed
[3435]930by composing the semantics of every instruction in a basic block
931restricted to the non-functional part of the state.
[3317]932
[3327]933Not all the details of the non-functional state needs to be exposed, and the
934technique works better when the part of state that is required can be summarized
935in a simple data structure. For example, to handle simple but realistic
936pipelines it is sufficient to remember a short integer that encodes the position
[3417]937of bubbles (stuck instructions) in the pipeline. In any case, it is not necessary
938for the user to understand the meaning of the state to reason over the properties
939of the
[3435]940program. Moreover, the user, or the invariant generator tools that
[3417]941analyse the instrumented source code produced by the compiler, can decide to
[3428]942trade precision of the analysis for simplicity by approximating the
[3435]943cost by safe bounds that do not depend on the processor state. Interestingly, the functional analysis of
[3424]944the code could determine which blocks are executed more frequently in order to
[3435]945use more aggressive approximations for those that are executed least.
[3317]946
[3432]947Dependent labelling can also be applied to allow the compiler to duplicate
[3336]948blocks that contain labels (e.g. in loop optimisations)~\cite{paolo}. The effect is to assign
[3327]949a different cost to the different occurrences of a duplicated label. For
950example, loop peeling turns a loop into the concatenation of a copy of the loop
[3424]951body for the first iteration and the conditional execution of the
952loop for successive iterations. Further optimisations will compile the two
953copies of the loop differently, with the first body usually
[3327]954taking more time.
[3317]955
[3417]956By introducing a variable that keeps track of the iteration number, we can
[3327]957associate to the label a cost that is a function of the iteration number. The
[3435]958same technique works for loop unrolling without modification: the function will
959assign one cost to the even iterations and another cost to the odd
960ones.  The loop optimisation code
[3424]961that duplicate the loop bodies must also modify the code to correctly propagate
[3435]962the update of the iteration numbers. The technical details are more complicated and
[3327]963can be found in the CerCo reports and publications. The implementation, however,
[3425]964is quite simple (and forms part of our OCaml version of the compiler)
965and the changes to a loop optimising compiler are minimal.
[3317]966
[3328]967\subsection{Techniques to exploit the induced cost model}
[3433]968\label{sec:exploit}
[3425]969We now turn our attention to synthesising high-level costs, such as
[3436]970the reaction time of a real-time program.  We consider as our starting point source level costs
[3432]971provided by basic labelling, in other words annotations
[3425]972on the source code which are constants that provide a sound and sufficiently
[3327]973precise upper bound on the cost of executing the blocks after compilation to
974object code.
[3317]975
[3327]976The principle that we have followed in designing the cost synthesis tools is
[3436]977that the synthesised bounds should be expressed and proved within a general
[3327]978purpose tool built to reason on the source code. In particular, we rely on the
[3328]979Frama-C tool to reason on C code and on the Coq proof-assistant to reason on
[3327]980higher-order functional programs.
[3436]981This principle entails that
982the inferred synthetic bounds are indeed correct as long as the general purpose
983tool is, and that there is no limitation on the class of programs that can be handled,
984for example by resorting to interactive proof.
[3317]985
[3327]986Of course, automation is desirable whenever possible. Within this framework,
987automation means writing programs that give hints to the general purpose tool.
988These hints may take the form, say, of loop invariants/variants, of predicates
989describing the structure of the heap, or of types in a light logic. If these
990hints are correct and sufficiently precise the general purpose tool will produce
991a proof automatically, otherwise, user interaction is required.
[3317]992
[3339]993\paragraph{The Cost plugin and its application to the Lustre compiler.}
[3328]994Frama-C \cite{framac} is a set of analysers for C programs with a
[3327]995specification language called ACSL. New analyses can be dynamically added
[3337]996via a plugin system. For instance, the Jessie plugin allows deductive
[3327]997verification of C programs with respect to their specification in ACSL, with
998various provers as back-end tools.
[3328]999We developed the CerCo Cost plugin for the Frama-C platform as a proof of
[3327]1000concept of an automatic environment exploiting the cost annotations produced by
[3417]1001the CerCo compiler. It consists of an OCaml program which essentially
[3436]1002uses the CerCo compiler to produce a related C program with cost annotations,
1003and applies some heuristics to produce a tentative bound on the cost of
[3327]1004executing the C functions of the program as a function of the value of their
[3436]1005parameters. The user can then call the Jessie plugin to discharge the
[3327]1006related proof obligations.
1007In the following we elaborate on the soundness of the framework and the
[3436]1008experiments we performed with the Cost tool on C programs, including some produced by a
[3327]1009Lustre compiler.
[3317]1010
[3339]1011\paragraph{Soundness.} The soundness of the whole framework depends on the cost
[3436]1012annotations added by the CerCo compiler,
1013the verification conditions (VCs) generated by Jessie, and the
1014external provers discharging the VCs. Jessie can be used to verify the
1015synthesised bounds because our plugin generates them in ACSL format. Thus, even if the added synthetic costs are
[3327]1016incorrect (relatively to the cost annotations), the process as a whole is still
1017correct: indeed, Jessie will not validate incorrect costs and no conclusion can
1018be made about the WCET of the program in this case. In other terms, the
[3436]1019soundness does not depend on the cost plugin, which can in
[3327]1020principle produce any synthetic cost. However, in order to be able to actually
1021prove a WCET of a C function, we need to add correct annotations in a way that
1022Jessie and subsequent automatic provers have enough information to deduce their
1023validity. In practice this is not straightforward even for very simple programs
1024composed of branching and assignments (no loops and no recursion) because a fine
1025analysis of the VCs associated with branching may lead to a complexity blow up.
[3436]1026
[3339]1027\paragraph{Experience with Lustre.} Lustre is a data-flow language for programming
[3417]1028synchronous systems, with a compiler which targets C. We designed a
[3327]1029wrapper for supporting Lustre files.
[3436]1030The C function produced by the compiler is relatively simple loop-free
1031code which implements the step function of the
1032synchronous system and computing the WCET of the function amounts to obtaining a
[3328]1033bound on the reaction time of the system. We tested the Cost plugin and the
[3327]1034Lustre wrapper on the C programs generated by the Lustre compiler. For programs
[3424]1035consisting of a few hundred lines of code, the cost plugin computes a
1036WCET and Alt-Ergo is able to discharge all VCs automatically.
[3317]1037
[3339]1038\paragraph{Handling C programs with simple loops.}
[3327]1039The cost annotations added by the CerCo compiler take the form of C instructions
[3424]1040that update a fresh global variable called the cost variable by a constant.
1041Synthesizing a WCET of a C function thus consists of statically resolving an
[3327]1042upper bound of the difference between the value of the cost variable before and
[3424]1043after the execution of the function, i.e. finding the instructions
[3327]1044that update the cost variable and establish the number of times they are passed
[3424]1045through during the flow of execution. To perform the analysis the plugin
[3436]1046assumes that there are no recursive functions in the program, and that
1047every loop is annotated with a variant. In the case of `for' loops the
1048variants are automatically inferred where a loop counter can be
1049syntactically detected.
[3317]1050
[3436]1051The plugin computes a call-graph and proceeds to calculate bounds for
1052each function from the leaves up to the main function.
1053The computation of the cost of each function is performed by traversing its
1054control flow graph, where the cost of a node is the maximum of the
[3327]1055costs of the successors.
1056In the case of a loop with a body that has a constant cost for every step of the
1057loop, the cost is the product of the cost of the body and of the variant taken
1058at the start of the loop.
1059In the case of a loop with a body whose cost depends on the values of some free
[3335]1060variables, a fresh logic function $f$ is introduced to represent the cost of the
[3327]1061loop in the logic assertions. This logic function takes the variant as a first
[3335]1062parameter. The other parameters of $f$ are the free variables of the body of the
[3327]1063loop. An axiom is added to account the fact that the cost is accumulated at each
1064step of the loop.
[3337]1065The cost of the function is directly added as post-condition of the function.
[3321]1066
[3436]1067The user can also specify more precise variants and annotate functions
1068with their own cost specifications. The plugin will use these
1069instead of computing its own, allowing greater precision and the
1070ability to analyse programs which the variant generator does not
1071support.
1072
1073In addition to the loop-free Lustre code, this method was successfully
[3437]1074applied to a small range of cryptographic code.  See~\cite{labelling}
1075for more details.  The example in Section~\ref{sec:workflow} was also
1076produced using the plug-in.  The variant was calculated automatically
1077by noticing that \lstinline'j' is a loop counter with maximum value
1078\lstinline'len'.  The most expensive path through the loop body
1079($78+136 = 214$) is then multiplied by the number of iterations to
1080give the cost of the loop.
[3436]1081
[3339]1082\paragraph{C programs with pointers.}
[3327]1083When it comes to verifying programs involving pointer-based data structures,
1084such as linked lists, trees, or graphs, the use of traditional first-order logic
1085to specify, and of SMT solvers to verify, shows some limitations. Separation
[3431]1086logic is an elegant alternative. It is a program logic
[3327]1087with a new notion of conjunction to express spatial heap separation. Bobot has
[3337]1088recently introduced automatically generated separation
1089predicates to simulate separation logic reasoning in the Jessie plugin where the specification language, the verification condition
[3327]1090generator, and the theorem provers were not designed with separation logic in
[3431]1091mind~\cite{bobot}. CerCo's plugin can exploit the separation predicates to automatically
[3327]1092reason on the cost of execution of simple heap manipulation programs such as an
1093in-place list reversal.
[3317]1094
[3417]1095\section{Conclusions and future work}
[3316]1096
[3337]1097All the CerCo software and deliverables can be found on the CerCo homepage at~\url{http://cerco.cs.unibo.it}.
[3316]1098
1099The results obtained so far are encouraging and provide evidence that
1100it is possible to perform static time and space analysis at the source level
[3337]1101without losing accuracy, reducing the trusted code base and reconciling the
[3327]1102study of functional and non-functional properties of programs. The
[3316]1103techniques introduced seem to be scalable, cover both imperative and
[3337]1104functional languages and are compatible with every compiler optimisation
[3316]1105considered by us so far.
1106
[3337]1107To prove that compilers can keep track of optimisations
[3316]1108and induce a precise cost model on the source code, we targeted a simple
1109architecture that admits a cost model that is execution history independent.
[3322]1110The most important future work is dealing with hardware architectures
[3316]1111characterized by history dependent stateful components, like caches and
[3428]1112pipelines. The main issue is to assign a parametric, dependent cost
[3432]1113to basic blocks that can be later transferred by the labelling approach to
[3316]1114the source code and represented in a meaningful way to the user. The dependent
[3432]1115labelling approach that we have studied seems a promising tool to achieve
[3428]1116this goal, but more work is required to provide good source level
1117approximations of the relevant processor state.
[3316]1118
[3428]1119Other examples of future work are to improve the cost invariant
1120generator algorithms and the coverage of compiler optimizations, to combining
[3432]1121the labelling approach with the type and effect discipline of~\cite{typeffects}
[3428]1122to handle languages with implicit memory management, and to experiment with
1123our tools in the early phases of development. Larger case studies are also necessary
[3337]1124to evaluate the CerCo's prototype on realistic, industrial-scale programs.
[3316]1125
[3339]1126% \bibliographystyle{splncs}
1127\bibliography{fopara13}
1128% \begin{thebibliography}{19}
1129%
1130% \bibitem{survey} \textbf{A Survey of Static Program Analysis Techniques}
1131% W.~W\"ogerer. Technical report. Technische Universit\"at Wien 2005
1132%
1133% \bibitem{cerco} \textbf{Certified Complexity}. R.M. Amadio, A. Asperti, N. Ayache,
1134% B. Campbell, D. P. Mulligan, R. Pollack, Y. Regis-Gianas, C. Sacerdoti Coen, I.
1135% Stark, in Procedia Computer Science, Volume 7, 2011, Proceedings of the 2 nd
1136% European Future Technologies Conference and Exhibition 2011 (FET 11), 175-177.
1137%
[3432]1138% \bibitem{labelling} \textbf{Certifying and Reasoning on Cost Annotations in C
[3339]1139% Programs}, N.  Ayache, R.M. Amadio, Y.R\'{e}gis-Gianas, in Proc. FMICS, Springer
1140% LNCS
1141% 7437: 32-46, 2012.
1142% %, DOI:10.1007/978-3-642-32469-7\_3.
1143%
[3432]1144% \bibitem{labelling2} \textbf{Certifying and reasoning on cost annotations of
[3339]1145% functional programs}.
1146% R.M. Amadio, Y. R\'{e}gis-Gianas. Proceedings of the Second international conference
1147% on Foundational and Practical Aspects of Resource Analysis FOPARA 2011 Springer
1148% LNCS 7177:72-89, 2012.
1149%
1150% \bibitem{compcert} \textbf{Formal verification of a realistic compiler}. X. Leroy,  In Commun. ACM 52(7), 107–115, 2009.
1151%
1152% \bibitem{framac} \textbf{Frama-C user manual}. L. Correnson, P. Cuoq, F. Kirchner, V. Prevosto, A. Puccetti, J. Signoles,
1153% B. Yakobowski. in CEA-LIST, Software Safety Laboratory, Saclay, F-91191,
1154% \url{http://frama-c.com/}.
1155%
1156% \bibitem{paolo} \textbf{Indexed Labels for Loop Iteration Dependent Costs}. P.
1157% Tranquilli, in Proceedings of the 11th International Workshop on Quantitative
1158% Aspects of Programming Languages and Systems (QAPL 2013), Rome, 23rd-24th March
1159% 2013, Electronic Proceedings in Theoretical Computer Science, to appear in 2013.
1160%
1161% \bibitem{separation} \textbf{Intuitionistic reasoning about shared mutable data
1162% structure} J.C. Reynolds. In Millennial Perspectives in Computer Science,
1163% pages 303–321, Houndsmill, Hampshire, 2000. Palgrave.
1164%
1165% \bibitem{lustre} \textbf{LUSTRE: a declarative language for real-time
1166% programming}
1167% P. Caspi, D. Pilaud, N. Halbwachs, J.A. Plaice. In Proceedings of
1168% the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages ACM
1169% 1987.
1170%
1171% \bibitem{correctness} \textbf{On the correctness of an optimising assembler for
1172% the intel MCS-51 microprocessor}.
1173%   D. P. Mulligan, C. Sacerdoti Coen. In Proceedings of the Second
1174% international conference on Certified Programs and Proofs, Springer-Verlag 2012.
1175%
1176% \bibitem{proartis} \textbf{PROARTIS: Probabilistically Analysable Real-Time
1177% Systems}, F.J. Cazorla, E. Qui\~{n}ones, T. Vardanega, L. Cucu, B. Triquet, G.
1178% Bernat, E. Berger, J. Abella, F. Wartel, M. Houston, et al., in ACM Transactions
1179% on Embedded Computing Systems, 2012.
1180%
1181% \bibitem{embounded} \textbf{The EmBounded project (project paper)}. K. Hammond,
1182% R. Dyckhoff, C. Ferdinand, R. Heckmann, M. Hofmann, H. Loidl, G. Michaelson, J.
1183% Serot, A. Wallace, in Trends in Functional Programming, Volume 6, Intellect
1184% Press, 2006.
1185%
1186% \bibitem{matita}
1187% \textbf{The Matita Interactive Theorem Prover}.
1188% A. Asperti, C. Sacerdoti Coen, W. Ricciotti, E. Tassi.
1189% 23rd International Conference on Automated Deduction, CADE 2011.
1190%
1191% \bibitem{typeffects} \textbf{The Type and Effect Discipline}. J.-P. Talpin,
1192%  P. Jouvelot.
1193%   In Proceedings of the Seventh Annual Symposium on Logic in Computer Science
1194% (LICS '92), Santa Cruz, California, USA, June 22-25, 1992.
1195% IEEE Computer Society 1992.
1196%
1197% \bibitem{stateart} \textbf{The worst-case execution-time problem overview of
1198% methods
1199% and survey of tools.} R. Wilhelm et al., in  ACM Transactions on Embedded
1200% Computing Systems, 7:1–53, May 2008.
1201%
1202% %\bibitem{proartis2} \textbf{A Cache Design for Probabilistic Real-Time
1203% % Systems}, L. Kosmidis, J. Abella, E. Quinones, and F. Cazorla, in Design,
1204% % Automation, and Test in Europe (DATE), Grenoble, France, 03/2013.
1205%
1206% \end{thebibliography}
[3307]1207
[3328]1208
[3324]1209%\bibliography{fopara13.bib}
1210
[3438]1211\appendix
1212
1213\include{appendix}
1214
[3307]1215\end{document}
Note: See TracBrowser for help on using the repository browser.