source: Papers/jar-cerco-2017/cerco.tex @ 3607

Last change on this file since 3607 was 3607, checked in by mulligan, 4 years ago

more structure to paper

File size: 28.2 KB
2%!PS-Adobe-3.0 EPSF-3.0
3%%BoundingBox: 19 19 221 221
4%%CreationDate: Mon Sep 29 1997
5%%Creator: programmed by hand (JK)
9  20 20 moveto
10  20 220 lineto
11  220 220 lineto
12  220 20 lineto
142 setlinewidth
16  .4 setgray fill
35            bookmarks,bookmarksopen,bookmarksdepth=2]{hyperref}
51        {\setlength{\fboxsep}{5pt}
52                \setlength{\mylength}{\linewidth}%
53                \addtolength{\mylength}{-2\fboxsep}%
54                \addtolength{\mylength}{-2\fboxrule}%
55                \Sbox
56                \minipage{\mylength}%
57                        \setlength{\abovedisplayskip}{0pt}%
58                        \setlength{\belowdisplayskip}{0pt}%
59                }%
60                {\endminipage\endSbox
61                        \[\fbox{\TheSbox}\]}
65\title{CerCo: Certified Complexity\thanks{The project CerCo acknowledges the
66financial support of the Future and Emerging Technologies (FET) programme within
67the Seventh Framework Programme for Research of the European Commission, under
68FET-Open grant number: 243881}}
69\subtitle{Verified lifting of concrete complexity annotations through a realistic C compiler}
70\journalname{Journal of Automated Reasoning}
71\titlerunning{Certified Complexity}
72\date{Received: date / Accepted: date}
73\author{Jaap Boender \and Brian Campbell \and Dominic P. Mulligan \and Claudio {Sacerdoti Coen}} % who else?
74\authorrunning{Boender, Campbell, Mulligan, and Sacerdoti Coen}
75\institute{Jaap Boender \at
76              Faculty of Science and Technology,\\
77                                                        Middlesex University London,\\
78                                                        United Kingdom.\\
79              \email{}
80           \and
81           Brian Campbell \at
82              Department of Informatics,\\
83              University of Edinburgh,\\
84              United Kingdom.\\
85              \email{}
86           \and
87           Dominic P. Mulligan \at
88             Computer Laboratory,\\
89             University of Cambridge, \\
90             United Kingdom.\\
91             \email{ }
92           \and
93           Claudio Sacerdoti Coen \at
94              Dipartimento di Informatica---Scienza e Ingegneria (DISI),\\
95              University of Bologna,\\
96              Italy.\\
97              \email{}}
104We provide an overview of the FET-Open Project CerCo (`Certified Complexity').
105Our main achievement is the development of a technique for analysing non-functional properties of programs (time, space) at the source level with little or no loss of accuracy and a small trusted code base.
106The core component is a C compiler, verified in the Matita theorem prover, that produces an instrumented copy of the source code in addition to generating object code.
107This instrumentation exposes, and tracks precisely, the actual (non-asymptotic) computational cost of the input program at the source level.
108Untrusted invariant generators and trusted theorem provers may then be used to compute and certify the parametric execution time of the code.
109\keywords{Verified compilation \and Complexity analysis \and CerCo (`Certified Complexity')}
112% ---------------------------------------------------------------------------- %
113% SECTION                                                                      %
114% ---------------------------------------------------------------------------- %
119%\paragraph{Problem statement.}
120Programs can be specified with both
121functional constraints (what the program must do) and non-functional constraints (what time, space or other resources the program may use).  In the current
122state of the art, functional properties are verified
123by combining user annotations---preconditions, invariants, and so on---with a
124multitude of automated analyses---invariant generators, type systems, abstract
125interpretation, theorem proving, and so on---on the program's high-level source code.
126By contrast, many non-functional properties
127are verified using analyses on low-level object code,
128%\footnote{A notable
129%  exception is the explicit allocation of heap space in languages like C, which
130%  can be handled at the source level.}
131but these analyses may then need information
132about the high-level functional behaviour of the program that must then be reconstructed.
133This analysis on low-level object code has several problems:
136It can be hard to deduce the high-level structure of the program after compiler optimisations.
137The object code produced by an optimising compiler may have radically different control flow to the original source code program.
139Techniques that operate on object code are not useful early in the development process of a program, yet problems with a program's design or implementation are cheaper to resolve earlier in the process, rather than later.
141Parametric cost analysis is very hard: how can we reflect a cost that depends on the execution state, for example the
142value of a register or a carry bit, to a cost that the user can understand
143looking at the source code?
145Performing functional analyses on object code makes it hard for the programmer to provide information about the program and its expected execution, leading to a loss of precision in the resulting analyses.
147\paragraph{Vision and approach.}
148We want to reconcile functional and
149non-functional analyses: to share information and perform both at the same time
150on high-level source code.
152What has previously prevented this approach is the lack of a uniform and precise
153cost model for high-level code as each statement occurrence is compiled
154differently, optimisations may change control flow, and the cost of an object
155code instruction may depend on the runtime state of hardware components like
156pipelines and caches, all of which are not visible in the source code.
158We envision a new generation of compilers that track program structure through compilation and optimisation and exploit this
159information to define a precise, non-uniform cost model for source code that accounts for runtime state. With such a cost model we can
160reduce non-functional verification to the functional case and exploit the state
161of the art in automated high-level verification~\cite{survey}. The techniques
162currently used by the Worst Case Execution Time (WCET) community, who perform analyses on object code,
163are still available but can be coupled with additional source-level
164analyses. Where our approach produces overly complex cost models, safe approximations can be used to trade complexity with precision.
165Finally, source code analysis can be used early in the development process, when
166components have been specified but not implemented, as modularity means
167that it is enough to specify the non-functional behaviour of missing
170We have developed \emph{the labelling approach}~\cite{labelling}, a
171technique to implement compilers that induce cost models on source programs by
172very lightweight tracking of code changes through compilation. We have studied
173how to formally prove the correctness of compilers implementing this technique, and
174have implemented such a compiler from C to object binaries for the 8051
175microcontroller for predicting execution time and stack space usage,
176verifying it in an interactive theorem prover.  As we are targeting
177an embedded microcontroller we do not consider dynamic memory allocation.
179To demonstrate source-level verification of costs we have implemented
180a Frama-C plugin~\cite{framac} that invokes the compiler on a source
181program and uses it to generate invariants on the high-level source
182that correctly model low-level costs. The plugin certifies that the
183program respects these costs by calling automated theorem provers, a
184new and innovative technique in the field of cost analysis. Finally,
185we have conducted several case studies, including showing that the
186plugin can automatically compute and certify the exact reaction time
187of Lustre~\cite{lustre} data flow programs compiled into C.
189% ---------------------------------------------------------------------------- %
190% SECTION                                                                      %
191% ---------------------------------------------------------------------------- %
193\section{Project context and approach}
196Formal methods for verifying functional properties of programs have 
197now reached a level of maturity and automation that their adoption is slowly
198increasing in production environments. For safety critical code, it is
199becoming commonplace to combine rigorous software engineering methodologies and testing
200with static analyses, taking the strengths of each and mitigating
201their weaknesses. Of particular interest are open frameworks
202for the combination of different formal methods, where the programs can be
203progressively specified and enriched with new safety
204guarantees: every method contributes knowledge (e.g. new invariants) that
205becomes an assumption for later analysis.
207The outlook for verifying non-functional properties of programs (time spent,
208memory used, energy consumed) is bleaker.
209% and the future seems to be getting even worse.
210Most industries verify that real time systems meet their deadlines
211by simply performing many runs of the system and timing their execution,  computing the
212maximum time and adding an empirical safety margin, claiming the result to be a
213bound for the WCET of the program. Formal methods and software to statically
214analyse the WCET of programs exist, but they often produce bounds that are too
215pessimistic to be useful. Recent advancements in hardware architecture
216have been
217focused on the improvement of the average case performance, not the
218predictability of the worst case. Execution time is becoming increasingly
219dependent on execution history and the internal state of
220hardware components like pipelines and caches. Multi-core processors and non-uniform
221memory models are drastically reducing the possibility of performing
222static analysis in isolation, because programs are less and less time
223composable. Clock-precise hardware models are necessary for static analysis, and
224obtaining them is becoming harder due to the increased sophistication
225of hardware design.
227Despite these problems, the need for reliable real time
228systems and programs is increasing, and there is pressure
229from the research community for the introduction of
230hardware with more predictable behaviour, which would be more suitable
231for static analysis.  One example, being investigated by the Proartis
232project~\cite{proartis}, is to decouple execution time from execution
233history by introducing randomisation.
235In CerCo~\cite{cerco} we do not address this problem, optimistically
236assuming that improvements in low-level timing analysis or architecture will make
237verification feasible in the longer term. Instead, the main objective of our work is
238to bring together the static analysis of functional and non-functional
239properties, which in the current state of the art are
240independent activities with limited exchange of information: while the
241functional properties are verified on the source code, the analysis of
242non-functional properties is performed on object code to exploit
243clock-precise hardware models.
245\subsection{Current object-code methods}
247Analysis currently takes place on object code for two main reasons.
248First, there cannot be a uniform, precise cost model for source
249code instructions (or even basic blocks). During compilation, high level
250instructions are broken up and reassembled in context-specific ways so that
251identifying a fragment of object code and a single high level instruction is
252infeasible. Even the control flow of the object and source code can be very
253different as a result of optimisations, for example aggressive loop
254optimisations may completely transform source level loops. Despite the lack of a uniform, compilation- and
255program-independent cost model on the source language, the literature on the
256analysis of non-asymptotic execution time on high level languages assuming
257such a model is growing and gaining momentum. However, unless we provide a
258replacement for such cost models, this literature's future practical impact looks
259to be minimal. Some hope has been provided by the EmBounded project \cite{embounded}, which
260compositionally compiles high-level code to a byte code that is executed by an
261interpreter with guarantees on the maximal execution time spent for each byte code
262instruction. This provides a uniform model at the expense of the model's
263precision (each cost is a pessimistic upper bound) and the performance of the
264executed code (because the byte code is interpreted compositionally instead of
265performing a fully non-compositional compilation).
267The second reason to perform the analysis on the object code is that bounding
268the worst case execution time of small code fragments in isolation (e.g. loop
269bodies) and then adding up the bounds yields very poor estimates as no
270knowledge of the hardware state prior to executing the fragment can be assumed. By
271analysing longer runs the bound obtained becomes more precise because the lack
272of information about the initial state has a relatively small impact.
274To calculate the cost of an execution, value and control flow analyses
275are required to bound the number of times each basic block is
276executed.  Currently, state
277of the art WCET analysis tools, such as AbsInt's aiT toolset~\cite{absint}, perform these analyses on
278object code, where the logic of the program is harder to reconstruct
279and most information available at the source code level has been lost;
280see~\cite{stateart} for a survey. Imprecision in the analysis can lead to useless bounds. To
281augment precision, the tools ask the user to provide constraints on
282the object code control flow, usually in the form of bounds on the
283number of iterations of loops or linear inequalities on them. This
284requires the user to manually link the source and object code,
285translating his assumptions on the source code (which may be wrong) to
286object code constraints. The task is error prone and hard, especially
287in the presence of complex compiler optimisations.
289Traditional techniques for WCET that work on object code are also affected by
290another problem: they cannot be applied before the generation of the object
291code. Functional properties can be analysed in early development stages, while
292analysis of non-functional properties may come too late to avoid expensive
293changes to the program architecture.
295\subsection{CerCo's approach}
297In CerCo we propose a radically new approach to the problem: we reject the idea
298of a uniform cost model and we propose that the compiler, which knows how the
299code is translated, must return the cost model for basic blocks of high level
300instructions. It must do so by keeping track of the control flow modifications
301to reverse them and by interfacing with processor timing analysis.
303By embracing compilation, instead of avoiding it like EmBounded did, a CerCo
304compiler can both produce efficient code and return costs that are
305as precise as the processor timing analysis can be. Moreover, our costs can be
306parametric: the cost of a block can depend on actual program data, on a
307summary of the execution history, or on an approximated representation of the
308hardware state. For example, loop optimisations may assign a cost to a loop body
309that is a function of the number of iterations performed. As another example,
310the cost of a block may be a function of the vector of stalled pipeline states,
311which can be exposed in the source code and updated at each basic block exit. It
312is parametricity that allows one to analyse small code fragments without losing
313precision. In the analysis of the code fragment we do not have to ignore the
314initial hardware state, rather, we may assume that we know exactly which
315state (or mode, as the WCET literature calls it) we are in.
317The CerCo approach has the potential to dramatically improve the state of the
318art. By performing control and data flow analyses on the source code, the error
319prone translation of invariants is completely avoided. Instead, this
320work is done at the source level using tools of the user's choice.
321Any available technique for the verification of functional properties
322can be immediately reused and multiple techniques can collaborate together to
323infer and certify cost invariants for the program.  There are no
324limitations on the types of loops or data structures involved. Parametric cost analysis
325becomes the default one, with non-parametric bounds used as a last resort when the user
326decides to trade the complexity of the analysis with its precision. \emph{A priori}, no
327technique previously used in traditional WCET is lost: processor
328timing analyses can be used by the compiler on the object code, and the rest can be applied
329at the source code level.
330 Our approach can also work in the early
331stages of development by axiomatically attaching costs to unimplemented components.
334Software used to verify properties of programs must be as bug free as
335possible. The trusted code base for verification consists of the code that needs
336to be trusted to believe that the property holds. The trusted code base of
337state-of-the-art WCET tools is very large: one needs to trust the control flow
338analyser, the linear programming libraries used, and also the formal models
339of the hardware under analysis, for example. In CerCo we are moving the control flow analysis to the source
340code and we are introducing a non-standard compiler too. To reduce the trusted
341code base, we implemented a prototype and a static analyser in an interactive
342theorem prover, which was used to certify that the costs added to the source
343code are indeed those incurred by the hardware. Formal models of the
344hardware and of the high level source languages were also implemented in the
345interactive theorem prover. Control flow analysis on the source code has been
346obtained using invariant generators, tools to produce proof obligations from
347generated invariants and automatic theorem provers to verify the obligations. If
348these tools are able to generate proof traces that can be
349independently checked, the only remaining component that enters the trusted code
350base is an off-the-shelf invariant generator which, in turn, can be proved
351correct using an interactive theorem prover. Therefore we achieve the double
352objective of allowing the use of more off-the-shelf components (e.g. provers and
353invariant generators) whilst reducing the trusted code base at the same time.
355%\paragraph{Summary of the CerCo objectives.} To summarize, the goal of CerCo is
356% to reconcile functional with non-functional analysis by performing them together
357% on the source code, sharing common knowledge about execution invariants. We want
358% to achieve the goal implementing a new generation of compilers that induce a
359% parametric, precise cost model for basic blocks on the source code. The compiler
360% should be certified using an interactive theorem prover to minimize the trusted
361% code base of the analysis. Once the cost model is induced, off-the-shelf tools
362% and techniques can be combined together to infer and prove parametric cost
363% bounds.
364%The long term benefits of the CerCo vision are expected to be:
365%1. the possibility to perform static analysis during early development stages
366%2.  parametric bounds made easier
367%3.  the application of off-the-shelf techniques currently unused for the
368% analysis of non-functional properties, like automated proving and type systems
369%4. simpler and safer interaction with the user, that is still asked for
370% knowledge, but on the source code, with the additional possibility of actually
371% verifying the provided knowledge
372%5. a reduced trusted code base
373%6. the increased accuracy of the bounds themselves.
375%The long term success of the project is hindered by the increased complexity of
376% the static prediction of the non-functional behaviour of modern hardware. In the
377% time frame of the European contribution we have focused on the general
378% methodology and on the difficulties related to the development and certification
379% of a cost model inducing compiler.
381% ---------------------------------------------------------------------------- %
382% SECTION                                                                      %
383% ---------------------------------------------------------------------------- %
385\section{The typical CerCo workflow}
391char a[] = {3, 2, 7, 14};
392char threshold = 4;
394int count(char *p, int len) {
395  char j;
396  int found = 0;
397  for (j=0; j < len; j++) {
398    if (*p <= threshold)
399      found++;
400    p++;
401    }
402  return found;
405int main() {
406  return count(a,4);
410%  $\vcenter{\includegraphics[width=7.5cm]{interaction_diagram.pdf}}$
412    baseline={([yshift=-.5ex]current bounding box)},
413    element/.style={draw, text width=1.6cm, on chain, text badly centered},
414    >=stealth
415    ]
416{[start chain=going below, node distance=.5cm]
417\node [element] (cerco) {CerCo\\compiler};
418\node [element] (cost)  {CerCo\\cost plugin};
419{[densely dashed]
420\node [element] (ded)   {Deductive\\platform};
421\node [element] (check) {Proof\\checker};
424\coordinate [left=3.5cm of cerco] (left);
425{[every node/.style={above, text width=3.5cm, text badly centered,
426                     font=\scriptsize}]
427\draw [<-] ([yshift=-1ex]cerco.north west) coordinate (t) --
428    node {C source}
429    (t-|left);
430\draw [->] (cerco) -- (cost);
431\draw [->] ([yshift=1ex]cerco.south west) coordinate (t) --
432    node {C source+\color{red}{cost annotations}}
433    (t-|left) coordinate (cerco out);
434\draw [->] ([yshift=1ex]cost.south west) coordinate (t) --
435    node {C source+\color{red}{cost annotations}\\+\color{blue}{synthesized assertions}}
436    (t-|left) coordinate (out);
437{[densely dashed]
438\draw [<-] ([yshift=-1ex]ded.north west) coordinate (t) --
439    node {C source+\color{red}{cost annotations}\\+\color{blue}{complexity assertions}}
440    (t-|left) coordinate (ded in) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
441\draw [->] ([yshift=1ex]ded.south west) coordinate (t) --
442    node {complexity obligations}
443    (t-|left) coordinate (out);
444\draw [<-] ([yshift=-1ex]check.north west) coordinate (t) --
445    node {complexity proof}
446    (t-|left) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
447\draw [dash phase=2.5pt] (cerco out) .. controls +(-1, 0) and +(-1, 0) ..
448    (ded in);
450%% user:
451% head
452\draw (current bounding box.west) ++(-.2,.5)
453    circle (.2) ++(0,-.2) -- ++(0,-.1) coordinate (t);
454% arms
455\draw (t) -- +(-.3,-.2);
456\draw (t) -- +(.3,-.2);
457% body
458\draw (t) -- ++(0,-.4) coordinate (t);
459% legs
460\draw (t) -- +(-.2,-.6);
461\draw (t) -- +(.2,-.6);
464\caption{On the left: C code to count the number of elements in an array
465 that are less than or equal to a given threshold. On the right: CerCo's interaction
466 diagram. Components provided by CerCo are drawn with a solid border.}
469We illustrate the workflow we envisage (on the right
470of~\autoref{test1}) on an example program (on the left
471of~\autoref{test1}).  The user writes the program and feeds it to the
472CerCo compiler, which outputs an instrumented version of the same
473program that updates global variables that record the elapsed
474execution time and the stack space usage.  The red lines in
475\autoref{itest1} introducing variables, functions and function calls
476starting with \lstinline'__cost' and \lstinline'__stack' are the instrumentation introduced by the
477compiler.  For example, the two calls at the start of
478\lstinline'count' say that 4 bytes of stack are required, and that it
479takes 111 cycles to reach the next cost annotation (in the loop body).
480The compiler measures these on the labelled object code that it generates.
482 The annotated program can then be enriched with complexity
483assertions in the style of Hoare logic, that are passed to a deductive
484platform (in our case Frama-C). We provide as a Frama-C cost plugin a
485simple automatic synthesiser for complexity assertions which can
486be overridden by the user to increase or decrease accuracy.  These are the blue
487comments starting with \lstinline'/*@' in \autoref{itest1}, written in
488Frama-C's specification language, ACSL.  From the
489assertions, a general purpose deductive platform produces proof
490obligations which in turn can be closed by automatic or interactive
491provers, ending in a proof certificate.
493% NB: if you try this example with the live CD you should increase the timeout
495Twelve proof obligations are generated from~\autoref{itest1} (to prove
496that the loop invariant holds after one execution if it holds before,
497to prove that the whole program execution takes at most 1358 cycles, and so on).  Note that the synthesised time bound for \lstinline'count',
498$178+214*(1+\text{\lstinline'len'})$ cycles, is parametric in the length of
499the array. The CVC3 prover
500closes all obligations within half a minute on routine commodity
501hardware.  A simpler non-parametric version can be solved in a few
506        moredelim=[is][\color{blue}]{$}{$},
507        moredelim=[is][\color{red}]{|}{|},
508        lineskip=-2pt}
511|int __cost = 33, __stack = 5, __stack_max = 5;|
512|void __cost_incr(int incr) { __cost += incr; }|
513|void __stack_incr(int incr) {
514  __stack += incr;
515  __stack_max = __stack_max < __stack ? __stack : __stack_max;
518char a[4] = {3, 2, 7, 14};  char threshold = 4;
520/*@ behavior stack_cost:
521      ensures __stack_max <= __max(\old(__stack_max), 4+\old(__stack));
522      ensures __stack == \old(__stack);
523    behavior time_cost:
524      ensures __cost <= \old(__cost)+(178+214*__max(1+\at(len,Pre), 0));
526int count(char *p, int len) {
527  char j;  int found = 0;
528  |__stack_incr(4);  __cost_incr(111);|
529  $__l: /* internal */$
530  /*@ for time_cost: loop invariant
531        __cost <= \at(__cost,__l)+
532                  214*(__max(\at((len-j)+1,__l), 0)-__max(1+(len-j), 0));
533      for stack_cost: loop invariant
534        __stack_max == \at(__stack_max,__l);
535      for stack_cost: loop invariant
536        __stack == \at(__stack,__l);
537      loop variant len-j;
538  */
539  for (j = 0; j < len; j++) {
540    |__cost_incr(78);|
541    if (*p <= threshold) { |__cost_incr(136);| found ++; }
542    else { |__cost_incr(114);| }
543    p ++;
544  }
545  |__cost_incr(67);  __stack_incr(-4);|
546  return found;
549/*@ behavior stack_cost:
550      ensures __stack_max <= __max(\old(__stack_max), 6+\old(__stack));
551      ensures __stack == \old(__stack);
552    behavior time_cost:
553      ensures __cost <= \old(__cost)+1358;
555int main(void) {
556  int t;
557  |__stack_incr(2);  __cost_incr(110);|
558  t = count(a,4);
559  |__stack_incr(-2);|
560  return t;
563\caption{The instrumented version of the program in \autoref{test1},
564 with instrumentation added by the CerCo compiler in red and cost invariants
565 added by the CerCo Frama-C plugin in blue. The \lstinline'__cost',
566 \lstinline'__stack' and \lstinline'__stack_max' variables hold the elapsed time
567in clock cycles and the current and maximum stack usage. Their initial values
568hold the clock cycles spent in initialising the global data before calling
569\lstinline'main' and the space required by global data (and thus unavailable for
570the stack).
575% ---------------------------------------------------------------------------- %
576% SECTION                                                                      %
577% ---------------------------------------------------------------------------- %
Note: See TracBrowser for help on using the repository browser.