source: Papers/fopara2013/fopara13.tex @ 3460

Last change on this file since 3460 was 3460, checked in by mulligan, 6 years ago

more fiddling, commit to avoid conflicts

File size: 64.6 KB
Line 
1\documentclass{llncs}
2
3\usepackage{amsfonts}
4\usepackage{amsmath}
5\usepackage{amssymb} 
6\usepackage[english]{babel}
7\usepackage{color}
8\usepackage{fancybox}
9\usepackage{fancyvrb}
10\usepackage{graphicx}
11\usepackage[colorlinks,
12            bookmarks,bookmarksopen,bookmarksdepth=2]{hyperref}
13\usepackage{hyphenat}
14\usepackage[utf8x]{inputenc}
15\usepackage{listings}
16\usepackage{mdwlist}
17\usepackage{microtype}
18\usepackage{stmaryrd}
19\usepackage{url}
20
21\usepackage{tikz}
22\usetikzlibrary{positioning,calc,patterns,chains,shapes.geometric,scopes}
23
24%\renewcommand{\verb}{\lstinline}
25%\def\lstlanguagefiles{lst-grafite.tex}
26%\lstset{language=Grafite}
27\lstset{language=C,basicstyle=\tt,basewidth=.5em,lineskip=-1.5pt}
28
29\newlength{\mylength}
30\newenvironment{frametxt}%
31        {\setlength{\fboxsep}{5pt}
32                \setlength{\mylength}{\linewidth}%
33                \addtolength{\mylength}{-2\fboxsep}%
34                \addtolength{\mylength}{-2\fboxrule}%
35                \Sbox
36                \minipage{\mylength}%
37                        \setlength{\abovedisplayskip}{0pt}%
38                        \setlength{\belowdisplayskip}{0pt}%
39                }%
40                {\endminipage\endSbox
41                        \[\fbox{\TheSbox}\]}
42
43\title{Certified Complexity (CerCo)\thanks{The project CerCo acknowledges the
44financial support of the Future and Emerging Technologies (FET) programme within
45the Seventh Framework Programme for Research of the European Commission, under
46FET-Open grant number: 243881}}
47\author{
48%Roberto
49R.~M. Amadio$^{4}$ \and
50%Nicolas
51N.~Ayache$^{3,4}$ \and
52%François
53F.~Bobot$^{3,4}$ \and
54%Jacob
55J.~P. Boender$^1$ \and
56%Brian
57B.~Campbell$^2$ \and
58%Ilias
59I.~Garnier$^2$ \and
60%Antoine
61A.~Madet$^4$ \and
62%James
63J.~McKinna$^2$ \and
64%Dominic
65D.~P. Mulligan$^1$ \and
66%Mauro
67M.~Piccolo$^1$ \and
68%Randy
69R.~Pollack$^2$ \and
70%Yann
71Y.~R\'egis-Gianas$^{3,4}$ \and
72%Claudio
73C.~Sacerdoti Coen$^1$ \and
74%Ian
75I.~Stark$^2$ \and
76%Paolo
77P.~Tranquilli$^1$}
78\institute{Dipartimento di Informatica - Scienza e Ingegneria, Universit\'a di
79Bologna \and
80LFCS, School of Informatics, University of Edinburgh
81\and INRIA (Team $\pi r^2$)
82\and
83Universit\`e Paris Diderot
84}
85
86\bibliographystyle{splncs03}
87
88\begin{document}
89
90\maketitle
91
92% Brian: I've changed "without loss of accuracy" because we discuss
93% important precision/complexity tradeoffs for advanced architectures
94% in the dependent labelling section.
95\begin{abstract}
96We provide an overview of the FET-Open Project CerCo
97(`Certified Complexity'). Our main achievement is
98the development of a technique for analysing non-functional
99properties of programs (time, space) at the source level with little or no loss of accuracy
100and a small trusted code base. We developed a C compiler, verified in Matita, that translates source code to object code, producing a copy of the source code embellished with cost annotations as an additional side-effect. These annotations expose and track precisely the actual (non-asymptotic)
101computational cost of the input program at
102the source level. Untrusted invariant generators and trusted theorem provers
103may then be used to compute and certify the parametric execution time of the
104code.
105\end{abstract}
106
107% ---------------------------------------------------------------------------- %
108% SECTION                                                                      %
109% ---------------------------------------------------------------------------- %
110\section{Introduction}
111
112%\paragraph{Problem statement.}
113Programs can be specified with both
114functional constraints (what the program must do) and non-functional constraints (what time, space or other resources the program may use).  In the current
115state of the art, functional properties are verified
116by combining user annotations---preconditions, invariants, and so on---with a
117multitude of automated analyses---invariant generators, type systems, abstract
118interpretation, theorem proving, and so on---on the program's high-level source code.
119By contrast, many non-functional properties
120are verified using analyses on low-level object code,
121%\footnote{A notable
122%  exception is the explicit allocation of heap space in languages like C, which
123%  can be handled at the source level.}
124but these analyses may then need information
125about the high-level functional behaviour of the program that must then be reconstructed.
126This analysis on low-level object code has several problems:
127\begin{itemize*}
128\item
129It can be hard to deduce the high-level structure of the program after compiler optimisations.
130The object code produced by an optimising compiler may have radically different control flow to the original source code program.
131\item
132Techniques that operate on object code are not useful early in the development process of a program, yet problems with a program's design or implementation are cheaper to resolve earlier in the process, rather than later.
133\item
134Parametric cost analysis is very hard: how can we reflect a cost that depends on the execution state, for example the
135value of a register or a carry bit, to a cost that the user can understand
136looking at the source code?
137\item
138Performing functional analyses on object code makes it hard for the programmer to provide information about the program and its expected execution, leading to a loss of precision in the resulting analyses.
139\end{itemize*}
140\paragraph{Vision and approach.}
141We wish to reconcile functional and
142non-functional analyses: to share information and perform both at the same time
143on high-level source code.
144%
145What has previously prevented this approach is the lack of a uniform and precise
146cost model for high-level code as each statement occurrence is compiled
147differently, optimisations may change control flow, and the cost of an object
148code instruction may depend on the runtime state of hardware components like
149pipelines and caches, all of which are not visible in the source code.
150
151We envision a new generation of compilers that track program structure through compilation and optimisation and exploit this
152information to define a precise, non-uniform cost model for source code that accounts for runtime state. With such a cost model we can
153reduce non-functional verification to the functional case and exploit the state
154of the art in automated high-level verification~\cite{survey}. The techniques
155currently used by the Worst Case Execution Time (WCET) community, who perform analyses on object code,
156are still available but can be coupled with additional source-level
157analyses. Where our approach produces overly complex cost models, safe approximations can be used to trade complexity with precision.
158Finally, source code analysis can be used early in the development process, when
159components have been specified but not implemented, as modularity means
160that it is enough to specify the non-functional behaviour of missing
161components.
162\paragraph{Contributions.}
163We have developed \emph{the labelling approach}~\cite{labelling}, a
164technique to implement compilers that induce cost models on source programs by
165very lightweight tracking of code changes through compilation. We have studied
166how to formally prove the correctness of compilers implementing this technique, and
167have implemented such a compiler from C to object binaries for the MCS-51
168microcontroller for predicting execution time and stack space usage,
169verifying it in an interactive theorem prover.  As we are targeting
170an embedded microcontroller we do not consider dynamic memory allocation.
171
172To demonstrate source-level verification of costs we have implemented
173a Frama-C plugin~\cite{framac} that invokes the compiler on a source
174program and uses it to generate invariants on the high-level source
175that correctly model low-level costs. The plugin certifies that the
176program respects these costs by calling automated theorem provers, a
177new and innovative technique in the field of cost analysis. Finally,
178we have conducted several case studies, including showing that the
179plugin can automatically compute and certify the exact reaction time
180of Lustre~\cite{lustre} data flow programs compiled into C.
181
182\section{Project context and approach}
183Formal methods for the verification of functional properties of programs are now mature enough to find increasing application in production environments. For safety critical code, it is commonplace to combine rigorous engineering methodologies and testing
184with static analyses, taking the strengths of each and mitigating
185their weaknesses. Of particular interest are open frameworks
186for the combination of different formal methods, where the programs can be
187progressively specified and enriched with new safety
188guarantees: every method contributes knowledge (e.g. new invariants) that
189becomes an assumption for later analysis.
190
191The outlook for verifying non-functional properties of programs (time spent,
192memory used, energy consumed) is bleaker.
193% and the future seems to be getting even worse.
194Most industries verify that real time systems meet their deadlines
195by simply performing many runs of the system and timing their execution,  computing the
196maximum time and adding an empirical safety margin, claiming the result to be a
197bound for the WCET of the program. Formal methods and software to statically
198analyse the WCET of programs exist, but they often produce bounds that are too
199pessimistic to be useful. Recent advancements in hardware architecture
200have been
201focused on the improvement of the average case performance, not the
202predictability of the worst case. Execution time is becoming increasingly
203dependent on execution history and the internal state of
204hardware components like pipelines and caches. Multi-core processors and non-uniform
205memory models are drastically reducing the possibility of performing
206static analysis in isolation, because programs are less and less time
207composable. Clock-precise hardware models are necessary for static analysis, and
208obtaining them is becoming harder due to the increased sophistication
209of hardware design.
210
211Despite these problems, the need for reliable real time
212systems and programs is increasing, and there is a push
213from the research community towards the introduction of
214hardware with more predictable behaviour, which would be more suitable
215for static analysis.  One example, exemplified by the Proartis
216project~\cite{proartis}, is to decouple execution time from execution
217history by introducing randomisation.
218
219In CerCo~\cite{cerco} we do not address this problem, optimistically
220assuming that improvements in low-level timing analysis or architecture will make
221verification feasible in the longer term. Instead, the main objective of our work is
222to bring together the static analysis of functional and non-functional
223properties, which in the current state of the art are
224independent activities with limited exchange of information: while the
225functional properties are verified on the source code, the analysis of
226non-functional properties is performed on object code to exploit
227clock-precise hardware models.
228
229\subsection{Current object-code methods}
230
231Analysis currently takes place on object code for two main reasons.
232First, there cannot be a uniform, precise cost model for source
233code instructions (or even basic blocks). During compilation, high level
234instructions are disassembled and reassembled in context-specific ways so that
235identifying a fragment of object code and a single high level instruction is
236infeasible. Even the control flow of the object and source code can be very
237different as a result of optimisations, for example aggressive loop
238optimisations may completely transform source level loops. Despite the lack of a uniform, compilation- and
239program-independent cost model on the source language, the literature on the
240analysis of non-asymptotic execution time on high level languages assuming
241such a model is growing and gaining momentum. However, unless we provide a
242replacement for such cost models, this literature's future practical impact looks
243to be minimal. Some hope has been provided by the EmBounded project \cite{embounded}, which
244compositionally compiles high-level code to a byte code that is executed by an
245interpreter with guarantees on the maximal execution time spent for each byte code
246instruction. This provides a uniform model at the expense of the model's
247precision (each cost is a pessimistic upper bound) and the performance of the
248executed code (because the byte code is interpreted compositionally instead of
249performing a fully non-compositional compilation).
250
251The second reason to perform the analysis on the object code is that bounding
252the worst case execution time of small code fragments in isolation (e.g. loop
253bodies) and then adding up the bounds yields very poor estimates as no
254knowledge of the hardware state prior to executing the fragment can be assumed. By
255analysing longer runs the bound obtained becomes more precise because the lack
256of information about the initial state has a relatively small impact.
257
258To calculate the cost of an execution, value and control flow analyses
259are required to bound the number of times each basic block is
260executed.  Currently, state
261of the art WCET analysis tools, such as AbsInt's aiT toolset~\cite{absint}, perform these analyses on
262object code, where the logic of the program is harder to reconstruct
263and most information available at the source code level has been lost;
264see~\cite{stateart} for a survey. Imprecision in the analysis can lead to useless bounds. To
265augment precision, the tools ask the user to provide constraints on
266the object code control flow, usually in the form of bounds on the
267number of iterations of loops or linear inequalities on them. This
268requires the user to manually link the source and object code,
269translating his assumptions on the source code (which may be wrong) to
270object code constraints. The task is error prone and hard, especially
271in the presence of complex compiler optimisations.
272
273Traditional techniques for WCET that work on object code are also affected by
274another problem: they cannot be applied before the generation of the object
275code. Functional properties can be analysed in early development stages, while
276analysis of non-functional properties may come too late to avoid expensive
277changes to the program architecture.
278
279\subsection{CerCo's approach}
280
281In CerCo we propose a new approach to the problem: we abandon the idea
282of a uniform cost model and we propose that the compiler, which knows how the
283code is translated, must return the cost model for basic blocks of high level
284instructions. It must do so by keeping track of the control flow modifications
285to reverse them and by interfacing with processor timing analysis.
286
287By embracing the knowledge implicit in the compilation process, instead of avoiding it like EmBounded did, a CerCo
288compiler can both produce efficient code and return costs that are
289as precise as the processor timing analysis can be. Moreover, our costs can be
290parametric: the cost of a block can depend on actual program data, on a
291summary of the execution history, or on an approximated representation of the
292hardware state. For example, loop optimisations may assign a cost to a loop body
293that is a function of the number of iterations performed. As another example,
294the cost of a block may be a function of the vector of stalled pipeline states,
295which can be exposed in the source code and updated at each basic block exit. It
296is parametricity that allows one to analyse small code fragments without losing
297precision. In the analysis of the code fragment we do not have to ignore the
298initial hardware state, rather, we may assume that we know exactly which
299state (or mode, as the WCET literature calls it) we are in.
300
301The CerCo approach has the potential to substantially improve upon the current state of the
302art. By performing control and data flow analyses on the source code, the error
303prone translation of invariants is avoided. Instead, this
304work is done at the source level using tools of the user's choice.
305Any available technique for the verification of functional properties
306can be immediately reused and multiple techniques can collaborate together to
307infer and certify cost invariants for the program.  There are no
308limitations on the types of loops or data structures involved. Parametric cost analysis
309becomes the default one, with non-parametric bounds used as a last resort when the user
310decides to trade the complexity of the analysis with its precision. \emph{A priori}, no
311technique previously used in traditional WCET is lost: processor
312timing analyses can be used by the compiler on the object code, and the rest can be applied
313at the source code level.
314 Our approach can also work in the early
315stages of development by axiomatically attaching costs to unimplemented components.
316
317
318We take as granted the position that all software used to verify properties of programs must be as bug free as
319possible. The trusted code base for verification consists of the code that needs
320to be trusted to believe that the property holds. The trusted code base of
321state-of-the-art WCET tools is very large: one needs to trust the control flow
322analyser, the linear programming libraries used, and also the formal models
323of the hardware under analysis, for example. In CerCo we are moving the control flow analysis to the source
324code and we are introducing a non-standard compiler too. To reduce the trusted
325code base, we implemented a prototype and a static analyser in an interactive
326theorem prover, which was used to certify that the costs added to the source
327code are indeed those incurred by the hardware. Formal models of the
328hardware and of the high level source languages were also implemented in the
329interactive theorem prover. Control flow analysis on the source code has been
330obtained using invariant generators, tools to produce proof obligations from
331generated invariants and automatic theorem provers to verify the obligations. If
332the automatic provers are able to generate proof traces that can be
333independently checked, the only remaining component that enters the trusted code
334base is an off-the-shelf invariant generator which, in turn, can be proved
335correct using an interactive theorem prover. Therefore we achieve the double
336objective of allowing the use of more off-the-shelf components (e.g. provers and
337invariant generators) whilst reducing the trusted code base at the same time.
338
339%\paragraph{Summary of the CerCo objectives.} To summarize, the goal of CerCo is
340% to reconcile functional with non-functional analysis by performing them together
341% on the source code, sharing common knowledge about execution invariants. We want
342% to achieve the goal implementing a new generation of compilers that induce a
343% parametric, precise cost model for basic blocks on the source code. The compiler
344% should be certified using an interactive theorem prover to minimize the trusted
345% code base of the analysis. Once the cost model is induced, off-the-shelf tools
346% and techniques can be combined together to infer and prove parametric cost
347% bounds.
348%The long term benefits of the CerCo vision are expected to be:
349%1. the possibility to perform static analysis during early development stages
350%2.  parametric bounds made easier
351%3.  the application of off-the-shelf techniques currently unused for the
352% analysis of non-functional properties, like automated proving and type systems
353%4. simpler and safer interaction with the user, that is still asked for
354% knowledge, but on the source code, with the additional possibility of actually
355% verifying the provided knowledge
356%5. a reduced trusted code base
357%6. the increased accuracy of the bounds themselves.
358%
359%The long term success of the project is hindered by the increased complexity of
360% the static prediction of the non-functional behaviour of modern hardware. In the
361% time frame of the European contribution we have focused on the general
362% methodology and on the difficulties related to the development and certification
363% of a cost model inducing compiler.
364
365\section{The typical CerCo workflow}
366\label{sec:workflow}
367\begin{figure}[!t]
368\begin{tabular}{l@{\hspace{0.2cm}}|@{\hspace{0.2cm}}l}
369\begin{lstlisting}
370char a[] = {3, 2, 7, 14};
371char threshold = 4;
372
373int count(char *p, int len) {
374  char j;
375  int found = 0;
376  for (j=0; j < len; j++) {
377    if (*p <= threshold)
378      found++;
379    p++;
380    }
381  return found;
382}
383
384int main() {
385  return count(a,4);
386}
387\end{lstlisting}
388&
389%  $\vcenter{\includegraphics[width=7.5cm]{interaction_diagram.pdf}}$
390\begin{tikzpicture}[
391    baseline={([yshift=-.5ex]current bounding box)},
392    element/.style={draw, text width=1.6cm, on chain, text badly centered},
393    >=stealth
394    ]
395{[start chain=going below, node distance=.5cm]
396\node [element] (cerco) {CerCo\\compiler};
397\node [element] (cost)  {CerCo\\cost plugin};
398{[densely dashed]
399\node [element] (ded)   {Deductive\\platform};
400\node [element] (check) {Proof\\checker};
401}
402}
403\coordinate [left=3.5cm of cerco] (left);
404{[every node/.style={above, text width=3.5cm, text badly centered,
405                     font=\scriptsize}]
406\draw [<-] ([yshift=-1ex]cerco.north west) coordinate (t) --
407    node {C source}
408    (t-|left);
409\draw [->] (cerco) -- (cost);
410\draw [->] ([yshift=1ex]cerco.south west) coordinate (t) --
411    node {C source+\color{red}{cost annotations}}
412    (t-|left) coordinate (cerco out);
413\draw [->] ([yshift=1ex]cost.south west) coordinate (t) --
414    node {C source+\color{red}{cost annotations}\\+\color{blue}{synthesized assertions}}
415    (t-|left) coordinate (out);
416{[densely dashed]
417\draw [<-] ([yshift=-1ex]ded.north west) coordinate (t) --
418    node {C source+\color{red}{cost annotations}\\+\color{blue}{complexity assertions}}
419    (t-|left) coordinate (ded in) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
420\draw [->] ([yshift=1ex]ded.south west) coordinate (t) --
421    node {complexity obligations}
422    (t-|left) coordinate (out);
423\draw [<-] ([yshift=-1ex]check.north west) coordinate (t) --
424    node {complexity proof}
425    (t-|left) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
426\draw [dash phase=2.5pt] (cerco out) .. controls +(-1, 0) and +(-1, 0) ..
427    (ded in);
428}}
429%% user:
430% head
431\draw (current bounding box.west) ++(-.2,.5)
432    circle (.2) ++(0,-.2) -- ++(0,-.1) coordinate (t);
433% arms
434\draw (t) -- +(-.3,-.2);
435\draw (t) -- +(.3,-.2);
436% body
437\draw (t) -- ++(0,-.4) coordinate (t);
438% legs
439\draw (t) -- +(-.2,-.6);
440\draw (t) -- +(.2,-.6);
441\end{tikzpicture}
442\end{tabular}
443\caption{On the left: code to count the array's elements
444 that are less than or equal to the threshold. On the right: CerCo's interaction
445 diagram. CerCo's components are drawn solid.}
446\label{test1}
447\end{figure}
448We illustrate the workflow we envisage (on the right
449of~\autoref{test1}) on an example program (on the left
450of~\autoref{test1}).  The user writes the program and feeds it to the
451CerCo compiler, which outputs an instrumented version of the same
452program that updates global variables that record the elapsed
453execution time and the stack space usage.  The red lines in
454\autoref{itest1} introducing variables, functions and function calls
455starting with \lstinline'__cost' and \lstinline'__stack' are the instrumentation introduced by the
456compiler.  For example, the two calls at the start of
457\lstinline'count' say that 4 bytes of stack are required, and that it
458takes 111 cycles to reach the next cost annotation (in the loop body).
459The compiler measures these on the labelled object code that it generates.
460
461 The annotated program can then be enriched with complexity
462assertions in the style of Hoare logic, that are passed to a deductive
463platform (in our case Frama-C). We provide as a Frama-C cost plugin a
464simple automatic synthesiser for complexity assertions which can
465be overridden by the user to increase or decrease accuracy.  These are the blue
466comments starting with \lstinline'/*@' in \autoref{itest1}, written in
467Frama-C's specification language, ACSL.  From the
468assertions, a general purpose deductive platform produces proof
469obligations which in turn can be closed by automatic or interactive
470provers, ending in a proof certificate.
471
472% NB: if you try this example with the live CD you should increase the timeout
473
474Twelve proof obligations are generated from~\autoref{itest1} (to prove
475that the loop invariant holds after one execution if it holds before,
476to prove that the whole program execution takes at most 1358 cycles,
477etc.).  Note that the synthesised time bound for \lstinline'count',
478$178+214*(1+\text{\lstinline'len'})$ cycles, is parametric in the length of
479the array. The CVC3 prover
480closes all obligations within half a minute on routine commodity
481hardware.  A simpler non-parametric version can be solved in a few
482seconds.
483%
484\fvset{commandchars=\\\{\}}
485\lstset{morecomment=[s][\color{blue}]{/*@}{*/},
486        moredelim=[is][\color{blue}]{$}{$},
487        moredelim=[is][\color{red}]{|}{|},
488        lineskip=-2pt}
489\begin{figure}[!p]
490\begin{lstlisting}
491|int __cost = 33, __stack = 5, __stack_max = 5;|
492|void __cost_incr(int incr) { __cost += incr; }|
493|void __stack_incr(int incr) {
494  __stack += incr;
495  __stack_max = __stack_max < __stack ? __stack : __stack_max;
496}|
497
498char a[4] = {3, 2, 7, 14};  char threshold = 4;
499
500/*@ behaviour stack_cost:
501      ensures __stack_max <= __max(\old(__stack_max), 4+\old(__stack));
502      ensures __stack == \old(__stack);
503    behaviour time_cost:
504      ensures __cost <= \old(__cost)+(178+214*__max(1+\at(len,Pre), 0));
505*/
506int count(char *p, int len) {
507  char j;  int found = 0;
508  |__stack_incr(4);  __cost_incr(111);|
509  $__l: /* internal */$
510  /*@ for time_cost: loop invariant
511        __cost <= \at(__cost,__l)+
512                  214*(__max(\at((len-j)+1,__l), 0)-__max(1+(len-j), 0));
513      for stack_cost: loop invariant
514        __stack_max == \at(__stack_max,__l);
515      for stack_cost: loop invariant
516        __stack == \at(__stack,__l);
517      loop variant len-j;
518  */
519  for (j = 0; j < len; j++) {
520    |__cost_incr(78);|
521    if (*p <= threshold) { |__cost_incr(136);| found ++; }
522    else { |__cost_incr(114);| }
523    p ++;
524  }
525  |__cost_incr(67);  __stack_incr(-4);|
526  return found;
527}
528
529/*@ behaviour stack_cost:
530      ensures __stack_max <= __max(\old(__stack_max), 6+\old(__stack));
531      ensures __stack == \old(__stack);
532    behaviour time_cost:
533      ensures __cost <= \old(__cost)+1358;
534*/
535int main(void) {
536  int t;
537  |__stack_incr(2);  __cost_incr(110);|
538  t = count(a,4);
539  |__stack_incr(-2);|
540  return t;
541}
542\end{lstlisting}
543\caption{The instrumented version of the program in \autoref{test1},
544 with instrumentation added by the CerCo compiler in red and cost invariants
545 added by the CerCo Frama-C plugin in blue. The \texttt{\_\_cost},
546 \texttt{\_\_stack} and \texttt{\_\_stack\_max} variables hold the elapsed time
547in clock cycles and the current and maximum stack usage. Their initial values
548hold the clock cycles spent in initialising the global data before calling
549\texttt{main} and the space required by global data (and thus unavailable for
550the stack).
551}
552\label{itest1}
553\end{figure}
554\section{Main scientific and technical results}
555First we describe the basic labelling approach and our compiler
556implementations that use it.  This is suitable for basic architectures
557with simple cost models.  Then we will discuss the dependent labelling
558extension which is suitable for more advanced processor architectures
559and compiler optimisations.  At the end of this section we will
560demonstrate automated high level reasoning about the source level
561costs provided by the compilers.
562
563% \emph{Dependent labelling~\cite{?}.} The basic labelling approach assumes a
564% bijective mapping between object code and source code O(1) blocks (called basic
565% blocks). This assumption is violated by many program optimizations (e.g. loop
566% peeling and loop unrolling). It also assumes the cost model computed on the
567% object code to be non parametric: every block must be assigned a cost that does
568% not depend on the state. This assumption is violated by stateful hardware like
569% pipelines, caches, branch prediction units. The dependent labelling approach is
570% an extension of the basic labelling approach that allows to handle parametric
571% cost models. We showed how the method allows to deal with loop optimizations and
572% pipelines, and we speculated about its applications to caches.
573%
574% \emph{Techniques to exploit the induced cost model.} Every technique used for
575% the analysis of functional properties of programs can be adapted to analyse the
576% non-functional properties of the source code instrumented by compilers that
577% implement the labelling approach. In order to gain confidence in this claim, we
578% showed how to implement a cost invariant generator combining abstract
579% interpretation with separation logic ideas \cite{separation}. We integrated
580% everything in the Frama-C modular architecture, in order to compute proof
581% obligations from functional and cost invariants and to use automatic theorem
582% provers on them. This is an example of a new technique that is not currently
583% exploited in WCET analysis. It also shows how precise functional invariants
584% benefits the non-functional analysis too. Finally, we show how to fully
585% automatically analyse the reaction time of Lustre nodes that are first compiled
586% to C using a standard Lustre compiler and then processed by a C compiler that
587% implements the labelling approach.
588
589% \emph{The CerCo compiler.} This is a compiler from a large subset of the C
590% program to 8051/8052 object code,
591% integrating the labelling approach and a static analyser for 8051 executables.
592% The latter can be implemented easily and does not require dependent costs
593% because the 8051 microprocessor is a very simple one, with constant-cost
594% instruction. It was chosen to separate the issue of exact propagation of the
595% cost model from the orthogonal problem of the static analysis of object code
596% that may require approximations or dependent costs. The compiler comes in
597% several versions: some are prototypes implemented directly in OCaml, and they
598% implement both the basic and dependent labelling approaches; the final version
599% is extracted from a Matita certification and at the moment implements only the
600% basic approach.
601
602\subsection{The (basic) labelling approach}
603The labelling approach is the foundational insight that underlies all the developments in CerCo.
604It facilitates the tracking of basic block evolution during the compilation process in order to propagate the cost model from the
605object code to the source code without losing precision in the process.
606\paragraph{Problem statement.} Given a source program $P$, we want to obtain an
607instrumented source program $P'$,  written in the same programming language, and
608the object code $O$ such that: 1) $P'$ is obtained by inserting into $P$ some
609additional instructions to update global cost information like the amount of
610time spent during execution or the maximal stack space required; 2) $P$ and $P'$ 
611must have the same functional behaviour, i.e.\ they must produce that same output
612and intermediate observables; 3) $P$ and $O$ must have the same functional
613behaviour; 4) after execution and in interesting points during execution, the
614cost information computed by $P'$ must be an upper bound of the one spent by $O$ 
615to perform the corresponding operations (\emph{soundness} property); 5) the difference
616between the costs computed by $P'$ and the execution costs of $O$ must be
617bounded by a program-dependent constant (\emph{precision} property).
618
619\paragraph{The labelling software components.} We solve the problem in four
620stages \cite{labelling}, implemented by four software components that are used
621in sequence.
622
623The first component labels the source program $P$ by injecting label emission
624statements in appropriate positions to mark the beginning of basic blocks.
625These are the positions where the cost instrumentation will appear in the final output.
626% The
627% set of labels with their positions is called labelling.
628The syntax and semantics
629of the source programming language is augmented with label emission statements.
630The statement ``EMIT $\ell$'' behaves like a NOP instruction that does not affect the
631program state or control flow, but its execution is observable.
632% Therefore the observables of a run of a program becomes a stream
633% of label emissions: $\ell_1,\ldots,\ell_n$, called the program trace. We clarify the conditions
634% that the labelling must respect later.
635For the example in Section~\ref{sec:workflow} this is just the original C code
636with ``EMIT'' instructions added at every point a \lstinline'__cost_incr' call
637appears in the final code.
638
639The second component is a labelling preserving compiler. It can be obtained from
640an existing compiler by adding label emission statements to every intermediate
641language and by propagating label emission statements during compilation. The
642compiler is correct if it preserves both the functional behaviour of the program
643and the traces of observables, including the labels `emitted'.
644% We may also ask that the function that erases the cost
645% emission statements commute with compilation. This optional property grants that
646% the labelling does not interfere with the original compiler behaviour. A further
647% set of requirements will be added later.
648
649The third component analyses the labelled
650object code to compute the scope of each of its label emission statements,
651i.e.\ the instructions that may be executed after the statement and before
652a new label emission is encountered, and then computes the maximum cost of
653each.  Note that we only have enough information at this point to compute
654the cost of loop-free portions of code.  We will consider how to ensure that
655every loop is broken by a cost label shortly.
656%It is a tree and not a sequence because the scope
657%may contain a branching statement. In order to grant that such a finite tree
658%exists, the object code must not contain any loop that is not broken by a label
659%emission statement. This is the first requirement of a sound labelling. The
660%analyser fails if the labelling is unsound. For each scope, the analyser
661%computes an upper bound of the execution time required by the scope, using the
662%maximum of the costs of the two branches in case of a conditional statement.
663%Finally, the analyser computes the cost of a label by taking the maximum of the
664%costs of the scopes of every statement that emits that label.
665
666The fourth and final component replaces the labels in the labelled
667version of the source code produced at the start with the costs
668computed for each label's scope.  This yields the instrumented source
669code.  For the example, this is the code in \autoref{itest1}, except
670for the specifications in comments, which we consider in
671Section~\ref{sec:exploit}.
672
673\paragraph{Correctness.} Requirements 1 and 2 hold because of the
674non-invasive labelling procedure.  Requirement 3 can be satisfied by
675implementing compilation correctly.  It is obvious that the value of
676the global cost variable of the instrumented source code is always
677equal to the sum of the costs of the labels emitted by the
678corresponding labelled code. Moreover, because the compiler preserves
679all traces, the sum of the costs of the labels emitted in the source
680and target labelled code are the same. Therefore, to satisfy the
681soundness requirement, we need to ensure that the time taken to
682execute the object code is equal to the sum of the costs of the labels
683emitted by the object code. We collect all the necessary conditions
684for this to happen in the definition of a \emph{sound} labelling: a)
685all loops must be broken by a cost emission statement; b) all program
686instructions must be in the scope of some cost emission statement.
687This ensures that every label's scope is a tree of instructions, with
688the cost being the most expensive path. To satisfy also the precision
689requirement, we must make the scopes flat sequences of instructions.
690We require a \emph{precise} labelling where every label is emitted at
691most once and both branches of each conditional jump start with a label
692emission statement.
693
694The correctness and precision of the labelling approach only rely on the
695correctness and precision of the object code labelling. The simplest
696way to achieve that is to impose correctness and precision
697requirements on the source code labelling produced at the start, and
698to demand that the compiler preserves these
699properties too. The latter requirement imposes serious limitations on the
700compilation strategy and optimisations: the compiler may not duplicate any code
701that contains label emission statements, like loop bodies. Therefore various
702loop optimisations like peeling or unrolling are prevented. Moreover, precision
703of the object code labelling is not sufficient \emph{per se} to obtain global
704precision: we implicitly assumed that a precise constant cost can be assigned
705to every instruction. This is not possible in
706the presence of stateful hardware whose state influences the cost of operations,
707like pipelines and caches. In Section~\ref{lab:deplabel} we will see an extension of the
708basic labelling approach which tackles these problems.
709
710In CerCo we have developed several cost preserving compilers based on
711the labelling approach. Excluding an initial certified compiler for a
712`while' language, all remaining compilers target realistic source
713languages---a pure higher order functional language and a large subset
714of C with pointers, \texttt{goto}s and all data structures---and real world
715target processors---MIPS and the Intel 8051/8052 processor
716family. Moreover, they achieve a level of optimisation that ranges
717from moderate (comparable to GCC level 1) to intermediate (including
718loop peeling and unrolling, hoisting and late constant propagation).
719We describe the C compilers in detail in the following section.
720
721Two compilation chains were implemented for a purely functional
722higher-order language~\cite{labelling2}. The two main changes required
723to deal with functional languages are: 1) because global variables and
724updates are not available, the instrumentation phase produces monadic
725code to `update' the global costs; 2) the requirements for a sound and
726precise labelling of the source code must be changed when the
727compilation is based on CPS translations.  In particular, we need to
728introduce labels emitted before a statement is executed and also
729labels emitted after a statement is executed. The latter capture code
730that is inserted by the CPS translation and that would escape all
731label scopes.
732
733% Brian: one of the reviewers pointed out that standard Prolog implementations
734% do have some global state that apparently survives backtracking and might be
735% used.  As we haven't experimented with this, I think it's best to elide it
736% entirely.
737
738% Phases 1, 2 and 3 can be applied as well to logic languages (e.g. Prolog).
739% However, the instrumentation phase cannot: in standard Prolog there is no notion
740% of (global) variable whose state is not retracted during backtracking.
741% Therefore, the cost of executing computations that are later backtracked would
742% not be correctly counted in. Any extension of logic languages with
743% non-backtrackable state could support our labelling approach.
744
745\subsection{The CerCo C compilers}
746We implemented two C compilers, one implemented directly in OCaml and
747the other implemented in Matita, an interactive theorem prover~\cite{matita}.  The first acted as a prototype for the second,
748but also supported MIPS and acted as a testbed for more advanced
749features such as the dependent labelling approach
750in Section~\ref{lab:deplabel}.
751
752The second C compiler is the \emph{Trusted CerCo Compiler}, whose cost
753predictions are formally verified. The executable code
754is OCaml code extracted from the Matita implementation. The Trusted
755CerCo Compiler only targets
756the C language and the 8051/8052 family, and does not yet implement any
757advanced optimisations. Its user interface, however, is the same as the
758other version for interoperability purposes. In
759particular, the Frama-C CerCo plugin descibed in
760Section~\ref{sec:exploit} can work without recompilation
761with both of our C compilers.
762
763The 8051/8052 microprocessor is a very simple one, with constant-cost
764instructions. It was chosen to separate the issue of exact propagation of the
765cost model from the orthogonal problem of low-level timing analysis of object code
766that may require approximation or dependent costs.
767
768The (trusted) CerCo compiler implements the following optimisations: cast
769simplification, constant propagation in expressions, liveness analysis driven
770spilling of registers, dead code elimination, branch displacement, and tunnelling.
771The two latter optimisations are performed by our optimising assembler
772\cite{correctness}. The back-end of the compiler works on three address
773instructions, preferred to static single assignment code for the simplicity of
774the formal certification.
775
776The CerCo compiler is loosely based on the CompCert compiler \cite{compcert}, a
777recently developed certified compiler from C to the PowerPC, ARM and x86
778microprocessors. In contrast to CompCert, both the CerCo code and its
779certification are fully open source. Some data structures and language definitions for
780the front-end are directly taken from CompCert, while the back-end is a redesign
781of a compiler from Pascal to MIPS used by Fran\c{c}ois Pottier for a course at the
782\'{E}cole Polytechnique.
783The main differences in the CerCo compiler are:
784\begin{itemize*}
785\item All the intermediate languages include label emitting instructions to
786implement the labelling approach, and the compiler preserves execution traces.
787\item Instead of targeting an assembly language with additional
788macro-instructions which are expanded before assembly, we directly
789produce object code in order to perform the timing analysis, using
790an integrated optimising assembler.
791\item In order to avoid the additional work of implementing a linker
792 and a loader, we do not support separate
793compilation and external calls. Adding them is orthogonal
794to the labelling approach and should not introduce extra problems.
795\item We target an 8-bit processor, in contrast to CompCert's 32-bit targets. This requires
796many changes and more compiler code, but it is not fundamentally more
797complex. The proof of correctness, however, becomes much harder.
798\item We target a microprocessor that has a non-uniform memory model, which is
799still often the case for microprocessors used in embedded systems and that is
800becoming common again in multi-core processors. Therefore the compiler has to
801keep track of data and it must move data between memory regions in the proper
802way. Moreover the size of pointers to different regions is not uniform. %An
803%additional difficulty was that the space available for the stack in internal
804%memory in the 8051 is tiny, allowing only a minor number of nested calls. To
805%support full recursion in order to test the CerCo tools also on recursive
806%programs, the compiler implements a stack in external memory.
807\end{itemize*}
808
809\subsection{Formal certification of the CerCo compiler}
810We have formally
811certified in Matita that the cost models induced on the source code by the
812Trusted CerCo Compiler correctly and precisely
813predict the object code behaviour. There are two cost models, one for execution
814time and one for stack space consumption. We show the correctness of the prediction
815only for those programs that do not exhaust the available stack space, a
816property that---thanks to the stack cost model---we can statically analyse on the
817source code in sharp contrast to other certified compilers.  Other projects have
818already certified the preservation of functional semantics in similar compilers,
819so we have not attempted to directly repeat that work and assume
820functional correctness for most passes. In order to complete the
821proof for non-functional properties, we have introduced a new,
822structured, form of execution trace, with the
823related notions for forward similarity and the intensional
824consequences of forward similarity. We have also introduced a unified
825representation for back-end intermediate languages that was exploited to provide
826a uniform proof of forward similarity.
827
828The details on the proof techniques employed
829and
830the proof sketch can be found in the CerCo deliverables and papers~\cite{cerco-deliverables}.
831In this section we will only hint at the correctness statement, which turned
832out to be more complex than expected.
833
834\paragraph{The correctness statement.}
835Real time programs are often reactive programs that loop forever responding to
836events (inputs) by performing some computation followed by some action (output)
837and continuing as before. For these programs the overall execution
838time does not make sense. The same is true for reactive programs that spend an
839unpredictable amount of time in I/O. Instead, what is interesting is the reaction time ---
840 the time spent between I/O events. Moreover, we are interested in
841predicting and ruling out crashes due to running out of space on certain
842inputs.
843Therefore we need a statement that talks about sub-runs of a
844program. A natural candidate is that the time predicted on the source
845code and spent on the object code by two corresponding sub-runs are the same.
846To make this statement formal we must identify the
847corresponding sub-runs and how to single out those that are meaningful.
848We introduce the notion of a \emph{measurable} sub-run of a run
849which does not exhaust the available stack before or during the
850sub-run, the number of function calls and returns
851in the sub-run is the same, the sub-run does not perform any I/O, and the sub-run
852starts with a label emission statement and ends with a return or another label
853emission statement. The stack usage is bounded using the stack usage model
854that is computed by the compiler.
855
856The statement that we formally proved is: for each C run with a measurable
857sub-run, there exists an object code run with a sub-run, with the same
858execution trace for both the prefix of the run and the sub-run itself,
859 and where the time spent by the
860object code in the sub-run is the same as the time predicted on the source code
861using the time cost model generated by the compiler.
862
863
864We briefly discuss the constraints for measurability. Not exhausting the stack
865space is necessary for a run to be meaningful, because the source semantics
866has no notion of running out of memory. Balancing
867function calls and returns is a requirement for precision: the labelling approach
868allows the scope of a label to extend after function calls to
869minimize the number of labels. (The scope excludes the called
870function's execution.) If the number of
871calls/returns is unbalanced, it means that there is a call we have not returned
872to that could be followed by additional instructions whose cost has already been
873taken in account.  The last condition on the start and end points of
874a run is also required to make the bound precise.  With these
875restrictions and the 8051's simple timing model we obtain \emph{exact}
876predictions.  If we relax these conditions then we obtain a corollary
877with an upper bound on the cost.
878Finally, I/O operations can be performed in the prefix of the run, but not in the
879measurable sub-run. Therefore we prove that we can predict reaction times, but
880not I/O times, as desired.
881
882\subsection{Dependent labelling}
883\label{lab:deplabel}
884The core idea of the basic labelling approach is to establish a tight connection
885between basic blocks executed in the source and target languages. Once the
886connection is established, any cost model computed on the object code can be
887transferred to the source code, without affecting the code of the compiler or
888its proof. In particular, we can also transport cost models
889that associate to each label a \emph{function} from the hardware state
890to a natural number.
891However, a problem arises during the instrumentation phase that replaces label
892emission statements with increments of global cost variables. They are
893incremented by the result of applying the label's cost function
894to the hardware state at the time of execution of the block. However,
895the hardware state comprises both the functional state that affects the
896computation (the value of the registers and memory) and the non-functional
897state that does not (the pipeline and cache contents, for example).
898We can find corresponding information for the former in the source code state, but constructing the
899correspondence may be hard and lifting the cost model to work on the source code
900state is likely to produce cost expressions that are too complex to understand and reason about.
901Fortunately, in modern architectures the cost of executing single
902instructions is either independent of the functional state or the jitter---the
903difference between the worst and best case execution times---is small enough
904to be bounded without losing too much precision. Therefore we only consider
905dependencies on the `non-functional' parts of the state.
906
907The non-functional state is not directly related to the high level
908state and does not influence the functional properties. What can be
909done is to expose key aspects of the non-functional state in the
910source code. We present here the basic intuition in a simplified form:
911the technical details that allow us to handle the general case are
912more complex and can be found in~\cite{paolo}. We add to the source
913code an additional global variable that represents the non-functional
914state and another one that remembers the last few labels emitted. The
915state variable must be updated at every label emission statement,
916using an update function which is computed during the processor timing
917analysis.
918This update function assigns to each label a function from the
919recently emitted labels and old state to the new state. It is computed
920by composing the semantics of every instruction in a basic block
921restricted to the non-functional part of the state.
922
923Not all the details of the non-functional state needs to be exposed, and the
924technique works better when the part of state that is required can be summarised
925in a simple data structure. For example, to handle simple but realistic
926pipelines it is sufficient to remember a short integer that encodes the position
927of bubbles (stuck instructions) in the pipeline. In any case, it is not necessary
928for the user to understand the meaning of the state to reason over the properties
929of the
930program. Moreover, the user, or the invariant generator tools that
931analyse the instrumented source code produced by the compiler, can decide to
932trade precision of the analysis for simplicity by approximating the
933cost by safe bounds that do not depend on the processor state. Interestingly, the functional analysis of
934the code could determine which blocks are executed more frequently in order to
935use more aggressive approximations for those that are executed least.
936
937Dependent labelling can also be applied to allow the compiler to duplicate
938blocks that contain labels (e.g. in loop optimisations)~\cite{paolo}. The effect is to assign
939a different cost to the different occurrences of a duplicated label. For
940example, loop peeling turns a loop into the concatenation of a copy of the loop
941body for the first iteration and the conditional execution of the
942loop for successive iterations. Further optimisations will compile the two
943copies of the loop differently, with the first body usually
944taking more time.
945
946By introducing a variable that keeps track of the iteration number, we can
947associate to the label a cost that is a function of the iteration number. The
948same technique works for loop unrolling without modification: the function will
949assign one cost to the even iterations and another cost to the odd
950ones.  The optimisation code
951that duplicates the loop bodies must also modify the code to correctly propagate
952the update of the iteration numbers. The technical details are more complicated and
953can be found in the CerCo reports and publications. The implementation, however,
954is quite simple (and forms part of our OCaml version of the compiler)
955and the changes to a loop optimising compiler are minimal.
956
957\subsection{Techniques to exploit the induced cost model}
958\label{sec:exploit}
959We now turn our attention to synthesising high-level costs, such as
960the reaction time of a real-time program.  We consider as our starting point source level costs
961provided by basic labelling, in other words annotations
962on the source code which are constants that provide a sound and sufficiently
963precise upper bound on the cost of executing the blocks after compilation to
964object code.
965
966The principle that we have followed in designing the cost synthesis tools is
967that the synthesised bounds should be expressed and proved within a general
968purpose tool built to reason on the source code. In particular, we rely on the
969Frama-C tool to reason on C code and on the Coq proof-assistant to reason on
970higher-order functional programs.
971This principle entails that
972the inferred synthetic bounds are indeed correct as long as the general purpose
973tool is, and that there is no limitation on the class of programs that can be handled,
974for example by resorting to interactive proof.
975
976Of course, automation is desirable whenever possible. Within this framework,
977automation means writing programs that give hints to the general purpose tool.
978These hints may take the form, say, of loop invariants/variants, of predicates
979describing the structure of the heap, or of types in a light logic. If these
980hints are correct and sufficiently precise the general purpose tool will produce
981a proof automatically, otherwise, user interaction is required.
982
983\paragraph{The Cost plugin and its application to the Lustre compiler.}
984Frama-C \cite{framac} is a set of analysers for C programs with a
985specification language, ACSL. New analyses can be added dynamically
986via plugins. For instance, the Jessie plugin~\cite{jessie} allows deductive
987verification of C programs with respect to their specification in ACSL, with
988various provers as back-end tools.
989We developed the CerCo Cost plugin for the Frama-C platform as a proof of
990concept of an automatic environment exploiting the cost annotations produced by
991the CerCo compiler. It consists of an OCaml program which essentially
992uses the CerCo compiler to produce a related C program with cost annotations,
993and applies some heuristics to produce a tentative bound on the cost of
994executing the C functions of the program as a function of the value of their
995parameters. The user can then call the Jessie plugin to discharge the
996related proof obligations.
997In the following we elaborate on the soundness of the framework and the
998experiments we performed with the Cost tool on C programs, including some produced by a
999Lustre compiler.
1000
1001\paragraph{Soundness.} The soundness of the whole framework depends on the cost
1002annotations added by the CerCo compiler,
1003the verification conditions (VCs) generated by Jessie, and the
1004external provers discharging the VCs. Jessie can be used to verify the
1005synthesised bounds because our plugin generates them in ACSL format. Thus, even if the added synthetic costs are
1006incorrect (relatively to the cost annotations), the process as a whole is still
1007correct: indeed, Jessie will not validate incorrect costs and no conclusion can
1008be made about the WCET of the program in this case. In other terms, the
1009soundness does not depend on the cost plugin, which can in
1010principle produce any synthetic cost. However, in order to be able to actually
1011prove a WCET of a C function, we need to add correct annotations in a way that
1012Jessie and subsequent automatic provers have enough information to deduce their
1013validity. In practice this is not straightforward even for very simple programs
1014composed of branching and assignments (no loops and no recursion) because a fine
1015analysis of the VCs associated with branching may lead to a complexity blow up.
1016
1017\paragraph{Experience with Lustre.} Lustre~\cite{lustre} is a data-flow language for programming
1018synchronous systems, with a compiler which targets C. We designed a
1019wrapper for supporting Lustre files.
1020The C function produced by the compiler is relatively simple loop-free
1021code which implements the step function of the
1022synchronous system and computing the WCET of the function amounts to obtaining a
1023bound on the reaction time of the system. We tested the Cost plugin and the
1024Lustre wrapper on the C programs generated by the Lustre compiler. For programs
1025consisting of a few hundred lines of code, the cost plugin computes a
1026WCET and Alt-Ergo is able to discharge all VCs automatically.
1027
1028\paragraph{Handling C programs with simple loops.}
1029The cost annotations added by the CerCo compiler take the form of C instructions
1030that update a fresh global variable called the cost variable by a constant.
1031Synthesizing a WCET of a C function thus consists of statically resolving an
1032upper bound of the difference between the value of the cost variable before and
1033after the execution of the function, i.e. finding the instructions
1034that update the cost variable and establish the number of times they are passed
1035through during the flow of execution. To perform the analysis the plugin
1036assumes that there are no recursive functions in the program, and that
1037every loop is annotated with a variant. In the case of `for' loops the
1038variants are automatically inferred where a loop counter can be
1039syntactically detected.
1040
1041The plugin computes a call-graph and proceeds to calculate bounds for
1042each function from the leaves up to the main function.
1043The computation of the cost of each function is performed by traversing its
1044control flow graph, where the cost of a node is the maximum of the
1045costs of the successors.
1046In the case of a loop with a body that has a constant cost for every step of the
1047loop, the cost is the product of the cost of the body and of the variant taken
1048at the start of the loop.
1049In the case of a loop with a body whose cost depends on the values of some free
1050variables, a fresh logic function $f$ is introduced to represent the cost of the
1051loop in the logic assertions. This logic function takes the variant as a first
1052parameter. The other parameters of $f$ are the free variables of the body of the
1053loop. An axiom is added to account for the fact that the cost is accumulated at each
1054step of the loop.
1055The cost of the function is added as post-condition of the function.
1056
1057The user can also specify more precise variants and annotate functions
1058with their own cost specifications. The plugin will use these
1059instead of computing its own, allowing greater precision and the
1060ability to analyse programs which the variant generator does not
1061support.
1062
1063In addition to the loop-free Lustre code, this method was successfully
1064applied to a small range of cryptographic code.  See~\cite{labelling}
1065for more details.  The example in Section~\ref{sec:workflow} was also
1066produced using the plug-in.  The variant was calculated automatically
1067by noticing that \lstinline'j' is a loop counter with maximum value
1068\lstinline'len'.  The most expensive path through the loop body
1069($78+136 = 214$) is then multiplied by the number of iterations to
1070give the cost of the loop.
1071
1072\paragraph{C programs with pointers.}
1073Using first-order logic and SMT solvers to specify and verify programs involving pointer-based
1074data structures such as linked-lists or graphs shows some limitations.
1075Separation logic, a program logic
1076with a new notion of conjunction to express spatial heap separation, is an elegant alternative.
1077Bobot has
1078recently introduced automatically generated separation
1079predicates to simulate separation logic reasoning in the Jessie plugin where the specification language, the verification condition
1080generator, and the theorem provers were not designed with separation logic in
1081mind~\cite{bobot}. CerCo's plugin can exploit these predicates to automatically
1082reason about the cost of execution of simple heap manipulation programs such as an
1083in-place list reversal.
1084
1085\section{Conclusions and future work}
1086
1087All CerCo software and deliverables may be found on the project homepage~\cite{cerco}.
1088
1089The results obtained so far are encouraging and provide evidence that
1090it is possible to perform static time and space analysis at the source level
1091without losing accuracy, reducing the trusted code base and reconciling the
1092study of functional and non-functional properties of programs. The
1093techniques introduced seem to be scalable, cover both imperative and
1094functional languages and are compatible with every compiler optimisation
1095considered by us so far.
1096
1097To prove that compilers can keep track of optimisations
1098and induce a precise cost model on the source code, we targeted a simple
1099architecture that admits a cost model that is execution history independent.
1100The most important future work is dealing with hardware architectures
1101characterised by history-dependent stateful components, like caches and
1102pipelines. The main issue is to assign a parametric, dependent cost
1103to basic blocks that can be later transferred by the labelling approach to
1104the source code and represented in a meaningful way to the user. The dependent
1105labelling approach that we have studied seems a promising tool to achieve
1106this goal, but more work is required to provide good source level
1107approximations of the relevant processor state.
1108
1109Other examples of future work are to improve the cost invariant
1110generator algorithms and the coverage of compiler optimisations, to combining
1111the labelling approach with the type and effect discipline of~\cite{typeffects}
1112to handle languages with implicit memory management, and to experiment with
1113our tools in the early phases of development. Larger case studies are also necessary
1114to evaluate the CerCo's prototype on realistic, industrial-scale programs.
1115
1116% \bibliographystyle{splncs}
1117\bibliography{fopara13}
1118% \begin{thebibliography}{19}
1119%
1120% \bibitem{survey} \textbf{A Survey of Static Program Analysis Techniques}
1121% W.~W\"ogerer. Technical report. Technische Universit\"at Wien 2005
1122%
1123% \bibitem{cerco} \textbf{Certified Complexity}. R.M. Amadio, A. Asperti, N. Ayache,
1124% B. Campbell, D. P. Mulligan, R. Pollack, Y. Regis-Gianas, C. Sacerdoti Coen, I.
1125% Stark, in Procedia Computer Science, Volume 7, 2011, Proceedings of the 2 nd
1126% European Future Technologies Conference and Exhibition 2011 (FET 11), 175-177.
1127%
1128% \bibitem{labelling} \textbf{Certifying and Reasoning on Cost Annotations in C
1129% Programs}, N.  Ayache, R.M. Amadio, Y.R\'{e}gis-Gianas, in Proc. FMICS, Springer
1130% LNCS
1131% 7437: 32-46, 2012.
1132% %, DOI:10.1007/978-3-642-32469-7\_3.
1133%
1134% \bibitem{labelling2} \textbf{Certifying and reasoning on cost annotations of
1135% functional programs}.
1136% R.M. Amadio, Y. R\'{e}gis-Gianas. Proceedings of the Second international conference
1137% on Foundational and Practical Aspects of Resource Analysis FOPARA 2011 Springer
1138% LNCS 7177:72-89, 2012.
1139%
1140% \bibitem{compcert} \textbf{Formal verification of a realistic compiler}. X. Leroy,  In Commun. ACM 52(7), 107–115, 2009.
1141%
1142% \bibitem{framac} \textbf{Frama-C user manual}. L. Correnson, P. Cuoq, F. Kirchner, V. Prevosto, A. Puccetti, J. Signoles,
1143% B. Yakobowski. in CEA-LIST, Software Safety Laboratory, Saclay, F-91191,
1144% \url{http://frama-c.com/}.
1145%
1146% \bibitem{paolo} \textbf{Indexed Labels for Loop Iteration Dependent Costs}. P.
1147% Tranquilli, in Proceedings of the 11th International Workshop on Quantitative
1148% Aspects of Programming Languages and Systems (QAPL 2013), Rome, 23rd-24th March
1149% 2013, Electronic Proceedings in Theoretical Computer Science, to appear in 2013.
1150%
1151% \bibitem{separation} \textbf{Intuitionistic reasoning about shared mutable data
1152% structure} J.C. Reynolds. In Millennial Perspectives in Computer Science,
1153% pages 303–321, Houndsmill, Hampshire, 2000. Palgrave.
1154%
1155% \bibitem{lustre} \textbf{LUSTRE: a declarative language for real-time
1156% programming}
1157% P. Caspi, D. Pilaud, N. Halbwachs, J.A. Plaice. In Proceedings of
1158% the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages ACM
1159% 1987.
1160%
1161% \bibitem{correctness} \textbf{On the correctness of an optimising assembler for
1162% the intel MCS-51 microprocessor}.
1163%   D. P. Mulligan, C. Sacerdoti Coen. In Proceedings of the Second
1164% international conference on Certified Programs and Proofs, Springer-Verlag 2012.
1165%
1166% \bibitem{proartis} \textbf{PROARTIS: Probabilistically Analysable Real-Time
1167% Systems}, F.J. Cazorla, E. Qui\~{n}ones, T. Vardanega, L. Cucu, B. Triquet, G.
1168% Bernat, E. Berger, J. Abella, F. Wartel, M. Houston, et al., in ACM Transactions
1169% on Embedded Computing Systems, 2012.
1170%
1171% \bibitem{embounded} \textbf{The EmBounded project (project paper)}. K. Hammond,
1172% R. Dyckhoff, C. Ferdinand, R. Heckmann, M. Hofmann, H. Loidl, G. Michaelson, J.
1173% Serot, A. Wallace, in Trends in Functional Programming, Volume 6, Intellect
1174% Press, 2006.
1175%
1176% \bibitem{matita}
1177% \textbf{The Matita Interactive Theorem Prover}.
1178% A. Asperti, C. Sacerdoti Coen, W. Ricciotti, E. Tassi.
1179% 23rd International Conference on Automated Deduction, CADE 2011.
1180%
1181% \bibitem{typeffects} \textbf{The Type and Effect Discipline}. J.-P. Talpin,
1182%  P. Jouvelot.
1183%   In Proceedings of the Seventh Annual Symposium on Logic in Computer Science
1184% (LICS '92), Santa Cruz, California, USA, June 22-25, 1992.
1185% IEEE Computer Society 1992.
1186%
1187% \bibitem{stateart} \textbf{The worst-case execution-time problem overview of
1188% methods
1189% and survey of tools.} R. Wilhelm et al., in  ACM Transactions on Embedded
1190% Computing Systems, 7:1–53, May 2008.
1191%
1192% %\bibitem{proartis2} \textbf{A Cache Design for Probabilistic Real-Time
1193% % Systems}, L. Kosmidis, J. Abella, E. Quinones, and F. Cazorla, in Design,
1194% % Automation, and Test in Europe (DATE), Grenoble, France, 03/2013.
1195%
1196% \end{thebibliography}
1197
1198
1199%\bibliography{fopara13.bib}
1200
1201\appendix
1202
1203\include{appendix}
1204
1205\end{document}
Note: See TracBrowser for help on using the repository browser.