source: Papers/fopara2013/fopara13.tex @ 3435

Last change on this file since 3435 was 3435, checked in by campbell, 6 years ago

Revise section 4.4.

File size: 63.7 KB
Line 
1\documentclass{llncs}
2
3\usepackage{amsfonts}
4\usepackage{amsmath}
5\usepackage{amssymb} 
6\usepackage[english]{babel}
7\usepackage{color}
8\usepackage{fancybox}
9\usepackage{fancyvrb}
10\usepackage{graphicx}
11\usepackage[colorlinks,
12            bookmarks,bookmarksopen,bookmarksdepth=2]{hyperref}
13\usepackage{hyphenat}
14\usepackage[utf8x]{inputenc}
15\usepackage{listings}
16\usepackage{mdwlist}
17\usepackage{microtype}
18\usepackage{stmaryrd}
19\usepackage{url}
20
21\usepackage{tikz}
22\usetikzlibrary{positioning,calc,patterns,chains,shapes.geometric,scopes}
23
24%\renewcommand{\verb}{\lstinline}
25%\def\lstlanguagefiles{lst-grafite.tex}
26%\lstset{language=Grafite}
27\lstset{language=C,basicstyle=\tt,basewidth=.5em,lineskip=-1.5pt}
28
29\newlength{\mylength}
30\newenvironment{frametxt}%
31        {\setlength{\fboxsep}{5pt}
32                \setlength{\mylength}{\linewidth}%
33                \addtolength{\mylength}{-2\fboxsep}%
34                \addtolength{\mylength}{-2\fboxrule}%
35                \Sbox
36                \minipage{\mylength}%
37                        \setlength{\abovedisplayskip}{0pt}%
38                        \setlength{\belowdisplayskip}{0pt}%
39                }%
40                {\endminipage\endSbox
41                        \[\fbox{\TheSbox}\]}
42
43\title{Certified Complexity (CerCo)\thanks{The project CerCo acknowledges the
44financial support of the Future and Emerging Technologies (FET) programme within
45the Seventh Framework Programme for Research of the European Commission, under
46FET-Open grant number: 243881}}
47\author{
48%Roberto
49R.~M. Amadio$^{4}$ \and
50%Nicolas
51N.~Ayache$^{3,4}$ \and
52%François
53F.~Bobot$^{3,4}$ \and
54%Jacob
55J.~P. Boender$^1$ \and
56%Brian
57B.~Campbell$^2$ \and
58%Ilias
59I.~Garnier$^2$ \and
60%Antoine
61A.~Madet$^4$ \and
62%James
63J.~McKinna$^2$ \and
64%Dominic
65D.~P. Mulligan$^1$ \and
66%Mauro
67M.~Piccolo$^1$ \and
68%Randy
69R.~Pollack$^2$ \and
70%Yann
71Y.~R\'egis-Gianas$^{3,4}$ \and
72%Claudio
73C.~Sacerdoti Coen$^1$ \and
74%Ian
75I.~Stark$^2$ \and
76%Paolo
77P.~Tranquilli$^1$}
78\institute{Dipartimento di Informatica - Scienza e Ingegneria, Universit\'a di
79Bologna \and
80LFCS, School of Informatics, University of Edinburgh
81\and INRIA (Team $\pi r^2$)
82\and
83Universit\`e Paris Diderot
84}
85
86\bibliographystyle{splncs03}
87
88\begin{document}
89
90\maketitle
91
92% Brian: I've changed "without loss of accuracy" because we discuss
93% important precision/complexity tradeoffs for advanced architectures
94% in the dependent labelling section.
95\begin{abstract}
96This paper provides an overview of the FET-Open Project CerCo
97(`Certified Complexity'). The project's main achievement is
98the development of a technique for analysing non-functional
99properties of programs (time, space) at the source level, with little or no loss of accuracy
100and with a small trusted code base. The main software component
101developed is a certified compiler producing cost annotations. The compiler
102translates source code to object code and produces an instrumented copy of the
103source code. This instrumentation exposes at
104the source level---and tracks precisely---the actual (non-asymptotic)
105computational cost of the input program. Untrusted invariant generators and trusted theorem provers
106can then be used to compute and certify the parametric execution time of the
107code.
108\end{abstract}
109
110% ---------------------------------------------------------------------------- %
111% SECTION                                                                      %
112% ---------------------------------------------------------------------------- %
113\section{Introduction}
114
115%\paragraph{Problem statement.}
116Computer programs can be specified with both
117functional constraints (what a program must do) and non-functional constraints
118(what resources---time, space, etc.---the program may use).  In the current
119state of the art, functional properties are verified for high-level source code
120by combining user annotations (e.g. preconditions and invariants) with a
121multitude of automated analyses (invariant generators, type systems, abstract
122interpretation, theorem proving, and so on). By contrast, non-functional properties
123are generally checked on low-level object code, but also demand information
124about high-level functional behaviour that must somehow be reconstructed.
125
126This situation presents several problems: 1) it can be hard to infer this
127high-level structure in the presence of compiler optimisations; 2) techniques
128working on object code are not useful in early development, yet problems
129detected later are more expensive to tackle; 3) parametric cost analysis is very
130hard: how can we reflect a cost that depends on the execution state (e.g. the
131value of a register or a carry bit) to a cost that the user can understand
132looking at source code? 4) performing functional analysis on the object code
133makes it hard for the programmer to provide information, leading to less precision in
134the estimates.
135
136\paragraph{Vision and approach.} We want to reconcile functional and
137non-functional analyses: to share information and perform both at the same time
138on source code.
139%
140What has previously prevented this approach is the lack of a uniform and precise
141cost model for high-level code: 1) each statement occurrence is compiled
142differently and optimisations may change control flow; 2) the cost of an object
143code instruction may depend on the runtime state of hardware components like
144pipelines and caches, all of which are not visible in the source code.
145
146To solve the issue, we envision a new generation of compilers able to keep track
147of program structure through compilation and optimisation, and to exploit that
148information to define a cost model for source code that is precise, non-uniform,
149and which accounts for runtime state. With such a source-level cost model we can
150reduce non-functional verification to the functional case and exploit the state
151of the art in automated high-level verification~\cite{survey}. The techniques
152currently used by the Worst Case Execution Time (WCET) community, which perform the analysis on object code,
153are still available but can now be coupled with an additional source-level
154analysis. Where the approach produces precise cost models too complex to reason
155about, safe approximations can be used to trade complexity with precision.
156Finally, source code analysis can be used during the early stages of development, when
157components have been specified but not implemented: source code modularity means
158that it is enough to specify the non-functional behaviour of unimplemented
159components.
160
161\paragraph{Contributions.} We have developed what we refer to as \emph{the labelling approach} \cite{labelling}, a
162technique to implement compilers that induce cost models on source programs by
163very lightweight tracking of code changes through compilation. We have studied
164how to formally prove the correctness of compilers implementing this technique.
165We have implemented such a compiler from C to object binaries for the 8051
166micro-controller, and verified it in an interactive theorem prover.
167
168To demonstrate source-level verification of costs we have implemented
169a Frama-C plugin \cite{framac} that invokes the compiler on a source
170program and uses this to generate invariants on the high-level source
171that correctly model low-level costs. The plugin certifies that the
172program respects these costs by calling automated theorem provers, a
173new and innovative technique in the field of cost analysis. Finally,
174we have conduct several case studies, including showing that the
175plugin can automatically compute and certify the exact reaction time
176of Lustre~\cite{lustre} data flow programs compiled into C.
177
178\section{Project context and approach}
179Formal methods for verification of functional properties of programs have
180now reached a level of maturity and automation that is facilitating a slow but
181increasing adoption in production environments. For safety critical code, it is
182becoming commonplace to combine rigorous software engineering methodologies and testing
183with static analysis, taking the strong points of each approach and mitigating
184their weaknesses. Of particular interest are open frameworks
185for the combination of different formal methods, where the programs can be
186progressively specified and are continuously enriched with new safety
187guarantees: every method contributes knowledge (e.g. new invariants) that
188becomes an assumption for later analysis.
189
190The scenario for the verification of non-functional properties (time spent,
191memory used, energy consumed) is bleaker.
192% and the future seems to be getting even worse.
193Most industries verify that real time systems meet their deadlines
194by simply performing many runs of the system and timing their execution,  computing the
195maximum time and adding an empirical safety margin, claiming the result to be a
196bound for the WCET of the program. Formal methods and software to statically
197analyse the WCET of programs exist, but they often produce bounds that are too
198pessimistic to be useful. Recent advancements in hardware architecture
199have been
200focused on the improvement of the average case performance, not the
201predictability of the worst case. Execution time is becoming increasingly
202dependent on execution history and the internal state of
203hardware components like pipelines and caches. Multi-core processors and non-uniform
204memory models are drastically reducing the possibility of performing
205static analysis in isolation, because programs are less and less time
206composable. Clock-precise hardware models are necessary for static analysis, and
207obtaining them is becoming harder due to the increased sophistication
208of hardware design.
209
210Despite the aforementioned problems, the need for reliable real time
211systems and programs is increasing, and there is increasing pressure
212from the research community towards the introduction of alternative
213hardware with more predictable behaviour, which would be more suitable
214for static analysis.  One example being investigated by the Proartis
215project~\cite{proartis} is to decouple execution time from execution
216history by introducing randomisation.
217
218In the CerCo project \cite{cerco} we do not try to address this problem, optimistically
219assuming that improvements in low-level timing analysis or architecture will make
220verification feasible in the longer term. The main objective of our work is
221instead to bring together static analysis of functional and non-functional
222properties, which, according to the current state of the art, are completely
223independent activities with limited exchange of information: while the
224functional properties are verified on the source code, the analysis of
225non-functional properties is entirely performed on the object code to exploit
226clock-precise hardware models.
227
228\subsection{Current object-code methods}
229
230Analysis currently takes place on object code for two main reasons.
231First, there cannot be a uniform, precise cost model for source
232code instructions (or even basic blocks). During compilation, high level
233instructions are torn apart and reassembled in context-specific ways so that
234identifying a fragment of object code and a single high level instruction is
235infeasible. Even the control flow of the object and source code can be very
236different as a result of optimisations, for example aggressive loop
237optimisations may completely transform source level loops. Despite the lack of a uniform, compilation- and
238program-independent cost model on the source language, the literature on the
239analysis of non-asymptotic execution time on high level languages that assumes
240such a model is growing and gaining momentum. However, unless we provide a
241replacement for such cost models, this literature's future practical impact looks
242to be minimal. Some hope has been provided by the EmBounded project \cite{embounded}, which
243compositionally compiles high-level code to a byte code that is executed by an
244interpreter with guarantees on the maximal execution time spent for each byte code
245instruction. This provides a uniform model at the expense of the model's
246precision (each cost is a pessimistic upper bound) and the performance of the
247executed code (because the byte code is interpreted compositionally instead of
248performing a fully non-compositional compilation).
249
250The second reason to perform the analysis on the object code is that bounding
251the worst case execution time of small code fragments in isolation (e.g. loop
252bodies) and then adding up the bounds yields very poor estimates because no
253knowledge of the hardware state on executing the fragment can be assumed. By
254analysing longer runs the bound obtained becomes more precise because the lack
255of information about the initial state has a relatively small impact.
256
257To calculate the cost of an execution, value and control flow analyses
258are required to bound the number of times each basic block is
259executed.  Current state
260of the art WCET technology such as AbsInt performs these analyses on the
261object code, where the logic of the program is harder to reconstruct
262and most information available at the source code level has been lost;
263see~\cite{stateart} for a survey. Imprecision in the analysis can lead to useless bounds. To
264augment precision, the tools ask the user to provide constraints on
265the object code control flow, usually in the form of bounds on the
266number of iterations of loops or linear inequalities on them. This
267requires the user to manually link the source and object code,
268translating his assumptions on the source code (which may be wrong) to
269object code constraints. The task is error prone and hard, especially
270in the presence of complex compiler optimisations.
271
272Traditional techniques for WCET that work on object code are also affected by
273another problem: they cannot be applied before the generation of the object
274code. Functional properties can be analysed in early development stages, while
275analysis of non-functional properties may come too late to avoid expensive
276changes to the program architecture.
277
278\subsection{CerCo}
279
280In CerCo we propose a radically new approach to the problem: we reject the idea
281of a uniform cost model and we propose that the compiler, which knows how the
282code is translated, must return the cost model for basic blocks of high level
283instructions. It must do so by keeping track of the control flow modifications
284to reverse them and by interfacing with processor timing analysis.
285
286By embracing compilation, instead of avoiding it like EmBounded did, a CerCo
287compiler can both produce efficient code and return costs that are
288as precise as the processor timing analysis can be. Moreover, our costs can be
289parametric: the cost of a block can depend on actual program data, on a
290summary of the execution history, or on an approximated representation of the
291hardware state. For example, loop optimizations may assign to a loop body a cost
292that is a function of the number of iterations performed. As another example,
293the cost of a block may be a function of the vector of stalled pipeline states,
294which can be exposed in the source code and updated at each basic block exit. It
295is parametricity that allows one to analyse small code fragments without losing
296precision: in the analysis of the code fragment we do not have to ignore the
297initial hardware state. On the contrary, we can assume that we know exactly which
298state (or mode, as the WCET literature calls it) we are in.
299
300The CerCo approach has the potential to dramatically improve the state of the
301art. By performing control and data flow analyses on the source code, the error
302prone translation of invariants is completely avoided. It is in fact performed
303by the compiler itself when it induces the cost model on the source language.
304Moreover, any available technique for the verification of functional properties
305can be immediately reused and multiple techniques can collaborate together to
306infer and certify cost invariants for the program. Parametric cost analysis
307becomes the default one, with non-parametric bounds used as a last resort when
308trading the complexity of the analysis with its precision. \emph{A priori}, no
309technique previously used in traditional WCET is lost: processor
310timing analyses can be used by the compiler on the object code, and the rest can be applied
311at the source code level.
312 Our approach can also work in the early
313stages of development by axiomatically attaching costs to unimplemented components.
314
315
316All software used to verify properties of programs must be as bug free as
317possible. The trusted code base for verification consists of the code that needs
318to be trusted to believe that the property holds. The trusted code base of
319state-of-the-art WCET tools is very large: one needs to trust the control flow
320analyser and the linear programming libraries it uses and also the formal models
321of the hardware. In CerCo we are moving the control flow analysis to the source
322code and we are introducing a non-standard compiler too. To reduce the trusted
323code base, we implemented a prototype and a static analyser in an interactive
324theorem prover, which was used to certify that the costs added to the source
325code are indeed those incurred by the hardware. Formal models of the
326hardware and of the high level source languages were also implemented in the
327interactive theorem prover. Control flow analysis on the source code has been
328obtained using invariant generators, tools to produce proof obligations from
329generated invariants and automatic theorem provers to verify the obligations. If
330the automatic provers are able to generate proof traces that can be
331independently checked, the only remaining component that enters the trusted code
332base is an off-the-shelf invariant generator which, in turn, can be proved
333correct using an interactive theorem prover. Therefore we achieve the double
334objective of allowing the use of more off-the-shelf components (e.g. provers and
335invariant generators) whilst reducing the trusted code base at the same time.
336
337%\paragraph{Summary of the CerCo objectives.} To summarize, the goal of CerCo is
338% to reconcile functional with non-functional analysis by performing them together
339% on the source code, sharing common knowledge about execution invariants. We want
340% to achieve the goal implementing a new generation of compilers that induce a
341% parametric, precise cost model for basic blocks on the source code. The compiler
342% should be certified using an interactive theorem prover to minimize the trusted
343% code base of the analysis. Once the cost model is induced, off-the-shelf tools
344% and techniques can be combined together to infer and prove parametric cost
345% bounds.
346%The long term benefits of the CerCo vision are expected to be:
347%1. the possibility to perform static analysis during early development stages
348%2.  parametric bounds made easier
349%3.  the application of off-the-shelf techniques currently unused for the
350% analysis of non-functional properties, like automated proving and type systems
351%4. simpler and safer interaction with the user, that is still asked for
352% knowledge, but on the source code, with the additional possibility of actually
353% verifying the provided knowledge
354%5. a reduced trusted code base
355%6. the increased accuracy of the bounds themselves.
356%
357%The long term success of the project is hindered by the increased complexity of
358% the static prediction of the non-functional behaviour of modern hardware. In the
359% time frame of the European contribution we have focused on the general
360% methodology and on the difficulties related to the development and certification
361% of a cost model inducing compiler.
362
363\section{The typical CerCo workflow}
364\label{sec:workflow}
365\begin{figure}[!t]
366\begin{tabular}{l@{\hspace{0.2cm}}|@{\hspace{0.2cm}}l}
367\begin{lstlisting}
368char a[] = {3, 2, 7, 14};
369char threshold = 4;
370
371int count(char *p, int len) {
372  char j;
373  int found = 0;
374  for (j=0; j < len; j++) {
375    if (*p <= threshold)
376      found++;
377    p++;
378    }
379  return found;
380}
381
382int main() {
383  return count(a,4);
384}
385\end{lstlisting}
386&
387%  $\vcenter{\includegraphics[width=7.5cm]{interaction_diagram.pdf}}$
388\begin{tikzpicture}[
389    baseline={([yshift=-.5ex]current bounding box)},
390    element/.style={draw, text width=1.6cm, on chain, text badly centered},
391    >=stealth
392    ]
393{[start chain=going below, node distance=.5cm]
394\node [element] (cerco) {CerCo\\compiler};
395\node [element] (cost)  {CerCo\\cost plugin};
396{[densely dashed]
397\node [element] (ded)   {Deductive\\platform};
398\node [element] (check) {Proof\\checker};
399}
400}
401\coordinate [left=3.5cm of cerco] (left);
402{[every node/.style={above, text width=3.5cm, text badly centered,
403                     font=\scriptsize}]
404\draw [<-] ([yshift=-1ex]cerco.north west) coordinate (t) --
405    node {C source}
406    (t-|left);
407\draw [->] (cerco) -- (cost);
408\draw [->] ([yshift=1ex]cerco.south west) coordinate (t) --
409    node {C source+\color{red}{cost annotations}}
410    (t-|left) coordinate (cerco out);
411\draw [->] ([yshift=1ex]cost.south west) coordinate (t) --
412    node {C source+\color{red}{cost annotations}\\+\color{blue}{synthesized assertions}}
413    (t-|left) coordinate (out);
414{[densely dashed]
415\draw [<-] ([yshift=-1ex]ded.north west) coordinate (t) --
416    node {C source+\color{red}{cost annotations}\\+\color{blue}{complexity assertions}}
417    (t-|left) coordinate (ded in) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
418\draw [->] ([yshift=1ex]ded.south west) coordinate (t) --
419    node {complexity obligations}
420    (t-|left) coordinate (out);
421\draw [<-] ([yshift=-1ex]check.north west) coordinate (t) --
422    node {complexity proof}
423    (t-|left) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
424\draw [dash phase=2.5pt] (cerco out) .. controls +(-1, 0) and +(-1, 0) ..
425    (ded in);
426}}
427%% user:
428% head
429\draw (current bounding box.west) ++(-.2,.5)
430    circle (.2) ++(0,-.2) -- ++(0,-.1) coordinate (t);
431% arms
432\draw (t) -- +(-.3,-.2);
433\draw (t) -- +(.3,-.2);
434% body
435\draw (t) -- ++(0,-.4) coordinate (t);
436% legs
437\draw (t) -- +(-.2,-.6);
438\draw (t) -- +(.2,-.6);
439\end{tikzpicture}
440\end{tabular}
441\caption{On the left: code to count the array's elements
442 that are less than or equal to the threshold. On the right: CerCo's interaction
443 diagram. CerCo's components are drawn solid.}
444\label{test1}
445\end{figure}
446We illustrate the workflow we envisage (on the right
447of~\autoref{test1}) on an example program (on the left
448of~\autoref{test1}).  The user writes the program and feeds it to the
449CerCo compiler, which outputs an instrumented version of the same
450program that updates global variables that record the elapsed
451execution time and the stack space usage.  The red lines in
452\autoref{itest1} introducing variables, functions and function calls
453starting with \lstinline'__cost' and \lstinline'__stack' are the instrumentation introduced by the
454compiler.  For example, the two calls at the start of
455\lstinline'count' say that 4 bytes of stack are required, and that it
456takes 111 cycles to reach the next cost annotation (in the loop body).
457The compiler measures these on the labelled object code that it generates.
458
459 The annotated program can then be enriched with complexity
460assertions in the style of Hoare logic, that are passed to a deductive
461platform (in our case Frama-C). We provide as a Frama-C cost plugin a
462simple automatic synthesiser for complexity assertions which can
463be overridden by the user to increase or decrease accuracy.  These are the blue
464comments starting with \lstinline'/*@' in \autoref{itest1}, written in
465Frama-C's specification language, ACSL.  From the
466assertions, a general purpose deductive platform produces proof
467obligations which in turn can be closed by automatic or interactive
468provers, ending in a proof certificate.
469
470% NB: if you try this example with the live CD you should increase the timeout
471
472Twelve proof obligations are generated from~\autoref{itest1} (to prove
473that the loop invariant holds after one execution if it holds before,
474to prove that the whole program execution takes at most 1358 cycles,
475etc.).  Note that the synthesised time bound for \lstinline'count',
476$178+214*(1+\text{\lstinline'len'})$ cycles, is parametric in the length of
477the array. The CVC3 prover
478closes all obligations within half a minute on routine commodity
479hardware.  A simpler non-parametric version can be solved in a few
480seconds.
481%
482\fvset{commandchars=\\\{\}}
483\lstset{morecomment=[s][\color{blue}]{/*@}{*/},
484        moredelim=[is][\color{blue}]{$}{$},
485        moredelim=[is][\color{red}]{|}{|},
486        lineskip=-2pt}
487\begin{figure}[!p]
488\begin{lstlisting}
489|int __cost = 33, __stack = 5, __stack_max = 5;|
490|void __cost_incr(int incr) { __cost += incr; }|
491|void __stack_incr(int incr) {
492  __stack += incr;
493  __stack_max = __stack_max < __stack ? __stack : __stack_max;
494}|
495
496char a[4] = {3, 2, 7, 14};  char threshold = 4;
497
498/*@ behavior stack_cost:
499      ensures __stack_max <= __max(\old(__stack_max), 4+\old(__stack));
500      ensures __stack == \old(__stack);
501    behavior time_cost:
502      ensures __cost <= \old(__cost)+(178+214*__max(1+\at(len,Pre), 0));
503*/
504int count(char *p, int len) {
505  char j;  int found = 0;
506  |__stack_incr(4);  __cost_incr(111);|
507  $__l: /* internal */$
508  /*@ for time_cost: loop invariant
509        __cost <= \at(__cost,__l)+
510                  214*(__max(\at((len-j)+1,__l), 0)-__max(1+(len-j), 0));
511      for stack_cost: loop invariant
512        __stack_max == \at(__stack_max,__l);
513      for stack_cost: loop invariant
514        __stack == \at(__stack,__l);
515      loop variant len-j;
516  */
517  for (j = 0; j < len; j++) {
518    |__cost_incr(78);|
519    if (*p <= threshold) { |__cost_incr(136);| found ++; }
520    else { |__cost_incr(114);| }
521    p ++;
522  }
523  |__cost_incr(67);  __stack_incr(-4);|
524  return found;
525}
526
527/*@ behavior stack_cost:
528      ensures __stack_max <= __max(\old(__stack_max), 6+\old(__stack));
529      ensures __stack == \old(__stack);
530    behavior time_cost:
531      ensures __cost <= \old(__cost)+1358;
532*/
533int main(void) {
534  int t;
535  |__stack_incr(2);  __cost_incr(110);|
536  t = count(a,4);
537  |__stack_incr(-2);|
538  return t;
539}
540\end{lstlisting}
541\caption{The instrumented version of the program in \autoref{test1},
542 with instrumentation added by the CerCo compiler in red and cost invariants
543 added by the CerCo Frama-C plugin in blue. The \texttt{\_\_cost},
544 \texttt{\_\_stack} and \texttt{\_\_stack\_max} variables hold the elapsed time
545in clock cycles and the current and maximum stack usage. Their initial values
546hold the clock cycles spent in initialising the global data before calling
547\texttt{main} and the space required by global data (and thus unavailable for
548the stack).
549}
550\label{itest1}
551\end{figure}
552\section{Main scientific and technical results}
553First we describe the basic labelling approach and our compiler
554implementations that use it.  This is suitable for basic architectures
555with simple cost models.  Then we will discuss the dependent labelling
556extension which is suitable for more advanced processor architectures
557and compiler optimisations.  At the end of this section we will
558demonstrate automated high level reasoning about the source level
559costs provided by the compilers.
560
561% \emph{Dependent labelling~\cite{?}.} The basic labelling approach assumes a
562% bijective mapping between object code and source code O(1) blocks (called basic
563% blocks). This assumption is violated by many program optimizations (e.g. loop
564% peeling and loop unrolling). It also assumes the cost model computed on the
565% object code to be non parametric: every block must be assigned a cost that does
566% not depend on the state. This assumption is violated by stateful hardware like
567% pipelines, caches, branch prediction units. The dependent labelling approach is
568% an extension of the basic labelling approach that allows to handle parametric
569% cost models. We showed how the method allows to deal with loop optimizations and
570% pipelines, and we speculated about its applications to caches.
571%
572% \emph{Techniques to exploit the induced cost model.} Every technique used for
573% the analysis of functional properties of programs can be adapted to analyse the
574% non-functional properties of the source code instrumented by compilers that
575% implement the labelling approach. In order to gain confidence in this claim, we
576% showed how to implement a cost invariant generator combining abstract
577% interpretation with separation logic ideas \cite{separation}. We integrated
578% everything in the Frama-C modular architecture, in order to compute proof
579% obligations from functional and cost invariants and to use automatic theorem
580% provers on them. This is an example of a new technique that is not currently
581% exploited in WCET analysis. It also shows how precise functional invariants
582% benefits the non-functional analysis too. Finally, we show how to fully
583% automatically analyse the reaction time of Lustre nodes that are first compiled
584% to C using a standard Lustre compiler and then processed by a C compiler that
585% implements the labelling approach.
586
587% \emph{The CerCo compiler.} This is a compiler from a large subset of the C
588% program to 8051/8052 object code,
589% integrating the labelling approach and a static analyser for 8051 executables.
590% The latter can be implemented easily and does not require dependent costs
591% because the 8051 microprocessor is a very simple one, with constant-cost
592% instruction. It was chosen to separate the issue of exact propagation of the
593% cost model from the orthogonal problem of the static analysis of object code
594% that may require approximations or dependent costs. The compiler comes in
595% several versions: some are prototypes implemented directly in OCaml, and they
596% implement both the basic and dependent labelling approaches; the final version
597% is extracted from a Matita certification and at the moment implements only the
598% basic approach.
599
600\subsection{The (basic) labelling approach}
601The labelling approach is the foundational insight that underlies all the developments in CerCo.
602It allows the tracking of the evolution of
603basic blocks during the compilation process in order to propagate the cost model from the
604object code to the source code without losing precision in the process.
605\paragraph{Problem statement.} Given a source program $P$, we want to obtain an
606instrumented source program $P'$,  written in the same programming language, and
607the object code $O$ such that: 1) $P'$ is obtained by inserting into $P$ some
608additional instructions to update global cost information like the amount of
609time spent during execution or the maximal stack space required; 2) $P$ and $P'$ 
610must have the same functional behaviour, i.e.\ they must produce that same output
611and intermediate observables; 3) $P$ and $O$ must have the same functional
612behaviour; 4) after execution and in interesting points during execution, the
613cost information computed by $P'$ must be an upper bound of the one spent by $O$ 
614to perform the corresponding operations (\emph{soundness} property); 5) the difference
615between the costs computed by $P'$ and the execution costs of $O$ must be
616bounded by a program-dependent constant (\emph{precision} property).
617
618\paragraph{The labelling software components.} We solve the problem in four
619stages \cite{labelling}, implemented by four software components that are used
620in sequence.
621
622The first component labels the source program $P$ by injecting label emission
623statements in appropriate positions to mark the beginning of basic blocks.
624These are the positions where the cost instrumentation will appear in the final output.
625% The
626% set of labels with their positions is called labelling.
627The syntax and semantics
628of the source programming language is augmented with label emission statements.
629The statement ``EMIT $\ell$'' behaves like a NOP instruction that does not affect the
630program state or control flow, but its execution is observable.
631% Therefore the observables of a run of a program becomes a stream
632% of label emissions: $\ell_1,\ldots,\ell_n$, called the program trace. We clarify the conditions
633% that the labelling must respect later.
634For the example in Section~\ref{sec:workflow} this is just the original C code
635with ``EMIT'' instructions added at every point a \lstinline'__cost_incr' call
636appears in the final code.
637
638The second component is a labelling preserving compiler. It can be obtained from
639an existing compiler by adding label emission statements to every intermediate
640language and by propagating label emission statements during compilation. The
641compiler is correct if it preserves both the functional behaviour of the program
642and the traces of observables, including the labels `emitted'.
643% We may also ask that the function that erases the cost
644% emission statements commute with compilation. This optional property grants that
645% the labelling does not interfere with the original compiler behaviour. A further
646% set of requirements will be added later.
647
648The third component analyses the labelled
649object code to compute the scope of each of its label emission statements,
650i.e.\ the instructions that may be executed after the statement and before
651a new label emission is encountered, and then computes the maximum cost of
652each.  Note that we only have enough information at this point to compute
653the cost of loop-free portions of code.  We will consider how to ensure that
654every loop is broken by a cost label shortly.
655%It is a tree and not a sequence because the scope
656%may contain a branching statement. In order to grant that such a finite tree
657%exists, the object code must not contain any loop that is not broken by a label
658%emission statement. This is the first requirement of a sound labelling. The
659%analyser fails if the labelling is unsound. For each scope, the analyser
660%computes an upper bound of the execution time required by the scope, using the
661%maximum of the costs of the two branches in case of a conditional statement.
662%Finally, the analyser computes the cost of a label by taking the maximum of the
663%costs of the scopes of every statement that emits that label.
664
665The fourth and final component replaces the labels in the labelled
666version of the source code produced at the start with the costs
667computed for each label's scope.  This yields the instrumented source
668code.  For the example, this is the code in \autoref{itest1}, except
669for the specifications in comments, which we consider in
670Section~\ref{sec:exploit}.
671
672\paragraph{Correctness.} Requirements 1 and 2 hold because of the
673non-invasive labelling procedure.  Requirement 3 can be satisfied by
674implementing compilation correctly.  It is obvious that the value of
675the global cost variable of the instrumented source code is always
676equal to the sum of the costs of the labels emitted by the
677corresponding labelled code. Moreover, because the compiler preserves
678all traces, the sum of the costs of the labels emitted in the source
679and target labelled code are the same. Therefore, to satisfy the
680soundness requirement, we need to ensure that the time taken to
681execute the object code is equal to the sum of the costs of the labels
682emitted by the object code. We collect all the necessary conditions
683for this to happen in the definition of a \emph{sound} labelling: a)
684all loops must be broken by a cost emission statement; b) all program
685instructions must be in the scope of some cost emission statement.
686This ensures that every label's scope is a tree of instructions, with
687the cost being the most expensive path. To satisfy also the precision
688requirement, we must make the scopes flat sequences of instructions.
689We require a \emph{precise} labelling where every label is emitted at
690most once and both branches of each conditional jump start with a label
691emission statement.
692
693The correctness and precision of the labelling approach only rely on the
694correctness and precision of the object code labelling. The simplest
695way to achieve that is to impose correctness and precision
696requirements on the source code labelling produced at the start, and
697to demand that the compiler to preserve these
698properties too. The latter requirement imposes serious limitations on the
699compilation strategy and optimizations: the compiler may not duplicate any code
700that contains label emission statements, like loop bodies. Therefore several
701loop optimisations like peeling or unrolling are prevented. Moreover, precision
702of the object code labelling is not sufficient \emph{per se} to obtain global
703precision: we implicitly assumed that a precise constant cost can be assigned
704to every instruction. This is not possible in
705the presence of stateful hardware whose state influences the cost of operations,
706like pipelines and caches. In Section~\ref{lab:deplabel} we will see an extension of the
707basic labelling approach which tackles these problems.
708
709In CerCo we have developed several cost preserving compilers based on
710the labelling approach. Excluding an initial certified compiler for a
711`while' language, all remaining compilers target realistic source
712languages---a pure higher order functional language and a large subset
713of C with pointers, gotos and all data structures---and real world
714target processors---MIPS and the Intel 8051/8052 processor
715family. Moreover, they achieve a level of optimisation that ranges
716from moderate (comparable to GCC level 1) to intermediate (including
717loop peeling and unrolling, hoisting and late constant propagation).
718We describe the C compilers in detail in the following section.
719
720Two compilation chains were implemented for a purely functional
721higher-order language~\cite{labelling2}. The two main changes required
722to deal with functional languages are: 1) because global variables and
723updates are not available, the instrumentation phase produces monadic
724code to `update' the global costs; 2) the requirements for a sound and
725precise labelling of the source code must be changed when the
726compilation is based on CPS translations.  In particular, we need to
727introduce labels emitted before a statement is executed and also
728labels emitted after a statement is executed. The latter capture code
729that is inserted by the CPS translation and that would escape all
730label scopes.
731
732% Brian: one of the reviewers pointed out that standard Prolog implementations
733% do have some global state that apparently survives backtracking and might be
734% used.  As we haven't experimented with this, I think it's best to elide it
735% entirely.
736
737% Phases 1, 2 and 3 can be applied as well to logic languages (e.g. Prolog).
738% However, the instrumentation phase cannot: in standard Prolog there is no notion
739% of (global) variable whose state is not retracted during backtracking.
740% Therefore, the cost of executing computations that are later backtracked would
741% not be correctly counted in. Any extension of logic languages with
742% non-backtrackable state could support our labelling approach.
743
744\subsection{The CerCo C compilers}
745We implemented two C compilers, one implemented directly in OCaml and
746the other implemented in the interactive theorem prover
747Matita~\cite{matita}.  The first acted as a prototype for the second,
748but also supported MIPS and acted as a testbed for more advanced
749features such as the dependent labelling approach
750in Section~\ref{lab:deplabel}.
751
752The second C compiler is the \emph{Trusted CerCo Compiler}, whose cost
753predictions are formally verified. The executable code
754is OCaml code extracted from the Matita implementation. The Trusted
755CerCo Compiler only targets
756the C language and the 8051/8052 family, and does not yet implement any
757advanced optimisations. Its user interface, however, is the same as the
758other version for interoperability. In
759particular, the Frama-C CerCo plugin descibed in
760Section~\ref{sec:exploit} can work without recompilation
761with both of our C compilers.
762
763The 8051/8052 microprocessor is a very simple one, with constant-cost
764instructions. It was chosen to separate the issue of exact propagation of the
765cost model from the orthogonal problem of low-level timing analysis of object code
766that may require approximation or dependent costs.
767
768The (trusted) CerCo compiler implements the following optimisations: cast
769simplification, constant propagation in expressions, liveness analysis driven
770spilling of registers, dead code elimination, branch displacement, and tunneling.
771The two latter optimisations are performed by our optimising assembler
772\cite{correctness}. The back-end of the compiler works on three address
773instructions, preferred to static single assignment code for the simplicity of
774the formal certification.
775
776The CerCo compiler is loosely based on the CompCert compiler \cite{compcert}, a
777recently developed certified compiler from C to the PowerPC, ARM and x86
778microprocessors. In contrast to CompCert, both the CerCo code and its
779certification are fully open source. Some data structures and language definitions for
780the front-end are directly taken from CompCert, while the back-end is a redesign
781of a compiler from Pascal to MIPS used by Fran\c{c}ois Pottier for a course at the
782Ecole Polytechnique.
783
784The main differences in the CerCo compiler are:
785\begin{enumerate}
786\item All the intermediate languages include label emitting instructions to
787implement the labelling approach, and the compiler preserves execution traces.
788\item Instead of targeting an assembly language with additional
789macro-instructions which are expanded before assembly, we directly
790produce object code in order to perform the timing analysis, using
791an integrated optimising assembler.
792\item In order to avoid the additional work of implementing a linker
793 and a loader, we do not support separate
794compilation and external calls. Adding them is orthogonal
795to the labelling approach and should not introduce extra problems.
796\item We target an 8-bit processor, in contrast to CompCert's 32-bit targets. This requires
797many changes and more compiler code, but it is not fundamentally more
798complex. The proof of correctness, however, becomes much harder.
799\item We target a microprocessor that has a non-uniform memory model, which is
800still often the case for microprocessors used in embedded systems and that is
801becoming common again in multi-core processors. Therefore the compiler has to
802keep track of data and it must move data between memory regions in the proper
803way. Moreover the size of pointers to different regions is not uniform. %An
804%additional difficulty was that the space available for the stack in internal
805%memory in the 8051 is tiny, allowing only a minor number of nested calls. To
806%support full recursion in order to test the CerCo tools also on recursive
807%programs, the compiler implements a stack in external memory.
808\end{enumerate}
809
810\subsection{Formal certification of the CerCo compiler}
811We have formally
812certified in the Matita interactive proof assistant that the cost models induced on the source code by the
813Trusted CerCo Compiler correctly and precisely
814predict the object code behaviour. There are two cost models, one for execution
815time and one for stack space consumption. We show the correctness of the prediction
816only for those programs that do not exhaust the available stack space, a
817property that---thanks to the stack cost model---we can statically analyse on the
818source code in sharp contrast to other certified compilers.  Other projects have
819already certified the preservation of functional semantics in similar compilers,
820so we have not attempted to directly repeat that work and assume
821functional correctness for most passes. In order to complete the
822proof for non-functional properties, we have introduced a new,
823structured, form of execution traces, with the
824related notions for forward similarity and the intensional
825consequences of forward similarity. We have also introduced a unified
826representation for back-end intermediate languages that was exploited to provide
827a uniform proof of forward similarity.
828
829The details on the proof techniques employed
830and
831the proof sketch can be found in the CerCo deliverables and papers.
832In this section we will only hint at the correctness statement, which turned
833out to be more complex than expected.
834
835\paragraph{The correctness statement.}
836Real time programs are often reactive programs that loop forever responding to
837events (inputs) by performing some computation followed by some action (output)
838and continuing as before. For these programs the overall execution
839time does not make sense. The same is true for reactive programs that spend an
840unpredictable amount of time in I/O. Instead, what is interesting is the reaction time ---
841 the time spent between I/O events. Moreover, we are interested in
842predicting and ruling out crashes due to running out of space on certain
843inputs.
844Therefore we need a statement that talks about sub-runs of a
845program. A natural candidate is that the time predicted on the source
846code and spent on the object code by two corresponding sub-runs are the same.
847To make this statement formal we must identify the
848corresponding sub-runs and how to single out those that are meaningful.
849We introduce the notion of a \emph{measurable} sub-run of a run
850which does not exhaust the available stack before or during the
851sub-run, the number of function calls and returns
852in the sub-run is the same, the sub-run does not perform any I/O, and the sub-run
853starts with a label emission statement and ends with a return or another label
854emission statement. The stack usage is bounded using the stack usage model
855that is computed by the compiler.
856
857The statement that we formally proved is: for each C run with a measurable
858sub-run, there exists an object code run with a sub-run, with the same
859execution trace for both the prefix of the run and the sub-run itself,
860 and where the time spent by the
861object code in the sub-run is the same as the time predicted on the source code
862using the time cost model generated by the compiler.
863
864
865We briefly discuss the constraints for measurability. Not exhausting the stack
866space is necessary for a run to be meaningful, because the source semantics
867has no notion of running out of memory. Balancing
868function calls and returns is a requirement for precision: the labelling approach
869allows the scope of a label to extend after function calls to
870minimize the number of labels. (The scope excludes the called
871function's execution.) If the number of
872calls/returns is unbalanced, it means that there is a call we have not returned
873to that could be followed by additional instructions whose cost has already been
874taken in account.  The last condition on the start and end points of
875a run is also required to make the bound precise.  With these
876restrictions and the 8051's simple timing model we obtain \emph{exact}
877predictions.  If we relax these conditions then we obtain a corollary
878with an upper bound on the cost.
879Finally, I/O operations can be performed in the prefix of the run, but not in the
880measurable sub-run. Therefore we prove that we can predict reaction times, but
881not I/O times, as desired.
882
883\subsection{Dependent labelling}
884\label{lab:deplabel}
885The core idea of the basic labelling approach is to establish a tight connection
886between basic blocks executed in the source and target languages. Once the
887connection is established, any cost model computed on the object code can be
888transferred to the source code, without affecting the code of the compiler or
889its proof. In particular, we can also transport cost models
890that associate to each label a \emph{function} from the hardware state
891to a natural number.
892However, a problem arises during the instrumentation phase that replaces label
893emission statements with increments of global cost variables. They are
894incremented by the result of applying the label's cost function
895to the hardware state at the time of execution of the block. However,
896the hardware state comprises both the functional state that affects the
897computation (the value of the registers and memory) and the non-functional
898state that does not (the pipeline and cache contents, for example).
899We can find corresponding information for the former in the source code state, but constructing the
900correspondence may be hard and lifting the cost model to work on the source code
901state is likely to produce cost expressions that are too complex to understand and reason about.
902Fortunately, in modern architectures the cost of executing single
903instructions is either independent of the functional state or the jitter---the
904difference between the worst and best case execution times---is small enough
905to be bounded without losing too much precision. Therefore we only consider
906dependencies on the `non-functional' parts of the state.
907
908The non-functional state is not directly related to the high level
909state and does not influence the functional properties. What can be
910done is to expose key aspects of the non-functional state in the
911source code. We present here the basic intuition in a simplified form:
912the technical details that allow us to handle the general case are
913more complex and can be found in~\cite{paolo}. We add to the source
914code an additional global variable that represents the non-functional
915state and another one that remembers the last few labels emitted. The
916state variable must be updated at every label emission statement,
917using an update function which is computed during the processor timing
918analysis.
919This update function assigns to each label a function from the
920recently emitted labels and old state to the new state. It is computed
921by composing the semantics of every instruction in a basic block
922restricted to the non-functional part of the state.
923
924Not all the details of the non-functional state needs to be exposed, and the
925technique works better when the part of state that is required can be summarized
926in a simple data structure. For example, to handle simple but realistic
927pipelines it is sufficient to remember a short integer that encodes the position
928of bubbles (stuck instructions) in the pipeline. In any case, it is not necessary
929for the user to understand the meaning of the state to reason over the properties
930of the
931program. Moreover, the user, or the invariant generator tools that
932analyse the instrumented source code produced by the compiler, can decide to
933trade precision of the analysis for simplicity by approximating the
934cost by safe bounds that do not depend on the processor state. Interestingly, the functional analysis of
935the code could determine which blocks are executed more frequently in order to
936use more aggressive approximations for those that are executed least.
937
938Dependent labelling can also be applied to allow the compiler to duplicate
939blocks that contain labels (e.g. in loop optimisations)~\cite{paolo}. The effect is to assign
940a different cost to the different occurrences of a duplicated label. For
941example, loop peeling turns a loop into the concatenation of a copy of the loop
942body for the first iteration and the conditional execution of the
943loop for successive iterations. Further optimisations will compile the two
944copies of the loop differently, with the first body usually
945taking more time.
946
947By introducing a variable that keeps track of the iteration number, we can
948associate to the label a cost that is a function of the iteration number. The
949same technique works for loop unrolling without modification: the function will
950assign one cost to the even iterations and another cost to the odd
951ones.  The loop optimisation code
952that duplicate the loop bodies must also modify the code to correctly propagate
953the update of the iteration numbers. The technical details are more complicated and
954can be found in the CerCo reports and publications. The implementation, however,
955is quite simple (and forms part of our OCaml version of the compiler)
956and the changes to a loop optimising compiler are minimal.
957
958\subsection{Techniques to exploit the induced cost model}
959\label{sec:exploit}
960We now turn our attention to synthesising high-level costs, such as
961the execution time of a whole program.  We consider as our starting point source level costs
962provided by basic labelling, in other words annotations
963on the source code which are constants that provide a sound and sufficiently
964precise upper bound on the cost of executing the blocks after compilation to
965object code.
966
967The principle that we have followed in designing the cost synthesis tools is
968that the synthetic bounds should be expressed and proved within a general
969purpose tool built to reason on the source code. In particular, we rely on the
970Frama-C tool to reason on C code and on the Coq proof-assistant to reason on
971higher-order functional programs.
972
973This principle entails that: 1)
974The inferred synthetic bounds are indeed correct as long as the general purpose
975tool is; 2) there is no limitation on the class of programs that can be handled
976as long as the user is willing to carry on an interactive proof.
977
978Of course, automation is desirable whenever possible. Within this framework,
979automation means writing programs that give hints to the general purpose tool.
980These hints may take the form, say, of loop invariants/variants, of predicates
981describing the structure of the heap, or of types in a light logic. If these
982hints are correct and sufficiently precise the general purpose tool will produce
983a proof automatically, otherwise, user interaction is required.
984
985\paragraph{The Cost plugin and its application to the Lustre compiler.}
986Frama-C \cite{framac} is a set of analysers for C programs with a
987specification language called ACSL. New analyses can be dynamically added
988via a plugin system. For instance, the Jessie plugin allows deductive
989verification of C programs with respect to their specification in ACSL, with
990various provers as back-end tools.
991We developed the CerCo Cost plugin for the Frama-C platform as a proof of
992concept of an automatic environment exploiting the cost annotations produced by
993the CerCo compiler. It consists of an OCaml program which essentially
994takes the following actions: 1) it receives as input a C program, 2) it
995applies the CerCo compiler to produce a related C program with cost annotations,
9963) it applies some heuristics to produce a tentative bound on the cost of
997executing the C functions of the program as a function of the value of their
998parameters, 4) the user can then call the Jessie plugin to discharge the
999related proof obligations.
1000In the following we elaborate on the soundness of the framework and the
1001experiments we performed with the Cost tool on the C programs produced by a
1002Lustre compiler.
1003
1004\paragraph{Soundness.} The soundness of the whole framework depends on the cost
1005annotations added by the CerCo compiler, the synthetic costs produced by the
1006cost plugin, the verification conditions (VCs) generated by Jessie, and the
1007external provers discharging the VCs. The synthetic costs being in ACSL format,
1008Jessie can be used to verify them. Thus, even if the added synthetic costs are
1009incorrect (relatively to the cost annotations), the process as a whole is still
1010correct: indeed, Jessie will not validate incorrect costs and no conclusion can
1011be made about the WCET of the program in this case. In other terms, the
1012soundness does not really depend on the action of the cost plugin, which can in
1013principle produce any synthetic cost. However, in order to be able to actually
1014prove a WCET of a C function, we need to add correct annotations in a way that
1015Jessie and subsequent automatic provers have enough information to deduce their
1016validity. In practice this is not straightforward even for very simple programs
1017composed of branching and assignments (no loops and no recursion) because a fine
1018analysis of the VCs associated with branching may lead to a complexity blow up.
1019\paragraph{Experience with Lustre.} Lustre is a data-flow language for programming
1020synchronous systems, with a compiler which targets C. We designed a
1021wrapper for supporting Lustre files.
1022The C function produced by the compiler implements the step function of the
1023synchronous system and computing the WCET of the function amounts to obtain a
1024bound on the reaction time of the system. We tested the Cost plugin and the
1025Lustre wrapper on the C programs generated by the Lustre compiler. For programs
1026consisting of a few hundred lines of code, the cost plugin computes a
1027WCET and Alt-Ergo is able to discharge all VCs automatically.
1028
1029\paragraph{Handling C programs with simple loops.}
1030The cost annotations added by the CerCo compiler take the form of C instructions
1031that update a fresh global variable called the cost variable by a constant.
1032Synthesizing a WCET of a C function thus consists of statically resolving an
1033upper bound of the difference between the value of the cost variable before and
1034after the execution of the function, i.e. finding the instructions
1035that update the cost variable and establish the number of times they are passed
1036through during the flow of execution. To perform the analysis the plugin
1037makes the following assumptions on the programs:
10381) there are no recursive functions;
10392) every loop must be annotated with a variant. The variants of `for' loops are
1040automatically inferred.
1041
1042The plugin proceeds as follows.
1043First the call graph of the program is computed.
1044Then the computation of the cost of the function is performed by traversing its
1045control flow graph. If the function $f$ calls the function $g$ 
1046then the function $g$ 
1047is processed before the function $f$. The cost at a node is the maximum of the
1048costs of the successors.
1049In the case of a loop with a body that has a constant cost for every step of the
1050loop, the cost is the product of the cost of the body and of the variant taken
1051at the start of the loop.
1052In the case of a loop with a body whose cost depends on the values of some free
1053variables, a fresh logic function $f$ is introduced to represent the cost of the
1054loop in the logic assertions. This logic function takes the variant as a first
1055parameter. The other parameters of $f$ are the free variables of the body of the
1056loop. An axiom is added to account the fact that the cost is accumulated at each
1057step of the loop.
1058The cost of the function is directly added as post-condition of the function.
1059
1060The user can influence the annotation by two different means:
10611) by using more precise variants;
10622) by annotating functions with cost specifications. The plugin will use this cost
1063for the function instead of computing it.
1064\paragraph{C programs with pointers.}
1065When it comes to verifying programs involving pointer-based data structures,
1066such as linked lists, trees, or graphs, the use of traditional first-order logic
1067to specify, and of SMT solvers to verify, shows some limitations. Separation
1068logic is an elegant alternative. It is a program logic
1069with a new notion of conjunction to express spatial heap separation. Bobot has
1070recently introduced automatically generated separation
1071predicates to simulate separation logic reasoning in the Jessie plugin where the specification language, the verification condition
1072generator, and the theorem provers were not designed with separation logic in
1073mind~\cite{bobot}. CerCo's plugin can exploit the separation predicates to automatically
1074reason on the cost of execution of simple heap manipulation programs such as an
1075in-place list reversal.
1076
1077\section{Conclusions and future work}
1078
1079All the CerCo software and deliverables can be found on the CerCo homepage at~\url{http://cerco.cs.unibo.it}.
1080
1081The results obtained so far are encouraging and provide evidence that
1082it is possible to perform static time and space analysis at the source level
1083without losing accuracy, reducing the trusted code base and reconciling the
1084study of functional and non-functional properties of programs. The
1085techniques introduced seem to be scalable, cover both imperative and
1086functional languages and are compatible with every compiler optimisation
1087considered by us so far.
1088
1089To prove that compilers can keep track of optimisations
1090and induce a precise cost model on the source code, we targeted a simple
1091architecture that admits a cost model that is execution history independent.
1092The most important future work is dealing with hardware architectures
1093characterized by history dependent stateful components, like caches and
1094pipelines. The main issue is to assign a parametric, dependent cost
1095to basic blocks that can be later transferred by the labelling approach to
1096the source code and represented in a meaningful way to the user. The dependent
1097labelling approach that we have studied seems a promising tool to achieve
1098this goal, but more work is required to provide good source level
1099approximations of the relevant processor state.
1100
1101Other examples of future work are to improve the cost invariant
1102generator algorithms and the coverage of compiler optimizations, to combining
1103the labelling approach with the type and effect discipline of~\cite{typeffects}
1104to handle languages with implicit memory management, and to experiment with
1105our tools in the early phases of development. Larger case studies are also necessary
1106to evaluate the CerCo's prototype on realistic, industrial-scale programs.
1107
1108% \bibliographystyle{splncs}
1109\bibliography{fopara13}
1110% \begin{thebibliography}{19}
1111%
1112% \bibitem{survey} \textbf{A Survey of Static Program Analysis Techniques}
1113% W.~W\"ogerer. Technical report. Technische Universit\"at Wien 2005
1114%
1115% \bibitem{cerco} \textbf{Certified Complexity}. R.M. Amadio, A. Asperti, N. Ayache,
1116% B. Campbell, D. P. Mulligan, R. Pollack, Y. Regis-Gianas, C. Sacerdoti Coen, I.
1117% Stark, in Procedia Computer Science, Volume 7, 2011, Proceedings of the 2 nd
1118% European Future Technologies Conference and Exhibition 2011 (FET 11), 175-177.
1119%
1120% \bibitem{labelling} \textbf{Certifying and Reasoning on Cost Annotations in C
1121% Programs}, N.  Ayache, R.M. Amadio, Y.R\'{e}gis-Gianas, in Proc. FMICS, Springer
1122% LNCS
1123% 7437: 32-46, 2012.
1124% %, DOI:10.1007/978-3-642-32469-7\_3.
1125%
1126% \bibitem{labelling2} \textbf{Certifying and reasoning on cost annotations of
1127% functional programs}.
1128% R.M. Amadio, Y. R\'{e}gis-Gianas. Proceedings of the Second international conference
1129% on Foundational and Practical Aspects of Resource Analysis FOPARA 2011 Springer
1130% LNCS 7177:72-89, 2012.
1131%
1132% \bibitem{compcert} \textbf{Formal verification of a realistic compiler}. X. Leroy,  In Commun. ACM 52(7), 107–115, 2009.
1133%
1134% \bibitem{framac} \textbf{Frama-C user manual}. L. Correnson, P. Cuoq, F. Kirchner, V. Prevosto, A. Puccetti, J. Signoles,
1135% B. Yakobowski. in CEA-LIST, Software Safety Laboratory, Saclay, F-91191,
1136% \url{http://frama-c.com/}.
1137%
1138% \bibitem{paolo} \textbf{Indexed Labels for Loop Iteration Dependent Costs}. P.
1139% Tranquilli, in Proceedings of the 11th International Workshop on Quantitative
1140% Aspects of Programming Languages and Systems (QAPL 2013), Rome, 23rd-24th March
1141% 2013, Electronic Proceedings in Theoretical Computer Science, to appear in 2013.
1142%
1143% \bibitem{separation} \textbf{Intuitionistic reasoning about shared mutable data
1144% structure} J.C. Reynolds. In Millennial Perspectives in Computer Science,
1145% pages 303–321, Houndsmill, Hampshire, 2000. Palgrave.
1146%
1147% \bibitem{lustre} \textbf{LUSTRE: a declarative language for real-time
1148% programming}
1149% P. Caspi, D. Pilaud, N. Halbwachs, J.A. Plaice. In Proceedings of
1150% the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages ACM
1151% 1987.
1152%
1153% \bibitem{correctness} \textbf{On the correctness of an optimising assembler for
1154% the intel MCS-51 microprocessor}.
1155%   D. P. Mulligan, C. Sacerdoti Coen. In Proceedings of the Second
1156% international conference on Certified Programs and Proofs, Springer-Verlag 2012.
1157%
1158% \bibitem{proartis} \textbf{PROARTIS: Probabilistically Analysable Real-Time
1159% Systems}, F.J. Cazorla, E. Qui\~{n}ones, T. Vardanega, L. Cucu, B. Triquet, G.
1160% Bernat, E. Berger, J. Abella, F. Wartel, M. Houston, et al., in ACM Transactions
1161% on Embedded Computing Systems, 2012.
1162%
1163% \bibitem{embounded} \textbf{The EmBounded project (project paper)}. K. Hammond,
1164% R. Dyckhoff, C. Ferdinand, R. Heckmann, M. Hofmann, H. Loidl, G. Michaelson, J.
1165% Serot, A. Wallace, in Trends in Functional Programming, Volume 6, Intellect
1166% Press, 2006.
1167%
1168% \bibitem{matita}
1169% \textbf{The Matita Interactive Theorem Prover}.
1170% A. Asperti, C. Sacerdoti Coen, W. Ricciotti, E. Tassi.
1171% 23rd International Conference on Automated Deduction, CADE 2011.
1172%
1173% \bibitem{typeffects} \textbf{The Type and Effect Discipline}. J.-P. Talpin,
1174%  P. Jouvelot.
1175%   In Proceedings of the Seventh Annual Symposium on Logic in Computer Science
1176% (LICS '92), Santa Cruz, California, USA, June 22-25, 1992.
1177% IEEE Computer Society 1992.
1178%
1179% \bibitem{stateart} \textbf{The worst-case execution-time problem overview of
1180% methods
1181% and survey of tools.} R. Wilhelm et al., in  ACM Transactions on Embedded
1182% Computing Systems, 7:1–53, May 2008.
1183%
1184% %\bibitem{proartis2} \textbf{A Cache Design for Probabilistic Real-Time
1185% % Systems}, L. Kosmidis, J. Abella, E. Quinones, and F. Cazorla, in Design,
1186% % Automation, and Test in Europe (DATE), Grenoble, France, 03/2013.
1187%
1188% \end{thebibliography}
1189
1190
1191%\bibliography{fopara13.bib}
1192
1193\end{document}
Note: See TracBrowser for help on using the repository browser.