source: Papers/jar-cerco-2017/cerco.tex @ 3611

Last change on this file since 3611 was 3611, checked in by boender, 3 years ago

Rewrote introduction

File size: 32.3 KB
Line 
1\begin{filecontents*}{example.eps}
2%!PS-Adobe-3.0 EPSF-3.0
3%%BoundingBox: 19 19 221 221
4%%CreationDate: Mon Sep 29 1997
5%%Creator: programmed by hand (JK)
6%%EndComments
7gsave
8newpath
9  20 20 moveto
10  20 220 lineto
11  220 220 lineto
12  220 20 lineto
13closepath
142 setlinewidth
15gsave
16  .4 setgray fill
17grestore
18stroke
19grestore
20\end{filecontents*}
21
22\RequirePackage{fix-cm}
23
24\documentclass[smallextended]{svjour3}
25
26\usepackage{amsfonts}
27\usepackage{amsmath}
28\usepackage{amssymb} 
29\usepackage[british]{babel}
30\usepackage{color}
31\usepackage{fancybox}
32\usepackage{fancyvrb}
33\usepackage{graphicx}
34\usepackage[colorlinks,
35            bookmarks,bookmarksopen,bookmarksdepth=2]{hyperref}
36\usepackage{hyphenat}
37\usepackage[utf8x]{inputenc}
38\usepackage{listings}
39\usepackage{mdwlist}
40\usepackage{microtype}
41\usepackage{stmaryrd}
42\usepackage{url}
43
44\usepackage{tikz}
45\usetikzlibrary{positioning,calc,patterns,chains,shapes.geometric,scopes}
46
47\lstset{language=C,basicstyle=\tt,basewidth=.5em,lineskip=-1.5pt}
48
49\newlength{\mylength}
50\newenvironment{frametxt}%
51        {\setlength{\fboxsep}{5pt}
52                \setlength{\mylength}{\linewidth}%
53                \addtolength{\mylength}{-2\fboxsep}%
54                \addtolength{\mylength}{-2\fboxrule}%
55                \Sbox
56                \minipage{\mylength}%
57                        \setlength{\abovedisplayskip}{0pt}%
58                        \setlength{\belowdisplayskip}{0pt}%
59                }%
60                {\endminipage\endSbox
61                        \[\fbox{\TheSbox}\]}
62
63\smartqed
64
65% PROPOSED PAPER STRUCTURE
66
67% Introduction
68%   Problem being solved, background, etc.
69%   Current state of the art (and problem with it)
70%   The `CerCo approach' (tm)
71%   Brief Matita overview
72%   Map of paper
73
74% CerCo compiler architecture
75%   Description of languages
76%   Target hardware description
77
78% Compiler proof
79%   Structure of proof, and high-level discussion
80%   Technical devices: structured traces, labelling, etc.
81%   Assembler proof
82%   Technical issues in front end (Brian?)
83%   Main theorem statement
84
85% FramaC plugin
86%   Purpose
87%   Description of architecture
88%   Case study/worked example (crypto example?)
89
90% Formal development
91%   Source code repo link
92%   Statistics (number of lines, etc.)
93%   Description of remaining axioms --- try and explain them away/make them sound reasonable
94
95% Conclusions
96%   Summary
97%   Related work
98%   Future work
99
100% Bibliography
101
102\title{CerCo: Certified Complexity\thanks{The project CerCo acknowledges the
103financial support of the Future and Emerging Technologies (FET) programme within
104the Seventh Framework Programme for Research of the European Commission, under
105FET-Open grant number: 243881}}
106\subtitle{Verified lifting of concrete complexity annotations through a realistic C compiler}
107\journalname{Journal of Automated Reasoning}
108\titlerunning{Certified Complexity}
109\date{Received: date / Accepted: date}
110\author{Jaap Boender \and Brian Campbell \and Dominic P. Mulligan \and Claudio Sacerdoti~Coen} % who else?
111\authorrunning{Boender, Campbell, Mulligan, and Sacerdoti~Coen}
112\institute{Jaap Boender \at
113              Faculty of Science and Technology,\\
114                                                        Middlesex University London,\\
115                                                        United Kingdom.\\
116              \email{J.Boender@mdx.ac.uk}
117           \and
118           Brian Campbell \at
119              Department of Informatics,\\
120              University of Edinburgh,\\
121              United Kingdom.\\
122              \email{Brian.Campbell@ed.ac.uk}
123           \and
124           Dominic P. Mulligan \at
125             Computer Laboratory,\\
126             University of Cambridge, \\
127             United Kingdom.\\
128             \email{Dominic.Mulligan@cl.cam.ac.uk}
129           \and
130           Claudio Sacerdoti~Coen \at
131              Dipartimento di Informatica---Scienza e Ingegneria (DISI),\\
132              University of Bologna,\\
133              Italy.\\
134              \email{Claudio.SacerdotiCoen@unibo.it}}
135
136\begin{document}
137
138\maketitle
139
140\begin{abstract}
141We provide an overview of the FET-Open Project CerCo (`Certified Complexity').
142Our main achievement is the development of a technique for analysing non-functional properties of programs (time, space) at the source level with little or no loss of accuracy and a small trusted code base.
143The core component is a C compiler, verified in the Matita theorem prover, that produces an instrumented copy of the source code in addition to generating object code.
144This instrumentation exposes, and tracks precisely, the actual (non-asymptotic) computational cost of the input program at the source level.
145Untrusted invariant generators and trusted theorem provers may then be used to compute and certify the parametric execution time of the code.
146\keywords{Verified compilation \and Complexity analysis \and CerCo (`Certified Complexity')}
147\end{abstract}
148
149% ---------------------------------------------------------------------------- %
150% SECTION                                                                      %
151% ---------------------------------------------------------------------------- %
152
153\section{Introduction}
154\label{sect.introduction}
155
156%\paragraph{Problem statement.}
157Programs can be specified using both
158functional constraints (what the program must do) and non-functional constraints (what time, space or other resources the program can use).  At the current
159state of the art, functional properties are verified
160by combining user annotations---preconditions, invariants, and so on---with a
161multitude of automated analyses---invariant generators, type systems, abstract
162interpretation, theorem proving, and so on---on the program's high-level source code.
163By contrast, many non-functional properties
164are verified by analysing the low-level object code.
165%\footnote{A notable
166%  exception is the explicit allocation of heap space in languages like C, which
167%  can be handled at the source level.}
168
169Analysis on low-level object code has several problems, however:
170\begin{itemize*}
171\item
172It can be hard to deduce the high-level structure of the program after compiler optimisations.
173The object code produced by an optimising compiler may have radically different control flow to the original source code program.
174\item
175Techniques that operate on object code are not useful early in the development process of a program, yet problems with a program's design or implementation are cheaper to resolve earlier in the process, rather than later.
176\item
177Parametric cost analysis is very hard: how can we reflect a cost that depends on the execution state, for example the
178value of a register or a carry bit, to a cost that the user can understand
179looking at the source code?
180\item
181Performing functional analysis on object code makes it hard for the programmer to provide information about the program and its expected execution, leading to a loss of precision in the resulting analysis.
182\end{itemize*}
183
184\paragraph{Vision and approach.}
185We want to reconcile functional and
186non-functional analysis: to share information between them and perform both at
187the same time on high-level source code.
188%
189What has previously prevented this approach from being applied is the lack of a
190uniform and precise cost model for high-level code, since each statement
191occurrence is compiled
192differently, optimisations may change control flow, and the cost of an object
193code instruction may depend on the runtime state of hardware components like
194pipelines and caches, none of which are visible in the source code.
195
196We envision a new generation of compilers that track program structure through compilation and optimisation and exploit this
197information to define a precise, non-uniform cost model for source code that accounts for runtime state. With such a cost model we can
198reduce non-functional verification to the functional case and exploit the state
199of the art in automated high-level verification~\cite{survey}. The techniques
200currently used by the Worst Case Execution Time (WCET) community, which perform
201analysis on object code, are still available but can be coupled with additional source-level analysis. In cases where our approach produces overly complex cost
202models, safe approximations can be used to trade complexity for precision.
203Finally, source code analysis can be used early in the development process,
204when components have been specified but not implemented, as modularity means
205that it is enough to specify the non-functional behaviour of missing
206components.
207
208\paragraph{Contributions.}
209We have developed \emph{the labelling approach}~\cite{labelling}, a
210technique to implement compilers that induce cost models on source programs by
211very lightweight tracking of code changes through compilation. We have studied
212how to formally prove the correctness of compilers implementing this technique, and
213have implemented such a compiler from C to object binaries for the 8051
214microcontroller that predicts execution time and stack space usage,
215verifying it in an interactive theorem prover.  As we are targeting
216an embedded microcontroller we do not consider dynamic memory allocation.
217
218To demonstrate source-level verification of costs we have implemented
219a Frama-C plugin~\cite{framac} that invokes the compiler on a source
220program and uses it to generate invariants on the high-level source
221that correctly model low-level costs. The plugin certifies that the
222program respects these costs by calling automated theorem provers, a
223new and innovative technique in the field of cost analysis. Finally,
224we have conducted several case studies, including showing that the
225plugin can automatically compute and certify the exact reaction time
226of Lustre~\cite{lustre} data flow programs compiled into C.
227
228% ---------------------------------------------------------------------------- %
229% SECTION                                                                      %
230% ---------------------------------------------------------------------------- %
231
232\section{Project context and approach}
233\label{sect.project.context.and.approach}
234
235Formal methods for verifying functional properties of programs have 
236now reached a level of maturity and automation that their adoption is slowly
237increasing in production environments. For safety critical code, it is
238becoming commonplace to combine rigorous software engineering methodologies and testing
239with static analyses, taking the strengths of each and mitigating
240their weaknesses. Of particular interest are open frameworks
241for the combination of different formal methods, where the programs can be
242progressively specified and enriched with new safety
243guarantees: every method contributes knowledge (e.g. new invariants) that
244becomes an assumption for later analysis.
245
246The outlook for verifying non-functional properties of programs (time spent,
247memory used, energy consumed) is bleaker.
248% and the future seems to be getting even worse.
249Most industries verify that real time systems meet their deadlines
250by simply performing many runs of the system and timing their execution,  computing the
251maximum time and adding an empirical safety margin, claiming the result to be a
252bound for the WCET of the program. Formal methods and software to statically
253analyse the WCET of programs exist, but they often produce bounds that are too
254pessimistic to be useful. Recent advancements in hardware architecture
255have been
256focused on the improvement of the average case performance, not the
257predictability of the worst case. Execution time is becoming increasingly
258dependent on execution history and the internal state of
259hardware components like pipelines and caches. Multi-core processors and non-uniform
260memory models are drastically reducing the possibility of performing
261static analysis in isolation, because programs are less and less time
262composable. Clock-precise hardware models are necessary for static analysis, and
263obtaining them is becoming harder due to the increased sophistication
264of hardware design.
265
266Despite these problems, the need for reliable real time
267systems and programs is increasing, and there is pressure
268from the research community for the introduction of
269hardware with more predictable behaviour, which would be more suitable
270for static analysis.  One example, being investigated by the Proartis
271project~\cite{proartis}, is to decouple execution time from execution
272history by introducing randomisation.
273
274In CerCo~\cite{cerco} we do not address this problem, optimistically
275assuming that improvements in low-level timing analysis or architecture will make
276verification feasible in the longer term. Instead, the main objective of our work is
277to bring together the static analysis of functional and non-functional
278properties, which in the current state of the art are
279independent activities with limited exchange of information: while the
280functional properties are verified on the source code, the analysis of
281non-functional properties is performed on object code to exploit
282clock-precise hardware models.
283
284\subsection{Current object-code methods}
285
286Analysis currently takes place on object code for two main reasons.
287First, there cannot be a uniform, precise cost model for source
288code instructions (or even basic blocks). During compilation, high level
289instructions are broken up and reassembled in context-specific ways so that
290identifying a fragment of object code and a single high level instruction is
291infeasible. Even the control flow of the object and source code can be very
292different as a result of optimisations, for example aggressive loop
293optimisations may completely transform source level loops. Despite the lack of a uniform, compilation- and
294program-independent cost model on the source language, the literature on the
295analysis of non-asymptotic execution time on high level languages assuming
296such a model is growing and gaining momentum. However, unless we provide a
297replacement for such cost models, this literature's future practical impact looks
298to be minimal. Some hope has been provided by the EmBounded project \cite{embounded}, which
299compositionally compiles high-level code to a byte code that is executed by an
300interpreter with guarantees on the maximal execution time spent for each byte code
301instruction. This provides a uniform model at the expense of the model's
302precision (each cost is a pessimistic upper bound) and the performance of the
303executed code (because the byte code is interpreted compositionally instead of
304performing a fully non-compositional compilation).
305
306The second reason to perform the analysis on the object code is that bounding
307the worst case execution time of small code fragments in isolation (e.g. loop
308bodies) and then adding up the bounds yields very poor estimates as no
309knowledge of the hardware state prior to executing the fragment can be assumed. By
310analysing longer runs the bound obtained becomes more precise because the lack
311of information about the initial state has a relatively small impact.
312
313To calculate the cost of an execution, value and control flow analyses
314are required to bound the number of times each basic block is
315executed.  Currently, state
316of the art WCET analysis tools, such as AbsInt's aiT toolset~\cite{absint}, perform these analyses on
317object code, where the logic of the program is harder to reconstruct
318and most information available at the source code level has been lost;
319see~\cite{stateart} for a survey. Imprecision in the analysis can lead to useless bounds. To
320augment precision, the tools ask the user to provide constraints on
321the object code control flow, usually in the form of bounds on the
322number of iterations of loops or linear inequalities on them. This
323requires the user to manually link the source and object code,
324translating his assumptions on the source code (which may be wrong) to
325object code constraints. The task is error prone and hard, especially
326in the presence of complex compiler optimisations.
327
328Traditional techniques for WCET that work on object code are also affected by
329another problem: they cannot be applied before the generation of the object
330code. Functional properties can be analysed in early development stages, while
331analysis of non-functional properties may come too late to avoid expensive
332changes to the program architecture.
333
334\subsection{CerCo's approach}
335
336In CerCo we propose a radically new approach to the problem: we reject the idea
337of a uniform cost model and we propose that the compiler, which knows how the
338code is translated, must return the cost model for basic blocks of high level
339instructions. It must do so by keeping track of the control flow modifications
340to reverse them and by interfacing with processor timing analysis.
341
342By embracing compilation, instead of avoiding it like EmBounded did, a CerCo
343compiler can both produce efficient code and return costs that are
344as precise as the processor timing analysis can be. Moreover, our costs can be
345parametric: the cost of a block can depend on actual program data, on a
346summary of the execution history, or on an approximated representation of the
347hardware state. For example, loop optimisations may assign a cost to a loop body
348that is a function of the number of iterations performed. As another example,
349the cost of a block may be a function of the vector of stalled pipeline states,
350which can be exposed in the source code and updated at each basic block exit. It
351is parametricity that allows one to analyse small code fragments without losing
352precision. In the analysis of the code fragment we do not have to ignore the
353initial hardware state, rather, we may assume that we know exactly which
354state (or mode, as the WCET literature calls it) we are in.
355
356The CerCo approach has the potential to dramatically improve the state of the
357art. By performing control and data flow analyses on the source code, the error
358prone translation of invariants is completely avoided. Instead, this
359work is done at the source level using tools of the user's choice.
360Any available technique for the verification of functional properties
361can be immediately reused and multiple techniques can collaborate together to
362infer and certify cost invariants for the program.  There are no
363limitations on the types of loops or data structures involved. Parametric cost analysis
364becomes the default one, with non-parametric bounds used as a last resort when the user
365decides to trade the complexity of the analysis with its precision. \emph{A priori}, no
366technique previously used in traditional WCET is lost: processor
367timing analyses can be used by the compiler on the object code, and the rest can be applied
368at the source code level.
369 Our approach can also work in the early
370stages of development by axiomatically attaching costs to unimplemented components.
371
372
373Software used to verify properties of programs must be as bug free as
374possible. The trusted code base for verification consists of the code that needs
375to be trusted to believe that the property holds. The trusted code base of
376state-of-the-art WCET tools is very large: one needs to trust the control flow
377analyser, the linear programming libraries used, and also the formal models
378of the hardware under analysis, for example. In CerCo we are moving the control flow analysis to the source
379code and we are introducing a non-standard compiler too. To reduce the trusted
380code base, we implemented a prototype and a static analyser in an interactive
381theorem prover, which was used to certify that the costs added to the source
382code are indeed those incurred by the hardware. Formal models of the
383hardware and of the high level source languages were also implemented in the
384interactive theorem prover. Control flow analysis on the source code has been
385obtained using invariant generators, tools to produce proof obligations from
386generated invariants and automatic theorem provers to verify the obligations. If
387these tools are able to generate proof traces that can be
388independently checked, the only remaining component that enters the trusted code
389base is an off-the-shelf invariant generator which, in turn, can be proved
390correct using an interactive theorem prover. Therefore we achieve the double
391objective of allowing the use of more off-the-shelf components (e.g. provers and
392invariant generators) whilst reducing the trusted code base at the same time.
393
394%\paragraph{Summary of the CerCo objectives.} To summarize, the goal of CerCo is
395% to reconcile functional with non-functional analysis by performing them together
396% on the source code, sharing common knowledge about execution invariants. We want
397% to achieve the goal implementing a new generation of compilers that induce a
398% parametric, precise cost model for basic blocks on the source code. The compiler
399% should be certified using an interactive theorem prover to minimize the trusted
400% code base of the analysis. Once the cost model is induced, off-the-shelf tools
401% and techniques can be combined together to infer and prove parametric cost
402% bounds.
403%The long term benefits of the CerCo vision are expected to be:
404%1. the possibility to perform static analysis during early development stages
405%2.  parametric bounds made easier
406%3.  the application of off-the-shelf techniques currently unused for the
407% analysis of non-functional properties, like automated proving and type systems
408%4. simpler and safer interaction with the user, that is still asked for
409% knowledge, but on the source code, with the additional possibility of actually
410% verifying the provided knowledge
411%5. a reduced trusted code base
412%6. the increased accuracy of the bounds themselves.
413%
414%The long term success of the project is hindered by the increased complexity of
415% the static prediction of the non-functional behaviour of modern hardware. In the
416% time frame of the European contribution we have focused on the general
417% methodology and on the difficulties related to the development and certification
418% of a cost model inducing compiler.
419
420% ---------------------------------------------------------------------------- %
421% SECTION                                                                      %
422% ---------------------------------------------------------------------------- %
423
424\section{The typical CerCo workflow}
425\label{sec:workflow}
426
427\begin{figure}[!t]
428\begin{tabular}{l@{\hspace{0.2cm}}|@{\hspace{0.2cm}}l}
429\begin{lstlisting}
430char a[] = {3, 2, 7, 14};
431char threshold = 4;
432
433int count(char *p, int len) {
434  char j;
435  int found = 0;
436  for (j=0; j < len; j++) {
437    if (*p <= threshold)
438      found++;
439    p++;
440    }
441  return found;
442}
443
444int main() {
445  return count(a,4);
446}
447\end{lstlisting}
448&
449%  $\vcenter{\includegraphics[width=7.5cm]{interaction_diagram.pdf}}$
450\begin{tikzpicture}[
451    baseline={([yshift=-.5ex]current bounding box)},
452    element/.style={draw, text width=1.6cm, on chain, text badly centered},
453    >=stealth
454    ]
455{[start chain=going below, node distance=.5cm]
456\node [element] (cerco) {CerCo\\compiler};
457\node [element] (cost)  {CerCo\\cost plugin};
458{[densely dashed]
459\node [element] (ded)   {Deductive\\platform};
460\node [element] (check) {Proof\\checker};
461}
462}
463\coordinate [left=3.5cm of cerco] (left);
464{[every node/.style={above, text width=3.5cm, text badly centered,
465                     font=\scriptsize}]
466\draw [<-] ([yshift=-1ex]cerco.north west) coordinate (t) --
467    node {C source}
468    (t-|left);
469\draw [->] (cerco) -- (cost);
470\draw [->] ([yshift=1ex]cerco.south west) coordinate (t) --
471    node {C source+\color{red}{cost annotations}}
472    (t-|left) coordinate (cerco out);
473\draw [->] ([yshift=1ex]cost.south west) coordinate (t) --
474    node {C source+\color{red}{cost annotations}\\+\color{blue}{synthesized assertions}}
475    (t-|left) coordinate (out);
476{[densely dashed]
477\draw [<-] ([yshift=-1ex]ded.north west) coordinate (t) --
478    node {C source+\color{red}{cost annotations}\\+\color{blue}{complexity assertions}}
479    (t-|left) coordinate (ded in) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
480\draw [->] ([yshift=1ex]ded.south west) coordinate (t) --
481    node {complexity obligations}
482    (t-|left) coordinate (out);
483\draw [<-] ([yshift=-1ex]check.north west) coordinate (t) --
484    node {complexity proof}
485    (t-|left) .. controls +(-.5, 0) and +(-.5, 0) .. (out);
486\draw [dash phase=2.5pt] (cerco out) .. controls +(-1, 0) and +(-1, 0) ..
487    (ded in);
488}}
489%% user:
490% head
491\draw (current bounding box.west) ++(-.2,.5)
492    circle (.2) ++(0,-.2) -- ++(0,-.1) coordinate (t);
493% arms
494\draw (t) -- +(-.3,-.2);
495\draw (t) -- +(.3,-.2);
496% body
497\draw (t) -- ++(0,-.4) coordinate (t);
498% legs
499\draw (t) -- +(-.2,-.6);
500\draw (t) -- +(.2,-.6);
501\end{tikzpicture}
502\end{tabular}
503\caption{On the left: C code to count the number of elements in an array
504 that are less than or equal to a given threshold. On the right: CerCo's interaction
505 diagram. Components provided by CerCo are drawn with a solid border.}
506\label{test1}
507\end{figure}
508We illustrate the workflow we envisage (on the right
509of~\autoref{test1}) on an example program (on the left
510of~\autoref{test1}).  The user writes the program and feeds it to the
511CerCo compiler, which outputs an instrumented version of the same
512program that updates global variables that record the elapsed
513execution time and the stack space usage.  The red lines in
514\autoref{itest1} introducing variables, functions and function calls
515starting with \lstinline'__cost' and \lstinline'__stack' are the instrumentation introduced by the
516compiler.  For example, the two calls at the start of
517\lstinline'count' say that 4 bytes of stack are required, and that it
518takes 111 cycles to reach the next cost annotation (in the loop body).
519The compiler measures these on the labelled object code that it generates.
520
521 The annotated program can then be enriched with complexity
522assertions in the style of Hoare logic, that are passed to a deductive
523platform (in our case Frama-C). We provide as a Frama-C cost plugin a
524simple automatic synthesiser for complexity assertions which can
525be overridden by the user to increase or decrease accuracy.  These are the blue
526comments starting with \lstinline'/*@' in \autoref{itest1}, written in
527Frama-C's specification language, ACSL.  From the
528assertions, a general purpose deductive platform produces proof
529obligations which in turn can be closed by automatic or interactive
530provers, ending in a proof certificate.
531
532% NB: if you try this example with the live CD you should increase the timeout
533
534Twelve proof obligations are generated from~\autoref{itest1} (to prove
535that the loop invariant holds after one execution if it holds before,
536to prove that the whole program execution takes at most 1358 cycles, and so on).  Note that the synthesised time bound for \lstinline'count',
537$178+214*(1+\text{\lstinline'len'})$ cycles, is parametric in the length of
538the array. The CVC3 prover
539closes all obligations within half a minute on routine commodity
540hardware.  A simpler non-parametric version can be solved in a few
541seconds.
542%
543\fvset{commandchars=\\\{\}}
544\lstset{morecomment=[s][\color{blue}]{/*@}{*/},
545        moredelim=[is][\color{blue}]{$}{$},
546        moredelim=[is][\color{red}]{|}{|},
547        lineskip=-2pt}
548\begin{figure}[!p]
549\begin{lstlisting}
550|int __cost = 33, __stack = 5, __stack_max = 5;|
551|void __cost_incr(int incr) { __cost += incr; }|
552|void __stack_incr(int incr) {
553  __stack += incr;
554  __stack_max = __stack_max < __stack ? __stack : __stack_max;
555}|
556
557char a[4] = {3, 2, 7, 14};  char threshold = 4;
558
559/*@ behavior stack_cost:
560      ensures __stack_max <= __max(\old(__stack_max), 4+\old(__stack));
561      ensures __stack == \old(__stack);
562    behavior time_cost:
563      ensures __cost <= \old(__cost)+(178+214*__max(1+\at(len,Pre), 0));
564*/
565int count(char *p, int len) {
566  char j;  int found = 0;
567  |__stack_incr(4);  __cost_incr(111);|
568  $__l: /* internal */$
569  /*@ for time_cost: loop invariant
570        __cost <= \at(__cost,__l)+
571                  214*(__max(\at((len-j)+1,__l), 0)-__max(1+(len-j), 0));
572      for stack_cost: loop invariant
573        __stack_max == \at(__stack_max,__l);
574      for stack_cost: loop invariant
575        __stack == \at(__stack,__l);
576      loop variant len-j;
577  */
578  for (j = 0; j < len; j++) {
579    |__cost_incr(78);|
580    if (*p <= threshold) { |__cost_incr(136);| found ++; }
581    else { |__cost_incr(114);| }
582    p ++;
583  }
584  |__cost_incr(67);  __stack_incr(-4);|
585  return found;
586}
587
588/*@ behavior stack_cost:
589      ensures __stack_max <= __max(\old(__stack_max), 6+\old(__stack));
590      ensures __stack == \old(__stack);
591    behavior time_cost:
592      ensures __cost <= \old(__cost)+1358;
593*/
594int main(void) {
595  int t;
596  |__stack_incr(2);  __cost_incr(110);|
597  t = count(a,4);
598  |__stack_incr(-2);|
599  return t;
600}
601\end{lstlisting}
602\caption{The instrumented version of the program in \autoref{test1},
603 with instrumentation added by the CerCo compiler in red and cost invariants
604 added by the CerCo Frama-C plugin in blue. The \lstinline'__cost',
605 \lstinline'__stack' and \lstinline'__stack_max' variables hold the elapsed time
606in clock cycles and the current and maximum stack usage. Their initial values
607hold the clock cycles spent in initialising the global data before calling
608\lstinline'main' and the space required by global data (and thus unavailable for
609the stack).
610}
611\label{itest1}
612\end{figure}
613
614% ---------------------------------------------------------------------------- %
615% SECTION                                                                      %
616% ---------------------------------------------------------------------------- %
617
618\section{Compiler architecture}
619\label{sect.compiler.architecture}
620
621% ---------------------------------------------------------------------------- %
622% SECTION                                                                      %
623% ---------------------------------------------------------------------------- %
624
625\section{Compiler proof}
626\label{sect.compiler.proof}
627
628% ---------------------------------------------------------------------------- %
629% SECTION                                                                      %
630% ---------------------------------------------------------------------------- %
631
632\section{FramaC plugin}
633\label{sect.framac.plugin}
634
635% ---------------------------------------------------------------------------- %
636% SECTION                                                                      %
637% ---------------------------------------------------------------------------- %
638
639\section{Formal development}
640\label{sect.formal.development}
641
642% ---------------------------------------------------------------------------- %
643% SECTION                                                                      %
644% ---------------------------------------------------------------------------- %
645
646\section{Conclusions}
647\label{sect.conclusions}
648
649%   Summary
650%   Related work
651%   Future work
652
653In many application domains the intensional properties of programs---asymptotic or concrete time and space usage, for example---are an important factor in overall correctness.
654`Soft real time' applications, and cryptography libraries, are two important classes of programs fitting this pattern.
655
656Worst Case Execution Time (WCET) tools currently represent the state of the art in determining accurate concrete resource bounds for programs.
657
658The CerCo verified compiler and associated toolchain has demonstrated that it is possible to perform static time and space analysis at the source level without losing accuracy, reducing the trusted code base and reconciling the study of functional and non-functional properties of programs.
659The techniques introduced in the compiler proof seem to be scalable, and are amenable to both imperative and functional programming languages.
660Further, all techniques presented are compatible with every compiler optimisation considered by us to date.
661
662To prove that compilers can keep track of optimisations
663and induce a precise cost model on the source code, we targeted a simple
664architecture that admits a cost model that is execution history independent.
665The most important future work is dealing with hardware architectures
666characterised by history-dependent stateful components, like caches and
667pipelines. The main issue is to assign a parametric, dependent cost
668to basic blocks that can be later transferred by the labelling approach to
669the source code and represented in a meaningful way to the user. The dependent
670labelling approach that we have studied seems a promising tool to achieve
671this goal, but more work is required to provide good source level
672approximations of the relevant processor state.
673
674Other examples of future work are to improve the cost invariant
675generator algorithms and the coverage of compiler optimisations, to combining
676the labelling approach with the type and effect discipline of~\cite{typeffects}
677to handle languages with implicit memory management, and to experiment with
678our tools in the early phases of development. Larger case studies are also necessary
679to evaluate the CerCo's prototype on realistic, industrial-scale programs.
680
681\begin{acknowledgements}
682\end{acknowledgements}
683
684\bibliographystyle{spmpsci}
685\bibliography{cerco}
686
687\end{document}
Note: See TracBrowser for help on using the repository browser.