1 | \documentclass{llncs} |
---|
2 | |
---|
3 | \usepackage{amsfonts} |
---|
4 | \usepackage{amsmath} |
---|
5 | \usepackage{amssymb} |
---|
6 | \usepackage[english]{babel} |
---|
7 | \usepackage{color} |
---|
8 | \usepackage{fancybox} |
---|
9 | \usepackage{fancyvrb} |
---|
10 | \usepackage{graphicx} |
---|
11 | \usepackage[colorlinks, |
---|
12 | bookmarks,bookmarksopen,bookmarksdepth=2]{hyperref} |
---|
13 | \usepackage{hyphenat} |
---|
14 | \usepackage[utf8x]{inputenc} |
---|
15 | \usepackage{listings} |
---|
16 | \usepackage{mdwlist} |
---|
17 | \usepackage{microtype} |
---|
18 | \usepackage{stmaryrd} |
---|
19 | \usepackage{url} |
---|
20 | |
---|
21 | \usepackage{tikz} |
---|
22 | \usetikzlibrary{positioning,calc,patterns,chains,shapes.geometric,scopes} |
---|
23 | |
---|
24 | %\renewcommand{\verb}{\lstinline} |
---|
25 | %\def\lstlanguagefiles{lst-grafite.tex} |
---|
26 | %\lstset{language=Grafite} |
---|
27 | \lstset{language=C,basewidth=.5em,lineskip=-2pt} |
---|
28 | |
---|
29 | \newlength{\mylength} |
---|
30 | \newenvironment{frametxt}% |
---|
31 | {\setlength{\fboxsep}{5pt} |
---|
32 | \setlength{\mylength}{\linewidth}% |
---|
33 | \addtolength{\mylength}{-2\fboxsep}% |
---|
34 | \addtolength{\mylength}{-2\fboxrule}% |
---|
35 | \Sbox |
---|
36 | \minipage{\mylength}% |
---|
37 | \setlength{\abovedisplayskip}{0pt}% |
---|
38 | \setlength{\belowdisplayskip}{0pt}% |
---|
39 | }% |
---|
40 | {\endminipage\endSbox |
---|
41 | \[\fbox{\TheSbox}\]} |
---|
42 | |
---|
43 | \title{Certified Complexity (CerCo)\thanks{The project CerCo acknowledges the |
---|
44 | financial support of the Future and Emerging Technologies (FET) programme within |
---|
45 | the Seventh Framework Programme for Research of the European Commission, under |
---|
46 | FET-Open grant number: 243881}} |
---|
47 | \author{ |
---|
48 | %Roberto |
---|
49 | R.~M. Amadio$^{4}$ \and |
---|
50 | %Nicolas |
---|
51 | N.~Ayache$^{3,4}$ \and |
---|
52 | %François |
---|
53 | F.~Bobot$^{3,4}$ \and |
---|
54 | %Jacob |
---|
55 | J.~P. Boender$^1$ \and |
---|
56 | %Brian |
---|
57 | B.~Campbell$^2$ \and |
---|
58 | %Ilias |
---|
59 | I.~Garnier$^2$ \and |
---|
60 | %Antoine |
---|
61 | A.~Madet$^4$ \and |
---|
62 | %James |
---|
63 | J.~McKinna$^2$ \and |
---|
64 | %Dominic |
---|
65 | D.~P. Mulligan$^1$ \and |
---|
66 | %Mauro |
---|
67 | M.~Piccolo$^1$ \and |
---|
68 | %Randy |
---|
69 | R.~Pollack$^2$ \and |
---|
70 | %Yann |
---|
71 | Y.~R\'egis-Gianas$^{3,4}$ \and |
---|
72 | %Claudio |
---|
73 | C.~Sacerdoti Coen$^1$ \and |
---|
74 | %Ian |
---|
75 | I.~Stark$^2$ \and |
---|
76 | %Paolo |
---|
77 | P.~Tranquilli$^1$} |
---|
78 | \institute{Dipartimento di Informatica - Scienza e Ingegneria, Universit\'a di |
---|
79 | Bologna \and |
---|
80 | LFCS, School of Informatics, University of Edinburgh |
---|
81 | \and INRIA (Team $\pi$r2 ) |
---|
82 | \and |
---|
83 | Universit\`e Paris Diderot |
---|
84 | } |
---|
85 | |
---|
86 | \bibliographystyle{splncs03} |
---|
87 | |
---|
88 | \begin{document} |
---|
89 | |
---|
90 | \maketitle |
---|
91 | |
---|
92 | % Brian: I've changed "without loss of accuracy" because we discuss |
---|
93 | % important precision/complexity tradeoffs for advanced architectures |
---|
94 | % in the dependent labelling section. |
---|
95 | \begin{abstract} |
---|
96 | This paper provides an overview of the FET-Open Project CerCo |
---|
97 | (`Certified Complexity'). The project's main achievement is |
---|
98 | the development of a technique for analysing non-functional |
---|
99 | properties of programs (time, space) at the source level, with little or no loss of accuracy |
---|
100 | and with a small, trusted code base. The main software component |
---|
101 | developed is a certified compiler producing cost annotations. The compiler |
---|
102 | translates source code to object code and produces an instrumented copy of the |
---|
103 | source code. This instrumentation exposes at |
---|
104 | the source level---and tracks precisely---the actual (non-asymptotic) |
---|
105 | computational cost of the input program. Untrusted invariant generators and trusted theorem provers |
---|
106 | can then be used to compute and certify the parametric execution time of the |
---|
107 | code. |
---|
108 | \end{abstract} |
---|
109 | |
---|
110 | % ---------------------------------------------------------------------------- % |
---|
111 | % SECTION % |
---|
112 | % ---------------------------------------------------------------------------- % |
---|
113 | \section{Introduction} |
---|
114 | |
---|
115 | \paragraph{Problem statement.} Computer programs can be specified with both |
---|
116 | functional constraints (what a program must do) and non-functional constraints |
---|
117 | (e.g. what resources---time, space, etc.---the program may use). In the current |
---|
118 | state of the art, functional properties are verified for high-level source code |
---|
119 | by combining user annotations (e.g. preconditions and invariants) with a |
---|
120 | multitude of automated analyses (invariant generators, type systems, abstract |
---|
121 | interpretation, theorem proving, and so on). By contrast, non-functional properties |
---|
122 | are generally checked on low-level object code, but also demand information |
---|
123 | about high-level functional behaviour that must somehow be recreated. |
---|
124 | |
---|
125 | This situation presents several problems: 1) it can be hard to infer this |
---|
126 | high-level structure in the presence of compiler optimisations; 2) techniques |
---|
127 | working on object code are not useful in early development, yet problems |
---|
128 | detected later are more expensive to tackle; 3) parametric cost analysis is very |
---|
129 | hard: how can we reflect a cost that depends on the execution state (e.g. the |
---|
130 | value of a register or a carry bit) to a cost that the user can understand |
---|
131 | looking at source code? 4) functional analysis performed only on object code |
---|
132 | makes any contribution from the programmer hard, leading to less precision in |
---|
133 | the estimates. |
---|
134 | |
---|
135 | \paragraph{Vision and approach.} We want to reconcile functional and |
---|
136 | non-functional analyses: to share information and perform both at the same time |
---|
137 | on source code. |
---|
138 | % |
---|
139 | What has previously prevented this approach is the lack of a uniform and precise |
---|
140 | cost model for high-level code: 1) each statement occurrence is compiled |
---|
141 | differently and optimisations may change control flow; 2) the cost of an object |
---|
142 | code instruction may depend on the runtime state of hardware components like |
---|
143 | pipelines and caches, all of which are not visible in the source code. |
---|
144 | |
---|
145 | To solve the issue, we envision a new generation of compilers able to keep track |
---|
146 | of program structure through compilation and optimisation, and to exploit that |
---|
147 | information to define a cost model for source code that is precise, non-uniform, |
---|
148 | and accounts for runtime state. With such a source-level cost model we can |
---|
149 | reduce non-functional verification to the functional case and exploit the state |
---|
150 | of the art in automated high-level verification~\cite{survey}. The techniques |
---|
151 | currently used by the Worst Case Execution Time (WCET) community, which perform the analysis on object code, |
---|
152 | are still available but can now be coupled with an additional source-level |
---|
153 | analysis. Where the approach produces precise cost models too complex to reason |
---|
154 | about, safe approximations can be used to trade complexity with precision. |
---|
155 | Finally, source code analysis can be made during early development stages, when |
---|
156 | components have been specified but not implemented: source code modularity means |
---|
157 | that it is enough to specify the non-functional behaviour of unimplemented |
---|
158 | components. |
---|
159 | |
---|
160 | \paragraph{Contributions.} We have developed what we refer to as \emph{the labeling approach} \cite{labeling}, a |
---|
161 | technique to implement compilers that induce cost models on source programs by |
---|
162 | very lightweight tracking of code changes through compilation. We have studied |
---|
163 | how to formally prove the correctness of compilers implementing this technique. |
---|
164 | We have implemented such a compiler from C to object binaries for the 8051 |
---|
165 | micro-controller, and verified it in an interactive theorem prover. We have |
---|
166 | implemented a Frama-C plugin \cite{framac} that invokes the compiler on a source |
---|
167 | program and uses this to generate invariants on the high-level source that |
---|
168 | correctly model low-level costs. Finally, the plugin certifies that the program |
---|
169 | respects these costs by calling automated theorem provers, a new and innovative |
---|
170 | technique in the field of cost analysis. As a case study, we show how the |
---|
171 | plugin can automatically compute and certify the exact reaction time of Lustre |
---|
172 | \cite{lustre} data flow programs compiled into C. |
---|
173 | |
---|
174 | \section{Project context and objectives} |
---|
175 | Formal methods for verification of functional properties of programs have |
---|
176 | now reached a level of maturity and automation that is facilitating a slow but |
---|
177 | increasing adoption in production environments. For safety critical code, it is |
---|
178 | becoming commonplace to combine rigorous software engineering methodologies and testing |
---|
179 | with static analysis, taking the strong points of each approach and mitigating |
---|
180 | their weaknesses. Of particular interest are open frameworks |
---|
181 | for the combination of different formal methods, where the programs can be |
---|
182 | progressively specified and are continuously enriched with new safety |
---|
183 | guarantees: every method contributes knowledge (e.g. new invariants) that |
---|
184 | becomes an assumption for later analysis. |
---|
185 | |
---|
186 | The scenario for the verification of non-functional properties (time spent, |
---|
187 | memory used, energy consumed) is bleaker and the future seems to be getting even |
---|
188 | worse. Most industries verify that real time systems meet their deadlines |
---|
189 | by simply performing many runs of the system and timing their execution, computing the |
---|
190 | maximum time and adding an empirical safety margin, claiming the result to be a |
---|
191 | bound for the WCET of the program. Formal methods and software to statically |
---|
192 | analyse the WCET of programs exist, but they often produce bounds that are too |
---|
193 | pessimistic to be useful. Recent advancements in hardware architectures are all |
---|
194 | focused on the improvement of the average case performance, not the |
---|
195 | predictability of the worst case. Execution time is becoming increasingly |
---|
196 | dependent on execution history and the internal state of |
---|
197 | hardware components like pipelines and caches. Multi-core processors and non-uniform |
---|
198 | memory models are drastically reducing the possibility of performing |
---|
199 | static analysis in isolation, because programs are less and less time |
---|
200 | composable. Clock-precise hardware models are necessary for static analysis, and |
---|
201 | obtaining them is becoming harder as a consequence of the increased sophistication |
---|
202 | of hardware design. |
---|
203 | |
---|
204 | Despite the aforementioned problems, the need for reliable real time systems and programs is |
---|
205 | increasing, and there is increasing pressure from the research community towards |
---|
206 | the differentiation of hardware. The aim is to introduce alternative |
---|
207 | hardware with more predictable behaviour and hence more suitability for static |
---|
208 | analyses, for example, the decoupling of execution time from execution history |
---|
209 | by introducing randomisation \cite{proartis}. |
---|
210 | |
---|
211 | In the CerCo project \cite{cerco} we do not try to address this problem, optimistically |
---|
212 | assuming that static analysis of non-functional properties of programs will |
---|
213 | become feasible in the longer term. The main objective of our work is |
---|
214 | instead to bring together static analysis of functional and non-functional |
---|
215 | properties, which, according to the current state of the art, are completely |
---|
216 | independent activities with limited exchange of information: while the |
---|
217 | functional properties are verified on the source code, the analysis of |
---|
218 | non-functional properties is entirely performed on the object code to exploit |
---|
219 | clock-precise hardware models. |
---|
220 | |
---|
221 | \subsection{Current object-code methods} |
---|
222 | |
---|
223 | Analysis currently takes place on object code for two main reasons. |
---|
224 | First, there cannot be a uniform, precise cost model for source |
---|
225 | code instructions (or even basic blocks). During compilation, high level |
---|
226 | instructions are torn apart and reassembled in context-specific ways so that |
---|
227 | identifying a fragment of object code and a single high level instruction is |
---|
228 | infeasible. Even the control flow of the object and source code can be very |
---|
229 | different as a result of optimisations, for example aggressive loop |
---|
230 | optimisations may completely transform source level loops. Despite the lack of a uniform, compilation- and |
---|
231 | program-independent cost model on the source language, the literature on the |
---|
232 | analysis of non-asymptotic execution time on high level languages that assumes |
---|
233 | such a model is growing and gaining momentum. However, unless we can provide a |
---|
234 | replacement for such cost models, this literature's future practical impact looks |
---|
235 | to be minimal. Some hope has been provided by the EmBounded project \cite{embounded}, which |
---|
236 | compositionally compiles high-level code to a byte code that is executed by an |
---|
237 | emulator with guarantees on the maximal execution time spent for each byte code |
---|
238 | instruction. That approach provides a uniform model at the price of the model's |
---|
239 | precision (each cost is a pessimistic upper bound) and performance of the |
---|
240 | executed code (because the byte code is emulated compositionally instead of |
---|
241 | performing a fully non-compositional compilation). |
---|
242 | |
---|
243 | The second reason to perform the analysis on the object code is that bounding |
---|
244 | the worst case execution time of small code fragments in isolation (e.g. loop |
---|
245 | bodies) and then adding up the bounds yields very poor estimates because no |
---|
246 | knowledge on the hardware state can be assumed when executing the fragment. By |
---|
247 | analysing longer runs the bound obtained becomes more precise because the lack |
---|
248 | of knowledge on the initial state has less of an effect on longer computations. |
---|
249 | |
---|
250 | The cost of an execution is the sum of the cost of basic blocks multiplied by |
---|
251 | the number of times they are executed, which is a functional property of the |
---|
252 | program. Therefore, in order to perform (parametric) time analysis of programs, |
---|
253 | it is necessary to combine a cost model with control and data flow analysis. |
---|
254 | Current state of the art WCET technology |
---|
255 | \cite{stateart} performs the analysis on the object code, where the logic of the |
---|
256 | program is harder to reconstruct and most information available at the source |
---|
257 | code level has been lost. Imprecision in the analysis leads to useless bounds. To |
---|
258 | augment precision, the tools ask the user to provide constraints on the object |
---|
259 | code control flow, usually in the form of bounds on the number of iterations of |
---|
260 | loops or linear inequalities on them. This requires the user to manually link |
---|
261 | the source and object code, translating his assumptions on the source code |
---|
262 | (which may be wrong) to object code constraints. The task is error prone and |
---|
263 | hard, especially in the presence of complex compiler optimisations. |
---|
264 | |
---|
265 | Traditional techniques for WCET that work on object code are also affected by |
---|
266 | another problem: they cannot be applied before the generation of the object |
---|
267 | code. Functional properties can be analysed in early development stages, while |
---|
268 | analysis of non-functional properties may come too late to avoid expensive |
---|
269 | changes to the program architecture. |
---|
270 | |
---|
271 | \subsection{CerCo} |
---|
272 | |
---|
273 | In CerCo we propose a radically new approach to the problem: we reject the idea |
---|
274 | of a uniform cost model and we propose that the compiler, which knows how the |
---|
275 | code is translated, must return the cost model for basic blocks of high level |
---|
276 | instructions. It must do so by keeping track of the control flow modifications |
---|
277 | to reverse them and by interfacing with static analysers. |
---|
278 | |
---|
279 | By embracing compilation, instead of avoiding it like EmBounded did, a CerCo |
---|
280 | compiler can at the same time produce efficient code and return costs that are |
---|
281 | as precise as the static analysis can be. Moreover, we allow our costs to be |
---|
282 | parametric: the cost of a block can depend on actual program data or on a |
---|
283 | summary of the execution history or on an approximated representation of the |
---|
284 | hardware state. For example, loop optimizations may assign to a loop body a cost |
---|
285 | that is a function of the number of iterations performed. As another example, |
---|
286 | the cost of a block may be a function of the vector of stalled pipeline states, |
---|
287 | which can be exposed in the source code and updated at each basic block exit. It |
---|
288 | is parametricity that allows one to analyse small code fragments without losing |
---|
289 | precision: in the analysis of the code fragment we do not have to ignore the |
---|
290 | initial hardware state. On the contrary, we can assume that we know exactly which |
---|
291 | state (or mode, as the WCET literature calls it) we are in. |
---|
292 | |
---|
293 | The CerCo approach has the potential to dramatically improve the state of the |
---|
294 | art. By performing control and data flow analyses on the source code, the error |
---|
295 | prone translation of invariants is completely avoided. It is in fact performed |
---|
296 | by the compiler itself when it induces the cost model on the source language. |
---|
297 | Moreover, any available technique for the verification of functional properties |
---|
298 | can be immediately reused and multiple techniques can collaborate together to |
---|
299 | infer and certify cost invariants for the program. Parametric cost analysis |
---|
300 | becomes the default one, with non parametric bounds used as last resorts when |
---|
301 | trading the complexity of the analysis with its precision. \emph{A priori}, no |
---|
302 | technique previously used in traditional WCET is lost: they can just be applied |
---|
303 | at the source code level. |
---|
304 | Our approach can also work in the early |
---|
305 | stages of development by axiomatically attaching costs to unimplemented components. |
---|
306 | |
---|
307 | |
---|
308 | All software used to verify properties of programs must be as bug free as |
---|
309 | possible. The trusted code base for verification is made by the code that needs |
---|
310 | to be trusted to believe that the property holds. The trusted code base of |
---|
311 | state-of-the-art WCET tools is very large: one needs to trust the control flow |
---|
312 | analyser and the linear programming libraries it uses and also the formal models |
---|
313 | of the hardware. In CerCo we are moving the control flow analysis to the source |
---|
314 | code and we are introducing a non-standard compiler too. To reduce the trusted |
---|
315 | code base, we implemented a prototype and a static analyser in an interactive |
---|
316 | theorem prover, which was used to certify that the cost computed on the source |
---|
317 | code is indeed the one actually spent by the hardware. Formal models of the |
---|
318 | hardware and of the high level source languages were also implemented in the |
---|
319 | interactive theorem prover. Control flow analysis on the source code has been |
---|
320 | obtained using invariant generators, tools to produce proof obligations from |
---|
321 | generated invariants and automatic theorem provers to verify the obligations. If |
---|
322 | the automatic provers are able to generate proof traces that can be |
---|
323 | independently checked, the only remaining component that enters the trusted code |
---|
324 | base is an off-the-shelf invariant generator which, in turn, can be proved |
---|
325 | correct using an interactive theorem prover. Therefore we achieve the double |
---|
326 | objective of allowing the use of more off-the-shelf components (e.g. provers and |
---|
327 | invariant generators) whilst reducing the trusted code base at the same time. |
---|
328 | |
---|
329 | %\paragraph{Summary of the CerCo objectives.} To summarize, the goal of CerCo is |
---|
330 | % to reconcile functional with non-functional analysis by performing them together |
---|
331 | % on the source code, sharing common knowledge about execution invariants. We want |
---|
332 | % to achieve the goal implementing a new generation of compilers that induce a |
---|
333 | % parametric, precise cost model for basic blocks on the source code. The compiler |
---|
334 | % should be certified using an interactive theorem prover to minimize the trusted |
---|
335 | % code base of the analysis. Once the cost model is induced, off-the-shelf tools |
---|
336 | % and techniques can be combined together to infer and prove parametric cost |
---|
337 | % bounds. |
---|
338 | %The long term benefits of the CerCo vision are expected to be: |
---|
339 | %1. the possibility to perform static analysis during early development stages |
---|
340 | %2. parametric bounds made easier |
---|
341 | %3. the application of off-the-shelf techniques currently unused for the |
---|
342 | % analysis of non-functional properties, like automated proving and type systems |
---|
343 | %4. simpler and safer interaction with the user, that is still asked for |
---|
344 | % knowledge, but on the source code, with the additional possibility of actually |
---|
345 | % verifying the provided knowledge |
---|
346 | %5. a reduced trusted code base |
---|
347 | %6. the increased accuracy of the bounds themselves. |
---|
348 | % |
---|
349 | %The long term success of the project is hindered by the increased complexity of |
---|
350 | % the static prediction of the non-functional behaviour of modern hardware. In the |
---|
351 | % time frame of the European contribution we have focused on the general |
---|
352 | % methodology and on the difficulties related to the development and certification |
---|
353 | % of a cost model inducing compiler. |
---|
354 | |
---|
355 | \section{The typical CerCo workflow} |
---|
356 | \begin{figure}[!t] |
---|
357 | \begin{tabular}{l@{\hspace{0.2cm}}|@{\hspace{0.2cm}}l} |
---|
358 | \begin{lstlisting} |
---|
359 | char a[] = {3, 2, 7, 14}; |
---|
360 | char threshold = 4; |
---|
361 | |
---|
362 | int count(char *p, int len) { |
---|
363 | char j; |
---|
364 | int found = 0; |
---|
365 | for (j=0; j < len; j++) { |
---|
366 | if (*p <= threshold) |
---|
367 | found++; |
---|
368 | p++; |
---|
369 | } |
---|
370 | return found; |
---|
371 | } |
---|
372 | |
---|
373 | int main() { |
---|
374 | return count(a,4); |
---|
375 | } |
---|
376 | \end{lstlisting} |
---|
377 | & |
---|
378 | % $\vcenter{\includegraphics[width=7.5cm]{interaction_diagram.pdf}}$ |
---|
379 | \begin{tikzpicture}[ |
---|
380 | baseline={([yshift=-.5ex]current bounding box)}, |
---|
381 | element/.style={draw, text width=1.6cm, on chain, text badly centered}, |
---|
382 | >=stealth |
---|
383 | ] |
---|
384 | {[start chain=going below, node distance=.5cm] |
---|
385 | \node [element] (cerco) {CerCo\\compiler}; |
---|
386 | \node [element] (cost) {CerCo\\cost plugin}; |
---|
387 | {[densely dashed] |
---|
388 | \node [element] (ded) {Deductive\\platform}; |
---|
389 | \node [element] (check) {Proof\\checker}; |
---|
390 | } |
---|
391 | } |
---|
392 | \coordinate [left=3.5cm of cerco] (left); |
---|
393 | {[every node/.style={above, text width=3.5cm, text badly centered, |
---|
394 | font=\scriptsize}] |
---|
395 | \draw [<-] ([yshift=-1ex]cerco.north west) coordinate (t) -- |
---|
396 | node {C source} |
---|
397 | (t-|left); |
---|
398 | \draw [->] (cerco) -- (cost); |
---|
399 | \draw [->] ([yshift=1ex]cerco.south west) coordinate (t) -- |
---|
400 | node {C source+\color{red}{cost annotations}} |
---|
401 | (t-|left) coordinate (cerco out); |
---|
402 | \draw [->] ([yshift=1ex]cost.south west) coordinate (t) -- |
---|
403 | node {C source+\color{red}{cost annotations}\\+\color{blue}{synthesized assertions}} |
---|
404 | (t-|left) coordinate (out); |
---|
405 | {[densely dashed] |
---|
406 | \draw [<-] ([yshift=-1ex]ded.north west) coordinate (t) -- |
---|
407 | node {C source+\color{red}{cost annotations}\\+\color{blue}{complexity assertions}} |
---|
408 | (t-|left) coordinate (ded in) .. controls +(-.5, 0) and +(-.5, 0) .. (out); |
---|
409 | \draw [->] ([yshift=1ex]ded.south west) coordinate (t) -- |
---|
410 | node {complexity obligations} |
---|
411 | (t-|left) coordinate (out); |
---|
412 | \draw [<-] ([yshift=-1ex]check.north west) coordinate (t) -- |
---|
413 | node {complexity proof} |
---|
414 | (t-|left) .. controls +(-.5, 0) and +(-.5, 0) .. (out); |
---|
415 | \draw [dash phase=2.5pt] (cerco out) .. controls +(-1, 0) and +(-1, 0) .. |
---|
416 | (ded in); |
---|
417 | }} |
---|
418 | %% user: |
---|
419 | % head |
---|
420 | \draw (current bounding box.west) ++(-.2,.5) |
---|
421 | circle (.2) ++(0,-.2) -- ++(0,-.1) coordinate (t); |
---|
422 | % arms |
---|
423 | \draw (t) -- +(-.3,-.2); |
---|
424 | \draw (t) -- +(.3,-.2); |
---|
425 | % body |
---|
426 | \draw (t) -- ++(0,-.4) coordinate (t); |
---|
427 | % legs |
---|
428 | \draw (t) -- +(-.2,-.6); |
---|
429 | \draw (t) -- +(.2,-.6); |
---|
430 | \end{tikzpicture} |
---|
431 | \end{tabular} |
---|
432 | \caption{On the left: code to count the array's elements |
---|
433 | that are less than or equal to the threshold. On the right: CerCo's interaction |
---|
434 | diagram. CerCo's components are drawn solid.} |
---|
435 | \label{test1} |
---|
436 | \end{figure} |
---|
437 | We illustrate the workflow we envisage (on the right of~\autoref{test1}) |
---|
438 | on an example program (on the left of~\autoref{test1}). |
---|
439 | The user writes the program and feeds it to the CerCo compiler, which outputs |
---|
440 | an instrumented version of the same program that updates global variables |
---|
441 | that record the elapsed execution time and the stack space usage. |
---|
442 | The red lines in \autoref{itest1} are the instrumentation introduced by the |
---|
443 | compiler. The annotated program can then be enriched with complexity assertions |
---|
444 | in the style of Hoare logic, that are passed to a deductive platform (in our |
---|
445 | case Frama-C). We provide as a Frama-C cost plugin a simple automatic |
---|
446 | synthesiser for complexity assertions (the blue lines in \autoref{itest1}), |
---|
447 | which can be overridden by the user to increase or decrease accuracy. From the |
---|
448 | assertions, a general purpose deductive platform produces proof obligations |
---|
449 | which in turn can be closed by automatic or interactive provers, ending in a |
---|
450 | proof certificate. |
---|
451 | |
---|
452 | % NB: if you try this example with the live CD you should increase the timeout |
---|
453 | |
---|
454 | Twelve proof obligations are generated from~\autoref{itest1} (to prove |
---|
455 | that the loop invariant holds after one execution if it holds before, |
---|
456 | to prove that the whole program execution takes at most 1358 cycles, |
---|
457 | etc.). Note that the synthesised time bound for \lstinline'count', |
---|
458 | $178+214*(1+\lstinline'len')$ cycles, is parametric in the length of |
---|
459 | the array. The CVC3 prover |
---|
460 | closes all obligations within half a minute on routine commodity |
---|
461 | hardware. A simpler non-parametric version can be solved in a few |
---|
462 | seconds. |
---|
463 | % |
---|
464 | \fvset{commandchars=\\\{\}} |
---|
465 | \lstset{morecomment=[s][\color{blue}]{/*@}{*/}, |
---|
466 | moredelim=[is][\color{blue}]{$}{$}, |
---|
467 | moredelim=[is][\color{red}]{|}{|}} |
---|
468 | \begin{figure}[!p] |
---|
469 | \begin{lstlisting} |
---|
470 | |int __cost = 33, __stack = 5, __stack_max = 5;| |
---|
471 | |void __cost_incr(int incr) { __cost += incr; }| |
---|
472 | |void __stack_incr(int incr) { |
---|
473 | __stack += incr; |
---|
474 | __stack_max = __stack_max < __stack ? __stack : __stack_max; |
---|
475 | }| |
---|
476 | |
---|
477 | char a[4] = {3, 2, 7, 14}; char threshold = 4; |
---|
478 | |
---|
479 | /*@ behavior stack_cost: |
---|
480 | ensures __stack_max <= __max(\old(__stack_max), 4+\old(__stack)); |
---|
481 | ensures __stack == \old(__stack); |
---|
482 | behavior time_cost: |
---|
483 | ensures __cost <= \old(__cost)+(178+214*__max(1+\at(len,Pre), 0)); |
---|
484 | */ |
---|
485 | int count(unsigned char *p, int len) { |
---|
486 | char j; int found = 0; |
---|
487 | |__stack_incr(4); __cost_incr(111);| |
---|
488 | $__l: /* internal */$ |
---|
489 | /*@ for time_cost: loop invariant |
---|
490 | __cost <= \at(__cost,__l)+ |
---|
491 | 214*(__max(\at((len-j)+1,__l), 0)-__max(1+(len-j), 0)); |
---|
492 | for stack_cost: loop invariant |
---|
493 | __stack_max == \at(__stack_max,__l); |
---|
494 | for stack_cost: loop invariant |
---|
495 | __stack == \at(__stack,__l); |
---|
496 | loop variant len-j; |
---|
497 | */ |
---|
498 | for (j = 0; j < len; j++) { |
---|
499 | |__cost_incr(78);| |
---|
500 | if (*p <= threshold) { |__cost_incr(136);| found ++; } |
---|
501 | else { |__cost_incr(114);| } |
---|
502 | p ++; |
---|
503 | } |
---|
504 | |__cost_incr(67); __stack_incr(-4);| |
---|
505 | return found; |
---|
506 | } |
---|
507 | |
---|
508 | /*@ behavior stack_cost: |
---|
509 | ensures __stack_max <= __max(\old(__stack_max), 6+\old(__stack)); |
---|
510 | ensures __stack == \old(__stack); |
---|
511 | behavior time_cost: |
---|
512 | ensures __cost <= \old(__cost)+1358; |
---|
513 | */ |
---|
514 | int main(void) { |
---|
515 | int t; |
---|
516 | |__stack_incr(2); __cost_incr(110);| |
---|
517 | t = count(a,4); |
---|
518 | |__stack_incr(-2);| |
---|
519 | return t; |
---|
520 | } |
---|
521 | \end{lstlisting} |
---|
522 | \caption{The instrumented version of the program in \autoref{test1}, |
---|
523 | with instrumentation added by the CerCo compiler in red and cost invariants |
---|
524 | added by the CerCo Frama-C plugin in blue. The \texttt{\_\_cost}, |
---|
525 | \texttt{\_\_stack} and \texttt{\_\_stack\_max} variables hold the elapsed time |
---|
526 | in clock cycles and the current and maximum stack usage. Their initial values |
---|
527 | hold the clock cycles spent in initialising the global data before calling |
---|
528 | \texttt{main} and the space required by global data (and thus unavailable for |
---|
529 | the stack). |
---|
530 | } |
---|
531 | \label{itest1} |
---|
532 | \end{figure} |
---|
533 | \section{Main scientific and technical results} |
---|
534 | We will now review the main results that the CerCo project has achieved. |
---|
535 | |
---|
536 | % \emph{Dependent labeling~\cite{?}.} The basic labeling approach assumes a |
---|
537 | % bijective mapping between object code and source code O(1) blocks (called basic |
---|
538 | % blocks). This assumption is violated by many program optimizations (e.g. loop |
---|
539 | % peeling and loop unrolling). It also assumes the cost model computed on the |
---|
540 | % object code to be non parametric: every block must be assigned a cost that does |
---|
541 | % not depend on the state. This assumption is violated by stateful hardware like |
---|
542 | % pipelines, caches, branch prediction units. The dependent labeling approach is |
---|
543 | % an extension of the basic labeling approach that allows to handle parametric |
---|
544 | % cost models. We showed how the method allows to deal with loop optimizations and |
---|
545 | % pipelines, and we speculated about its applications to caches. |
---|
546 | % |
---|
547 | % \emph{Techniques to exploit the induced cost model.} Every technique used for |
---|
548 | % the analysis of functional properties of programs can be adapted to analyse the |
---|
549 | % non-functional properties of the source code instrumented by compilers that |
---|
550 | % implement the labeling approach. In order to gain confidence in this claim, we |
---|
551 | % showed how to implement a cost invariant generator combining abstract |
---|
552 | % interpretation with separation logic ideas \cite{separation}. We integrated |
---|
553 | % everything in the Frama-C modular architecture, in order to compute proof |
---|
554 | % obligations from functional and cost invariants and to use automatic theorem |
---|
555 | % provers on them. This is an example of a new technique that is not currently |
---|
556 | % exploited in WCET analysis. It also shows how precise functional invariants |
---|
557 | % benefits the non-functional analysis too. Finally, we show how to fully |
---|
558 | % automatically analyse the reaction time of Lustre nodes that are first compiled |
---|
559 | % to C using a standard Lustre compiler and then processed by a C compiler that |
---|
560 | % implements the labeling approach. |
---|
561 | |
---|
562 | % \emph{The CerCo compiler.} This is a compiler from a large subset of the C |
---|
563 | % program to 8051/8052 object code, |
---|
564 | % integrating the labeling approach and a static analyser for 8051 executables. |
---|
565 | % The latter can be implemented easily and does not require dependent costs |
---|
566 | % because the 8051 microprocessor is a very simple one, with constant-cost |
---|
567 | % instruction. It was chosen to separate the issue of exact propagation of the |
---|
568 | % cost model from the orthogonal problem of the static analysis of object code |
---|
569 | % that may require approximations or dependent costs. The compiler comes in |
---|
570 | % several versions: some are prototypes implemented directly in OCaml, and they |
---|
571 | % implement both the basic and dependent labeling approaches; the final version |
---|
572 | % is extracted from a Matita certification and at the moment implements only the |
---|
573 | % basic approach. |
---|
574 | |
---|
575 | \subsection{The (basic) labeling approach} |
---|
576 | The labeling approach is the foundational insight that underlies all the developments in CerCo. |
---|
577 | It allows the tracking of the evolution of |
---|
578 | basic blocks during the compilation process in order to propagate the cost model from the |
---|
579 | object code to the source code without losing precision in the process. |
---|
580 | \paragraph{Problem statement.} Given a source program $P$, we want to obtain an |
---|
581 | instrumented source program $P'$, written in the same programming language, and |
---|
582 | the object code $O$ such that: 1) $P'$ is obtained by inserting into $P$ some |
---|
583 | additional instructions to update global cost information like the amount of |
---|
584 | time spent during execution or the maximal stack space required; 2) $P$ and $P'$ |
---|
585 | must have the same functional behaviour, i.e.\ they must produce that same output |
---|
586 | and intermediate observables; 3) $P$ and $O$ must have the same functional |
---|
587 | behaviour; 4) after execution and in interesting points during execution, the |
---|
588 | cost information computed by $P'$ must be an upper bound of the one spent by $O$ |
---|
589 | to perform the corresponding operations (soundness property); 5) the difference |
---|
590 | between the costs computed by $P'$ and the execution costs of $O$ must be |
---|
591 | bounded by a program-dependent constant (precision property). |
---|
592 | |
---|
593 | \paragraph{The labeling software components.} We solve the problem in four |
---|
594 | stages \cite{labeling}, implemented by four software components that are used |
---|
595 | in sequence. |
---|
596 | |
---|
597 | The first component labels the source program $P$ by injecting label emission |
---|
598 | statements in appropriate positions to mark the beginning of basic blocks. |
---|
599 | % The |
---|
600 | % set of labels with their positions is called labeling. |
---|
601 | The syntax and semantics |
---|
602 | of the source programming language is augmented with label emission statements. |
---|
603 | The statement ``EMIT $\ell$'' behaves like a NOP instruction that does not affect the |
---|
604 | program state or control flow, but its execution is observable. |
---|
605 | % Therefore the observables of a run of a program becomes a stream |
---|
606 | % of label emissions: $\ell_1,\ldots,\ell_n$, called the program trace. We clarify the conditions |
---|
607 | % that the labeling must respect later. |
---|
608 | |
---|
609 | The second component is a labeling preserving compiler. It can be obtained from |
---|
610 | an existing compiler by adding label emission statements to every intermediate |
---|
611 | language and by propagating label emission statements during compilation. The |
---|
612 | compiler is correct if it preserves both the functional behaviour of the program |
---|
613 | and the traces of observables. |
---|
614 | % We may also ask that the function that erases the cost |
---|
615 | % emission statements commute with compilation. This optional property grants that |
---|
616 | % the labeling does not interfere with the original compiler behaviour. A further |
---|
617 | % set of requirements will be added later. |
---|
618 | |
---|
619 | The third component is a static object code analyser. It takes as input a labeled |
---|
620 | object code and it computes the scope of each of its label emission statements, |
---|
621 | i.e.\ the tree of instructions that may be executed after the statement and before |
---|
622 | a new label emission is encountered. |
---|
623 | It is a tree and not a sequence because the scope |
---|
624 | may contain a branching statement. In order to grant that such a finite tree |
---|
625 | exists, the object code must not contain any loop that is not broken by a label |
---|
626 | emission statement. This is the first requirement of a sound labeling. The |
---|
627 | analyser fails if the labeling is unsound. For each scope, the analyser |
---|
628 | computes an upper bound of the execution time required by the scope, using the |
---|
629 | maximum of the costs of the two branches in case of a conditional statement. |
---|
630 | Finally, the analyser computes the cost of a label by taking the maximum of the |
---|
631 | costs of the scopes of every statement that emits that label. |
---|
632 | |
---|
633 | The fourth and last component takes in input the cost model computed at step 3 |
---|
634 | and the labelled code computed at step 1. It outputs a source program obtained |
---|
635 | by replacing each label emission statement with a statement that increments the |
---|
636 | global cost variable with the cost associated to the label by the cost model. |
---|
637 | The obtained source code is the instrumented source code. |
---|
638 | |
---|
639 | \paragraph{Correctness.} Requirements 1, 2 and 3 of the problem statement |
---|
640 | obviously hold, with 2 and 3 being a consequence of the definition of a correct |
---|
641 | labeling preserving compiler. It is also obvious that the value of the global |
---|
642 | cost variable of an instrumented source code is at any time equal to the sum of |
---|
643 | the costs of the labels emitted by the corresponding labelled code. Moreover, |
---|
644 | because the compiler preserves all traces, the sum of the costs of the labels |
---|
645 | emitted in the source and target labelled code are the same. Therefore, to |
---|
646 | satisfy the fourth requirement, we need to grant that the time taken to execute |
---|
647 | the object code is equal to the sum of the costs of the labels emitted by the |
---|
648 | object code. We collect all the necessary conditions for this to happen in the |
---|
649 | definition of a sound labeling: a) all loops must be broken by a cost emission |
---|
650 | statement; b) all program instructions must be in the scope of some cost |
---|
651 | emission statement. To satisfy also the fifth requirement, additional |
---|
652 | requirements must be imposed on the object code labeling to avoid all uses of |
---|
653 | the maximum in the cost computation: the labeling is precise if every label is |
---|
654 | emitted at most once and both branches of a conditional jump start with a label |
---|
655 | emission statement. |
---|
656 | |
---|
657 | The correctness and precision of the labeling approach only rely on the |
---|
658 | correctness and precision of the object code labeling. The simplest |
---|
659 | way to achieve them is to impose correctness and precision |
---|
660 | requirements also on the initial labeling produced by the first software |
---|
661 | component, and to ask the compiler to preserve these |
---|
662 | properties too. The latter requirement imposes serious limitations on the |
---|
663 | compilation strategy and optimizations: the compiler may not duplicate any code |
---|
664 | that contains label emission statements, like loop bodies. Therefore several |
---|
665 | loop optimisations like peeling or unrolling are prevented. Moreover, precision |
---|
666 | of the object code labeling is not sufficient \emph{per se} to obtain global |
---|
667 | precision: we implicitly assumed that a precise constant cost can be assigned |
---|
668 | to every instruction. This is not possible in |
---|
669 | the presence of stateful hardware whose state influences the cost of operations, |
---|
670 | like pipelines and caches. In the next subsection we will see an extension of the |
---|
671 | basic labeling approach to cover this situation. |
---|
672 | |
---|
673 | The labeling approach described in this section can be applied equally well and |
---|
674 | with minor modifications to imperative and functional languages |
---|
675 | \cite{labeling2}. We have tested it on a simple imperative language without |
---|
676 | functions (a `while' language), on a subset of C and on two compilation chains for |
---|
677 | a purely functional higher-order language. The two main changes required to deal |
---|
678 | with functional languages are: 1) because global variables and updates are not |
---|
679 | available, the instrumentation phase produces monadic code to `update' the |
---|
680 | global costs; 2) the requirements for a sound and precise labeling of the |
---|
681 | source code must be changed when the compilation is based on CPS translations. |
---|
682 | In particular, we need to introduce labels emitted before a statement is |
---|
683 | executed and also labels emitted after a statement is executed. The latter capture |
---|
684 | code that is inserted by the CPS translation and that would escape all label |
---|
685 | scopes. |
---|
686 | |
---|
687 | % Brian: one of the reviewers pointed out that standard Prolog implementations |
---|
688 | % do have some global state that apparently survives backtracking and might be |
---|
689 | % used. As we haven't experimented with this, I think it's best to elide it |
---|
690 | % entirely. |
---|
691 | |
---|
692 | % Phases 1, 2 and 3 can be applied as well to logic languages (e.g. Prolog). |
---|
693 | % However, the instrumentation phase cannot: in standard Prolog there is no notion |
---|
694 | % of (global) variable whose state is not retracted during backtracking. |
---|
695 | % Therefore, the cost of executing computations that are later backtracked would |
---|
696 | % not be correctly counted in. Any extension of logic languages with |
---|
697 | % non-backtrackable state could support our labeling approach. |
---|
698 | |
---|
699 | \subsection{Dependent labeling} |
---|
700 | The core idea of the basic labeling approach is to establish a tight connection |
---|
701 | between basic blocks executed in the source and target languages. Once the |
---|
702 | connection is established, any cost model computed on the object code can be |
---|
703 | transferred to the source code, without affecting the code of the compiler or |
---|
704 | its proof. In particular, it is immediate that we can also transport cost models |
---|
705 | that associate to each label functions from hardware state to natural numbers. |
---|
706 | However, a problem arises during the instrumentation phase that replaces cost |
---|
707 | emission statements with increments of global cost variables. The global cost |
---|
708 | variable must be incremented with the result of applying the function associated |
---|
709 | to the label to the hardware state at the time of execution of the block. |
---|
710 | The hardware state comprises both the functional state that affects the |
---|
711 | computation (the value of the registers and memory) and the non-functional |
---|
712 | state that does not (the pipeline and cache contents for example). The former is |
---|
713 | in correspondence with the source code state, but reconstructing the |
---|
714 | correspondence may be hard and lifting the cost model to work on the source code |
---|
715 | state is likely to produce cost expressions that are too complex to understand and reason about. |
---|
716 | Luckily enough, in all modern architectures the cost of executing single |
---|
717 | instructions is either independent of the functional state or the jitter---the |
---|
718 | difference between the worst and best case execution times---is small enough |
---|
719 | to be bounded without losing too much precision. Therefore we can concentrate |
---|
720 | on dependencies over the `non-functional' parts of the state only. |
---|
721 | |
---|
722 | The non-functional state has no correspondence in the high level state and does |
---|
723 | not influence the functional properties. What can be done is to expose the |
---|
724 | non-functional state in the source code. We present here the basic |
---|
725 | intuition in a simplified form: the technical details that allow us to handle the |
---|
726 | general case are more complex and can be found in~\cite{paolo}. We add |
---|
727 | to the source code an additional global variable that represents the |
---|
728 | non-functional state and another one that remembers the last labels emitted. The |
---|
729 | state variable must be updated at every label emission statement, using an |
---|
730 | update function which is computed during static analysis too. The update |
---|
731 | function associates to each label a function from the recently emitted labels |
---|
732 | and old state to the new state. It is computed composing the semantics of every |
---|
733 | instruction in a basic block and restricting it to the non-functional part of |
---|
734 | the state. |
---|
735 | |
---|
736 | Not all the details of the non-functional state needs to be exposed, and the |
---|
737 | technique works better when the part of state that is required can be summarized |
---|
738 | in a simple data structure. For example, to handle simple but realistic |
---|
739 | pipelines it is sufficient to remember a short integer that encodes the position |
---|
740 | of bubbles (stuck instructions) in the pipeline. In any case, it is not necessary |
---|
741 | for the user to understand the meaning of the state to reason over the properties |
---|
742 | of the |
---|
743 | program. Moreover, at any moment the user, or the invariant generator tools that |
---|
744 | analyse the instrumented source code produced by the compiler, can decide to |
---|
745 | trade precision of the analysis for simplicity by approximating the parametric |
---|
746 | cost with safe non parametric bounds. Interestingly, the functional analysis of |
---|
747 | the code can determine which blocks are executed more frequently in order to |
---|
748 | approximate more aggressively the ones that are executed less. |
---|
749 | |
---|
750 | Dependent labeling can also be applied to allow the compiler to duplicate |
---|
751 | blocks that contain labels (e.g. in loop optimisations)~\cite{paolo}. The effect is to assign |
---|
752 | a different cost to the different occurrences of a duplicated label. For |
---|
753 | example, loop peeling turns a loop into the concatenation of a copy of the loop |
---|
754 | body (that executes the first iteration) with the conditional execution of the |
---|
755 | loop (for the successive iterations). Because of further optimisations, the two |
---|
756 | copies of the loop will be compiled differently, with the first body usually |
---|
757 | taking more time. |
---|
758 | |
---|
759 | By introducing a variable that keeps track of the iteration number, we can |
---|
760 | associate to the label a cost that is a function of the iteration number. The |
---|
761 | same technique works for loop unrolling without modifications: the function will |
---|
762 | assign a cost to the even iterations and another cost to the odd ones. The |
---|
763 | actual work to be done consists of introducing within the source code, and for each |
---|
764 | loop, a variable that counts the number of iterations. The loop optimisation code |
---|
765 | that duplicate the loop bodies must also modify the code to propagate correctly |
---|
766 | the update of the iteration numbers. The technical details are more complex and |
---|
767 | can be found in the CerCo reports and publications. The implementation, however, |
---|
768 | is quite simple and the changes to a loop optimising compiler are minimal. |
---|
769 | |
---|
770 | \subsection{Techniques to exploit the induced cost model} |
---|
771 | We review the cost synthesis techniques developed in the project. |
---|
772 | The starting hypothesis is that we have a certified methodology to annotate |
---|
773 | blocks in the source code with constants which provide a sound and sufficiently |
---|
774 | precise upper bound on the cost of executing the blocks after compilation to |
---|
775 | object code. |
---|
776 | |
---|
777 | The principle that we have followed in designing the cost synthesis tools is |
---|
778 | that the synthetic bounds should be expressed and proved within a general |
---|
779 | purpose tool built to reason on the source code. In particular, we rely on the |
---|
780 | Frama-C tool to reason on C code and on the Coq proof-assistant to reason on |
---|
781 | higher-order functional programs. |
---|
782 | |
---|
783 | This principle entails that: 1) |
---|
784 | The inferred synthetic bounds are indeed correct as long as the general purpose |
---|
785 | tool is; 2) there is no limitation on the class of programs that can be handled |
---|
786 | as long as the user is willing to carry on an interactive proof. |
---|
787 | |
---|
788 | Of course, automation is desirable whenever possible. Within this framework, |
---|
789 | automation means writing programs that give hints to the general purpose tool. |
---|
790 | These hints may take the form, say, of loop invariants/variants, of predicates |
---|
791 | describing the structure of the heap, or of types in a light logic. If these |
---|
792 | hints are correct and sufficiently precise the general purpose tool will produce |
---|
793 | a proof automatically, otherwise, user interaction is required. |
---|
794 | |
---|
795 | \paragraph{The Cost plugin and its application to the Lustre compiler.} |
---|
796 | Frama-C \cite{framac} is a set of analysers for C programs with a |
---|
797 | specification language called ACSL. New analyses can be dynamically added |
---|
798 | via a plugin system. For instance, the Jessie plugin allows deductive |
---|
799 | verification of C programs with respect to their specification in ACSL, with |
---|
800 | various provers as back-end tools. |
---|
801 | We developed the CerCo Cost plugin for the Frama-C platform as a proof of |
---|
802 | concept of an automatic environment exploiting the cost annotations produced by |
---|
803 | the CerCo compiler. It consists of an OCaml program which essentially |
---|
804 | takes the following actions: 1) it receives as input a C program, 2) it |
---|
805 | applies the CerCo compiler to produce a related C program with cost annotations, |
---|
806 | 3) it applies some heuristics to produce a tentative bound on the cost of |
---|
807 | executing the C functions of the program as a function of the value of their |
---|
808 | parameters, 4) the user can then call the Jessie plugin to discharge the |
---|
809 | related proof obligations. |
---|
810 | In the following we elaborate on the soundness of the framework and the |
---|
811 | experiments we performed with the Cost tool on the C programs produced by a |
---|
812 | Lustre compiler. |
---|
813 | |
---|
814 | \paragraph{Soundness.} The soundness of the whole framework depends on the cost |
---|
815 | annotations added by the CerCo compiler, the synthetic costs produced by the |
---|
816 | cost plugin, the verification conditions (VCs) generated by Jessie, and the |
---|
817 | external provers discharging the VCs. The synthetic costs being in ACSL format, |
---|
818 | Jessie can be used to verify them. Thus, even if the added synthetic costs are |
---|
819 | incorrect (relatively to the cost annotations), the process as a whole is still |
---|
820 | correct: indeed, Jessie will not validate incorrect costs and no conclusion can |
---|
821 | be made about the WCET of the program in this case. In other terms, the |
---|
822 | soundness does not really depend on the action of the cost plugin, which can in |
---|
823 | principle produce any synthetic cost. However, in order to be able to actually |
---|
824 | prove a WCET of a C function, we need to add correct annotations in a way that |
---|
825 | Jessie and subsequent automatic provers have enough information to deduce their |
---|
826 | validity. In practice this is not straightforward even for very simple programs |
---|
827 | composed of branching and assignments (no loops and no recursion) because a fine |
---|
828 | analysis of the VCs associated with branching may lead to a complexity blow up. |
---|
829 | \paragraph{Experience with Lustre.} Lustre is a data-flow language for programming |
---|
830 | synchronous systems, with a compiler which targets C. We designed a |
---|
831 | wrapper for supporting Lustre files. |
---|
832 | The C function produced by the compiler implements the step function of the |
---|
833 | synchronous system and computing the WCET of the function amounts to obtain a |
---|
834 | bound on the reaction time of the system. We tested the Cost plugin and the |
---|
835 | Lustre wrapper on the C programs generated by the Lustre compiler. For programs |
---|
836 | consisting of a few hundred lines of code, the cost plugin computes a WCET and Alt- |
---|
837 | Ergo is able to discharge all VCs automatically. |
---|
838 | |
---|
839 | \paragraph{Handling C programs with simple loops.} |
---|
840 | The cost annotations added by the CerCo compiler take the form of C instructions |
---|
841 | that update by a constant a fresh global variable called the cost variable. |
---|
842 | Synthesizing a WCET of a C function thus consists in statically resolving an |
---|
843 | upper bound of the difference between the value of the cost variable before and |
---|
844 | after the execution of the function, i.e. find in the function the instructions |
---|
845 | that update the cost variable and establish the number of times they are passed |
---|
846 | through during the flow of execution. In order to do the analysis the plugin |
---|
847 | makes the following assumptions on the programs: |
---|
848 | 1) there are no recursive functions; |
---|
849 | 2) every loop must be annotated with a variant. The variants of `for' loops are |
---|
850 | automatically inferred. |
---|
851 | |
---|
852 | The plugin proceeds as follows. |
---|
853 | First the call graph of the program is computed. |
---|
854 | Then the computation of the cost of the function is performed by traversing its |
---|
855 | control flow graph. If the function $f$ calls the function $g$ |
---|
856 | then the function $g$ |
---|
857 | is processed before the function $f$. The cost at a node is the maximum of the |
---|
858 | costs of the successors. |
---|
859 | In the case of a loop with a body that has a constant cost for every step of the |
---|
860 | loop, the cost is the product of the cost of the body and of the variant taken |
---|
861 | at the start of the loop. |
---|
862 | In the case of a loop with a body whose cost depends on the values of some free |
---|
863 | variables, a fresh logic function $f$ is introduced to represent the cost of the |
---|
864 | loop in the logic assertions. This logic function takes the variant as a first |
---|
865 | parameter. The other parameters of $f$ are the free variables of the body of the |
---|
866 | loop. An axiom is added to account the fact that the cost is accumulated at each |
---|
867 | step of the loop. |
---|
868 | The cost of the function is directly added as post-condition of the function. |
---|
869 | |
---|
870 | The user can influence the annotation by two different means: |
---|
871 | 1) by using more precise variants; |
---|
872 | 2) by annotating functions with cost specifications. The plugin will use this cost |
---|
873 | for the function instead of computing it. |
---|
874 | \paragraph{C programs with pointers.} |
---|
875 | When it comes to verifying programs involving pointer-based data structures, |
---|
876 | such as linked lists, trees, or graphs, the use of traditional first-order logic |
---|
877 | to specify, and of SMT solvers to verify, shows some limitations. Separation |
---|
878 | logic~\cite{separation} is an elegant alternative. It is a program logic |
---|
879 | with a new notion of conjunction to express spatial heap separation. Bobot has |
---|
880 | recently introduced automatically generated separation |
---|
881 | predicates to simulate separation logic reasoning in the Jessie plugin where the specification language, the verification condition |
---|
882 | generator, and the theorem provers were not designed with separation logic in |
---|
883 | mind. CerCo's plugin can exploit the separation predicates to automatically |
---|
884 | reason on the cost of execution of simple heap manipulation programs such as an |
---|
885 | in-place list reversal. |
---|
886 | |
---|
887 | \subsection{The CerCo compiler} |
---|
888 | In CerCo we have developed a certain number of cost preserving compilers based |
---|
889 | on the labeling approach. Excluding an initial certified compiler for a `while' |
---|
890 | language, all remaining compilers target realistic source languages---a pure |
---|
891 | higher order functional language and a large subset of C with pointers, gotos |
---|
892 | and all data structures---and real world target processors---MIPS and the |
---|
893 | Intel 8051/8052 processor family. Moreover, they achieve a level of optimisation |
---|
894 | that ranges from moderate (comparable to GCC level 1) to intermediate (including |
---|
895 | loop peeling and unrolling, hoisting and late constant propagation). The so |
---|
896 | called \emph{Trusted CerCo Compiler} is the only one that was implemented in the |
---|
897 | interactive theorem prover Matita~\cite{matita} and its costs certified. The code distributed |
---|
898 | is extracted OCaml code from the Matita implementation. In the rest of |
---|
899 | this section we will only focus on the Trusted CerCo Compiler, that only targets |
---|
900 | the C language and the 8051/8052 family, and that does not implement any |
---|
901 | advanced optimisations yet. Its user interface, however, is the same as the one |
---|
902 | of the other versions, in order to trade safety with additional performances. In |
---|
903 | particular, the Frama-C CerCo plugin can work without recompilation with all |
---|
904 | compilers. |
---|
905 | |
---|
906 | The 8051/8052 microprocessor is a very simple one, with constant-cost |
---|
907 | instructions. It was chosen to separate the issue of exact propagation of the |
---|
908 | cost model from the orthogonal problem of the static analysis of object code |
---|
909 | that may require approximations or dependent costs. |
---|
910 | |
---|
911 | The (trusted) CerCo compiler implements the following optimisations: cast |
---|
912 | simplification, constant propagation in expressions, liveness analysis driven |
---|
913 | spilling of registers, dead code elimination, branch displacement, and tunneling. |
---|
914 | The two latter optimisations are performed by our optimising assembler |
---|
915 | \cite{correctness}. The back-end of the compiler works on three address |
---|
916 | instructions, preferred to static single assignment code for the simplicity of |
---|
917 | the formal certification. |
---|
918 | |
---|
919 | The CerCo compiler is loosely based on the CompCert compiler \cite{compcert}, a |
---|
920 | recently developed certified compiler from C to the PowerPC, ARM and x86 |
---|
921 | microprocessors. Contrary to CompCert, both the CerCo code and its |
---|
922 | certification are open source. Some data structures and language definitions for |
---|
923 | the front-end are directly taken from CompCert, while the back-end is a redesign |
---|
924 | of a compiler from Pascal to MIPS used by Fran\c{c}ois Pottier for a course at the |
---|
925 | Ecole Polytechnique. |
---|
926 | |
---|
927 | The main peculiarities of the CerCo compiler are the following. |
---|
928 | \begin{enumerate} |
---|
929 | \item All the intermediate languages include label emitting instructions to |
---|
930 | implement the labeling approach, and the compiler preserves execution traces. |
---|
931 | \item Traditionally compilers target an assembly language with additional |
---|
932 | macro-instructions to be expanded before assembly; for CerCo we need to go all |
---|
933 | the way down to object code in order to perform the static analysis. Therefore |
---|
934 | we integrated also an optimising assembler and a static analyser. |
---|
935 | \item In order to avoid implementing a linker and a loader, we do not support separate |
---|
936 | compilation and external calls. Adding them is a transparent |
---|
937 | process to the labeling approach and should create no unknown problem. |
---|
938 | \item We target an 8-bit processor, in contrast to CompCert's 32-bit targets. Targeting an 8-bit processor requires |
---|
939 | several changes and increased code size, but it is not fundamentally more |
---|
940 | complex. The proof of correctness, however, becomes much harder. |
---|
941 | \item We target a microprocessor that has a non uniform memory model, which is |
---|
942 | still often the case for microprocessors used in embedded systems and that is |
---|
943 | becoming common again in multi-core processors. Therefore the compiler has to |
---|
944 | keep track of data and it must move data between memory regions in the proper |
---|
945 | way. Moreover the size of pointers to different regions is not uniform. An |
---|
946 | additional difficulty was that the space available for the stack in internal |
---|
947 | memory in the 8051 is tiny, allowing only a minor number of nested calls. To |
---|
948 | support full recursion in order to test the CerCo tools also on recursive |
---|
949 | programs, the compiler implements a stack in external memory. |
---|
950 | \end{enumerate} |
---|
951 | |
---|
952 | \subsection{Formal certification of the CerCo compiler} |
---|
953 | We implemented the |
---|
954 | CerCo compiler in the interactive theorem prover Matita and have formally |
---|
955 | certified that the cost model induced on the source code correctly and precisely |
---|
956 | predicts the object code behaviour. We actually induce two cost models, one for |
---|
957 | time and one for stack space consumption. We show the correctness of the prediction |
---|
958 | only for those programs that do not exhaust the available machine stack space, a |
---|
959 | property that---thanks to the stack cost model---we can statically analyse on the |
---|
960 | source code using our Frama-C tool. The preservation of functional properties we |
---|
961 | take as an assumption, not itself formally proved in CerCo. Other projects have |
---|
962 | already certified the preservation of functional semantics in similar compilers, |
---|
963 | and we have not attempted to directly repeat that work. In order to complete the |
---|
964 | proof for non-functional properties, we have introduced a new semantics for |
---|
965 | programming languages based on a new kind of structured observables with the |
---|
966 | relative notions of forward similarity and the study of the intentional |
---|
967 | consequences of forward similarity. We have also introduced a unified |
---|
968 | representation for back-end intermediate languages that was exploited to provide |
---|
969 | a uniform proof of forward similarity. |
---|
970 | |
---|
971 | The details on the proof techniques employed |
---|
972 | and |
---|
973 | the proof sketch can be collected from the CerCo deliverables and papers. |
---|
974 | In this section we will only hint at the correctness statement, which turned |
---|
975 | out to be more complex than what we expected at the beginning. |
---|
976 | |
---|
977 | \paragraph{The statement.} |
---|
978 | Real time programs are often reactive programs that loop forever responding to |
---|
979 | events (inputs) by performing some computation followed by some action (output) |
---|
980 | and the return to the initial state. For looping programs the overall execution |
---|
981 | time does not make sense. The same happens for reactive programs that spend an |
---|
982 | unpredictable amount of time in I/O. What is interesting is the reaction time |
---|
983 | that measure the time spent between I/O events. Moreover, we are interested in |
---|
984 | predicting and ruling out programs that crash running out of space on a certain |
---|
985 | input. |
---|
986 | Therefore we need to look for a statement that talks about sub-runs of a |
---|
987 | program. The most natural statement is that the time predicted on the source |
---|
988 | code and spent on the object code by two corresponding sub-runs are the same. |
---|
989 | The problem to solve to make this statement formal is how to identify the |
---|
990 | corresponding sub-runs and how to single out those that are meaningful. |
---|
991 | The solution we found is based on the notion of measurability. We say that a run |
---|
992 | has a \emph{measurable sub-run} when both the prefix of the sub-run and the |
---|
993 | sub-run do not exhaust the stack space, the number of function calls and returns |
---|
994 | in the sub-run is the same, the sub-run does not perform any I/O and the sub-run |
---|
995 | starts with a label emission statement and ends with a return or another label |
---|
996 | emission statements. The stack usage is estimated using the stack usage model |
---|
997 | that is computed by the compiler. |
---|
998 | |
---|
999 | The statement that we formally proved is: for each C run with a measurable |
---|
1000 | sub-run, there exists an object code run with a sub-run, such that the observables |
---|
1001 | of the pairs of prefixes and sub-runs are the same and the time spent by the |
---|
1002 | object code in the sub-run is the same as the one predicted on the source code |
---|
1003 | using the time cost model generated by the compiler. |
---|
1004 | We briefly discuss the constraints for measurability. Not exhausting the stack |
---|
1005 | space is a clear requirement of meaningfulness of a run, because source programs |
---|
1006 | do not crash for lack of space while object code ones do. The balancing of |
---|
1007 | function calls and returns is a requirement for precision: the labeling approach |
---|
1008 | allows the scope of label emission statements to extend after function calls to |
---|
1009 | minimize the number of labels. Therefore a label pays for all the instructions |
---|
1010 | in a block, excluding those executed in nested function calls. If the number of |
---|
1011 | calls/returns is unbalanced, it means that there is a call we have not returned |
---|
1012 | to that could be followed by additional instructions whose cost has already been |
---|
1013 | taken in account. To make the statement true (but less precise) in this |
---|
1014 | situation, we could only say that the cost predicted on the source code is a |
---|
1015 | safe bound, not that it is exact. The last condition on the entry/exit points of |
---|
1016 | a run is used to identify sub-runs whose code correspond to a sequence of blocks |
---|
1017 | that we can measure precisely. Any other choice would start or end the run in the |
---|
1018 | middle of a block and we would be forced again to weaken the statement taking as |
---|
1019 | a bound the cost obtained counting in all the instructions that precede the |
---|
1020 | starting one in the block, or follow the final one in the block. |
---|
1021 | I/O operations can be performed in the prefix of the run, but not in the |
---|
1022 | measurable sub-run. Therefore we prove that we can predict reaction times, but |
---|
1023 | not I/O times, as it should be. |
---|
1024 | |
---|
1025 | \section{Conclusions and future work} |
---|
1026 | |
---|
1027 | All the CerCo software and deliverables can be found on the CerCo homepage at~\url{http://cerco.cs.unibo.it}. |
---|
1028 | |
---|
1029 | The results obtained so far are encouraging and provide evidence that |
---|
1030 | it is possible to perform static time and space analysis at the source level |
---|
1031 | without losing accuracy, reducing the trusted code base and reconciling the |
---|
1032 | study of functional and non-functional properties of programs. The |
---|
1033 | techniques introduced seem to be scalable, cover both imperative and |
---|
1034 | functional languages and are compatible with every compiler optimisation |
---|
1035 | considered by us so far. |
---|
1036 | |
---|
1037 | To prove that compilers can keep track of optimisations |
---|
1038 | and induce a precise cost model on the source code, we targeted a simple |
---|
1039 | architecture that admits a cost model that is execution history independent. |
---|
1040 | The most important future work is dealing with hardware architectures |
---|
1041 | characterized by history dependent stateful components, like caches and |
---|
1042 | pipelines. The main issue consists in assigning a parametric, dependent cost |
---|
1043 | to basic blocks that can be later transferred by the labeling approach to |
---|
1044 | the source code and represented in a meaningful way to the user. The dependent |
---|
1045 | labeling approach that we have studied seems a promising tool to achieve |
---|
1046 | this goal, but the cost model generated for a realistic processor could be too |
---|
1047 | large and complex to be exposed in the source code. Further study is required |
---|
1048 | to evaluate the technique on a realistic processor and to introduce early |
---|
1049 | approximations of the cost model to make the technique feasible. |
---|
1050 | |
---|
1051 | Examples of further future work consist in improving the cost invariant |
---|
1052 | generator algorithms and the coverage of compiler optimizations, in combining |
---|
1053 | the labeling approach with the type and effect discipline of~\cite{typeffects} |
---|
1054 | to handle languages with implicit memory management, and in experimenting with |
---|
1055 | our tools in early development phases. Some larger case study is also necessary |
---|
1056 | to evaluate the CerCo's prototype on realistic, industrial-scale programs. |
---|
1057 | |
---|
1058 | % \bibliographystyle{splncs} |
---|
1059 | \bibliography{fopara13} |
---|
1060 | % \begin{thebibliography}{19} |
---|
1061 | % |
---|
1062 | % \bibitem{survey} \textbf{A Survey of Static Program Analysis Techniques} |
---|
1063 | % W.~W\"ogerer. Technical report. Technische Universit\"at Wien 2005 |
---|
1064 | % |
---|
1065 | % \bibitem{cerco} \textbf{Certified Complexity}. R.M. Amadio, A. Asperti, N. Ayache, |
---|
1066 | % B. Campbell, D. P. Mulligan, R. Pollack, Y. Regis-Gianas, C. Sacerdoti Coen, I. |
---|
1067 | % Stark, in Procedia Computer Science, Volume 7, 2011, Proceedings of the 2 nd |
---|
1068 | % European Future Technologies Conference and Exhibition 2011 (FET 11), 175-177. |
---|
1069 | % |
---|
1070 | % \bibitem{labeling} \textbf{Certifying and Reasoning on Cost Annotations in C |
---|
1071 | % Programs}, N. Ayache, R.M. Amadio, Y.R\'{e}gis-Gianas, in Proc. FMICS, Springer |
---|
1072 | % LNCS |
---|
1073 | % 7437: 32-46, 2012. |
---|
1074 | % %, DOI:10.1007/978-3-642-32469-7\_3. |
---|
1075 | % |
---|
1076 | % \bibitem{labeling2} \textbf{Certifying and reasoning on cost annotations of |
---|
1077 | % functional programs}. |
---|
1078 | % R.M. Amadio, Y. R\'{e}gis-Gianas. Proceedings of the Second international conference |
---|
1079 | % on Foundational and Practical Aspects of Resource Analysis FOPARA 2011 Springer |
---|
1080 | % LNCS 7177:72-89, 2012. |
---|
1081 | % |
---|
1082 | % \bibitem{compcert} \textbf{Formal verification of a realistic compiler}. X. Leroy, In Commun. ACM 52(7), 107–115, 2009. |
---|
1083 | % |
---|
1084 | % \bibitem{framac} \textbf{Frama-C user manual}. L. Correnson, P. Cuoq, F. Kirchner, V. Prevosto, A. Puccetti, J. Signoles, |
---|
1085 | % B. Yakobowski. in CEA-LIST, Software Safety Laboratory, Saclay, F-91191, |
---|
1086 | % \url{http://frama-c.com/}. |
---|
1087 | % |
---|
1088 | % \bibitem{paolo} \textbf{Indexed Labels for Loop Iteration Dependent Costs}. P. |
---|
1089 | % Tranquilli, in Proceedings of the 11th International Workshop on Quantitative |
---|
1090 | % Aspects of Programming Languages and Systems (QAPL 2013), Rome, 23rd-24th March |
---|
1091 | % 2013, Electronic Proceedings in Theoretical Computer Science, to appear in 2013. |
---|
1092 | % |
---|
1093 | % \bibitem{separation} \textbf{Intuitionistic reasoning about shared mutable data |
---|
1094 | % structure} J.C. Reynolds. In Millennial Perspectives in Computer Science, |
---|
1095 | % pages 303–321, Houndsmill, Hampshire, 2000. Palgrave. |
---|
1096 | % |
---|
1097 | % \bibitem{lustre} \textbf{LUSTRE: a declarative language for real-time |
---|
1098 | % programming} |
---|
1099 | % P. Caspi, D. Pilaud, N. Halbwachs, J.A. Plaice. In Proceedings of |
---|
1100 | % the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages ACM |
---|
1101 | % 1987. |
---|
1102 | % |
---|
1103 | % \bibitem{correctness} \textbf{On the correctness of an optimising assembler for |
---|
1104 | % the intel MCS-51 microprocessor}. |
---|
1105 | % D. P. Mulligan, C. Sacerdoti Coen. In Proceedings of the Second |
---|
1106 | % international conference on Certified Programs and Proofs, Springer-Verlag 2012. |
---|
1107 | % |
---|
1108 | % \bibitem{proartis} \textbf{PROARTIS: Probabilistically Analysable Real-Time |
---|
1109 | % Systems}, F.J. Cazorla, E. Qui\~{n}ones, T. Vardanega, L. Cucu, B. Triquet, G. |
---|
1110 | % Bernat, E. Berger, J. Abella, F. Wartel, M. Houston, et al., in ACM Transactions |
---|
1111 | % on Embedded Computing Systems, 2012. |
---|
1112 | % |
---|
1113 | % \bibitem{embounded} \textbf{The EmBounded project (project paper)}. K. Hammond, |
---|
1114 | % R. Dyckhoff, C. Ferdinand, R. Heckmann, M. Hofmann, H. Loidl, G. Michaelson, J. |
---|
1115 | % Serot, A. Wallace, in Trends in Functional Programming, Volume 6, Intellect |
---|
1116 | % Press, 2006. |
---|
1117 | % |
---|
1118 | % \bibitem{matita} |
---|
1119 | % \textbf{The Matita Interactive Theorem Prover}. |
---|
1120 | % A. Asperti, C. Sacerdoti Coen, W. Ricciotti, E. Tassi. |
---|
1121 | % 23rd International Conference on Automated Deduction, CADE 2011. |
---|
1122 | % |
---|
1123 | % \bibitem{typeffects} \textbf{The Type and Effect Discipline}. J.-P. Talpin, |
---|
1124 | % P. Jouvelot. |
---|
1125 | % In Proceedings of the Seventh Annual Symposium on Logic in Computer Science |
---|
1126 | % (LICS '92), Santa Cruz, California, USA, June 22-25, 1992. |
---|
1127 | % IEEE Computer Society 1992. |
---|
1128 | % |
---|
1129 | % \bibitem{stateart} \textbf{The worst-case execution-time problem overview of |
---|
1130 | % methods |
---|
1131 | % and survey of tools.} R. Wilhelm et al., in ACM Transactions on Embedded |
---|
1132 | % Computing Systems, 7:1–53, May 2008. |
---|
1133 | % |
---|
1134 | % %\bibitem{proartis2} \textbf{A Cache Design for Probabilistic Real-Time |
---|
1135 | % % Systems}, L. Kosmidis, J. Abella, E. Quinones, and F. Cazorla, in Design, |
---|
1136 | % % Automation, and Test in Europe (DATE), Grenoble, France, 03/2013. |
---|
1137 | % |
---|
1138 | % \end{thebibliography} |
---|
1139 | |
---|
1140 | |
---|
1141 | %\bibliography{fopara13.bib} |
---|
1142 | |
---|
1143 | \end{document} |
---|