Changes between Version 1 and Version 2 of Proposal1.2


Ignore:
Timestamp:
Jun 15, 2010, 1:18:56 PM (7 years ago)
Author:
sacerdot
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Proposal1.2

    v1 v2  
    1 = Progress beyond state of the art =
     1== [[BR]] Progress beyond state of the art ==
    22
     3Automatic verification of properties of software components has reached a level of maturity allowing complete correctness proofs of entire compilers; that is of the semantic equivalence between the generated assembly code and its source program. For instance, in the framework of the http://www.info.uni-karlsruhe.de/publications.php/id=213Verifix Project a compiler from a subset of Common Lisp to Transputer code was formally checked in PVS (see [[[node60.html#Dold|Dold and Vialard]]]). [[[node60.html#Strecker|Strecker]]] and [[[node60.html#Klein|Klein and Nipkow]]] certified bytecode compilers from a subset of Java to a subset of the Java Virtual Machine in Isabelle. In the same system, [[[node60.html#Leinenbach|Leinenbach et al.]]] formally verified a compiler from a subset of C to a DLX assembly code. [[[node60.html#Chlipala|Chlipala]]] recently wrote in Coq a certified compiler from the simply-typed lambda calculus to assembly language, making an extensive use of dependent types. Perhaps, the most advanced project is http://pauillac.inria.fr/�xleroy/compcert/Compcert, headed by Leroy, based on the use of Coq both for programming the compiler and proving its correctness. In particular, both the back-end ([[[node60.html#Leroy06|Leroy]]]) and the front-end ([[[node60.html#Leroy08|Leroy and Tristan]]]) of an optimising compiler from a subset of C to PowerPC assembly have been certified in this way.
    34
    4 Automatic verification of properties of software components has
    5 reached a level of maturity allowing complete correctness proofs of
    6 entire compilers; that is of the semantic equivalence between the
    7 generated assembly code and its source program. For instance, in the
    8 framework of the
    9 \href{http://www.info.uni-karlsruhe.de/publications.php/id=213}{Verifix Project}
    10 a compiler from a subset of Common Lisp to Transputer code was formally checked
    11 in PVS (see \cite{Dold}). \cite{Strecker} and \cite{Klein}
    12 certified bytecode compilers from a subset of Java to a subset of the Java
    13 Virtual Machine in Isabelle.
    14 In the same system, \cite{Leinenbach}
    15 formally verified a compiler
    16 from a subset of C to a DLX assembly code. \cite{Chlipala} recently wrote
    17 in Coq a certified compiler from the simply-typed lambda calculus to
    18 assembly language, making an extensive use of dependent types.
    19 Perhaps, the most advanced project is
    20 \href{http://pauillac.inria.fr/~xleroy/compcert/}{Compcert}, headed by Leroy,
    21 based on the use of Coq both for programming the compiler and proving its
    22 correctness. In particular, both the back-end (\cite{Leroy06})
    23 and the front-end (\cite{Leroy08})
    24 of an optimising compiler from a subset of C to PowerPC
    25 assembly have been certified in this way.
     5However, very little is known about the preservation of intensional properties of programs, and in particular about their (concrete) complexity. The theoretical study of the complexity impact of program transformations between different computational models has so far been confined to very foundational devices. Here we propose to address a concrete case of compilation from a typical high-level language to assembly. It is worth remarking that it is unlikely to have constant-time transformations between foundational models: for instance coding a multitape Turing machine into a single tape one could introduce a polynomial slow-down. Thus, complexity is architecture dependent, and the claim that you may pass from one language to another, preserving the performance of your algorithms, must be taken with the due caution. In particular, as surprising as it may be, very little is known about the complexity behaviour of a compiled code with respect to its source; as a matter of fact, most industries producing robots or devices with strong temporal constraints (such as, e.g. photoelectric safety barriers) still program such devices in assembly.
    266
     7The tacit assumption that the complexity of algorithms is preserved along compilation, while plausible under the suitable assumptions, is not supported by any practical or theoretical study. For instance, a single register is usually used to point to activation records, implicitly delimiting their number; you may take more registers to this purpose, but unless you fix a priori their number (hence fixing the size of the stack), you cannot expect to access data in activation records in constant time. In particular, the memory model assumed by Leroy assumes an infinite memory, where allocation requests always succeed, that clearly conflicts with the reality of embedded software, where one has to work within precise (often relatively small) memory bounds. If working in restricted space on one side allows us to properly weight memory access as a unit cost, on the other side it introduces a subtle interplay between space complexity, time complexity and correctness that will be one of the crucial issues of the project.
    278
    28 However, very little is known about the preservation of intensional
    29 properties of programs, and in particular about their (concrete)
    30 complexity. The theoretical study of the complexity impact of program
    31 transformations between different computational models has so far been
    32 confined to very foundational devices.  Here we propose to address a
    33 concrete case of compilation from a typical high-level language to
    34 assembly. It is worth remarking that it is unlikely to have
    35 constant-time transformations between foundational models: for
    36 instance coding a multitape Turing machine into a single tape one
    37 could introduce a polynomial slow-down.  Thus, complexity is
    38 architecture dependent, and the claim that you may pass from one
    39 language to another, preserving the performance of your algorithms,
    40 must be taken with the due caution.  In particular, as surprising as
    41 it may be, very little is known about the complexity behaviour of a
    42 compiled code with respect to its source; as a matter of fact, most
    43 industries producing robots or devices with strong temporal
    44 constraints (such as, e.g.\ photoelectric safety barriers) still
    45 program such devices in assembly.
     9Even admitting (as we hope to prove) that in a confined universe we may actually preserve complexity, the main interest of the project is in producing a (certified) computational cost (in terms of clock cycles) for all instruction slices of the source program with O(1) complexity, thus providing precise values for all constants appearing in the cost function for the source. This opens the possibility of computing time constraints for executable code by reasoning directly on the high level input language. In particular, we are not aiming to help analyse the complexity (or termination) of programs (that has to be guessed by the user, as he guesses invariants in axiomatic semantics), but we shall build the necessary infrastructure to reflect a high-level, abstract complexity analysis of the source on a concrete instantiation of its target code.
    4610
    47 The tacit assumption that the complexity of algorithms is preserved
    48 along compilation, while plausible under the suitable assumptions, is
    49 not supported by any practical or theoretical study. For instance, a
    50 single register is usually used to point to activation records,
    51 implicitly delimiting their number; you may take more registers to
    52 this purpose, but unless you fix a priori their number (hence fixing
    53 the size of the stack), you cannot expect to access data in activation
    54 records in constant time. In particular, the memory model assumed by
    55 Leroy assumes an infinite memory, where allocation requests always
    56 succeed, that clearly conflicts with the reality of embedded software,
    57 where one has to work within precise (often relatively small) memory
    58 bounds. If working in restricted space on one side allows us to
    59 properly weight memory access as a unit cost, on the other side it
    60 introduces a subtle interplay between space complexity, time
    61 complexity and correctness that will be one of the crucial issues of
    62 the project.
     11Such instantiation depends on the target architecture: for instance some microcontrollers lack the multiplication instruction as a primitive operation, preventing to count such an operation with a fixed cost. Moreover, if we are interested in a really tight complexity measure, we cannot expect to have a uniform cost for input instructions since, due to register allocation and optimisations, their actual cost depends on their surrounding context. In other words, we have to face the non compositional nature (in terms of the source structure) of most compiler techniques and optimizations.
    6312
    64 Even admitting (as we hope to prove) that in a confined universe we
    65 may actually preserve complexity, the main interest of the project is
    66 in producing a (certified) computational cost (in terms of clock
    67 cycles) for all instruction slices of the source program with O(1)
    68 complexity, thus providing precise values for all constants appearing
    69 in the cost function for the source. This opens the possibility of
    70 computing time constraints for executable code by reasoning directly
    71 on the high level input language. In particular, we are not aiming to
    72 help analyse the complexity (or termination) of programs (that has to
    73 be guessed by the user, as he guesses invariants in axiomatic
    74 semantics), but we shall build the necessary infrastructure to reflect
    75 a high-level, abstract complexity analysis of the source on a concrete
    76 instantiation of its target code.
     13The Compcert project represents the current baseline for any future work on compiler certification, comprising the one we plan to do. We will improve on Compcert in two directions: by assuming a formal model where resources (memory) are constrained; and by preserving complexity of O(1) operations, also tracing the way they are mapped to assembly to reflect actual computational costs on the source code. Both improvements greatly increase the exploitation potentials, in particular in the domain of embedded systems and real time computation.
    7714
    78 Such instantiation depends on the target architecture: for instance
    79 some microcontrollers lack the multiplication instruction as a
    80 primitive operation, preventing to count such an operation with a
    81 fixed cost.  Moreover, if we are interested in a really tight
    82 complexity measure, we cannot expect to have a uniform cost for input
    83 instructions since, due to register allocation and optimisations,
    84 their actual cost depends on their surrounding context. In other
    85 words, we have to face the non compositional nature (in terms of the
    86 source structure) of most compiler techniques and optimizations.
     15[[BR]]
    8716
    88 The Compcert project represents the current baseline for any future work on
    89 compiler certification, comprising the one we plan to do. We will improve on
    90 Compcert in two directions: by assuming a formal model where resources
    91 (memory) are constrained; and by preserving complexity of O(1) operations, also
    92 tracing the way they are mapped to assembly to reflect actual computational
    93 costs on the source code. Both improvements greatly increase the exploitation
    94 potentials, in particular in the domain of embedded systems and real time
    95 computation.
     17=== The CerCo approach ===
    9618
    97 == The !CerCo approach ==
     19The complexity of a program only depends on its control-flow structure, and in particular on its cycles (procedures calls are just a special case of cycles). Proving that a compiler preserves complexity amounts to proving that it preserves (up to local modifications, like loop unrolling, in-line expansion, etc.) the control-flow structure of the source[[footnode.html#foot144|^1^]] and, less trivially, that all other instructions are compiled into assembly code whose execution takes a bounded number of clock-cycles (i.e. with O(1) complexity). The interest of the project lies in the possibility to compute these costs directly on the target code then refer them back to the source program, allowing the possibility to make precise and trusted temporal assertions about execution from reasoning on the source code.
    9820
    99 The complexity of a program only depends on its control-flow
    100 structure, and in particular on its cycles (procedures calls are just
    101 a special case of cycles). Proving that a compiler preserves
    102 complexity amounts to proving that it preserves (up to local
    103 modifications, like loop unrolling, in-line expansion, etc.) the
    104 control-flow structure of the source\footnote{This requires, in turn,
    105 the preservation of semantics: the right conditions must be tested,
    106 and procedures must be called with the correct parameters.} and, less
    107 trivially, that all other instructions are compiled into assembly code
    108 whose execution takes a bounded number of clock-cycles (i.e. with O(1)
    109 complexity). The interest of the project lies in the possibility to
    110 compute these costs directly on the target code then refer them back
    111 to the source program, allowing the possibility to make precise and
    112 trusted temporal assertions about execution from reasoning on the
    113 source code.
     21As already mentioned, the main problem in the backward translation of costs from target code to source code is the fact that, apart from the overall control flow structure, all remaining structure of the input is usually irremediably lost during compilation: optimizations can move instructions around, change the order of jumps, and in general perform operations that are far from compositional w.r.t. the high level syntactic structure of the input program. So there is no hope to compute costs on an instruction-by-instruction basis of the source language, since the actual cost of the executable is not compositional in these figures. We have to find another, eventually coarser, level of granularity where the source can be sensibly annotated by target costs.
    11422
     23We regard a C program as a collection of mutually defined procedures. The flow inside each procedure is determined by branching instructions like if-then-else; ``while'' loops can be regarded as a special kind of tail recursive procedures. The resulting flow can thus be represented as a directed acyclic graph (DAG). We call a path of the directed acyclic graph an ''execution path''.
    11524
    116 As already mentioned, the main problem in the backward translation of
    117 costs from target code to source code is the fact that, apart from the
    118 overall control flow structure, all remaining structure of the input
    119 is usually irremediably lost during compilation: optimizations can
    120 move instructions around, change the order of jumps, and in general
    121 perform operations that are far from compositional w.r.t.\ the high
    122 level syntactic structure of the input program.  So there is no hope
    123 to compute costs on an instruction-by-instruction basis of the source
    124 language, since the actual cost of the executable is not compositional
    125 in these figures. We have to find another, eventually coarser, level
    126 of granularity where the source can be sensibly annotated by target
    127 costs.
     25'''Figure 2:''' Quicksort
    12826
     27||  ||
    12928
    130 We regard a C program as a collection of mutually defined
    131 procedures. The flow inside each procedure is determined by branching
    132 instructions like if-then-else; ``while'' loops can be regarded as a
    133 special kind of tail recursive procedures. The resulting flow can thus
    134 be represented as a directed acyclic graph (DAG).  We call a path of
    135 the directed acyclic graph an {\em execution path}.
    136 \begin{center}
    137 \begin{figure}[htb]
    138 \begin{verbatim}
    139 void quicksort(int t[], int l, int r) {
    140   if (l < r) {
    141     int v = t[l];
    142     int m = l;
    143     int i = l + 1;
    144     while (i <= r) {
    145       if (t[i] < v) { m++; swap(t, i, m); }
    146       i++;
    147     }
    148     swap(t, l, m);
    149     quicksort(t, l, m - 1);
    150     quicksort(t, m + 1, r);
    151   }
    152 }
     29As a simple example, consider the quicksort program of Fig.�[[#code|2]]. This algorithm performs in-place sorting of input array img3.png whose bounds are img4.png and img5.png; initially img4.png is expected to be zero, while img5.png is the length of the array minus one. The outermost conditional terminates when the bounds of the array are illegal. (Sorting an empty array will end the recursive behaviour of the algorithm.) The variable img6.png is the so called pivot: a selected element of the array that will be compared with all other elements. Bigger elements will be moved (by the swap function) to the end of the array (the upper part), while smaller elements are placed at the beginning of the array (the lower part). Then the pivot is placed between the lower and the upper part of the array, in position img7.png, its position in the resulting sorted array; all elements before the pivot are smaller and all elements following it are bigger. The algorithm completes the sorting with recursive calls on the lower and on the upper parts of the array.
    15330
    154 \end{verbatim}
    155 \caption{\label{code}Quicksort}
    156 \end{figure}
    157 \end{center}
    158 As a simple example, consider the quicksort program of
    159 Fig.~\ref{code}.  This algorithm performs in-place sorting of input
    160 array $t$ whose bounds are $l$ and $r$; initially $l$ is expected to
    161 be zero, while $r$ is the length of the array minus one. The outermost
    162 conditional terminates when the bounds of the array are illegal.
    163 (Sorting an empty array will end the recursive behaviour of the
    164 algorithm.) The variable $v$ is the so called pivot: a selected
    165 element of the array that will be compared with all other elements.
    166 Bigger elements will be moved (by the swap function) to the end of the
    167 array (the upper part), while smaller elements are placed at the
    168 beginning of the array (the lower part).  Then the pivot is placed
    169 between the lower and the upper part of the array, in position $m$,
    170 its position in the resulting sorted array; all elements before the
    171 pivot are smaller and all elements following it are bigger.  The
    172 algorithm completes the sorting with recursive calls on the lower and
    173 on the upper parts of the array.
     31In the body of the `quick_sort` procedure there are only two execution paths, corresponding to the two cases img8.png and img9.png. The latter is a trivial path, immediately leading to termination. The former leads to the while loop (that is regarded as a procedure call), the call to swap, and the two recursive calls. Similarly, the body of the while loop is composed by two paths, corresponding to the two conditions img10.png and img11.png.
    17432
     33All operations performed along any of these paths takes some constant time c. The complexity magnitude of the program only depends on the loops and recursive calls met along its execution paths, but not on their associated constants. On the other hand, if we want to give tight performance bounds to the execution time, we have to compute the real constants on the executable.
    17534
    176 In the body of the \verb+quick_sort+ procedure there are only two
    177 execution paths, corresponding to the two cases $l < r$ and $l \ge
    178 r$. The latter is a trivial path, immediately leading to
    179 termination. The former leads to the while loop (that is regarded as a
    180 procedure call), the call to swap, and the two recursive calls.
    181 Similarly, the body of the while loop is composed by two paths,
    182 corresponding to the two conditions $i \le r$ and $i > r$.
     35The compiler must be able to return a set of pairs img12.png, where each img13.png is an execution path, and img14.png is its actual cost[[footnode.html#foot154|^2^]]. It is then up to the user to guess (and to prove, assisted by interactive tools) the complexity of the program: the compiler only provides the infrastructure required to map a complexity analysis on the source into a faithful analog on the target. This approach looks compatible with most local optimizations. Moreover, since we work on a cycle-by-cycle (procedure-by-procedure) basis, the approach should scale up well.
    18336
    184 All operations performed along any of these paths takes some constant
    185 time c.  The complexity magnitude of the program only depends on the
    186 loops and recursive calls met along its execution paths, but not on
    187 their associated constants. On the other hand, if we want to give
    188 tight performance bounds to the execution time, we have to compute the
    189 real constants on the executable.
     37[[BR]]
    19038
     39=== User interaction flow ===
    19140
    192 The compiler must be able to return a set of pairs $(p_i,c_i)$, where
    193 each $p_i$ is an execution path, and $c_i$ is its actual
    194 cost\footnote{A more flexible result would consist in returning pairs
    195 $(p_i,a_i)$ where $a_i$ is the sequence of assembly instructions
    196 corresponding to $p_i$; this would allow to take space into
    197 consideration, as well as time.}. It is then up to the user to guess
    198 (and to prove, assisted by interactive tools) the complexity of the
    199 program: the compiler only provides the infrastructure required to map
    200 a complexity analysis on the source into a faithful analog on the
    201 target.  This approach looks compatible with most local optimizations.
    202 Moreover, since we work on a cycle-by-cycle (procedure-by-procedure)
    203 basis, the approach should scale up well.
     41The left part of Fig.�[[#interaction|3]] shows the interaction diagram for the final user of the system (the right part is a planned case study for a possible automation of the process and will be discussed later).
    20442
    205 == User interaction flow ==
     43'''Figure 3:''' Interaction and automation diagrams
    20644
    207 The left part of Fig.~\ref{interaction} shows the interaction diagram for the final user of
    208 the system (the right part is a planned case study for a possible
    209 automation of the process and will be discussed later).
    210 \begin{figure}[tb]
    211 \begin{center}
    212 \includegraphics[width=0.40\textwidth]{interaction_diagram}
    213 \hspace{0.5cm}
    214 \vrule
    215 \hspace{1cm}
    216 \includegraphics[width=0.45\textwidth]{case_study_diagram}
    217 \caption{\label{interaction}Interaction and automation diagrams}
    218 \end{center}
    219 \end{figure}
    220 The interaction is done in several steps. We illustrate them using the
    221 quicksort program above.
    222 \begin{enumerate}
    223  \item The user writes her code in C~(see Fig.~\ref{code}) and compiles
    224        it with our CerCo (Certified Complexity) compiler.
    225  \item The CerCo compiler outputs both the object code and an annotated copy
    226        of the C source (Fig.~\ref{annotated}).
     45|| img17.png ||
    22746
    228 \begin{figure}[tb]
    229 \begin{quote}
    230 \begin{verbatim}
    231 void quick_rec(int t[], int l, int r) {
    232   /* Cost annotation for quick_rec body (1 cycle only)
    233      @ if (l<r) time += 21;
    234      @ if (l>=r) time += 6; */
    235   if (l < r) {
    236     int v = t[l];
    237     int m = l;
    238     int i = l + 1;
    239     while (i <= r) {
    240       /* Cost annotation for while (1 cycle only)
    241          @ if (t[i] < v) time += 4; else time += 5; */
    242       if (t[i] < v) { m++; swap(t, i, m); }
    243       i++;
    244     }
    245     swap(t, l, m);
    246     quick_rec(t, l, m - 1);
    247     quick_rec(t, m + 1, r);
    248   }
    249 }
    250 \end{verbatim}
    251 \end{quote}
    252 \caption{\label{annotated}Cost annotated C code (generated by the compiler)}
    253 \end{figure}
     47The interaction is done in several steps. We illustrate them using the quicksort program above.
    25448
    255 \begin{figure}[p]
    256 \begin{center}
    257 \begin{tabular}{ccc}
    258 \begin{minipage}[t]{.3\textwidth}
    259 \textbf{C source}\\~\\
    260 \texttt{\scriptsize{
    261 \colorbox{cola}{void quicksort(t,l,r) \{}\\
    262 \colorbox{colb}{\ \ if (l {\textless} r) \{}\\
    263 \colorbox{colc}{\ \ \ \ int i = l + 1;}\\
    264 \colorbox{cold}{\ \ \ \ int m = l;}\\
    265 \colorbox{cole}{\ \ \ \ int v = t[l];}\\
    266 \colorbox{colf}{\ \ \ \ while (i {\textless}= r) \{}\\
    267 \colorbox{colg}{\ \ \ \ \ \ if (t[i] {\textless} v) \{ }\\
    268 \colorbox{colh}{\ \ \ \ \ \ \ \ m++; }\\
    269 \colorbox{coli}{\ \ \ \ \ \ \ \ swap(t, i, m); \}}\\
    270 \colorbox{colj}{\ \ \ \ \ \ i++;\}}\\
    271 \colorbox{coll}{\ \ \ \ swap(t, l, m);}\\
    272 \colorbox{colm}{\ \ \ \ quicksort(t, l, m {}- 1);}\\
    273 \colorbox{coln}{\ \ \ \ quicksort(t, m + 1, r);}\\
    274 \colorbox{cola}{\}}\\
    275 }}
    276 \end{minipage}
    277 &
    278 \begin{minipage}[t]{.3\textwidth}
    279 \textbf{Pseudo-Assembly code}\\~\\
    280 \texttt{\scriptsize{
    281 \colorbox{cola}{24: r <- r3}\\
    282 \colorbox{cola}{29: l <- r2}\\
    283 \colorbox{colb}{34: cmp l r}\\
    284 \colorbox{cola}{36: t <- r1}\\
    285 \colorbox{colb}{3a: jump c4 if l >= r}\\
    286 \colorbox{colc}{40: i <- l + 1}\\
    287 \colorbox{cole}{44: r8 <- t}\\
    288 \colorbox{cole}{48: r7 <- l}\\
    289 \colorbox{cold}{4b: m <- l}
    290 \vspace{0.4em}\\
    291 \framebox[0.7\linewidth]{
    292 \begin{minipage}[t]{0.6\textwidth}
     49  1. The user writes her code in C�(see Fig.�[[node6.html#code|2]]) and compiles it with our CerCo (Certified Complexity) compiler.
     50  1. The CerCo compiler outputs both the object code and an annotated copy of the C source (Fig.�[[#annotated|4]]). '''Figure 4:''' Cost annotated C code (generated by the compiler)||  ||
     51  '''Figure 5:''' Automatic inference of cost annotations from assembly code|| || ||<tablestyle="width:30%"> C source[[BR]]� [[BR]]`~- void quicksort(t,l,r) {[[BR]] if (l < r) {[[BR]] int i = l + 1;[[BR]] int m = l;[[BR]] int v = t[l];[[BR]] while (i <= r) {[[BR]] if (t[i] < v) { [[BR]] m++; [[BR]] swap(t, i, m); }[[BR]] i++;}[[BR]] swap(t, l, m);[[BR]] quicksort(t, l, m - 1);[[BR]] quicksort(t, m + 1, r);[[BR]]}[[BR]]-~` || || ||<tablestyle="width:30%"> Pseudo-Assembly code[[BR]]� [[BR]]`~- 24: r <- r3[[BR]]29: l <- r2[[BR]]34: cmp l r[[BR]]36: t <- r1[[BR]]3a: jump c4 if l >= r[[BR]]40: i <- l + 1[[BR]]44: r8 <- t[[BR]]48: r7 <- l[[BR]]4b: m <- l[[BR]][[BR]]img20.png[[BR]][[BR]]97: r1 <- t[[BR]]9b: r3 <- m[[BR]]9e: r2 <- l[[BR]]a1: call swap[[BR]]a6: r1 <- t[[BR]]aa: r3 <- m - 0x1[[BR]]af: r2 <- l[[BR]]b2: call quicksort[[BR]]bc: l <- r6[[BR]]bf: call quicksort[[BR]]c4: ret -~` || || ||<tablestyle="width:30%"> Execution Paths[[BR]]� [[BR]]`~- l >= r img21.png [[BR]]24: r <- r3[[BR]]29: l <- r2[[BR]]34: cmp l r[[BR]]36: t <- r1[[BR]]3a: jump c4 if l >= r[[BR]]c4: ret[[BR]]�[[BR]]total: 6 clock cycles[[BR]]�[[BR]]l < r img21.png [[BR]]24: r <- r3[[BR]]29: l <- r2[[BR]]34: cmp l r[[BR]]36: t <- r1[[BR]]3a: jump c4 if l >= r[[BR]]40: i <- l + 1[[BR]]44: r8 <- t[[BR]]48: r7 <- l[[BR]]4b: m <- l[[BR]]while loop[[BR]]97: r1 <- t[[BR]]9b: r3 <- m[[BR]]9e: r2 <- l[[BR]]a1: call swap [[BR]]swap [[BR]]a6: r1 <- t[[BR]]aa: r3 <- m - 0x1[[BR]]af: r2 <- l[[BR]]b2: call quicksort[[BR]]quicksort[[BR]]bc: l <- r6[[BR]]bf: call quicksort[[BR]]quicksort[[BR]]c4: ret[[BR]]�[[BR]]total: 21 clock cycles + function calls -~` || || ||
     52   Each loop and function body is annotated with the cost of one iteration, along all its possible execution paths. The cost is expressed as a function of the state of the program, which comprises the value of every variable. (In the example, we use a textual annotation for simplicity, but we expect to produce a more structured output.) The leftmost column of Fig.�[[#runs|5]] shows the original source code. Colours are used to relate source code statements with their respective (human readable) assembly instructions, reported in the central column. That assembly was produced by gcc[[footnode.html#foot390|^3^]] with a moderate level of optimizations for an Intel 386 family microprocessor. We used img4.png, img5.png, img3.png, img6.png and img7.png to mention locations (registers and/or stack frame temporaries) in which the corresponding C variables are placed, while img22.png are other register or temporaries that have no direct mapping to C. The calling convention puts the first three parameters in img23.png, img24.png and img25.png, and it is up to the callee to eventually store them in local temporaries. Assignment is denoted with `<-`, addition and multiplication with `+` and `*`; the jump instruction is followed by the target address, and when the jump is conditional a C like expression follows (but its evaluation is performed early by the `cmp` instruction, that sets a CPU flag recording the result of the comparison). The only tricky expression is ```*(r8 + r7 * 4)`'', that exploits an advanced addressing mechanism corresponding to array indexing (4 is the size of an array cell in bytes, img26.png is the index and img27.png is the address at which the array starts). It amounts to the C statement ```t[l]`'' that computes the pivot.The rightmost column shows two possible execution paths, with a precise estimation of their cost (here 6 and 21 CPU cycles plus the cost of function calls) and the algebraic conditions characterizing these paths.More precisely
     53    * The CerCo compiler avoids intra-procedural optimisations and loop optimisations that may change the number of iterations performed in a non trivial way.
     54    * Some intra-procedural or loop optimisations (like the `while` to `repeat` pre-hoisting optimisation applied by gcc in Fig.�[[#runs|5]]) can be allowed, provided that the compiler records them precisely.
     55    * Once the assembly code is produced, the assembly-level control flow graph is analysed in order to compute the cost of execution paths. Fig.�[[#runs|5]] shows two of them in the rightmost column; the analysis of the `while` loop has been omitted, but is similar.
     56  1. The user computes (by hand or semi-automatically) the complexity invariants of each cycle and (recursive) function, and he adds them to the C code as special comments[[footnode.html#foot404|^4^]] (Fig.�[[#invariants|6]]). '''Figure 6:''' Invariants annotated C code. The invariants are user provided.|| img28.png ||
     57  '''Figure 7:''' Complexity obligations (automatically generated). The user should prove every complexity obligation.|| img29.png ||
     58   The quicksort complexity invariant states the maximum number of clock cycles required by its execution on an array delimited by img4.png and img5.png. Since the procedure can be called with wrong indices (img30.png) the formula has to take into account that the img31.png could be negative using the img32.png function to raise that difference to zero when needed. The literature suggests that this quicksort implementation (where the pivot img6.png is chosen deterministically) has a quadratic complexity in the worst case. Cleaning up the formula from multiplicative and additive constants one obtains the expected asymptotic complexity img33.png. The coefficients are those returned by the cost-annotating compiler. Similarly, the user has to give a complexity invariant for the inner while loop.
     59  1. The user and compiler annotated C code is fed into an already existing tool (in this example, Caduceus, [[[node60.html#caduceus|Filli�tre and March�]]]) that produces one complexity obligation for each execution path (Fig.�[[#obligations|7]]).
     60  1. The user should prove all complexity obligations. The proofs are the certificate that the user provided complexity invariant is correct. In many cases, the obligations can be proved automatically using a general purpose automatic theorem prover or an ad-hoc procedure. For instance, to prove the complexity obligations of Fig.�[[#obligations|7]], we must show that a system of inequations holds, which may be done automatically. When an automatic proof is not possible, the user can resort to an interactive proof.
    29361
    294 \colorbox{white}{while loop}\\
    295 \tiny{
    296 \colorbox{colf}{4e: cmp i r}\\
    297 \colorbox{cole}{53: v <- *(r8 + r7 * 4) }\\
    298 \colorbox{colf}{57: jump 97 if i > r}\\
    299 \colorbox{colg}{59: r7 <- i}\\
    300 \colorbox{colg}{5c: r9 <- r8 + r7 * 4}\\
    301 \colorbox{colf}{60: jump 6e}\\
    302 \colorbox{colj}{62: i <- i + 0x1}\\
    303 \colorbox{colg}{65: r9 <- r9 + 0x4}\\
    304 \colorbox{colf}{69: cmp i r}\\
    305 \colorbox{colf}{6c: jump 92 if i > r}\\
    306 \colorbox{colg}{6e: cmp v *r9}\\
    307 \colorbox{colg}{72: jump 62 if v <= r9}\\
    308 \colorbox{coli}{74: r1 <- t}\\
    309 \colorbox{colh}{78: m <- m + 0x1}\\
    310 \colorbox{coli}{7c: r2 <- i}\\
    311 \colorbox{coli}{7e: r3 <- r12d}\\
    312 \colorbox{colj}{81: i <- i + 0x1}\\
    313 \colorbox{colg}{84: r9 <- r9 + 0x4}\\
    314 \colorbox{coli}{88: call swap}\\
    315 \colorbox{colf}{8d: cmp i r}\\
    316 \colorbox{colf}{90: jump 6e if i <= r}\\
    317 \colorbox{coln}{92: r6 <- m + 0x1}\\
    318 }
    319 \end{minipage}
    320 }
    321 \vspace{0.4em}
    322 \\
     62The right part of Fig.�[[#interaction|3]] describes a planned case study for the automation of the complexity proof. We start with a synchronous program which is compiled to C code. The CerCo compiler then produces suitable cost annotations which are used by an invariant synthesizer to build complexity assertions on the C code. The synthesizer can take advantage of the high level control flow information contained in the source synchronous program. The deductive platform (Caduceus) generates complexity obligations which are passed to an ad-hoc proof generator to produce a machine-checked proof from which we can extract a certified bound on the reaction time of the original synchronous program. The proof generator can also take advantage of the high level information coming from the original source program, and user interaction can be used to drive the generator in critical cases.
    32363
    324 \colorbox{coll}{97: r1 <- t}\\
    325 \colorbox{coll}{9b: r3 <- m}\\
    326 \colorbox{coll}{9e: r2 <- l}\\
    327 \colorbox{coll}{a1: call swap}\\
    328 \colorbox{colm}{a6: r1 <- t}\\
    329 \colorbox{colm}{aa: r3 <- m {}- 0x1}\\
    330 \colorbox{colm}{af: r2 <- l}\\
    331 \colorbox{colm}{b2: call quicksort}\\
    332 \colorbox{coln}{bc: l <- r6}\\
    333 \colorbox{coln}{bf: call quicksort}\\
    334 \colorbox{cola}{c4: ret}
    335 }}
    336 \end{minipage}
    337 &
    338 \begin{minipage}[t]{.3\textwidth}
    339 \textbf{Execution Paths}\\~\\
    340 \texttt{\scriptsize{
    341 \colorbox{white}{l >= r $\rightarrow$ }\\
    342 \colorbox{cola}{24: r <- r3}\\
    343 \colorbox{cola}{29: l <- r2}\\
    344 \colorbox{colb}{34: cmp l r}\\
    345 \colorbox{cola}{36: t <- r1}\\
    346 \colorbox{colb}{3a: jump c4 if l >= r}\\
    347 \colorbox{cola}{c4: ret}\\
    348 \colorbox{white}{~}\\
    349 \colorbox{white}{total: 6 clock cycles}\\
    350 \colorbox{white}{~}\\
    351 \colorbox{white}{l < r $\rightarrow$ }\\
    352 \colorbox{cola}{24: r <- r3}\\
    353 \colorbox{cola}{29: l <- r2}\\
    354 \colorbox{colb}{34: cmp l r}\\
    355 \colorbox{cola}{36: t <- r1}\\
    356 \colorbox{colb}{3a: jump c4 if l >= r}\\
    357 \colorbox{colc}{40: i <- l + 1}\\
    358 \colorbox{cole}{44: r8 <- t}\\
    359 \colorbox{cole}{48: r7 <- l}\\
    360 \colorbox{cold}{4b: m <- l}\\
    361 \colorbox{white}{while loop}\\
     64[[BR]]
    36265
    363 \colorbox{coll}{97: r1 <- t}\\
    364 \colorbox{coll}{9b: r3 <- m}\\
    365 \colorbox{coll}{9e: r2 <- l}\\
    366 \colorbox{coll}{a1: call swap }\\
    367 \colorbox{white}{swap }\\
    368 \colorbox{colm}{a6: r1 <- t}\\
    369 \colorbox{colm}{aa: r3 <- m {}- 0x1}\\
    370 \colorbox{colm}{af: r2 <- l}\\
    371 \colorbox{colm}{b2: call quicksort}\\
    372 \colorbox{white}{quicksort}\\
    373 \colorbox{coln}{bc: l <- r6}\\
    374 \colorbox{coln}{bf: call quicksort}\\
    375 \colorbox{white}{quicksort}\\
    376 \colorbox{cola}{c4: ret}\\
    377 \colorbox{white}{~}\\
    378 \colorbox{white}{total: 21 clock cycles + function calls}
    379 }}
    380 \end{minipage}
    381 \end{tabular}
    382 \end{center}
    383 \caption{\label{runs}Automatic inference of cost annotations from assembly code}
    384 \end{figure}
    385 Each loop and function body is annotated with the cost of one
    386 iteration, along all its possible execution paths. The cost is
    387 expressed as a function of the state of the program, which comprises
    388 the value of every variable.  (In the example, we use a textual
    389 annotation for simplicity, but we expect to produce a more structured
    390 output.)
     66=== Certification: tools and techniques ===
    39167
     68In order to trust the process described in the previous section, we need to trust the CerCo compiler. I.e. we need to fulfil the following requirements:
    39269
    393 The leftmost column of Fig.~\ref{runs} shows the original source code.  Colours
    394 are used to relate source code statements with their respective (human
    395 readable) assembly instructions, reported in the central column.  That assembly
    396 was produced by gcc\footnote{GNU compiler collection, version 4.2} with a
    397 moderate level of optimizations for an Intel 386 family microprocessor.  We
    398 used $l$, $r$, $t$, $v$ and $m$ to mention locations (registers and/or stack
    399 frame temporaries) in which the corresponding C variables are placed, while $r1,
    400 \ldots, r9$ are other register or temporaries that have no direct mapping to C.
    401 The calling convention puts the first three parameters in $r1$, $r2$
    402 and $r3$, and it is up to the callee to eventually store them in local
    403 temporaries.  Assignment is denoted with \texttt{<-}, addition and
    404 multiplication with
    405 \texttt{+} and \texttt{*}; the jump instruction is followed by the target
    406 address, and when the jump is conditional a C like expression follows (but its
    407 evaluation is performed early by the \texttt{cmp} instruction, that sets a CPU
    408 flag recording the result of the comparison). The only tricky expression is
    409 ``\texttt{*(r8 + r7 * 4)}'', that exploits an advanced addressing mechanism
    410 corresponding to array indexing (4 is the size of an array cell in bytes, $r7$
    411 is the index and $r8$ is the address at which the array starts). It amounts to
    412 the C statement ``\texttt{t[l]}'' that computes the pivot.
     70  1. the compiled assembly program respects the semantics of the C program
     71  1. we need to know that the number of iterations performed by the C and assembly programs are the same, i.e. to prove that the compiler preserves the complexity
     72  1. we need to prove that the cost annotations generated by the compiler really correspond to the number of clock cycles spent by the hardware
    41373
    414 The rightmost column shows two possible execution paths, with a
    415 precise estimation of their cost (here 6 and 21 CPU cycles plus the
    416 cost of function calls) and the algebraic conditions characterizing
    417 these paths.
     74For this reason, we plan to[[footnode.html#foot438|^5^]]:
    41875
     76  1. develop an untrusted CerCo compiler prototype in high level programming language;
     77  1. provide an executable formal specification of the target microprocessor;
     78  1. provide an executable formal specification of the C language;
     79  1. develop an executable version of the CerCo compiler in a language suitable to formal correctness proofs;
     80  1. give a machine checkable proof that the latter implementation satisfies all the requirements mentioned above.
    41981
    420 More precisely
    421    \begin{itemize}
    422      \item The CerCo compiler avoids intra-procedural optimisations and loop
    423            optimisations that may change the number of
    424            iterations performed in a non trivial way.
    425      \item Some intra-procedural or loop optimisations (like the
    426            \texttt{while} to \texttt{repeat} pre-hoisting optimisation applied
    427            by gcc in Fig.~\ref{runs})
    428            can be allowed, provided that the compiler records them precisely.
    429      \item Once the assembly code is produced, the assembly-level control flow
    430            graph is analysed in order
    431            to compute the cost of execution paths. Fig.~\ref{runs}
    432            shows two of them in the rightmost column; the analysis  of
    433            the \texttt{while} loop has been omitted, but is similar.
    434    \end{itemize}
    435  \item The user computes (by hand or semi-automatically) the complexity
    436        invariants of each cycle and (recursive) function, and he adds them
    437        to the C code as special comments\footnote{Again, more interactive forms
    438        of annotations can be considered.} (Fig.~\ref{invariants}).
    439 \begin{figure}[p]
    440 \begin{quote}
    441 \begin{verbatim}
     82The untrusted compiler will be written in the http://caml.inria.fr/OCaml programming language, developed and distributed by INRIA, France's national research institute for computer science. OCaml is a general-purpose programming language, especially suited for symbolic manipulation of tree-like data structures, of the kind typically used during compilation. It is a simple and efficient language, designed for program safety and reliability and particularly suited for rapid prototyping.
    44283
    443 /* User provided invariant for the quicksort function.
    444    @ ensures (time <= \old(time) + max(0,((1+r)-l))*(max(0,(r-l))*5+21) + 6) */
    445 void quick_rec(int t[], int l, int r) {
    446   /* Cost annotation for quick_rec body (1 cycle only)
    447      @ if (l<r) time += 21;
    448      @ if (l>=r) time += 6; */
    449   if (l < r) {
    450     int v = t[l];
    451     int m = l;
    452     int i = l + 1;
    453     /* @ label L
    454        User provided invariant for the while loop. To perform i iterations
    455        we need at most (i-l)*5 clock cycles.
    456        @ invariant time <= \at(time,L) + (i-l) * 5 &&
    457        Additional invariants of the loop
    458        @ l <= m && l+1 <= i && m <= r && i <= r+1 && m < i && l < r */
    459     while (i <= r) {
    460       /* Cost annotation for while (1 cycle only)
    461          @ if (t[i] < v) time += 4; else time += 5; */
    462       if (t[i] < v) { m++; swap(t, i, m); }
    463       i++;
    464     }
    465     swap(t, l, m);
    466     quick_rec(t, l, m - 1);
    467     quick_rec(t, m + 1, r);
    468   }
    469 }
    470 \end{verbatim}
    471 \end{quote}
    472 \vspace{-15pt}
    473 \caption{\label{invariants}Invariants annotated C code. The invariants are
    474  user provided.}
    475 \end{figure}
    476 \begin{figure}[p]
    477 \begin{center}
    478 \framebox[0.9\linewidth]{
    479 \begin{minipage}[t]{0.8\linewidth}
    480 \vspace{0.5em}
     84For the certification of the compiler we plan to use the http://matita.cs.unibo.itMatita Interactive Theorem Prover, developed at the Computer Science Department of the University of Bologna. Matita is based on the Calculus of Inductive Constructions, the same foundational paradigm as INRIA's Coq system, and it is partially compatible with it. It adopts a tactic based editing mode. Proof objects (XML-encoded) are produced for storage and exchange. Its graphical interface, inspired by CtCoq and Proof General, supports high quality bidimensional rendering of proofs and formulae, transformed on-the-fly to MathML markup. In spite of its young age it has already been used for complex formalizations, including non trivial results in Number Theory and problems from the Poplmark challenge�[[[node60.html#poplmark|POPLmark]]]. An executable specification for all models of Freescale 8bit ALUs (Families HC05/HC08/RS08/HCS08) and memories (RAM, ROM, Flash) has also already been formalised in Matita.
    48185
    482 \emph{First complexity obligation}: case where $l < r$.
    483 The number of clock cycles spent in one iteration
    484 of the recursive function must be greater or equal than the sum of the number
    485 of clock cycles spent in the function body, in the \texttt{while} loop and
    486 in the two recursive calls:
    487 $$\begin{array}{l}
    488   \forall i,l,m,r.~ l < r \land l+1 \leq  r \land r < i \leq r+1 \land
    489   m \leq r \land l+1 \leq i \land l \leq m \Rightarrow\\
    490   (i - l) * 5 + 1 + \\
    491   max(0,1 + r - (m + 1)) * (max(0,r - (m + 1)) * 5 + 21) + 6 + \\
    492   max(0,1 + (m - 1) - l) * (max(0,m - 1 - l) * 5 + 21) + \\
    493   6 \\
    494   \leq
    495    max(0,1 + r - l) * (max(0,r - l) * 5 + 21) + 6
    496  \end{array}
    497 $$
     86For the management of cost annotations and proof obligation synthesis we plan to interface with the http://caduceus.lri.fr/Caduceus verification tool for C programs, developed by the Computer Science Laboratory of the University of Paris sud. Caduceus is built on top of Why, a general-purpose verification condition generator, exploiting Dijkstra's weakest precondition calculus. The Why tool allows the declaration of logical models (types, functions, predicates and axioms) that can be used in programs and annotations; moreover, it can be interfaced to a wide set of existing provers for verification of the resulting conditions. In particular, we will rely on http://alt-ergo.lri.frAlt-Ergo, which includes a decision procedure for linear arithmetic.
    49887
    499 %\item
    500 \emph{Second complexity obligation}: case $l \leq r$. The number of clock cycles spent
    501 in one iteration must be greater than the number of clock cycles required when
    502 the \texttt{if} statement fails.
    503 $$l \leq r \Rightarrow 6 \leq  max(0,1 + r - l) * (max(0,r - l) * 5 + 21) + 6
    504 $$
     88We plan to support almost every ANSI C construct (functions, pointers, arrays and structures) and data-types (integers of various sizes) except function pointers, explicit jumps (goto) and pointer aliasing or casting. These features do not seem to pose major additional challenges to the technology we plan to develop, but could be time expensive to implement, formalize and prove correct. Moreover, they are not currently supported by Caduceus, posing additional problems for the development of the proof-of-concept prototype of the whole application. We could also support floating point numbers, but the kind of micro-controller we are targeting (8 and 16 bits processors) do not usually provide instructions to efficiently process them. Moreover floating point numbers are seldom used in embedded software for that very reason, making them a feature of ANSI C of limited interest in our scenario.
    50589
    506 %\end{itemize}
    507 A few others complexity obligations are automatically generated but, being
    508 trivial, are automatically proved by the system.\\
    509 \end{minipage}
    510 }
    511 \caption{\label{obligations}Complexity obligations (automatically generated).
    512  The user should prove every complexity obligation.}
    513 \end{center}
    514 \end{figure}
    515        \noindent
    516        The quicksort complexity invariant states the maximum
    517        number of clock cycles required by its execution on an array delimited
    518        by $l$ and $r$. Since the procedure can be called with wrong indices
     90We stress that the proposed approach handles costa annotations for C programs as a special case of standard annotations for imperative programs whose management we plan to automatize with tools such as Caduceus. As explained above, the choice of the tool does have an impact on the fragment of ANSI C constructs we will handle, and future advances in this domain could enlarge the fragment of ANSI C under consideration. On the other hand, tools like Caduceus pose no limitation on the invariants, which can be freely described in a computational and very expressive logic. Hence, every technique to automatically infer invariants and cost annotations can be exploited (and often automatized) in Caduceus.
    51991
    520        ($r < l$) the formula has to take into account that the $r-l$ could
    521        be negative using the $max$ function to raise that difference to zero
    522        when needed.
    523        The literature suggests that this quicksort implementation (where
    524        the pivot $v$ is chosen deterministically) has a quadratic complexity
    525        in the worst case. Cleaning up the
    526        formula from multiplicative and additive constants one obtains the
    527        expected asymptotic complexity $(r-l)^2$.
    528 
    529        The coefficients are those returned by the cost-annotating compiler.
    530        Similarly, the user has to give a complexity invariant for the
    531        inner while loop.
    532  \item The user and compiler annotated C code is fed into an already existing
    533        tool (in this example, Caduceus, \cite{caduceus}) that produces
    534        one complexity obligation for each execution path (Fig.~\ref{obligations}).
    535  \item The user should prove all complexity obligations. The proofs
    536  are the certificate that the user provided complexity invariant is
    537  correct.  In many cases, the obligations can be proved automatically
    538  using a general purpose automatic theorem prover or an ad-hoc
    539  procedure.  For instance, to prove the complexity obligations of
    540  Fig.~\ref{obligations}, we must show that a system of inequations
    541  holds, which may be done automatically. When an automatic proof is
    542  not possible, the user can resort to an interactive proof.
    543 \end{enumerate}
    544 
    545 
    546 The right part of Fig.~\ref{interaction} describes a planned case study for the
    547 automation of the complexity proof. We start with a synchronous program
    548 which is compiled to C code. The CerCo compiler then produces suitable
    549 cost annotations which are used by an invariant synthesizer to build
    550 complexity assertions on the C code. The synthesizer can take advantage of the
    551 high level control flow information contained in the source synchronous program.
    552 The deductive platform (Caduceus) generates complexity obligations which are passed to an ad-hoc proof generator to produce a machine-checked proof from which
    553 we can extract a certified bound on the reaction time of the original
    554 synchronous program. The proof generator can also take advantage of the high
    555 level information coming from the original source program, and user interaction
    556 can be used to drive the generator in critical cases.
    557 
    558 \subsubsection{Certification: tools and techniques}
    559 In order to trust the process described in the previous section,
    560 we need to trust the CerCo compiler.
    561 I.e.\ we need to fulfil the following requirements:
    562 \begin{enumerate}
    563 \item the compiled assembly program respects the semantics of the C program
    564 \item we need to know that the number of iterations performed by the C
    565       and assembly programs are the same, i.e. to prove that the
    566       compiler preserves the complexity
    567 \item we need to prove that the cost annotations generated by the compiler
    568       really correspond to the number of clock cycles spent by the hardware
    569 \end{enumerate}
    570 
    571 
    572 For this reason, we plan to\footnote{See next section for a
    573 more articulated description of the methodology}:
    574 \begin{enumerate}
    575 \item develop an untrusted CerCo compiler prototype in high level
    576       programming language;
    577 \item provide an executable
    578       formal specification of the target microprocessor;
    579 \item provide an executable
    580       formal specification of the C language;
    581 \item develop an executable version of the CerCo compiler in a language
    582       suitable to formal correctness proofs;
    583 \item give a machine checkable proof that the latter implementation
    584       satisfies all the requirements mentioned above.
    585 \end{enumerate}
    586 
    587 The untrusted compiler will be written in the
    588 \href{http://caml.inria.fr/}{OCaml} programming
    589 language, developed and distributed by \nobreak{INRIA}, France's
    590 national research institute for computer science. OCaml is a
    591 general-purpose programming language, especially suited for symbolic
    592 manipulation of tree-like data structures, of the kind typically used
    593 during compilation.  It is a simple and efficient language, designed
    594 for program safety and reliability and particularly suited for rapid
    595 prototyping.
    596 
    597 For the certification of the compiler we plan to use the
    598 \href{http://matita.cs.unibo.it}{Matita}
    599 Interactive Theorem Prover, developed at the Computer Science
    600 Department of the University of Bologna.
    601 Matita is based on the Calculus of Inductive Constructions,
    602 the same foundational paradigm as INRIA's Coq system,
    603 and it is partially compatible with it.
    604 It adopts a tactic based editing mode.
    605 Proof objects (XML-encoded) are produced for storage and exchange.
    606 Its graphical interface, inspired by CtCoq and Proof General,
    607 supports high quality bidimensional rendering of proofs and formulae,
    608 transformed on-the-fly to MathML markup.
    609 In spite of its young age it has already been used for complex
    610 
    611 formalizations, including non trivial results in Number Theory
    612 and problems from the Poplmark challenge~\cite{poplmark}.
    613 An executable specification for all models of Freescale 8bit
    614 ALUs (Families HC05/HC08/RS08/HCS08) and memories (RAM, ROM, Flash) has
    615 also already been formalised in Matita.
    616 
    617 For the management of cost annotations and proof obligation synthesis
    618 we plan to interface with the \href{http://caduceus.lri.fr/}{Caduceus}
    619 verification tool for C programs, developed by the Computer Science
    620 Laboratory of the University of Paris sud. Caduceus is built on top of
    621 Why, a general-purpose verification condition generator, exploiting
    622 Dijkstra's weakest precondition calculus. The Why tool allows the
    623 declaration of logical models (types, functions, predicates and
    624 axioms) that can be used in programs and annotations; moreover, it can
    625 be interfaced to a wide set of existing provers for verification of
    626 the resulting conditions. In particular, we will rely on
    627 \href{http://alt-ergo.lri.fr}{Alt-Ergo}, which includes a decision
    628 procedure for linear arithmetic.
    629 
    630 
    631 We plan to support almost every ANSI C construct (functions, pointers, arrays
    632 and structures) and data-types (integers of various sizes) except function
    633 pointers, explicit jumps (goto) and pointer aliasing or casting.
    634 These features do not seem to pose major additional challenges to the
    635 technology we plan to develop, but could be time expensive to implement,
    636 formalize and prove correct. Moreover, they are not currently supported
    637 by Caduceus, posing additional problems for the development
    638 of the proof-of-concept prototype of the whole application. We could also
    639 support floating point numbers, but the kind of micro-controller
    640 we are targeting (8 and 16 bits processors) do not usually provide
    641 instructions to efficiently process them. Moreover floating point
    642 numbers are seldom used in embedded software for that very reason,
    643 making them a feature of ANSI C of limited interest in our scenario.
    644 
    645 We stress that the proposed approach handles costa annotations for C programs
    646 as a special case of standard annotations for imperative programs whose
    647 management we plan to automatize with tools such as Caduceus. As explained
    648 above, the choice of the tool does have an impact on the fragment of ANSI C
    649 constructs we will handle, and future advances in this domain could enlarge
    650 the fragment of ANSI C under consideration. On the other hand, tools like
    651 Caduceus pose no limitation on the invariants, which can be freely described in
    652 a computational and very expressive logic. Hence, every technique to
    653 automatically infer invariants and cost annotations can be exploited (and
    654 often automatized) in Caduceus.
    655 
     92[[BR]]