source: Deliverables/D5.2/report.tex @ 3267

Last change on this file since 3267 was 3233, checked in by tranquil, 7 years ago

passed spell checker, added description of the cerco wrapper, minor changes to the cost plug-in section

File size: 41.3 KB
Line 
1\documentclass[11pt]{article}
2
3\usepackage{../style/cerco}
4
5\usepackage{amsfonts}
6\usepackage{amsmath}
7\usepackage{amssymb} 
8\usepackage[english]{babel}
9\usepackage{graphicx}
10\usepackage[utf8x]{inputenc}
11\usepackage{listings}
12\usepackage{stmaryrd}
13\usepackage{url}
14
15\title{
16INFORMATION AND COMMUNICATION TECHNOLOGIES\\
17(ICT)\\
18PROGRAMME\\
19\vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}}
20
21\lstdefinelanguage{matita-ocaml}
22  {keywords={ndefinition,ncoercion,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,let,in,rec,match,return,with,Type,try},
23   morekeywords={[2]nwhd,nnormalize,nelim,ncases,ndestruct},
24   morekeywords={[3]type,of},
25   mathescape=true,
26  }
27
28\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
29        keywordstyle=\color{red}\bfseries,
30        keywordstyle=[2]\color{blue},
31        keywordstyle=[3]\color{blue}\bfseries,
32        commentstyle=\color{green},
33        stringstyle=\color{blue},
34        showspaces=false,showstringspaces=false}
35
36\lstset{extendedchars=false}
37\lstset{inputencoding=utf8x}
38\DeclareUnicodeCharacter{8797}{:=}
39\DeclareUnicodeCharacter{10746}{++}
40\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
41\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
42
43\date{}
44\author{}
45
46\begin{document}
47
48\thispagestyle{empty}
49
50\vspace*{-1cm}
51\begin{center}
52\includegraphics[width=0.6\textwidth]{../style/cerco_logo.png}
53\end{center}
54
55\begin{minipage}{\textwidth}
56\maketitle
57\end{minipage}
58
59\vspace*{0.5cm}
60\begin{center}
61\begin{LARGE}
62\textbf{
63Report n. D5.2\\
64Trusted CerCo Prototype}
65\end{LARGE} 
66\end{center}
67
68\vspace*{2cm}
69\begin{center}
70\begin{large}
71Version 1.0
72\end{large}
73\end{center}
74
75\vspace*{0.5cm}
76\begin{center}
77\begin{large}
78Main Authors:\\
79Roberto M. Amadio, Nicolas Ayache, François Bobot, Jaap Boender, Brian Campbell, Ilias Garnier,
80Antoine Madet, James McKinna,
81Dominic P. Mulligan, Mauro Piccolo, Yann R\'egis-Gianas,
82Claudio Sacerdoti Coen, Ian Stark and Paolo Tranquilli
83\end{large}
84\end{center}
85
86\vspace*{\fill}
87
88\noindent
89Project Acronym: \cerco{}\\
90Project full title: Certified Complexity\\
91Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\
92
93\clearpage
94\pagestyle{myheadings}
95\markright{\cerco{}, FP7-ICT-2009-C-243881}
96
97\newpage
98
99\vspace*{7cm}
100\paragraph{Abstract}
101The trusted CerCo Prototype is meant to be the last piece of software produced
102in the CerCo project. It consists of
103\begin{enumerate}
104 \item A compiler from a large subset of C to the Intel HEX format for the
105   object code of the 8051 microprocessor family. The compiler also computes
106   the cost models for the time spent in basic blocks and the stack space used
107   by the functions in the source code. The cost models are exposed to the
108   user by producing an instrumented C code obtained injecting in the original
109   source code some instructions to update three global variables that record
110   the current clock, the current stack usage and the maximum stack usage so far.
111 \item A plug-in for the Frama-C Software Analyser architecture. The plug-in
112   takes in input a C file, compiles it with the CerCo compiler and then
113   automatically infers cost invariants for every loop and every function in
114   the source code. The invariants can be fed by Frama-C to various theorem
115   provers to automatically verify them. Those that are not automatically
116   verified can be manually checked by the user using a theorem prover.
117 \item A wrapper that interfaces and integrates the two pieces of software
118   above with the Frama-C Jessie plugin and with the Why3 platform.
119 
120\end{enumerate}
121
122The plug-in is fully functional and it does not need to be formally verified
123because it does not belong to the trusted code base of programs verified using
124CerCo's technology. A bug in the plug-in can just infer wrong time/space
125invariants and bounds that will be rejected by the automatic provers.
126
127The compiler is currently fully functional, even if not fully
128certified yet. The certification will continue after the end of the project.
129
130In this deliverable we discuss the status of the Trusted CerCo Prototype at the
131time of the official end of the project.
132\newpage
133
134\tableofcontents
135
136\newpage
137
138\section{Task}
139\label{sect.task}
140
141The Grant Agreement describes deliverable D5.2 as follows:
142\begin{quotation}
143\textbf{Trusted CerCo Prototype}: Final, fully trusted version of the
144system.
145\end{quotation}
146
147This report describes the state of the implementation of the Trusted CerCo
148Prototype at the official end of the project.
149
150\section{The Trusted CerCo Compiler}
151\subsection{Command line interface}
152The Trusted CerCo Compiler is the first component of the Trusted CerCo
153Prototype, the second one being the Frama-C plug-in already developed in
154D5.1 and not modified. The Trusted CerCo compiler replaces the Untrusted
155CerCo Compiler already delivered in D5.1 as part of the Untrusted CerCo
156Prototype. The Trusted and Untrusted compilers are meant to provide the
157same command line interface, so that they can be easily swapped without
158affecting the plug-in. Actually, the two compilers share the following
159minimal subset of options which are sufficient to the plug-in:
160
161\begin{verbatim}
162Usage: acc [options] file...
163  -a                        Add cost annotations on the source code.
164  -is                       Outputs and interprets all the compilation passes,
165                            showing the execution traces
166  -o                        Prefix of the output files.
167\end{verbatim}
168
169\begin{figure}[t]
170\begin{verbatim}
171char a[] = {3, 2, 7, -4};
172char treshold = 4;
173
174int main() {
175        char j;
176        char *p = a;
177        int found = 0;
178        for (j=0; j < 4; j++) {
179                if (*p <= treshold) { found++; }
180                p++;
181        }
182        return found;
183}
184\end{verbatim}
185\caption{A simple C program.}
186\label{test1}
187\end{figure}
188
189\begin{figure}[!p]
190\begin{verbatim}
191int __cost = 33;
192
193int __stack_size = 5, __stack_size_max = 5;
194
195void __cost_incr(int incr) {
196  __cost = __cost + incr;
197}
198
199void __stack_size_incr(int incr) {
200  __stack_size = __stack_size + incr;
201  __stack_size_max = __stack_size_max < __stack_size ? __stack_size : __stack_size_max;
202}
203
204unsigned char a[4] = { 3, 2, 7, 252, };
205
206unsigned char treshold = 4;
207
208int main(void)
209{
210  unsigned char j;
211  unsigned char *p;
212  int found;
213  __stack_size_incr(0);
214  __cost_incr(91);
215  p = a;
216  found = 0;
217  for (j = (unsigned char)0; (int)j < 4; j = (unsigned char)((int)j + 1)) {
218    __cost_incr(76);
219    if ((int)*p <= (int)treshold) {
220      __cost_incr(144);
221      found = found + 1;
222    } else {
223      __cost_incr(122);
224      /*skip*/;
225    }
226    p = p + 1;
227  }
228  __cost_incr(37);
229  /*skip*/;
230  __stack_size_incr(-0);
231  return found;
232  __stack_size_incr(-0);
233
234}
235\end{verbatim}
236\caption{The instrumented version of the program in \autoref{test1}.}
237\label{itest1}
238\end{figure}
239
240Let the file \texttt{test.c} contains the code presented in \autoref{test1}.
241By calling \texttt{acc -a test.c}, the user obtains the following files:
242\begin{itemize}
243 \item \texttt{test-instrumented.c} (\autoref{itest1}): the file is a copy
244   of \texttt{test.c} obtained by adding code that keeps track of the amount
245   of time and space used by the source code.
246
247   The global variable
248   \texttt{\_\_cost} records the number of clock cycles used by the
249   microprocessor. Its initial value may not be zero because we emit object code
250   to initialize the global variables of the program. The initialization code
251   is not part of \texttt{main} (to allow \texttt{main} to be recursive, even
252   if it is not clear from the standard if this should be allowed or not).
253   The initial value of \texttt{\_\_cost} is the time spent in the
254   initialization code.
255   The \texttt{\_\_cost} variable is incremented at the beginning of every
256   basic block using the \texttt{\_\_cost\_incr} function, whose argument
257   is the number of clock cycles that will be spent by the basic block that
258   is beginning. Therefore the value stored in the variable is always a safe
259   upper bound of the actual time. Moreover, the difference at the begin and
260   end of a function of the value of the \texttt{\_\_clock} variable is
261   also exact. The latter statement --- with several technical complications
262   --- is the one that will be eventually certified in Matita.
263
264   The global variables \texttt{\_\_stack\_size} and
265   \texttt{\_\_stack\_size\_max} are handled similarly to \texttt{\_\_cost}.
266   Their value is meant to represent the amount of external data memory
267   currently in use and the maximum amount of memory used so far by the program.
268   The two values are updated by the \texttt{\_\_stack\_size\_incr} function
269   at the beginning of every function body, at its end and just before every
270   \texttt{return}. A negative increment is used to represent stack release.
271   The initial value of the two variables may not be zero because some
272   external data memory is used for global and static variables. Moreover, the
273   last byte of external data memory may not be addressable in the 8051, so
274   we avoid using it. The compiled code runs correctly only until stack
275   overflow, which happens when the value of \texttt{\_\_stack\_size} reaches
276   $2^{16}$. It is up to the user to state and maintain the invariant that no
277   stack overflow occurs. In case of stack overflow, the compiled code does
278   no longer simulate the source code. Automatic provers are often able to
279   prove the invariant in simple cases.
280
281   In order to instrument the code, all basic blocks that are hidden in the
282   source code but that will contain code in the object code need to be made
283   manifest. In particular, \texttt{if-then} instructions without an explicit
284   \texttt{else} are given one (compare Figures~\ref{test1} and \ref{itest1}).
285 \item \texttt{test.hex}: the file is an Intel HEX representation of the
286   object code for an 8051 microprocessor. The file can be loaded in any
287   8051 emulator (like the MCU 8051 IDE) for running or disassembling it.
288   Moreover, an executable semantics (an emulator) for the 8051 is linked
289   with the CerCo compilers and can be used to run the object code at the
290   end of the compilation.
291 \item \texttt{test.cerco} and \texttt{test.stack\_cerco}: these two files
292   are used to pass information from the CerCo compilers to the Frama-C plug-in
293   and they are not interesting to the user. They just list the
294   global variables and increment functions inserted by the compiler and used
295   later by the plug-in to compute the cost invariants.
296\end{itemize}
297
298When the option \texttt{-is} is used, the compilers run the program after
299every intermediate compilation pass. The untrusted compiler only outputs the
300trace of (cost) observables and the final value returned by main; the
301trusted compiler also observes every function call and return (the stack-usage
302observables) and therefore allows to better track program execution.
303The traces observed after every pass should always be equal up to the initial
304observable that corresponds to the initialization of global variables. That
305observable is currently only emitted in back-end languages. The preservation of
306the traces will actually be granted by the main theorem of the formalization
307(the forward simulation theorem) when the proof will be completed.
308
309The compilers also create a file
310for each pass with the intermediate code. However, the syntaxes used by the two
311compilers for the intermediate passes are not the same and we do not output
312yet any intermediate syntax for the assembly code. The latter can be obtained
313by disassembling the object code, e.g. by using the MCU 8051 IDE.
314
315\subsection{Code structure}
316The code of the trusted compiler is made of three different parts:
317\begin{itemize}
318 \item Code extracted to OCaml from the formalization of the compiler in Matita.
319   This is the code that will eventually be fully certified using Matita.
320   It is fully contained in the \texttt{extracted} directory (not comprising
321   the subdirectory \texttt{untrusted}).
322   It translates the source C-light code to:
323   \begin{itemize}
324    \item An instrumented version of the same C-light code. The instrumented
325      version is obtained inserting cost emission statements~\texttt{COST k}
326      at the beginning of basic blocks. The emitted labels are the ones that
327      are observed calling \texttt{acc -is}. They will be replaced in the
328      final instrumented code with increments of the \texttt{\_\_cost}
329      variable.
330    \item The object code for the 8051 as a list of bytes to be loaded in
331      code memory. The code is coupled with a trie over bit vectors that maps
332      program counters to cost labels \texttt{k}. The intended semantics is
333      that, when the current program counter is associated to \texttt{k},
334      the observable label \texttt{k} should be emitted during the execution
335      of the next object code instruction. Similarly, a second trie, reminiscent
336      of a symbol table, maps program counters to function names to emit the
337      observables associated to stack usage.
338   \end{itemize}
339   During the ERTL to RTL pass that deals with register allocation,
340   the source code calls two untrusted components that we are now going to
341   describe.
342 \item Untrusted code called by trusted compiler (directory
343   \texttt{extracted/untrusted}). The two main untrusted components in this
344    directory are
345  \begin{itemize}
346   \item The \texttt{Fix.ml} module by Fran\c{c}ois Pottier that provides a
347    generic algorithm to compute the least solution of a system of monotonic
348    equations, described in the paper \cite{LazyLeastFixedPointsInML}.
349    The trusted code uses this piece of code to do liveness analysis and only
350    assumes that the module computes a pre-fix point of the system of equations
351    provided. The performance of this piece of code is critical to the compiler
352    and the implementation used exploits some clever programming techniques
353    and imperative data structures that would be difficult to prove correct.
354    Checking if the result of the computation is actually a pre-fixpoint is
355    instead very simple to do using low computational complexity functional
356    code only. Therefore we do not plan to prove this piece of code correct.
357    Instead, we will certify a verifier for the results of the computation.
358   
359    Note: this module, as well as the next one, have been reused from the
360    untrusted compiler. Therefore they have been thoroughly tested for bugs
361    during the last year of the project.
362   \item The \texttt{coloring.ml} module taken from Fran\c{c}ois Pottier
363    PP compiler, used in a compiler's course for undergraduates.
364    The module takes an interference graph
365    (data structure implemented in module \texttt{untrusted\_interference.ml})
366    and colors it, assigning to nodes of the interference graph either a
367    color or the constant \texttt{Spilled}. An heuristic is used to drive the
368    algorithm: the caller must provide a function that returns the number of
369    times a register is accessed, to decrease it likelihood of being spilled.
370    To minimise bugs and reduce code size, we actually implement the heuristics
371    in Matita and then extract the code. Therefore the untrusted code also calls
372    back extracted code for untrusted computations.
373
374    Finally, module \texttt{compute\_colouring.ml} is the one that actually
375    computes the colouring of an interference graph of registers given the
376    result of the liveness analysis. The code first creates the interference
377    graph; then colours it once using real registers as colours to determine
378    which pseudo-registers need spilling;
379    finally it colours it again using real registers and spilling slots as
380    colours to assign to each pseudo-register either a spilling slot or a
381    real register.
382  \end{itemize}
383  The directory also contains a few more files that provide glue code between
384  OCaml's data structures from the standard library (e.g. booleans and lists)
385  and the corresponding data structures extracted from Matita. The glue is
386  necessary to translate results back and forth between the trusted (extracted)
387  and untrusted components, because the latter have been written using data
388  structures from the standard library of OCaml.
389 \item Untrusted code that drives the trusted compiler (directory
390   \texttt{driver}). The directory contains the \texttt{acc.ml} module
391   that implements the command line interface of the trusted compiler.
392   It also contains or links three untrusted components which are safety
393   critical and that enter the trusted code base of CerCo:
394   \begin{itemize}
395    \item The CIL parser~\cite{CIL} that parses standard C code and simplifies
396      it to C-light code.
397    \item A pretty-printer for C-light code, used to print the
398      \texttt{-instrumented.c} file.
399    \item The instrumenter module (integrated with the pretty-printer for
400      C-light code). It takes the output of the compiler and replaces every
401      statement \texttt{COST k} (that emits the cost label \texttt{k}) with
402      \texttt{\_\_cost\_incr(n)} where $n$ is the cost associated to
403      \texttt{k} in the map returned by the compiler. A similar operation is
404      performed to update the stack-related global variables. Eventually this
405      modules needs to be certified with the following specification: if
406      a run of the labelled program produces the trace
407      $\tau = k_1 \ldots k_n$, the
408      equivalent run of the instrumented program should yield the same
409      result and be such that at the end the value of the \texttt{\_\_cost}
410      variable is equal to $\Sigma_{k \in \tau} K(k)$ where $K(k)$ is the
411      lookup of the cost of $k$ in the cost map $K$ returned by the compiler.
412      A similar statement is expected for stack usage.
413   \end{itemize}
414\end{itemize}
415
416\subsection{Functional differences w.r.t. the untrusted compiler}
417From the user perspective, the trusted and untrusted compiler have some
418differences:
419\begin{itemize}
420 \item The untrusted compiler put the initialization code for global variables
421   at the beginning of main. The trusted compiler uses a separate function.
422   Therefore only the trusted compiler allows recursive calls. To pay for
423   the initialization code, the \texttt{\_\_cost} variable is not always
424   initialized to 0 in the trusted compiler, while it is always 0 in the
425   untrusted code.
426 \item The two compilers do not compile the code in exactly the same way, even
427   if they adopt the same compilation scheme.
428   Therefore the object code produced is different and the control blocks are
429   given different costs. On average, the code generated by the trusted
430   compiler is about 3 times faster and may use less stack space.
431 \item The trusted compiler is slightly slower than the untrusted one and the
432   trusted executable semantics are also slightly slower than the untrusted
433   ones.
434   The only passes that at the moment are significantly much slower are the
435   policy computation pass, which is a preliminary to the assembly, and the
436   assembly pass. These are the passes that manipulate the largest quantity
437   of data, because assembly programs are much larger than the corresponding
438   Clight sources. The reason for the slowness is currently under investigation.
439   It is likely to be due to the quality of the extracted code (see
440   \autoref{quality}).
441 \item The back-ends of both compilers do not handle external functions
442   because we have not implemented a linker. The trusted compiler fails
443   during compilation, while the untrusted compiler silently turns every
444   external function call into a \texttt{NOP}.
445 \item The untrusted compiler had \emph{ad hoc} options to deal with C files
446   generated from a Lustre compiler. The \emph{ad hoc} code simplified the C files
447   by avoiding some calls to external functions and it was adding some
448   debugging code to print the actual reaction time of every Lustre node.
449   The trusted compiler does not implement any \emph{ad hoc} Lustre option yet.
450\end{itemize}
451
452\subsection{Implementation differences w.r.t.\ the untrusted compiler}
453The code of the trusted compiler greatly differs from the code of the
454untrusted prototype. The main architectural difference is the one of
455representation of back-end languages. In the trusted compiler we have adopted
456a unified syntax (data-structure), semantics and pretty-printing for every
457back-end language.
458In order to accommodate the differences between the original languages, the
459syntax and semantics have been abstracted over a number of parameters, like
460the type of arguments of the instructions. For example, early languages use
461pseudo-registers to hold data while late languages store data into real machine
462registers or stack locations. The unification of the languages have brought
463a few benefits and can potentially bring new ones in the future:
464\begin{itemize}
465 \item Code size reduction and faster detection and correction of bugs.
466
467   About the 3/4th of the code for the semantics and pretty-printing of
468   back-end languages is shared, while 1/4th is pass-dependent. Sharing
469   the semantics automatically implies sharing semantic properties, i.e.
470   reducing to 1/6th the number of lemmas to be proved (6 is the number of
471   back-end intermediate languages). Moreover, several back-end passes
472   ---a pass between two alternative semantics for RTL, the RTL to ERTL pass
473   and the ERTL to LTL pass--- transform a graph instance of the generic
474   back-end intermediate language to another graph instance. The
475   graph-to-graph transformation has also been generalized and parametrised
476   over the pass-specific details. While the code saved in this way is not
477   much, several significant lemmas are provided once and for all on the
478   transformation. At the time this deliverable has been written,
479   the generalized languages, semantics, transformations and relative
480   properties take about 3,900 lines of Matita code (definitions and lemmas).
481
482   We also benefit from the typical advantage of code sharing over cut\&paste:
483   once a bug is found and fixed, the fix immediately applies to every
484   instance. This becomes particularly significant for code certification,
485   where one simple bug fix usually requires a complex work to fix the
486   related correctness proofs.
487
488 \item Some passes and several proofs can be given in greater generality,
489   allowing more reuse.
490
491   For example, in the untrusted prototype the
492   LTL to LIN pass was turning a graph language into a linearised language
493   with the very same instructions and similar semantics. In particular, the
494   two semantics shared the same execute phase, while fetching was different.
495   The pass consisted in performing a visit of the graph, emitting the
496   instructions in visit order. When the visit detected a cycle, the back-link
497   arc was represent with a new explicitly introduced \texttt{GOTO} statement.
498
499   Obviously, the transformation just described could be applied as well to
500   any language with a \texttt{GOTO} statement. In our formalization in Matita,
501   we have rewritten and proved correct the translation between any two
502   instances of the generalized back-end languages such that: 1) the
503   fetching-related parameters of the two passes are instantiated respectively
504   with graphs and linear operations; 2) the execute-related parameters are
505   shared.
506
507   Obviously, we could also prove correct the reverse translation, from a
508   linear to a graph-based version of the same language. The two combined
509   passes would allow to temporarily switch to a graph based representation
510   only when a data-flow analysis over the code is required. In our compiler,
511   for example, at the moment only the RTL to ERTL pass is based on a data
512   flow analysis. A similar pair of translations could be also provided between
513   one of the two representations and a static single assignment (SSA) one.
514   As a final observation, the insertion of another graph-based language after
515   the LTL one is now made easy: the linearisation pass needs not be redone
516   for the new pass.
517   
518\item Pass commutation and reuse.
519   Every pass is responsible for reducing a difference
520   between the source and target languages. For example, the RTL to ERTL pass
521   is responsible for the parameter passing conversion, while the ERTL to LTL
522   pass performs pseudo-registers allocation. At the moment, both CompCert
523   and CerCo fix the relative order of the two passes in an opposite and
524   partially arbitrary way and it is not possible to simply switch the
525   two passes. We believe instead that it should be possible to generalize
526   most passes in such a way that they could be freely composed in any order,
527   also with repetitions. For example, real world compilers like GCC perform
528   some optimizations like constant propagation multiple times, after
529   every optimization that is likely to trigger more constant propagation.
530   Thanks to our generalized intermediate language, we can already implement
531   a generic constant propagation pass that can be freely applied.
532%   For instance, table below shows the amount of Matita
533%   code taken by the description of the syntax, semantics and pretty printing
534%   rules for the back-end passes. It also compares the size with the untrusted
535%   C code.
536%   \begin{tabular}[h]{lrr}
537%       & Untrusted code & Matita code \\
538%    RTL    &   615          &    456 \\
539%   ERTL    &   891          &    342 \\
540%   LIN     &   568          &     79 \\
541%   LTL     &   636          &     94 \\
542%   LTL/LIN &                &    466 \\
543%   joint   &                &   1615 \\ \hline \\
544%   Tot.    & 2,710          &   3,052 \\ 
545%   The tables shows that the size of all untrusted code languages is
546%   comparable.
547\end{itemize}
548
549
550\subsection{Quality of the extracted code}\label{quality}
551
552We have extracted the Matita code of the compiler to OCaml in order to compile
553and execute it in an efficient way and without any need to install Matita.
554The implementation of code extraction for Matita has been obtained by
555generalizing the one of Coq over the data structures of Coq, and then
556instantiating the resulting code for Matita. Differences in the two calculi
557have also been taken in account during the generalization. Therefore we can
558expect the extraction procedure to be reasonably bug free, because bugs in the
559core of the code extraction would be likely to be detected in Coq also.
560
561The quality of the extracted code ranges from sub-optimal to poor, depending
562on the part of the formalization. We analyse here the causes for the poor
563quality:
564\begin{itemize}
565 \item Useless computations. The extraction procedure
566   removes from the extracted code almost all of the non computational parts,
567   replacing the ones that are not removed with code with a low complexity.
568   However, following Coq's tradition, detection of the useless parts is not
569   done according to the computationally expensive algorithm by
570   Berardi~\cite{berardi1,berardi2}. Instead, the user decides which data structures
571   should be assigned computation interest by declaring them in one of the
572   \texttt{Type\_i} sorts of the Calculus of (Co)Inductive Constructions.
573   The non computational structures are declared in \texttt{Prop}, the sort
574   of impredicative, possibly classical propositions. Every computation that
575   produces a data structure in \texttt{Prop} is granted to be computationally
576   irrelevant. Computations that produce data structures in \texttt{Type\_i},
577   instead, may actually be relevant of irrelevant, even if the extraction code
578   conservatively consider them relevant. The result consists in extracted
579   OCaml code that computes values that will be passed to functions that do
580   not use the result, or that will be returned to the caller that will ignore
581   the result.
582
583   The phenomenon is particularly visible when the dependently typed discipline
584   is employed, which is our choice for CerCo. Under this discipline, terms
585   can be passed to type formers. For example, the data type for back-end
586   languages in CerCo is parametrised over the list of global variables that
587   may be referred to by the code. Another example is the type of vectors
588   that is parametrised over a natural which is the size of the vector, or
589   the type of vector tries which is parametrised over the fixed height of
590   the tree and that can be read and updated only using keys (vectors of
591   bits) whose length is the height of the trie. Functions that compute these
592   dependent types also have to compute the new indexes (parameters) for the
593   types, even if this information is used only for typing. For example,
594   appending two vectors require the computation of the length of the result
595   vector just to type the result. In turn, this computation requires the
596   lengths of the two vectors in input. Therefore, functions that call append
597   have to compute the length of the vectors to append even if the lengths
598   will actually be ignored by the extracted code of the append function.
599
600   In the literature there are proposals for allowing the user to more
601   accurately distinguish computational from non computational arguments of
602   functions. The proposals introduce two different types of
603   $\lambda$-abstractions and \emph{ad hoc} typing rules to ensure that computationally
604   irrelevant bound variables are not used in computationally relevant
605   positions. An OCaml prototype that implements these ideas for Coq is
606   available~\cite{implicitcoq}, but heavily bugged. We did not try yet to do
607   anything along these lines in Matita. To avoid modifying the system,
608   another approach based on the explicit use of a non computational monad
609   has been also proposed in the literature, but it introduces many
610   complications in the formalization and it cannot be used in every situation.
611
612   Improvement of the code extraction to more aggressively remove irrelevant
613   code from code extracted from Matita is left as future work.
614   At the moment, it seems that useless computations are indeed responsible
615   for poor performances of some parts of the extracted code. We have
616   experimented with a few manual optimizations of the extracted code and
617   we saw that a few minor patches already allow a 25\% speed up of the
618   assembly pass. The code released with this deliverable is the one without
619   the manual patches to maximize reliability.
620 \item Poor typing. A nice theoretical result is that the terms of the
621   Calculus of Constructions (CoC),
622   the upper-right corner of Barendregt cube, can be re-typed in System
623   $F_\omega$, the corresponding typed lambda calculus without dependent
624   types~\cite{coctofomega}. The calculi implemented by Coq and Matita, however,
625   are more expressive than CoC, and several type constructions have no
626   counterparts in System $F_\omega$. Moreover, core OCaml does not even
627   implement $F_\omega$, but only the Hindley-Milner fragment of it.
628   Therefore, in all situations listed below, code extraction is not able to
629   type the extracted code using informative types, but it uses the
630   super-type \texttt{Obj.magic} of OCaml --- abbreviated \texttt{\_\_} in
631   the extracted code. The lack of more precise typing has very limited
632   impact on the performance of the compiler OCaml code, but it makes
633   very hard to plug the extracted code together with untrusted code. The latter
634   needs to introduce explicit unsafe casts between the super-type and the
635   concrete types used by instances of the generic functions. The code written
636   in this is very error prone. For this reason we have decided to write in
637   Matita also parts of the untrusted code of the CerCo compiler (e.g. the
638   pretty-printing functions), in order to benefit from the type checker of
639   Matita.
640
641   The exact situations that triggers uses of the super-type are:
642   \begin{enumerate}
643   \item They calculi feature a cumulative
644    hierarchy of universes that allows to write functions that can be used
645    both as term formers and type formers, according to the arguments that are
646    passed to them. In System $F_\omega$, instead, terms and types are
647    syntactically distinct. Extracting terms according to all their possible
648    uses may be impractical because the number of uses is exponential in the
649    number of arguments of sort $Type_i$ with $i \geq 2$.
650   \item Case analysis and recursion over inhabitants of primitive inductive
651    types can be used in types (strong elimination), which is not allowed in
652    $F_\omega$. In the CerCo compiler we largely exploit type formers declared
653    in this way, for example to provide the same level of type safety achieved
654    in the untrusted compiler via polymorphic variants~\cite{garrigue98}.
655    In particular, we have terms to syntactically describe as first class
656    citizens the large number of combinations of operand modes of object code
657    instructions. On the instructions we provide ``generic'' functions that work
658    on some combinations of the operand modes, and whose type is computed by
659    primitive recursion on the syntactic description of the operand modes of
660    the argument of the function.
661   \item $\Sigma$-types and, more generally, dependently typed records can have
662    at the same time fields that are type declarations and fields that are
663    terms. This situation happens all the time in CerCo because we are sticking
664    to the dependently typed discipline and because we often generalize our
665    data structures over the types used in them. Concretely, the generalization
666    happens over a record containing a type --- e.g. the type of
667    (pseudo)-registers for back-end languages --- some terms working on the
668    type --- e.g. functions to set/get values from (pseudo)-registers --- and
669    properties over them. In System $F_\omega$ terms and types abstractions
670    are kept syntactically separate and there is no way to pack them in
671    records.
672  \item The type of the extracted function does not belong to the Hindley-Milner
673    fragment. Sometimes simple code transformations could be used to make the
674    function typeable, but the increased extraction code complexity would
675    outweigh the benefits.
676 \end{enumerate}
677\end{itemize}
678
679We should note how other projects, in particular CompCert, have decided from
680the beginning to avoid dependent types to grant high quality of the extracted
681code and maximal control over it. Therefore, at the current state of the art of
682code extraction, there seems to be a huge trade-off between extracted code
683quality and exploitation of advanced typing and proving techniques in the
684source code. In the near future, the code base of CerCo can provide an
685interesting example of a large formalization based on dependent types and
686in need of high quality of extracted code. Therefore we plan to use it as a
687driver and test bench for future works in the improvement of code extraction.
688In particular, we are planning to study the following improvements to the
689code extraction of Matita:
690\begin{itemize}
691 \item We already have a prototype that extracts code from Matita to GHC plus
692  several extensions that allow GHC to use a very large subset of System
693  $F_\omega$. However, the prototype is not fully functional yet because we
694  still need to solve at the theoretical level a problem of interaction
695  between $F_\omega$ types and strong elimination. Roughly speaking, the
696  two branches of a strong elimination always admit a most general unifier in
697  Hindley-Milner plus the super-type \texttt{Obj.magic}, but the same property
698  is lost for $F_\omega$. As a consequence, we loose modularity in code
699  extraction and we need to figure out static analysis techniques to reduce
700  the impact of the loss of modularity.
701 \item The two most recent versions of OCaml have introduced first class
702  modules, which are exactly the feature needed for extracting code that uses
703  records containing both types and term declarations. However, the syntax
704  required for first class modules is extremely cumbersome and it requires
705  the explicit introduction of type expressions to make manifest the type
706  declaration/definition fields of the module. This complicates code
707  extraction with the needs of performing some computations at extraction time,
708  which are not in the tradition of code extraction. Moreover, the actual
709  performance of OCaml code that uses first class modules heavily is unknown
710  to us. We plan to experiment with first class modules for extraction very
711  soon.
712  \item Algebraic data types are generalized to families of algebraic data
713    types in the calculi implemented by Coq and Matita, even if the two
714    generalizations are different. Families of algebraic data type are
715    traditionally not supported by programming languages, but some
716    restrictions have been recently considered under the name of
717    Generalized Abstract Data Types (GADTs) and they are now implemented in
718    recent versions of OCaml and GHCs. A future work is the investigation on
719    the possibility of exploiting GADTs during code extraction.
720\end{itemize}
721
722\section{The Cost-Annotating Plug-In}
723
724The functionalities of the Cost Annotating Plug-In have already been described
725in Deliverables~D5.1 and D5.3.
726The plug-in interfaces with the CerCo compiler via
727the command line. Therefore there was no need to update the plug-in code for
728integration in the Trusted CerCo Prototype. Actually, it is even possible
729to install at the same time the untrusted and the trusted compilers. The
730\texttt{-cost-acc} flag of the plug-in can be used to select the executable
731to be used for compilation.
732The code of the plug-in has been modified w.r.t. D5.1 to address two issues.
733
734On the one side, the analysis of the stack-size consumption has been integrated
735into it. From the user point of view,
736time and space cost annotations and invariants are handled in a similar way.
737Nevertheless, we expect automated theorem provers to face more difficulties
738in dealing with stack usage because elapsed time is additive, whereas what
739is interesting for space usage is the maximum amount of stack used, which is
740not additive. On the other hand, programs produced by our compiler require
741more stack only at function calls. Therefore the proof obligations generated
742to bound the maximum stack size for non recursive programs are trivial.
743Most C programs, and in particular those used in time critical systems, avoid
744recursive functions.
745
746On the other side, the plug-in has been updated to take advantage of the new
747Why3 platform.
748
749\section{The \texttt{cerco} Wrapper}
750The Why3 platform is a complete rewrite of the old Why2 one. The update
751has triggered several additional passages to enable the use of the cost plug-in
752in conjunction with the Jessie one and the automatic and interactive theorem
753provers federated by the Why3 platform, mainly because the Jessie plug-in still
754uses Why2. These passages, which required either
755tedious manual commands or a complicated makefile, have prompted us to write
756a script wrapping all the functionalities provided by the software described
757in this deliverable.
758\begin{verbatim}
759Syntax: cerco [-ide] [-untrusted] filename.c
760\end{verbatim}
761The C file provided is processed via the cost plug-in and then to the Why3 platform.
762The two available options command the following features.
763\begin{itemize}
764 \item \verb+-ide+: launch the Why3 interactive graphical interface for a
765 fine-grained control on proving the synthesised program invariants. If not provided,
766 the script will launch all available automatic theorem provers with a 5 second
767 time-out, and just report failure or success.
768 \item \verb+-untrusted+: if it is installed, use the untrusted prototype
769 rather than the trusted one (which is the default behaviour).
770\end{itemize}
771The minimum dependencies for the use of this script are
772\begin{itemize}
773 \item either the trusted or the untrusted \verb+acc+ CerCo compiler;
774 \item both Why2 and Why3;
775 \item the cost and Jessie plus-ins.
776\end{itemize}
777However it is recommended to install as much Why3-compatible automatic provers
778as possible to maximise the effectiveness of the command. The provers provided
779by default were not very effective in our experience.
780
781\section{Connection with other deliverables}
782\label{subsect.connection.other.deliverables}
783
784This deliverable represents the final milestone of the CerCo project.
785The software delivered links together most of the software already developed
786in previous deliverables, and it benefits from the studies performed in
787other deliverables. In particular:
788
789\begin{itemize}
790 \item The compiler links the code extracted from the executable formal
791  semantics of C (D3.1), machine code (D4.1), front-end intermediate languages
792  (D3.3) and back-end intermediate languages (D4.3). The \texttt{-is} flag
793  of the compiler invokes the semantics of every intermediate
794  representation of the program to be compiled. The executability of the
795  C and machine code languages has been important to debug the the two
796  semantics, that are part of the trusted code base of the compiler.
797  The executability of the intermediate languages has been important during
798  development for early detection of bugs both in the semantics and in the
799  compiler passes. They are also useful to users to profile programs in early
800  compilation stages to detect where the code spends more time.
801  Those semantics, however, are not part of the trusted code base.
802 \item The compiler links the code extracted from the CIC encoding of the
803  Front-end (D3.2) and Back-end (D4.2). The two encodings have been partially
804  proved correct in D3.4 (Front End Correctness Proof) and D4.4 (Back End
805  Correctness Proof). The formal proofs to be delivered in those deliverables
806  have not been completed. The most urgent future work after the end of the
807  project will consist in completing those proofs.
808 \item Debian Packages have been provided in D6.6 for the Cost-Annotating
809  Plug-In, the Untrusted CerCo Compiler and the Trusted CerCo Compiler.
810  The first and third installed together form a full installation of the
811  Trusted CerCo Prototype. In order to allow interested people to test the
812  prototype more easily, we also provided in D6.6 a Live CD based on
813  Debian with the CerCo Packages pre-installed.
814\end{itemize}
815
816\bibliographystyle{unsrt}
817\bibliography{report}
818\end{document}
Note: See TracBrowser for help on using the repository browser.