source: Deliverables/D5.2/report.tex @ 3231

Last change on this file since 3231 was 3109, checked in by sacerdot, 7 years ago

New version.

File size: 39.2 KB
Line 
1\documentclass[11pt, epsf, a4wide]{article}
2
3\usepackage{../style/cerco}
4
5\usepackage{amsfonts}
6\usepackage{amsmath}
7\usepackage{amssymb} 
8\usepackage[english]{babel}
9\usepackage{graphicx}
10\usepackage[utf8x]{inputenc}
11\usepackage{listings}
12\usepackage{stmaryrd}
13\usepackage{url}
14
15\title{
16INFORMATION AND COMMUNICATION TECHNOLOGIES\\
17(ICT)\\
18PROGRAMME\\
19\vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}}
20
21\lstdefinelanguage{matita-ocaml}
22  {keywords={ndefinition,ncoercion,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,let,in,rec,match,return,with,Type,try},
23   morekeywords={[2]nwhd,nnormalize,nelim,ncases,ndestruct},
24   morekeywords={[3]type,of},
25   mathescape=true,
26  }
27
28\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
29        keywordstyle=\color{red}\bfseries,
30        keywordstyle=[2]\color{blue},
31        keywordstyle=[3]\color{blue}\bfseries,
32        commentstyle=\color{green},
33        stringstyle=\color{blue},
34        showspaces=false,showstringspaces=false}
35
36\lstset{extendedchars=false}
37\lstset{inputencoding=utf8x}
38\DeclareUnicodeCharacter{8797}{:=}
39\DeclareUnicodeCharacter{10746}{++}
40\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
41\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
42
43\date{}
44\author{}
45
46\begin{document}
47
48\thispagestyle{empty}
49
50\vspace*{-1cm}
51\begin{center}
52\includegraphics[width=0.6\textwidth]{../style/cerco_logo.png}
53\end{center}
54
55\begin{minipage}{\textwidth}
56\maketitle
57\end{minipage}
58
59\vspace*{0.5cm}
60\begin{center}
61\begin{LARGE}
62\textbf{
63Report n. D5.2\\
64Trusted CerCo Prototype}
65\end{LARGE} 
66\end{center}
67
68\vspace*{2cm}
69\begin{center}
70\begin{large}
71Version 1.0
72\end{large}
73\end{center}
74
75\vspace*{0.5cm}
76\begin{center}
77\begin{large}
78Main Authors:\\
79XXXX %Dominic P. Mulligan and Claudio Sacerdoti Coen
80\end{large}
81\end{center}
82
83\vspace*{\fill}
84
85\noindent
86Project Acronym: \cerco{}\\
87Project full title: Certified Complexity\\
88Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\
89
90\clearpage
91\pagestyle{myheadings}
92\markright{\cerco{}, FP7-ICT-2009-C-243881}
93
94\newpage
95
96\vspace*{7cm}
97\paragraph{Abstract}
98The Trusted CerCo Prototype is meant to be the last piece of software produced
99in the CerCo project. It consists of
100\begin{enumerate}
101 \item A compiler from a large subset of C to the Intel HEX format for the
102   object code of the 8051 microprocessor family. The compiler also computes
103   the cost models for the time spent in basic blocks and the stack space used
104   by the functions in the source code. The cost models are exposed to the
105   user by producing an instrumented C code obtained injecting in the original
106   source code some instructions to update three global variables that record
107   the current clock, the current stack usage and the max stack usage so far.
108 \item A plug-in for the Frama-C Software Analyser architecture. The plug-in
109   takes in input a C file, compiles it with the CerCo compiler and then
110   automatically infers cost invariants for every loop and every function in
111   the source code. The invariants can be fed by Frama-C to various theorem
112   provers to automatically verify them. Those that are not automatically
113   verified can be manually checked by the user using a theorem prover.
114\end{enumerate}
115
116The plug-in is fully functional and it does not need to be formally verified
117because it does not belong to the trusted code base of programs verified using
118CerCo's technology. A bug in the plug-in can just infer wrong time/space
119invariants and bounds that will be rejected by the automatic provers.
120
121The compiler is currently fully functional, even if not fully
122certified yet. The certification will continue after the end of the project.
123
124In this deliverable we discuss the status of the Trusted CerCo Prototype at the
125time of the official end of the project.
126\newpage
127
128\tableofcontents
129
130\newpage
131
132\section{Task}
133\label{sect.task}
134
135The Grant Agreement describes deliverable D5.2 as follows:
136\begin{quotation}
137\textbf{Trusted CerCo Prototype}: Final, fully trustable version of the
138system.
139\end{quotation}
140
141This report describes the state of the implementation of the Trusted CerCo
142Prototype at the official end of the project.
143
144\section{The Trusted CerCo Compiler}
145\subsection{Command line interface}
146The Trusted CerCo Compiler is the first component of the Trusted CerCo
147Prototype, the second one being the Frama-C plug-in already developed in
148D5.1 and not modified. The Trusted CerCo compiler replaces the Untrusted
149CerCo Compiler already delivered in D5.1 as part of the Untrusted CerCo
150Prototype. The Trusted and Untrusted compilers are meant to provide the
151same command line interface, so that they can be easily swapped without
152affecting the plug-in. Actually, the two compilers share the following
153minimal subset of options which are sufficient to the plug-in:
154
155\begin{verbatim}
156Usage: acc [options] file...
157  -a                        Add cost annotations on the source code.
158  -is                       Outputs and interprets all the compilation passes,
159                            showing the execution traces
160  -o                        Prefix of the output files.
161\end{verbatim}
162
163\begin{figure}[t]
164\begin{verbatim}
165char a[] = {3, 2, 7, -4};
166char treshold = 4;
167
168int main() {
169        char j;
170        char *p = a;
171        int found = 0;
172        for (j=0; j < 4; j++) {
173                if (*p <= treshold) { found++; }
174                p++;
175        }
176        return found;
177}
178\end{verbatim}
179\caption{A simple C program~\label{test1}}
180\end{figure}
181
182\begin{figure}[!p]
183\begin{verbatim}
184int __cost = 33;
185
186int __stack_size = 5, __stack_size_max = 5;
187
188void __cost_incr(int incr) {
189  __cost = __cost + incr;
190}
191
192void __stack_size_incr(int incr) {
193  __stack_size = __stack_size + incr;
194  __stack_size_max = __stack_size_max < __stack_size ? __stack_size : __stack_size_max;
195}
196
197unsigned char a[4] = { 3, 2, 7, 252, };
198
199unsigned char treshold = 4;
200
201int main(void)
202{
203  unsigned char j;
204  unsigned char *p;
205  int found;
206  __stack_size_incr(0);
207  __cost_incr(91);
208  p = a;
209  found = 0;
210  for (j = (unsigned char)0; (int)j < 4; j = (unsigned char)((int)j + 1)) {
211    __cost_incr(76);
212    if ((int)*p <= (int)treshold) {
213      __cost_incr(144);
214      found = found + 1;
215    } else {
216      __cost_incr(122);
217      /*skip*/;
218    }
219    p = p + 1;
220  }
221  __cost_incr(37);
222  /*skip*/;
223  __stack_size_incr(-0);
224  return found;
225  __stack_size_incr(-0);
226
227}
228\end{verbatim}
229\caption{The instrumented version of the program in Figure~\ref{test1}\label{itest1}}
230\end{figure}
231
232Let the file \texttt{test.c} contains the code presented in Figure~\ref{test1}.
233By calling \texttt{acc -a test.c}, the user obtains the following files:
234\begin{itemize}
235 \item \texttt{test-instrumented.c} (Figure~\ref{itest1}): the file is a copy
236   of \texttt{test.c} obtained by adding code that keeps track of the amount
237   of time and space used by the source code.
238
239   The global variable
240   \texttt{\_\_cost} records the number of clock cycles used by the
241   microprocessor. Its initial value may not be zero because we emit object code
242   to initialize the global variables of the program. The initialization code
243   is not part of \texttt{main} (to allow \texttt{main} to be recursive, even
244   if it is not clear from the standard if this should be allowed or not).
245   The initial value of \texttt{\_\_cost} is the time spent in the
246   initialization code.
247   The \texttt{\_\_cost} variable is incremented at the beginning of every
248   basic block using the \texttt{\_\_cost\_incr} function, whose argument
249   is the number of clock cycles that will be spent by the basic block that
250   is beginning. Therefore the value stored in the variable is always a safe
251   upper bound of the actual time. Moreover, the difference at the begin and
252   end of a function of the value of the \texttt{\_\_clock} variable is
253   also exact. The latter statement --- with several technical complications
254   --- is the one that will be eventually certified in Matita.
255
256   The global variables \texttt{\_\_stack\_size} and
257   \texttt{\_\_stack\_size\_max} are handled similarly to \texttt{\_\_cost}.
258   Their value is meant to represent the amount of external data memory
259   currently in use and the maximum amount of memory used so far by the program.
260   The two values are updated by the \texttt{\_\_stack\_size\_incr} function
261   at the beginning of every function body, at its end and just before every
262   \texttt{return}. A negative increment is used to represent stack release.
263   The initial value of the two variables may not be zero because some
264   external data memory is used for global and static variables. Moreover, the
265   last byte of external data memory may not be addressable in the 8051, so
266   we avoid using it. The compiled code runs correctly only until stack
267   overflow, which happens when the value of \texttt{\_\_stack\_size} reaches
268   $2^{16}$. It is up to the user to state and maintain the invariant that no
269   stack overflow occurs. In case of stack overflow, the compiled code does
270   no longer simulate the source code. Automatic provers are often able to
271   prove the invariant in simple cases.
272
273   In order to instrument the code, all basic blocks that are hidden in the
274   source code but that will contain code in the object code need to be made
275   manifest. In particular, \texttt{if-then} instructions without an explicit
276   \texttt{else} are given one (compare Figures~\ref{test1} and \ref{itest1}).
277 \item \texttt{test.hex}: the file is an Intel HEX representation of the
278   object code for an 8051 microprocessor. The file can be loaded in any
279   8051 emulator (like the MCU 8051 IDE) for running or dissassemblying it.
280   Moreover, an executable semantics (an emulator) for the 8051 is linked
281   with the CerCo compilers and can be used to run the object code at the
282   end of the compilation.
283 \item \texttt{test.cerco} and \texttt{test.stack\_cerco}: these two files
284   are used to pass information from the CerCo compilers to the Frama-C plug-in
285   and they are not interesting to the user. They just list the
286   global variables and increment functions inserted by the compiler and used
287   later by the plug-in to compute the cost invariants.
288\end{itemize}
289
290When the option \texttt{-is} is used, the compilers run the program after
291every intermediate compilation pass. The untrusted compiler only outputs the
292trace of (cost) observables and the final value returned by main; the
293trusted compiler also observes every function call and return (the stack-usage
294observables) and therefore allows to better track program execution.
295The traces observed after every pass should always be equal up to the initial
296observable that corresponds to the initialization of global variables. That
297observable is currently only emitted in back-end languages. The preservation of
298the traces will actually be granted by the main theorem of the formalization
299(the forward simulation theorem) when the proof will be completed.
300
301The compilers also create a file
302for each pass with the intermediate code. However, the syntaxes used by the two
303compilers for the intermediate passes are not the same and we do not output
304yet any intermediate syntax for the assembly code. The latter can be obtained
305by disassemblying the object code, e.g. by using the MCU 8051 IDE.
306
307\subsection{Code structure}
308The code of the trusted compiler is made of three different parts:
309\begin{itemize}
310 \item Code extracted to OCaml from the formalization of the compiler in Matita.
311   This is the code that will eventually be fully certified using Matita.
312   It is fully contained in the \texttt{extracted} directory (not comprising
313   the subdirectory \texttt{unstrusted}).
314   It translates the source C-light code to:
315   \begin{itemize}
316    \item An instrumented version of the same C-light code. The instrumented
317      version is obtained inserting cost emission statements~\texttt{COST k}
318      at the beginning of basic blocks. The emitted labels are the ones that
319      are observed calling \texttt{acc -is}. They will be replaced in the
320      final instrumented code with incrementes of the \texttt{\_\_cost}
321      variable.
322    \item The object code for the 8051 as a list of bytes to be loaded in
323      code memory. The code is coupled with a trie over bitvectors that maps
324      program counters to cost labels \texttt{k}. The intended semantics is
325      that, when the current program counter is associated to \texttt{k},
326      the observable label \texttt{k} should be emitted during the execution
327      of the next object code instruction. Similarly, a second trie, reminiscent
328      of a symbol table, maps program counters to function names to emit the
329      observables associated to stack usage.
330   \end{itemize}
331   During the ERTL to RTL pass that deals with register allocation,
332   the source code calls two untrusted components that we are now going to
333   describe.
334 \item Untrusted code called by trusted compiler (directory
335   \texttt{extracted/unstrusted}). The two main untrusted components in this
336    directory are
337  \begin{itemize}
338   \item The \texttt{Fix.ml} module by Fran\c{c}ois Pottier that provides a
339    generic algorithm to compute the least solution of a system of monotonic
340    equations, described in the paper \cite{LazyLeastFixedPointsInML}.
341    The trusted code uses this piece of code to do liveness analysis and only
342    assumes that the module computes a pre-fix point of the system of equations
343    provided. The performance of this piece of code is critical to the compiler
344    and the implementation used exploits some clever programming techniques
345    and imperative data structures that would be difficult to prove correct.
346    Checking if the result of the computation is actually a pre-fixpoint is
347    instead very simple to do using low computational complexity functional
348    code only. Therefore we do not plan to prove this piece of code correct.
349    Instead, we will certify a verifier for the results of the computation.
350   
351    Note: this module, as well as the next one, have been reused from the
352    untrusted compiler. Therefore they have been thoroughly tested for bugs
353    during the last year of the project.
354   \item The \texttt{coloring.ml} module taken from Fran\c{c}ois Pottier
355    PP compiler, used in a compiler's course for undergraduates.
356    The module takes an interference graph
357    (data structure implemented in module \texttt{untrusted\_interference.ml})
358    and colors it, assigning to nodes of the interference graph either a
359    color or the constant \texttt{Spilled}. An heuristic is used to drive the
360    algorithm: the caller must provide a function that returns the number of
361    times a register is accessed, to decrease it likelihood of being spilled.
362    To minimise bugs and reduce code size, we actually implement the heuristics
363    in Matita and then extract the code. Therefore the untrusted code also calls
364    back extracted code for untrusted computations.
365
366    Finally, module \texttt{compute\_colouring.ml} is the one that actually
367    computes the colouring of an interference graph of registers given the
368    result of the liveness analysis. The code first creates the interference
369    graph; then colours it once using real registers as colours to determine
370    which pseudo-registers need spilling;
371    finally it colours it again using real registers and spilling slots as
372    colours to assign to each pseudoregister either a spilling slot or a
373    real register.
374  \end{itemize}
375  The directory also contains a few more files that provide glue code between
376  OCaml's data structures from the standard library (e.g. booleans and lists)
377  and the corresponding data structures extracted from Matita. The glue is
378  necessary to translate results back and forth between the trusted (extracted)
379  and untrusted components, because the latter have been written using data
380  structures from the standard library of OCaml.
381 \item Untrusted code that drives the trusted compiler (directory
382   \texttt{driver}). The directory contains the \texttt{acc.ml} module
383   that implements the command line interface of the trusted compiler.
384   It also contains or links three untrusted components which are safety
385   critical and that enter the trusted code base of CerCo:
386   \begin{itemize}
387    \item The CIL parser~\cite{CIL} that parses standard C code and simplifies
388      it to C-light code.
389    \item A pretty-printer for C-light code, used to print the
390      \texttt{-instrumented.c} file.
391    \item The instrumenter module (integrated with the pretty-printer for
392      C-light code). It takes the output of the compiler and replaces every
393      statement \texttt{COST k} (that emits the cost label \texttt{k}) with
394      \texttt{\_\_cost\_incr(n)} where $n$ is the cost associated to
395      \texttt{k} in the map returned by the compiler. A similar operation is
396      performed to update the stack-related global variables. Eventually this
397      modules needs to be certified with the following specification: if
398      a run of the labelled program produces the trace
399      $\tau = k_1 \ldots k_n$, the
400      equivalent run of the instrumented program should yield the same
401      result and be such that at the end the value of the \texttt{\_\_cost}
402      variable is equal to $\Sigma_{k \in \tau} K(k)$ where $K(k)$ is the
403      lookup of the cost of $k$ in the cost map $K$ returned by the compiler.
404      A similar statement is expected for stack usage.
405   \end{itemize}
406\end{itemize}
407
408\subsection{Functional differences w.r.t. the untrusted compiler}
409From the user perspective, the trusted and untrusted compiler have some
410differences:
411\begin{itemize}
412 \item The untrusted compiler put the initialization code for global variables
413   at the beginning of main. The trusted compiler uses a separate function.
414   Therefore only the trusted compiler allows recursive calls. To pay for
415   the initialization code, the \texttt{\_\_cost} variable is not always
416   initialized to 0 in the trusted compiler, while it is always 0 in the
417   untrusted code.
418 \item The two compilers do not compile the code in exactly the same way, even
419   if they adopt the same compilation scheme.
420   Therefore the object code produced is different and the control blocks are
421   given different costs. On average, the code generated by the trusted
422   compiler is about 3 times faster and may use less stack space.
423 \item The trusted compiler is slightly slower than the untrusted one and the
424   trusted executable semantics are also slightly slower than the untrusted
425   ones.
426   The only passes that at the moment are significantly much slower are the
427   policy computation pass, which is a preliminary to the assembly, and the
428   assembly pass. These are the passes that manipulate the largest quantity
429   of data, because assembly programs are much larger than the corresponding
430   Clight sources. The reason for the slowness is currently under investigation.
431   It is likely to be due to the quality of the extracted code (see
432   Section~\ref{quality}).
433 \item The back-ends of both compilers do not handle external functions
434   because we have not implemented a linker. The trusted compiler fails
435   during compilation, while the untrusted compiler silently turns every
436   external function call into a \texttt{NOP}.
437 \item The untrusted compiler had ad-hoc options to deal with C files
438   generated from a Lustre compiler. The ad-hoc code simplified the C files
439   by avoiding some calls to external functions and it was adding some
440   debugging code to print the actual reaction time of every Lustre node.
441   The trusted compiler does not implement any ad-hoc Lustre option yet.
442\end{itemize}
443
444\subsection{Implementative differences w.r.t. the untrusted compiler}\label{quality}
445The code of the trusted compiler greatly differs from the code of the
446untrusted prototype. The main architectural difference is the one of
447representation of back-end languages. In the trusted compiler we have adopted
448a unified syntax (data-structure), semantics and pretty-printing for every
449back-end language.
450In order to accomodate the differences between the original languages, the
451syntax and semantics have been abstracted over a number of parameters, like
452the type of arguments of the instructions. For example, early languages use
453pseudo-registers to hold data while late languages store data into real machine
454registers or stack locations. The unification of the languages have bringed
455a few benefits and can potentially bring new ones in the future:
456\begin{itemize}
457 \item Code size reduction and faster detection and correction of bugs.
458
459   About the 3/4th of the code for the semantics and pretty-printing of
460   back-end languages is shared, while 1/4th is pass-dependent. Sharing
461   the semantics automatically implies sharing semantic properties, i.e.
462   reducing to 1/6th the number of lemmas to be proved (6 is the number of
463   back-end intermediate languages). Moreover, several back-end passes
464   --- a pass between two alternative semantics for RTL, the RTL to ERTL pass
465   and the ERTL to LTL pass --- transform a graph instance of the generic
466   back-end intermediate language to another graph instance. The
467   graph-to-graph transformation has also been generalized and parameterized
468   over the pass-specific details. While the code saved in this way is not
469   much, several significant lemmas are provided once and for all on the
470   transformation. At the time this deliverable has been written,
471   the generalized languages, semantics, transformations and relative
472   properties take about 3,900 lines of Matita code (definitions and lemmas).
473
474   We also benefit from the typical advantage of code sharing over cut\&paste:
475   once a bug is found and fixed, the fix immediately applies to every
476   instance. This becomes particularly significant for code certification,
477   where one simple bug fix usually requires a complex work to fix the
478   related correctness proofs.
479
480 \item Some passes and several proofs can be given in greater generality,
481   allowing more reusal.
482
483   For example, in the untrusted prototype the
484   LTL to LIN pass was turning a graph language into a linearized language
485   with the very same instructions and similar semantics. In particular, the
486   two semantics shared the same execute phase, while fetching was different.
487   The pass consisted in performing a visit of the graph, emitting the
488   instructions in visit order. When the visit detected a cycle, the back-link
489   arc was represent with a new explicitly introduced \texttt{GOTO} statement.
490
491   Obviously, the transformation just described could be applied as well to
492   any language with a \texttt{GOTO} statement. In our formalization in Matita,
493   we have rewritten and proved correct the translation between any two
494   instances of the generalized back-end languages such that: 1) the
495   fetching-related parameters of the two passes are instantiated respectively
496   with graphs and linear operations; 2) the execute-related parameters are
497   shared.
498
499   Obviously, we could also prove correct the reverse translation, from a
500   linear to a graph-based version of the same language. The two combined
501   passes would allow to temporarily switch to a graph based representation
502   only when a data-flow analysis over the code is required. In our compiler,
503   for example, at the moment only the RTL to ERTL pass is based on a data
504   flow analysis. A similar pair of translations could be also provided between
505   one of the two representations and a static single assignment (SSA) one.
506   As a final observation, the insertion of another graph-based language after
507   the LTL one is now made easy: the linearization pass needs not be redone
508   for the new pass.
509   
510\item Pass commutation and reusal.
511   Every pass is responsible for reducing a difference
512   between the source and target languages. For example, the RTL to ERTL pass
513   is responsible for the parameter passing conversion, while the ERTL to LTL
514   pass performs pseudo-registers allocation. At the moment, both CompCert
515   and CerCo fix the relative order of the two passes in an opposite and
516   partially arbitrary way and it is not possible to simply switch the
517   two passes. We believe instead that it should be possible to generalize
518   most passes in such a way that they could be freely composed in any order,
519   also with repetitions. For example, real world compilers like GCC perform
520   some optimizations like constant propagation multiple times, after
521   every optimization that is likely to trigger more constant propagation.
522   Thanks to our generalized intermediate language, we can already implement
523   a generic constant propagation pass that can be freely applied.
524%   For instance, table below shows the amount of Matita
525%   code taken by the description of the syntax, semantics and pretty printing
526%   rules for the back-end passes. It also compares the size with the untrusted
527%   C code.
528%   \begin{tabular}[h]{lrr}
529%       & Untrusted code & Matita code \\
530%    RTL    &   615          &    456 \\
531%   ERTL    &   891          &    342 \\
532%   LIN     &   568          &     79 \\
533%   LTL     &   636          &     94 \\
534%   LTL/LIN &                &    466 \\
535%   joint   &                &   1615 \\ \hline \\
536%   Tot.    & 2,710          &   3,052 \\ 
537%   The tables shows that the size of all untrusted code languages is
538%   comparable.
539\end{itemize}
540
541
542\subsection{Quality of the extracted code}\label{quality}
543
544We have extracted the Matita code of the compiler to OCaml in order to compile
545and execute it in an efficient way and without any need to install Matita.
546The implementation of code extraction for Matita has been obtained by
547generalizing the one of Coq over the data structures of Coq, and then
548instantiating the resulting code for Matita. Differences in the two calculi
549have also been taken in account during the generalization. Therefore we can
550expect the extraction procedure to be reasonably bug free, because bugs in the
551core of the code extraction would be likely to be detected in Coq also.
552
553The quality of the extracted code ranges from sub-optimal to poor, depending
554on the part of the formalization. We analyse here the causes for the poor
555quality:
556\begin{itemize}
557 \item Useless computations. The extraction procedure
558   removes from the extracted code almost all of the non computational parts,
559   replacing the ones that are not removed with code with a low complexity.
560   However, following Coq's tradition, detection of the useless parts is not
561   done according to the computationally expensive algorithm by
562   Berardi~\cite{berardixxx}. Instead, the user decides which data structures
563   should be assigned computation interest by declaring them in one of the
564   \texttt{Type\_i} sorts of the Calculus of (Co)Inductive Constructions.
565   The non computational structures are declared in \texttt{Prop}, the sort
566   of impredicative, possibly classical propositions. Every computation that
567   produces a data structure in \texttt{Prop} is granted to be computationally
568   irrelevant. Computations that produce data structures in \texttt{Type\_i},
569   instead, may actually be relevant of irrelevant, even if the extraction code
570   conservatively consider them relevant. The result consists in extracted
571   OCaml code that computes values that will be passed to functions that do
572   not use the result, or that will be returned to the caller that will ignore
573   the result.
574
575   The phenomenon is particularly visible when the dependently typed discipline
576   is employed, which is our choice for CerCo. Under this discipline, terms
577   can be passed to type formers. For example, the data type for back-end
578   languages in CerCo is parameterized over the list of global variables that
579   may be referred to by the code. Another example is the type of vectors
580   that is parameterized over a natural which is the size of the vector, or
581   the type of vector tries which is parameterized over the fixed height of
582   the tree and that can be read and updated only using keys (vectors of
583   bits) whose lenght is the height of the trie. Functions that compute these
584   dependent types also have to compute the new indexes (parameters) for the
585   types, even if this information is used only for typing. For example,
586   appending two vectors require the computation of the lenght of the result
587   vector just to type the result. In turn, this computation requires the
588   lenghts of the two vectors in input. Therefore, functions that call append
589   have to compute the length of the vectors to append even if the lenghts
590   will actually be ignored by the extracted code of the append function.
591
592   In the litterature there are proposals for allowing the user to more
593   accurately distinguish computational from non computational arguments of
594   functions. The proposals introduce two different types of
595   $\lambda$-abstractions and ad-hoc typing rules to ensure that computationally
596   irrelevant bound variables are not used in computationally relevant
597   positions. An OCaml prototype that implements these ideas for Coq is
598   available~\cite{xxyy}, but heavily bugged. We did not try yet to do
599   anything along these lines in Matita. To avoid modifying the system,
600   another approach based on the explicit use of a non computational monad
601   has been also proposed in the litterature, but it introduces many
602   complications in the formalization and it cannot be used in every situation.
603
604   Improvement of the code extraction to more aggresively remove irrelevant
605   code from code extracted from Matita is left as future work.
606   At the moment, it seems that useless computations are indeed responsible
607   for poor performances of some parts of the extracted code. We have
608   experimented with a few manual optimizations of the extracted code and
609   we saw that a few minor patches already allow a 25\% speed up of the
610   assembly pass. The code released with this deliverable is the one without
611   the manual patches to maximize reliability.
612 \item Poor typing. A nice theoretical result is that the terms of the
613   Calculus of Constructions (CoC),
614   the upper-right corner of Barendregt cube, can be re-typed in System
615   $F_\omega$, the corresponding typed lambda calculus without dependent
616   types~\cite{christinexx}. The calculi implemented by Coq and Matita, however,
617   are more expressive than CoC, and several type constructions have no
618   counterparts in System $F_\omega$. Moreover, core OCaml does not even
619   implement $F_\omega$, but only the Hindley-Milner fragment of it.
620   Therefore, in all situations listed below, code extraction is not able to
621   type the extracted code using informative types, but it uses the
622   super-type \texttt{Obj.magic} of OCaml --- abbreviated \texttt{\_\_} in
623   the extracted code. The lack of more precise typing has very limited
624   impact on the performance of the compiler OCaml code, but it makes
625   very hard to plug the extracted code together with untrusted code. The latter
626   needs to introduce explicit unsafe casts between the super-type and the
627   concrete types used by instances of the generic functions. The code written
628   in this is very error prone. For this reason we have decided to write in
629   Matita also parts of the untrusted code of the CerCo compiler (e.g. the
630   pretty-printing functions), in order to benefit from the type checker of
631   Matita.
632
633   The exact situations that triggers uses of the super-type are:
634   \begin{enumerate}
635   \item They calculi feature a cumulative
636    hierarchy of universes that allows to write functions that can be used
637    both as term formers and type formers, according to the arguments that are
638    passed to them. In System $F_\omega$, instead, terms and types are
639    syntactially distinct. Extracting terms according to all their possible
640    uses may be unpractical because the number of uses is exponential in the
641    number of arguments of sort $Type_i$ with $i \geq 2$.
642   \item Case analysis and recursion over inhabitants of primitive inductive
643    types can be used in types (strong elimination), which is not allowed in
644    $F_\omega$. In the CerCo compiler we largely exploit type formers declared
645    in this way, for example to provide the same level of type safety achieved
646    in the untrusted compiler via polymorphic variants~\cite{guarriguesxx}.
647    In particular, we have terms to syntactically describe as first class
648    citizens the large number of combinations of operand modes of object code
649    instructions. On the instructions we provide ``generic'' functions that work
650    on some combinations of the operand modes, and whose type is computed by
651    primitive recursion on the syntactic description of the operand modes of
652    the argument of the function.
653   \item $\Sigma$-types and, more generally, dependently typed records can have
654    at the same time fields that are type declarations and fields that are
655    terms. This situation happens all the time in CerCo because we are sticking
656    to the dependently typed discipline and because we often generalize our
657    data structures over the types used in them. Concretely, the generalization
658    happens over a record containing a type --- e.g. the type of
659    (pseudo)-registers for back-end languages --- some terms working on the
660    type --- e.g. functions to set/get values from (pseudo)-registers --- and
661    properties over them. In System $F_\omega$ terms and types abstractions
662    are kept syntactically separate and there is no way to pack them in
663    records.
664  \item The type of the extracted function does not belong to the Hindley-Milner
665    fragment. Sometimes simple code transformations could be used to make the
666    function typeable, but the increased extraction code complexity would
667    outweight the benefits.
668 \end{enumerate}
669\end{itemize}
670
671We should note how other projects, in particular CompCert, have decided from
672the beginning to avoid dependent types to grant high quality of the extracted
673code and maximal control over it. Therefore, at the current state of the art of
674code extraction, there seems to be a huge trade-off between extracted code
675quality and exploitation of advanced typing and proving techniques in the
676source code. In the near future, the code base of CerCo can provide an
677interesting example of a large formalization based on dependent types and
678in need of high quality of extracted code. Therefore we plan to use it as a
679driver and test bench for future works in the improvement of code extraction.
680In particular, we are planning to study the following improvements to the
681code extraction of Matita:
682\begin{itemize}
683 \item We already have a prototype that extracts code from Matita to GHC plus
684  several extensions that allow GHC to use a very large subset of System
685  $F_\omega$. However, the prototype is not fully functional yet because we
686  still need to solve at the theoretical level a problem of interaction
687  between $F_\omega$ types and strong elimination. Roughly speaking, the
688  two branches of a strong elimination always admit a most general unifier in
689  Hindley-Milner plus the super-type \texttt{Obj.magic}, but the same property
690  is lost for $F_\omega$. As a consequence, we loose modularity in code
691  extraction and we need to figure out static analysis techniques to reduce
692  the impact of the loss of modularity.
693 \item The two most recent versions of OCaml have introduced first class
694  modules, which are exactly the feature needed for extracting code that uses
695  records contining both types and term declarations. However, the syntax
696  required for first class modules is extremely cumbersome and it requires
697  the explicit introduction of type expressions to make manifest the type
698  declaration/definition fields of the module. This complicates code
699  extraction with the needs of performing some computations at extraction time,
700  which are not in the tradition of code extraction. Moreover, the actual
701  performance of OCaml code that uses first class modules heavily is unknown
702  to us. We plan to experiment with first class modules for extraction very
703  soon.
704  \item Algebraic data types are generalized to families of algebraic data
705    types in the calculi implemented by Coq and Matita, even if the two
706    generalizations are different. Families of algebraic data type are
707    traditionally not supported by programming languages, but some
708    restrictions have been recently considered under the name of
709    Generalized Abstract Data Types (GADTs) and they are now implemented in
710    recent versions of OCaml and GHCs. A future work is the investigation on
711    the possibility of exploiting GADTs during code extraction.
712\end{itemize}
713
714\section{The Cost-Annotating Plug-In}
715
716The functionalities of the Cost Annotating Plug-In have already been described
717in Deliverables~D5.1 and D5.3.
718The plug-in interfaces with the CerCo compiler via
719the command line. Therefore there was no need to update the plug-in code for
720integration in the Trusted CerCo Prototype. Actually, it is even possible
721to install at the same time the untrusted and the trusted compilers. The
722\texttt{-cost-acc} flag of the plug-in can be used to select the executable
723to be used for compilation.
724
725The code of the plug-in has been modified w.r.t. D5.1 in order to take care
726also of the cost model for stack-size consumption. From the user point of view,
727time and space cost annotations and invariants are handled in a similar way.
728Nevertheless, we expect automated theorem provers to face more difficulties
729in dealing with stack usage because elapsed time is additive, whereas what
730is interesting for space usage is the maximum amount of stack used, which is
731not additive. On the other hand, programs produced by our compiler require
732more stack only at function calls. Therefore the proof obligations generated
733to bound the maximum stack size for non recursive programs are trivial.
734Most C programs, and in particular those used in time critical systems, avoid
735recursive functions.
736
737\section{Connection with other deliverables}
738\label{subsect.connection.other.deliverables}
739
740This deliverable represents the final milestone of the CerCo project.
741The software delivered links together most of the software already developed
742in previous deliverables, and it benefits from the studies performed in
743other deliverables. In particular:
744
745\begin{itemize}
746 \item The compiler links the code extracted from the executable formal
747  semantics of C (D3.1), machine code (D4.1), front-end intermediate languages
748  (D3.3) and back-end intermediate languages (D4.3). The \texttt{-is} flag
749  of the compiler invokes the semantics of every intermediate
750  representation of the program to be compiled. The executability of the
751  C and machine code languages has been important to debug the the two
752  semantics, that are part of the trusted code base of the compiler.
753  The executability of the intermediate languages has been important during
754  development for early detection of bugs both in the semantics and in the
755  compiler passes. They are also useful to users to profile programs in early
756  compilation stages to detect where the code spends more time.
757  Those semantics, however, are not part of the trusted code base.
758 \item The compiler links the code extracted from the CIC encoding of the
759  Front-end (D3.2) and Back-end (D4.2). The two encodings have been partially
760  proved correct in D3.4 (Front End Correctness Proof) and D4.4 (Back End
761  Correctness Proof). The formal proofs to be delivered in those deliverables
762  have not been completed. The most urgent future work after the end of the
763  project will consist in completing those proofs.
764 \item Debian Packages have been provided in D6.6 for the Cost-Annotating
765  Plug-In, the Untrusted CerCo Compiler and the Trusted CerCo Compiler.
766  The first and third installed together form a full installation of the
767  Trusted CerCo Prototype. In order to allow interested people to test the
768  prototype more easily, we also provided in D6.6 a Live CD based on
769  Debian with the CerCo Packages pre-installed.
770\end{itemize}
771
772\end{document}
Note: See TracBrowser for help on using the repository browser.