1 | \documentclass[11pt, epsf, a4wide]{article} |
---|
2 | |
---|
3 | \usepackage{../style/cerco} |
---|
4 | |
---|
5 | \usepackage{amsfonts} |
---|
6 | \usepackage{amsmath} |
---|
7 | \usepackage{amssymb} |
---|
8 | \usepackage[english]{babel} |
---|
9 | \usepackage{graphicx} |
---|
10 | \usepackage[utf8x]{inputenc} |
---|
11 | \usepackage{listings} |
---|
12 | \usepackage{stmaryrd} |
---|
13 | \usepackage{url} |
---|
14 | |
---|
15 | \title{ |
---|
16 | INFORMATION AND COMMUNICATION TECHNOLOGIES\\ |
---|
17 | (ICT)\\ |
---|
18 | PROGRAMME\\ |
---|
19 | \vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}} |
---|
20 | |
---|
21 | \lstdefinelanguage{matita-ocaml} |
---|
22 | {keywords={ndefinition,ncoercion,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,let,in,rec,match,return,with,Type,try}, |
---|
23 | morekeywords={[2]nwhd,nnormalize,nelim,ncases,ndestruct}, |
---|
24 | morekeywords={[3]type,of}, |
---|
25 | mathescape=true, |
---|
26 | } |
---|
27 | |
---|
28 | \lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false, |
---|
29 | keywordstyle=\color{red}\bfseries, |
---|
30 | keywordstyle=[2]\color{blue}, |
---|
31 | keywordstyle=[3]\color{blue}\bfseries, |
---|
32 | commentstyle=\color{green}, |
---|
33 | stringstyle=\color{blue}, |
---|
34 | showspaces=false,showstringspaces=false} |
---|
35 | |
---|
36 | \lstset{extendedchars=false} |
---|
37 | \lstset{inputencoding=utf8x} |
---|
38 | \DeclareUnicodeCharacter{8797}{:=} |
---|
39 | \DeclareUnicodeCharacter{10746}{++} |
---|
40 | \DeclareUnicodeCharacter{9001}{\ensuremath{\langle}} |
---|
41 | \DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}} |
---|
42 | |
---|
43 | \date{} |
---|
44 | \author{} |
---|
45 | |
---|
46 | \begin{document} |
---|
47 | |
---|
48 | \thispagestyle{empty} |
---|
49 | |
---|
50 | \vspace*{-1cm} |
---|
51 | \begin{center} |
---|
52 | \includegraphics[width=0.6\textwidth]{../style/cerco_logo.png} |
---|
53 | \end{center} |
---|
54 | |
---|
55 | \begin{minipage}{\textwidth} |
---|
56 | \maketitle |
---|
57 | \end{minipage} |
---|
58 | |
---|
59 | \vspace*{0.5cm} |
---|
60 | \begin{center} |
---|
61 | \begin{LARGE} |
---|
62 | \textbf{ |
---|
63 | Report n. D5.2\\ |
---|
64 | Trusted CerCo Prototype} |
---|
65 | \end{LARGE} |
---|
66 | \end{center} |
---|
67 | |
---|
68 | \vspace*{2cm} |
---|
69 | \begin{center} |
---|
70 | \begin{large} |
---|
71 | Version 1.0 |
---|
72 | \end{large} |
---|
73 | \end{center} |
---|
74 | |
---|
75 | \vspace*{0.5cm} |
---|
76 | \begin{center} |
---|
77 | \begin{large} |
---|
78 | Main Authors:\\ |
---|
79 | XXXX %Dominic P. Mulligan and Claudio Sacerdoti Coen |
---|
80 | \end{large} |
---|
81 | \end{center} |
---|
82 | |
---|
83 | \vspace*{\fill} |
---|
84 | |
---|
85 | \noindent |
---|
86 | Project Acronym: \cerco{}\\ |
---|
87 | Project full title: Certified Complexity\\ |
---|
88 | Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\ |
---|
89 | |
---|
90 | \clearpage |
---|
91 | \pagestyle{myheadings} |
---|
92 | \markright{\cerco{}, FP7-ICT-2009-C-243881} |
---|
93 | |
---|
94 | \newpage |
---|
95 | |
---|
96 | \vspace*{7cm} |
---|
97 | \paragraph{Abstract} |
---|
98 | The Trusted CerCo Prototype is meant to be the last piece of software produced |
---|
99 | in the CerCo project. It consists of |
---|
100 | \begin{enumerate} |
---|
101 | \item A compiler from a large subset of C to the Intel HEX format for the |
---|
102 | object code of the 8051 microprocessor family. The compiler also computes |
---|
103 | the cost models for the time spent in basic blocks and the stack space used |
---|
104 | by the functions in the source code. The cost models are exposed to the |
---|
105 | user by producing an instrumented C code obtained injecting in the original |
---|
106 | source code some instructions to update three global variables that record |
---|
107 | the current clock, the current stack usage and the max stack usage so far. |
---|
108 | \item A plug-in for the Frama-C Software Analyser architecture. The plug-in |
---|
109 | takes in input a C file, compiles it with the CerCo compiler and then |
---|
110 | automatically infers cost invariants for every loop and every function in |
---|
111 | the source code. The invariants can be fed by Frama-C to various theorem |
---|
112 | provers to automatically verify them. Those that are not automatically |
---|
113 | verified can be manually checked by the user using a theorem prover. |
---|
114 | \end{enumerate} |
---|
115 | |
---|
116 | The plug-in is fully functional and it does not need to be formally verified |
---|
117 | because it does not belong to the trusted code base of programs verified using |
---|
118 | CerCo's technology. A bug in the plug-in can just infer wrong time/space |
---|
119 | invariants and bounds that will be rejected by the automatic provers. |
---|
120 | |
---|
121 | The compiler is currently fully functional, even if not fully |
---|
122 | certified yet. The certification will continue after the end of the project. |
---|
123 | |
---|
124 | In this deliverable we discuss the status of the Trusted CerCo Prototype at the |
---|
125 | time of the official end of the project. |
---|
126 | \newpage |
---|
127 | |
---|
128 | \tableofcontents |
---|
129 | |
---|
130 | \newpage |
---|
131 | |
---|
132 | \section{Task} |
---|
133 | \label{sect.task} |
---|
134 | |
---|
135 | The Grant Agreement describes deliverable D5.2 as follows: |
---|
136 | \begin{quotation} |
---|
137 | \textbf{Trusted CerCo Prototype}: Final, fully trustable version of the |
---|
138 | system. |
---|
139 | \end{quotation} |
---|
140 | |
---|
141 | This report describes the state of the implementation of the Trusted CerCo |
---|
142 | Prototype at the official end of the project. |
---|
143 | |
---|
144 | \section{The Trusted CerCo Compiler} |
---|
145 | \subsection{Command line interface} |
---|
146 | The Trusted CerCo Compiler is the first component of the Trusted CerCo |
---|
147 | Prototype, the second one being the Frama-C plug-in already developed in |
---|
148 | D5.1 and not modified. The Trusted CerCo compiler replaces the Untrusted |
---|
149 | CerCo Compiler already delivered in D5.1 as part of the Untrusted CerCo |
---|
150 | Prototype. The Trusted and Untrusted compilers are meant to provide the |
---|
151 | same command line interface, so that they can be easily swapped without |
---|
152 | affecting the plug-in. Actually, the two compilers share the following |
---|
153 | minimal subset of options which are sufficient to the plug-in: |
---|
154 | |
---|
155 | \begin{verbatim} |
---|
156 | Usage: acc [options] file... |
---|
157 | -a Add cost annotations on the source code. |
---|
158 | -is Outputs and interprets all the compilation passes, |
---|
159 | showing the execution traces |
---|
160 | -o Prefix of the output files. |
---|
161 | \end{verbatim} |
---|
162 | |
---|
163 | \begin{figure}[t] |
---|
164 | \begin{verbatim} |
---|
165 | char a[] = {3, 2, 7, -4}; |
---|
166 | char treshold = 4; |
---|
167 | |
---|
168 | int main() { |
---|
169 | char j; |
---|
170 | char *p = a; |
---|
171 | int found = 0; |
---|
172 | for (j=0; j < 4; j++) { |
---|
173 | if (*p <= treshold) { found++; } |
---|
174 | p++; |
---|
175 | } |
---|
176 | return found; |
---|
177 | } |
---|
178 | \end{verbatim} |
---|
179 | \caption{A simple C program~\label{test1}} |
---|
180 | \end{figure} |
---|
181 | |
---|
182 | \begin{figure}[!p] |
---|
183 | \begin{verbatim} |
---|
184 | int __cost = 33; |
---|
185 | |
---|
186 | int __stack_size = 5, __stack_size_max = 5; |
---|
187 | |
---|
188 | void __cost_incr(int incr) { |
---|
189 | __cost = __cost + incr; |
---|
190 | } |
---|
191 | |
---|
192 | void __stack_size_incr(int incr) { |
---|
193 | __stack_size = __stack_size + incr; |
---|
194 | __stack_size_max = __stack_size_max < __stack_size ? __stack_size : __stack_size_max; |
---|
195 | } |
---|
196 | |
---|
197 | unsigned char a[4] = { 3, 2, 7, 252, }; |
---|
198 | |
---|
199 | unsigned char treshold = 4; |
---|
200 | |
---|
201 | int main(void) |
---|
202 | { |
---|
203 | unsigned char j; |
---|
204 | unsigned char *p; |
---|
205 | int found; |
---|
206 | __stack_size_incr(0); |
---|
207 | __cost_incr(91); |
---|
208 | p = a; |
---|
209 | found = 0; |
---|
210 | for (j = (unsigned char)0; (int)j < 4; j = (unsigned char)((int)j + 1)) { |
---|
211 | __cost_incr(76); |
---|
212 | if ((int)*p <= (int)treshold) { |
---|
213 | __cost_incr(144); |
---|
214 | found = found + 1; |
---|
215 | } else { |
---|
216 | __cost_incr(122); |
---|
217 | /*skip*/; |
---|
218 | } |
---|
219 | p = p + 1; |
---|
220 | } |
---|
221 | __cost_incr(37); |
---|
222 | /*skip*/; |
---|
223 | __stack_size_incr(-0); |
---|
224 | return found; |
---|
225 | __stack_size_incr(-0); |
---|
226 | |
---|
227 | } |
---|
228 | \end{verbatim} |
---|
229 | \caption{The instrumented version of the program in Figure~\ref{test1}\label{itest1}} |
---|
230 | \end{figure} |
---|
231 | |
---|
232 | Let the file \texttt{test.c} contains the code presented in Figure~\ref{test1}. |
---|
233 | By calling \texttt{acc -a test.c}, the user obtains the following files: |
---|
234 | \begin{itemize} |
---|
235 | \item \texttt{test-instrumented.c} (Figure~\ref{itest1}): the file is a copy |
---|
236 | of \texttt{test.c} obtained by adding code that keeps track of the amount |
---|
237 | of time and space used by the source code. |
---|
238 | |
---|
239 | The global variable |
---|
240 | \texttt{\_\_cost} records the number of clock cycles used by the |
---|
241 | microprocessor. Its initial value may not be zero because we emit object code |
---|
242 | to initialize the global variables of the program. The initialization code |
---|
243 | is not part of \texttt{main} (to allow \texttt{main} to be recursive, even |
---|
244 | if it is not clear from the standard if this should be allowed or not). |
---|
245 | The initial value of \texttt{\_\_cost} is the time spent in the |
---|
246 | initialization code. |
---|
247 | The \texttt{\_\_cost} variable is incremented at the beginning of every |
---|
248 | basic block using the \texttt{\_\_cost\_incr} function, whose argument |
---|
249 | is the number of clock cycles that will be spent by the basic block that |
---|
250 | is beginning. Therefore the value stored in the variable is always a safe |
---|
251 | upper bound of the actual time. Moreover, the difference at the begin and |
---|
252 | end of a function of the value of the \texttt{\_\_clock} variable is |
---|
253 | also exact. The latter statement --- with several technical complications |
---|
254 | --- is the one that will be eventually certified in Matita. |
---|
255 | |
---|
256 | The global variables \texttt{\_\_stack\_size} and |
---|
257 | \texttt{\_\_stack\_size\_max} are handled similarly to \texttt{\_\_cost}. |
---|
258 | Their value is meant to represent the amount of external data memory |
---|
259 | currently in use and the maximum amount of memory used so far by the program. |
---|
260 | The two values are updated by the \texttt{\_\_stack\_size\_incr} function |
---|
261 | at the beginning of every function body, at its end and just before every |
---|
262 | \texttt{return}. A negative increment is used to represent stack release. |
---|
263 | The initial value of the two variables may not be zero because some |
---|
264 | external data memory is used for global and static variables. Moreover, the |
---|
265 | last byte of external data memory may not be addressable in the 8051, so |
---|
266 | we avoid using it. The compiled code runs correctly only until stack |
---|
267 | overflow, which happens when the value of \texttt{\_\_stack\_size} reaches |
---|
268 | $2^{16}$. It is up to the user to state and maintain the invariant that no |
---|
269 | stack overflow occurs. In case of stack overflow, the compiled code does |
---|
270 | no longer simulate the source code. Automatic provers are often able to |
---|
271 | prove the invariant in simple cases. |
---|
272 | |
---|
273 | In order to instrument the code, all basic blocks that are hidden in the |
---|
274 | source code but that will contain code in the object code need to be made |
---|
275 | manifest. In particular, \texttt{if-then} instructions without an explicit |
---|
276 | \texttt{else} are given one (compare Figures~\ref{test1} and \ref{itest1}). |
---|
277 | \item \texttt{test.hex}: the file is an Intel HEX representation of the |
---|
278 | object code for an 8051 microprocessor. The file can be loaded in any |
---|
279 | 8051 emulator (like the MCU 8051 IDE) for running or dissassemblying it. |
---|
280 | Moreover, an executable semantics (an emulator) for the 8051 is linked |
---|
281 | with the CerCo compilers and can be used to run the object code at the |
---|
282 | end of the compilation. |
---|
283 | \item \texttt{test.cerco} and \texttt{test.stack\_cerco}: these two files |
---|
284 | are used to pass information from the CerCo compilers to the Frama-C plug-in |
---|
285 | and they are not interesting to the user. They just list the |
---|
286 | global variables and increment functions inserted by the compiler and used |
---|
287 | later by the plug-in to compute the cost invariants. |
---|
288 | \end{itemize} |
---|
289 | |
---|
290 | When the option \texttt{-is} is used, the compilers run the program after |
---|
291 | every intermediate compilation pass. The untrusted compiler only outputs the |
---|
292 | trace of (cost) observables and the final value returned by main; the |
---|
293 | trusted compiler also observes every function call and return (the stack-usage |
---|
294 | observables) and therefore allows to better track program execution. |
---|
295 | The traces observed after every pass should always be equal up to the initial |
---|
296 | observable that corresponds to the initialization of global variables. That |
---|
297 | observable is currently only emitted in back-end languages. The preservation of |
---|
298 | the traces will actually be granted by the main theorem of the formalization |
---|
299 | (the forward simulation theorem) when the proof will be completed. |
---|
300 | |
---|
301 | The compilers also create a file |
---|
302 | for each pass with the intermediate code. However, the syntaxes used by the two |
---|
303 | compilers for the intermediate passes are not the same and we do not output |
---|
304 | yet any intermediate syntax for the assembly code. The latter can be obtained |
---|
305 | by disassemblying the object code, e.g. by using the MCU 8051 IDE. |
---|
306 | |
---|
307 | \subsection{Code structure} |
---|
308 | The code of the trusted compiler is made of three different parts: |
---|
309 | \begin{itemize} |
---|
310 | \item Code extracted to OCaml from the formalization of the compiler in Matita. |
---|
311 | This is the code that will eventually be fully certified using Matita. |
---|
312 | It is fully contained in the \texttt{extracted} directory (not comprising |
---|
313 | the subdirectory \texttt{unstrusted}). |
---|
314 | It translates the source C-light code to: |
---|
315 | \begin{itemize} |
---|
316 | \item An instrumented version of the same C-light code. The instrumented |
---|
317 | version is obtained inserting cost emission statements~\texttt{COST k} |
---|
318 | at the beginning of basic blocks. The emitted labels are the ones that |
---|
319 | are observed calling \texttt{acc -is}. They will be replaced in the |
---|
320 | final instrumented code with incrementes of the \texttt{\_\_cost} |
---|
321 | variable. |
---|
322 | \item The object code for the 8051 as a list of bytes to be loaded in |
---|
323 | code memory. The code is coupled with a trie over bitvectors that maps |
---|
324 | program counters to cost labels \texttt{k}. The intended semantics is |
---|
325 | that, when the current program counter is associated to \texttt{k}, |
---|
326 | the observable label \texttt{k} should be emitted during the execution |
---|
327 | of the next object code instruction. Similarly, a second trie, reminiscent |
---|
328 | of a symbol table, maps program counters to function names to emit the |
---|
329 | observables associated to stack usage. |
---|
330 | \end{itemize} |
---|
331 | During the ERTL to RTL pass that deals with register allocation, |
---|
332 | the source code calls two untrusted components that we are now going to |
---|
333 | describe. |
---|
334 | \item Untrusted code called by trusted compiler (directory |
---|
335 | \texttt{extracted/unstrusted}). The two main untrusted components in this |
---|
336 | directory are |
---|
337 | \begin{itemize} |
---|
338 | \item The \texttt{Fix.ml} module by Fran\c{c}ois Pottier that provides a |
---|
339 | generic algorithm to compute the least solution of a system of monotonic |
---|
340 | equations, described in the paper \cite{LazyLeastFixedPointsInML}. |
---|
341 | The trusted code uses this piece of code to do liveness analysis and only |
---|
342 | assumes that the module computes a pre-fix point of the system of equations |
---|
343 | provided. The performance of this piece of code is critical to the compiler |
---|
344 | and the implementation used exploits some clever programming techniques |
---|
345 | and imperative data structures that would be difficult to prove correct. |
---|
346 | Checking if the result of the computation is actually a pre-fixpoint is |
---|
347 | instead very simple to do using low computational complexity functional |
---|
348 | code only. Therefore we do not plan to prove this piece of code correct. |
---|
349 | Instead, we will certify a verifier for the results of the computation. |
---|
350 | |
---|
351 | Note: this module, as well as the next one, have been reused from the |
---|
352 | untrusted compiler. Therefore they have been thoroughly tested for bugs |
---|
353 | during the last year of the project. |
---|
354 | \item The \texttt{coloring.ml} module taken from Fran\c{c}ois Pottier |
---|
355 | PP compiler, used in a compiler's course for undergraduates. |
---|
356 | The module takes an interference graph |
---|
357 | (data structure implemented in module \texttt{untrusted\_interference.ml}) |
---|
358 | and colors it, assigning to nodes of the interference graph either a |
---|
359 | color or the constant \texttt{Spilled}. An heuristic is used to drive the |
---|
360 | algorithm: the caller must provide a function that returns the number of |
---|
361 | times a register is accessed, to decrease it likelihood of being spilled. |
---|
362 | To minimise bugs and reduce code size, we actually implement the heuristics |
---|
363 | in Matita and then extract the code. Therefore the untrusted code also calls |
---|
364 | back extracted code for untrusted computations. |
---|
365 | |
---|
366 | Finally, module \texttt{compute\_colouring.ml} is the one that actually |
---|
367 | computes the colouring of an interference graph of registers given the |
---|
368 | result of the liveness analysis. The code first creates the interference |
---|
369 | graph; then colours it once using real registers as colours to determine |
---|
370 | which pseudo-registers need spilling; |
---|
371 | finally it colours it again using real registers and spilling slots as |
---|
372 | colours to assign to each pseudoregister either a spilling slot or a |
---|
373 | real register. |
---|
374 | \end{itemize} |
---|
375 | The directory also contains a few more files that provide glue code between |
---|
376 | OCaml's data structures from the standard library (e.g. booleans and lists) |
---|
377 | and the corresponding data structures extracted from Matita. The glue is |
---|
378 | necessary to translate results back and forth between the trusted (extracted) |
---|
379 | and untrusted components, because the latter have been written using data |
---|
380 | structures from the standard library of OCaml. |
---|
381 | \item Untrusted code that drives the trusted compiler (directory |
---|
382 | \texttt{driver}). The directory contains the \texttt{acc.ml} module |
---|
383 | that implements the command line interface of the trusted compiler. |
---|
384 | It also contains or links three untrusted components which are safety |
---|
385 | critical and that enter the trusted code base of CerCo: |
---|
386 | \begin{itemize} |
---|
387 | \item The CIL parser~\cite{CIL} that parses standard C code and simplifies |
---|
388 | it to C-light code. |
---|
389 | \item A pretty-printer for C-light code, used to print the |
---|
390 | \texttt{-instrumented.c} file. |
---|
391 | \item The instrumenter module (integrated with the pretty-printer for |
---|
392 | C-light code). It takes the output of the compiler and replaces every |
---|
393 | statement \texttt{COST k} (that emits the cost label \texttt{k}) with |
---|
394 | \texttt{\_\_cost\_incr(n)} where $n$ is the cost associated to |
---|
395 | \texttt{k} in the map returned by the compiler. A similar operation is |
---|
396 | performed to update the stack-related global variables. Eventually this |
---|
397 | modules needs to be certified with the following specification: if |
---|
398 | a run of the labelled program produces the trace |
---|
399 | $\tau = k_1 \ldots k_n$, the |
---|
400 | equivalent run of the instrumented program should yield the same |
---|
401 | result and be such that at the end the value of the \texttt{\_\_cost} |
---|
402 | variable is equal to $\Sigma_{k \in \tau} K(k)$ where $K(k)$ is the |
---|
403 | lookup of the cost of $k$ in the cost map $K$ returned by the compiler. |
---|
404 | A similar statement is expected for stack usage. |
---|
405 | \end{itemize} |
---|
406 | \end{itemize} |
---|
407 | |
---|
408 | \subsection{Functional differences w.r.t. the untrusted compiler} |
---|
409 | From the user perspective, the trusted and untrusted compiler have some |
---|
410 | differences: |
---|
411 | \begin{itemize} |
---|
412 | \item The untrusted compiler put the initialization code for global variables |
---|
413 | at the beginning of main. The trusted compiler uses a separate function. |
---|
414 | Therefore only the trusted compiler allows recursive calls. To pay for |
---|
415 | the initialization code, the \texttt{\_\_cost} variable is not always |
---|
416 | initialized to 0 in the trusted compiler, while it is always 0 in the |
---|
417 | untrusted code. |
---|
418 | \item The two compilers do not compile the code in exactly the same way, even |
---|
419 | if they adopt the same compilation scheme. |
---|
420 | Therefore the object code produced is different and the control blocks are |
---|
421 | given different costs. On average, the code generated by the trusted |
---|
422 | compiler is about 3 times faster and may use less stack space. |
---|
423 | \item The trusted compiler is slightly slower than the untrusted one and the |
---|
424 | trusted executable semantics are also slightly slower than the untrusted |
---|
425 | ones. |
---|
426 | The only passes that at the moment are significantly much slower are the |
---|
427 | policy computation pass, which is a preliminary to the assembly, and the |
---|
428 | assembly pass. These are the passes that manipulate the largest quantity |
---|
429 | of data, because assembly programs are much larger than the corresponding |
---|
430 | Clight sources. The reason for the slowness is currently under investigation. |
---|
431 | It is likely to be due to the quality of the extracted code (see |
---|
432 | Section~\ref{quality}). |
---|
433 | \item The back-ends of both compilers do not handle external functions |
---|
434 | because we have not implemented a linker. The trusted compiler fails |
---|
435 | during compilation, while the untrusted compiler silently turns every |
---|
436 | external function call into a \texttt{NOP}. |
---|
437 | \item The untrusted compiler had ad-hoc options to deal with C files |
---|
438 | generated from a Lustre compiler. The ad-hoc code simplified the C files |
---|
439 | by avoiding some calls to external functions and it was adding some |
---|
440 | debugging code to print the actual reaction time of every Lustre node. |
---|
441 | The trusted compiler does not implement any ad-hoc Lustre option yet. |
---|
442 | \end{itemize} |
---|
443 | |
---|
444 | \subsection{Implementative differences w.r.t. the untrusted compiler}\label{quality} |
---|
445 | The code of the trusted compiler greatly differs from the code of the |
---|
446 | untrusted prototype. The main architectural difference is the one of |
---|
447 | representation of back-end languages. In the trusted compiler we have adopted |
---|
448 | a unified syntax (data-structure), semantics and pretty-printing for every |
---|
449 | back-end language. |
---|
450 | In order to accomodate the differences between the original languages, the |
---|
451 | syntax and semantics have been abstracted over a number of parameters, like |
---|
452 | the type of arguments of the instructions. For example, early languages use |
---|
453 | pseudo-registers to hold data while late languages store data into real machine |
---|
454 | registers or stack locations. The unification of the languages have bringed |
---|
455 | a few benefits and can potentially bring new ones in the future: |
---|
456 | \begin{itemize} |
---|
457 | \item Code size reduction and faster detection and correction of bugs. |
---|
458 | |
---|
459 | About the 3/4th of the code for the semantics and pretty-printing of |
---|
460 | back-end languages is shared, while 1/4th is pass-dependent. Sharing |
---|
461 | the semantics automatically implies sharing semantic properties, i.e. |
---|
462 | reducing to 1/6th the number of lemmas to be proved (6 is the number of |
---|
463 | back-end intermediate languages). Moreover, several back-end passes |
---|
464 | --- a pass between two alternative semantics for RTL, the RTL to ERTL pass |
---|
465 | and the ERTL to LTL pass --- transform a graph instance of the generic |
---|
466 | back-end intermediate language to another graph instance. The |
---|
467 | graph-to-graph transformation has also been generalized and parameterized |
---|
468 | over the pass-specific details. While the code saved in this way is not |
---|
469 | much, several significant lemmas are provided once and for all on the |
---|
470 | transformation. At the time this deliverable has been written, |
---|
471 | the generalized languages, semantics, transformations and relative |
---|
472 | properties take about 3,900 lines of Matita code (definitions and lemmas). |
---|
473 | |
---|
474 | We also benefit from the typical advantage of code sharing over cut\&paste: |
---|
475 | once a bug is found and fixed, the fix immediately applies to every |
---|
476 | instance. This becomes particularly significant for code certification, |
---|
477 | where one simple bug fix usually requires a complex work to fix the |
---|
478 | related correctness proofs. |
---|
479 | |
---|
480 | \item Some passes and several proofs can be given in greater generality, |
---|
481 | allowing more reusal. |
---|
482 | |
---|
483 | For example, in the untrusted prototype the |
---|
484 | LTL to LIN pass was turning a graph language into a linearized language |
---|
485 | with the very same instructions and similar semantics. In particular, the |
---|
486 | two semantics shared the same execute phase, while fetching was different. |
---|
487 | The pass consisted in performing a visit of the graph, emitting the |
---|
488 | instructions in visit order. When the visit detected a cycle, the back-link |
---|
489 | arc was represent with a new explicitly introduced \texttt{GOTO} statement. |
---|
490 | |
---|
491 | Obviously, the transformation just described could be applied as well to |
---|
492 | any language with a \texttt{GOTO} statement. In our formalization in Matita, |
---|
493 | we have rewritten and proved correct the translation between any two |
---|
494 | instances of the generalized back-end languages such that: 1) the |
---|
495 | fetching-related parameters of the two passes are instantiated respectively |
---|
496 | with graphs and linear operations; 2) the execute-related parameters are |
---|
497 | shared. |
---|
498 | |
---|
499 | Obviously, we could also prove correct the reverse translation, from a |
---|
500 | linear to a graph-based version of the same language. The two combined |
---|
501 | passes would allow to temporarily switch to a graph based representation |
---|
502 | only when a data-flow analysis over the code is required. In our compiler, |
---|
503 | for example, at the moment only the RTL to ERTL pass is based on a data |
---|
504 | flow analysis. A similar pair of translations could be also provided between |
---|
505 | one of the two representations and a static single assignment (SSA) one. |
---|
506 | As a final observation, the insertion of another graph-based language after |
---|
507 | the LTL one is now made easy: the linearization pass needs not be redone |
---|
508 | for the new pass. |
---|
509 | |
---|
510 | \item Pass commutation and reusal. |
---|
511 | Every pass is responsible for reducing a difference |
---|
512 | between the source and target languages. For example, the RTL to ERTL pass |
---|
513 | is responsible for the parameter passing conversion, while the ERTL to LTL |
---|
514 | pass performs pseudo-registers allocation. At the moment, both CompCert |
---|
515 | and CerCo fix the relative order of the two passes in an opposite and |
---|
516 | partially arbitrary way and it is not possible to simply switch the |
---|
517 | two passes. We believe instead that it should be possible to generalize |
---|
518 | most passes in such a way that they could be freely composed in any order, |
---|
519 | also with repetitions. For example, real world compilers like GCC perform |
---|
520 | some optimizations like constant propagation multiple times, after |
---|
521 | every optimization that is likely to trigger more constant propagation. |
---|
522 | Thanks to our generalized intermediate language, we can already implement |
---|
523 | a generic constant propagation pass that can be freely applied. |
---|
524 | % For instance, table below shows the amount of Matita |
---|
525 | % code taken by the description of the syntax, semantics and pretty printing |
---|
526 | % rules for the back-end passes. It also compares the size with the untrusted |
---|
527 | % C code. |
---|
528 | % \begin{tabular}[h]{lrr} |
---|
529 | % & Untrusted code & Matita code \\ |
---|
530 | % RTL & 615 & 456 \\ |
---|
531 | % ERTL & 891 & 342 \\ |
---|
532 | % LIN & 568 & 79 \\ |
---|
533 | % LTL & 636 & 94 \\ |
---|
534 | % LTL/LIN & & 466 \\ |
---|
535 | % joint & & 1615 \\ \hline \\ |
---|
536 | % Tot. & 2,710 & 3,052 \\ |
---|
537 | % The tables shows that the size of all untrusted code languages is |
---|
538 | % comparable. |
---|
539 | \end{itemize} |
---|
540 | |
---|
541 | |
---|
542 | \subsection{Quality of the extracted code}\label{quality} |
---|
543 | |
---|
544 | We have extracted the Matita code of the compiler to OCaml in order to compile |
---|
545 | and execute it in an efficient way and without any need to install Matita. |
---|
546 | The implementation of code extraction for Matita has been obtained by |
---|
547 | generalizing the one of Coq over the data structures of Coq, and then |
---|
548 | instantiating the resulting code for Matita. Differences in the two calculi |
---|
549 | have also been taken in account during the generalization. Therefore we can |
---|
550 | expect the extraction procedure to be reasonably bug free, because bugs in the |
---|
551 | core of the code extraction would be likely to be detected in Coq also. |
---|
552 | |
---|
553 | The quality of the extracted code ranges from sub-optimal to poor, depending |
---|
554 | on the part of the formalization. We analyse here the causes for the poor |
---|
555 | quality: |
---|
556 | \begin{itemize} |
---|
557 | \item Useless computations. The extraction procedure |
---|
558 | removes from the extracted code almost all of the non computational parts, |
---|
559 | replacing the ones that are not removed with code with a low complexity. |
---|
560 | However, following Coq's tradition, detection of the useless parts is not |
---|
561 | done according to the computationally expensive algorithm by |
---|
562 | Berardi~\cite{berardixxx}. Instead, the user decides which data structures |
---|
563 | should be assigned computation interest by declaring them in one of the |
---|
564 | \texttt{Type\_i} sorts of the Calculus of (Co)Inductive Constructions. |
---|
565 | The non computational structures are declared in \texttt{Prop}, the sort |
---|
566 | of impredicative, possibly classical propositions. Every computation that |
---|
567 | produces a data structure in \texttt{Prop} is granted to be computationally |
---|
568 | irrelevant. Computations that produce data structures in \texttt{Type\_i}, |
---|
569 | instead, may actually be relevant of irrelevant, even if the extraction code |
---|
570 | conservatively consider them relevant. The result consists in extracted |
---|
571 | OCaml code that computes values that will be passed to functions that do |
---|
572 | not use the result, or that will be returned to the caller that will ignore |
---|
573 | the result. |
---|
574 | |
---|
575 | The phenomenon is particularly visible when the dependently typed discipline |
---|
576 | is employed, which is our choice for CerCo. Under this discipline, terms |
---|
577 | can be passed to type formers. For example, the data type for back-end |
---|
578 | languages in CerCo is parameterized over the list of global variables that |
---|
579 | may be referred to by the code. Another example is the type of vectors |
---|
580 | that is parameterized over a natural which is the size of the vector, or |
---|
581 | the type of vector tries which is parameterized over the fixed height of |
---|
582 | the tree and that can be read and updated only using keys (vectors of |
---|
583 | bits) whose lenght is the height of the trie. Functions that compute these |
---|
584 | dependent types also have to compute the new indexes (parameters) for the |
---|
585 | types, even if this information is used only for typing. For example, |
---|
586 | appending two vectors require the computation of the lenght of the result |
---|
587 | vector just to type the result. In turn, this computation requires the |
---|
588 | lenghts of the two vectors in input. Therefore, functions that call append |
---|
589 | have to compute the length of the vectors to append even if the lenghts |
---|
590 | will actually be ignored by the extracted code of the append function. |
---|
591 | |
---|
592 | In the litterature there are proposals for allowing the user to more |
---|
593 | accurately distinguish computational from non computational arguments of |
---|
594 | functions. The proposals introduce two different types of |
---|
595 | $\lambda$-abstractions and ad-hoc typing rules to ensure that computationally |
---|
596 | irrelevant bound variables are not used in computationally relevant |
---|
597 | positions. An OCaml prototype that implements these ideas for Coq is |
---|
598 | available~\cite{xxyy}, but heavily bugged. We did not try yet to do |
---|
599 | anything along these lines in Matita. To avoid modifying the system, |
---|
600 | another approach based on the explicit use of a non computational monad |
---|
601 | has been also proposed in the litterature, but it introduces many |
---|
602 | complications in the formalization and it cannot be used in every situation. |
---|
603 | |
---|
604 | Improvement of the code extraction to more aggresively remove irrelevant |
---|
605 | code from code extracted from Matita is left as future work. |
---|
606 | At the moment, it seems that useless computations are indeed responsible |
---|
607 | for poor performances of some parts of the extracted code. We have |
---|
608 | experimented with a few manual optimizations of the extracted code and |
---|
609 | we saw that a few minor patches already allow a 25\% speed up of the |
---|
610 | assembly pass. The code released with this deliverable is the one without |
---|
611 | the manual patches to maximize reliability. |
---|
612 | \item Poor typing. A nice theoretical result is that the terms of the |
---|
613 | Calculus of Constructions (CoC), |
---|
614 | the upper-right corner of Barendregt cube, can be re-typed in System |
---|
615 | $F_\omega$, the corresponding typed lambda calculus without dependent |
---|
616 | types~\cite{christinexx}. The calculi implemented by Coq and Matita, however, |
---|
617 | are more expressive than CoC, and several type constructions have no |
---|
618 | counterparts in System $F_\omega$. Moreover, core OCaml does not even |
---|
619 | implement $F_\omega$, but only the Hindley-Milner fragment of it. |
---|
620 | Therefore, in all situations listed below, code extraction is not able to |
---|
621 | type the extracted code using informative types, but it uses the |
---|
622 | super-type \texttt{Obj.magic} of OCaml --- abbreviated \texttt{\_\_} in |
---|
623 | the extracted code. The lack of more precise typing has very limited |
---|
624 | impact on the performance of the compiler OCaml code, but it makes |
---|
625 | very hard to plug the extracted code together with untrusted code. The latter |
---|
626 | needs to introduce explicit unsafe casts between the super-type and the |
---|
627 | concrete types used by instances of the generic functions. The code written |
---|
628 | in this is very error prone. For this reason we have decided to write in |
---|
629 | Matita also parts of the untrusted code of the CerCo compiler (e.g. the |
---|
630 | pretty-printing functions), in order to benefit from the type checker of |
---|
631 | Matita. |
---|
632 | |
---|
633 | The exact situations that triggers uses of the super-type are: |
---|
634 | \begin{enumerate} |
---|
635 | \item They calculi feature a cumulative |
---|
636 | hierarchy of universes that allows to write functions that can be used |
---|
637 | both as term formers and type formers, according to the arguments that are |
---|
638 | passed to them. In System $F_\omega$, instead, terms and types are |
---|
639 | syntactially distinct. Extracting terms according to all their possible |
---|
640 | uses may be unpractical because the number of uses is exponential in the |
---|
641 | number of arguments of sort $Type_i$ with $i \geq 2$. |
---|
642 | \item Case analysis and recursion over inhabitants of primitive inductive |
---|
643 | types can be used in types (strong elimination), which is not allowed in |
---|
644 | $F_\omega$. In the CerCo compiler we largely exploit type formers declared |
---|
645 | in this way, for example to provide the same level of type safety achieved |
---|
646 | in the untrusted compiler via polymorphic variants~\cite{guarriguesxx}. |
---|
647 | In particular, we have terms to syntactically describe as first class |
---|
648 | citizens the large number of combinations of operand modes of object code |
---|
649 | instructions. On the instructions we provide ``generic'' functions that work |
---|
650 | on some combinations of the operand modes, and whose type is computed by |
---|
651 | primitive recursion on the syntactic description of the operand modes of |
---|
652 | the argument of the function. |
---|
653 | \item $\Sigma$-types and, more generally, dependently typed records can have |
---|
654 | at the same time fields that are type declarations and fields that are |
---|
655 | terms. This situation happens all the time in CerCo because we are sticking |
---|
656 | to the dependently typed discipline and because we often generalize our |
---|
657 | data structures over the types used in them. Concretely, the generalization |
---|
658 | happens over a record containing a type --- e.g. the type of |
---|
659 | (pseudo)-registers for back-end languages --- some terms working on the |
---|
660 | type --- e.g. functions to set/get values from (pseudo)-registers --- and |
---|
661 | properties over them. In System $F_\omega$ terms and types abstractions |
---|
662 | are kept syntactically separate and there is no way to pack them in |
---|
663 | records. |
---|
664 | \item The type of the extracted function does not belong to the Hindley-Milner |
---|
665 | fragment. Sometimes simple code transformations could be used to make the |
---|
666 | function typeable, but the increased extraction code complexity would |
---|
667 | outweight the benefits. |
---|
668 | \end{enumerate} |
---|
669 | \end{itemize} |
---|
670 | |
---|
671 | We should note how other projects, in particular CompCert, have decided from |
---|
672 | the beginning to avoid dependent types to grant high quality of the extracted |
---|
673 | code and maximal control over it. Therefore, at the current state of the art of |
---|
674 | code extraction, there seems to be a huge trade-off between extracted code |
---|
675 | quality and exploitation of advanced typing and proving techniques in the |
---|
676 | source code. In the near future, the code base of CerCo can provide an |
---|
677 | interesting example of a large formalization based on dependent types and |
---|
678 | in need of high quality of extracted code. Therefore we plan to use it as a |
---|
679 | driver and test bench for future works in the improvement of code extraction. |
---|
680 | In particular, we are planning to study the following improvements to the |
---|
681 | code extraction of Matita: |
---|
682 | \begin{itemize} |
---|
683 | \item We already have a prototype that extracts code from Matita to GHC plus |
---|
684 | several extensions that allow GHC to use a very large subset of System |
---|
685 | $F_\omega$. However, the prototype is not fully functional yet because we |
---|
686 | still need to solve at the theoretical level a problem of interaction |
---|
687 | between $F_\omega$ types and strong elimination. Roughly speaking, the |
---|
688 | two branches of a strong elimination always admit a most general unifier in |
---|
689 | Hindley-Milner plus the super-type \texttt{Obj.magic}, but the same property |
---|
690 | is lost for $F_\omega$. As a consequence, we loose modularity in code |
---|
691 | extraction and we need to figure out static analysis techniques to reduce |
---|
692 | the impact of the loss of modularity. |
---|
693 | \item The two most recent versions of OCaml have introduced first class |
---|
694 | modules, which are exactly the feature needed for extracting code that uses |
---|
695 | records contining both types and term declarations. However, the syntax |
---|
696 | required for first class modules is extremely cumbersome and it requires |
---|
697 | the explicit introduction of type expressions to make manifest the type |
---|
698 | declaration/definition fields of the module. This complicates code |
---|
699 | extraction with the needs of performing some computations at extraction time, |
---|
700 | which are not in the tradition of code extraction. Moreover, the actual |
---|
701 | performance of OCaml code that uses first class modules heavily is unknown |
---|
702 | to us. We plan to experiment with first class modules for extraction very |
---|
703 | soon. |
---|
704 | \item Algebraic data types are generalized to families of algebraic data |
---|
705 | types in the calculi implemented by Coq and Matita, even if the two |
---|
706 | generalizations are different. Families of algebraic data type are |
---|
707 | traditionally not supported by programming languages, but some |
---|
708 | restrictions have been recently considered under the name of |
---|
709 | Generalized Abstract Data Types (GADTs) and they are now implemented in |
---|
710 | recent versions of OCaml and GHCs. A future work is the investigation on |
---|
711 | the possibility of exploiting GADTs during code extraction. |
---|
712 | \end{itemize} |
---|
713 | |
---|
714 | \section{The Cost-Annotating Plug-In} |
---|
715 | |
---|
716 | The functionalities of the Cost Annotating Plug-In have already been described |
---|
717 | in Deliverables~D5.1 and D5.3. |
---|
718 | The plug-in interfaces with the CerCo compiler via |
---|
719 | the command line. Therefore there was no need to update the plug-in code for |
---|
720 | integration in the Trusted CerCo Prototype. Actually, it is even possible |
---|
721 | to install at the same time the untrusted and the trusted compilers. The |
---|
722 | \texttt{-cost-acc} flag of the plug-in can be used to select the executable |
---|
723 | to be used for compilation. |
---|
724 | |
---|
725 | The code of the plug-in has been modified w.r.t. D5.1 in order to take care |
---|
726 | also of the cost model for stack-size consumption. From the user point of view, |
---|
727 | time and space cost annotations and invariants are handled in a similar way. |
---|
728 | Nevertheless, we expect automated theorem provers to face more difficulties |
---|
729 | in dealing with stack usage because elapsed time is additive, whereas what |
---|
730 | is interesting for space usage is the maximum amount of stack used, which is |
---|
731 | not additive. On the other hand, programs produced by our compiler require |
---|
732 | more stack only at function calls. Therefore the proof obligations generated |
---|
733 | to bound the maximum stack size for non recursive programs are trivial. |
---|
734 | Most C programs, and in particular those used in time critical systems, avoid |
---|
735 | recursive functions. |
---|
736 | |
---|
737 | \section{Connection with other deliverables} |
---|
738 | \label{subsect.connection.other.deliverables} |
---|
739 | |
---|
740 | This deliverable represents the final milestone of the CerCo project. |
---|
741 | The software delivered links together most of the software already developed |
---|
742 | in previous deliverables, and it benefits from the studies performed in |
---|
743 | other deliverables. In particular: |
---|
744 | |
---|
745 | \begin{itemize} |
---|
746 | \item The compiler links the code extracted from the executable formal |
---|
747 | semantics of C (D3.1), machine code (D4.1), front-end intermediate languages |
---|
748 | (D3.3) and back-end intermediate languages (D4.3). The \texttt{-is} flag |
---|
749 | of the compiler invokes the semantics of every intermediate |
---|
750 | representation of the program to be compiled. The executability of the |
---|
751 | C and machine code languages has been important to debug the the two |
---|
752 | semantics, that are part of the trusted code base of the compiler. |
---|
753 | The executability of the intermediate languages has been important during |
---|
754 | development for early detection of bugs both in the semantics and in the |
---|
755 | compiler passes. They are also useful to users to profile programs in early |
---|
756 | compilation stages to detect where the code spends more time. |
---|
757 | Those semantics, however, are not part of the trusted code base. |
---|
758 | \item The compiler links the code extracted from the CIC encoding of the |
---|
759 | Front-end (D3.2) and Back-end (D4.2). The two encodings have been partially |
---|
760 | proved correct in D3.4 (Front End Correctness Proof) and D4.4 (Back End |
---|
761 | Correctness Proof). The formal proofs to be delivered in those deliverables |
---|
762 | have not been completed. The most urgent future work after the end of the |
---|
763 | project will consist in completing those proofs. |
---|
764 | \item Debian Packages have been provided in D6.6 for the Cost-Annotating |
---|
765 | Plug-In, the Untrusted CerCo Compiler and the Trusted CerCo Compiler. |
---|
766 | The first and third installed together form a full installation of the |
---|
767 | Trusted CerCo Prototype. In order to allow interested people to test the |
---|
768 | prototype more easily, we also provided in D6.6 a Live CD based on |
---|
769 | Debian with the CerCo Packages pre-installed. |
---|
770 | \end{itemize} |
---|
771 | |
---|
772 | \end{document} |
---|