source: Deliverables/Dissemination/final-review/wp6-dissemination.ltx @ 3300

Last change on this file since 3300 was 3300, checked in by mckinna, 7 years ago

updates

File size: 17.0 KB
Line 
1
2\begin{frame}{Summary}
3
4Dissemination and exploitation in the CerCo project has been both:
5
6\begin{description}
7\item[direct] in the form of Deliverables such as:
8  \begin{itemize}
9  \item papers
10  \item workshops
11  \item invited talks
12  \item software
13  \end{itemize}
14%%%\end{description}
15\item[and]
16%%%\begin{description}
17\item[\red{indirect}] in the form of:
18  \begin{itemize}
19  \item ideas, influence;
20  \item mobility of researchers;
21  \item follow-on projects
22  \end{itemize}
23
24\end{description}
25
26\end{frame}
27
28\begin{frame}{Activities in the Third Period}
29  A number of distinct activities took place, as reported in D1.3/D1.4
30  and in the individual Deliverables already submitted. Noteworthy direct outcomes are:
31
32  \begin{description}
33  \item[papers] two at CPP 2012 (UEDIN, UNIBO; Kyoto); one at FMICS (UPD; Paris); also work on implicit computational complexity (UPD)
34  \item[workshops] HiPEAC (D6.4: Berlin, Jan 2013); ETAPS (D6.5: Rome, March 2013)
35  \item[software] extracted trusted compiler (D5.2) and live CD (D6.6)
36  \end{description}
37
38  Indirect outcomes:
39  \begin{description}
40  \item[people] a number of researchers moved on to other projects or industry
41  \item[ideas] executable semantics for CompCert C: Leroy, following Campbell (UEDIN)
42  \item[projects] COST Action TACLe (Lisper; SE), EPSRC REMS (Sewell, \textit{et al.}\ldots; UK) DARPA Crash/Safe (Morrissett \textit{et al.}\ldots; US)
43  \end{description}
44
45\end{frame}
46
47\begin{frame}{Research Context: Projects and People}
48Building on established networks of EU-funded research:
49\begin{description}
50  \item[FP5] IST 33149 ``Mobile Resource Guarantees''
51  \item[FP6] \begin{itemize}
52        \item IST 015905 ``Mobius: Mobility, Ubiquity, Security''
53        \item FET-Open STReP 510255 ``EmBounded''
54        \end{itemize}
55  \item[FP7] \red{FET-Open STReP 243881 ``CerCo''}
56  %%%\item COST: ICT IC2102 ``TACLe''
57\end{description}
58former researchers on CerCo have moved to related research projects, or industrial formal methods groups:
59\begin{description}
60\item[UEDIN]
61  \begin{itemize}
62  \item Pollack, now researcher on Crash/Safe %%%(Morrissett \textit{et al.}\ldots; DARPA, US)
63  \item Campbell, now researcher on REMS %%%(Sewell \textit{et al.}\ldots; EPSRC, UK)
64  \end{itemize}
65\item[UPD] \begin{itemize}
66  \item Ayache, engineer at CLEARSY (Atelier B, FR)
67  \item Bobot, researcher at CEA (Frama-C, FR)
68  \end{itemize}
69\item[UNIBO] \begin{itemize}
70  \item Tranquilli, to join Nokia (PL)
71  \item Mulligan, now researcher on REMS (UK)
72  \item Boender, assistant professor at Middlesex (UK)
73  \end{itemize}
74\end{description}
75
76\end{frame}
77
78\begin{frame}{Prospects for further exploitation}
79  In addition to the movement of people, and influence of ideas already sketched:
80  \begin{description}
81  \item[further proof/software development] UNIBO, with existing staff
82  \item[more advanced hardware models] UNIBO, as above, UEDIN with REMS
83  \item[further collaborations] COST: ICT IC2102 ``TACLe'' (Lisper, SE). UNIBO and UEDIN
84  \end{description}
85  In particular, the role of source-level (timing) analysis in the future of WCET, envisaged as ``parametric'':
86  \begin{itemize}
87    \item symbolic
88    \item local, modular
89    \item relativised (\textit{cf.} dependent labelling)
90
91  \end{itemize}
92\end{frame}
93
94
95%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
96
97\comment{
98\begin{frame}{CerCo Outcomes}
99
100  The final results of the CerCo project include the following:
101
102  \medskip %
103  \begin{itemize}
104  \item A compiler from C to executable 8051 binaries, formalised in Matita as
105    \blue{\lstinline|compiler.ma|} \dots
106  \item \dots which adds labels and cost information to C source code
107  \item \dots with exact values for time and space used by the binary.
108
109    \medskip %
110  \item A machine-checked proof of this in Matita
111    \blue{\lstinline|correctness.ma|}:
112
113    \smallskip %
114    \begin{itemize}
115    \item Addressing \red{intensional} properties --- clock cycles, stack bytes
116
117      \smallskip %
118    \item As well as \red{extensional} properties --- functional correctness.
119    \end{itemize}
120  \item An executable cost-annotating compiler \blue{\lstinline|cerco|}
121    extracted from the formalisation.
122  \end{itemize}
123
124  \medskip %
125  This talk treats the front-end compiler and proof, describing the technical
126  challenges and contributions from the work in Period~3.
127 
128\end{frame}
129
130\begin{frame}{Compiler}
131
132  The CerCo compiler, given C source code, will generate:
133  \begin{itemize}
134  \item Labelled Clight source code;
135  \item Labelled 8051 object code;
136  \item Cost maps: for each label a count of time (clock cycles) and space
137    (stack bytes) usage.
138  \end{itemize}
139
140  \medskip %
141  The compiler operates in multiple passes, with each intermediate language
142  having an executable semantics in Matita: %
143  \vspace*{-\medskipamount}
144  \begin{center}
145    \includegraphics[width=0.7\linewidth]{compiler-plain.pdf.eps}
146  \end{center}
147  \vspace*{-\medskipamount}
148
149\end{frame}
150
151\begin{frame}{Correctness}
152 
153  Suppose we have C source with labelled Clight, labelled 8051 binary, and
154  cost maps all generated from the CerCo compiler.
155
156  \medskip %
157  Then the Matita theorem for correctness is that:
158  \begin{itemize}
159  \item Executing the original and labelled Clight source gives the same
160    behaviour;
161  \item Executing the 8051 binary gives the same observables: labels and
162    call/return;
163  \item Applying the cost maps to the trace of C labels correctly
164    describes the time and space usage of the 8051 execution.
165  \end{itemize}
166
167  In fact we give an exact result on all \red{measurable subtraces}: from any
168  label to the end of its enclosing function.
169
170  \medskip %
171  The machine-checked proof is the concatenation of correctness results for
172  the front end, the back end, and the assembler.
173
174\end{frame}
175
176\begin{frame}{Front-End Progress}
177 
178  \vspace*{-\bigskipamount}
179  \begin{center}
180    \includegraphics[width=0.7\linewidth]{compiler-plain.pdf.eps}
181  \end{center}
182  \vspace*{-\medskipamount}
183
184  At the start of Period~3 the front end included in Matita:
185  \begin{itemize}
186  \item Executable semantics for source and intermediate languages;
187  \item Labelling and compilation of source through these;
188  \item Innovation of \red{structured traces}, originally to support timing
189    computation in the assembler.
190  \end{itemize}
191
192  At the completion of Period~3, the front end now includes proofs of
193  correctness for all these stages, extension to stack usage, and considerable
194  refinement of the existing formal development.
195 
196\end{frame}
197
198\begin{frame}{Front-End Contribution}
199
200  Notable features and original elements of the front-end include:
201  \begin{itemize}
202  \item Proof layering:
203    \begin{itemize}
204    \item A \blue{functional correctness proof};
205    \item A \blue{cost proof}, separate but depending on functional
206      correctness.
207    \end{itemize}
208  \item \blue{Internal checks} as a proof shortcut, like translation
209    validation.
210  \item \blue{Structured traces} for holding detailed data on resource
211    behaviour, not just for assembler cost calculation.
212  \item Introduction of \blue{measurable subtraces} as a unit of cost
213    proof.
214  \end{itemize}
215  These features are significant in particular because:
216  \begin{itemize}
217  \item Compilation is not 1-1 transliteration: code is rearranged and
218    modified, but label sequencing is preserved;
219  \item The source language has implicit control flow constraints not present
220    in the target (call/return, branches, loops).
221  \end{itemize}
222
223\end{frame}
224
225\begin{frame}{Axiomatization}
226
227  The translation in \blue{\lstinline|compile.ma|} is complete --- essential
228  for extracting a compiler.
229
230  \medskip %
231  The simulation results in \blue{\lstinline|correctness.ma|} are fully
232  specified, and with substantially complete proofs.  Some parts, however,
233  have been axiomatized and their correctness assumed.
234
235  \medskip %
236  In the front-end there are two sources of this axiomatisation:
237  \begin{itemize}
238  \item \red{Functional correctness.}  Other projects, notably
239    \blue{CompCert}, provide assurances that such proofs can be completed; so
240    certain parts of this proof have been assumed.
241  \item Simulation steps with \red{many cases}, some very similar; here we
242    have identified representative and more challenging cases for detailed
243    proof.
244  \end{itemize}
245 
246\end{frame}
247
248\begin{frame}{Structure of the Proof}
249
250  \begin{center}
251    \includegraphics[width=0.7\linewidth]{compiler.pdf.eps}
252  \end{center}
253
254  The transition from front-end to back-end marks the change:
255  \begin{itemize}
256  \item From a language with explicit call/return structure and high-level
257    structured control flow;
258  \item To a language where all control flow is unconstrained jumps.
259  \end{itemize}
260
261  Correspondingly, the proof hands over
262  \begin{itemize}
263  \item From measurable subtraces with labelling that must be checked
264    \blue{sound} and \blue{precise};
265  \item To \blue{structured traces} which embed these invariants.
266  \end{itemize}
267
268\end{frame}
269
270\begin{frame}{Structured Traces}
271
272  Embed call structure and label invariants into execution traces.
273
274  \begin{center}
275    \includegraphics{strtraces.pdf.eps}
276  \end{center}
277  \vspace*{-\bigskipamount}
278  \begin{itemize}
279  \item Enforces \blue{invariants} on positions of cost labels.
280  \item \blue{Groups} execution steps according to preceding cost label.
281  \item Forbids \blue{repeating} addresses in grouped steps.
282  \end{itemize}
283
284  Constructed by folding up \red{measurable subtraces}, using their termination
285  and the fact that labelling is \blue{sound} and \blue{precise}.
286
287\end{frame}
288
289\begin{frame}{Step-by-Step: Getting to Labelled Source}
290
291  \blue{C $\to$ Clight}
292
293  \smallskip %
294  To translate from C to Clight we reuse the CompCert parser.
295
296  \bigskip %
297  \blue{Switch Removal}
298
299  \smallskip %
300  Control flow for \red{\lstinline'switch'} is too subtle for simple
301  labelling; we replace with an \red{\lstinline'if .. then .. else'} tree.
302  Proof requires tracking memory extension for an extra test variable.
303
304  \bigskip %
305  \blue{Cost Labels}
306
307  \smallskip %
308  Labels added at function start, branch targets, \dots
309
310  \smallskip %
311  Simulation of execution is exact, just skipping over labels.
312
313\end{frame}
314
315\begin{frame}{Front-End Correctness}
316
317  Suppose we have a \blue{measurable} subtrace of Clight execution, including
318  the \blue{prefix} of that trace from initial state.
319
320  \medskip %
321  Then for handover to the back-end we have in RTLabs:
322  \begin{itemize}
323  \item A corresponding \red{prefix};
324  \item A \red{structured trace} corresponding to the measurable subtrace;
325  \item The required invariants, including \red{no repeating addresses};
326  \item A proof that the same \red{stack limit} is observed.
327  \end{itemize}
328
329  \medskip %
330  In addition, to link this with the source program:
331  \begin{itemize}
332  \item The observables for the RTLabs \blue{prefix} and \blue{structured
333      trace} are the same as for their counterparts in Clight.
334  \end{itemize}
335
336\end{frame}
337
338\begin{frame}{Lifting Measurable Traces}
339
340  For each pass find a target \red{measurable subtrace} equivalent to any
341  source \red{measurable subtrace}.
342
343  \begin{center}
344    \includegraphics{meassim.pdf.eps}
345  \end{center}
346
347  Split normal simulation proof:
348  \begin{enumerate}
349  \item each \blue{call} becomes a \blue{call} step plus zero or more
350    \blue{normal} steps; following cost labels should stay next to the call
351    step
352  \item each \blue{return} becomes a \blue{return} plus zero or
353    more \blue{normal} steps
354  \item \blue{cost} label steps are preserved
355  \item other \blue{normal} steps become zero or more \blue{normal} steps
356  \end{enumerate}
357
358  This preserves the defining property for \red{measurable subtraces}.
359
360\end{frame}
361
362\begin{frame}{Front-End Simulation Results}
363  \blue{Cast simplification}
364  \begin{itemize}
365  \item Simplify expressions for 8-bit target.
366  \item Only expressions change --- statements are in lock-step.
367  \end{itemize}
368
369  \blue{Clight to Cminor}
370  \begin{itemize}
371  \item Stack allocation and control flow transformation.
372  \item Similar to \blue{CompCert}, but using \red{\lstinline'goto'} instead
373    of \red{\lstinline'loop'}
374  \item Large proof terms for invariants embedded in Cminor programs using
375    dependent types.
376  \end{itemize}
377
378  \blue{Cminor to RTLabs}
379  \begin{itemize}
380  \item Construction of control-flow graph.
381  \item Embedded invariants mean that translation function is \red{total}.
382  \end{itemize}
383\end{frame}
384
385\begin{frame}{Checking Cost Labelling}
386
387  Building structured traces in \blue{RTLabs} depends on having cost labelling
388  on the linear trace that is \red{sound} and \red{precise}.
389
390  \medskip %
391  Instead of carrying this as an invariant all the way through each previous
392  pass, we take a \blue{shortcut} and check on the spot.
393
394  \medskip %
395  This is similar to \blue{translation validation}.
396
397  \medskip %
398   The specific requirements for labelling \blue{RTLabs} code are:
399   \begin{description}[\red{soundness}]
400   \item[\red{soundness}] At every point, a finite bound on the number of
401     instructions until the next label;
402   \item[\red{soundness}] A label at the start of every function;
403   \item[\red{precision}] A label after every branch.
404   \end{description}
405 
406   \medskip %
407   The bound is the hard part; the rest is a simple syntactic check.
408\end{frame}
409
410\begin{frame}[fragile]{Checking Bound to Next Label}
411
412    \begin{center}
413  \begin{overprint}[0.5\linewidth]
414      \onslide<1>\includegraphics[width=0.8\linewidth]{loop.pdf.eps}
415      \onslide<2>\includegraphics[width=0.8\linewidth]{loop1.pdf.eps}
416      \onslide<3>\includegraphics[width=0.8\linewidth]{loop2.pdf.eps}
417      \onslide<4>\includegraphics[width=0.8\linewidth]{loop3.pdf.eps}
418      \onslide<5>\includegraphics[width=0.8\linewidth]{loop4.pdf.eps}
419      \onslide<6->\includegraphics[width=0.8\linewidth]{loopx.pdf.eps}
420  \end{overprint}
421    \end{center}
422  \begin{itemize}
423  \item Pick an arbitrary node in the CFG;
424  \item<2-> Follow successor instructions;
425  \item<4-> If we find a \alert{cost label} all the traced nodes have a
426    bound\dots
427  \item<5-> \dots so remove them and pick a new node, continue;
428  \item<6-> But if we find an \alert{unlabelled loop} then the labelling is
429    unsound, report an error.
430  \end{itemize}
431
432  \onslide<7->
433  When complete, we have a proof that every loop contains a label.
434
435\end{frame}
436
437\begin{frame}{Checking Cost Labelling}
438
439  This check of soundness and precision in RTLabs:
440  \begin{itemize}
441  \item Is partial --- will only succeed when invariants hold;
442  \item Extracts to additional checking code in the compiler;
443  \item[\checkmark] Significantly reduces the proof effort.
444  \end{itemize}
445
446  \medskip %
447  The last point is our key observation: a conventional proof would require
448  showing that:
449  \begin{quote}
450    \red{If} every loop in \blue{Cminor} code is headed by a cost label;
451    \\[1\jot]
452    \red{Then} at every point in every \blue{RTLabs} execution there is a
453    finite bound on the number of instructions until the next cost label.
454  \end{quote}
455  We expect that proving existence in general of this bound alone to be
456  considerably harder than the whole check.
457
458\end{frame}
459
460\begin{frame}{Putting the Proofs Together}
461
462  \begin{center}
463    \includegraphics[width=0.7\linewidth]{compiler.pdf.eps}
464  \end{center}
465
466  The front-end proof demonstrates that for any Clight execution:
467  \begin{itemize}
468  \item We have a corresponding \blue{structured trace} with \blue{invariants}
469    \dots
470  \item \dots and with the same \blue{observables} of labels and call/return.
471  \end{itemize}
472
473  The back-end proof takes this structured trace and shows that
474  \begin{itemize}
475  \item There is a corresponding assembler trace \dots
476  \item \dots with \blue{sound} and \blue{precise} labelling and the same
477    \blue{observables}.
478  \end{itemize}
479
480  Combining these gives a proof that time and space costs computed on the
481  source are exactly those observed on executing the binary.
482
483\end{frame}
484
485\begin{frame}{Summary}
486
487  CerCo WP3 has produced the front-end of a C-to-8051 compiler that annotates
488  C source with runtime cycle counts and stack space.
489
490  \medskip %
491  This is formalised in the Matita proof assistant, with a proof of
492  correctness, and can be extracted as an executable compiler.
493
494  \smallskip %
495  \begin{itemize}
496  \item Modular \blue{layering} of intensional proofs over extensional
497    results;
498  \item Extensional results richer than normal and carefully chosen, but do
499    not require large changes to proof techniques.
500  \item \blue{Compile-time checks} on intermediate code reduce proof effort.
501  \item \blue{Structured traces} as a repository for invariant information.
502  \item \blue{Measurable subtraces} as a unit of cost proof.
503  \end{itemize}
504
505  \smallskip %
506  Going beyond this, in future projects (REMS, \dots) we plan to:
507  \begin{itemize}
508  \item Generalize  labelling schemes for more sophisticated targets;
509  \item Apply \blue{decompilation} to jump the gap from source to binary.
510  \end{itemize}
511
512\end{frame}
513}
514
515% LocalWords:  LFCS Matita CerCo intensional WCET toolchain CompCert Clight ASM
516% LocalWords:  reexecutes subtrace RTLabs subtraces Garnier McKinna cerco
517% LocalWords:  Axiomatization axiomatized Cminor goto
518
519%%% Local Variables:
520%%% mode: latex
521%%% TeX-master: "wp6-dissemination.tex"
522%%% End:
Note: See TracBrowser for help on using the repository browser.