source: Deliverables/Dissemination/final-review/wp6-dissemination.ltx @ 3301

Last change on this file since 3301 was 3301, checked in by mckinna, 8 years ago

updates: wp6.tex is definitive

File size: 18.1 KB
Line 
1
2\begin{frame}{Summary}
3
4Dissemination and exploitation in the CerCo project has been both:
5
6\begin{description}
7\item[direct] in the form of Deliverables such as:
8  \begin{itemize}
9  \item papers
10  \item workshops
11  \item invited talks
12  \item software
13  \end{itemize}
14%%%\end{description}
15\item[and]
16%%%\begin{description}
17\item[\red{indirect}] in the form of:
18  \begin{itemize}
19  \item ideas, influence;
20  \item mobility of researchers;
21  \item follow-on projects
22  \end{itemize}
23
24\end{description}
25
26\end{frame}
27
28\begin{frame}{Activities in the Third Period}
29  A number of distinct activities took place, as reported in D1.3/D1.4
30  and in the individual Deliverables already submitted. Noteworthy \alert{direct} outcomes are:
31
32  \begin{description}
33  \item[papers] two at CPP 2012 (UEDIN, UNIBO; Kyoto); one at FMICS (UPD; Paris); also work on implicit computational complexity (UPD)
34  \item[workshops] HiPEAC (D6.4: Berlin, Jan 2013); ETAPS (D6.5: Rome, March 2013)
35  \item[software] extracted trusted compiler (D5.2) and live CD (D6.6)
36  \end{description}
37
38  \alert{Indirect} outcomes:
39  \begin{description}
40  \item[people] a number of researchers moved on to other projects or industry
41  \item[ideas] executable semantics for CompCert C: Leroy, following Campbell (UEDIN)
42  \item[projects] COST Action TACLe (Lisper; SE), EPSRC REMS (Sewell, \textit{et al.}\ldots; UK) DARPA Crash/Safe (Morrissett \textit{et al.}\ldots; US)
43  \end{description}
44
45\end{frame}
46
47\begin{frame}
48  {Reflections}
49  \begin{description}
50  \item[HiPEAC]
51    \begin{description}
52    \item[+] well targeted: \textit{cf.} Koen de Bosschere (U.Gent)
53    \item[--] poorly attended
54    \end{description}
55  \item[ETAPS]
56    \begin{description}
57    \item[++] Intersection with PROARTIS (Vardanega) and TACLe (Lisper)
58    \item[-] scheduling/balance with QAPL meeting
59    \end{description}
60  \item[software]
61    \begin{description}
62    \item[++] Reusable, out-of-the-box installation; integration with Why3/Frama-C
63    \item[-] Quality of the extraction; refinements, improvements, completion of the proofs
64    \end{description}
65  \item[papers]
66    \begin{description}
67    \item[++] High quality! \textit{cf.} Leroy reusing Campbell's executable C semantics
68    \item[--] Challenge: identifying the target audience(s)?
69    \end{description}
70  \item[ideas] \begin{description}
71    \item[++] labelled source code; structured traces
72    \item[+/-] r{\^o}le of types in certification
73    \end{description}
74  \end{description}
75
76\end{frame}
77
78\begin{frame}{Research Context: Projects and People}
79Building on established networks of EU-funded research:
80\begin{description}
81  \item[FP5] IST 33149 ``Mobile Resource Guarantees''
82  \item[FP6] \begin{itemize}
83        \item IST 015905 ``Mobius: Mobility, Ubiquity, Security''
84        \item FET-Open STReP 510255 ``EmBounded''
85        \end{itemize}
86  \item[FP7] \red{FET-Open STReP 243881 ``CerCo''}
87  %%%\item COST: ICT IC2102 ``TACLe''
88\end{description}
89CerCo researchers now on related projects, or practising formal methods in industry:
90\begin{description}
91\item[UEDIN]
92  \begin{itemize}
93  \item Pollack, now researcher on Crash/Safe (US)%%%(Morrissett \textit{et al.}\ldots; DARPA, US)
94  \item Campbell, now researcher on REMS (UK)%%%(Sewell \textit{et al.}\ldots; EPSRC, UK)
95  \end{itemize}
96\item[UPD] \begin{itemize}
97  \item Ayache, engineer at CLEARSY (Atelier B, FR)
98  \item Bobot, researcher at CEA (Frama-C, FR)
99  \item Madet, researcher on PARKAS (INRIA, FR)
100  \end{itemize}
101\item[UNIBO] \begin{itemize}
102  \item Tranquilli, to join Nokia-Siemens Networks (PL)
103  \item Mulligan, now researcher on REMS (UK)
104  \item Boender, assistant professor at Middlesex (UK)
105  \end{itemize}
106\end{description}
107
108\end{frame}
109
110\begin{frame}{Prospects for further exploitation}
111  In addition to the movement of people, and influence of ideas already sketched:
112  \begin{description}
113  \item[further proof/software development] UNIBO, with existing staff
114  \item[more advanced hardware models] UNIBO, as above, UEDIN with REMS
115  \item[further collaborations] COST: ICT IC2102 ``TACLe'' (Lisper, SE). UNIBO and UEDIN
116  \end{description}
117  In particular, the role of source-level (timing) analysis in the future of WCET, envisaged as ``parametric'':
118  \begin{itemize}
119    \item symbolic
120    \item local, modular
121    \item relativised (\textit{cf.} dependent labelling)
122
123  \end{itemize}
124\end{frame}
125
126
127%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
128
129\comment{
130\begin{frame}{CerCo Outcomes}
131
132  The final results of the CerCo project include the following:
133
134  \medskip %
135  \begin{itemize}
136  \item A compiler from C to executable 8051 binaries, formalised in Matita as
137    \blue{\lstinline|compiler.ma|} \dots
138  \item \dots which adds labels and cost information to C source code
139  \item \dots with exact values for time and space used by the binary.
140
141    \medskip %
142  \item A machine-checked proof of this in Matita
143    \blue{\lstinline|correctness.ma|}:
144
145    \smallskip %
146    \begin{itemize}
147    \item Addressing \red{intensional} properties --- clock cycles, stack bytes
148
149      \smallskip %
150    \item As well as \red{extensional} properties --- functional correctness.
151    \end{itemize}
152  \item An executable cost-annotating compiler \blue{\lstinline|cerco|}
153    extracted from the formalisation.
154  \end{itemize}
155
156  \medskip %
157  This talk treats the front-end compiler and proof, describing the technical
158  challenges and contributions from the work in Period~3.
159 
160\end{frame}
161
162\begin{frame}{Compiler}
163
164  The CerCo compiler, given C source code, will generate:
165  \begin{itemize}
166  \item Labelled Clight source code;
167  \item Labelled 8051 object code;
168  \item Cost maps: for each label a count of time (clock cycles) and space
169    (stack bytes) usage.
170  \end{itemize}
171
172  \medskip %
173  The compiler operates in multiple passes, with each intermediate language
174  having an executable semantics in Matita: %
175  \vspace*{-\medskipamount}
176  \begin{center}
177    \includegraphics[width=0.7\linewidth]{compiler-plain.pdf.eps}
178  \end{center}
179  \vspace*{-\medskipamount}
180
181\end{frame}
182
183\begin{frame}{Correctness}
184 
185  Suppose we have C source with labelled Clight, labelled 8051 binary, and
186  cost maps all generated from the CerCo compiler.
187
188  \medskip %
189  Then the Matita theorem for correctness is that:
190  \begin{itemize}
191  \item Executing the original and labelled Clight source gives the same
192    behaviour;
193  \item Executing the 8051 binary gives the same observables: labels and
194    call/return;
195  \item Applying the cost maps to the trace of C labels correctly
196    describes the time and space usage of the 8051 execution.
197  \end{itemize}
198
199  In fact we give an exact result on all \red{measurable subtraces}: from any
200  label to the end of its enclosing function.
201
202  \medskip %
203  The machine-checked proof is the concatenation of correctness results for
204  the front end, the back end, and the assembler.
205
206\end{frame}
207
208\begin{frame}{Front-End Progress}
209 
210  \vspace*{-\bigskipamount}
211  \begin{center}
212    \includegraphics[width=0.7\linewidth]{compiler-plain.pdf.eps}
213  \end{center}
214  \vspace*{-\medskipamount}
215
216  At the start of Period~3 the front end included in Matita:
217  \begin{itemize}
218  \item Executable semantics for source and intermediate languages;
219  \item Labelling and compilation of source through these;
220  \item Innovation of \red{structured traces}, originally to support timing
221    computation in the assembler.
222  \end{itemize}
223
224  At the completion of Period~3, the front end now includes proofs of
225  correctness for all these stages, extension to stack usage, and considerable
226  refinement of the existing formal development.
227 
228\end{frame}
229
230\begin{frame}{Front-End Contribution}
231
232  Notable features and original elements of the front-end include:
233  \begin{itemize}
234  \item Proof layering:
235    \begin{itemize}
236    \item A \blue{functional correctness proof};
237    \item A \blue{cost proof}, separate but depending on functional
238      correctness.
239    \end{itemize}
240  \item \blue{Internal checks} as a proof shortcut, like translation
241    validation.
242  \item \blue{Structured traces} for holding detailed data on resource
243    behaviour, not just for assembler cost calculation.
244  \item Introduction of \blue{measurable subtraces} as a unit of cost
245    proof.
246  \end{itemize}
247  These features are significant in particular because:
248  \begin{itemize}
249  \item Compilation is not 1-1 transliteration: code is rearranged and
250    modified, but label sequencing is preserved;
251  \item The source language has implicit control flow constraints not present
252    in the target (call/return, branches, loops).
253  \end{itemize}
254
255\end{frame}
256
257\begin{frame}{Axiomatization}
258
259  The translation in \blue{\lstinline|compile.ma|} is complete --- essential
260  for extracting a compiler.
261
262  \medskip %
263  The simulation results in \blue{\lstinline|correctness.ma|} are fully
264  specified, and with substantially complete proofs.  Some parts, however,
265  have been axiomatized and their correctness assumed.
266
267  \medskip %
268  In the front-end there are two sources of this axiomatisation:
269  \begin{itemize}
270  \item \red{Functional correctness.}  Other projects, notably
271    \blue{CompCert}, provide assurances that such proofs can be completed; so
272    certain parts of this proof have been assumed.
273  \item Simulation steps with \red{many cases}, some very similar; here we
274    have identified representative and more challenging cases for detailed
275    proof.
276  \end{itemize}
277 
278\end{frame}
279
280\begin{frame}{Structure of the Proof}
281
282  \begin{center}
283    \includegraphics[width=0.7\linewidth]{compiler.pdf.eps}
284  \end{center}
285
286  The transition from front-end to back-end marks the change:
287  \begin{itemize}
288  \item From a language with explicit call/return structure and high-level
289    structured control flow;
290  \item To a language where all control flow is unconstrained jumps.
291  \end{itemize}
292
293  Correspondingly, the proof hands over
294  \begin{itemize}
295  \item From measurable subtraces with labelling that must be checked
296    \blue{sound} and \blue{precise};
297  \item To \blue{structured traces} which embed these invariants.
298  \end{itemize}
299
300\end{frame}
301
302\begin{frame}{Structured Traces}
303
304  Embed call structure and label invariants into execution traces.
305
306  \begin{center}
307    \includegraphics{strtraces.pdf.eps}
308  \end{center}
309  \vspace*{-\bigskipamount}
310  \begin{itemize}
311  \item Enforces \blue{invariants} on positions of cost labels.
312  \item \blue{Groups} execution steps according to preceding cost label.
313  \item Forbids \blue{repeating} addresses in grouped steps.
314  \end{itemize}
315
316  Constructed by folding up \red{measurable subtraces}, using their termination
317  and the fact that labelling is \blue{sound} and \blue{precise}.
318
319\end{frame}
320
321\begin{frame}{Step-by-Step: Getting to Labelled Source}
322
323  \blue{C $\to$ Clight}
324
325  \smallskip %
326  To translate from C to Clight we reuse the CompCert parser.
327
328  \bigskip %
329  \blue{Switch Removal}
330
331  \smallskip %
332  Control flow for \red{\lstinline'switch'} is too subtle for simple
333  labelling; we replace with an \red{\lstinline'if .. then .. else'} tree.
334  Proof requires tracking memory extension for an extra test variable.
335
336  \bigskip %
337  \blue{Cost Labels}
338
339  \smallskip %
340  Labels added at function start, branch targets, \dots
341
342  \smallskip %
343  Simulation of execution is exact, just skipping over labels.
344
345\end{frame}
346
347\begin{frame}{Front-End Correctness}
348
349  Suppose we have a \blue{measurable} subtrace of Clight execution, including
350  the \blue{prefix} of that trace from initial state.
351
352  \medskip %
353  Then for handover to the back-end we have in RTLabs:
354  \begin{itemize}
355  \item A corresponding \red{prefix};
356  \item A \red{structured trace} corresponding to the measurable subtrace;
357  \item The required invariants, including \red{no repeating addresses};
358  \item A proof that the same \red{stack limit} is observed.
359  \end{itemize}
360
361  \medskip %
362  In addition, to link this with the source program:
363  \begin{itemize}
364  \item The observables for the RTLabs \blue{prefix} and \blue{structured
365      trace} are the same as for their counterparts in Clight.
366  \end{itemize}
367
368\end{frame}
369
370\begin{frame}{Lifting Measurable Traces}
371
372  For each pass find a target \red{measurable subtrace} equivalent to any
373  source \red{measurable subtrace}.
374
375  \begin{center}
376    \includegraphics{meassim.pdf.eps}
377  \end{center}
378
379  Split normal simulation proof:
380  \begin{enumerate}
381  \item each \blue{call} becomes a \blue{call} step plus zero or more
382    \blue{normal} steps; following cost labels should stay next to the call
383    step
384  \item each \blue{return} becomes a \blue{return} plus zero or
385    more \blue{normal} steps
386  \item \blue{cost} label steps are preserved
387  \item other \blue{normal} steps become zero or more \blue{normal} steps
388  \end{enumerate}
389
390  This preserves the defining property for \red{measurable subtraces}.
391
392\end{frame}
393
394\begin{frame}{Front-End Simulation Results}
395  \blue{Cast simplification}
396  \begin{itemize}
397  \item Simplify expressions for 8-bit target.
398  \item Only expressions change --- statements are in lock-step.
399  \end{itemize}
400
401  \blue{Clight to Cminor}
402  \begin{itemize}
403  \item Stack allocation and control flow transformation.
404  \item Similar to \blue{CompCert}, but using \red{\lstinline'goto'} instead
405    of \red{\lstinline'loop'}
406  \item Large proof terms for invariants embedded in Cminor programs using
407    dependent types.
408  \end{itemize}
409
410  \blue{Cminor to RTLabs}
411  \begin{itemize}
412  \item Construction of control-flow graph.
413  \item Embedded invariants mean that translation function is \red{total}.
414  \end{itemize}
415\end{frame}
416
417\begin{frame}{Checking Cost Labelling}
418
419  Building structured traces in \blue{RTLabs} depends on having cost labelling
420  on the linear trace that is \red{sound} and \red{precise}.
421
422  \medskip %
423  Instead of carrying this as an invariant all the way through each previous
424  pass, we take a \blue{shortcut} and check on the spot.
425
426  \medskip %
427  This is similar to \blue{translation validation}.
428
429  \medskip %
430   The specific requirements for labelling \blue{RTLabs} code are:
431   \begin{description}[\red{soundness}]
432   \item[\red{soundness}] At every point, a finite bound on the number of
433     instructions until the next label;
434   \item[\red{soundness}] A label at the start of every function;
435   \item[\red{precision}] A label after every branch.
436   \end{description}
437 
438   \medskip %
439   The bound is the hard part; the rest is a simple syntactic check.
440\end{frame}
441
442\begin{frame}[fragile]{Checking Bound to Next Label}
443
444    \begin{center}
445  \begin{overprint}[0.5\linewidth]
446      \onslide<1>\includegraphics[width=0.8\linewidth]{loop.pdf.eps}
447      \onslide<2>\includegraphics[width=0.8\linewidth]{loop1.pdf.eps}
448      \onslide<3>\includegraphics[width=0.8\linewidth]{loop2.pdf.eps}
449      \onslide<4>\includegraphics[width=0.8\linewidth]{loop3.pdf.eps}
450      \onslide<5>\includegraphics[width=0.8\linewidth]{loop4.pdf.eps}
451      \onslide<6->\includegraphics[width=0.8\linewidth]{loopx.pdf.eps}
452  \end{overprint}
453    \end{center}
454  \begin{itemize}
455  \item Pick an arbitrary node in the CFG;
456  \item<2-> Follow successor instructions;
457  \item<4-> If we find a \alert{cost label} all the traced nodes have a
458    bound\dots
459  \item<5-> \dots so remove them and pick a new node, continue;
460  \item<6-> But if we find an \alert{unlabelled loop} then the labelling is
461    unsound, report an error.
462  \end{itemize}
463
464  \onslide<7->
465  When complete, we have a proof that every loop contains a label.
466
467\end{frame}
468
469\begin{frame}{Checking Cost Labelling}
470
471  This check of soundness and precision in RTLabs:
472  \begin{itemize}
473  \item Is partial --- will only succeed when invariants hold;
474  \item Extracts to additional checking code in the compiler;
475  \item[\checkmark] Significantly reduces the proof effort.
476  \end{itemize}
477
478  \medskip %
479  The last point is our key observation: a conventional proof would require
480  showing that:
481  \begin{quote}
482    \red{If} every loop in \blue{Cminor} code is headed by a cost label;
483    \\[1\jot]
484    \red{Then} at every point in every \blue{RTLabs} execution there is a
485    finite bound on the number of instructions until the next cost label.
486  \end{quote}
487  We expect that proving existence in general of this bound alone to be
488  considerably harder than the whole check.
489
490\end{frame}
491
492\begin{frame}{Putting the Proofs Together}
493
494  \begin{center}
495    \includegraphics[width=0.7\linewidth]{compiler.pdf.eps}
496  \end{center}
497
498  The front-end proof demonstrates that for any Clight execution:
499  \begin{itemize}
500  \item We have a corresponding \blue{structured trace} with \blue{invariants}
501    \dots
502  \item \dots and with the same \blue{observables} of labels and call/return.
503  \end{itemize}
504
505  The back-end proof takes this structured trace and shows that
506  \begin{itemize}
507  \item There is a corresponding assembler trace \dots
508  \item \dots with \blue{sound} and \blue{precise} labelling and the same
509    \blue{observables}.
510  \end{itemize}
511
512  Combining these gives a proof that time and space costs computed on the
513  source are exactly those observed on executing the binary.
514
515\end{frame}
516
517\begin{frame}{Summary}
518
519  CerCo WP3 has produced the front-end of a C-to-8051 compiler that annotates
520  C source with runtime cycle counts and stack space.
521
522  \medskip %
523  This is formalised in the Matita proof assistant, with a proof of
524  correctness, and can be extracted as an executable compiler.
525
526  \smallskip %
527  \begin{itemize}
528  \item Modular \blue{layering} of intensional proofs over extensional
529    results;
530  \item Extensional results richer than normal and carefully chosen, but do
531    not require large changes to proof techniques.
532  \item \blue{Compile-time checks} on intermediate code reduce proof effort.
533  \item \blue{Structured traces} as a repository for invariant information.
534  \item \blue{Measurable subtraces} as a unit of cost proof.
535  \end{itemize}
536
537  \smallskip %
538  Going beyond this, in future projects (REMS, \dots) we plan to:
539  \begin{itemize}
540  \item Generalize  labelling schemes for more sophisticated targets;
541  \item Apply \blue{decompilation} to jump the gap from source to binary.
542  \end{itemize}
543
544\end{frame}
545}
546
547% LocalWords:  LFCS Matita CerCo intensional WCET toolchain CompCert Clight ASM
548% LocalWords:  reexecutes subtrace RTLabs subtraces Garnier McKinna cerco
549% LocalWords:  Axiomatization axiomatized Cminor goto
550
551%%% Local Variables:
552%%% mode: latex
553%%% TeX-master: "wp6-dissemination.tex"
554%%% End:
Note: See TracBrowser for help on using the repository browser.