source: Deliverables/Dissemination/final-review/wp6-dissemination.ltx @ 3294

Last change on this file since 3294 was 3294, checked in by mckinna, 7 years ago

draft final version: more to say than to put on slides etc.

File size: 16.9 KB
Line 
1
2\begin{frame}{Summary}
3
4Dissemination and exploitation in the CerCo project has been both:
5
6\begin{description}
7\item[direct] in the form of Deliverables such as:
8  \begin{itemize}
9  \item papers
10  \item workshops
11  \item invited talks
12  \item software
13  \end{itemize}
14\\and\hfill\\
15\item[indirect] in the form of:
16  \begin{itemize}
17  \item ideas, influence;
18  \item mobility of researchers;
19  \item follow-on projects
20  \end{itemize}
21
22\end{description}
23
24\end{frame}
25
26\begin{frame}{Activities in the Third Period}
27  A number of distinct activities took place, as reported in D1.3/D1.4
28  and in the individual Deliverables already submitted. Noteworthy direct outcomes are:
29
30  \begin{description}
31  \item[papers] two at CPP 2012 (UEDIN, UNIBO; Kyoto); one at FMICS (UPD; Paris)
32  \item[workshops] HiPEAC (D6.4: Berlin, Jan 2013); ETAPS (D6.5: Rome, March 2013)
33  \item[software] extracted trusted compiler (D5.2) and live CD (D6.6)
34  \end{description}
35
36  Indirect outcomes:
37  \begin{description}
38  \item[people] a number of researchers moved on to other projects or industry
39  \item[ideas] executable semantics for CompCert C: Leroy, following Campbell (UEDIN)
40  \item[projects] COST Action TACLe (Lisper; SE), EPSRC REMS (Sewell, \textit{et al.}\ldots; UK) DARPA Crash/Safe (Morrissett \textit{et al.}\ldots; US)
41  \end{description}
42
43\end{frame}
44
45\begin{frame}{Research Context: Projects and People}
46Building on established networks of EU-funded research: \begin{itemize}
47  \item FP5: IST 33149 ``Mobile Resource Guarantees''
48  \item FP6: \begin{itemize}
49        \item IST 015905 ``Mobius: Mobility, Ubiquity, Security''
50        \item FET-Open STReP 510255 ``EmBounded''
51        \end{itemize}
52  \item FP7: \red{FET-Open STReP 243881 ``CerCo''}
53  %%%\item COST: ICT IC2102 ``TACLe''
54\end{itemize}
55former researchers on CerCo have moved to related research projects, or industrial formal methods groups:
56\begin{description}
57\item[UEDIN]
58  \begin{itemize}
59  \item Pollack, now researcher on Crash/Safe %%%(Morrissett \textit{et al.}\ldots; DARPA, US)
60  \item Campbell, now researcher on REMS %%%(Sewell \textit{et al.}\ldots; EPSRC, UK)
61  \end{itemize}
62\item[UPD] \begin{itemize}
63  \item Ayache, engineer at CLEARSY (Atelier B, FR)
64  \item Bobot, researcher at CEA (Frama-C, FR)
65  \end{itemize}
66\item[UNIBO] \begin{itemize}
67  \item Tranquilli, to join Nokia (PL)
68  \item Mulligan, now researcher on REMS (UK)
69  \item Boender, assistant professor at Middlesex (UK)
70  \end{itemize}
71\end{description}
72
73\end{frame}
74
75\begin{frame}{Prospects for further exploitation}
76  In addition to the movement of people, and influence of ideas already sketched:
77  \begin{description}
78  \item[further proof/software development] UNIBO, with existing staff
79  \item[more advanced hardware models] UNIBO, as above, UEDIN with REMS
80  \item[further collaborations] COST: ICT IC2102 ``TACLe'' (Lisper, SE). UNIBO and UEDIN
81  \end{description}
82  In particular, the role of source-level (timing) analysis in the future of WCET, envisaged as ``parametric'':
83  \begin{itemize}
84    \item symbolic
85    \item local, modular
86    \item relativised (\textit{cf.} dependent labelling)
87
88  \end{itemize}
89\end{frame}
90
91
92%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
93
94\comment{
95\begin{frame}{CerCo Outcomes}
96
97  The final results of the CerCo project include the following:
98
99  \medskip %
100  \begin{itemize}
101  \item A compiler from C to executable 8051 binaries, formalised in Matita as
102    \blue{\lstinline|compiler.ma|} \dots
103  \item \dots which adds labels and cost information to C source code
104  \item \dots with exact values for time and space used by the binary.
105
106    \medskip %
107  \item A machine-checked proof of this in Matita
108    \blue{\lstinline|correctness.ma|}:
109
110    \smallskip %
111    \begin{itemize}
112    \item Addressing \red{intensional} properties --- clock cycles, stack bytes
113
114      \smallskip %
115    \item As well as \red{extensional} properties --- functional correctness.
116    \end{itemize}
117  \item An executable cost-annotating compiler \blue{\lstinline|cerco|}
118    extracted from the formalisation.
119  \end{itemize}
120
121  \medskip %
122  This talk treats the front-end compiler and proof, describing the technical
123  challenges and contributions from the work in Period~3.
124 
125\end{frame}
126
127\begin{frame}{Compiler}
128
129  The CerCo compiler, given C source code, will generate:
130  \begin{itemize}
131  \item Labelled Clight source code;
132  \item Labelled 8051 object code;
133  \item Cost maps: for each label a count of time (clock cycles) and space
134    (stack bytes) usage.
135  \end{itemize}
136
137  \medskip %
138  The compiler operates in multiple passes, with each intermediate language
139  having an executable semantics in Matita: %
140  \vspace*{-\medskipamount}
141  \begin{center}
142    \includegraphics[width=0.7\linewidth]{compiler-plain.pdf.eps}
143  \end{center}
144  \vspace*{-\medskipamount}
145
146\end{frame}
147
148\begin{frame}{Correctness}
149 
150  Suppose we have C source with labelled Clight, labelled 8051 binary, and
151  cost maps all generated from the CerCo compiler.
152
153  \medskip %
154  Then the Matita theorem for correctness is that:
155  \begin{itemize}
156  \item Executing the original and labelled Clight source gives the same
157    behaviour;
158  \item Executing the 8051 binary gives the same observables: labels and
159    call/return;
160  \item Applying the cost maps to the trace of C labels correctly
161    describes the time and space usage of the 8051 execution.
162  \end{itemize}
163
164  In fact we give an exact result on all \red{measurable subtraces}: from any
165  label to the end of its enclosing function.
166
167  \medskip %
168  The machine-checked proof is the concatenation of correctness results for
169  the front end, the back end, and the assembler.
170
171\end{frame}
172
173\begin{frame}{Front-End Progress}
174 
175  \vspace*{-\bigskipamount}
176  \begin{center}
177    \includegraphics[width=0.7\linewidth]{compiler-plain.pdf.eps}
178  \end{center}
179  \vspace*{-\medskipamount}
180
181  At the start of Period~3 the front end included in Matita:
182  \begin{itemize}
183  \item Executable semantics for source and intermediate languages;
184  \item Labelling and compilation of source through these;
185  \item Innovation of \red{structured traces}, originally to support timing
186    computation in the assembler.
187  \end{itemize}
188
189  At the completion of Period~3, the front end now includes proofs of
190  correctness for all these stages, extension to stack usage, and considerable
191  refinement of the existing formal development.
192 
193\end{frame}
194
195\begin{frame}{Front-End Contribution}
196
197  Notable features and original elements of the front-end include:
198  \begin{itemize}
199  \item Proof layering:
200    \begin{itemize}
201    \item A \blue{functional correctness proof};
202    \item A \blue{cost proof}, separate but depending on functional
203      correctness.
204    \end{itemize}
205  \item \blue{Internal checks} as a proof shortcut, like translation
206    validation.
207  \item \blue{Structured traces} for holding detailed data on resource
208    behaviour, not just for assembler cost calculation.
209  \item Introduction of \blue{measurable subtraces} as a unit of cost
210    proof.
211  \end{itemize}
212  These features are significant in particular because:
213  \begin{itemize}
214  \item Compilation is not 1-1 transliteration: code is rearranged and
215    modified, but label sequencing is preserved;
216  \item The source language has implicit control flow constraints not present
217    in the target (call/return, branches, loops).
218  \end{itemize}
219
220\end{frame}
221
222\begin{frame}{Axiomatization}
223
224  The translation in \blue{\lstinline|compile.ma|} is complete --- essential
225  for extracting a compiler.
226
227  \medskip %
228  The simulation results in \blue{\lstinline|correctness.ma|} are fully
229  specified, and with substantially complete proofs.  Some parts, however,
230  have been axiomatized and their correctness assumed.
231
232  \medskip %
233  In the front-end there are two sources of this axiomatisation:
234  \begin{itemize}
235  \item \red{Functional correctness.}  Other projects, notably
236    \blue{CompCert}, provide assurances that such proofs can be completed; so
237    certain parts of this proof have been assumed.
238  \item Simulation steps with \red{many cases}, some very similar; here we
239    have identified representative and more challenging cases for detailed
240    proof.
241  \end{itemize}
242 
243\end{frame}
244
245\begin{frame}{Structure of the Proof}
246
247  \begin{center}
248    \includegraphics[width=0.7\linewidth]{compiler.pdf.eps}
249  \end{center}
250
251  The transition from front-end to back-end marks the change:
252  \begin{itemize}
253  \item From a language with explicit call/return structure and high-level
254    structured control flow;
255  \item To a language where all control flow is unconstrained jumps.
256  \end{itemize}
257
258  Correspondingly, the proof hands over
259  \begin{itemize}
260  \item From measurable subtraces with labelling that must be checked
261    \blue{sound} and \blue{precise};
262  \item To \blue{structured traces} which embed these invariants.
263  \end{itemize}
264
265\end{frame}
266
267\begin{frame}{Structured Traces}
268
269  Embed call structure and label invariants into execution traces.
270
271  \begin{center}
272    \includegraphics{strtraces.pdf.eps}
273  \end{center}
274  \vspace*{-\bigskipamount}
275  \begin{itemize}
276  \item Enforces \blue{invariants} on positions of cost labels.
277  \item \blue{Groups} execution steps according to preceding cost label.
278  \item Forbids \blue{repeating} addresses in grouped steps.
279  \end{itemize}
280
281  Constructed by folding up \red{measurable subtraces}, using their termination
282  and the fact that labelling is \blue{sound} and \blue{precise}.
283
284\end{frame}
285
286\begin{frame}{Step-by-Step: Getting to Labelled Source}
287
288  \blue{C $\to$ Clight}
289
290  \smallskip %
291  To translate from C to Clight we reuse the CompCert parser.
292
293  \bigskip %
294  \blue{Switch Removal}
295
296  \smallskip %
297  Control flow for \red{\lstinline'switch'} is too subtle for simple
298  labelling; we replace with an \red{\lstinline'if .. then .. else'} tree.
299  Proof requires tracking memory extension for an extra test variable.
300
301  \bigskip %
302  \blue{Cost Labels}
303
304  \smallskip %
305  Labels added at function start, branch targets, \dots
306
307  \smallskip %
308  Simulation of execution is exact, just skipping over labels.
309
310\end{frame}
311
312\begin{frame}{Front-End Correctness}
313
314  Suppose we have a \blue{measurable} subtrace of Clight execution, including
315  the \blue{prefix} of that trace from initial state.
316
317  \medskip %
318  Then for handover to the back-end we have in RTLabs:
319  \begin{itemize}
320  \item A corresponding \red{prefix};
321  \item A \red{structured trace} corresponding to the measurable subtrace;
322  \item The required invariants, including \red{no repeating addresses};
323  \item A proof that the same \red{stack limit} is observed.
324  \end{itemize}
325
326  \medskip %
327  In addition, to link this with the source program:
328  \begin{itemize}
329  \item The observables for the RTLabs \blue{prefix} and \blue{structured
330      trace} are the same as for their counterparts in Clight.
331  \end{itemize}
332
333\end{frame}
334
335\begin{frame}{Lifting Measurable Traces}
336
337  For each pass find a target \red{measurable subtrace} equivalent to any
338  source \red{measurable subtrace}.
339
340  \begin{center}
341    \includegraphics{meassim.pdf.eps}
342  \end{center}
343
344  Split normal simulation proof:
345  \begin{enumerate}
346  \item each \blue{call} becomes a \blue{call} step plus zero or more
347    \blue{normal} steps; following cost labels should stay next to the call
348    step
349  \item each \blue{return} becomes a \blue{return} plus zero or
350    more \blue{normal} steps
351  \item \blue{cost} label steps are preserved
352  \item other \blue{normal} steps become zero or more \blue{normal} steps
353  \end{enumerate}
354
355  This preserves the defining property for \red{measurable subtraces}.
356
357\end{frame}
358
359\begin{frame}{Front-End Simulation Results}
360  \blue{Cast simplification}
361  \begin{itemize}
362  \item Simplify expressions for 8-bit target.
363  \item Only expressions change --- statements are in lock-step.
364  \end{itemize}
365
366  \blue{Clight to Cminor}
367  \begin{itemize}
368  \item Stack allocation and control flow transformation.
369  \item Similar to \blue{CompCert}, but using \red{\lstinline'goto'} instead
370    of \red{\lstinline'loop'}
371  \item Large proof terms for invariants embedded in Cminor programs using
372    dependent types.
373  \end{itemize}
374
375  \blue{Cminor to RTLabs}
376  \begin{itemize}
377  \item Construction of control-flow graph.
378  \item Embedded invariants mean that translation function is \red{total}.
379  \end{itemize}
380\end{frame}
381
382\begin{frame}{Checking Cost Labelling}
383
384  Building structured traces in \blue{RTLabs} depends on having cost labelling
385  on the linear trace that is \red{sound} and \red{precise}.
386
387  \medskip %
388  Instead of carrying this as an invariant all the way through each previous
389  pass, we take a \blue{shortcut} and check on the spot.
390
391  \medskip %
392  This is similar to \blue{translation validation}.
393
394  \medskip %
395   The specific requirements for labelling \blue{RTLabs} code are:
396   \begin{description}[\red{soundness}]
397   \item[\red{soundness}] At every point, a finite bound on the number of
398     instructions until the next label;
399   \item[\red{soundness}] A label at the start of every function;
400   \item[\red{precision}] A label after every branch.
401   \end{description}
402 
403   \medskip %
404   The bound is the hard part; the rest is a simple syntactic check.
405\end{frame}
406
407\begin{frame}[fragile]{Checking Bound to Next Label}
408
409    \begin{center}
410  \begin{overprint}[0.5\linewidth]
411      \onslide<1>\includegraphics[width=0.8\linewidth]{loop.pdf.eps}
412      \onslide<2>\includegraphics[width=0.8\linewidth]{loop1.pdf.eps}
413      \onslide<3>\includegraphics[width=0.8\linewidth]{loop2.pdf.eps}
414      \onslide<4>\includegraphics[width=0.8\linewidth]{loop3.pdf.eps}
415      \onslide<5>\includegraphics[width=0.8\linewidth]{loop4.pdf.eps}
416      \onslide<6->\includegraphics[width=0.8\linewidth]{loopx.pdf.eps}
417  \end{overprint}
418    \end{center}
419  \begin{itemize}
420  \item Pick an arbitrary node in the CFG;
421  \item<2-> Follow successor instructions;
422  \item<4-> If we find a \alert{cost label} all the traced nodes have a
423    bound\dots
424  \item<5-> \dots so remove them and pick a new node, continue;
425  \item<6-> But if we find an \alert{unlabelled loop} then the labelling is
426    unsound, report an error.
427  \end{itemize}
428
429  \onslide<7->
430  When complete, we have a proof that every loop contains a label.
431
432\end{frame}
433
434\begin{frame}{Checking Cost Labelling}
435
436  This check of soundness and precision in RTLabs:
437  \begin{itemize}
438  \item Is partial --- will only succeed when invariants hold;
439  \item Extracts to additional checking code in the compiler;
440  \item[\checkmark] Significantly reduces the proof effort.
441  \end{itemize}
442
443  \medskip %
444  The last point is our key observation: a conventional proof would require
445  showing that:
446  \begin{quote}
447    \red{If} every loop in \blue{Cminor} code is headed by a cost label;
448    \\[1\jot]
449    \red{Then} at every point in every \blue{RTLabs} execution there is a
450    finite bound on the number of instructions until the next cost label.
451  \end{quote}
452  We expect that proving existence in general of this bound alone to be
453  considerably harder than the whole check.
454
455\end{frame}
456
457\begin{frame}{Putting the Proofs Together}
458
459  \begin{center}
460    \includegraphics[width=0.7\linewidth]{compiler.pdf.eps}
461  \end{center}
462
463  The front-end proof demonstrates that for any Clight execution:
464  \begin{itemize}
465  \item We have a corresponding \blue{structured trace} with \blue{invariants}
466    \dots
467  \item \dots and with the same \blue{observables} of labels and call/return.
468  \end{itemize}
469
470  The back-end proof takes this structured trace and shows that
471  \begin{itemize}
472  \item There is a corresponding assembler trace \dots
473  \item \dots with \blue{sound} and \blue{precise} labelling and the same
474    \blue{observables}.
475  \end{itemize}
476
477  Combining these gives a proof that time and space costs computed on the
478  source are exactly those observed on executing the binary.
479
480\end{frame}
481
482\begin{frame}{Summary}
483
484  CerCo WP3 has produced the front-end of a C-to-8051 compiler that annotates
485  C source with runtime cycle counts and stack space.
486
487  \medskip %
488  This is formalised in the Matita proof assistant, with a proof of
489  correctness, and can be extracted as an executable compiler.
490
491  \smallskip %
492  \begin{itemize}
493  \item Modular \blue{layering} of intensional proofs over extensional
494    results;
495  \item Extensional results richer than normal and carefully chosen, but do
496    not require large changes to proof techniques.
497  \item \blue{Compile-time checks} on intermediate code reduce proof effort.
498  \item \blue{Structured traces} as a repository for invariant information.
499  \item \blue{Measurable subtraces} as a unit of cost proof.
500  \end{itemize}
501
502  \smallskip %
503  Going beyond this, in future projects (REMS, \dots) we plan to:
504  \begin{itemize}
505  \item Generalize  labelling schemes for more sophisticated targets;
506  \item Apply \blue{decompilation} to jump the gap from source to binary.
507  \end{itemize}
508
509\end{frame}
510}
511
512% LocalWords:  LFCS Matita CerCo intensional WCET toolchain CompCert Clight ASM
513% LocalWords:  reexecutes subtrace RTLabs subtraces Garnier McKinna cerco
514% LocalWords:  Axiomatization axiomatized Cminor goto
515
516%%% Local Variables:
517%%% mode: latex
518%%% TeX-master: "wp6-dissemination.tex"
519%%% End:
Note: See TracBrowser for help on using the repository browser.