source: Deliverables/Dissemination/front-end/final-review/front-end.tex @ 3280

Last change on this file since 3280 was 3280, checked in by stark, 4 years ago

Front end final review talk?

  • Property svn:eol-style set to native
  • Property svn:mime-type set to text/x-latex
File size: 14.7 KB
Line 
1\documentclass[nopanel]{beamer}
2\usepackage{graphicx,color}
3\usepackage{amsfonts}
4\usepackage{listings}
5
6\lstset{basicstyle=\footnotesize\tt,columns=flexible,breaklines=false,
7        keywordstyle=\color{red}\bfseries,
8        keywordstyle=[2]\color{blue},
9        commentstyle=\color{green},
10        stringstyle=\color{blue},
11        showspaces=false,showstringspaces=false}
12
13
14\setbeamertemplate{footline}[page number]
15\setbeamertemplate{sidebar right}{}
16
17\setbeamercolor{alerted text}{fg=red!70!orange}
18
19\newcommand{\red}[1]{\textcolor{red}{#1}}
20\newcommand{\blue}[1]{\textcolor{blue}{#1}}
21\newcommand{\green}[1]{\textcolor{green}{#1}}
22\newcommand{\grey}[1]{\textcolor{grey}{#1}}
23
24\addtolength{\parskip}{0.3em}
25
26\begin{document}
27
28\title{WP3: Verified Compiler --- Front End\\[1\jot]
29D3.4: Front-End Correctness Proofs}
30\author{Brian~Campbell, Ilias~Garnier, James~McKinna, \underline{Ian~Stark}}
31\institute{LFCS, University of Edinburgh\\[1ex]
32\includegraphics[width=.3\linewidth]{cerco_logo.png}\\
33\footnotesize Project FP7-ICT-2009-C-243881
34}
35\date{CerCo review meeting\\16th May 2013}
36\maketitle
37
38
39\begin{frame}{CerCo Outcomes}
40
41  The final results of the CerCo project include the following:
42
43  \medskip %
44  \begin{itemize}
45  \item A compiler from C to executable 8051 binaries, formalised in Matita as
46    \blue{\lstinline|compiler.ma|} \dots
47  \item \dots which adds labels and cost information to C source code
48  \item \dots with exact values for time and space used by the binary.
49
50    \medskip %
51  \item A machine-checked proof of this in Matita
52    \blue{\lstinline|correctness.ma|}:
53
54    \smallskip %
55    \begin{itemize}
56    \item Addressing \red{intensional} properties --- clock cycles, stack bytes
57
58      \smallskip %
59    \item As well as \red{extensional} properties --- functional correctness.
60    \end{itemize}
61  \item An executable cost-annotating compiler \blue{\lstinline|cerco|}
62    extracted from the formalisation.
63  \end{itemize}
64
65  \medskip %
66  This talk treats the front-end compiler and proof, describing the technical
67  challenges and contributions from the work in Period~3.
68 
69\end{frame}
70
71\begin{frame}{Compiler}
72
73  The CerCo compiler, given C source code, will generate:
74  \begin{itemize}
75  \item Labelled Clight source code;
76  \item Labelled 8051 object code;
77  \item Cost maps: for each label a count of time (clock cycles) and space
78    (stack bytes) usage.
79  \end{itemize}
80
81  \medskip %
82  The compiler operates in multiple passes, with each intermediate language
83  having an executable semantics in Matita: %
84  \vspace*{-\medskipamount}
85  \begin{center}
86    \includegraphics[width=0.7\linewidth]{compiler-plain.pdf}
87  \end{center}
88  \vspace*{-\medskipamount}
89
90\end{frame}
91
92\begin{frame}{Correctness}
93 
94  Suppose we have C source with labelled Clight, labelled 8051 binary, and
95  cost maps all generated from the CerCo compiler.
96
97  \medskip %
98  Then the Matita theorem for correctness is that:
99  \begin{itemize}
100  \item Executing the original and labelled Clight source gives the same
101    behaviour;
102  \item Executing the 8051 binary gives the same observables: labels and
103    call/return;
104  \item Applying the cost maps to the trace of C labels correctly
105    describes the time and space usage of the 8051 execution.
106  \end{itemize}
107
108  In fact we give an exact result on all \red{measurable subtraces}: from any
109  label to the end of its enclosing function.
110
111  \medskip %
112  The machine-checked proof is the concatenation of correctness results for
113  the front end, the back end, and the assembler.
114
115\end{frame}
116
117\begin{frame}{Front-End Progress}
118 
119  \vspace*{-\bigskipamount}
120  \begin{center}
121    \includegraphics[width=0.7\linewidth]{compiler-plain.pdf}
122  \end{center}
123  \vspace*{-\medskipamount}
124
125  At the start of Period~3 the front end included in Matita:
126  \begin{itemize}
127  \item Executable semantics for source and intermediate languages;
128  \item Labelling and compilation of source through these;
129  \item Innovation of \red{structured traces}, originally to support timing
130    computation in the assembler.
131  \end{itemize}
132
133  At the completion of Period~3, the front end now includes proofs of
134  correctness for all these stages, extension to stack usage, and considerable
135  refinement of the existing formal development.
136 
137\end{frame}
138
139\begin{frame}{Front-End Contribution}
140
141  Notable features and original elements of the front-end include:
142  \begin{itemize}
143  \item Proof layering:
144    \begin{itemize}
145    \item A \blue{functional correctness proof};
146    \item A \blue{cost proof}, separate but depending on functional
147      correctness.
148    \end{itemize}
149  \item \blue{Internal checks} as a proof shortcut, like translation
150    validation.
151  \item \blue{Structured traces} for holding detailed data on resource
152    behaviour, not just for assembler cost calculation.
153  \item Introduction of \blue{measurable subtraces} as a unit of cost
154    proof.
155  \end{itemize}
156  These features are significant in particular because:
157  \begin{itemize}
158  \item Compilation is not 1-1 transliteration: code is rearranged and
159    modified, but label sequencing is preserved;
160  \item The source language has implicit control flow constraints not present
161    in the target (call/return, branches, loops).
162  \end{itemize}
163
164\end{frame}
165
166\begin{frame}{Axiomatization}
167
168  The translation in \blue{\lstinline|compile.ma|} is complete --- essential
169  for extracting a compiler.
170
171  \medskip %
172  The simulation results in \blue{\lstinline|correctness.ma|} are fully
173  specified, and with substantially complete proofs.  Some parts, however,
174  have been axiomatized and their correctness assumed.
175
176  \medskip %
177  In the front-end there are two sources of this axiomatisation:
178  \begin{itemize}
179  \item \red{Functional correctness.}  Other projects, notably
180    \blue{CompCert}, provide assurances that such proofs can be completed; so
181    certain parts of this proof have been assumed.
182  \item Simulation steps with \red{many cases}, some very similar; here we
183    have identified representative and more challenging cases for detailed
184    proof.
185  \end{itemize}
186 
187\end{frame}
188
189\begin{frame}{Structure of the Proof}
190
191  \begin{center}
192    \includegraphics[width=0.7\linewidth]{compiler.pdf}
193  \end{center}
194
195  The transition from front-end to back-end marks the change:
196  \begin{itemize}
197  \item From a language with explicit call/return structure and high-level
198    structured control flow;
199  \item To a language where all control flow is unconstrained jumps.
200  \end{itemize}
201
202  Correspondingly, the proof hands over
203  \begin{itemize}
204  \item From measurable subtraces with labelling that must be checked
205    \blue{sound} and \blue{precise};
206  \item To \blue{structured traces} which embed these invariants.
207  \end{itemize}
208
209\end{frame}
210
211\begin{frame}{Structured Traces}
212
213  Embed call structure and label invariants into execution traces.
214
215  \begin{center}
216    \includegraphics{strtraces.pdf}
217  \end{center}
218  \vspace*{-\bigskipamount}
219  \begin{itemize}
220  \item Enforces \blue{invariants} on positions of cost labels.
221  \item \blue{Groups} execution steps according to preceding cost label.
222  \item Forbids \blue{repeating} addresses in grouped steps.
223  \end{itemize}
224
225  Constructed by folding up \red{measurable subtraces}, using their termination
226  and the fact that labelling is \blue{sound} and \blue{precise}.
227
228\end{frame}
229
230\begin{frame}{Step-by-Step: Getting to Labelled Source}
231
232  \blue{C $\to$ Clight}
233
234  \smallskip %
235  To translate from C to Clight we reuse the CompCert parser.
236
237  \bigskip %
238  \blue{Switch Removal}
239
240  \smallskip %
241  Control flow for \red{\lstinline'switch'} is too subtle for simple
242  labelling; we replace with an \red{\lstinline'if .. then .. else'} tree.
243  Proof requires tracking memory extension for an extra test variable.
244
245  \bigskip %
246  \blue{Cost Labels}
247
248  \smallskip %
249  Labels added at function start, branch targets, \dots
250
251  \smallskip %
252  Simulation of execution is exact, just skipping over labels.
253
254\end{frame}
255
256\begin{frame}{Front-End Correctness}
257
258  Suppose we have a \blue{measurable} subtrace of Clight execution, including
259  the \blue{prefix} of that trace from initial state.
260
261  \medskip %
262  Then for handover to the back-end we have in RTLabs:
263  \begin{itemize}
264  \item A corresponding \red{prefix};
265  \item A \red{structured trace} corresponding to the measurable subtrace;
266  \item The required invariants, including \red{no repeating addresses};
267  \item A proof that the same \red{stack limit} is observed.
268  \end{itemize}
269
270  \medskip %
271  In addition, to link this with the source program:
272  \begin{itemize}
273  \item The observables for the RTLabs \blue{prefix} and \blue{structured
274      trace} are the same as for their counterparts in Clight.
275  \end{itemize}
276
277\end{frame}
278
279\begin{frame}{Lifting Measurable Traces}
280
281  For each pass find a target \red{measurable subtrace} equivalent to any
282  source \red{measurable subtrace}.
283
284  \begin{center}
285    \includegraphics{meassim.pdf}
286  \end{center}
287
288  Split normal simulation proof:
289  \begin{enumerate}
290  \item each \blue{call} becomes a \blue{call} step plus zero or more
291    \blue{normal} steps; following cost labels should stay next to the call
292    step
293  \item each \blue{return} becomes a \blue{return} plus zero or
294    more \blue{normal} steps
295  \item \blue{cost} label steps are preserved
296  \item other \blue{normal} steps become zero or more \blue{normal} steps
297  \end{enumerate}
298
299  This preserves the defining property for \red{measurable subtraces}.
300
301\end{frame}
302
303\begin{frame}{Front-End Simulation Results}
304  \blue{Cast simplification}
305  \begin{itemize}
306  \item Simplify expressions for 8-bit target.
307  \item Only expressions change --- statements are in lock-step.
308  \end{itemize}
309
310  \blue{Clight to Cminor}
311  \begin{itemize}
312  \item Stack allocation and control flow transformation.
313  \item Similar to \blue{CompCert}, but using \red{\lstinline'goto'} instead
314    of \red{\lstinline'loop'}
315  \item Large proof terms for invariants embedded in Cminor programs using
316    dependent types.
317  \end{itemize}
318
319  \blue{Cminor to RTLabs}
320  \begin{itemize}
321  \item Construction of control-flow graph.
322  \item Embedded invariants mean that translation function is \red{total}.
323  \end{itemize}
324\end{frame}
325
326\begin{frame}{Checking Cost Labelling}
327
328  Building structured traces in \blue{RTLabs} depends on having cost labelling
329  on the linear trace that is \red{sound} and \red{precise}.
330
331  \medskip %
332  Instead of carrying this as an invariant all the way through each previous
333  pass, we take a \blue{shortcut} and check on the spot.
334
335  \medskip %
336  This is similar to \blue{translation validation}.
337
338  \medskip %
339   The specific requirements for labelling \blue{RTLabs} code are:
340   \begin{description}[\red{soundness}]
341   \item[\red{soundness}] At every point, a finite bound on the number of
342     instructions until the next label;
343   \item[\red{soundness}] A label at the start of every function;
344   \item[\red{precision}] A label after every branch.
345   \end{description}
346 
347   \medskip %
348   The bound is the hard part; the rest is a simple syntactic check.
349\end{frame}
350
351\begin{frame}[fragile]{Checking Bound to Next Label}
352
353    \begin{center}
354  \begin{overprint}[0.5\linewidth]
355      \onslide<1>\includegraphics[width=0.8\linewidth]{loop.pdf}
356      \onslide<2>\includegraphics[width=0.8\linewidth]{loop1.pdf}
357      \onslide<3>\includegraphics[width=0.8\linewidth]{loop2.pdf}
358      \onslide<4>\includegraphics[width=0.8\linewidth]{loop3.pdf}
359      \onslide<5>\includegraphics[width=0.8\linewidth]{loop4.pdf}
360      \onslide<6->\includegraphics[width=0.8\linewidth]{loopx.pdf}
361  \end{overprint}
362    \end{center}
363  \begin{itemize}
364  \item Pick an arbitrary node in the CFG;
365  \item<2-> Follow successor instructions;
366  \item<4-> If we find a \alert{cost label} all the traced nodes have a
367    bound\dots
368  \item<5-> \dots so remove them and pick a new node, continue;
369  \item<6-> But if we find an \alert{unlabelled loop} then the labelling is
370    unsound, report an error.
371  \end{itemize}
372
373  \onslide<7->
374  When complete, we have a proof that every loop contains a label.
375
376\end{frame}
377
378\begin{frame}{Checking Cost Labelling}
379
380  This check of soundness and precision in RTLabs:
381  \begin{itemize}
382  \item Is partial --- will only succeed when invariants hold;
383  \item Extracts to additional checking code in the compiler;
384  \item[\checkmark] Significantly reduces the proof effort.
385  \end{itemize}
386
387  \medskip %
388  The last point is our key observation: a conventional proof would require
389  showing that:
390  \begin{quote}
391    \red{If} every loop in \blue{Cminor} code is headed by a cost label;
392    \\[1\jot]
393    \red{Then} at every point in every \blue{RTLabs} execution there is a
394    finite bound on the number of instructions until the next cost label.
395  \end{quote}
396  We expect that proving existence in general of this bound alone to be
397  considerably harder than the whole check.
398
399\end{frame}
400
401\begin{frame}{Putting the Proofs Together}
402
403  \begin{center}
404    \includegraphics[width=0.7\linewidth]{compiler.pdf}
405  \end{center}
406
407  The front-end proof demonstrates that for any Clight execution:
408  \begin{itemize}
409  \item We have a corresponding \blue{structured trace} with \blue{invariants}
410    \dots
411  \item \dots and with the same \blue{observables} of labels and call/return.
412  \end{itemize}
413
414  The back-end proof takes this structured trace and shows that
415  \begin{itemize}
416  \item There is a corresponding assembler trace \dots
417  \item \dots with \blue{sound} and \blue{precise} labelling and the same
418    \blue{observables}.
419  \end{itemize}
420
421  Combining these gives a proof that time and space costs computed on the
422  source are exactly those observed on executing the binary.
423
424\end{frame}
425
426\begin{frame}{Summary}
427
428  CerCo WP3 has produced the front-end of a C-to-8051 compiler that annotates
429  C source with runtime cycle counts and stack space.
430
431  \medskip %
432  This is formalised in the Matita proof assistant, with a proof of
433  correctness, and can be extracted as an executable compiler.
434
435  \smallskip %
436  \begin{itemize}
437  \item Modular \blue{layering} of intensional proofs over extensional
438    results;
439  \item Extensional results richer than normal and carefully chosen, but do
440    not require large changes to proof techniques.
441  \item \blue{Compile-time checks} on intermediate code reduce proof effort.
442  \item \blue{Structured traces} as a repository for invariant information.
443  \item \blue{Measurable subtraces} as a unit of cost proof.
444  \end{itemize}
445
446  \smallskip %
447  Going beyond this, in future projects (REMS, \dots) we plan to:
448  \begin{itemize}
449  \item Generalize  labelling schemes for more sophisticated targets;
450  \item Apply \blue{decompilation} to jump the gap from source to binary.
451  \end{itemize}
452
453\end{frame}
454\end{document}
455
456% LocalWords:  LFCS Matita CerCo intensional WCET toolchain CompCert Clight ASM
457% LocalWords:  reexecutes subtrace RTLabs subtraces Garnier McKinna cerco
458%  LocalWords:  Axiomatization axiomatized Cminor goto
Note: See TracBrowser for help on using the repository browser.