source: Deliverables/D4.4/paolo.tex @ 3213

Last change on this file since 3213 was 3213, checked in by tranquil, 8 years ago

summary for D4.4, and other modifications

File size: 19.9 KB
Line 
1\documentclass[11pt, epsf, a4wide]{article}
2
3\usepackage{../style/cerco}
4
5\usepackage{amsfonts}
6\usepackage{amsmath}
7\usepackage{amssymb} 
8\usepackage[english]{babel}
9\usepackage{graphicx}
10\usepackage[utf8x]{inputenc}
11\usepackage{listings}
12\usepackage[all]{xy}
13
14\newcommand{\clap}[1]{\hbox to 0pt{\hss#1\hss}}
15\def\mathclap{\mathpalette\mathclapinternal}
16\def\mathclapinternal#1#2{%
17           \clap{$\mathsurround=0pt#1{#2}$}}
18
19\newcommand{\unif}{\ensuremath{\stackrel{\scriptscriptstyle ?}{\equiv}}}
20\newcommand{\founif}{\ensuremath{\stackrel{\scriptscriptstyle ?}{=}}}
21\newcommand{\vect}[1]{\ensuremath{\stackrel{\to}{#1}}}
22\newcommand{\sem}[1]{\ensuremath{[\![#1]\!]}}
23\newcommand{\bsem}[2]{\ensuremath{[\![#1;\;#2]\!]}}
24\newcommand{\s}[1]{\textsf{#1}}
25
26\renewcommand{\verb}{\lstinline}
27\def\lstlanguagefiles{lst-grafite.tex}
28\lstset{language=Grafite}
29
30\usepackage{lscape}
31\usepackage{stmaryrd}
32\usepackage{threeparttable}
33\usepackage{caption,subcaption}
34\usepackage{url}
35\usepackage{tikz}
36\usetikzlibrary{positioning,calc,patterns}
37\title{
38INFORMATION AND COMMUNICATION TECHNOLOGIES\\
39(ICT)\\
40PROGRAMME\\
41\vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}}
42
43\lstdefinelanguage{matita-ocaml}
44  {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try},
45   morekeywords={[2]whd,normalize,elim,cases,destruct},
46   morekeywords={[3]type,of},
47   mathescape=true,
48  }
49
50\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
51        keywordstyle=\color{red}\bfseries,
52        keywordstyle=[2]\color{blue},
53        keywordstyle=[3]\color{blue}\bfseries,
54        commentstyle=\color{green},
55        stringstyle=\color{blue},
56        showspaces=false,showstringspaces=false}
57
58\lstset{extendedchars=false}
59\lstset{inputencoding=utf8x}
60\DeclareUnicodeCharacter{8797}{:=\ }
61\DeclareUnicodeCharacter{10746}{++}
62\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
63\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
64\DeclareUnicodeCharacter{8230}{.\!\!.\!\!.\@\ }
65\author{Mauro Piccolo, Claudio Sacerdoti Coen and Paolo Tranquilli\\[15pt]
66\small Department of Computer Science and Engineering, University of Bologna\\
67\small \verb|\{Mauro.Piccolo, Claudio.SacerdotiCoen, Paolo.Tranquilli\}@unibo.it|}
68\title{Certifying the back end pass of the CerCo annotating compiler}
69\date{}
70
71\begin{document}
72\maketitle
73\begin{abstract}
74We present the certifying effort of the back end of the CerCo annotating
75compiler to 8051 assembly. Apart from the usual effort needed in propagating
76the extensional part of execution traces, additional care must be taken in
77order to ensure a form of intensional correctness of the pass, necessary to
78prove the lifting of computational costs correct. We concentrate here on two
79aspects of the proof: how stack is dealt with during the pass, and how generic graph
80language passes can be proven correct from an extensional and intensional point
81of view.
82\end{abstract}
83
84\section{The Back-End Correctness Proof at a Glance}
85\label{sec:glance}
86
87The back-end part of the compiler takes an \s{RTLabs} program together with
88an initialising cost label and gives the assembly code to be fed to the
89assembler. From the semantic point of view, at this stage of the compiler,
90we are interested in propagating structured traces, as explained in \cite{simulation}.
91
92A schema with all the back-end passes and the salient and common points of each one is the following:
93$$\overbrace{\s{RTLabs} \longrightarrow
94\overbrace{\underbrace{\overbrace{\s{RTL}}^{\mathclap{\text{stack merge}}} \longrightarrow \s{ERTL} \longrightarrow \s{LTL}}_{\texttt{joint}\text{ graph languages}}\longrightarrow
95\hskip -42.5pt\overbrace{\hskip 42.5pt\s{LIN}}^{\text{linearisation}}}^{\texttt{joint}\text{ languages}}\longrightarrow
96\hskip -42.5pt\underbrace{\hskip 42.5pt\s{ASM}}_{\text{linear map}}}^{\texttt{abstract\_status}}$$
97
98First, here all language semantics share a common interface via \verb+abstract_status+,
99which is at the base of the definition of structured traces.
100The generic shape of each pass of the back-end is thus the same: we get in input an unstructured
101prefix trace followed by a structured one which we want to measure in the source language,
102and we need to prove the existence of the corresponding unstructured and structured
103traces in the target language, with the two traces producing the same observables.
104
105From \s{RTL} down to \s{LIN} we are in the common syntactic and semantic language
106we call \verb+joint+. Inside the first of these languages, \s{RTL}, we pass
107from separate local memory spaces for each function call to a unique one. Even
108though parameters are passed on stack only in \s{ERTL} and variable allocation
109and spilling is done in \s{LTL}, we already have values on stack at this stage,
110and we ensure all the stack space that will eventually be needed is allocated
111at each call.
112
113The next two passes live in the graph part of \verb+joint+, and can benefit from
114a generic approach to graph translation and its proof of correctness, as described
115in~\autoref{modular}.
116
117Next, a generic \verb+joint+ linearisation pass is performed to go from
118\s{LTL} to \s{LIN}. The proof is generic too, with reasonable and straightforward proof
119obligations asking that the common semantic operations of the two languages
120depend on program counters in memory only to control the flow. Program counters
121are indeed the only semantic entity that changes during this pass.
122
123The last pass is, as far as function bodies are concerned, a simple linear
124map---every \s{LIN} instruction is mapped to a single \s{ASM} instruction,
125so that we may say that \s{LIN} is a subset of \s{ASM}. Less straightforward
126is that function bodies are merged together, and that pointers pass from
127formal data to actual one. Values in memory need therefore to be remapped
128to relate the states in the two languages.
129
130\subsection{A common invariant}
131This block of traces with properties is recurrent enough to merit the
132\s{matita} definition in \autoref{fig:ft_and_tlr}. We rely on the fact
133that all semantics from \s{RTLabs} down to object code can be expressed
134with the \verb+abstract_status+ record, by which all our definitions of traces
135are expressed.
136\begin{figure}
137\begin{lstlisting}
138record ft_and_tlr (S : abstract_status) (prefix, subtrace : list intensional_event)
139(fn : ident) (st1 : S) : Prop ≝
140{ st2 : S
141; st3 : S
142; ft : flat_trace S st1 st2
143; tlr : trace_label_return S st2 st3
144; tlr_unrpt : tlr_unrepeating … tlr
145; ft_is_prefix : ft_observables … ft = prefix
146; fn_is_current : ft_current_function … ft = Some ? fn
147; tlr_is_subtrace : observables_trace_label_return … tlr fn = subtrace
148}.
149\end{lstlisting}
150\caption{The \verb+ft_and_tlr+ record (which stands for
151\verb+flat_trace+ and \verb+trace_label_return+), which abstracts the kind
152of invariant that all back-end passes need to preserve.}
153\label{fig:ft_and_tlr}
154\end{figure}
155The parameters of the record fix respectively the intensional events encountered
156in the prefix (as stated by \verb+ft_is_prefix+), the ones in the structured trace
157(\verb+tlr_is_subtrace+) and the name of the function in which \verb+tlr+
158takes place (\verb+fn_is_current+), and the initial state.
159The additional property \verb+tlr_unrpt+
160tells that program counters do not repeat locally (i.e.\ between two labels)
161in \verb+tlr+, a property needed for the soundness of the cost static analysis.
162
163\subsection{The statement}
164The back-end correctness theorem proves the \s{matita} statement in
165\autoref{fig:statement}. The output of the a proof is
166a \verb+ft_and_tlr+ structure as outlined before, which constitutes the
167preconditions of the assembly proof. The \verb+sigma+ and \verb+pol+
168parameters are passed to \s{ASM}'s semantics, however they have significance
169only with respect to the \s{ASM} to object code correctness
170\footnote{\verb+sigma+ and \verb+pol+ are what allows to maps instructions
171between \s{ASM} and the produced object code. This information is gained during
172the jump expansion pass, cf.~\cite{policy}.}.
173
174\begin{figure}
175\begin{subfigure}{\linewidth}
176\begin{lstlisting}
177theorem back_end_correct :
178∀observe,init_cost,p_rtlabs,p_asm,stack_m,max_stack,prefix,subtrace,fn.
179back_end observe init_cost p_rtlabs =
180    return 〈 p_asm, stack_m, max_stack 〉 →
181back_end_preconditions p_rtlabs (stack_sizes stack_m) max_stack prefix subtrace fn →
182∀sigma,pol.
183ft_and_tlr (ASM_status p_asm sigma pol)
184    prefix subtrace fn (initialise_status ? p_asm).
185\end{lstlisting}
186\caption{The statement.}
187\end{subfigure}
188\\[10pt]
189\begin{subfigure}{\linewidth}
190\begin{lstlisting}
191record back_end_preconditions (p_rtlabs : RTLabs_program)
192(stacksizes : ident → option ℕ) (max_stack : ℕ)
193(prefix, subtrace : list intensional_event) (fn : ident) : Prop ≝
194{ ra_init : RTLabs_status (make_global p_rtlabs)
195; ra_init_ok : make_ext_initial_state p_rtlabs = return ra_init
196; ra_max :
197  le (maximum (update_stacksize_info stacksizes (mk_stacksize_info 0 0)
198    (extract_call_ud_from_observables (prefix @ subtrace)))) max_stack
199; ra_ft_tlr : ft_and_tlr (RTLabs_status (make_global p_rtlabs))
200    prefix subtrace fn ra_init
201}.
202\end{lstlisting}
203\caption{The definition of preconditions for the correctness of the back-end.}
204\label{subfig:back_end_preconditions}
205\end{subfigure}
206\caption{The statement of the back-end correctness result.}
207\label{fig:statement}
208\end{figure}
209
210Care must be taken in dealing with the stack. The back-end pass produces a
211\emph{stack model}: the amount of stack needed by each function (\verb+stack_m+),
212together with the maximal usable stack (\verb+max_stack+, $2^{16}$ minus the size occupied by globals).
213While programs in the front end do not fail for stack overflow, the stack
214information is lifted to source much in the same ways as time consumption information,
215so that we can actually use as a hypothesis on input source traces that they
216do not use more stack than allowed.
217This hypothesis is included in the \verb+measurable+ predicate over
218front-end traces, and we put it to use during the back-end pass,
219where we pass to a unique bounded stack space for all functions. More details
220will be presented in \autoref{sec:stack}.
221
222The above explanation is why the back-end correctness results requires some additional preconditions
223with respect to a \verb+ft_and_tlr+ for \s{RTLabs}. This is accomplished by the
224\verb+ra_max+ field in the record \verb+back_end_preconditions+
225(\autoref{subfig:back_end_preconditions}). The other fields hold the initial
226state \verb+ra_init+ of the program in \s{RTLabs} (with a proof that it is
227actually the initial state).
228
229\section{Dealing with the Stack}
230\label{sec:stack}
231Setting apart the extensional details of the proofs, a delicate point is how
232we deal with stack.
233
234A first minor issue is that the information we have about stack usage of
235each function call evolves along the back-end passes:
236\begin{itemize}
237 \item in \s{RTL} we must allocate some stack for what where referenced
238 local variables (including local arrays) in source code;
239 \item in \s{ERTL} we add up the stack used to pass parameters that overflow
240 the hardware registers available for the task;
241 \item finally in \s{LTL} we add the space necessary for spilled
242 pseudo-registers, and the stack usage of functions is finally fixed.
243\end{itemize}
244
245Another issue that we already mentioned above is that somewhere during the
246back-end pass we must pass from a semantics with no stack overflow error to
247one that can fail.
248
249In the following we describe the approach we have taken to these issues.
250
251\subsection{An evolving stack usage}
252\label{subsec:evolvingstack}
253The stack size we know each function must at least have is written in
254a field of \verb+joint+ internal functions (called \verb+joint_if_stacksize+),
255and as already said
256evolves during the back-end pass. The \emph{naïve} approach to defining
257semantics of these languages is to allocate the minimal necessary space for
258each function call, reading it from this syntactic field. The drawback is
259that then at each update of the stack size we must remap the data pointers
260in memory, moreover in a way that is dependent on the call stack. This is
261exemplified by the picture in \autoref{subfig:stacktight}.
262\begin{figure}
263\begin{subfigure}{.475\linewidth}
264\centering
265$
266\begin{tikzpicture}[thick, line cap=round,
267    m/.style = {execute at begin node=$, execute at end node=$}]
268\begin{scope}[every node/.style={draw, rectangle,
269    minimum width=1.25cm, m}, every label/.style={minimum width=0, draw=none}]
270    \node (main src) [minimum height=.5cm]
271        [label=left:main] {};
272    \node (f1 src) [above=-1pt of main src,minimum height=1cm]
273        [label=left:f] {};
274    \node (g src) [above=-1pt of f1 src,minimum height=.75cm]
275        [label=left:g] {};
276    \node (f2 src) [above=-1pt of g src,minimum height=1cm]
277        [label=left:f] {};
278
279    \node (main tgt) at ([xshift=2cm]main src.south east) [anchor=south west,minimum height=1cm]
280        [label=right:main] {};
281    \node (f1 tgt) [above=-1pt of main tgt,minimum height=1.25cm]
282        [label=right:f] {};
283    \node (g tgt) [above=-1pt of f1 tgt,minimum height=1.5cm]
284        [label=right:g] {};
285    \node (f2 tgt) [above=-1pt of g tgt,minimum height=1.25cm]
286        [label=right:f] {};
287
288\end{scope}
289\foreach \x/\off in {main/.5, f1/1, g/.75, f2/1} {
290    \draw [densely dashed, semithick]
291        ([yshift=1pt]\x\space src.south east) --
292        ([yshift=1pt]\x\space tgt.south west);
293    \draw [densely dashed, semithick] (\x\space src.north east) --
294        ([yshift=\off cm]\x\space tgt.south west) coordinate (tmp)
295        -- (tmp-|\x\space tgt.east);
296    \fill [pattern=dots, opacity=.5] (tmp) |- (\x\space tgt.north east) |- (tmp);
297    \path (\x\space src.east) -- node [sloped] {$\hookrightarrow$}
298        ($(\x\space tgt.south west)!.5!(tmp)$);
299}
300\node (heading tgt) [above=.125cm of f2 tgt] {target stack};
301\node (heading src) at (heading tgt-|f2 src) {source stack};
302
303\draw [|->] (heading src) -- (heading tgt);
304\end{tikzpicture}
305$
306\caption{How stack is remapped if we use tight stack spaces for each language.
307Dotted areas are the new parts of the local stacks.}
308\label{subfig:stacktight}
309\end{subfigure}
310\hfill
311\begin{subfigure}{.475\linewidth}
312\centering
313$
314\begin{tikzpicture}[thick, line cap=round,
315    m/.style = {execute at begin node=$, execute at end node=$}]
316\begin{scope}[every node/.style={draw, rectangle,
317    minimum width=1.25cm, m}, every label/.style={minimum width=0, draw=none}]
318    \node (main src) [minimum height=1.25cm]
319        [label=left:main] {};
320    \node (f1 src) [above=-1pt of main src,minimum height=1.75cm]
321        [label=left:f] {};
322    \node (g src) [above=-1pt of f1 src,minimum height=1.75cm]
323        [label=left:g] {};
324    \node (f2 src) [above=-1pt of g src,minimum height=1.75cm]
325        [label=left:f] {};
326
327    \node (main tgt) at ([xshift=2cm]main src.south east) [anchor=south west,minimum height=1.25cm]
328        [label=right:main] {};
329    \node (f1 tgt) [above=-1pt of main tgt,minimum height=1.75cm]
330        [label=right:f] {};
331    \node (g tgt) [above=-1pt of f1 tgt,minimum height=1.75cm]
332        [label=right:g] {};
333    \node (f2 tgt) [above=-1pt of g tgt,minimum height=1.75cm]
334        [label=right:f] {};
335
336\end{scope}
337\foreach \x/\off/\o in {main/.5/1, f1/1/1.25, g/.75/1.5, f2/1/1.25} {
338    \draw [densely dashed, semithick]
339        ([yshift=1pt]\x\space src.south east) --
340        ([yshift=1pt]\x\space tgt.south west);
341    \draw [densely dashed, semithick]
342        ([yshift=\off cm]\x\space src.south west) --
343        ([yshift=\off cm]\x\space tgt.south west) coordinate (tmp)
344        -- (tmp-|\x\space tgt.east);
345    \fill [pattern=dots, opacity=.5] (tmp) --
346        ([yshift=\o cm]\x\space tgt.south west) coordinate (tmp2) -| (tmp-|\x\space tgt.east) -- (tmp);
347    \draw [densely dashed] (tmp2) -- (tmp2-|\x\space tgt.east);
348    \fill [pattern=north east lines, opacity=.5]
349        (tmp2) |- (\x\space tgt.north east) |- (tmp2);
350    \path ($(\x\space src.south east)!.5!(\x\space src.east|-tmp)$) --
351        node {$\hookrightarrow$}
352        ($(\x\space tgt.south west)!.5!(tmp)$);
353    \fill [pattern=north east lines, opacity=.5]
354        (\x\space src.west|-tmp) coordinate (tmp) |- (\x\space src.north east) |- (tmp);
355}
356\node (heading tgt) [above=.125cm of f2 tgt] {target stack};
357\node (heading src) at (heading tgt-|f2 src) {source stack};
358
359\draw [|->] (heading src) -- (heading tgt);
360\end{tikzpicture}
361$
362\caption{Constant stack usage taken as a semantic parameter across languages.
363Shaded areas are unused parts of the stack.}
364\label{subfig:stacksame}
365\end{subfigure}
366\caption{Comparison between using tight stack sizes across langauges, i.e.\
367reading how much stack is required from the function, and using the same
368stack sizes provided as a parameter to the semantics. We suppose the call stack
369is $main, f, g, f$.}
370\label{fig:stacktightsame}
371\end{figure}
372
373We decided to take another approach, where stack sizes of each function call
374are a parameter to the semantics, and all passes are proved correct for the
375same stack sizes in source and target language, possibly with a hypothesis
376on these ``semantic'' stack sizes to be greater than the ``syntactic'' ones
377(but not all passes require it). The picture becomes thus the one in
378\autoref{subfig:stacksame}. The advantage is that now the data in memory does
379not change along the evolving of stack usage, apart from the first time we
380pass from separate independent stack spaces and a unique one, a situation
381we describe in the next subsection.
382
383\subsection{From an unbounded to a bounded stack}
384The way the \verb+ft_and_tlr+ is propagated is almost everwhere by means of
385the proof of existence of a special relation between states that is a simulation
386with respect to execution. There are more details involved than in a usual
387extensional proof, regarding the intensional preservation. The details are
388contained in~\cite{simulation}.
389
390Here we concentrate on a particular aspect that eludes the generic treatment:
391the moment when we pass from an unbounded stack to a bounded one. This passage
392is delicate for two reasons. Firstly, while usually we can assure forward simulation
393of an execution step by simply depending on that step not producing an error,
394here is no more the case, as the stack overflow error must start happening somewhere in
395the back-end. Secondly, merging the stack spaces of the function calls requires
396mapping the pointer values contained in memory.
397
398In order to tackle these subtleties, we firstly decided to isolate this passage,
399without mingling it with other pass-dependent semantic preservation proofs.
400Rather than performing the proof when passing from a language to another,
401we decided to split in three the semantics of a single language (\s{RTL}).
402
403The first one, called \verb+RTL_semantics_separate+, mimics how stack is modelled
404in \s{RTLabs}: each function call gets allocated an independent stack space.
405Passing from \s{RTLabs} to this semantics involves the usual simulation relation.
406A peculiarity is that data pointers in memory need not change.
407
408The middle, second one, is called
409\verb+RTL_semantics_separate_overflow+, and it has separate allocated stack spaces
410just as above. However the execution is defined
411so that if the call stack as a whole uses more space than (later) physically
412available (as provided by a parameter), an error is issued. The actual states are identical,
413and the simulation is an almost trivial 1-to-1, where the stack correctness
414inherited from \verb+back_end_preconditions+ can be easily put to use even
415if the generic results for structure preserving simulation actually cannot be
416used here.
417
418The third one, defined as \verb+RTL_semantics_unique+, switches to a semantics with
419a single, bounded stack space, where moreover memory has already been granted to
420globals too. From this semantics down to \s{LIN}, all data in memory excluding
421pieces of program counters remain the same, as specified in \autoref{subsec:evolvingstack}.
422However in order to show a
423forwarding simulation from \verb+RTL_semantics_separate_overflow+, one needs to
424remap data pointers. No further hypothesis on stack usage needs to be employed,
425as it is integrated in the fact that the execution step does not fail.
426
427\input{mauro.tex}
428
429\bibliographystyle{plain}
430\bibliography{report}
431\end{document}
Note: See TracBrowser for help on using the repository browser.