source: Deliverables/D4.4/paolo.tex

Last change on this file was 3222, checked in by tranquil, 8 years ago

added pages to included papers. final version.

File size: 19.9 KB
Line 
1\documentclass[11pt, epsf, a4wide]{article}
2
3\usepackage{../style/cerco}
4
5\usepackage{amsfonts}
6\usepackage{amsmath}
7\usepackage{amssymb} 
8\usepackage[english]{babel}
9\usepackage{graphicx}
10\usepackage[utf8x]{inputenc}
11\usepackage{listings}
12\usepackage[all]{xy}
13% \usepackage{supertabular}
14
15\newcommand{\clap}[1]{\hbox to 0pt{\hss#1\hss}}
16\def\mathclap{\mathpalette\mathclapinternal}
17\def\mathclapinternal#1#2{%
18           \clap{$\mathsurround=0pt#1{#2}$}}
19
20\newcommand{\unif}{\ensuremath{\stackrel{\scriptscriptstyle ?}{\equiv}}}
21\newcommand{\founif}{\ensuremath{\stackrel{\scriptscriptstyle ?}{=}}}
22\newcommand{\vect}[1]{\ensuremath{\stackrel{\to}{#1}}}
23\newcommand{\sem}[1]{\ensuremath{[\![#1]\!]}}
24\newcommand{\bsem}[2]{\ensuremath{[\![#1;\;#2]\!]}}
25\newcommand{\s}[1]{\textsf{#1}}
26
27\renewcommand{\verb}{\lstinline}
28\def\lstlanguagefiles{lst-grafite.tex}
29\lstset{language=Grafite}
30
31\usepackage{lscape}
32\usepackage{stmaryrd}
33\usepackage{threeparttable}
34\usepackage{caption,subcaption}
35\usepackage{url}
36\usepackage{tikz}
37\usetikzlibrary{positioning,calc,patterns}
38\title{
39INFORMATION AND COMMUNICATION TECHNOLOGIES\\
40(ICT)\\
41PROGRAMME\\
42\vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}}
43
44\lstdefinelanguage{matita-ocaml}
45  {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try},
46   morekeywords={[2]whd,normalize,elim,cases,destruct},
47   morekeywords={[3]type,of},
48   mathescape=true,
49  }
50
51\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
52        keywordstyle=\color{red}\bfseries,
53        keywordstyle=[2]\color{blue},
54        keywordstyle=[3]\color{blue}\bfseries,
55        commentstyle=\color{green},
56        stringstyle=\color{blue},
57        showspaces=false,showstringspaces=false}
58
59\lstset{extendedchars=false}
60\lstset{inputencoding=utf8x}
61\DeclareUnicodeCharacter{8797}{:=\ }
62\DeclareUnicodeCharacter{10746}{++}
63\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
64\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
65\DeclareUnicodeCharacter{8230}{.\!\!.\!\!.\@\ }
66\author{Mauro Piccolo, Claudio Sacerdoti Coen and Paolo Tranquilli\\[15pt]
67\small Department of Computer Science and Engineering, University of Bologna\\
68\small \verb|\{Mauro.Piccolo, Claudio.SacerdotiCoen, Paolo.Tranquilli\}@unibo.it|}
69\title{Certifying the back-end pass of the CerCo annotating compiler}
70\date{}
71
72\begin{document}
73\maketitle
74\begin{abstract}
75We present the certifying effort of the back-end of the CerCo annotating
76compiler to 8051 assembly. Apart from the proofs needed for propagating
77the extensional part of execution traces, additional care must be taken in
78order to ensure a form of intensional correctness of the pass, necessary to
79prove that the lifting of computational costs is correct. We concentrate here on two
80aspects of the proof: how stack is dealt with during the pass, and how generic graph
81language passes can be proven correct from an extensional and intensional point
82of view.
83\end{abstract}
84
85\section{The Back-End Correctness Proof at a Glance}
86\label{sec:glance}
87
88The back-end part of the compiler takes an \s{RTLabs} program together with
89an initialising cost label and gives the assembly code to be fed to the
90assembler. From the semantic point of view, at this stage of the compiler,
91we are interested in propagating structured traces, as explained in \cite{simulation}.
92
93A schema with all the back-end passes and the salient and common points of each one is the following:
94$$\overbrace{\s{RTLabs} \longrightarrow
95\overbrace{\underbrace{\overbrace{\s{RTL}}^{\mathclap{\text{stack merge}}} \longrightarrow \s{ERTL} \longrightarrow \s{LTL}}_{\texttt{joint}\text{ graph languages}}\longrightarrow
96\hskip -42.5pt\overbrace{\hskip 42.5pt\s{LIN}}^{\text{linearisation}}}^{\texttt{joint}\text{ languages}}\longrightarrow
97\hskip -42.5pt\underbrace{\hskip 42.5pt\s{ASM}}_{\text{linear map}}}^{\texttt{abstract\_status}}$$
98
99First, here all language semantics share a common interface via \verb+abstract_status+,
100which is at the base of the definition of structured traces.
101The generic shape of each pass of the back-end is thus the same: we get in input an unstructured
102prefix trace followed by a structured one which we want to measure in the source language,
103and we need to prove the existence of the corresponding unstructured and structured
104traces in the target language, with the two traces producing the same observables.
105
106From \s{RTL} down to \s{LIN} we are in the common syntactic and semantic language
107we call \verb+joint+. Inside the first of these languages, \s{RTL}, we pass
108from separate local memory spaces for each function call to a unique one. Even
109though parameters are passed on stack only in \s{ERTL} and variable allocation
110and spilling is done in \s{LTL}, we already have values on stack at this stage,
111and we ensure all the stack space that will eventually be needed is allocated
112at each call.
113
114The next two passes live in the graph part of \verb+joint+, and can benefit from
115a generic approach to graph translation and its proof of correctness, as described
116in~\autoref{modular}.
117
118Next, a generic \verb+joint+ linearisation pass is performed to go from
119\s{LTL} to \s{LIN}. The proof is generic too, with reasonable and straightforward proof
120obligations asking that the common semantic operations of the two languages
121depend on program counters in memory only to control the flow. Program counters
122are indeed the only semantic entity that changes during this pass.
123
124The last pass is, as far as function bodies are concerned, a simple linear
125map---every \s{LIN} instruction is mapped to a single \s{ASM} instruction,
126so that we may say that \s{LIN} is a subset of \s{ASM}. Less straightforward
127is that function bodies are merged together, and that pointers pass from
128formal data to actual one. Values in memory need therefore to be remapped
129to relate the states in the two languages.
130
131\subsection{A common invariant}
132This block of traces with properties is recurrent enough to merit the
133\s{matita} definition in \autoref{fig:ft_and_tlr}. We rely on the fact
134that all semantics from \s{RTLabs} down to object code can be expressed
135with the \verb+abstract_status+ record, by which all our definitions of traces
136are expressed.
137\begin{figure}
138\begin{lstlisting}
139record ft_and_tlr (S : abstract_status) (prefix, subtrace : list intensional_event)
140(fn : ident) (st1 : S) : Prop ≝
141{ st2 : S
142; st3 : S
143; ft : flat_trace S st1 st2
144; tlr : trace_label_return S st2 st3
145; tlr_unrpt : tlr_unrepeating … tlr
146; ft_is_prefix : ft_observables … ft = prefix
147; fn_is_current : ft_current_function … ft = Some ? fn
148; tlr_is_subtrace : observables_trace_label_return … tlr fn = subtrace
149}.
150\end{lstlisting}
151\caption{The \verb+ft_and_tlr+ record (which stands for
152\verb+flat_trace+ and \verb+trace_label_return+), which abstracts the kind
153of invariant that all back-end passes need to preserve.}
154\label{fig:ft_and_tlr}
155\end{figure}
156The parameters of the record fix respectively the intensional events encountered
157in the prefix (as stated by \verb+ft_is_prefix+), the ones in the structured trace
158(\verb+tlr_is_subtrace+) and the name of the function in which \verb+tlr+
159takes place (\verb+fn_is_current+), and the initial state.
160The additional property \verb+tlr_unrpt+
161tells that program counters do not repeat locally (i.e.\ between two labels)
162in \verb+tlr+, a property needed for the soundness of the cost static analysis.
163
164\subsection{The statement}
165The back-end correctness theorem proves the \s{matita} statement in
166\autoref{fig:statement}. The output of the a proof is
167a \verb+ft_and_tlr+ structure as outlined before, which constitutes the
168preconditions of the assembly proof. The \verb+sigma+ and \verb+pol+
169parameters are passed to \s{ASM}'s semantics, however they have significance
170only with respect to the \s{ASM} to object code correctness
171\footnote{\verb+sigma+ and \verb+pol+ are what allows to maps instructions
172between \s{ASM} and the produced object code. This information is gained during
173the jump expansion pass, cf.~\cite{policy}.}.
174
175\begin{figure}
176\begin{subfigure}{\linewidth}
177\begin{lstlisting}
178theorem back_end_correct :
179∀observe,init_cost,p_rtlabs,p_asm,stack_m,max_stack,prefix,subtrace,fn.
180back_end observe init_cost p_rtlabs =
181    return 〈 p_asm, stack_m, max_stack 〉 →
182back_end_preconditions p_rtlabs (stack_sizes stack_m) max_stack prefix subtrace fn →
183∀sigma,pol.
184ft_and_tlr (ASM_status p_asm sigma pol)
185    prefix subtrace fn (initialise_status ? p_asm).
186\end{lstlisting}
187\caption{The statement.}
188\end{subfigure}
189\\[10pt]
190\begin{subfigure}{\linewidth}
191\begin{lstlisting}
192record back_end_preconditions (p_rtlabs : RTLabs_program)
193(stacksizes : ident → option ℕ) (max_stack : ℕ)
194(prefix, subtrace : list intensional_event) (fn : ident) : Prop ≝
195{ ra_init : RTLabs_status (make_global p_rtlabs)
196; ra_init_ok : make_ext_initial_state p_rtlabs = return ra_init
197; ra_max :
198  le (maximum (update_stacksize_info stacksizes (mk_stacksize_info 0 0)
199    (extract_call_ud_from_observables (prefix @ subtrace)))) max_stack
200; ra_ft_tlr : ft_and_tlr (RTLabs_status (make_global p_rtlabs))
201    prefix subtrace fn ra_init
202}.
203\end{lstlisting}
204\caption{The definition of preconditions for the correctness of the back-end.}
205\label{subfig:back_end_preconditions}
206\end{subfigure}
207\caption{The statement of the back-end correctness result.}
208\label{fig:statement}
209\end{figure}
210
211Care must be taken in dealing with the stack. The back-end pass produces a
212\emph{stack model}: the amount of stack needed by each function (\verb+stack_m+),
213together with the maximal usable stack (\verb+max_stack+, $2^{16}$ minus the size occupied by global variables).
214While programs in the front end do not fail for stack overflow, the stack
215information is lifted to source much in the same ways as time consumption information,
216so that we can actually use as a hypothesis on input source traces that they
217do not use more stack than allowed.
218This hypothesis is included in the \verb+measurable+ predicate over
219front-end traces, and we put it to use during the back-end pass,
220where we pass to a unique bounded stack space for all functions. More details
221will be presented in \autoref{sec:stack}.
222
223The above explanation is why the back-end correctness results requires some additional preconditions
224with respect to a \verb+ft_and_tlr+ for \s{RTLabs}. This is accomplished by the
225\verb+ra_max+ field in the record \verb+back_end_preconditions+
226(\autoref{subfig:back_end_preconditions}). The other fields hold the initial
227state \verb+ra_init+ of the program in \s{RTLabs} (with a proof that it is
228actually the initial state).
229
230\section{Dealing with the Stack}
231\label{sec:stack}
232Setting apart the extensional details of the proofs, a delicate point is how
233we deal with stack.
234
235A first minor issue is that the information we have about stack usage of
236each function call evolves along the back-end passes:
237\begin{itemize}
238 \item in \s{RTL} we must allocate some stack for what where referenced
239 local variables (including local arrays) in source code;
240 \item in \s{ERTL} we add up the stack used to pass parameters that overflow
241 the hardware registers available for the task;
242 \item finally in \s{LTL} we add the space necessary for spilled
243 pseudo-registers, and the stack usage of functions is finally fixed.
244\end{itemize}
245
246Another issue that we already mentioned above is that somewhere during the
247back-end pass we must pass from a semantics with no stack overflow error to
248one that can fail.
249
250In the following we describe the approach we have taken to these issues.
251
252\subsection{An evolving stack usage}
253\label{subsec:evolvingstack}
254The stack size we know each function must at least have is written in
255a field of \verb+joint+ internal functions (called \verb+joint_if_stacksize+),
256and as already said
257evolves during the back-end pass. The \emph{naïve} approach to defining
258semantics of these languages is to allocate the minimal necessary space for
259each function call, reading it from this syntactic field. The drawback is
260that then at each update of the stack size we must remap the data pointers
261in memory, moreover in a way that is dependent on the call stack. This is
262exemplified by the picture in \autoref{subfig:stacktight}.
263\begin{figure}
264\begin{subfigure}{.475\linewidth}
265\centering
266$
267\begin{tikzpicture}[thick, line cap=round,
268    m/.style = {execute at begin node=$, execute at end node=$}]
269\begin{scope}[every node/.style={draw, rectangle,
270    minimum width=1.25cm, m}, every label/.style={minimum width=0, draw=none}]
271    \node (main src) [minimum height=.5cm]
272        [label=left:main] {};
273    \node (f1 src) [above=-1pt of main src,minimum height=1cm]
274        [label=left:f] {};
275    \node (g src) [above=-1pt of f1 src,minimum height=.75cm]
276        [label=left:g] {};
277    \node (f2 src) [above=-1pt of g src,minimum height=1cm]
278        [label=left:f] {};
279
280    \node (main tgt) at ([xshift=2cm]main src.south east) [anchor=south west,minimum height=1cm]
281        [label=right:main] {};
282    \node (f1 tgt) [above=-1pt of main tgt,minimum height=1.25cm]
283        [label=right:f] {};
284    \node (g tgt) [above=-1pt of f1 tgt,minimum height=1.5cm]
285        [label=right:g] {};
286    \node (f2 tgt) [above=-1pt of g tgt,minimum height=1.25cm]
287        [label=right:f] {};
288
289\end{scope}
290\foreach \x/\off in {main/.5, f1/1, g/.75, f2/1} {
291    \draw [densely dashed, semithick]
292        ([yshift=1pt]\x\space src.south east) --
293        ([yshift=1pt]\x\space tgt.south west);
294    \draw [densely dashed, semithick] (\x\space src.north east) --
295        ([yshift=\off cm]\x\space tgt.south west) coordinate (tmp)
296        -- (tmp-|\x\space tgt.east);
297    \fill [pattern=dots, opacity=.5] (tmp) |- (\x\space tgt.north east) |- (tmp);
298    \path (\x\space src.east) -- node [sloped] {$\hookrightarrow$}
299        ($(\x\space tgt.south west)!.5!(tmp)$);
300}
301\node (heading tgt) [above=.125cm of f2 tgt] {target stack};
302\node (heading src) at (heading tgt-|f2 src) {source stack};
303
304\draw [|->] (heading src) -- (heading tgt);
305\end{tikzpicture}
306$
307\caption{How stack is remapped if we use tight stack spaces for each language.
308Dotted areas are the new parts of the local stacks.}
309\label{subfig:stacktight}
310\end{subfigure}
311\hfill
312\begin{subfigure}{.475\linewidth}
313\centering
314$
315\begin{tikzpicture}[thick, line cap=round,
316    m/.style = {execute at begin node=$, execute at end node=$}]
317\begin{scope}[every node/.style={draw, rectangle,
318    minimum width=1.25cm, m}, every label/.style={minimum width=0, draw=none}]
319    \node (main src) [minimum height=1.25cm]
320        [label=left:main] {};
321    \node (f1 src) [above=-1pt of main src,minimum height=1.75cm]
322        [label=left:f] {};
323    \node (g src) [above=-1pt of f1 src,minimum height=1.75cm]
324        [label=left:g] {};
325    \node (f2 src) [above=-1pt of g src,minimum height=1.75cm]
326        [label=left:f] {};
327
328    \node (main tgt) at ([xshift=2cm]main src.south east) [anchor=south west,minimum height=1.25cm]
329        [label=right:main] {};
330    \node (f1 tgt) [above=-1pt of main tgt,minimum height=1.75cm]
331        [label=right:f] {};
332    \node (g tgt) [above=-1pt of f1 tgt,minimum height=1.75cm]
333        [label=right:g] {};
334    \node (f2 tgt) [above=-1pt of g tgt,minimum height=1.75cm]
335        [label=right:f] {};
336
337\end{scope}
338\foreach \x/\off/\o in {main/.5/1, f1/1/1.25, g/.75/1.5, f2/1/1.25} {
339    \draw [densely dashed, semithick]
340        ([yshift=1pt]\x\space src.south east) --
341        ([yshift=1pt]\x\space tgt.south west);
342    \draw [densely dashed, semithick]
343        ([yshift=\off cm]\x\space src.south west) --
344        ([yshift=\off cm]\x\space tgt.south west) coordinate (tmp)
345        -- (tmp-|\x\space tgt.east);
346    \fill [pattern=dots, opacity=.5] (tmp) --
347        ([yshift=\o cm]\x\space tgt.south west) coordinate (tmp2) -| (tmp-|\x\space tgt.east) -- (tmp);
348    \draw [densely dashed] (tmp2) -- (tmp2-|\x\space tgt.east);
349    \fill [pattern=north east lines, opacity=.5]
350        (tmp2) |- (\x\space tgt.north east) |- (tmp2);
351    \path ($(\x\space src.south east)!.5!(\x\space src.east|-tmp)$) --
352        node {$\hookrightarrow$}
353        ($(\x\space tgt.south west)!.5!(tmp)$);
354    \fill [pattern=north east lines, opacity=.5]
355        (\x\space src.west|-tmp) coordinate (tmp) |- (\x\space src.north east) |- (tmp);
356}
357\node (heading tgt) [above=.125cm of f2 tgt] {target stack};
358\node (heading src) at (heading tgt-|f2 src) {source stack};
359
360\draw [|->] (heading src) -- (heading tgt);
361\end{tikzpicture}
362$
363\caption{Constant stack usage taken as a semantic parameter across languages.
364Shaded areas are unused parts of the stack.}
365\label{subfig:stacksame}
366\end{subfigure}
367\caption{Comparison between using tight stack sizes across languages, i.e.\
368reading how much stack is required from the function, and using the same
369stack sizes provided as a parameter to the semantics. We suppose the call stack
370is $main, f, g, f$.}
371\label{fig:stacktightsame}
372\end{figure}
373
374We decided to take another approach, where stack sizes of each function call
375are a parameter to the semantics, and all passes are proved correct for the
376same stack sizes in source and target language, possibly with a hypothesis
377on these ``semantic'' stack sizes to be greater than the ``syntactic'' ones
378(but not all passes require it). The picture becomes thus the one in
379\autoref{subfig:stacksame}. The advantage is that now the data in memory does
380not change along the evolving of stack usage, apart from the first time we
381pass from separate independent stack spaces and a unique one, a situation
382we describe in the next subsection.
383
384\subsection{From an unbounded to a bounded stack}
385The way the \verb+ft_and_tlr+ is propagated is almost everywhere by means of
386the proof of existence of a special relation between states that is a simulation
387with respect to execution. There are more details involved than in a usual
388extensional proof, regarding the intensional preservation. The details are
389contained in~\cite{simulation}.
390
391Here we concentrate on a particular aspect that eludes the generic treatment:
392the moment when we pass from an unbounded stack to a bounded one. This passage
393is delicate for two reasons. Firstly, while usually we can assure forward simulation
394of an execution step by simply depending on that step not producing an error,
395here is no more the case, as the stack overflow error must start happening somewhere in
396the back-end. Secondly, merging the stack spaces of the function calls requires
397mapping the pointer values contained in memory.
398
399In order to tackle these subtleties, we firstly decided to isolate this passage,
400without mingling it with other pass-dependent semantic preservation proofs.
401Rather than performing the proof when passing from a language to another,
402we decided to split in three the semantics of a single language (\s{RTL}).
403
404The first one, called \verb+RTL_semantics_separate+, mimics how stack is modelled
405in \s{RTLabs}: each function call gets allocated an independent stack space.
406Passing from \s{RTLabs} to this semantics involves the usual simulation relation.
407A peculiarity is that data pointers in memory need not change.
408
409The middle, second one, is called
410\verb+RTL_semantics_separate_overflow+, and it has separate allocated stack spaces
411just as above. However the execution is defined
412so that if the call stack as a whole uses more space than (later) physically
413available (as provided by a parameter), an error is issued. The actual states are identical,
414and the simulation is an almost trivial 1-to-1, where the stack correctness
415inherited from \verb+back_end_preconditions+ can be easily put to use even
416if the generic results for structure preserving simulation actually cannot be
417used here.
418
419The third one, defined as \verb+RTL_semantics_unique+, switches to a semantics with
420a single, bounded stack space, where moreover memory has already been granted to
421global variables too. From this semantics down to \s{LIN}, all data in memory excluding
422pieces of program counters remain the same, as specified in \autoref{subsec:evolvingstack}.
423However in order to show a
424forwarding simulation from \verb+RTL_semantics_separate_overflow+, one needs to
425remap data pointers. No further hypothesis on stack usage needs to be employed,
426as it is integrated in the fact that the execution step does not fail.
427
428\input{mauro.tex}
429
430\bibliographystyle{plain}
431\bibliography{report}
432\end{document}
Note: See TracBrowser for help on using the repository browser.