source: Deliverables/D4.4/paolo.tex @ 3194

Last change on this file since 3194 was 3194, checked in by tranquil, 8 years ago

more on the role of the stack in the back end pass.
moved mauro.tex as an include of paolo.tex

File size: 19.6 KB
Line 
1\documentclass[11pt, epsf, a4wide]{article}
2
3\usepackage{../style/cerco}
4
5\usepackage{amsfonts}
6\usepackage{amsmath}
7\usepackage{amssymb} 
8\usepackage[english]{babel}
9\usepackage{graphicx}
10\usepackage[utf8x]{inputenc}
11\usepackage{listings}
12\usepackage[all]{xy}
13
14\newcommand{\clap}[1]{\hbox to 0pt{\hss#1\hss}}
15\def\mathclap{\mathpalette\mathclapinternal}
16\def\mathclapinternal#1#2{%
17           \clap{$\mathsurround=0pt#1{#2}$}}
18
19\newcommand{\unif}{\ensuremath{\stackrel{\scriptscriptstyle ?}{\equiv}}}
20\newcommand{\founif}{\ensuremath{\stackrel{\scriptscriptstyle ?}{=}}}
21\newcommand{\vect}[1]{\ensuremath{\stackrel{\to}{#1}}}
22\newcommand{\sem}[1]{\ensuremath{[\![#1]\!]}}
23\newcommand{\bsem}[2]{\ensuremath{[\![#1;\;#2]\!]}}
24\newcommand{\s}[1]{\textsf{#1}}
25
26\renewcommand{\verb}{\lstinline}
27\def\lstlanguagefiles{lst-grafite.tex}
28\lstset{language=Grafite}
29
30\usepackage{lscape}
31\usepackage{stmaryrd}
32\usepackage{threeparttable}
33\usepackage{caption,subcaption}
34\usepackage{url}
35\usepackage{tikz}
36\usetikzlibrary{positioning,calc,patterns}
37\title{
38INFORMATION AND COMMUNICATION TECHNOLOGIES\\
39(ICT)\\
40PROGRAMME\\
41\vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}}
42
43\lstdefinelanguage{matita-ocaml}
44  {keywords={definition,coercion,lemma,theorem,remark,inductive,record,qed,let,in,rec,match,return,with,Type,try},
45   morekeywords={[2]whd,normalize,elim,cases,destruct},
46   morekeywords={[3]type,of},
47   mathescape=true,
48  }
49
50\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
51        keywordstyle=\color{red}\bfseries,
52        keywordstyle=[2]\color{blue},
53        keywordstyle=[3]\color{blue}\bfseries,
54        commentstyle=\color{green},
55        stringstyle=\color{blue},
56        showspaces=false,showstringspaces=false}
57
58\lstset{extendedchars=false}
59\lstset{inputencoding=utf8x}
60\DeclareUnicodeCharacter{8797}{:=}
61\DeclareUnicodeCharacter{10746}{++}
62\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
63\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
64
65\date{}
66\author{}
67
68\begin{document}
69
70\thispagestyle{empty}
71
72\vspace*{-1cm}
73\begin{center}
74\includegraphics[width=0.6\textwidth]{../style/cerco_logo.png}
75\end{center}
76
77\begin{minipage}{\textwidth}
78\maketitle
79\end{minipage}
80
81\vspace*{0.5cm}
82\begin{center}
83\begin{LARGE}
84\textbf{
85Report n. D4.4\\
86??????????????
87}
88\end{LARGE} 
89\end{center}
90
91\vspace*{2cm}
92\begin{center}
93\begin{large}
94Version 1.1
95\end{large}
96\end{center}
97
98\vspace*{0.5cm}
99\begin{center}
100\begin{large}
101Main Authors:\\
102????????????
103\end{large}
104\end{center}
105
106\vspace*{\fill}
107
108\noindent
109Project Acronym: \cerco{}\\
110Project full title: Certified Complexity\\
111Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\
112
113\clearpage
114\pagestyle{myheadings}
115\markright{\cerco{}, FP7-ICT-2009-C-243881}
116
117\newpage
118
119\vspace*{7cm}
120\paragraph{Abstract}
121BLA BLA BLA
122\newpage
123
124\tableofcontents
125
126\newpage
127
128\section{The Back-End Correctness Proof at a Glance}
129
130The back-end part of the compiler takes an \s{RTLabs} program together with
131an initialising cost label and gives the assembly code to be fed to the
132assembler. From the semantic point of view, at this stage of the compiler,
133we are interested in propagating structured traces, as explained in \cite{?}.
134
135A schema with all the back-end passes and the salient and common points of each one is the following:
136$$\overbrace{\s{RTLabs} \longrightarrow
137\overbrace{\underbrace{\overbrace{\s{RTL}}^{\mathclap{\text{stack merge}}} \longrightarrow \s{ERTL} \longrightarrow \s{LTL}}_{\texttt{joint}\text{ graph languages}}\longrightarrow
138\hskip -42.5pt\overbrace{\hskip 42.5pt\s{LIN}}^{\text{linearisation}}}^{\texttt{joint}\text{ languages}}\longrightarrow
139\hskip -42.5pt\underbrace{\hskip 42.5pt\s{ASM}}_{\text{linear map}}}^{\texttt{abstract\_status}}$$
140
141First, here all language semantics share a common interface via \verb+abstract_status+,
142which is at the base of the definition of structured traces.
143The generic shape of each pass of the back-end is thus the same: we get in input an unstructured
144prefix trace followed by a structured one which we want to measure in the source language,
145and we need to prove the existence of the corresponding unstructured and structured
146traces in the target language, with the two traces producing the same observables.
147
148From \s{RTL} down to \s{LIN} we are in the common syntactic and semantic language
149we call \verb+joint+. Inside the first of these languages, \s{RTL}, we pass
150from separate local memory spaces for each function call to a unique one. Even
151though parameters are passed on stack only in \s{ERTL} and variable allocation
152and spilling is done in \s{LTL}, we already have values on stack at this stage,
153and we ensure all the stack space that will eventually be needed is allocated
154at each call.
155
156The next two passes live in the graph part of \verb+joint+, and can benefit from
157a generic approach to graph translation and its proof of correctness, as described
158in~\cite{mauro}.
159
160Next, a generic \verb+joint+ linearisation pass is performed to go from
161\s{LTL} to \s{LIN}. The proof is generic too, with reasonable and straightforward proof
162obligations asking that the common semantic operations of the two languages
163depend on program counters in memory only to control the flow. Program counters
164are indeed the only semantic entity that changes during this pass.
165
166The last pass is, as far as function bodies are concerned, a simple linear
167map---every \s{LIN} instruction is mapped to a single \s{ASM} instruction,
168so that we may say that \s{LIN} is a subset of \s{ASM}. Less straightforward
169is that function bodies are merged together, and that pointers pass from
170formal data to actual one. Values in memory need therefore to be remapped
171to relate the states in the two languages.
172
173\subsection{A common invariant} This block of traces with properties is recurrent enough to merit the
174\s{matita} definition in \autoref{fig:ft_and_tlr}. We rely on the fact
175that all semantics from \s{RTLabs} down to object code can be expressed
176with the \verb+abstract_status+ record, by which all our definitions of traces
177are expressed.
178\begin{figure}
179\begin{lstlisting}
180record ft_and_tlr (S : abstract_status) (prefix, subtrace : list intensional_event)
181(fn : ident) (st1 : S) : Prop ≝
182{ st2 : S
183; st3 : S
184; ft : flat_trace S st1 st2
185; tlr : trace_label_return S st2 st3
186; tlr_unrpt : tlr_unrepeating … tlr
187; ft_is_prefix : ft_observables … ft = prefix
188; fn_is_current : ft_current_function … ft = Some ? fn
189; tlr_is_subtrace : observables_trace_label_return … tlr fn = subtrace
190}.
191\end{lstlisting}
192\caption{The \verb+ft_and_tlr+ record (which stands for
193\verb+flat_trace+ and \verb+trace_label_return+), which abstracts the kind
194of invariant that all back-end passes need to preserve.}
195\label{fig:ft_and_tlr}
196\end{figure}
197The parameters of the record fix respectively the intensional events encountered
198in the prefix (as stated by \verb+ft_is_prefix+), the ones in the structured trace
199(\verb+tlr_is_subtrace+) and the name of the function in which \verb+tlr+
200takes place (\verb+fn_is_current+), and the initial state.
201The additional property \verb+tlr_unrpt+
202tells that program counters do not repeat locally (i.e.\ between two labels)
203in \verb+tlr+, a property needed for the soundness of the cost static analysis
204(cf.~\cite{?}).
205
206\subsection{The statement}
207The back-end correctness theorem proves the \s{matita} statement in
208\autoref{fig:statement}. The output of the a proof is
209a \verb+ft_and_tlr+ structure as outlined before, which constitutes the
210preconditions of the assembly proof. The \verb+sigma+ and \verb+pol+
211parameters are passed to \s{ASM}'s semantics, however they have significance
212only with respect to the \s{ASM} to object code correctness
213\footnote{\verb+sigma+ and \verb+pol+ are what allows to maps instructions
214between \s{ASM} and the produced object code. This information isgained during
215the jump expansion pass, cf.~\cite{?}.}.
216
217\begin{figure}
218\begin{subfigure}{\linewidth}
219\begin{lstlisting}
220theorem back_end_correct :
221∀observe,init_cost,p_rtlabs,p_asm,stack_m,max_stack,prefix,subtrace,fn.
222back_end observe init_cost p_rtlabs =
223    return 〈 p_asm, stack_m, max_stack 〉 →
224back_end_preconditions p_rtlabs (stack_sizes stack_m) max_stack prefix subtrace fn →
225∀sigma,pol.
226ft_and_tlr (ASM_status p_asm sigma pol)
227    prefix subtrace fn (initialise_status ? p_asm).
228\end{lstlisting}
229\caption{The statement.}
230\end{subfigure}
231\\[10pt]
232\begin{subfigure}{\linewidth}
233\begin{lstlisting}
234record back_end_preconditions (p_rtlabs : RTLabs_program)
235(stacksizes : ident → option ℕ) (max_stack : ℕ)
236(prefix, subtrace : list intensional_event) (fn : ident) : Prop ≝
237{ ra_init : RTLabs_status (make_global p_rtlabs)
238; ra_init_ok : make_ext_initial_state p_rtlabs = return ra_init
239; ra_max :
240  le (maximum (update_stacksize_info stacksizes (mk_stacksize_info 0 0)
241    (extract_call_ud_from_observables (prefix @ subtrace)))) max_stack
242; ra_ft_tlr : ft_and_tlr (RTLabs_status (make_global p_rtlabs))
243    prefix subtrace fn ra_init
244}.
245\end{lstlisting}
246\caption{The definition of preconditions for the correctness of the back-end.}
247\label{subfig:back_end_preconditions}
248\end{subfigure}
249\caption{The statement of the back-end correctness result.}
250\label{fig:statement}
251\end{figure}
252
253Care must be taken in dealing with the stack. The back-end pass produces a
254\emph{stack model}: the amount of stack needed by each function (\verb+stack_m+),
255together with the maximal usable stack (\verb+max_stack+, $2^{16}$ minus the size occupied by globals).
256While programs in the front end do not fail for stack overflow, the stack
257information is lifted to source much in the same ways as time consumption information,
258so that we can actually use as a hypothesis on input source traces that they
259do not use more stack than allowed.
260This hypothesis is included in the \verb+measurable+ predicate over
261front-end traces, and we put it to use during the back-end pass,
262where we pass to a unique bounded stack space for all functions. More details
263will be presented in \autoref{sec:?}.
264
265The above explanation is why the back-end correctness results requires some additional preconditions
266with respect to a \verb+ft_and_tlr+ for \s{RTLabs}. This is accomplished by the
267\verb+ra_max+ field in the record \verb+back_end_preconditions+
268(\autoref{subfig:back_end_preconditions}). The other fields hold the initial
269state \verb+ra_init+ of the program in \s{RTLabs} (with a proof that it is
270actually the initial state).
271
272\section{Dealing with the stack}
273Setting apart the extensional details of the proofs, a delicate point is how
274we deal with stack.
275
276A first minor issue is that the information we have about stack usage of
277each function call evolves along the back-end passes:
278\begin{itemize}
279 \item in \s{RTL} we must allocate some stack for what where referenced
280 local variables (including local arrays) in source code;
281 \item in \s{ERTL} we add up the stack used to pass parameters that overflow
282 the hardware registers available for the task;
283 \item finally in \s{LTL} we add the space necessary for spilled
284 pseudo-registers, and the stack usage of functions is finally fixed.
285\end{itemize}
286
287Another issue that we already mentioned above is that somewhere during the
288back-end pass we must pass from a semantics with no stack overflow error to
289one that can fail.
290
291In the following we describe the approach we have taken to these issues.
292
293\subsection{An evolving stack usage}
294\label{subsec:evolvingstack}
295The stack size we know each function must at least have is written in
296a field of \verb+joint+ internal functions (called \verb+joint_if_stacksize+),
297and as already said
298evolves during the back-end pass. The \emph{naïve} approach to defining
299semantics of these languages is to allocate the minimal necessary space for
300each function call, reading it from this syntactic field. The drawback is
301that then at each update of the stack size we must remap the data pointers
302in memory, moreover in a way that is dependent on the call stack. This is
303exemplified by the picture in \autoref{subfig:stacktight}.
304\begin{figure}
305\begin{subfigure}{.475\linewidth}
306\centering
307$
308\begin{tikzpicture}[thick, line cap=round,
309    m/.style = {execute at begin node=$, execute at end node=$}]
310\begin{scope}[every node/.style={draw, rectangle,
311    minimum width=1.25cm, m}, every label/.style={minimum width=0, draw=none}]
312    \node (main src) [minimum height=.5cm]
313        [label=left:main] {};
314    \node (f1 src) [above=-1pt of main src,minimum height=1cm]
315        [label=left:f] {};
316    \node (g src) [above=-1pt of f1 src,minimum height=.75cm]
317        [label=left:g] {};
318    \node (f2 src) [above=-1pt of g src,minimum height=1cm]
319        [label=left:f] {};
320
321    \node (main tgt) at ([xshift=2cm]main src.south east) [anchor=south west,minimum height=1cm]
322        [label=right:main] {};
323    \node (f1 tgt) [above=-1pt of main tgt,minimum height=1.25cm]
324        [label=right:f] {};
325    \node (g tgt) [above=-1pt of f1 tgt,minimum height=1.5cm]
326        [label=right:g] {};
327    \node (f2 tgt) [above=-1pt of g tgt,minimum height=1.25cm]
328        [label=right:f] {};
329
330\end{scope}
331\foreach \x/\off in {main/.5, f1/1, g/.75, f2/1} {
332    \draw [densely dashed, semithick]
333        ([yshift=1pt]\x\space src.south east) --
334        ([yshift=1pt]\x\space tgt.south west);
335    \draw [densely dashed, semithick] (\x\space src.north east) --
336        ([yshift=\off cm]\x\space tgt.south west) coordinate (tmp)
337        -- (tmp-|\x\space tgt.east);
338    \fill [pattern=dots, opacity=.5] (tmp) |- (\x\space tgt.north east) |- (tmp);
339    \path (\x\space src.east) -- node [sloped] {$\hookrightarrow$}
340        ($(\x\space tgt.south west)!.5!(tmp)$);
341}
342\node (heading tgt) [above=.125cm of f2 tgt] {target stack};
343\node (heading src) at (heading tgt-|f2 src) {source stack};
344
345\draw [|->] (heading src) -- (heading tgt);
346\end{tikzpicture}
347$
348\caption{How stack is remapped if we use tight stack spaces for each language.
349Dotted areas are the new parts of the local stacks.}
350\label{subfig:stacktight}
351\end{subfigure}
352\hfill
353\begin{subfigure}{.475\linewidth}
354\centering
355$
356\begin{tikzpicture}[thick, line cap=round,
357    m/.style = {execute at begin node=$, execute at end node=$}]
358\begin{scope}[every node/.style={draw, rectangle,
359    minimum width=1.25cm, m}, every label/.style={minimum width=0, draw=none}]
360    \node (main src) [minimum height=1.25cm]
361        [label=left:main] {};
362    \node (f1 src) [above=-1pt of main src,minimum height=1.75cm]
363        [label=left:f] {};
364    \node (g src) [above=-1pt of f1 src,minimum height=1.75cm]
365        [label=left:g] {};
366    \node (f2 src) [above=-1pt of g src,minimum height=1.75cm]
367        [label=left:f] {};
368
369    \node (main tgt) at ([xshift=2cm]main src.south east) [anchor=south west,minimum height=1.25cm]
370        [label=right:main] {};
371    \node (f1 tgt) [above=-1pt of main tgt,minimum height=1.75cm]
372        [label=right:f] {};
373    \node (g tgt) [above=-1pt of f1 tgt,minimum height=1.75cm]
374        [label=right:g] {};
375    \node (f2 tgt) [above=-1pt of g tgt,minimum height=1.75cm]
376        [label=right:f] {};
377
378\end{scope}
379\foreach \x/\off/\o in {main/.5/1, f1/1/1.25, g/.75/1.5, f2/1/1.25} {
380    \draw [densely dashed, semithick]
381        ([yshift=1pt]\x\space src.south east) --
382        ([yshift=1pt]\x\space tgt.south west);
383    \draw [densely dashed, semithick]
384        ([yshift=\off cm]\x\space src.south west) --
385        ([yshift=\off cm]\x\space tgt.south west) coordinate (tmp)
386        -- (tmp-|\x\space tgt.east);
387    \fill [pattern=dots, opacity=.5] (tmp) --
388        ([yshift=\o cm]\x\space tgt.south west) coordinate (tmp2) -| (tmp-|\x\space tgt.east) -- (tmp);
389    \draw [densely dashed] (tmp2) -- (tmp2-|\x\space tgt.east);
390    \fill [pattern=north east lines, opacity=.5]
391        (tmp2) |- (\x\space tgt.north east) |- (tmp2);
392    \path ($(\x\space src.south east)!.5!(\x\space src.east|-tmp)$) --
393        node {$\hookrightarrow$}
394        ($(\x\space tgt.south west)!.5!(tmp)$);
395    \fill [pattern=north east lines, opacity=.5]
396        (\x\space src.west|-tmp) coordinate (tmp) |- (\x\space src.north east) |- (tmp);
397}
398\node (heading tgt) [above=.125cm of f2 tgt] {target stack};
399\node (heading src) at (heading tgt-|f2 src) {source stack};
400
401\draw [|->] (heading src) -- (heading tgt);
402\end{tikzpicture}
403$
404\caption{Constant stack usage taken as a semantic parameter across languages.
405Shaded areas are unused parts of the stack.}
406\label{subfig:stacksame}
407\end{subfigure}
408\caption{Comparison between using tight stack sizes across langauges, i.e.\
409reading how much stack is required from the function, and using the same
410stack sizes provided as a parameter to the semantics. We suppose the call stack
411is $main, f, g, f$.}
412\label{fig:stacktightsame}
413\end{figure}
414
415We decided to take another approach, where stack sizes of each function call
416are a parameter to the semantics, and all passes are proved correct for the
417same stack sizes in source and target language, possibly with a hypothesis
418on these ``semantic'' stack sizes to be greater than the ``syntactic'' ones
419(but not all passes require it). The picture becomes thus the one in
420\autoref{subfig:stacksame}. The advantage is that now the data in memory does
421not change along the evolving of stack usage, apart from the first time we
422pass from separate independent stack spaces and a unique one, a situation
423we describe in the next subsection.
424
425\subsection{From an unbounded to a bounded stack}
426The way the \verb+ft_and_tlr+ is propagated is almost everwhere by means of
427the proof of existence of a special relation between states that is a simulation
428with respect to execution. There are more details involved than in a usual
429extensional proof, regarding the intensional preservation. The details are
430contained in~\cite{?}.
431
432Here we concentrate on a particular aspect that eludes the generic treatment:
433the moment when we pass from an unbounded stack to a bounded one. This passage
434is delicate for two reasons. Firstly, while usually we can assure forward simulation
435of an execution step by simply depending on that step not producing an error,
436here is no more the case, as the stack overflow error must start happening somewhere in
437the back-end. Secondly, merging the stack spaces of the function calls requires
438mapping the pointer values contained in memory.
439
440In order to tackle these subtleties, we firstly decided to isolate this passage,
441without mingling it with other pass-dependent semantic preservation proofs.
442Rather than performing the proof when passing from a language to another,
443we decided to split in three the semantics of a single language (\s{RTL}).
444
445The first one, called \verb+RTL_semantics_separate+, mimics how stack is modelled
446in \s{RTLabs}: each function call gets allocated an independent stack space.
447Passing from \s{RTLabs} to this semantics involves the usual simulation relation.
448A peculiarity is that data pointers in memory need not change.
449
450The middle, second one, is called
451\verb+RTL_semantics_separate_overflow+, and it has separate allocated stack spaces
452just as above. However the execution is defined
453so that if the call stack as a whole uses more space than (later) physically
454available (as provided by a parameter), an error is issued. The actual states are identical,
455and the simulation is an almost trivial 1-to-1, where the stack correctness
456inherited from \verb+back_end_preconditions+ can be easily put to use even
457if the generic results for structure preserving simulation actually cannot be
458used here.
459
460The third one, defined as \verb+RTL_semantics_unique+, switches to a semantics with
461a single, bounded stack space, where moreover memory has already been granted to
462globals too. From this semantics down to \s{LIN}, all data in memory excluding
463pieces of program counters remain the same, as specified in \autoref{subsec:evolvingstack}.
464However in order to show a
465forwarding simulation from \verb+RTL_semantics_separate_overflow+, one needs to
466remap data pointers. No further hypothesis on stack usage needs to be employed,
467as it is integrated in the fact that the execution step does not fail.
468
469\include{mauro}
470\end{document}
Note: See TracBrowser for help on using the repository browser.