source: Deliverables/D1.2/CompilerProofOutline/outline.tex @ 1734

Last change on this file since 1734 was 1734, checked in by mulligan, 8 years ago

followed brian's lead in explaining the backend translation steps with a handy diagram, some more changes, commit to avoid conflicts also

File size: 24.4 KB
Line 
1\documentclass[a4paper, 10pt]{article}
2
3\usepackage{a4wide}
4\usepackage{amsfonts}
5\usepackage{amsmath}
6\usepackage{amssymb}
7\usepackage[english]{babel}
8\usepackage{color}
9\usepackage{diagrams}
10\usepackage{graphicx}
11\usepackage[colorlinks]{hyperref}
12\usepackage[utf8x]{inputenc}
13\usepackage{listings}
14\usepackage{microtype}
15\usepackage{skull}
16\usepackage{stmaryrd}
17
18\lstdefinelanguage{matita-ocaml} {
19  mathescape=true
20}
21\lstset{
22  language=matita-ocaml,basicstyle=\tt,columns=flexible,breaklines=false,
23  showspaces=false, showstringspaces=false, extendedchars=false,
24  inputencoding=utf8x, tabsize=2
25}
26
27\title{Proof outline for the correctness of the CerCo compiler}
28\date{\today}
29\author{The CerCo team}
30
31\begin{document}
32
33\maketitle
34
35\section{Introduction}
36\label{sect.introduction}
37
38In the last project review of the CerCo project, the project reviewers expressed the opinion that it would be a good idea to attempt to write down some of the statements of the correctness theorems that we intend to prove about the complexity preserving compiler.
39This document provides a very high-level, pen-and-paper sketch of what we view as the best path to completing the correctness proof for the compiler.
40In particular, for every translation between two intermediate languages, in both the front- and back-ends, we identify the key translation steps, and identify some invariants that we view as being important for the correctness proof.
41
42\section{Front-end: Clight to RTLabs}
43
44The front-end of the CerCo compiler consists of several stages:
45
46\begin{center}
47\begin{minipage}{.8\linewidth}
48\begin{tabbing}
49\quad \= $\downarrow$ \quad \= \kill
50\textsf{Clight}\\
51\> $\downarrow$ \> cast removal\\
52\> $\downarrow$ \> add runtime functions\footnote{Following the last project
53meeting we intend to move this transformation to the back-end}\\
54\> $\downarrow$ \> cost labelling\\
55\> $\downarrow$ \> stack variable allocation and control structure
56 simplification\\
57\textsf{Cminor}\\
58\> $\downarrow$ \> generate global variable initialisation code\\
59\> $\downarrow$ \> transform to RTL graph\\
60\textsf{RTLabs}\\
61\> $\downarrow$ \> start of target specific back-end\\
62\>\quad \vdots
63\end{tabbing}
64\end{minipage}
65\end{center}
66
67%Our overall statements of correctness with respect to costs will
68%require a correctly labelled program
69
70There are three layers in most of the proofs proposed:
71\begin{enumerate}
72\item invariants closely tied to the syntax and transformations using
73  dependent types,
74\item a forward simulation proof, and
75\item syntactic proofs about the cost labelling.
76\end{enumerate}
77The first will support both function correctness and allow us to show
78the totality of some of the compiler stages (that is, these stages of
79the compiler cannot fail).  The second provides the main functional
80correctness result, and the last will be crucial for applying
81correctness results about the costings from the back-end.
82
83We will also prove that a suitably labelled RTLabs trace can be turned
84into a \emph{structured trace} which splits the execution trace into
85cost-label-to-cost-label chunks with nested function calls.  This
86structure was identified during work on the correctness of the
87back-end cost analysis as retaining important information about the
88structure of the execution that is difficult to reconstruct later in
89the compiler.
90
91\subsection{Clight Cast removal}
92
93This removes some casts inserted by the parser to make arithmetic
94promotion explicit when they are superfluous (such as
95\lstinline[language=C]'c = (short)((int)a + (int)b);').
96
97The transformation only affects Clight expressions, recursively
98detecting casts that can be safely eliminated.  The semantics provides
99a big-step definition for expression, so we should be able to show a
100lock-step forward simulation using a lemma showing that cast
101elimination does not change the evaluation of expressions.  This lemma
102will follow from a structural induction on the source expression.  We
103have already proved a few of the underlying arithmetic results
104necessary to validate the approach.
105
106\subsection{Clight cost labelling}
107
108This adds cost labels before and after selected statements and
109expressions, and the execution traces ought to be equivalent modulo
110cost labels.  Hence it requires a simple forward simulation with a
111limited amount of stuttering whereever a new cost label is introduced.
112A bound can be given for the amount of stuttering allowed can be given
113based on the statement or continuation to be evaluated next.
114
115We also intend to show three syntactic properties about the cost
116labelling:
117\begin{itemize}
118\item every function starts with a cost label,
119\item every branching instruction is followed by a label (note that
120  exiting a loop is treated as a branch), and
121\item the head of every loop (and any \lstinline'goto' destination) is
122  a cost label.
123\end{itemize}
124These can be shown by structural induction on the source term.
125
126\subsection{Clight to Cminor translation}
127
128This translation is the first to introduce some invariants, with the
129proofs closely tied to the implementation by dependent typing.  These
130are largely complete and show that the generated code enjoys:
131\begin{itemize}
132\item some minimal type safety shown by explicit checks on the
133  Cminor types during the transformation (a little more work remains
134  to be done here, but follows the same form);
135\item that variables named in the parameter and local variable
136  environments are distinct from one another, again by an explicit
137  check;
138\item that variables used in the generated code are present in the
139  resulting environment (either by checking their presence in the
140  source environment, or from a list of freshly temporary variables);
141  and
142\item that all \lstinline[language=C]'goto' labels are present (by
143  checking them against a list of source labels and proving that all
144  source labels are preserved).
145\end{itemize}
146
147The simulation will be similar to the relevant stages of CompCert
148(Clight to Csharpminor and Csharpminor to Cminor --- in the event that
149the direct proof is unwieldy we could introduce a corresponding
150intermediate language).  During early experimentation with porting
151CompCert definitions to the Matita proof assistant we found little
152difficulty reproving the results for the memory model, so we plan to
153port the memory injection properties and use them to relate Clight
154in-memory variables with either a local variable valuation or a stack
155slot, depending on how it was classified.
156
157This should be sufficient to show the equivalence of (big-step)
158expression evaluation.  The simulation can then be shown by relating
159corresponding blocks of statement and continuations with their Cminor
160counterparts and proving that a few steps reaches the next matching
161state.
162
163The syntactic properties required for cost labels remain similar and a
164structural induction on the function bodies should be sufficient to
165show that they are preserved.
166
167\subsection{Cminor global initialisation code}
168
169This short phase replaces the global variable initialisation data with
170code that executes when the program starts.  Each piece of
171initialisation data in the source is matched by a new statement
172storing that data.  As each global variable is allocated a distinct
173memory block, the program state after the initialisation statements
174will be the same as the original program's state at the start of
175execution, and will proceed in the same manner afterwards.
176
177% Actually, the above is wrong...
178% ... this ought to be in a fresh main function with a fresh cost label
179
180\subsection{Cminor to RTLabs translation}
181
182In this part of the compiler we transform the program's functions into
183control flow graphs.  It is closely related to CompCert's Cminorsel to
184RTL transformation, albeit with target-independent operations.
185
186We already enforce several invariants with dependent types: some type
187safety is enforced mostly using the type information from Cminor; and
188that the graph is closed (by showing that each successor was recently
189added, or corresponds to a \lstinline[language=C]'goto' label which
190are all added before the end).  Note that this relies on a
191monotonicity property; CompCert maintains a similar property in a
192similar way while building RTL graphs.  We will also add a result
193showing that all of the pseudo-register names are distinct for use by
194later stages using the same method as Cminor.
195
196The simulation will relate Cminor states to RTLabs states about to
197execute the corresponding code (with a corresponding continuation).
198Each Cminor statement becomes zero or more RTLabs statements, with a
199decreasing measure based on the statement and continuations similar to
200CompCert's.  We may also follow CompCert in using a relational
201specification of this stage so as to abstract away from the functional
202(and highly dependently typed) definition.
203
204The first two labelling properties remain as before; we will show that
205cost labels are preserved, so the function entry point will be a cost
206label, and successors that are cost labels map still map to cost
207labels, preserving the condition on branches.  We replace the property
208for loops with the notion that we will always reach a cost label or
209the end of the function by following a bounded number of successors.
210This can be easily seen in Cminor using the requirement for cost
211labels at the head of loops and after gotos.  It remains to show that
212this is preserved by the translation to RTLabs.  % how?
213
214\subsection{RTLabs structured trace generation}
215
216This proof-only step incorporates the function call structure and cost
217labelling properties into the execution trace.  As the function calls
218are nested within the trace, we need to distinguish between
219terminating and non-terminating function calls.
220
221Thus we use the excluded middle (specialised to a function termination
222property) to do this.  Terminating functions are built by following
223the trace, breaking it into chunks between cost labels and recursively
224processing function calls.  The main difficulties here are the
225non-structurally recursive nature of the function (instead we use the
226size of the termination proof as a measure) and using the properties
227on RTLabs cost labelling to create the correct structure.  We also
228show that the lower stack frames are preserved during function calls
229in order to prove that after returning from a function call we execute
230the correct code.  This part of the work has already been constructed,
231but still requires a simple proof to show that flattening the structured
232trace recreates the original flat trace.
233
234The non-terminating case uses the excluded middle to decide whether to
235use the function described above to construct an inductive terminating
236structured trace, or a coinductive version for non-termination.  It
237follows the trace like the terminating version to build up chunks of
238trace from cost-label to cost-label (which, by the finite distance to
239a cost label property shown before, can be represented by an inductive
240type), unless it encounters a non-terminating function in which case
241it corecursively processes that.  This part of the trace
242transformation is currently under construction, and will also need a
243flattening result to show that it is correct.
244
245\section{Backend: RTLabs to machine code}
246\label{sect.backend.rtlabs.machine.code}
247
248The compiler backend consists of the following intermediate languages, and stages of translation:
249
250\begin{center}
251\begin{minipage}{.8\linewidth}
252\begin{tabbing}
253\quad \= $\downarrow$ \quad \= \kill
254\textsf{RTLabs}\\
255\> $\downarrow$ \> instruction selection\\
256\textsf{RTL}\\
257\> $\downarrow$ \> constant propagation (an endo-transformation) \\
258\> $\downarrow$ \> copy propagation (an endo-transformation) \\
259\> $\downarrow$ \> calling convention made explicit \\
260\> $\downarrow$ \> stack frame generation \\
261\textsf{ERTL}\\
262\> $\downarrow$ \> register allocation and spilling\\
263\> $\downarrow$ \> dead code elimination\\
264\textsf{LTL}\\
265\> $\downarrow$ \> function linearisation\\
266\> $\downarrow$ \> branch compression (an endo-transformation) \\
267\textsf{LIN}\\
268\> $\downarrow$ \> relabeling\\
269\textsf{ASM}\\
270\> $\downarrow$ \> pseudoinstruction expansion\\
271\textsf{Machine code}\\
272\end{tabbing}
273\end{minipage}
274\end{center}
275
276Here, by `endo-transformation', we mean a mapping from language back to itself.
277For example, the constant propagation step maps the RTL language to the RTL language.
278
279\subsection{The RTLabs to RTL translation}
280\label{subsect.rtlabs.rtl.translation}
281
282% dpm: type system and casting load (???)
283We require a map, $\sigma$, between \texttt{Values} of the front-end memory model to lists of \texttt{BEValues} of the back-end memory model:
284
285\begin{displaymath}
286\mathtt{Value} ::= \bot \mid \mathtt{int(size)} \mid \mathtt{float} \mid \mathtt{null} \mid \mathtt{ptr} \quad\stackrel{\sigma}{\longrightarrow}\quad \mathtt{BEValue} ::= \bot \mid \mathtt{byte} \mid \mathtt{int}_i \mid \mathtt{ptr}_i
287\end{displaymath}
288
289We further require a map, $\sigma$, which maps the front-end \texttt{Memory} and the back-end's notion of \texttt{BEMemory}.
290
291\begin{displaymath}
292\mathtt{Mem}\ \alpha = \mathtt{Block} \rightarrow (\mathbb{Z} \rightarrow \alpha)
293\end{displaymath}
294
295where
296
297\begin{displaymath}
298\mathtt{Block} ::= \mathtt{Region} \cup \mathtt{ID}
299\end{displaymath}
300
301\begin{displaymath}
302\mathtt{BEMem} = \mathtt{Mem} \mathtt{Value}
303\end{displaymath}
304
305\begin{displaymath}
306\mathtt{Address} = \mathtt{BEValue} \times  \mathtt{BEValue} \\
307\end{displaymath}
308
309\begin{displaymath}
310\mathtt{Mem} = \mathtt{Block} \rightarrow (\mathbb{Z} \rightarrow \mathtt{Cont} \mid \mathtt{Value} \times \mathtt{Size} \mid \mathtt{null})
311\end{displaymath}
312
313\begin{center}
314\begin{picture}(2, 2)
315% picture of sigma mapping memory to memory
316TODO
317\end{picture}
318\end{center}
319
320\begin{displaymath}
321\mathtt{load}\ s\ a\ M = \mathtt{Some}\ v \rightarrow \forall i \leq s.\ \mathtt{load}\ s\ (a + i)\ \sigma(M) = \mathtt{Some}\ v_i
322\end{displaymath}
323
324\begin{displaymath}
325\sigma(\mathtt{store}\ v\ M) = \mathtt{store}\ \sigma(v)\ \sigma(M)
326\end{displaymath}
327
328\begin{displaymath}
329\begin{array}{rll}
330\mathtt{State} & ::=  & (\mathtt{State} : \mathtt{Frame}^* \times \mathtt{Frame} \\
331               & \mid & \mathtt{Call} : \mathtt{Frame}^* \times \mathtt{Args} \times \mathtt{Return} \times \mathtt{Fun} \\
332               & \mid & \mathtt{Return} : \mathtt{Frame}^* \times \mathtt{Value} \times \mathtt{Return}) \times \mathtt{Mem}
333\end{array}
334\end{displaymath}
335
336\begin{displaymath}
337\mathtt{State} ::= \mathtt{Frame}^* \times \mathtt{PC} \times \mathtt{SP} \times \mathtt{ISP} \times \mathtt{CARRY} \times \mathtt{REGS}
338\end{displaymath}
339
340\begin{displaymath}
341\mathtt{State} \stackrel{\sigma}{\longrightarrow} \mathtt{State}
342\end{displaymath}
343
344\begin{displaymath}
345\sigma(\mathtt{State} (\mathtt{Frame}^* \times \mathtt{Frame})) \longrightarrow ((\sigma(\mathtt{Frame}^*), \sigma(\mathtt{PC}), \sigma(\mathtt{SP}), 0, 0, \sigma(\mathtt{REGS})), \sigma(\mathtt{Mem}))
346\end{displaymath}
347
348\begin{displaymath}
349\sigma(\mathtt{Return}(-)) \longrightarrow \sigma \circ \text{return one step}
350\end{displaymath}
351
352\begin{displaymath}
353\sigma(\mathtt{Call}(-)) \longrightarrow \sigma \circ \text{call one step}
354\end{displaymath}
355
356Return one step commuting diagram:
357
358\begin{displaymath}
359\begin{diagram}
360s & \rTo^{\text{one step of execution}} & s'   \\
361  & \rdTo                             & \dTo \\
362  &                                   & \llbracket s'' \rrbracket
363\end{diagram}
364\end{displaymath}
365
366Call one step commuting diagram:
367
368\begin{displaymath}
369\begin{diagram}
370s & \rTo^{\text{one step of execution}} & s'   \\
371  & \rdTo                             & \dTo \\
372  &                                   & \llbracket s'' \rrbracket
373\end{diagram}
374\end{displaymath}
375
376\begin{displaymath}
377\begin{diagram}
378\mathtt{CALL}(\mathtt{id},\ \mathtt{args},\ \mathtt{dst},\ \mathtt{pc}),\ \mathtt{State}(\mathtt{Frame},\ \mathtt{Frames}) & \rTo & \mathtt{Call}(FINISH ME)
379\end{diagram}
380\end{displaymath}
381
382\begin{displaymath}
383\begin{array}{rcl}
384\mathtt{CALL}(\mathtt{id}, \mathtt{args}, \mathtt{dst}, \mathtt{pc}) & \longrightarrow & \mathtt{CALL\_ID}(\mathtt{id}, \sigma'(\mathtt{args}), \sigma(\mathtt{dst}), \mathtt{pc}) \\
385\mathtt{RETURN}                                                      & \longrightarrow & \mathtt{RETURN} \\
386\end{array} 
387\end{displaymath}
388
389\begin{displaymath}
390\begin{array}{rcl}
391\sigma & : & \mathtt{register} \rightarrow \mathtt{list\ register} \\
392\sigma' & : & \mathtt{list\ register} \rightarrow \mathtt{list\ register}
393\end{array} 
394\end{displaymath}
395
396\subsection{The RTL to ERTL translation}
397\label{subsect.rtl.ertl.translation}
398
399\begin{displaymath}
400\begin{diagram}
401& & \llbracket \mathtt{CALL\_ID}(\mathtt{id}, \mathtt{args}, \mathtt{dst}, \mathtt{pc})\rrbracket & & \\
402& \ldTo^{\text{external}} & & \rdTo^{\text{internal}} & \\
403\skull & & & & \mathtt{regs} = [\mathtt{params}/-] \\
404& & & & \mathtt{sp} = \mathtt{ALLOC} \\
405& & & & \mathtt{PUSH}(\mathtt{carry}, \mathtt{regs}, \mathtt{dst}, \mathtt{return\_addr}), \mathtt{pc}_{0}, \mathtt{regs}, \mathtt{sp} \\
406\end{diagram}
407\end{displaymath}
408
409\begin{align*}
410\llbracket \mathtt{RETURN} \rrbracket \\
411\mathtt{return\_addr} & := \mathtt{top}(\mathtt{stack}) \\
412v*                    & := m(\mathtt{rv\_regs}) \\
413\mathtt{dst}, \mathtt{sp}, \mathtt{carry}, \mathtt{regs} & := \mathtt{pop} \\
414\mathtt{regs}[v* / \mathtt{dst}] \\
415\end{align*}
416
417\begin{displaymath}
418\begin{diagram}
419s    & \rTo^1 & s' & \rTo^1 & s'' \\
420\dTo &        &    & \rdTo  & \dTo \\
421\llbracket s \rrbracket & \rTo(1,3)^1 & & & \llbracket s'' \rrbracket \\ 
422\mathtt{CALL} \\
423\end{diagram}
424\end{displaymath}
425
426\begin{displaymath}
427\begin{diagram}
428s    & \rTo^1 & s' & \rTo^1 & s'' \\
429\dTo &        &    & \rdTo  & \dTo \\
430\  & \rTo(1,3) & & & \ \\
431\mathtt{RETURN} \\
432\end{diagram}
433\end{displaymath}
434
435\begin{displaymath}
436\mathtt{b\_graph\_translate}: (\mathtt{label} \rightarrow \mathtt{blist'})
437\rightarrow \mathtt{graph} \rightarrow \mathtt{graph}
438\end{displaymath}
439
440\begin{align*}
441\mathtt{theorem} &\ \mathtt{b\_graph\_translate\_ok}: \\
442& \forall  f.\forall G_{i}.\mathtt{let}\ G_{\sigma} := \mathtt{b\_graph\_translate}\ f\ G_{i}\ \mathtt{in} \\
443&       \forall l \in G_{i}.\mathtt{subgraph}\ (f\ l)\ l\ (\mathtt{next}\ l\ G_{i})\ G_{\sigma}
444\end{align*}
445
446\begin{align*}
447\mathtt{lemma} &\ \mathtt{execute\_1\_step\_ok}: \\
448&       \forall s.  \mathtt{let}\ s' := s\ \sigma\ \mathtt{in} \\
449&       \mathtt{let}\ l := pc\ s\ \mathtt{in} \\
450&       s \stackrel{1}{\rightarrow} s^{*} \Rightarrow \exists n. s' \stackrel{n}{\rightarrow} s'^{*} \wedge s'^{*} = s'\ \sigma
451\end{align*}
452
453\begin{align*}
454\mathrm{RTL\ status} & \ \ \mathrm{ERTL\ status} \\
455\mathtt{sp} & = \mathtt{spl} / \mathtt{sph} \\
456\mathtt{graph} & \mapsto \mathtt{graph} + \mathtt{prologue}(s) + \mathtt{epilogue}(s) \\
457& \mathrm{where}\ s = \mathrm{callee\ saved} + \nu \mathrm{RA} \\
458\end{align*}
459
460\begin{displaymath}
461\begin{diagram}
462\mathtt{CALL} & \rTo^1 & \mathtt{inside\ function} \\
463\dTo & & \dTo \\
464\underbrace{\ldots}_{\llbracket \mathtt{CALL} \rrbracket} & \rTo &
465\underbrace{\ldots}_{\mathtt{prologue}} \\
466\end{diagram}
467\end{displaymath}
468
469\begin{displaymath}
470\begin{diagram}
471\mathtt{RETURN} & \rTo^1 & \mathtt{.} \\
472\dTo & & \dTo \\
473\underbrace{\ldots}_{\mathtt{epilogue}} & \rTo &
474\underbrace{\ldots} \\
475\end{diagram}
476\end{displaymath}
477
478\begin{align*}
479\mathtt{prologue}(s) = & \mathtt{create\_new\_frame}; \\
480                       & \mathtt{pop\ ra}; \\
481                       & \mathtt{save\ callee\_saved}; \\
482                                                                                         & \mathtt{get\_params} \\
483                                                                                         & \ \ \mathtt{reg\_params}: \mathtt{move} \\
484                                                                                         & \ \ \mathtt{stack\_params}: \mathtt{push}/\mathtt{pop}/\mathtt{move} \\
485\end{align*}
486
487\begin{align*}
488\mathtt{epilogue}(s) = & \mathtt{save\ return\ to\ tmp\ real\ regs}; \\
489                                                                                         & \mathtt{restore\_registers}; \\
490                       & \mathtt{push\ ra}; \\
491                       & \mathtt{delete\_frame}; \\
492                       & \mathtt{save return} \\
493\end{align*}
494
495\begin{displaymath}
496\mathtt{CALL} id \mapsto \mathtt{set\_params}; \mathtt{CALL} id; \mathtt{fetch\_result}
497\end{displaymath}
498
499\subsection{The ERTL to LTL translation}
500\label{subsect.ertl.ltl.translation}
501
502\subsection{The LTL to LIN translation}
503\label{subsect.ltl.lin.translation}
504
505We require a map, $\sigma$, from LTL statuses, where program counters are represented as labels in a graph data structure, to LIN statuses, where program counters are natural numbers:
506\begin{displaymath}
507\mathtt{pc : label} \stackrel{\sigma}{\longrightarrow} \mathbb{N}
508\end{displaymath}
509
510The LTL to LIN translation pass also linearises the graph data structure into a list of instructions.
511Pseudocode for the linearisation process is as follows:
512
513\begin{lstlisting}
514let rec linearise graph visited required generated todo :=
515  match todo with
516  | l::todo ->
517    if l $\in$ visited then
518      let generated := generated $\cup\ \{$ Goto l $\}$ in
519      let required := required $\cup$ l in
520        linearise graph visited required generated todo
521    else
522      -- Get the instruction at label `l' in the graph
523      let lookup := graph(l) in
524      let generated := generated $\cup\ \{$ lookup $\}$ in
525      -- Find the successor of the instruction at label `l' in the graph
526      let successor := succ(l, graph) in
527      let todo := successor::todo in
528        linearise graph visited required generated todo
529  | []      -> (required, generated)
530\end{lstlisting}
531
532It is easy to see that this linearisation process eventually terminates.
533In particular, the size of the visited label set is monotonically increasing, and is bounded above by the size of the graph that we are linearising.
534
535The initial call to \texttt{linearise} sees the \texttt{visited}, \texttt{required} and \texttt{generated} sets set to the empty set, and \texttt{todo} initialized with the singleton list consisting of the entry point of the graph.
536We envisage needing to prove the following invariants on the linearisation function above:
537
538\begin{enumerate}
539\item
540$\mathtt{visited} \approx \mathtt{generated}$, where $\approx$ is \emph{multiset} equality, as \texttt{generated} is a set of instructions where instructions may mention labels multiple times, and \texttt{visited} is a set of labels,
541\item
542$\forall \mathtt{l} \in \mathtt{generated}.\ \mathtt{succ(l,\ graph)} \subseteq \mathtt{required} \cup \mathtt{todo}$,
543\item
544$\mathtt{required} \subseteq \mathtt{visited}$,
545\item
546$\mathtt{visited} \cap \mathtt{todo} = \emptyset$.
547\end{enumerate}
548
549The invariants collectively imply the following properties, crucial to correctness, about the linearisation process:
550
551\begin{enumerate}
552\item
553Every graph node is visited at most once,
554\item
555Every instruction that is generated is generated due to some graph node being visited,
556\item
557The successor instruction of every instruction that has been visited already will eventually be visited too.
558\end{enumerate}
559
560Note, because the LTL to LIN transformation is the first time the program is linearised, we must discover a notion of `well formed program' suitable for linearised forms.
561In particular, we see the notion of well formedness (yet to be formally defined) resting on the following conditions:
562
563\begin{enumerate}
564\item
565For every jump to a label in a linearised program, the target label exists at some point in the program,
566\item
567Each label is unique, appearing only once in the program,
568\item
569The final instruction of a program must be a return.
570\end{enumerate}
571
572We assume that these properties will be easy consequences of the invariants on the linearisation function defined above.
573
574The final condition above is potentially a little opaque, so we explain further.
575First, the only instructions that can reasonably appear in final position at the end of a program are returns or backward jumps, as any other instruction would cause execution to `fall out' of the end of the program (for example, when a function invoked with \texttt{CALL} returns, it returns to the next instruction past the \texttt{CALL} that invoked it).
576However, in LIN, though each function's graph has been linearised, the entire program is not yet fully linearised into a list of instructions, but rather, a list of `functions', each consisting of a linearised body along with other data.
577Each well-formed function must end with a call to \texttt{RET}, and therefore the only correct instruction that can terminate a LIN program is a \texttt{RET} instruction.
578
579\subsection{The LIN to ASM and ASM to MCS-51 machine code translations}
580\label{subsect.lin.asm.translation}
581
582The LIN to ASM translation step is trivial, being almost the identity function.
583The only non-trivial feature of the LIN to ASM translation is that all labels are `named apart' so that there is no chance of freshly generated labels from different namespaces clashing with labels from another namespace.
584
585The ASM to MCS-51 machine code translation step, and the required statements of correctness, are found in an unpublished manuscript attached to this document.
586
587\end{document}
Note: See TracBrowser for help on using the repository browser.