source: Deliverables/D2.1/text.tex

Last change on this file was 59, checked in by sacerdot, 9 years ago

Added a new appendix for assessment within the CerCo? project.

File size: 127.5 KB
Line 
1
2\section{Introduction}\label{intro-sec}
3The formal description and certification of software components is
4reaching a certain level of maturity with impressing case studies
5ranging from compilers to kernels of operating systems.  A
6well-documented example is the proof of functional correctness of a
7moderately optimising compiler from a large subset of the $\C$ language
8to a typical assembly language of the kind used in embedded systems
9\cite{Leroy09}.
10
11In the framework of the {\em Certified Complexity} ($\cerco$) project \cite{Cerco10}, we
12aim to refine this line of work by focusing on the issue of the {\em
13execution cost} of the compiled code.  Specifically, we aim to build a
14formally verified $\C$ compiler that given a source program produces
15automatically a functionally equivalent object code plus an annotation
16of the source code which is a sound and precise description
17of the execution cost of the object code.
18
19We target in particular the kind of $\C$ programs produced for embedded
20applications; these programs are eventually compiled to binaries
21executable on specific processors.  The current state of the art in
22commercial products such as $\scade$ \cite{SCADE,Fornari10} is that the {\em reaction
23time} of the program is estimated by means of abstract interpretation
24methods (such as those developed by $\absint$ \cite{AbsInt,AbsintScade}) that operate on the
25binaries.  These methods rely on a specific knowledge of the
26architecture of the processor and may require explicit annotations of
27the binaries to determine the number of times a loop is iterated (see,
28{\em e.g.}, \cite{W09} for a survey of the state of the art).
29
30In this context, our aim is to produce a functionally correct compiler
31which can {\em lift} in a provably correct way the pieces of information on the
32execution cost of the binary code to cost annotations on the source $\C$
33code.  Eventually, we plan to manipulate the cost annotations
34with automatic tools such as $\framac$ \cite{Frama-C}.
35
36In order to carry on our project, we need a clear and flexible picture
37of: (i) the meaning of cost annotations, (ii) the method to prove them
38sound and precise, and (iii) the way such proofs can be composed.  Our
39purpose here is to propose two methodologies addressing these three
40questions and to consider their concrete application to
41a simple toy compiler and to a moderately optimising $\C$ compiler.
42
43
44\subsection{Meaning of cost annotations}
45The execution cost of the source programs we are interested in depends on
46their control structure. Typically, the source programs are composed
47of mutually recursive procedures and loops and their execution cost
48depends, up to some multiplicative constant, on the number of times
49procedure calls and loop iterations are performed.
50
51Producing a {\em cost annotation} of a  source program  amounts to:
52\begin{itemize}
53
54\item enrich the program with a  collection of {\em global cost variables} 
55to measure resource consumption (time, stack size, heap size,$\ldots$)
56
57\item inject suitable code at some critical points (procedures, loops,$\ldots$) to keep track of the execution cost. 
58
59\end{itemize}
60
61Thus producing a cost-annotation of a source program $S$ amounts to
62build an {\em annotated program} $\w{An}(S)$ which behaves as $S$
63while self-monitoring its execution cost. In particular, if we do {\em
64  not} observe the cost variables then we expect the annotated
65program $\w{An}(S)$ to be functionally equivalent to $S$.  Notice that
66in the proposed approach an annotated program is a program in the
67source language. Therefore the meaning of the cost
68  annotations is automatically defined by the semantics of the source
69language and tools developed to reason on the source programs
70can be directly applied to the annotated programs too.
71
72
73\subsection{Soundness and precision of cost annotations}
74Suppose we have a functionally correct compiler $\cl{C}$ that
75associates with a program $S$ in the source language a program
76$\cl{C}(S)$ in the object language.  Further suppose we have some
77obvious way of defining the execution cost of an object code. For
78instance, we have a good estimate of the number of cycles needed for
79the execution of each instruction of the object code.
80Now the annotation of the source program $\w{An}(S)$ is {\em sound} if its
81prediction of the execution cost is an upper bound for the
82`real' execution cost. Moreover, we say that the annotation is {\em precise} 
83if the {\em difference} between the predicted and real execution costs
84is bounded by a constant which depends on the program.
85
86
87\subsection{Compositionality}\label{comp-intro}
88In order to master the complexity of the compilation process (and its
89verification), the compilation function $\cl{C}$ must be regarded as
90the result of the composition of a certain number of program transformations
91$\cl{C}=\cl{C}_{k} \comp \cdots \comp \cl{C}_{1}$.
92When building a system of
93cost annotations on top of an existing compiler 
94a certain number of problems arise.
95First, the estimated cost of executing a piece of source code
96is determined only at the {\em end} of the compilation process.
97Thus while we are used to define the compilation
98functions $\cl{C}_i$ in increasing order (from left to right),
99the annotation function $\w{An}$ is the result of
100a progressive abstraction from
101the object to the source code (from right to left).
102Second, we must be able to foresee in the source language
103the looping and branching  points of the object code.
104Missing a loop may lead to unsound cost annotations
105while missing a branching point may lead to rough cost
106predictions. This means that we must have a rather good
107idea of the way the source code will eventually be compiled
108to object code.
109Third, the definition of the annotation of the source
110code depends heavily on contextual information. For instance,
111the cost of the compiled code associated with
112a simple expression such as $x+1$ will depend on the
113place in the memory hierarchy where the variable $x$ is allocated.
114
115
116\subsection{Direct approach to cost annotations}\label{direct-intro}
117A first `direct' approach to the problem of building
118cost annotations is summarised by the following diagram.
119
120
121\[
122\xymatrix{
123% Row 1
124  L_1 \ar[d]^{An_1} \ar[r]^{\cl{C}_1}
125& L_2 \ar[d]^{An_1
126& \ldots \hspace{0.3cm}\ar[r]^{\cl{C}_k}
127& L_{k+1} \ar[d]^{An_{k+1}}  \\
128% Row 2
129  L_1                                 
130& L_2   
131& & L_{k+1}
132}
133\]
134
135With respect to our previous discussion,  $L_1$ is the source
136language with the related annotation function $\w{An}_1$ while
137$L_{k+1}$ is the object language with a related annotation
138$\w{An}_{k+1}$. This annotation of the object code is supposed to be truly
139straightforward and it is taken as an `axiomatic' definition of
140the `real' execution cost of the program.
141The languages $L_i$, for $2\leq i \leq k$, are
142intermediate languages which are also equipped with increasingly `realistic'
143annotation functions.
144Suppose we denote with $S$ the source program,
145with $\cl{C}(S)$ the compiled program, and
146that we write $(P,s)\eval s'$ to mean that
147the (source or object) program $P$ in the state $s$
148terminates successfully in the state $s'$.
149The soundness proof of the compilation function
150guarantees that if $(S,s)\eval s'$ then
151$(\cl{C}(S),s)\eval s'$.
152In the direct approach, the {\em proof of soundness} of
153the cost annotations amounts to lift the proof of functional
154equivalence of the source program and the object code to a proof of
155`quasi-equivalence' of the respective instrumented codes.
156Suppose we write $s[c/\w{cost}]$ for a state that associates $c$
157with the cost variable $\w{cost}$. Then what we want to show is
158that whenever $(\w{An}_{1}(S),s[c/\w{cost}])\eval s'[c'/\w{cost}]$ 
159we have that $(\w{An}_{k+1}(\cl{C}(S)),s[d/\w{cost}])\eval s'[d'/\w{cost}]$
160and $|d'-d|\leq |c'-c|+k$.
161This means that the increment in the annotated source program bounds up to
162an additive constant the increment in the annotated object program.
163We will also say that the cost annotation is precise if we can also prove that
164$|c'-c|\leq |d'-d|$, {\em i.e.}, the `estimated' cost is not too far
165away from the `real' one.
166We will see that while in theory one can build sound and precise
167annotation functions, in practice definitions and proofs become
168unwieldy.
169
170
171
172\subsection{Labelling approach to cost annotations}\label{label-intro}
173The `labelling' approach to the problem of building
174cost annotations is summarized in the following diagram.
175
176\newcolumntype{M}[1]{>{\raggedright}m{#1}}
177
178\-
179
180\begin{tabular}{M{10cm}||M{4cm}}
181$$
182\xymatrix{
183% Row 1
184  L_1 \\
185%
186% Row 2
187  L_{1,\ell}
188  \ar[u]^{\cl{I}}
189  \ar@/^/[d]^{\w{er}_1}
190  \ar[r]^{\cl{C}_1}
191%
192& L_{2,\ell}
193  \ar[d]^{\w{er}_2
194%
195& \ldots \hspace{0.3cm}\ar[r]^{\cl{C}_k}
196%
197& L_{k+1, \ell}
198  \ar[d]^{\w{er}_{k+1}}  \\
199%
200% Row 3
201  L_1                                 
202  \ar@/^/[u]^{\cl{L}}
203  \ar[r]^{\cl{C}_1}
204& L_2   
205& \ldots\hspace{0.3cm}
206  \ar[r]^{\cl{C}_{k}}
207& L_{k+1}
208}
209$$
210&
211$$
212\begin{array}{ccc}
213\w{er}_{i+1} \comp \cl{C}_{i} &= &\cl{C}_{i} \comp \w{er}_{i} \\
214
215\w{er}_{1} \comp \cl{L} &= &\w{id}_{L_{1}} \\ 
216
217\w{An}_{1} &= &\cl{I} \comp \cl{L} 
218\end{array}
219$$
220\end{tabular}
221
222\-
223
224For each language $L_i$ considered in the compilation process,
225we define an extended {\em labelled} language $L_{i,\ell}$ and an
226extended operational semantics. The labels are used to mark
227certain points of the control. The semantics makes sure
228that whenever we cross a labelled control point a labelled and observable
229transition is produced.
230
231For each labelled language there is an obvious function $\w{er}_i$
232erasing all labels and producing a program in the corresponding
233unlabelled language.
234The compilation functions $\cl{C}_i$ are extended from the unlabelled
235to the labelled language so that they enjoy commutation
236with the erasure functions. Moreover, we lift the soundness properties of
237the compilation functions from the unlabelled to the labelled languages
238and transition systems.
239
240A {\em labelling} $\cl{L}$  of the source language $L_1$ is just a function
241such that $\w{er}_{L_{1}} \comp \cl{L}$ is the identity function.
242An {\em instrumentation} $\cl{I}$ of the source labelled language  $L_{1,\ell}$
243is a function replacing the labels with suitable increments of, say,
244a fresh \w{cost} variable. Then an {\em annotation} $\w{An}_{1}$ of the
245source program can be derived simply as the composition of the labelling
246and the instrumentation functions: $\w{An}_{1} = \cl{I}\comp \cl{L}$.
247
248As for the direct approach, suppose $s$ is some adequate representation
249of the state of a program. Let $S$ be a source program and suppose
250that its annotation satisfies the following property:
251\begin{equation}\label{STEP1}
252(\w{An}_{1}(S),s[c/\w{cost}])  \eval s'[c+\delta/\w{cost}]
253\end{equation}
254where $\delta$ is some non-negative number.
255Then the definition of the instrumentation and the fact that
256the soundness proofs of the compilation functions have been lifted
257to the labelled languages allows to conclude that
258\begin{equation}\label{step2}
259(\cl{C}(L(S)),s[c/\w{cost}])  \eval (s'[c/\w{cost}],\lambda)
260\end{equation}
261where $\cl{C} = \cl{C}_{k} \comp \cdots \comp \cl{C}_{1}$ and
262$\lambda$ is a sequence (or a multi-set) of labels whose `cost'
263corresponds to the number $\delta$ produced by the
264annotated program.
265Then the commutation properties of erasure and compilation functions
266allows to conclude that the {\em erasure} of the compiled
267labelled code $\w{er}_{k+1}(\cl{C}(L(S)))$ is actually  equal to
268the compiled code $\cl{C}(S)$ we are interested in.
269Given this, the following question arises:
270
271\begin{quote}
272Under which conditions
273the sequence $\lambda$, {\em i.e.}, the increment $\delta$,
274is a sound and possibly precise description of the
275execution cost of the object code?
276\end{quote}
277
278To answer this question, we observe that the object code we are interested in is some kind of assembly code
279and its control flow can be easily represented as a control flow
280graph. The fact that we have to prove the soundness of the compilation
281functions means that we have plenty of pieces of information
282on the way the control flows in the compiled code, in particular as
283far as procedure calls and returns are concerned.
284These pieces of information allow to build a rather accurate representation
285of the control flow of the compiled code at run time.
286
287The idea is then to perform two simple checks on the control flow
288graph. The first check is to verify that all loops go through a labelled node.
289If this is the case then we can associate a finite cost with
290every label and prove that the cost annotations are sound.
291The second check amounts to verify that all paths starting from
292a label have the same cost. If this check is successful then we can
293conclude that the cost annotations are precise.
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309\subsection{A toy compiler}\label{toy-intro}
310As a first case study for the two approaches to cost
311annotations we have sketched, we introduce a
312{\em toy compiler} which is summarised by the
313following diagram.
314
315\[
316\xymatrix{
317\imp 
318\ar[r]^{\cl{C}}
319& \vm 
320\ar[r]^{\cl{C'}}
321& \mips
322}
323\]
324
325The three languages considered can be shortly described as follows:
326$\imp$ is a very simple imperative language
327with pure expressions, branching and looping commands,
328$\vm$ is an assembly-like language enriched with a stack, and
329$\mips$ is a $\mips$-like assembly language with
330registers and main memory.
331The first compilation function $\cl{C}$ relies on the stack
332of the $\vm$ language to implement expression evaluation while
333the second compilation function $\cl{C'}$ allocates (statically)
334the base of the stack in the registers and the rest in main memory.
335This is of course a naive strategy but it suffices to expose
336some of the problems that arise in defining a compositional approach
337(cf. section \ref{comp-intro}).
338
339
340\subsection{A C compiler}\label{Ccompiler-intro}
341As a second, more complex, case study we consider a
342$\C$ compiler we have built in $\ocaml$ whose
343structure is summarised by the following diagram:
344
345{\footnotesize
346\[
347\begin{array}{cccccccccc}
348&&\C &\arrow &\Clight &\arrow &\Cminor &\arrow &\RTLAbs &\qquad \mbox{(front end)}\\
349                                              &&&&&&&&\downarrow \\
350 \mips &\leftarrow &\LIN &\leftarrow  &\LTL &\leftarrow  &\ERTL  &\leftarrow &\RTL &\qquad \mbox{(back-end)}
351\end{array}
352\]
353}
354
355The structure follows rather closely the one of the $\compcert$ compiler.
356Notable differences are that some compilation steps are fusioned, that
357the front-end goes till $\RTLAbs$ (rather than $\Cminor$) and that we
358target the $\mips$ assembly language (rather than $\powerpc$). 
359These differences are contingent to the way we built the compiler.
360The compilation from $\C$ to $\Clight$ relies on the $\cil$ front-end \cite{CIL02}. The one from $\Clight$ to $\RTL$ has been programmed from scratch and it
361is partly based on the $\coq$ definitions available in the $\compcert$
362compiler.  Finally, the back-end from $\RTL$ to $\mips$ is based on a
363compiler developed in $\ocaml$ for pedagogical
364purposes \cite{Pottier}.  The main optimisations it performs are common
365subexpression elimination, liveness analysis and register allocation,
366and graph compression.
367
368
369
370\subsection{Organisation}
371The rest of the paper is organised as follows.
372Section \ref{toy-compiler-sec} describes the 3 languages and the
3732 compilation steps of the toy compiler.
374Section \ref{toy-direct-sec} describes the application of the
375direct approach to the toy compiler and points
376out its limitations.
377Section \ref{label-toy-sec} describes the application of
378the labelling approach to the toy compiler.
379Section \ref{C-compiler-sec} provides some details on the
380structure of the $\C$ compiler we have implemented.
381Section \ref{C-label-sec} reports our experience in implementing
382and testing the labelling approach on the $\C$ compiler.
383Section \ref{conclusion-sec} summarizes our contribution and
384outlines some perspectives for future work.
385Section \ref{paper-proofs} sketches the proofs that have not
386been mechanically checked in $\coq$.
387
388
389
390\section{A toy compiler}\label{toy-compiler-sec}
391We formalise the toy compiler
392introduced in section \ref{toy-intro}.
393
394\subsection{\imp: language and semantics}\label{imp-sec}
395The syntax of the $\imp$ language is described below.
396This is a rather standard imperative language with
397\s{while} loops and \s{if-then-else}.
398
399{\footnotesize
400\[
401\begin{array}{lll}
402
403\w{id}&::= x\Alt y\Alt \ldots       &\mbox{(identifiers)} \\
404\w{n} &::= 0 \Alt -1 \Alt +1 \Alt \ldots &\mbox{(integers)} \\
405v &::= n \Alt \s{true} \Alt \s{false}  &\mbox{(values)} \\
406e &::= \w{id} \Alt n \Alt e+e    &\mbox{(numerical expressions)} \\
407b &::= e<e  &\mbox{(boolean conditions)} \\
408S &::= \s{skip} \Alt \w{id}:=e \Alt S;S \Alt 
409  \s{if} \ b \ \s{then} \ S \ \s{else} \ S
410   \Alt \s{while} \ b \ \s{do} \ S  &\mbox{(commands)} \\
411P &::= \s{prog} \ S      &\mbox{(programs)}
412
413\end{array}
414\]}
415
416Let $s$ be a total function from identifiers to integers representing
417the {\bf state}.  If $s$ is a state, $x$ an identifier, and $n$ an
418integer then $s[n/x]$ is the `updated' state such that $s[n/x](x)=n$
419and $s[n/x](y)=s(y)$ if $x\neq y$.
420
421
422\subsection{Big-step semantics}
423The big-step operational semantics of $\imp$ programs is
424defined by the following judgements:
425\[
426(e,s)\eval v  \quad (b,s) \eval v \qquad (S,s)\eval s'\quad (P,s)\eval s'
427\]
428and it is described in table \ref{semantics-imp}.
429We assume that  the addition $v +_{\Z} v'$ is only defined if
430$v$ and $v'$ are integers and the comparison $v <_{\Z} v'$ produces
431a value \s{true} or \s{false} only if $v$ and $v'$ are integers.
432\begin{table}
433{\footnotesize
434\[
435\begin{array}{c}
436
437\infer{}
438{(v,s)\eval v}
439
440\qquad
441
442\infer{}
443{(x,s) \eval s(x)}
444
445\qquad
446
447\infer{(e,s)\eval v\quad (e',s)\eval v'}
448{(e+e',s)\eval (v+_{\Z} v')}
449
450\qquad
451\infer{(e,s)\eval v\quad (e',s)\eval v'}
452{(e<e',s) \eval (v<_{\Z} v')} \\ \\ 
453
454\infer{}
455{(\s{skip},s)\eval s}
456
457\qquad 
458
459\infer{(e,s) \eval v}
460{(x:=e,s) \eval s[v/x]}
461
462\qquad
463
464\infer{(S_1,s)\eval s'\quad (S_2,s')\eval s''}
465{(S_1;S_2,s)\eval s''} \\\\
466
467
468\infer{(b,s)\eval \s{true} \quad (S,s)\eval s'}
469{ (\s{if} \ b \ \s{then} \ S \ \s{else} \ S',s) \eval  s'}
470\qquad
471\infer{(b,s)\eval \s{false} \quad (S',s)\eval s'}
472{ (\s{if} \ b \ \s{then} \ S \ \s{else} \ S',s) \eval  s'}
473\\ \\ 
474
475\infer{(b,s) \eval \s{false}}
476{(\s{while} \ b \ \s{do} \  S ,s)\eval s}
477\qquad
478
479\infer{(b,s) \eval \s{true} \quad (S;\s{while} \ b \ \s{do} \  S,s) \eval s'}
480{(\s{while} \ b \ \s{do} \  S ,s)\eval s'} \\ \\ 
481
482\infer{(S,s)\eval s'}
483{(\s{prog} \ S,s)\eval s'}
484
485\end{array}
486\]}
487\caption{Big-step operational semantics of $\imp$}\label{semantics-imp}
488\end{table}
489
490
491\subsection{Small-step semantics}
492We also introduce an alternative small-step semantics of the
493$\imp$ language.
494A {\em continuation} $K$ is a list of commands which terminates with a special
495symbol \s{halt}.
496\[
497K::= \s{halt} \Alt S \cdot K
498\]
499Table \ref{small-step-imp} defines a small-step semantics of $\imp$ commands
500whose basic judgement has the shape:
501\[
502(S,K,s) \arrow  (S',K',s')~.
503\]
504We define the semantics of a program $\s{prog} \ S$ as the semantics
505of the command $S$ with continuation $\s{halt}$.
506We derive a big step semantics from the small step one as follows:
507\[
508\begin{array}{ll}
509(S,s) \eval s'                   &\mbox{if }(S,\s{halt},s) \arrow \cdots \arrow (\s{skip},\s{halt},s') ~.
510\end{array}
511\]
512
513
514\begin{table}
515{\footnotesize
516\[
517\begin{array}{lll}
518(x:=e,K,s) &\arrow &(\s{skip},K,s[v/x]) \qquad\mbox{if }(e,s)\eval v \\ \\
519
520(S;S',K,s) &\arrow &(S,S'\cdot K,s) \\ \\
521
522(\s{if} \ b \ \s{then} \ S \ \s{else} \ S',K,s) &\arrow 
523&\left\{
524\begin{array}{ll}
525(S,K,s) &\mbox{if }(b,s)\eval \s{true} \\
526(S',K,s) &\mbox{if }(b,s)\eval \s{false}
527\end{array}
528\right. \\ \\
529
530
531(\s{while} \ b \ \s{do} \ S ,K,s) &\arrow 
532&\left\{
533\begin{array}{ll}
534(S,(\s{while} \ b \ \s{do} \ S)\cdot K,s) &\mbox{if }(b,s)\eval \s{true} \\
535(\s{skip},K,s) &\mbox{if }(b,s)\eval \s{false}
536\end{array}
537\right. \\ \\
538
539
540(\s{skip},S\cdot K,s) &\arrow
541&(S,K,s)
542
543
544
545\end{array}
546\]}
547\caption{Small-step operational semantics of $\imp$ commands}\label{small-step-imp}
548\end{table}
549
550
551
552
553\subsection{\vm: language and semantics}\label{vm-sec}
554Following \cite{Leroy09-ln},
555we define a virtual machine $\vm$ and its programming language.
556The machine includes the following elements:
557(1) A fixed code $C$ (a possibly empty sequence of instructions),
558(2) A program counter \w{pc},
559(3) A store $s$ (as for the source program),
560(4) A stack of integers $\sigma$.
561
562We will rely on the following instructions with the associated
563informal semantics:
564
565{\footnotesize
566\[
567\begin{array}{ll}
568{\tt cnst(n)}        &\mbox{push on the stack} \\
569{\tt var(x)}         &\mbox{push value }x   \\
570{\tt setvar(x)}      &\mbox{pop value and assign it to }x \\
571{\tt add}            &\mbox{pop 2 values and push their sum} \\
572{\tt branch(k)}      &\mbox{jump with offset }k        \\
573{\tt bge(k)}         &\mbox{pop 2 values and jump if greater or equal with offset }k \\
574{\tt halt}           &\mbox{stop computation}
575\end{array}
576\]}
577
578In the branching instructions, $k$ is an integer that has to be added
579to the current program counter in order to determine the following
580instruction to be executed.
581Given a sequence $C$, we denote with $|C|$ its length and
582with  $C[i]$ its $i^{\w{th}}$ element (the leftmost element
583being the $0^{\w{th}}$ element).
584The operational semantics of the instructions is formalised by rules of the shape:
585\[
586C \Gives (i,\sigma,s) \arrow (j,\sigma',s')
587\]
588and it is fully described in table \ref{semantics-vm}.
589Notice that $\imp$ and $\vm$ semantics share the same notion of store.
590We write, {\em e.g.}, $n\cdot \sigma$ to stress that the top element
591of the stack exists and is $n$.
592We will also write $(C,s)\eval s'$ if
593$C\Gives (0,\epsilon,s) \trarrow (i,\epsilon,s')$ and $C[i]=\s{halt}$.
594\begin{table}
595{\footnotesize
596\[
597\begin{array}{l|l}
598
599\mbox{Rule}                                                 & C[i]= \\\hline
600
601C \Gives (i, \sigma, s) \arrow (i+1, n\cdot \sigma,s)  & {\tt cnst}(n) \\ 
602C \Gives (i, \sigma, s) \arrow (i+1,s(x)\cdot \sigma,s)  & {\tt var}(x) \\ 
603C \Gives (i, n \cdot \sigma, s) \arrow (i+1,\sigma,s[n/x])  & {\tt setvar}(x) \\
604C \Gives (i, n \cdot n' \cdot \sigma, s) \arrow (i+1, (n+_{\Z} n')\cdot \sigma,s)  & {\tt add} \\
605C \Gives (i, \sigma, s) \arrow (i+k+1, \sigma,s)  & {\tt branch(k)} \\
606C \Gives (i, n \cdot n' \cdot \sigma, s) \arrow (i+1, \sigma,s)  & {\tt bge(k)} \mbox{ and }n<_{\Z} n'\\
607C \Gives (i, n \cdot n' \cdot \sigma, s) \arrow (i+k+1, \sigma,s)  & {\tt bge(k)} \mbox{ and }n\geq_{\Z} n'\\
608
609\end{array}
610\]}
611\caption{Operational semantics $\vm$ programs}\label{semantics-vm}
612\end{table}
613
614Code coming from the compilation of $\imp$ programs has specific
615properties that are used in the following compilation step when values
616on the stack are allocated either in registers or in main memory.  In
617particular, it turns out that for every instruction of the compiled
618code it is possible to predict statically the {\em height of the stack}
619whenever the instruction is executed.  We now proceed to define a simple notion
620of {\em well-formed} code and show that it enjoys this property.  In
621the following section, we will define the compilation function from
622$\imp$ to $\vm$ and show that it produces well-formed code.
623
624
625\begin{definition}
626We say that a sequence of instructions $C$ is well formed if there
627is a function $h:\set{0,\ldots,|C|} \arrow \Nat$ which satisfies the
628conditions listed in table \ref{well-formed-instr} for $0\leq i \leq |C|-1$.
629In this case we write $C:h$.
630\end{definition}
631
632\begin{table}
633{\footnotesize
634\[
635\begin{array}{l|l}
636
637C[i]=         & \mbox{Conditions for }C:h \\\hline
638
639\s{cnst}(n)
640\mbox{ or } \s{var}(x)
641&h(i+1) = h(i)+1 \\
642
643\s{add}           
644& h(i)\geq 2, \quad h(i+1)=h(i)-1 \\
645
646\s{setvar}(x)     
647& h(i)=1, \quad h(i+1)=0 \\
648
649\s{branch}(k)
650& 0\leq i+k+1 \leq |C|, \quad h(i)=h(i+1)=h(i+k+1)=0 \\
651
652\s{bge}(k)
653& 0\leq i+k+1 \leq |C|, \quad h(i)=2, \quad h(i+1)=h(i+k+1)=0 \\
654
655\s{halt}   
656&i=|C|-1, \quad h(i)=h(i+1)=0
657
658\end{array}
659\]}
660\caption{Conditions for well-formed code}\label{well-formed-instr}
661\end{table}
662
663The conditions defining the predicate $C:h$ are strong enough
664to entail that $h$ correctly predicts the stack height
665and to guarantee the uniqueness of $h$ up to the initial condition.
666
667\begin{proposition}\label{unicity-height-prop}
668\Defitemf{(1)}
669If $C:h$, $C\Gives (i,\sigma,s) \trarrow (j,\sigma',s')$, and
670$h(i)=|\sigma|$ then $h(j)=|\sigma'|$.
671
672\Defitem{(2)} If $C:h$, $C:h'$ and $h(0)=h'(0)$ then $h=h'$.
673\end{proposition}
674
675
676\subsection{Compilation from $\imp$ to $\vm$}\label{compil-imp-vm-sec}
677In table \ref{compil-imp-vm},
678we define compilation functions $\cl{C}$ from $\imp$ to $\vm$ which
679operate on expressions, boolean conditions, statements, and programs.
680We write $\w{sz}(e)$, $\w{sz}(b)$, $\w{sz}(S)$ for the number of instructions
681the compilation function associates with the expression $e$, the boolean
682condition $b$, and the statement $S$, respectively.
683\begin{table}
684{\footnotesize
685\[
686\begin{array}{c}
687
688\cl{C}(x) = {\tt var}(x)   \qquad
689\cl{C}(n)  ={\tt cnst}(n)   \qquad
690\cl{C}(e+e') =\cl{C}(e) \cdot \cl{C}(e') \cdot {\tt add} \\ \\
691
692\cl{C}(e<e',k) = \cl{C}(e) \cdot \cl{C}(e') \cdot {\tt bge(k)} \\ \\ 
693
694\qquad  \cl{C}(x:=e) = \cl{C}(e) \cdot {\tt setvar(x)}
695
696\qquad \cl{C}(S;S') = \cl{C}(S) \cdot \cl{C}(S')\\ \\
697
698
699\cl{C}(\s{if} \ b \ \s{then} \ S\  \s{else} \ S') = 
700\cl{C}(b,k) \cdot \cl{C}(S) \cdot (\s{branch}(k')) \cdot \cl{C}(S')  \\
701\mbox{where: } k= \w{sz}(S)+1, \quad k'=\w{sz}(S')
702
703\\ \\
704
705\cl{C}(\s{while} \ b \ \s{do} \  S )
706= \cl{C}(b,k)  \cdot \cl{C}(S)  \cdot \s{branch}(k') \\
707\mbox{where: } k=\w{sz}(S)+1, \quad k'=-(\w{sz}(b)+\w{sz}(S)+1)\\\\
708
709\cl{C}(\s{prog} \ S) = 
710\cl{C}(S) \cdot \s{halt}
711
712\end{array}
713\]}
714
715\caption{Compilation from $\imp$ to $\vm$}\label{compil-imp-vm}
716\end{table}
717
718
719\subsection{Soundness of compilation for the big-step semantics}
720We follow \cite{Leroy09-ln} for
721the proof of correctness of the compilation function with
722respect to the big-step semantics (see also \cite{MP67} for a
723much older reference).
724
725\begin{proposition}\label{soundness-big-step-prop}
726The following properties hold:
727
728\Defitem{(1)}
729If $(e,s)\eval v$ then
730$C \cdot \cl{C}(e) \cdot C'
731\Gives (i,\sigma,s) \trarrow (j,v\cdot \sigma,s)$
732where $i=|C|$ and $j=|C \cdot \cl{C}(e)|$.
733
734
735\Defitem{(2)}
736If $(b,s)\eval \s{true}$ then
737$C \cdot \cl{C}(b,k) \cdot C'
738\Gives (i,\sigma,s) \trarrow (j+k,\sigma,s)$
739where  $i=|C|$ and $j=|C \cdot \cl{C}(b,k)|$.
740
741\Defitem{(3)}
742If $(b,s)\eval \s{false}$
743then $C \cdot \cl{C}(b,k) \cdot C'
744\Gives (i,\sigma,s) \trarrow (j,\sigma,s)$
745where  $i=|C|$ and $j=|C \cdot \cl{C}(b,k)|$.
746
747\Defitem{(4)}
748If $(S,s)\eval s'$ then
749$C \cdot \cl{C}(S) \cdot C'
750\Gives (i,\sigma,s) \trarrow (j,\sigma,s')$
751where  $i=|C|$ and $j=|C \cdot \cl{C}(e)|$.
752\end{proposition}
753
754
755
756\subsection{Soundness of compilation for the small-step semantics}
757We prove soundness with respect to the small step semantics too.
758To this end, given a $\vm$ code $C$, we define an `accessibility relation'
759$\access{C}$ as the least binary relation on $\set{0,\ldots,|C|-1}$ such that:
760
761{\footnotesize
762\[
763\begin{array}{ll}
764
765\infer{}
766{i\access{C} i}
767\quad
768&\infer{C[i]=\s{branch}(k)\quad (i+k+1)\access{C} j}
769{i\access{C} j}
770
771\end{array}
772\]}
773
774We also introduce a ternary relation $R(C,i,K)$ which relates
775a $\vm$ code $C$, a number $i\in\set{0,\ldots,|C|-1}$ and a continuation
776$K$. The intuition is that relative to the code $C$,
777the instruction $i$ can be regarded as having continuation $K$.
778Formally, the relation $R$ is
779defined as the least one that satisfies the following conditions.
780
781{\footnotesize
782\[
783\begin{array}{ll}
784
785\infer{
786\begin{array}{c}
787\\
788i\access{C} j\qquad C[j]=\s{halt}
789\end{array}}
790{R(C,i,\s{halt})}
791
792\qquad
793&\infer{
794\begin{array}{c}
795i\access{C} i'\quad C=C_1 \cdot \cl{C}(S) \cdot C_2\\ 
796i'=|C_1|\quad j=|C_1\cdot \cl{C}(S)| \quad R(C,j,K)
797\end{array}}
798{R(C,i,S\cdot K)}~.
799
800\end{array}
801\]}
802
803
804We can then state the correctness of the compilation function as follows.
805
806
807\begin{proposition}\label{soundness-small-step}
808If $(S,K,s) \arrow (S',K',s')$ and $R(C,i,S\cdot K)$ then
809$C\Gives (i,\sigma,s) \trarrow (j,\sigma,s')$ and
810$R(C,j,S'\cdot K')$.
811\end{proposition}
812
813
814
815
816
817\subsection{Compiled code is well-formed}
818As announced, we can prove that the result of the compilation
819is a well-formed code.
820
821\begin{proposition}\label{well-formed-prop}
822For any expression $e$, statement $S$, and program $P$ the
823following holds:
824
825\Defitem{(1)} 
826For any $n\in\Nat$ there is a unique $h$ such that
827$\cl{C}(e):h$, $h(0)=n$, and $h(|\cl{C}(e)|)=h(0)+1$.
828
829\Defitem{(2)}
830For any $S$,
831there is a unique $h$ such that $\cl{C}(S):h$, $h(0)=0$, and
832$h(|\cl{C}(e)|)=0$.
833
834\Defitem{(3)}
835There is a unique $h$ such that $\cl{C}(P):h$.
836\end{proposition}
837
838
839
840
841
842
843\subsection{$\mips$: language and semantics}\label{mips-sec}
844We consider a $\mips$-like machine \cite{Larus05} 
845which includes the following elements:
846(1) a fixed code $M$ (a sequence of instructions),
847(2) a program counter \w{pc},
848(3) a finite set of registers including the registers
849$A$, $B$, and $R_0,\ldots,R_{b-1}$,
850and (4)  an (infinite) main memory which maps
851locations to integers.
852
853We denote with $R,R',\ldots$ registers, with $l,l',\ldots$ locations
854and with $m,m',\ldots$ memories which  are total functions from
855registers and locations to (unbounded) integers.
856We will rely on the following instructions with the associated
857informal semantics:
858
859{\footnotesize
860\[
861\begin{array}{ll}
862\s{loadi} \ R, n    &\mbox{store value }n\mbox{ in the register }R \\
863\s{load} \ R,l   &\mbox{store contents of location }l \mbox{ in the register }R   \\
864\s{store} \ R,l &\mbox{store contents of register }R \mbox{ in the location }l   \\
865\s{add}\ R,R',R''  &\mbox{add contents of }R',R''\mbox{ and store it in }R \\
866\s{branch}\ k      &\mbox{jump with offset }k        \\
867\s{bge}\  R,R',k     &\mbox{jump with offset }k \mbox{ if contents }R \mbox{ greater or equal than contents }R' \\
868\s{halt}           &\mbox{stop computation}
869\end{array}
870\]}
871
872We denote with $M$ a list of instructions.
873The operational semantics is formalised  in
874table \ref{semantics-mips} by rules of the shape:
875\[
876M \Gives (i,m) \arrow (j,m')
877\]
878where $M$ is a list of $\mips$ instructions, $i,j$ are natural numbers and $m,m'$ are memories.
879We write $(M,m) \eval m'$ if
880$M\Gives (0,m) \trarrow (j,m')$ and $M[j]=\s{halt}$.
881
882
883\begin{table}
884{\footnotesize
885\[
886\begin{array}{l|l}
887
888\mbox{Rule}                                                 & M[i]= \\\hline
889
890M \Gives (i, m) \arrow (i+1, m[n/R])  & \s{loadi} \ R,n \\ 
891
892M \Gives (i, m) \arrow (i+1,m[m(l)/R])  & \s{load} \ R,l \\ 
893
894M \Gives (i, m ) \arrow (i+1,m[m(R)/l]))  & \s{store} \ R,l \\
895
896M \Gives (i, m) 
897\arrow (i+1, m[m(R')+m(R'')/R])  & \s{add}\ R,R',R'' \\
898
899M \Gives (i, m) \arrow (i+k+1, m)  & \s{branch}\ k \\
900
901M \Gives (i, m) \arrow (i+1, m)  & \s{bge}\ R,R',k \mbox{ and }m(R)<_{\Z} m(R')\\
902
903M \Gives (i, m) \arrow (i+k+1, m)  & \s{bge}\ R,R',k \mbox{ and }m(R)\geq_{\Z} m(R')
904\end{array}
905\]}
906\caption{Operational semantics $\mips$ programs}\label{semantics-mips}
907\end{table}
908
909
910\subsection{Compilation from $\vm$ to $\mips$}\label{compil-vm-mips-sec}
911In order to compile $\vm$ programs to $\mips$ programs we make the following
912hypotheses.
913(1) For every $\vm$ program variable $x$ we reserve an address $l_x$,
914(2) For every natural number $h\geq b$, we reserve an address $l_h$ (the addresses $l_x,l_h,\ldots$ are all distinct),
915(3) We store the first $b$ elements of the stack $\sigma$ in the registers
916$R_0,\ldots,R_{b-1}$ and the remaining (if any) at the addresses
917$l_b,l_{b+1},\ldots$.
918
919We say that the memory $m$ represents the stack $\sigma$ and
920the store $s$, and write $m\real \sigma,s$,
921if the following conditions are satisfied:
922(1) $s(x) = m(l_x)$, and (2) if $0\leq i < |\sigma|$ then
923\[
924\sigma[i] = \left\{ \begin{array}{ll}
925                    m(R_i) &\mbox{if }i<b \\
926                    m(l_i) &\mbox{if }i\geq b
927                    \end{array} \right.
928\]
929The compilation function $\cl{C'}$ from
930$\vm$ to $\mips$ is described in table \ref{compil-vm-mips}.
931It operates on a well-formed $\vm$ code $C$ whose last instruction
932is $\s{halt}$. Hence, by proposition \ref{well-formed-prop}(3),
933there is a unique $h$ such that $C:h$.
934We denote with $\cl{C'}(C)$ the concatenation $\cl{C'}(0,C)\cdots \cl{C'}(|C|-1,C)$.
935\begin{table}
936{\footnotesize
937\[
938\begin{array}{l|l}
939
940C[i]=          &\cl{C'}(i,C)= 
941\\\hline
942
943\s{cnst}(n) 
944& \left\{ \begin{array}{ll}
945                   (\s{loadi} \ R_h,n)                                &\mbox{if } h=h(i)<b \\
946                   (\s{loadi} \ A,n ) \cdot  (\s{store} \ A, l_h)  &\mbox{otherwise}
947                   \end{array} \right. \\ \\ 
948
949
950\s{var}(x)
951\left\{ \begin{array}{ll}
952                   (\s{load} \ R_h,l_x)                           &\mbox{if } h=h(i)<b \\
953                   (\s{load} \ A,l_x ) \cdot  (\s{store} \ A, l_h)  &\mbox{otherwise}
954                   \end{array} \right. \\ \\ 
955
956
957
958\s{add}
959& \left\{ \begin{array}{ll}
960                     (\s{add} \ R_{h-2} , R_{h-2} , R_{h-1})   &\mbox{if }h=h(i)<(b-1)   \\
961                     (\s{load} \ A,l_{h-1}) \cdot (\s{add} \ R_{h-2}, R_{h-2}, A)         &\mbox{if }h=h(i)=(b-1)  \\
962                     
963
964                           (\s{load} \ A, l_{h-1}) \cdot (\s{load} \ B,l_{h-2})  &\mbox{if }h=h(i)>(b-1) \\ 
965                           (\s{add} \ A,B,A) \cdot (\s{store} \ A, l_{h-2})         
966                        \end{array}\right. \\ \\ 
967
968\s{setvar}(x)           
969& \left\{ \begin{array}{ll}
970                         (\s{store} \ R_{h-1} \ l_x) &\mbox{if }h=h(i)<b\\
971                         (\s{load} \ A,l_{h-1})\cdot (\s{store} \ A,l_{x}) &\mbox{if }h=h(i)\geq b
972                         \end{array}\right. \\ \\ 
973
974
975
976\s{branch}(k)
977& (\s{branch} \ k')  \quad \mbox{if }k'=p(i+k+1,C)-p(i+1,C)\\ \\ 
978
979
980\s{bge}(k)
981&\left\{ \begin{array}{ll}
982                     (\s{bge} \ R_{h-2} , R_{h-1} , k')                                &\mbox{if }h=h(i)<(b-1)   \\
983
984                     (\s{load} \ A,l_{h-1}) \cdot (\s{bge} \ R_{h-2}, A, k')         &\mbox{if }h=h(i)=(b-1)  \\
985                     
986
987                           (\s{load} \ A, l_{h-2}) \cdot (\s{load} \ B,l_{h-1}) \cdot 
988                           (\s{bge} \ A,B,k')         &\mbox{if }h=h(i)>(b-1) \mbox{ and} \\
989
990&\quad k'=p(i+k+1,C)-p(i+1,C)
991          \end{array}\right. \\\\
992
993
994
995\s{halt}
996&\s{halt}
997
998\end{array}
999\]}
1000\caption{Compilation from $\vm$ to $\mips$}\label{compil-vm-mips}
1001\end{table}
1002Given a well formed $\vm$ code $C$ with $i<|C|$ we denote with
1003$p(i,C)$ the position of the first instruction in $\cl{C'}(C)$ which
1004corresponds to the compilation of the instruction with position $i$ in $C$.
1005This is defined as:\footnote{There is an obvious circularity in this definition that can be easily eliminated by defining first the function
1006$d$ following the case
1007analysis in table \ref{compil-vm-mips}, then the function $p$ as in (\ref{p-function}), and finally the function $\cl{C'}$ as in table \ref{compil-vm-mips}.}
1008\begin{equation}\label{p-function}
1009p(i,C) = \Sigma_{0\leq j <i} d(i,C),
1010\end{equation}
1011where the function $d(i,C)$ is defined as follows:
1012\begin{equation}\label{d-function}
1013d(i,C) = |\cl{C'}(i,C)|~.
1014\end{equation}
1015Hence $d(i,C)$ is the number of $\mips$ instructions associated
1016with the $i^{\w{th}}$ instruction of the (well-formed) $C$ code.
1017
1018The functional correctness of the compilation function can then be stated
1019as follows.
1020
1021\begin{proposition}\label{simulation-vm-mips}
1022Let $C : h$ be a well formed code.
1023If  $C \Gives (i,\sigma,s) \arrow (j,\sigma',s')$ with $h(i) = |\sigma|$ and  $m \real \sigma,s$
1024then
1025$\cl{C'}(C) \Gives (p(i,C),m) \trarrow (p(j,C),m')$ and $m'\real \sigma',s'$.
1026\end{proposition}
1027
1028
1029
1030\section{Direct approach for the toy compiler}\label{toy-direct-sec}
1031We apply the direct approach discussed in section \ref{direct-intro}
1032to the toy compiler which results in the following diagram:
1033
1034
1035\[
1036\xymatrix{
1037\imp 
1038\ar[r]^{\cl{C}}
1039\ar[d]^{An_{\imp}}
1040&
1041\vm
1042\ar[r]^{\cl{C'}}
1043\ar[d]^{An_{\vm}}
1044&
1045\mips
1046\ar[d]^{An_{\mips}}
1047\\
1048\imp 
1049&
1050\vm
1051& \mips\\
1052}
1053\]
1054
1055
1056
1057\subsection{$\mips$ and $\vm$ cost annotations}
1058The definition of the cost annotation $\w{An}_{\mips}(M)$ for a $\mips$ code $M$ goes as follows assuming that all the $\mips$ instructions have cost $1$.
1059
1060\begin{enumerate}
1061
1062\item  We select fresh locations $l_{\w{cost}},l_A,l_B$ not used in $M$.
1063
1064\item Before each instruction in $M$ we insert the following
1065list of  $8$ instructions whose effect is to
1066increase by $1$ the contents of location $l_\w{cost}$:
1067
1068{\footnotesize
1069\[
1070\begin{array}{l}
1071
1072(\s{store} \ A,l_{A}) \cdot
1073(\s{store} \ B,l_{B}) \cdot
1074(\s{loadi} \ A, 1) \cdot
1075(\s{load} \ B, l_{\w{cost}}) \cdot \\
1076(\s{add} \ A, A, B) \cdot
1077(\s{store} \ A,l_{\w{cost}}) \cdot
1078(\s{load} \ A, l_{A}) \cdot
1079(\s{load} \ B, l_{B})~.
1080\end{array}
1081\]}
1082
1083
1084\item We update the offset of the branching instructions according to the map
1085$k \mapsto (9\cdot k)$.
1086
1087\end{enumerate}
1088
1089
1090The definition of the cost annotation
1091$\w{An}_{\vm}(C)$ for a well-formed $\vm$ code $C$ such that
1092$C:h$ goes as follows.
1093
1094\begin{enumerate}
1095
1096\item We select a fresh $\w{cost}$ variable not used in $C$.
1097
1098\item   Before the instruction $i^{\w{th}}$ in $C$,
1099we insert the following list of 4 instructions whose
1100effect is to increase the $\w{cost}$ variable by the maximum number
1101$\kappa(C[i])$ of instructions associated with $C[i]$ (as specified
1102in table \ref{cost-approx-instr}):
1103
1104{\footnotesize\[
1105(\s{cnst}(\kappa(C[i]))) \cdot
1106(\s{var}(\w{cost})) \cdot
1107(\s{add}) \cdot
1108(\s{setvar}(\w{cost}))~.
1109\]}
1110
1111\item We update the offset of the branching instructions according to the map
1112$k\mapsto (5 \cdot k)$.
1113
1114\end{enumerate}
1115
1116
1117\begin{table}
1118{\footnotesize
1119\[
1120\begin{array}{c|cccccc}
1121
1122C[i]=
1123
1124&\s{cnst}(n) \mbox{ or } \s{var}(x)
1125
1126&\s{add}
1127
1128&\s{setvar}(x)
1129
1130&\s{branch}(k)
1131
1132&\s{bge}(k)
1133
1134&\s{halt} \\ \hline
1135
1136
1137\kappa(C[i])=
1138
1139&2 
1140
1141& 4 
1142
1143& 2 
1144
1145& 1 
1146
1147& 3 
1148
1149& 1
1150
1151\end{array}
1152\]}
1153\caption{Upper bounds on the cost of $\vm$ instructions}\label{cost-approx-instr}
1154\end{table}
1155
1156
1157To summarise, the situation is that the instruction $i$ of the $\vm$
1158code $C$ corresponds to: the instruction $5\cdot i +4$ of the
1159annotated $\vm$ code $\w{An}_{\vm}(C)$, the instruction $p(i,C)$ of
1160the $\mips$ code $\cl{C'}(C)$, and the instruction $9\cdot p(i,C)+8$
1161of the instrumented $\mips$ code $\w{An}_{\mips}(\cl{C'}(C))$.
1162
1163The following lemma describes the effect of the injected sequences
1164of instructions.
1165
1166\begin{lemma}\label{annot-vm-mips}
1167The following properties hold:
1168
1169\Defitem{(1)} If $M$ is a $\mips$ instruction then $C_1 \cdot \w{An}_{\mips}(M) \cdot C_2 \Gives  (|C_1|,m[c/l_{\w{cost}}]) \trarrow (|C_1|+8,m[c+1/l_{\w{cost}},m(A)/l_A,m(B)/l_B])$.
1170
1171\Defitem{(2)} If $C$ is a $\vm$ instruction then $C_1 \cdot \w{An}_{\vm}(C) \cdot C_2 \Gives (|C_1|,\sigma,s[c/\w{cost}]) \trarrow (|C_1|+4,\sigma,s[c + \kappa(C[i])/\w{cost}])$, where
1172$i=|C_1|$.
1173
1174\end{lemma}
1175
1176
1177The simulation proposition \ref{simulation-vm-mips} is extended to the
1178annotated codes as follows where we write $\arrow^k$ for the relation
1179obtained by composing $k$ times the reduction relation $\arrow$.
1180
1181
1182\begin{proposition}\label{labelled-simulation-vm-mips}
1183Let $C : h$ be a well-formed code. If $\w{An}_{\vm}(C) \Gives (5 \cdot
1184i, \sigma, s) \arrow^5 (5 \cdot j, \sigma', s')$, $m \real \sigma,
1185s$, and $h(i) = |\sigma|$ then $\w{An}_{\mips}(\cl{C'}(C)) \Gives (9
1186\cdot p(i,C),m) \trarrow (9\cdot p(j,C),m')$, with $m'[s'(\w{cost}) /
1187  l_{\w{cost}}] \real \sigma', s'$ and
1188\[
1189m'(l_{\w{cost}}) - m(l_{\w{cost}}) \le s'(\w{cost}) - s(\w{cost})~.
1190\]
1191\end{proposition}
1192
1193
1194
1195
1196
1197
1198\subsection{$\imp$ cost annotation}\label{imp-cost-annotation}
1199The definition of the cost annotation $\w{An}_{\imp}(P)$ for an $\imp$
1200program $P$ is defined in table \ref{annotation-imp} and it relies on
1201an auxiliary function $\kappa$ which provides an upper bound on the
1202cost of executing the various operators of the language. 
1203For the sake of simplicity, this
1204annotation introduces two main {\em approximations}:
1205
1206\begin{itemize}
1207
1208\item It over-approximates the cost of
1209an $\s{if\_then\_else}$ by taking the maximum cost of the
1210cost of the two branches.
1211
1212\item  It always considers the worst case where
1213the data in the stack resides in the main memory.
1214
1215\end{itemize}
1216
1217We will discuss in section \ref{limitations-direct-sec} to what extent
1218these approximations can be removed.
1219
1220\begin{table}
1221{\footnotesize
1222\[
1223\begin{array}{ll}
1224
1225\w{An}_{\imp}(\s{prog} \ S) &= \w{cost}:=\w{cost}+\kappa(S); \w{An}_{\imp}(S) \\
1226\w{An}_{\imp}(\s{skip})     &= \s{skip} \\
1227\w{An}_{\imp}(x:=e)         &= x:=e \\
1228\w{An}_{\imp}(S;S')         &= \w{An}_{\imp}(S);\w{An}_{\imp}(S') \\
1229\w{An}_{\imp}(\s{if} \ b \ \s{then} \ S \ \s{else} \ S') 
1230&=  (\s{if} \ b \ \s{then} \  \w{An}_{\imp}(S) \\
1231&\qquad\qquad  \s{else} \ \w{An}_{\imp}(S')) \\ 
1232
1233\w{An}_{\imp}(\s{while} \ b \ \s{do} \ S) 
1234&= (\s{while} \ b \
1235\s{do} \ \w{cost}:=\w{cost}+\kappa(b)+\kappa(S)+1; \w{An}_{\imp}(S)  ) 
1236
1237\end{array}
1238\]
1239\[
1240\begin{array}{ll}
1241
1242\kappa(\s{skip})  =0 
1243&\kappa(x:=e)       =\kappa(e) + \kappa(\s{setvar}) \\
1244
1245\kappa(S;S')       = \kappa(S)+\kappa(S') 
1246
1247&\kappa(\s{if} \ b \ \s{then} \ S \ \s{else} \ S') 
1248                   =\kappa(b)+\w{max}(\kappa(S) + \kappa(\s{branch}),\kappa(S')) \\
1249
1250\kappa(\s{while} \ b \ \s{do} \ S)
1251                   =\kappa(b)  \\
1252
1253\kappa(e<e')      =\kappa(e)+\kappa(e')+ \kappa(\s{bge})
1254
1255&\kappa(e+e')    = \kappa(e)+\kappa(e')+ \kappa(\s{add})  \\
1256
1257\kappa(n)       = \kappa(\s{cnst})
1258
1259&\kappa(x)       = \kappa(\s{var})
1260
1261\end{array}
1262\]}
1263\caption{Annotation for $\imp$ programs}\label{annotation-imp}
1264\end{table}
1265
1266Next we formulate the soundness of the $\imp$ annotation with respect
1267to the $\vm$ annotation along the pattern of proposition
1268\ref{soundness-big-step-prop}.
1269
1270
1271
1272\begin{proposition}\label{annot-imp-vm}
1273The following properties hold.
1274
1275\Defitem{(1)} If $(e,s)\eval v$ then
1276\[
1277C \cdot \w{An}_{\vm}(\cl{C}(e)) \cdot C'
1278\Gives (i,\sigma,s[c/\w{cost}]) \trarrow 
1279(j,v \cdot \sigma,s[c+\kappa(e)/\w{cost}])
1280\]
1281where $i=|C|$, $j=|C\cdot  \w{An}_{\vm}(\cl{C}(e))|$.
1282
1283
1284\Defitem{(2)} If $(b,s)\eval \s{true}$ then
1285\[
1286C \cdot \w{An}_{\vm}(\cl{C}(b,k)) \cdot C'
1287\Gives (i,\sigma,s[c/\w{cost}]) \trarrow 
1288(5\cdot k +j, \sigma,s[c+\kappa(b)/\w{cost}])
1289\]
1290where $i=|C|$, $j=|C\cdot  \w{An}_{\vm}(\cl{C}(b,k))|$.
1291
1292\Defitem{(3)} If
1293$(b,s)\eval \s{false}$ then
1294\[
1295C \cdot \w{An}_{\vm}(\cl{C}(b,k)) \cdot C'
1296\Gives (i,\sigma,s[c/\w{cost}]) \trarrow 
1297(j,\sigma,s[c+\kappa(b)/\w{cost}])
1298\]
1299where $i=|C|$, $j=|C\cdot  \w{An}_{\vm}(\cl{C}(b,k))|$.
1300
1301\Defitem{(4)}
1302If $(\w{An}_{\imp}(S),s[c/\w{cost}])\eval s'[c'/\w{cost}]$ then
1303\[
1304C \cdot \w{An}_{\vm}(\cl{C}(S)) \cdot C'
1305\Gives (i,\sigma,s[d/\w{cost}]) \trarrow (j,\sigma,s'[d'/\w{cost}]) 
1306\]
1307where $i=|C|$, $j=|C\cdot  \w{An}_{\vm}(\cl{C}(S))|$,
1308and $(d'-d) \leq (c'-c)+\kappa(S)$.
1309\end{proposition}
1310
1311
1312\subsection{Composition}
1313The soundness of the cost annotations can be composed
1314so as to obtain the soundness of the cost annotations of the source
1315program relatively to the one of the object code.
1316
1317\begin{proposition}\label{annot-composition}
1318If $(\w{An}_{\imp}(P),s[0/\w{cost}])\eval s'[c'/\w{cost}]$ 
1319and $m\real \epsilon,s[0/l_{\w{cost}}]$ then
1320\[
1321(\w{An}_{\mips}(\cl{C'}(\cl{C}(P))),m)
1322 \eval m'
1323\]
1324where $m'(l_{\w{cost}})\leq c'$ and $m'[c'/l_{\w{cost}}] \real \epsilon,s'[c'/\w{cost}]$.
1325\end{proposition}
1326
1327
1328
1329\subsection{$\coq$ development}\label{coq-development-sec}
1330We have
1331formalised and mechanically checked in {\sc Coq} the
1332application of the direct approach to the toy compiler
1333(but for propositions \ref{labelled-simulation-vm-mips} and
1334\ref{annot-composition} for which
1335we provide a `paper proof' in the appendix). 
1336By current standards, this is a small size
1337development including 1000 lines of specifications and 4000 lines of
1338proofs.  Still there are a couple of points that deserve to be
1339mentioned.  First, we did not find a way of re-using the soundness
1340proof of the compiler in a modular way.  As a matter of fact, the
1341soundness proof of the annotations is intertwined with the soundness
1342proof of the compiler.  Second, the manipulation of the cost variables
1343in the annotated programs entails a significant increase in the size
1344of the proofs.  In particular, the soundness proof for the compilation
1345function $\cl{C}$ from $\imp$ to $\vm$ available in \cite{Leroy09-ln}
1346is roughly $7$ times smaller than the soundness proof of the
1347annotation function of the $\imp$ code relatively to the one of the
1348$\vm$ code.
1349
1350
1351\subsection{Limitations of the direct approach}\label{limitations-direct-sec}
1352As mentioned in section \ref{imp-cost-annotation}, the annotation function
1353for the source language introduces some over-approximation thus {\em failing to be precise}.
1354On one hand, it is easy to modify the  definitions
1355in table \ref{annotation-imp} so that
1356they compute the cost of each branch of an
1357{\tt if\_then\_else} separately
1358rather than taking the maximum cost of the two branches.
1359On the other hand, it is rather difficult to refine the annotation
1360function so that it accounts for the memory hierarchy in the $\mips$ machine;
1361one needs to pass to the function $\kappa$ an `hidden
1362parameter' which corresponds to the stack height.
1363This process of pushing hidden parameters into the definition of
1364the annotation is error prone and it seems unlikely to work in practice
1365for a realistic compiler. We programmed sound (but not precise) cost annotations
1366for the $\C$ compiler introduced in section \ref{Ccompiler-intro} and found
1367that the approach is difficult to test because an over-approximation
1368of the cost at some point may easily compensate an under-approximation
1369somewhere else. By contrast, in the labelling approach introduced in section \ref{label-intro},
1370we manipulate costs at an abstract level as labels and produce
1371numerical values only at the very end of the construction.
1372
1373
1374
1375
1376\section{Labelling approach for the toy compiler}\label{label-toy-sec}
1377We apply the labelling approach introduced in section
1378\ref{label-intro} to the toy compiler which results in the following
1379diagram.
1380
1381\-
1382
1383\newcolumntype{M}[1]{>{\raggedright}m{#1}}
1384\begin{tabular}{M{9cm}||M{4cm}}
1385$$
1386\xymatrix{
1387% Row 1
1388  \imp \\
1389%
1390% Row 2
1391  \imp_\ell
1392  \ar[u]^{\cl{I}}
1393  \ar@/^/[d]^{\w{er}_\imp}
1394  \ar[r]^{\cl{C}}
1395%
1396& \vm_\ell
1397  \ar[d]^{\w{er}_\vm
1398  \ar[r]^{\cl{C}'}
1399%
1400& \mips_\ell
1401  \ar[d]^{\w{er}_{\mips}}  \\
1402%
1403% Row 3
1404  \imp               
1405  \ar@/^/[u]^{\cl{L}}
1406  \ar[r]^{\cl{C}}
1407& \vm   
1408  \ar[r]^{\cl{C}'}
1409& \mips
1410}
1411$$
1412&
1413$$
1414\begin{array}{ccc}
1415\w{er}_\vm \comp \cl{C} &= &\cl{C} \comp \w{er}_{\imp} \\
1416
1417\w{er}_\mips \comp \cl{C'} &= &\cl{C'} \comp \w{er}_{\vm} \\
1418
1419\w{er}_\imp \comp \cl{L} & = & id_{\imp} \\
1420
1421\w{An}_\imp &= &\cl{I} \comp \cl{L} 
1422\end{array}
1423$$
1424\end{tabular}
1425
1426
1427
1428
1429\subsection{Labelled $\imp$}
1430We extend the syntax so that statements
1431and boolean conditions in \s{while} loops
1432can be labelled.
1433
1434{\footnotesize
1435\[
1436\begin{array}{lll}
1437
1438\w{lb}  &::= \ell:b 
1439&\mbox{(labelled boolean conditions)}\\
1440S  &::= \ldots \Alt \ell:S \Alt \s{while} \ \w{lb} \ \s{do} \ S
1441&\mbox{(labelled commands)}
1442
1443\end{array}
1444\]}
1445
1446For instance $\ell:(\s{while} \ \ell' : (n<x) \ \s{do} \ \ell:S )$ is a labelled command.
1447Also notice that the definition allows labels in commands to
1448be nested as in $\ell:(\ell':(x:=e))$, though this feature is not really used.
1449The evaluation predicate $\eval$ on labelled boolean conditions is
1450extended as follows:
1451
1452{\footnotesize
1453\begin{equation}
1454\infer{(b,s) \eval v}
1455{(\ell:b,s) \eval (v,\ell)}
1456\end{equation}}
1457
1458So a labelled boolean condition evaluates into a pair
1459composed of a boolean value and a label.
1460The small step semantics of statements
1461defined in table \ref{small-step-imp}
1462is extended as follows.
1463
1464{\footnotesize
1465\[
1466\begin{array}{lll}
1467(\ell:S,K,s) &\act{\ell} &(S,K,s) \\  \\
1468
1469
1470
1471
1472
1473(\s{while} \ \w{lb} \ \s{do} \ S ,K,s) &\act{\ell}
1474&\left\{
1475\begin{array}{ll}
1476(S,(\s{while} \ \w{lb} \ \s{do} \ S);K,s) &\mbox{if }(\w{lb},s)\eval (\s{true},\ell) \\
1477(\s{skip},K,s) &\mbox{if }(\w{lb},s)\eval (\s{false},\ell)
1478\end{array}
1479\right. \\
1480
1481\end{array}
1482\]}
1483
1484We denote with $\lambda,\lambda',\ldots$
1485finite sequences of labels. In particular, we denote with $\epsilon$ the empty sequence
1486and identify an unlabelled transition with a transition labelled with $\epsilon$.
1487Then the small step reduction relation we have defined
1488on statements becomes a {\em labelled transition system}.
1489There is an obvious {\em erasure} function $\w{er}_{\imp}$ 
1490from the labelled language to the unlabelled one which is the identity on expressions,
1491removes labels from labelled boolean conditions
1492and is the identity on unlabelled ones, and traverses commands
1493removing all labels.
1494We derive a {\em labelled} big-step semantics as follows:
1495
1496{\footnotesize
1497\[
1498\begin{array}{ll}
1499
1500(S,s)\eval (s',\lambda)          &\mbox{if } (S,\s{halt},s) \act{\lambda_1} \cdots \act{\lambda_n} 
1501                                             (\s{skip},\s{halt},s')  \mbox{ and }\lambda =\lambda_1 \cdots \lambda_n~.
1502
1503\end{array}
1504\]}
1505
1506
1507
1508
1509
1510
1511
1512\subsection{Labelled $\vm$}
1513We introduce a new instruction
1514$\s{nop}(\ell)$
1515whose semantics is defined as follows:
1516
1517{\footnotesize
1518\[
1519C \Gives (i,\sigma,s) \act{\ell} (i+1,\sigma,s) \qquad \mbox{if }C[i]=\s{nop}(\ell)~.
1520\]}
1521
1522The erasure function $\w{er}_{\vm}$ amounts to remove from a
1523$\vm$ code $C$ all the $\s{nop}(\ell)$ instructions and recompute jumps
1524accordingly. Specifically, let $n(C,i,j)$ be the number of \s{nop} instructions
1525in the interval $[i,j]$. Then, assuming $C[i]=\s{branch}(k)$ we replace the offset
1526$k$ with an offset $k'$ determined as follows:
1527
1528{\footnotesize
1529\[
1530k' = 
1531\left\{
1532\begin{array}{ll}
1533
1534k-n(C,i,i+k) &\mbox{if }k\geq 0 \\
1535
1536k+n(C,i+1+k,i)   &\mbox{if }k<0
1537
1538\end{array}
1539\right.
1540\]}
1541%
1542The compilation function $\cl{C}$ is extended to $\imp_\ell$ by defining:
1543
1544{\footnotesize
1545\[
1546\begin{array}{llll}   
1547\cl{C}(\ell:b,k) &=(\s{nop}(\ell))\cdot \cl{C}(b,k) \qquad
1548&\cl{C}(\ell:S)   &=   (\s{nop}(\ell))\cdot \cl{C}(S)~.
1549
1550\end{array}
1551\]}
1552
1553\begin{proposition}\label{labelled-sim-imp-vm}
1554For all commands $S$ in $\imp_\ell$ we have that:
1555
1556\Defitem{(1)} $\w{er}_{\vm}(\cl{C}(S)) = \cl{C}(\w{er}_{\imp}(S))$.
1557
1558\Defitem{(2)} 
1559If $(S,s)\eval (s',\lambda)$ then $(\cl{C}(S),s)\eval(s',\lambda)$.
1560\end{proposition}
1561
1562
1563\begin{remark}
1564In the current formulation,
1565a sequence of transitions $\lambda$ in the source code
1566must be simulated by the same sequence of transitions
1567in the object code. However, in the actual computation
1568of the costs, the order of the labels occurring in the sequence is
1569immaterial. Therefore one may consider a more relaxed notion of simulation
1570where $\lambda$ is a multi-set of labels.
1571\end{remark}
1572
1573
1574\subsection{Labelled $\mips$}
1575The labelled extension of $\mips$ is similar to the one of $\vm$.
1576We add an instruction $\s{nop} \ \ell$ whose semantics is defined
1577as follows:
1578
1579{\footnotesize
1580\[
1581\begin{array}{ll}
1582M\Gives (i,m) \act{\ell} (i+1,m)    &\mbox{if }M[i]=(\s{nop} \ \ell)~.
1583\end{array}
1584\]}
1585
1586The {\em erasure function} $\w{er}_{\mips}$ is also similar to the one
1587of $\vm$ as it amounts to remove from a $\mips$ code all the $(\s{nop} \ \ell)$ 
1588instructions and recompute jumps accordingly.
1589The compilation function $\cl{C'}$ is extended to $\vm_\ell$ by
1590simply translating $\s{nop}(\ell)$ as $(\s{nop} \ \ell)$:
1591
1592{\footnotesize
1593\[
1594\begin{array}{ll}   
1595
1596\cl{C'}(i,C) = (\s{nop} \ \ell)
1597&\mbox{if }C[i]=\s{nop}(\ell)
1598
1599\end{array}
1600\]}
1601The evaluation predicate for labelled $\mips$ is defined as
1602$(M,m)\eval (m',\lambda)$
1603if
1604$M\Gives (0,m) \act{\lambda_1} \cdots \act{\lambda_n}  (j,m')$,
1605$\lambda =\lambda_1 \cdots \lambda_n$  and $M[j]=\s{halt}$.
1606The following proposition relates $\vm_\ell$ code and its compilation
1607and it is similar to proposition \ref{labelled-sim-imp-vm}.
1608
1609
1610\begin{proposition}\label{sim-vm-mips-prop}
1611Let $C$ be a $\vm_\ell$ code. Then:
1612
1613\Defitem{(1)} $\w{er}_{\mips}(\cl{C'}(C)) = \cl{C'}(\w{er}_{\vm}(C))$.
1614
1615\Defitem{(2)} 
1616If $(C,s) \eval (s',\lambda)$ and $m\real \epsilon,s$ then
1617($\cl{C'}(C),m) \eval (m',\lambda)$ and $m'\real \epsilon,s'$.
1618\end{proposition}
1619
1620
1621
1622
1623\subsection{Labellings and instrumentations}
1624Assuming a function $\kappa$ which associates an integer number with
1625labels and a distinct variable \w{cost} which does not occur in the program $P$ under consideration,
1626we abbreviate with $\w{inc}(\ell)$ the assignment $\w{cost}:=\w{cost}+\kappa(\ell)$.
1627Then we define the instrumentation $\cl{I}$ (relative to $\kappa$ and $\w{cost})$ 
1628as follows.
1629
1630{\footnotesize
1631\[
1632\begin{array}{ll}
1633
1634\cl{I}(\ell:S) &= \w{inc}(\ell) ; \cl{I}(S) \\
1635
1636
1637\cl{I}(\s{while} \ \ell:b \ \s{do} \ S )
1638&= \w{inc}(\ell); \s{while} \ b \ \s{do} \ (\cl{I}(S);\w{inc}(\ell))
1639
1640\end{array}
1641\]}
1642
1643The function $\cl{I}$ just distributes over the other operators of the language.
1644We extend the function $\kappa$ on labels to sequences of labels by defining
1645$\kappa(\ell_1,\ldots,\ell_n) = \kappa(\ell_1)+\cdots +\kappa(\ell_n)$.
1646
1647
1648The instrumented $\imp$ program relates to the labelled one has follows.
1649
1650\begin{proposition}\label{lab-instr-erasure-imp}
1651Let $S$ be an $\imp_\ell$ command. If
1652$(\cl{I}(S),s[c/\w{cost}]) \eval s'[c+\delta/\w{cost}]$
1653then
1654$\xst{\lambda}{\kappa(\lambda)=\delta \mbox{ and }(S,s[c/\w{cost}])\eval(s'[c/\w{cost}],\lambda)}$.
1655\end{proposition}
1656
1657
1658
1659
1660
1661
1662
1663\begin{definition}\label{labelling-def}
1664A {\em labelling} is a function $\cl{L}$ from an unlabelled language to
1665the corresponding labelled one such that
1666$\w{er}_{\imp} \comp \cl{L}$ is the identity function on the
1667$\imp$ language.
1668\end{definition}
1669
1670\begin{proposition}\label{global-commutation-prop}
1671For any labelling function $\cl{L}$, and $\imp$ program $P$, the following holds:
1672\begin{equation}
1673\w{er}_{\mips}(\cl{C'}(\cl{C}(\cl{L}(P))) = \cl{C'}(\cl{C}(P))~.
1674\end{equation}
1675\end{proposition}
1676
1677
1678
1679
1680
1681\begin{proposition}\label{instrument-to-label-prop}
1682Given a function $\kappa$ for the labels and a labelling function $\cl{L}$,
1683for all programs $P$ of the source language
1684if
1685$(\cl{I}(\cl{L}(P)),s[c/\w{cost}]) \eval s'[c+\delta/\w{cost}]$ 
1686and $m\real s[c/\w{cost}]$ then
1687$(\cl{C'}(\cl{C}(\cl{L}(P))),m) \eval (m',\lambda)$, $m'\real s'[c/\w{cost}]$ and
1688$\kappa(\lambda)=\delta$.
1689\end{proposition}
1690
1691
1692
1693
1694
1695\subsection{Sound and precise labellings}\label{sound-label-sec}
1696With any $\mips_\ell$ code $M$ we can associate a directed and rooted
1697(control flow) graph whose nodes are the instruction positions $\set{0,\ldots,|M|-1}$,
1698whose root is the node $0$,  and
1699whose directed edges correspond to the possible transitions between instructions. 
1700We say that a node is labelled if it corresponds to an instruction $\s{nop}
1701\ \ell$
1702
1703\begin{definition}
1704A {\em simple path} in a $\mips_\ell$ code $M$ is a directed finite path in the graph associated with
1705$M$ where the first node is labelled, the last node is the predecessor of either a labelled node or a leaf,
1706and all the other nodes are unlabelled.
1707\end{definition}
1708
1709
1710\begin{definition}
1711A $\mips_\ell$ code $M$ is {\em soundly labelled} if
1712in the associated graph
1713the root node $0$ is labelled
1714and there are no loops that do not go through a labelled node.
1715\end{definition}
1716
1717
1718In a soundly labelled graph there are finitely many simple paths.
1719Thus, given a soundly labelled $\mips$ code $M$, we can associate
1720with every label $\ell$ a number $\kappa(\ell)$ which is the
1721maximum (estimated) cost  of executing a simple
1722path whose first node is labelled with $\ell$.
1723We stress that in the following we assume that
1724the cost of a simple path is proportional to the number of $\mips$
1725instructions that are crossed in the path.
1726
1727
1728\begin{proposition}\label{sound-label-prop}
1729If $M$ is soundly labelled and $(M,m) \eval (m',\lambda)$ then the cost
1730of the computation is bounded by $\kappa(\lambda)$.
1731\end{proposition}
1732
1733
1734Thus for a soundly labelled $\mips$ code the sequence of labels associated
1735with a computation is a significant information on the execution cost.
1736
1737
1738\begin{definition}
1739We say that a soundly labelled code is {\em precise}
1740if for every label $\ell$ in the code, the simple
1741paths starting from a node labelled with $\ell$ have
1742the same cost.
1743\end{definition}
1744
1745In particular, a code is precise if we can associate at most one simple
1746path with every label.
1747
1748
1749
1750\begin{proposition}\label{precise-label-prop}
1751If $M$ is precisely labelled and $(M,m) \eval (m',\lambda)$ then the cost
1752of the computation is $\kappa(\lambda)$.
1753\end{proposition}
1754
1755
1756
1757
1758
1759
1760The next point we have to check is that
1761there are labelling functions (of the source code)
1762such that the compilation function does produce sound and possibly
1763precise labelled $\mips$ code.  To discuss this point, we introduce in table
1764\ref{labelling} two labelling functions $\cl{L}_s$ and $\cl{L}_p$ for the
1765$\imp$ language where the operator \w{new} is meant to return fresh
1766labels.
1767
1768\begin{table}
1769{\footnotesize
1770\[
1771\begin{array}{ll}
1772\cl{L}_s(\s{prog} \ S) &=  \s{prog}\  \ell: \cl{L}_s(S) \\
1773
1774\cl{L}_s(\s{skip})     &=\s{skip} \\
1775
1776\cl{L}_s(x:=e)         &=x:=e \\
1777
1778\cl{L}_s(S;S')         &= \cl{L}_s(S);\cl{L}_s(S') \\
1779
1780
1781\cl{L}_s(\s{if} \ b \ \s{then} \ S_1 \ \s{else} \ S_2) &= 
1782
1783\s{if} \ b \ \s{then} \ \cl{L}_s(S_1) \ \s{else} \ \cl{L}_s(S_2) \\ 
1784
1785
1786\cl{L}_s(\s{while} \ b \ \s{do} \ S) &= \s{while} \ b \ \s{do} \ \ell:\cl{L}_s(S) \\  \\
1787
1788
1789
1790
1791\cl{L}_p(\s{prog} \ S) &= \w{let} \ \ell=\w{new} \ \w{in}\ \quad \s{prog}\  \ell: \cl{L}_p(S) \\
1792
1793\cl{L}_p(\s{skip})     &=\w{let} \ \ell=\w{new} \ \w{in} \ \quad \ell:(\s{skip}) \\
1794
1795\cl{L}_p(x:=e)         &= \w{let} \ \ell=\w{new} \ \w{in} \ \quad \ell:(x:=e) \\
1796
1797\cl{L}_p(S;S')         &= \cl{L}_p(S);\cl{L}_p(S') \\
1798
1799
1800\cl{L}_p(\s{if} \ b \ \s{then} \ S_1 \ \s{else} \ S_2) &= \w{let} \ \ell=\w{new} \ \w{in} \quad\quad
1801\s{if} \ \ell:b \ \s{then} \ \cl{L}_p(S_1) \ \s{else} \ \cl{L}_p(S_2) \\ 
1802
1803
1804\cl{L}_p(\s{while} \ b \ \s{do} \ S) &= \w{let} \ \ell=\w{new} \ \w{in} \quad\quad 
1805\s{while} \ \ell:b \ \s{do} \ \cl{L}_p(S) 
1806
1807\end{array}
1808\]}
1809\caption{Two labellings for the $\imp$ language} \label{labelling}
1810\end{table}
1811
1812\begin{proposition}\label{lab-sound}
1813For all $\imp$ programs $P$:
1814
1815\Defitem{(1)} 
1816$\cl{C'}(\cl{C}(\cl{L}_s(P))$ 
1817is a soundly labelled $\mips$ code.
1818
1819\Defitem{(2)}
1820$\cl{C'}(\cl{C}(\cl{L}_p(P))$ is a
1821soundly and precisely  labelled $\mips$ code.
1822\end{proposition}
1823
1824
1825For an example of command which is not soundly labelled consider
1826$\ell: \s{while} \ 0<x \ \s{do} \ x:=x+1$ which when compiled produces
1827a loop that does not go through any label.  On the other hand, for an
1828example of a program which is not precisely labelled consider
1829$(\s{while} \ \ell:(0<x) \ \s{do} \ x:=x+1)$.  In the compiled code, we
1830find two simple paths associated with the label $\ell$ whose cost will
1831be quite different in general.
1832
1833Once a sound and possibly precise labelling $\cl{L}$ has been
1834designed, we can determine the cost of each label and define an instrumentation
1835$\cl{I}$ whose composition with $\cl{L}$ will produce the desired
1836cost annotation.
1837
1838
1839\begin{definition}
1840Given a labelling function $\cl{L}$ for the source language $\imp$ 
1841and a program $P$  in the $\imp$ language,  we define an
1842annotation for the source program as follows:
1843\[
1844\begin{array}{ll}
1845
1846\w{An}_{\imp}(P) &= \cl{I}(L(P))~.
1847
1848\end{array}
1849\]
1850\end{definition}
1851
1852
1853
1854
1855\begin{proposition}\label{ann-correct}
1856If $P$ is a program and $\cl{C'}(\cl{C}(\cl{L}(P)))$ is a sound (sound and precise)
1857labelling then $(\w{An}_{\imp}(P),s[c/\w{cost}]) \eval s'[c+\delta/\w{cost}]$
1858and $m\real s[c/\w{cost}]$ entails that
1859$(\cl{C'}(\cl{C}(P)),m) \eval m'$, $m'\real s'[c/\w{cost}]$ 
1860and the cost of the execution is bound (is exactly) $\delta$.
1861\end{proposition}
1862
1863
1864To summarise, producing sound and precise labellings is mainly a
1865matter of designing the labelled source language so that
1866the labelling is sufficiently {\em fine grained}.
1867For instance, in the toy compiler, the fact that boolean conditions
1868are labelled is instrumental to precision while
1869the labelling of expressions turns out to be unnecessary.
1870
1871
1872Besides soundness and precision, a third criteria to evaluate
1873labellings is that they do not introduce too many unnecessary labels.
1874We call this property {\em economy}.
1875There are two reasons for this requirement. On one hand we would like
1876to minimise the number of labels so that the source program
1877is not cluttered by too many cost annotations and
1878on the other hand we would like to maximise the length of the simple
1879paths because in a non-trivial processor the longer the sequence of
1880instructions we consider the more accurate is the estimation of
1881their execution cost (on a long sequence certain costs are
1882amortized).
1883In practice, it seems that one can produce first a sound and
1884possibly precise labelling and
1885then apply heuristics to eliminate unnecessary labels.
1886
1887
1888
1889\section{A $\C$ compiler}\label{C-compiler-sec}
1890
1891This section gives an informal overview of the compiler, in particular it
1892highlights the main features of the intermediate languages, the purpose of the
1893compilation steps, and the optimisations.
1894
1895\subsection{$\Clight$} 
1896$\Clight$ is a large subset of the $\C$ language that we adopt as
1897the source language of our compiler.
1898It features most of the types and operators
1899of $\C$. It includes pointer arithmetic, pointers to
1900functions, and \texttt{struct} and \texttt{union} types, as well as
1901all $\C$ control structures. The main difference with the $\C$
1902language is that $\Clight$ expressions are side-effect free, which
1903means that side-effect operators ($\codeex{=}$,$\codeex{+=}$,$\codeex{++}$,$\ldots$) and function calls
1904within expressions are not supported.
1905Given a $\C$ program, we rely on the $\cil$ tool \cite{CIL02} to deal
1906with the idiosyncrasy of  $\C$ concrete syntax and to produce an
1907equivalent program in $\Clight$ abstract syntax.
1908We refer to  the $\compcert$ project \cite{Leroy09} 
1909for a formal definition of the
1910$\Clight$ language. Here we just recall  in
1911figure \ref{syntax-clight} its syntax which is
1912classically structured in expressions, statements, functions,
1913and whole programs. In order to limit the implementation effort,
1914our current compiler for $\Clight$ does {\em not} cover the operators
1915relating to the floating point type {\tt float}. So, in a nutshell,
1916the fragment of $\C$ we have implemented is $\Clight$ without
1917floating point.
1918
1919
1920
1921\begin{figure}
1922  \label{syntax-clight}
1923  \footnotesize{
1924  \begin{tabular}{l l l l}
1925    Expressions: & $a$ ::= & $id$               & variable identifier \\
1926    & & $|$ $n$                                 & integer constant \\
1927%    & & $|$ $f$                                & float constant \\
1928    & & $|$ \texttt{sizeof}($\tau$)             & size of a type \\
1929    & & $|$ $op_1$ $a$                          & unary arithmetic operation \\
1930    & & $|$ $a$ $op_2$ $a$                      & binary arithmetic operation \\
1931    & & $|$ $*a$                                & pointer dereferencing \\
1932    & & $|$ $a.id$                              & field access \\
1933    & & $|$ $\&a$                               & taking the address of \\
1934    & & $|$ $(\tau)a$                           & type cast \\
1935    & & $|$ $a ? a : a$                         & conditional expression \\[10pt]
1936    Statements: & $s$ ::= & \texttt{skip}       & empty statement \\
1937    & & $|$ $a=a$                               & assignment \\
1938    & & $|$ $a=a(a^*)$                          & function call \\
1939    & & $|$ $a(a^*)$                            & procedure call \\
1940    & & $|$ $s;s$                               & sequence \\
1941    & & $|$ \texttt{if} $a$ \texttt{then} $s$ \texttt{else} $s$ & conditional \\
1942    & & $|$ \texttt{switch} $a$ $sw$            & multi-way branch \\
1943    & & $|$ \texttt{while} $a$ \texttt{do} $s$  & ``while'' loop \\
1944    & & $|$ \texttt{do} $s$ \texttt{while} $a$  & ``do'' loop \\
1945    & & $|$ \texttt{for}($s$,$a$,$s$) $s$       & ``for'' loop\\
1946    & & $|$ \texttt{break}                      & exit from current loop \\
1947    & & $|$ \texttt{continue}                   & next iteration of the current loop \\
1948    & & $|$ \texttt{return} $a^?$               & return from current function \\
1949    & & $|$ \texttt{goto} $lbl$                 & branching \\
1950    & & $|$ $lbl$ : $s$                         & labelled statement \\[10pt]
1951    Switch cases: & $sw$ ::= & \texttt{default} : $s$           & default case \\
1952    & & $|$ \texttt{case } $n$ : $s;sw$                         & labelled case \\[10pt]
1953    Variable declarations: & $dcl$ ::= & $(\tau\quad id)^*$             & type and name\\[10pt]
1954    Functions: & $Fd$ ::= & $\tau$ $id(dcl)\{dcl;s\}$   & internal function \\
1955    & & $|$ \texttt{extern} $\tau$ $id(dcl)$                    & external function\\[10pt]
1956    Programs: & $P$ ::= & $dcl;Fd^*;\texttt{main}=id$           & global variables, functions, entry point
1957  \end{tabular}}
1958  \caption{Syntax of the $\Clight$ language}
1959\end{figure}
1960
1961
1962\subsection{$\Cminor$}
1963
1964$\Cminor$ is a simple, low-level imperative language, comparable to a
1965stripped-down, typeless variant of $\C$. Again we refer
1966to the $\compcert$ project for its formal definition
1967and we just recall in figure \ref{syntax-cminor}
1968its syntax which as for $\Clight$ is structured in
1969expressions, statements, functions, and whole programs.
1970
1971\begin{figure}
1972  \label{syntax-cminor}
1973  \footnotesize{
1974  \begin{tabular}{l l l l}
1975    Signatures: & $sig$ ::= & \texttt{sig} $\vec{\texttt{int}}$ $(\texttt{int}|\texttt{void})$ & arguments and result \\[10pt]
1976    Expressions: & $a$ ::= & $id$               & local variable \\
1977     & & $|$ $n$                                & integer constant \\
1978%     & & $|$ $f$                               & float constant \\
1979     & & $|$ \texttt{addrsymbol}($id$)          & address of global symbol \\
1980     & & $|$ \texttt{addrstack}($\delta$)       & address within stack data \\
1981     & & $|$ $op_1$ $a$                         & unary arithmetic operation \\
1982     & & $|$ $op_2$ $a$ $a$                     & binary arithmetic operation \\
1983     & & $|$ $\kappa[a]$                        & memory read\\
1984     & & $|$ $a?a:a$                    & conditional expression \\[10pt]
1985    Statements: & $s$ ::= & \texttt{skip}       & empty statement \\
1986    & & $|$ $id=a$                              & assignment \\
1987    & & $|$ $\kappa[a]=a$                       & memory write \\
1988    & & $|$ $id^?=a(\vec{a}):sig$               & function call \\
1989    & & $|$ \texttt{tailcall} $a(\vec{a}):sig$  & function tail call \\
1990    & & $|$ \texttt{return}$(a^?)$              & function return \\
1991    & & $|$ $s;s$                               & sequence \\
1992    & & $|$ \texttt{if} $a$ \texttt{then} $s$ \texttt{else} $s$         & conditional  \\
1993    & & $|$ \texttt{loop} $s$                   & infinite loop \\
1994    & & $|$ \texttt{block} $s$          & block delimiting \texttt{exit} constructs \\
1995    & & $|$ \texttt{exit} $n$                   & terminate the $(n+1)^{th}$ enclosing block \\
1996    & & $|$ \texttt{switch} $a$ $tbl$           & multi-way test and exit\\
1997    & & $|$ $lbl:s$                             & labelled statement \\
1998    & & $|$ \texttt{goto} $lbl$                 & jump to a label\\[10pt]
1999    Switch tables: & $tbl$ ::= & \texttt{default:exit}($n$) & \\
2000    & & $|$ \texttt{case} $i$: \texttt{exit}($n$);$tbl$ & \\[10pt]
2001    Functions: & $Fd$ ::= & \texttt{internal} $sig$ $\vec{id}$ $\vec{id}$ $n$ $s$ & internal function: signature, parameters, \\
2002    & & & local variables, stack size and body \\
2003    & & $|$ \texttt{external} $id$ $sig$ & external function \\[10pt]
2004    Programs: & $P$ ::= & \texttt{prog} $(id=data)^*$ $(id=Fd)^*$ $id$   & global variables, functions and entry point
2005  \end{tabular}}
2006  \caption{Syntax of the $\Cminor$ language}
2007\end{figure}
2008
2009\paragraph{Translation of $\Clight$ to $\Cminor$.}
2010As in $\Cminor$ stack operations are made explicit, one has to know
2011which variables are stored in the stack. This
2012information is produced by a static analysis that determines
2013the variables whose address may be `taken'.
2014Also space is reserved for local arrays and structures. In a
2015second step, the proper compilation is performed: it consists mainly
2016in translating $\Clight$ control structures to the basic
2017ones available in $\Cminor$.
2018
2019\subsection{$\RTLAbs$}
2020
2021$\RTLAbs$ is the last architecture independent language in the
2022compilation process.  It is a rather straightforward {\em abstraction} of
2023the {\em architecture-dependent} $\RTL$ intermediate language
2024available in the $\compcert$ project and it is intended to factorize
2025some work common to the various target assembly languages
2026(e.g. optimizations) and thus to make retargeting of the compiler a
2027simpler matter.
2028
2029We stress that in $\RTLAbs$ the structure of $\Cminor$ expressions
2030is lost and that this may have a negative impact on the following
2031instruction selection step.
2032Still, the subtleties of instruction selection seem rather orthogonal
2033to our goals and we deem the possibility of retargeting
2034easily the compiler more important than the efficiency of the generated code.
2035
2036
2037
2038\paragraph{Syntax.}
2039In $\RTLAbs$, programs are represented as \emph{control flow
2040  graphs} (CFGs for short). We associate with the nodes of the graphs
2041instructions reflecting the $\Cminor$ commands. As usual, commands that change the control
2042flow of the program (e.g. loops, conditionals) are translated by inserting
2043suitable branching instructions in the CFG.
2044The syntax of the language is depicted in table
2045\ref{RTLAbs:syntax}. Local variables are now represented by \emph{pseudo
2046  registers} that are available in unbounded number. The grammar rule $\w{op}$ that is
2047not detailed in table \ref{RTLAbs:syntax} defines usual arithmetic and boolean
2048operations (\codeex{+}, \codeex{xor}, \codeex{$\le$}, etc.) as well as constants
2049and conversions between sized integers.
2050
2051\begin{table}
2052{\footnotesize
2053\[
2054\begin{array}{lllllll}
2055\w{return\_type} & ::= & \s{int} \Alt \s{void} & \qquad
2056\w{signature} & ::= & (\s{int} \rightarrow)^*\ \w{return\_type}\\
2057\end{array}
2058\]
2059
2060\[
2061\begin{array}{lllllll}
2062\w{memq} & ::= & \s{int8s} \Alt \s{int8u} \Alt \s{int16s} \Alt \s{int16u}
2063                 \Alt \s{int32} & \qquad
2064\w{fun\_ref} & ::= & \w{fun\_name} \Alt \w{psd\_reg}
2065\end{array}
2066\]
2067
2068\[
2069\begin{array}{llll}
2070\w{instruction} & ::= & \Alt \s{skip} \rightarrow \w{node} &
2071                        \quad \mbox{(no instruction)}\\
2072                &     & \Alt \w{psd\_reg} := \w{op}(\w{psd\_reg}^*)
2073                        \rightarrow \w{node} & \quad \mbox{(operation)}\\
2074                &     & \Alt \w{psd\_reg} := \s{\&}\w{var\_name}
2075                        \rightarrow \w{node} &
2076                        \quad \mbox{(address of a global)}\\
2077                &     & \Alt \w{psd\_reg} := \s{\&locals}[n]
2078                        \rightarrow \w{node} &
2079                        \quad \mbox{(address of a local)}\\
2080                &     & \Alt \w{psd\_reg} := \w{fun\_name}
2081                        \rightarrow \w{node} &
2082                        \quad \mbox{(address of a function)}\\
2083                &     & \Alt \w{psd\_reg} :=
2084                        \w{memq}(\w{psd\_reg}[\w{psd\_reg}])
2085                        \rightarrow \w{node} & \quad \mbox{(memory load)}\\
2086                &     & \Alt \w{memq}(\w{psd\_reg}[\w{psd\_reg}]) :=
2087                        \w{psd\_reg}
2088                        \rightarrow \w{node} & \quad \mbox{(memory store)}\\
2089                &     & \Alt \w{psd\_reg} := \w{fun\_ref}(\w{psd\_reg^*}) :
2090                        \w{signature} \rightarrow \w{node} &
2091                        \quad \mbox{(function call)}\\
2092                &     & \Alt \w{fun\_ref}(\w{psd\_reg^*}) : \w{signature}
2093                        & \quad \mbox{(function tail call)}\\
2094                &     & \Alt \s{test}\ \w{op}(\w{psd\_reg}^*) \rightarrow
2095                        \w{node}, \w{node} & \quad \mbox{(branch)}\\
2096                &     & \Alt \s{return}\ \w{psd\_reg}? & \quad \mbox{(return)}
2097\end{array}
2098\]
2099
2100\[
2101\begin{array}{lll}
2102\w{fun\_def} & ::= & \w{fun\_name}(\w{psd\_reg}^*): \w{signature}\\
2103             &     & \s{result:} \w{psd\_reg}?\\
2104             &     & \s{locals:} \w{psd\_reg}^*\\
2105             &     & \s{stack:} n\\
2106             &     & \s{entry:} \w{node}\\
2107             &     & \s{exit:} \w{node}\\
2108             &     & (\w{node:} \w{instruction})^*
2109\end{array}
2110\]
2111
2112\[
2113\begin{array}{lllllll}
2114\w{init\_datum} & ::= & \s{reserve}(n) \Alt \s{int8}(n) \Alt \s{int16}(n)
2115                       \Alt \s{int32}(n) & \qquad
2116\w{init\_data} & ::= & \w{init\_datum}^+
2117\end{array}
2118\]
2119
2120\[
2121\begin{array}{lllllll}
2122\w{global\_decl} & ::= & \s{var}\ \w{var\_name} \w{\{init\_data\}} & \qquad
2123\w{fun\_decl} & ::= & \s{extern}\ \w{fun\_name} \w{(signature)} \Alt
2124                      \w{fun\_def}
2125\end{array}
2126\]
2127
2128\[
2129\begin{array}{lll}
2130\w{program} & ::= & \w{global\_decl}^*\\
2131            &     & \w{fun\_decl}^*
2132\end{array}
2133\]}
2134\caption{Syntax of the $\RTLAbs$ language}\label{RTLAbs:syntax}
2135\end{table}
2136
2137\paragraph{Translation of $\Cminor$ to $\RTLAbs$.}
2138Translating $\Cminor$ programs to $\RTLAbs$ programs mainly consists in
2139transforming $\Cminor$ commands in CFGs. Most commands are sequential and have a
2140rather straightforward linear translation. A conditional is translated in a
2141branch instruction; a loop is translated using a back edge in the CFG.
2142
2143
2144
2145\subsection{$\RTL$}
2146
2147As in $\RTLAbs$, the structure of $\RTL$ programs is based on CFGs. $\RTL$ is
2148the first architecture-dependant intermediate language of our compiler
2149which, in its current version, targets the $\mips$ assembly language.
2150
2151\paragraph{Syntax.}
2152$\RTL$ is very close to $\RTLAbs$. It is based on CFGs and explicits the $\mips$
2153instructions corresponding to the $\RTLAbs$ instructions. Type information
2154disappears: everything is represented using 32 bits integers. Moreover, each
2155global of the program is associated to an offset. The syntax of the language can
2156be found in table \ref{RTL:syntax}. The grammar rules $\w{unop}$, $\w{binop}$,
2157$\w{uncon}$, and $\w{bincon}$, respectively, represent the sets of unary
2158operations, binary operations, unary conditions and binary conditions of the
2159$\mips$ language.
2160
2161\begin{table}
2162{\footnotesize
2163\[
2164\begin{array}{lllllll}
2165\w{size} & ::= & \s{Byte} \Alt \s{HalfWord} \Alt \s{Word}  & \qquad
2166\w{fun\_ref} & ::= & \w{fun\_name} \Alt \w{psd\_reg}
2167\end{array}
2168\]
2169
2170\[
2171\begin{array}{llll}
2172\w{instruction} & ::= & \Alt \s{skip} \rightarrow \w{node} &
2173                        \quad \mbox{(no instruction)}\\
2174                &     & \Alt \w{psd\_reg} := n
2175                        \rightarrow \w{node} & \quad \mbox{(constant)}\\
2176                &     & \Alt \w{psd\_reg} := \w{unop}(\w{psd\_reg})
2177                        \rightarrow \w{node} & \quad \mbox{(unary operation)}\\
2178                &     & \Alt \w{psd\_reg} :=
2179                        \w{binop}(\w{psd\_reg},\w{psd\_reg})
2180                        \rightarrow \w{node} & \quad
2181                        \mbox{(binary operation)}\\
2182                &     & \Alt \w{psd\_reg} := \s{\&globals}[n]
2183                        \rightarrow \w{node} &
2184                        \quad \mbox{(address of a global)}\\
2185                &     & \Alt \w{psd\_reg} := \s{\&locals}[n]
2186                        \rightarrow \w{node} &
2187                        \quad \mbox{(address of a local)}\\
2188                &     & \Alt \w{psd\_reg} := \w{fun\_name}
2189                        \rightarrow \w{node} &
2190                        \quad \mbox{(address of a function)}\\
2191                &     & \Alt \w{psd\_reg} :=
2192                        \w{size}(\w{psd\_reg}[n])
2193                        \rightarrow \w{node} & \quad \mbox{(memory load)}\\
2194                &     & \Alt \w{size}(\w{psd\_reg}[n]) :=
2195                        \w{psd\_reg}
2196                        \rightarrow \w{node} & \quad \mbox{(memory store)}\\
2197                &     & \Alt \w{psd\_reg} := \w{fun\_ref}(\w{psd\_reg^*})
2198                        \rightarrow \w{node} &
2199                        \quad \mbox{(function call)}\\
2200                &     & \Alt \w{fun\_ref}(\w{psd\_reg^*})
2201                        & \quad \mbox{(function tail call)}\\
2202                &     & \Alt \s{test}\ \w{uncon}(\w{psd\_reg}) \rightarrow
2203                        \w{node}, \w{node} & \quad
2204                        \mbox{(branch unary condition)}\\
2205                &     & \Alt \s{test}\ \w{bincon}(\w{psd\_reg},\w{psd\_reg})
2206                        \rightarrow \w{node}, \w{node} & \quad
2207                        \mbox{(branch binary condition)}\\
2208                &     & \Alt \s{return}\ \w{psd\_reg}? & \quad
2209                        \mbox{(return)}
2210\end{array}
2211\]
2212
2213\[
2214\begin{array}{lllllll}
2215\w{fun\_def} & ::= & \w{fun\_name}(\w{psd\_reg}^*) & \qquad
2216\w{program} & ::= & \s{globals}: n\\
2217             &     & \s{result:} \w{psd\_reg}? & \qquad
2218            &     & \w{fun\_def}^*\\
2219             &     & \s{locals:} \w{psd\_reg}^*\\
2220             &     & \s{stack:} n\\
2221             &     & \s{entry:} \w{node}\\
2222             &     & \s{exit:} \w{node}\\
2223             &     & (\w{node:} \w{instruction})^*
2224\end{array}
2225\]}
2226\caption{Syntax of the $\RTL$ language}\label{RTL:syntax}
2227\end{table}
2228
2229\paragraph{Translation of $\RTLAbs$ to $\RTL$.}
2230This translation is mostly straightforward. A $\RTLAbs$ instruction is often
2231directly translated to a corresponding $\mips$ instruction. There are a few
2232exceptions: some $\RTLAbs$ instructions are expanded in two or more $\mips$
2233instructions. When the translation of a $\RTLAbs$ instruction requires more than
2234a few simple $\mips$ instruction, it is translated into a call to a function
2235defined in the preamble of the compilation result.
2236
2237\subsection{$\ERTL$}
2238
2239As in $\RTL$, the structure of $\ERTL$ programs is based on CFGs. $\ERTL$
2240explicits the calling conventions of the $\mips$ assembly language.
2241
2242\paragraph{Syntax.}
2243The syntax of the language is given in table \ref{ERTL:syntax}. The main
2244difference between $\RTL$ and $\ERTL$ is the use of hardware registers.
2245Parameters are passed in specific hardware registers; if there are too many
2246parameters, the remaining are stored in the stack. Other conventionally specific
2247hardware registers are used: a register that holds the result of a function, a
2248register that holds the base address of the globals, a register that holds the
2249address of the top of the stack, and some registers that need to be saved when
2250entering a function and whose values are restored when leaving a
2251function. Following these conventions, function calls do not list their
2252parameters anymore; they only mention their number. Two new instructions appear
2253to allocate and deallocate on the stack some space needed by a function to
2254execute. Along with these two instructions come two instructions to fetch or
2255assign a value in the parameter sections of the stack; these instructions cannot
2256yet be translated using regular load and store instructions because we do not
2257know the final size of the stack area of each function. At last, the return
2258instruction has a boolean argument that tells whether the result of the function
2259may later be used or not (this is exploited for optimizations).
2260
2261\begin{table}
2262{\footnotesize
2263\[
2264\begin{array}{lllllll}
2265\w{size} & ::= & \s{Byte} \Alt \s{HalfWord} \Alt \s{Word}  & \qquad
2266\w{fun\_ref} & ::= & \w{fun\_name} \Alt \w{psd\_reg}
2267\end{array}
2268\]
2269
2270\[
2271\begin{array}{llll}
2272\w{instruction} & ::= & \Alt \s{skip} \rightarrow \w{node} &
2273                        \quad \mbox{(no instruction)}\\
2274                &     & \Alt \s{NewFrame} \rightarrow \w{node} &
2275                        \quad \mbox{(frame creation)}\\
2276                &     & \Alt \s{DelFrame} \rightarrow \w{node} &
2277                        \quad \mbox{(frame deletion)}\\
2278                &     & \Alt \w{psd\_reg} := \s{stack}[\w{slot}, n]
2279                        \rightarrow \w{node} &
2280                        \quad \mbox{(stack load)}\\
2281                &     & \Alt \s{stack}[\w{slot}, n] := \w{psd\_reg}
2282                        \rightarrow \w{node} &
2283                        \quad \mbox{(stack store)}\\
2284                &     & \Alt \w{hdw\_reg} := \w{psd\_reg}
2285                        \rightarrow \w{node} &
2286                        \quad \mbox{(pseudo to hardware)}\\
2287                &     & \Alt \w{psd\_reg} := \w{hdw\_reg}
2288                        \rightarrow \w{node} &
2289                        \quad \mbox{(hardware to pseudo)}\\
2290                &     & \Alt \w{psd\_reg} := n
2291                        \rightarrow \w{node} & \quad \mbox{(constant)}\\
2292                &     & \Alt \w{psd\_reg} := \w{unop}(\w{psd\_reg})
2293                        \rightarrow \w{node} & \quad \mbox{(unary operation)}\\
2294                &     & \Alt \w{psd\_reg} :=
2295                        \w{binop}(\w{psd\_reg},\w{psd\_reg})
2296                        \rightarrow \w{node} & \quad
2297                        \mbox{(binary operation)}\\
2298                &     & \Alt \w{psd\_reg} := \w{fun\_name}
2299                        \rightarrow \w{node} &
2300                        \quad \mbox{(address of a function)}\\
2301                &     & \Alt \w{psd\_reg} :=
2302                        \w{size}(\w{psd\_reg}[n])
2303                        \rightarrow \w{node} & \quad \mbox{(memory load)}\\
2304                &     & \Alt \w{size}(\w{psd\_reg}[n]) :=
2305                        \w{psd\_reg}
2306                        \rightarrow \w{node} & \quad \mbox{(memory store)}\\
2307                &     & \Alt \w{fun\_ref}(n) \rightarrow \w{node} &
2308                        \quad \mbox{(function call)}\\
2309                &     & \Alt \w{fun\_ref}(n)
2310                        & \quad \mbox{(function tail call)}\\
2311                &     & \Alt \s{test}\ \w{uncon}(\w{psd\_reg}) \rightarrow
2312                        \w{node}, \w{node} & \quad
2313                        \mbox{(branch unary condition)}\\
2314                &     & \Alt \s{test}\ \w{bincon}(\w{psd\_reg},\w{psd\_reg})
2315                        \rightarrow \w{node}, \w{node} & \quad
2316                        \mbox{(branch binary condition)}\\
2317                &     & \Alt \s{return}\ b & \quad \mbox{(return)}
2318\end{array}
2319\]
2320
2321\[
2322\begin{array}{lllllll}
2323\w{fun\_def} & ::= & \w{fun\_name}(n) & \qquad
2324\w{program} & ::= & \s{globals}: n\\
2325             &     & \s{locals:} \w{psd\_reg}^* & \qquad
2326            &     & \w{fun\_def}^*\\
2327             &     & \s{stack:} n\\
2328             &     & \s{entry:} \w{node}\\
2329             &     & (\w{node:} \w{instruction})^*
2330\end{array}
2331\]}
2332\caption{Syntax of the $\ERTL$ language}\label{ERTL:syntax}
2333\end{table}
2334
2335\paragraph{Translation of $\RTL$ to $\ERTL$.}
2336The work consists in expliciting the conventions previously mentioned. These
2337conventions appear when entering, calling and leaving a function, and when
2338referencing a global variable or the address of a local variable.
2339
2340\paragraph{Optimizations.}
2341A \emph{liveness analysis} is performed on $\ERTL$ to replace unused
2342instructions by a $\s{skip}$. An instruction is tagged as unused when it
2343performs an assignment on a register that will not be read afterwards. Also, the
2344result of the liveness analysis is exploited by a \emph{register allocation}
2345algorithm whose result is to efficiently associate a physical location (a
2346hardware register or an address in the stack) to each pseudo register of the
2347program.
2348
2349\subsection{$\LTL$}
2350
2351As in $\ERTL$, the structure of $\LTL$ programs is based on CFGs. Pseudo
2352registers are not used anymore; instead, they are replaced by physical locations
2353(a hardware register or an address in the stack).
2354
2355\paragraph{Syntax.}
2356Except for a few exceptions, the instructions of the language are those of
2357$\ERTL$ with hardware registers replacing pseudo registers. Calling and
2358returning conventions were explicited in $\ERTL$; thus, function calls and
2359returns do not need parameters in $\LTL$. The syntax is defined in table
2360\ref{LTL:syntax}.
2361
2362\begin{table}
2363{\footnotesize
2364\[
2365\begin{array}{lllllll}
2366\w{size} & ::= & \s{Byte} \Alt \s{HalfWord} \Alt \s{Word}  & \qquad
2367\w{fun\_ref} & ::= & \w{fun\_name} \Alt \w{hdw\_reg}
2368\end{array}
2369\]
2370
2371\[
2372\begin{array}{llll}
2373\w{instruction} & ::= & \Alt \s{skip} \rightarrow \w{node} &
2374                        \quad \mbox{(no instruction)}\\
2375                &     & \Alt \s{NewFrame} \rightarrow \w{node} &
2376                        \quad \mbox{(frame creation)}\\
2377                &     & \Alt \s{DelFrame} \rightarrow \w{node} &
2378                        \quad \mbox{(frame deletion)}\\
2379                &     & \Alt \w{hdw\_reg} := n
2380                        \rightarrow \w{node} & \quad \mbox{(constant)}\\
2381                &     & \Alt \w{hdw\_reg} := \w{unop}(\w{hdw\_reg})
2382                        \rightarrow \w{node} & \quad \mbox{(unary operation)}\\
2383                &     & \Alt \w{hdw\_reg} :=
2384                        \w{binop}(\w{hdw\_reg},\w{hdw\_reg})
2385                        \rightarrow \w{node} & \quad
2386                        \mbox{(binary operation)}\\
2387                &     & \Alt \w{hdw\_reg} := \w{fun\_name}
2388                        \rightarrow \w{node} &
2389                        \quad \mbox{(address of a function)}\\
2390                &     & \Alt \w{hdw\_reg} := \w{size}(\w{hdw\_reg}[n])
2391                        \rightarrow \w{node} & \quad \mbox{(memory load)}\\
2392                &     & \Alt \w{size}(\w{hdw\_reg}[n]) := \w{hdw\_reg}
2393                        \rightarrow \w{node} & \quad \mbox{(memory store)}\\
2394                &     & \Alt \w{fun\_ref}() \rightarrow \w{node} &
2395                        \quad \mbox{(function call)}\\
2396                &     & \Alt \w{fun\_ref}()
2397                        & \quad \mbox{(function tail call)}\\
2398                &     & \Alt \s{test}\ \w{uncon}(\w{hdw\_reg}) \rightarrow
2399                        \w{node}, \w{node} & \quad
2400                        \mbox{(branch unary condition)}\\
2401                &     & \Alt \s{test}\ \w{bincon}(\w{hdw\_reg},\w{hdw\_reg})
2402                        \rightarrow \w{node}, \w{node} & \quad
2403                        \mbox{(branch binary condition)}\\
2404                &     & \Alt \s{return} & \quad \mbox{(return)}
2405\end{array}
2406\]
2407
2408\[
2409\begin{array}{lllllll}
2410\w{fun\_def} & ::= & \w{fun\_name}(n) & \qquad
2411\w{program} & ::= & \s{globals}: n\\
2412             &     & \s{locals:} n & \qquad
2413            &     & \w{fun\_def}^*\\
2414             &     & \s{stack:} n\\
2415             &     & \s{entry:} \w{node}\\
2416             &     & (\w{node:} \w{instruction})^*
2417\end{array}
2418\]}
2419\caption{Syntax of the $\LTL$ language}\label{LTL:syntax}
2420\end{table}
2421
2422\paragraph{Translation of $\ERTL$ to $\LTL$.} The translation relies on the
2423results of the liveness analysis and of the register allocation. Unused
2424instructions are eliminated and each pseudo register is replaced by a physical
2425location. In $\LTL$, the size of the stack frame of a function is known;
2426instructions intended to load or store values in the stack are translated
2427using regular load and store instructions.
2428
2429\paragraph{Optimizations.} A \emph{graph compression} algorithm removes empty
2430instructions generated by previous compilation passes and by the liveness
2431analysis.
2432
2433\subsection{$\LIN$}
2434
2435In $\LIN$, the structure of a program is no longer based on CFGs. Every function
2436is represented as a sequence of instructions.
2437
2438\paragraph{Syntax.}
2439The instructions of $\LIN$ are very close to those of $\LTL$. \emph{Program
2440  labels}, \emph{gotos} and branch instructions handle the changes in the
2441control flow. The syntax of $\LIN$ programs is shown in table \ref{LIN:syntax}.
2442
2443\begin{table}
2444{\footnotesize
2445\[
2446\begin{array}{lllllll}
2447\w{size} & ::= & \s{Byte} \Alt \s{HalfWord} \Alt \s{Word}  & \qquad
2448\w{fun\_ref} & ::= & \w{fun\_name} \Alt \w{hdw\_reg}
2449\end{array}
2450\]
2451
2452\[
2453\begin{array}{llll}
2454\w{instruction} & ::= & \Alt \s{NewFrame} &
2455                        \quad \mbox{(frame creation)}\\
2456                &     & \Alt \s{DelFrame} &
2457                        \quad \mbox{(frame deletion)}\\
2458                &     & \Alt \w{hdw\_reg} := n & \quad \mbox{(constant)}\\
2459                &     & \Alt \w{hdw\_reg} := \w{unop}(\w{hdw\_reg})
2460                        & \quad \mbox{(unary operation)}\\
2461                &     & \Alt \w{hdw\_reg} :=
2462                        \w{binop}(\w{hdw\_reg},\w{hdw\_reg})
2463                        & \quad \mbox{(binary operation)}\\
2464                &     & \Alt \w{hdw\_reg} := \w{fun\_name}
2465                        & \quad \mbox{(address of a function)}\\
2466                &     & \Alt \w{hdw\_reg} := \w{size}(\w{hdw\_reg}[n])
2467                        & \quad \mbox{(memory load)}\\
2468                &     & \Alt \w{size}(\w{hdw\_reg}[n]) := \w{hdw\_reg}
2469                        & \quad \mbox{(memory store)}\\
2470                &     & \Alt \s{call}\ \w{fun\_ref}
2471                        & \quad \mbox{(function call)}\\
2472                &     & \Alt \s{tailcall}\ \w{fun\_ref}
2473                        & \quad \mbox{(function tail call)}\\
2474                &     & \Alt \w{uncon}(\w{hdw\_reg}) \rightarrow
2475                        \w{node} & \quad
2476                        \mbox{(branch unary condition)}\\
2477                &     & \Alt \w{bincon}(\w{hdw\_reg},\w{hdw\_reg})
2478                        \rightarrow \w{node} & \quad
2479                        \mbox{(branch binary condition)}\\
2480                &     & \Alt \w{mips\_label:} & \quad \mbox{($\mips$ label)}\\
2481                &     & \Alt \s{goto}\ \w{mips\_label} & \quad \mbox{(goto)}\\
2482                &     & \Alt \s{return} & \quad \mbox{(return)}
2483\end{array}
2484\]
2485
2486\[
2487\begin{array}{lllllll}
2488\w{fun\_def} & ::= & \w{fun\_name}(n) & \qquad
2489\w{program} & ::= & \s{globals}: n\\
2490             &     & \s{locals:} n & \qquad
2491            &     & \w{fun\_def}^*\\
2492             &     & \w{instruction}^*
2493\end{array}
2494\]}
2495\caption{Syntax of the $\LIN$ language}\label{LIN:syntax}
2496\end{table}
2497
2498\paragraph{Translation of $\LTL$ to $\LIN$.}
2499This translation amounts to transform in an efficient way the graph structure of
2500functions into a linear structure of sequential instructions.
2501
2502\subsection{$\mips$}
2503
2504$\mips$ is a rather simple assembly language. As for other assembly languages, a
2505program in $\mips$ is a sequence of instructions. The $\mips$ code produced by
2506the compilation of a $\Clight$ program starts with a preamble in which some
2507useful and non-primitive functions are predefined (e.g. conversion from 8 bits
2508unsigned integers to 32 bits integers). The subset of the $\mips$ assembly
2509language that the compilation produces is defined in table \ref{mips:syntax}.
2510
2511\begin{table}
2512{\footnotesize
2513\[
2514\begin{array}{lllllllllll}
2515\w{load} & ::= & \s{lb} \Alt \s{lhw} \Alt \s{lw} & \qquad
2516\w{store} & ::= & \s{sb} \Alt \s{shw} \Alt \s{sw} & \qquad
2517\w{fun\_ref} & ::= & \w{fun\_name} \Alt \w{hdw\_reg}
2518\end{array}
2519\]
2520
2521\[
2522\begin{array}{llll}
2523\w{instruction} & ::= & \Alt \s{nop} & \quad \mbox{(empty instruction)}\\
2524                &     & \Alt \s{li}\ \w{hdw\_reg}, n & \quad \mbox{(constant)}\\
2525                &     & \Alt \w{unop}\ \w{hdw\_reg}, \w{hdw\_reg}
2526                        & \quad \mbox{(unary operation)}\\
2527                &     & \Alt \w{binop}\
2528                        \w{hdw\_reg},\w{hdw\_reg},\w{hdw\_reg}
2529                        & \quad \mbox{(binary operation)}\\
2530                &     & \Alt \s{la}\ \w{hdw\_reg}, \w{fun\_name}
2531                        & \quad \mbox{(address of a function)}\\
2532                &     & \Alt \w{load}\ \w{hdw\_reg}, n(\w{hdw\_reg})
2533                        & \quad \mbox{(memory load)}\\
2534                &     & \Alt \w{store}\ \w{hdw\_reg}, n(\w{hdw\_reg})
2535                        & \quad \mbox{(memory store)}\\
2536                &     & \Alt \s{call}\ \w{fun\_ref}
2537                        & \quad \mbox{(function call)}\\
2538                &     & \Alt \w{uncon}\ \w{hdw\_reg}, \w{node} & \quad
2539                        \mbox{(branch unary condition)}\\
2540                &     & \Alt \w{bincon}\ \w{hdw\_reg},\w{hdw\_reg},\w{node}
2541                        & \quad \mbox{(branch binary condition)}\\
2542                &     & \Alt \w{mips\_label:} & \quad \mbox{($\mips$ label)}\\
2543                &     & \Alt \s{j}\ \w{mips\_label} & \quad \mbox{(goto)}\\
2544                &     & \Alt \s{return} & \quad \mbox{(return)}
2545\end{array}
2546\]
2547
2548\[
2549\begin{array}{lll}
2550\w{program} & ::= & \s{globals}: n\\
2551            &     & \s{entry}: \w{mips\_label}^*\\
2552            &     & \w{instruction}^*
2553\end{array}
2554\]}
2555\caption{Syntax of the $\mips$ language}\label{mips:syntax}
2556\end{table}
2557
2558\paragraph{Translation of $\LIN$ to $\mips$.} This final translation is simple
2559enough. Stack allocation and deallocation are explicited and the function
2560definitions are sequentialized.
2561
2562
2563
2564%\newpage
2565
2566\section{Labelling approach for the $\C$ compiler}\label{C-label-sec}
2567
2568This section informally describes the labelled extensions of the languages in
2569the compilation chain, the way the labels are propagated by the compilation
2570functions, the labelling of the source code, the hypotheses on the control flow
2571of the labelled $\mips$ code and the verification that we perform on it, the way
2572we build the instrumentation, and finally the way the labelling approach has
2573been tested.
2574The process of annotating a $\Clight$ program using the labelling approach is
2575summarized in table \ref{annot-clight-summary} and is detailed in the following sections.
2576
2577\begin{table}
2578{\footnotesize
2579
25801. Label the input $\Clight$ program.\\
2581
25822. Compile the labelled $\Clight$ program in the labelled world. This
2583  produces a labelled $\mips$ code. \\
2584
25853. For each label of the labelled $\mips$ code, compute the cost of the
2586  instructions under its scope and generate a \emph{label-cost mapping}. An
2587  unlabelled $\mips$ code --- the result of the compilation --- is obtained by
2588  removing the labels from the labelled $\mips$ code. \\
2589
25904. Add a fresh \emph{cost variable} to the labelled $\Clight$ program and
2591  replace the labels by an increment of this cost variable according to the
2592  label-cost mapping. The result is an \emph{annotated} $\Clight$ program with no
2593  label.
2594}
2595\caption{Building the annotation of a $\Clight$ program in the labelling approach}\label{annot-clight-summary}
2596\end{table}
2597
2598
2599
2600
2601\subsection{Labelled $\Clight$ and labelled $\Cminor$} 
2602
2603Both the $\Clight$ and $\Cminor$ languages are extended in the same way
2604by labelling both statements and expressions (by comparison, in the
2605toy language $\imp$ we just labelled statements and boolean conditions).
2606The labelling of expressions aims to capture precisely their execution cost.
2607Indeed,  $\Clight$ and $\Cminor$ include expressions such
2608as $a_1?a_2;a_3$ whose evaluation cost depends on the boolean value $a_1$.
2609
2610As both languages are extended in the same way, the extended
2611compilation does nothing more than sending $\Clight$ labelled
2612statements and expressions to $\Cminor$ labelled statements and
2613expressions.
2614
2615\subsection{Labels in $\RTLAbs$ and the back-end languages}
2616The labelled version of $\RTLAbs$ and the languages in the back-end language
2617simply consists in adding a new instruction whose semantics is to emit a label
2618without modifying the state. For the CFG based languages
2619($\RTLAbs$ to $\LTL$), this new instruction is $\s{emit}\ \w{label} \rightarrow
2620\w{node}$. For $\LIN$ and $\mips$, it is $\s{emit}\ \w{label}$. The translation
2621of these label instructions is immediate.
2622In $\mips$, we also
2623rely on a reserved label $\s{begin\_function}$  to pinpoint the
2624beginning of a function code (cf. section \ref{fun-call-sec}).
2625
2626
2627
2628\subsection{Labelling of the source language}
2629
2630The goal here is to add labels in the source program that cover every reachable
2631instruction of the program and avoid unlabelled loops; this can be seen as a \emph{soundness}
2632property. Another important point is \emph{precision}, meaning that a label
2633might cover several paths to the next labels only if those paths have equal
2634costs. Several labellings might satisfy the soundness and precision conditions,
2635but from an engineering point of view, a labelling that makes obvious which
2636instruction is under the scope of which label would be better. There is a thin
2637line to find between too many labels --- which may obfuscate the code --- and
2638too few labels --- which makes it harder to see which instruction is under the
2639scope of which label. The balance leans a bit towards the \emph{economy} of
2640labels because the cost of executing an assembly instruction often depends on
2641its context (for instance by the status of the cache memory). We explain our
2642labelling by considering the constructions of $\Clight$ and their compilation to
2643$\mips$.
2644
2645\subsubsection{Sequential instructions}
2646
2647A sequence of $\Clight$ instructions that compile to sequential $\mips$ code,
2648such as a sequence of assignments, can be handled by a single
2649label which covers the unique execution path.
2650The example below illustrates the labelling of `sequential' $\Clight$ instructions.
2651
2652\begin{center}
2653  {\footnotesize
2654    \begin{tabular}{lllll}
2655      $\Clight$ & $\xrightarrow{Labelling}$ & Labelled $\Clight$ &
2656      $\xrightarrow{Compilation}$ & Labelled $\mips$\\\\
2657      \codeex{i = 0;} & & \textbf{\_cost:} & & \codeex{emit} \textbf{\_cost}\\
2658      \codeex{tab[i] = x;} & & \codeex{i = 0;} & & \codeex{li\ \ \ \$v0, 4}\\
2659      \codeex{x++;} & & \codeex{tab[i] = x;} & &
2660      \codeex{mul\ \ \$v0, \$zero, \$v0}\\
2661      & & \codeex{x++;} & & \codeex{add\ \ \$v0, \$a1, \$v0}\\
2662      & & & & \codeex{sw\ \ \ \$a0, 0(\$v0)}\\
2663      & & & & \codeex{li\ \ \ \$v0, 1}\\
2664      & & & & \codeex{add\ \ \$a0, \$a0, \$v0}
2665    \end{tabular}}
2666\end{center}
2667
2668\subsubsection{Ternary expressions}
2669
2670Most $\Clight$ expressions compile to sequential $\mips$ code. There is one
2671exception: {\em ternary expressions} that introduce a branching in the control
2672flow.
2673Because of the precision condition, we must associate a label with each branch.
2674
2675
2676\begin{center}
2677  {\footnotesize
2678    \begin{tabular}{lllll}
2679      $\Clight$ & $\xrightarrow{Labelling}$ & Labelled $\Clight$ &
2680      $\xrightarrow{Compilation}$ & Labelled $\mips$\\\\
2681      \codeex{b ? x+1 :} & &
2682      \codeex{b ?} \textbf{(\_cost1:} \codeex{x+1}\textbf{)} \codeex{:} & &
2683      \codeex{beq\ \ \$a0, \$zero, c\_false}\\
2684      \codeex{\ \ \ \ \ y} & &
2685      \codeex{\ \ \ \ }\textbf{(\_cost2:} \codeex{y}\textbf{)}
2686      & & \codeex{emit} \textbf{\_cost1}\\
2687      & & & & \codeex{li\ \ \ \$v0, 1}\\
2688      & & & & \codeex{add\ \ \$v0, \$a1, \$v0}\\
2689      & & & & \codeex{j\ \ \ \ exit}\\
2690      & & & & \codeex{c\_false:}\\
2691      & & & & \codeex{emit} \textbf{\_cost2}\\
2692      & & & & \codeex{move \$v0, \$a2}\\
2693      & & & & \codeex{exit:}
2694    \end{tabular}}
2695\end{center}
2696
2697\paragraph{Related cases.}
2698The two $\Clight$ boolean operations \texttt{\&\&} and \texttt{||}
2699have a lazy semantics: depending on the evaluation of the first
2700argument, the second one might be evaluated or not.  There is an
2701obvious translation to ternary expressions.  For instance, the
2702expression \texttt{x \&\& y} is translated into the expression
2703\texttt{x?(y?1:0):0}.  Our compiler performs this translation
2704\emph{before} computing the labelling.
2705
2706
2707
2708\subsubsection{Conditionals}
2709
2710Conditionals are another way to introduce a branching. As for ternary
2711expressions, the labelling of a conditional consists in adding a starting label
2712to the labelling of each branch.
2713
2714\begin{center}
2715  {\footnotesize
2716    \begin{tabular}{lllll}
2717      $\Clight$ & $\xrightarrow{Labelling}$ & Labelled $\Clight$ &
2718      $\xrightarrow{Compilation}$ & Labelled $\mips$\\\\
2719      \codeex{if (b) \{} & & \codeex{if (b) \{} & &
2720      \codeex{beq\ \ \$a0, \$zero, c\_false}\\
2721      \codeex{\ \ x = 1;} & & \ \ \textbf{\_cost1:}
2722      & & \codeex{emit} \textbf{\_cost1}\\
2723      \codeex{\ \ ... \}} & & \codeex{\ \ x = 1;} & & \codeex{li\ \ \ \$v0, 1}\\
2724      \codeex{else \{} & & \codeex{\ \ ... \}} & & \codeex{...}\\
2725      \codeex{\ \ x = 2;} & & \codeex{else \{} & & \codeex{j\ \ \ \ exit}\\
2726      \codeex{\ \ ... \}} & & \ \ \textbf{\_cost2:} & & \codeex{c\_false:}\\
2727      & & \codeex{\ \ x = 2;} & & \codeex{emit} \textbf{\_cost2}\\
2728      & & \codeex{\ \ ... \}} & & \codeex{li\ \ \ \$v0, 2}\\
2729      & & & & \codeex{...}\\
2730      & & & & \codeex{exit:}
2731    \end{tabular}}
2732\end{center}
2733
2734\subsubsection{Loops}
2735
2736Loops in $\Clight$ are guarded by a condition. Following the arguments of the
2737previous cases, we add two labels when encountering a loop construct: one label
2738to start the loop's body, and one label when exiting the loop.
2739This is enough to guarantee that the loop in the compiled code goes through a label.
2740
2741
2742\begin{center}
2743  {\footnotesize
2744    \begin{tabular}{lllll}
2745      $\Clight$ & $\xrightarrow{Labelling}$ & Labelled $\Clight$ &
2746      $\xrightarrow{Compilation}$ & Labelled $\mips$\\\\
2747      \codeex{while (b) \{} & & \codeex{while (b) \{} & &
2748      \codeex{loop:}\\
2749      \codeex{\ \ i++;} & & \textbf{\ \ \_cost1:} & &
2750      \codeex{beq\ \ \$a0, \$zero, exit}\\
2751      \codeex{\ \ ... \}} & & \codeex{\ \ i++;} & &
2752      \codeex{emit} \textbf{\_cost1}\\
2753      \codeex{x = i;} & & \codeex{\ \ ... \}} & & \codeex{li\ \ \ \$v0, 1}\\
2754      & & \textbf{\_cost2:} & & \codeex{add\ \ \$a1, \$a1, \$v0}\\
2755      & & \codeex{x = i;} & & \codeex{...}\\
2756      & & & & \codeex{j\ \ \ \ loop}\\
2757      & & & & \codeex{exit:}\\
2758      & & & & \codeex{emit} \textbf{\_cost2}\\
2759      & & & & \codeex{move \$a2, \$a1}
2760    \end{tabular}}
2761\end{center}
2762
2763\subsubsection{Program Labels and Gotos}
2764
2765In $\Clight$, program labels and gotos are intraprocedural. Their only effect on
2766the control flow of the resulting assembly code is to potentially introduce an
2767unguarded loop. This loop must contain at least one cost label in order to
2768satisfy the soundness condition, which we ensure by adding a cost label right
2769after a program label.
2770
2771\begin{center}
2772  {\footnotesize
2773    \begin{tabular}{lllll}
2774      $\Clight$ & $\xrightarrow{Labelling}$ & Labelled $\Clight$ &
2775      $\xrightarrow{Compilation}$ & Labelled $\mips$\\\\
2776      \codeex{lbl:} & & \codeex{lbl:} & & \codeex{lbl:}\\
2777      \codeex{i++;} & & \textbf{\_cost:} & &
2778      \codeex{emit} \textbf{\_cost}\\
2779      \codeex{...} & & \codeex{i++;} & &
2780      \codeex{li\ \ \ \$v0, 1}\\
2781      \codeex{goto lbl;} & & \codeex{...} & &
2782      \codeex{add\ \ \$a0, \$a0, \$v0}\\
2783      & & \codeex{goto lbl;} & & \codeex{...}\\
2784      & & & & \codeex{j\ \ \ \ lbl}
2785    \end{tabular}}
2786\end{center}
2787
2788\subsubsection{Function calls}\label{fun-call-sec}
2789
2790Function calls in $\mips$ are performed by indirect jumps, the address of the
2791callee being in a register. In the general case, this address cannot be inferred
2792statically. Even though the destination point of a function call is unknown,
2793when the considered $\mips$ code has been produced by our compiler, we know for
2794a fact that this function ends with a return statement that transfers the
2795control back to the instruction following the function call in the caller. As a
2796result, we treat function calls according to the following principles:
2797(1) the instructions of a function are covered by the labels inside
2798this function, (2) we assume a function call always returns and runs
2799the instruction following the call.
2800
2801
2802\bigskip
2803
2804Principle (1) entails in particular that each function must contain at least one
2805label. To ensure this, we simply add a starting label in every function
2806definition. The example below illustrates this point:
2807
2808\begin{center}
2809  {\footnotesize
2810    \begin{tabular}{lllll}
2811      $\Clight$ & $\xrightarrow{Labelling}$ & Labelled $\Clight$ &
2812      $\xrightarrow{Compilation}$ & Labelled $\mips$\\\\
2813      \codeex{void f () \{} & & \codeex{void f () \{} & & \codeex{f\_start:}\\
2814      \codeex{\ \ f}\emph{'s body} & & \textbf{\ \ \_cost:} & &
2815      \emph{Frame Creation}\\
2816      \codeex{\}} & & \codeex{\ \ f}\emph{'s body} & & \emph{Initializations}\\
2817      & & \codeex{\}} & & \codeex{emit} \textbf{\_cost} \\
2818      & & & & \codeex{f}\emph{'s body}\\
2819      & & & & \emph{Frame Deletion}\\
2820      & & & & \codeex{return}
2821    \end{tabular}}
2822\end{center}
2823
2824We notice that some instructions in $\mips$ will be inserted \emph{before}
2825the first label is emitted. These instructions relate to the
2826frame creation and/or variable initializations, and are composed
2827of sequential instructions (no branching). To deal with
2828this issue, we take the convention that the instructions
2829that precede the first label in a function code are actually
2830under the scope of the first label.
2831
2832\bigskip
2833
2834Principle (2) is of course an over-approximation of the program
2835behaviour as a function might fail to return because of an infinite loop.
2836In this case, the proposed labelling remains correct: it just assumes
2837that the instructions following the function call will be executed, and takes
2838their cost into consideration. The final computed cost is still an
2839over-approximation of the actual cost.
2840
2841\subsection{Verifications on the object code}
2842
2843The labelling previously described has been designed so that the
2844compiled $\mips$ code satisfies the soundness and precision
2845conditions. However, we do not need to prove this, instead we have to
2846devise an algorithm that checks the conditions on the compiled code.
2847The algorithm assumes a correct management of function calls in the
2848compiled code. In particular, when we call a function we always jump
2849to the first instruction of the corresponding code segment and when we
2850return we always jump to an an instruction that follows a call.  We
2851stress that this is a reasonable hypothesis that is essentially
2852subsumed by the proof that the object code {\em simulates} the source
2853code.
2854
2855In our current implementation, we check the soundness and the
2856precision conditions while building at the same time the label-cost
2857mapping. To this end, the algorithm takes the following main steps.
2858
2859\begin{itemize}
2860\item First, for each function a control flow graph is built.
2861
2862\item For each graph, we check whether there is a unique
2863label that is reachable from the root by a unique path.
2864This unique path corresponds to the instructions generated by
2865the calling conventions as discussed in section \ref{fun-call-sec}.
2866We shift the occurrence of the label to the root of the graph.
2867
2868\item 
2869By a  strongly connected components algorithm,
2870we check whether every loop in the graphs goes through at least one label.
2871
2872\item We perform a (depth-first) search of the graph. Whenever we reach a
2873  labelled node, we perform a second (depth-first) search that stops at labelled
2874  nodes and computes an upper on the cost of the occurrence of the label. Of
2875  course, when crossing a branching instruction, we take the maximum cost of the
2876  branches.  When the second search stops we update the current cost of the
2877  label-cost mapping (by taking a maximum) and we continue the first search.
2878
2879\item Warning messages are emitted whenever the maximum is taken between two
2880different values as in this case  the precision condition may be violated.
2881
2882\end{itemize}
2883
2884
2885
2886\subsection{Building the cost annotation}
2887
2888Once the label-cost mapping is computed, instrumenting the labelled source code is an easy task. A fresh global variable which we call
2889\emph{cost variable} is added to the source program with the  purpose of holding the cost value and it is initialised at the very beginning of the \codeex{main} program.
2890Then, every label is replaced by an increment of the
2891cost variable according to the label-cost
2892mapping.
2893Following this replacement, the cost labels disappear and the result is
2894a $\Clight$ program with annotations in the form of assignments.
2895
2896There is one final problem: labels inside expressions.
2897As we already mentioned,  $\Clight$ does not allow
2898writing side-effect instructions --- such as  cost
2899increments --- inside expressions. To cope with this restriction, we produce
2900first an instrumented $\C$ program --- with side-effects in expressions
2901--- that we translate back to $\Clight$ using $\cil$.
2902This process is summarized below.
2903
2904\begin{center}
2905  {\footnotesize
2906    \begin{tabular}{lllll}
2907      $\left . \begin{array}{l}
2908      \text{Labelled $\Clight$}\\
2909      \text{label-cost mapping}
2910    \end{array} \right \}$ & $\xrightarrow{\text{Instrumentation}}$ &
2911      Instrumented $\C$ & $\xrightarrow{\cil}$ & Instrumented $\Clight$
2912    \end{tabular}}
2913\end{center}
2914
2915\subsection{Testing}
2916It is desirable to test the coherence of the labelling from $\Clight$ to $\mips$.
2917To this end, each labelled language comes with an interpreter
2918that produces the trace of the labels encountered during the computation.
2919Then, one naive approach is to test
2920the equality of the traces
2921produced by the program at the different stages of the compilation.
2922Our current implementation passes this kind of tests.
2923For some optimisations that may re-order computations, a weaker
2924condition could be considered which consists in abstracting the
2925traces as {\em multi-sets} of labels before comparing them.
2926
2927
2928
2929\section{Conclusion and future work}\label{conclusion-sec}
2930We have discussed the problem of building a compiler which can {\em
2931  lift} in a provably correct way pieces of information on the
2932execution cost of the object code to cost annotations on the source
2933code.  To this end, we have introduced the so called {\em direct} and
2934{\em labelling} approaches and discussed their formal application to a
2935toy compiler. Based on this experience, we have argued that the second
2936approach has better scalability properties.  To substantiate this
2937claim, we have reported on our successful experience in implementing
2938and testing the labelling approach on top of a prototype compiler
2939written in $\ocaml$ for a large fragment of the $\C$ language which
2940can be shortly described as $\Clight$ without floating point.
2941
2942We discuss next a few directions for future work.
2943%
2944First, we plan to test the current compiler on
2945the kind of $\C$ code produced for embedded applications,
2946a typical example being the $\C$ code produced by
2947the compilation of synchronous languages such as $\lustre$ or $\esterel$.
2948Starting from the annotated $\C$ code, we expect to
2949produce automatically meaningful information on, say,
2950the reaction time of a given synchronous program.
2951%
2952Second, we plan to port the current compiler to commercial assembly
2953languages. In particular, it would be interesting to target
2954one of the assembly languages covered by the $\absint$ tool so as
2955to obtain more realistic estimations of
2956the execution cost of sequences of instructions.
2957%
2958Third,  we plan to formalise and validate in the {\em Calculus of Inductive
2959Constructions} the prototype implementation of the labelling approach
2960for the $\C$ compiler described in section \ref{C-compiler-sec}.
2961This requires a major implementation effort which will be carried
2962on in collaboration with our partners of  the $\cerco$ project \cite{Cerco10}.
2963%
2964Fourth, we plan to study the applicability of the labelling
2965approach to other optimisation techniques in the realm of
2966the $\C$ compilers technology such as {\em loop optimisations},
2967and to other languages which rely on 
2968rather distinct compilation technologies
2969such as a language of the $\ml$ family.
2970
2971
2972{\footnotesize
2973\begin{thebibliography}{99}
2974
2975\bibitem{AbsInt}
2976AbsInt Angewandte Informatik. {\tt http://www.absint.com/}.
2977
2978\bibitem{Cerco10}
2979Certified Complexity (Project description).
2980\newblock  ICT-2007.8.0 FET Open, Grant 243881.
2981{\tt http://cerco.cs.unibo.it}.
2982%\newblock  Partners: Universities of Bologna, Edinburgh, and Paris Diderot, 2010.
2983
2984\bibitem{SCADE}
2985Esterel Technologies.
2986{\tt http://www.esterel-technologies.com}.
2987
2988\bibitem{Frama-C}
2989$\framac$ software analysers.
2990{\tt http://frama-c.com/}.
2991
2992\bibitem{AbsintScade}
2993C.~Ferdinand, R.~Heckmann, T.~ Le Sergent, D.~Lopes, B.~Martin, X.~Fornari, and F.~Martin.
2994\newblock Combining a high-level design tool for safety-critical
2995systems with a tool for {WCET} analysis of executables.
2996\newblock  In Proc. of the 4th European Congress on
2997{\em Embedded Real Time Software (ERTS)}, Toulouse, 2008.
2998
2999\bibitem{Fornari10}
3000X.~Fornari.
3001\newblock Understanding how {SCADE} suite {KCG} generates safe {C} code.
3002\newblock White paper, Esterel Technologies, 2010.
3003
3004
3005\bibitem{Larus05}
3006J.~Larus.
3007\newblock Assemblers, linkers, and the SPIM simulator.
3008\newblock Appendix of {\em Computer Organization and Design: the hardware/software interface},
3009book by Hennessy and Patterson, 2005.
3010
3011
3012\bibitem{Leroy09}
3013X.~Leroy.
3014\newblock Formal verification of a realistic compiler.
3015\newblock {\em Commun. ACM}, 52(7):107-115, 2009.
3016
3017
3018\bibitem{Leroy09-ln}
3019X.~Leroy.
3020\newblock Mechanized semantics, with applications to program proof and compiler verification (lecture notes and Coq development).
3021\newblock {\em Marktoberdorf summer school}, Germany, 2009.
3022
3023
3024\bibitem{MP67}
3025J.~McCarthy and J.~Painter.
3026\newblock Correctness of a compiler for arithmetic expressions.
3027\newblock In {\em Mathematical aspects of computer science 1}, volume 19
3028of Symposia in Applied Mathematics, AMS, 1967.
3029
3030\bibitem{CIL02}
3031 G.~Necula, S.~McPeak, S.P.~Rahul, and W.~Weimer.
3032\newblock CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs.
3033\newblock In {\em Proceedings of Conference on Compiler Construction},
3034Springer LNCS 2304:213--228, 2002.
3035
3036\bibitem{Pottier}
3037F.~Pottier.
3038\newblock Compilation (INF 564), \'Ecole Polytechnique, 2009-2010.
3039{\tt http://www.enseignement.polytechnique.fr/informatique/INF564/}.
3040
3041\bibitem{W09}
3042R.~Wilhelm et al.
3043\newblock The worst-case execution-time problem - overview of methods and survey of tools.
3044\newblock {\em ACM Trans. Embedded Comput. Syst.}, 7(3), 2008.
3045
3046\end{thebibliography}}
3047
3048\newpage
3049\appendix
3050
3051%\section{Choice of target architecture and intermediate languages}
3052
3053\section{Assessment of the deliverable within the $\cerco$ project}
3054%RA This section is supposed to be a kind of `executive summary' for
3055%the project reviewer. How about systematically adding a final appendix
3056%to the scientific deliverables with a title such as the one proposed above?
3057
3058Following the extensive experiments described in the previous sections,
3059we now feel confident about the possibility of scaling our approach to
3060a realistic, mildly optimizing, complexity preserving\footnote{In the sense
3061of the project proposal.}  compiler
3062% RA:
3063% Deliverable 2.1 makes no explicit reference to the notion of
3064% `complexity preservation' though this can be simply derived.
3065% If a program is soundly labelled and its execution generates a
3066% sequence of labels lambda then the execution cost of the compiled code
3067% is proportional to the lenght of the sequence lambda.
3068for a target architecture of the kind described in the proposal. While we believe
3069that our approach should scale to a retargetable compiler, in the timeframe
3070of the European Project and according to the project work-plan we only commit
3071to the investigation of a compiler having a single target processor.
3072
3073In particular, we will target the 8051 (also known as 8052 or MCS51) family
3074of processors. The 8051 is an 8 bit CISC microprocessor introduced in the 1980
3075by Intel, very popular and still manufactured by a host of companies,
3076many European. It is widely used in embedded systems and, thanks to its
3077predictable behaviour and execution cost, it will allow us to compute fully
3078accurate measurement of the actual computational complexity of $O(1)$
3079assembly program slices\footnote{In the sense of the project proposal.},
3080to be manifested at the C level.
3081% RA:
3082% The terminology of `program slice' is used in the proposal but not in the
3083% deliverable. Note that some confusion is possible with the notion of
3084% `program slicing' which is a technique we plan to use in Frama-C
3085
3086With respect to the test-cases previously described, in particular the
3087C compiler for MIPS, the main difficulty introduced by the 8051 is the
3088non uniform concrete memory model: the processor has different types of
3089memory (on-chip RAM, external RAM, on-chip and/or external ROM) that can
3090be accessed using different access modes and pointer types.
3091Moreover, memory mapped I/O is heavily used in the design of the chip,
3092to the point that all registers (apart from the inaccessible program
3093counter) are seen as memory locations and have their own memory address.
3094To complete the picture, the different memories are
3095split into regions that may overlap, and the same accessing mode can
3096point to different regions (or even to memory mapped registers) according
3097to the value of the pointer. Finally, the amount of memory available is
3098quite limited, with a stack which is at most 80 bytes wide and different
3099speeds and opcode sizes to access different memory areas. For this
3100reasons, compilers that target 8051 processors usually abound in directives
3101to drive the tool in assigning memory locations to values.
3102
3103Hence, because of the peculiar choice of target processor, in the next six
3104months of the $\cerco$ project and partly as an extension of the original work-plan we will:
3105% RA: Added partly
3106
3107\begin{itemize} 
3108
3109 \item Extend the memory model used so far (and taken from CompCert) to
3110       accurately describe the different memory types and regions of the 8051,
3111       as well as to obey to the \texttt{volatile} directive used to map
3112       memory mapped regions to program variables. This work will be done
3113       as part of Tasks~T2.3 and T4.1 and the outcome will be described in
3114       Deliverables~D2.2 and D4.1 where we will also provide a formalization in
3115       Matita of the executable semantics of the extended memory model.
3116       In case we decide to adopt the same memory model for all compilation
3117       phases, as it is done in CompCert, the memory model extension will
3118       also span over Task~T3.1, requiring close cooperation between
3119       the front-end formalization (lead by Edinburgh) and the back-end
3120       formalization (lead by Bologna).
3121
3122 \item Devise language extensions (possibly inspired by the variable modifiers
3123       of the SDCC compiler) to let the user suggest or force the compiler to
3124       put data into particular memory regions. Similarly, we could refine the
3125       pointer types of ANSI C into classes of pointer types pointing to
3126       particular regions, in order to reflect the difference in sizes of
3127       pointers to different regions (8 bits to address internal RAM and
3128       memory mapped registers; 8 bits to address bits in the bit memory
3129       or in registers using an ad-hoc addressing mode;
3130       16 bits for code memory and external memory; 24 bits for generic
3131       pointers). All of these language extensions, if any, must be reflected
3132       all over the compilation chain, with modifications to the intermediate
3133       languages and/or memory model. This work will be done as part of
3134       Tasks~T2.3, T3.1 and T4.1 and the outcome will be described in
3135       Deliverables~D2.2, D3.1, D3.3, D4.1, D4.3 where we will provide
3136       executable semantics in Matita of the source, target and intermediate
3137       languages.
3138
3139       In order to remain compatible with the project schedule, we will first
3140       focus on compiling standard Clight without floating points to the 8051,
3141       adding the language extensions in a second moment, within Task~T2.3.
3142
3143%RA: If we want to follow the CompCert approach (one memory model for the
3144% whole verification chain) then these two points should be done in cooperation
3145% by Edinburgh (front-end formalisation) and  Bologna (back-end formalisation).
3146
3147
3148 \item Modify and extend the experimental compiler from Clight
3149(without floating point) to MIPS in order to target the 8051
3150architecture (Task~T2.3, Deliverable~D2.2).
3151%RA: In the 2 SVN (the one we used to prepare the proposal and the
3152%one we are using now) I cannot find the latest version of the proposal
3153%with the correct numbering of the deliverables. Would it be possible
3154%to add it to the Contracts directory of the SVN?
3155
3156%RA: We (Paris) are supposed to deliver the prototype annotating compiler end of January
3157% which means that the bulk of the work must be done before Christmas (also remember
3158% that Nicolas will leave us end of February).
3159% However, we started talking about the specification of the memory model
3160% and the extension of the source Clight language only a few days ago.
3161% This means that unless we get a solid specification in a couple of weeks,
3162% the deliverable 2.2 will NOT necessarily cover all the `language extensions' which
3163% are mentioned above but will focus on compiling Clight with floating point to 8051.
3164
3165
3166\end{itemize}
3167
3168But for the modifications required by the selection of the target processor (8051),
3169we plan to rely as much as possible on the architecture and the
3170the intermediate languages described in this deliverable D2.1.
3171
3172
3173\newpage
3174\section{Proofs}\label{paper-proofs}
3175
3176
3177\subsection{Notation}
3178Let $\act{t}$ be a family of reduction relations where
3179$t$ ranges over the set of labels and $\epsilon$. Then we
3180define:
3181\[
3182\wact{t} = \left\{
3183\begin{array}{ll}
3184(\act{\epsilon})^* &\mbox{if }t=\epsilon \\
3185(\act{\epsilon})^* \comp \act{t} \comp (\act{\epsilon})^*  &\mbox{otherwise}
3186\end{array}
3187\right.
3188\]
3189where as usual $R^*$ denote the reflexive and transitive closure of the relation $R$ 
3190and $\comp$ denotes the composition of relations.
3191
3192\subsection{Proof of proposition \ref{soundness-small-step}} 
3193
3194The following properties are useful.
3195
3196\begin{lemma}\label{access-lemma}
3197\Defitemf{(1)} The relation $\access{C}$ is transitive.
3198
3199\Defitem{(2)}  If $i\access{C} j$ and $R(C,j,K)$ then $R(C,i,K)$.
3200\end{lemma}
3201The first property can be proven by induction on the definition of $\access{C}$ and  the second by induction on the structure of $K$.
3202Next we can focus on the proposition.
3203The notation $C \stackrel{i}{\cdot} C'$ means that $i=|C|$.
3204Suppose that:
3205
3206{\footnotesize
3207\[
3208\begin{array}{lll}
3209(S,K,s) \arrow (S',K',s') \quad (1)  &\text{and}  &R(C,i,S\cdot K) \quad (2)~.
3210\end{array}
3211\]}
3212From $(2)$, we know that there exist $i'$ and $i''$ such that:
3213
3214{\footnotesize\[
3215\begin{array}{llll}
3216i \access{C} i' \quad (3),
3217&C=C_1\stackrel{i'}{\cdot}\cl{C}(S)\stackrel{i''}{\cdot}C_2 \quad (4),
3218&\text{and}
3219&R(C,i'',K) \quad (5)
3220\end{array}
3221\]}
3222and from $(3)$ it follows that:
3223
3224{\footnotesize
3225\[ C\Gives (i,\sigma,s) \trarrow (i',\sigma,s) \quad (3')~.
3226\]}
3227We are looking for $j$ such that:
3228
3229{\footnotesize
3230\[
3231\begin{array}{lll}
3232C \Gives (i,\sigma,s) \trarrow (j,\sigma,s') \quad (6),
3233&\text{and}
3234&R(C,j,S'\cdot K') \quad (7) ~.
3235\end{array}
3236\]}
3237We proceed by case analysis on $S$. We just detail the case of the conditional command as the
3238the remaining cases have similar proofs.
3239If $S=\s{if}$ $e_1<e_2$ $\s{then}$ $S_1$ $\s{else}$ $S_2$ then $(4)$ is rewritten as follows:
3240
3241{\footnotesize  $$ C=C_1\stackrel{i'}{\cdot}\cl{C}(e_1)\cdot \cl{C}(e_2).\s{bge}(k_1)\stackrel{a}{\cdot}\cl{C}(S_1)\stackrel{b}{\cdot}\s{branch}(k_2) 
3242\stackrel{c}{\cdot} \cl{C}(S_2)\stackrel{i''}{\cdot}C_2~ $$}
3243where $c = a+k_1$ and $i''=c+k_2$.
3244  We distinguish two cases according to the evaluation of the boolean condition.
3245We describe the case
3246$(e_1<e_2) \eval \s{true}$. We set $j=a$.
3247\begin{itemize}
3248        \item The instance of $(1)$ is $(S,K,s) \arrow (S_1,K,s)$.
3249        \item The reduction required in (6) takes the form
3250$C \Gives (i,\sigma,s) \trarrow (i',\sigma,s) \trarrow (a,\sigma,s')$,
3251and it follows from $(3')$, the fact that $(e_1<e_2) \eval \s{true}$, and
3252proposition \ref{soundness-big-step-prop}(2).
3253
3254        \item Property $(7)$, follows from lemma \ref{access-lemma}(2), fact $(5)$, and
3255the following proof tree:
3256
3257{\footnotesize   
3258$$ \infer{j\access{C} j \quad \infer{b \access{C} i'' \quad R(C,i'',K)}{R(C,b,K)}}{R(C,j,S_1\cdot K)} ~.$$ }
3259~\qed
3260\end{itemize}
3261
3262
3263\subsection{Proof of proposition \ref{labelled-simulation-vm-mips}}
3264We recall that the compiled code $\cl{C'}(C)$ does not read or write
3265the locations $l_{\w{cost}}$, $l_A$, and $l_B$.
3266Then we note the following properties.
3267
3268{\footnotesize
3269\Defitem{(a)} If $\w{An}_{\vm}(C) \Gives (5 \cdot i, \sigma, s) \arrow^5 (5 \cdot j, \sigma', s')$
3270 then $C \Gives (i, \sigma, s) \arrow (j, \sigma', s'[s(\w{cost}) / \w{cost}])$.
3271
3272\Defitem{(b)} If $M \Gives (i, m) \arrow (j, m')$ 
3273then $\w{An}_{\mips}(M) \Gives (9 \cdot i, m) \trarrow (9 \cdot j, m'[m(l_{\w{cost}}) + 1/ l_{\w{cost}}, m(A) / l_A, m(B) / l_B])$.
3274}
3275
3276Using $(a)$ and the hypothesis, we derive:
3277
3278{\footnotesize
3279\[
3280C \Gives (i, \sigma, s) \arrow (j, \sigma', s'[s(\w{cost}) / \w{cost}])
3281\]}
3282Then, by proposition~\ref{simulation-vm-mips} and the hypothesis on $m$ we derive:
3283
3284{\footnotesize
3285\begin{equation}\label{step1}
3286\cl{C'}(C) \Gives (p(i,C),m) \trarrow (p(j,C),m') \mbox{ and } m' \real \sigma', s'[s(\w{cost}) / \w{cost}]
3287\end{equation}}
3288We perform a case analysis on the instruction $C[i]$ and $h(i)$ to derive:
3289
3290
3291{\footnotesize
3292\[
3293\w{An}_{\mips}(\cl{C'}(C)) \Gives (9 \cdot p(i,C), m) \trarrow (9 \cdot p(j,C), m'[m(l_{\w{cost}}) + d(i,C) / l_{\w{cost}}, x / l_A, y / l_B])
3294\]}
3295where $x$ and $y$ are respectively the value of $l_A$ and $l_B$ in the
3296last intermediate configuration.
3297
3298
3299\paragraph{The memory realisation.}
3300The final memory state is: $m'[m(l_{\w{cost}}) + d(i,C) / l_{\w{cost}}, x / l_A, y / l_B]$.
3301Then we have to verify:
3302
3303{\footnotesize\[
3304(m'[m(l_{\w{cost}}) + d(i,C) / l_{\w{cost}}, x / l_A, y / l_B]) [s'(\w{cost}) / l_{\w{cost}}] \real \sigma', s'~.
3305\]}
3306This is  equivalent to:
3307
3308{\footnotesize \[
3309m'[s'(\w{cost}) / l_{\w{cost}}, x / l_A, y / l_B] \real \sigma', s'~.
3310\]}
3311Since the realisation predicate does not depend on the locations $l_A$ and $l_B$,
3312we can conclude using the second part of (\ref{step1}).
3313
3314\paragraph{The inequation.}
3315To conclude, we verify the inequation:
3316
3317{\footnotesize
3318\[
3319m'[m(l_{\w{cost}}) + d(i,C) / l_{\w{cost}}, x / l_A, y / l_B](l_{\w{cost}}) - m(l_{\w{cost}}) \le s'(\w{cost}) - s(\w{cost})~.
3320\]}
3321\noindent By rewriting, we get: $d(i,C) \le s'(\w{cost}) - s(\w{cost})$.
3322This follows by case analysis of $C[i]$ and $h(i)$ since, by definition, $\w{An}_{\vm}$ uses the worst possible value of $d(i,C)$.
3323\qed
3324
3325
3326\subsection{Proof of proposition \ref{annot-composition}}
3327We have $P = \s{prog} \ S$ and by definition, $\w{An}_{\imp}(P) = \w{cost}:=\w{cost}+\kappa(S); \w{An}_{\imp}(S)$.
3328With our simulation hypothesis, we derive $(\w{An}_{\imp}(S), s[\kappa(S) /\w{cost}]) \eval s'[c'/\w{cost}]$.
3329Using the proposition~\ref{annot-imp-vm} with $C = C' = \epsilon$, $d = 0$ and $\sigma = \epsilon$, we have:
3330
3331{\footnotesize
3332\[
3333\w{An}_{\vm}(\cl{C}(S)) \Gives (0,\epsilon, s[0/\w{cost}]) \trarrow (|\w{An}_{\vm}(\cl{C}(P))|,\epsilon,s'[d'/\w{cost}]) ~,
3334\]}
3335where $d' \leq (c' - \kappa(S)) +\kappa(S) = c'$.
3336%
3337By definition, $\cl{C}(P) = \cl{C}(S) \cdot \s{halt}$ and $|\w{An}_{\vm}(\cl{C}(P))| = 5 \cdot |\cl{C}(P)|$. We can rewrite the simulation:
3338
3339{\footnotesize\[
3340\w{An}_{\vm}(\cl{C}(P)) \Gives (0,\epsilon, s[0/\w{cost}]) \trarrow (5 \cdot |\cl{C}(P)|,\epsilon,s'[d'/\w{cost}]) ~.
3341\]}
3342By proposition~\ref{well-formed-prop}, there is a decoration $h$ such that $\cl{C}(P) : h$ and $h(0) = 0$.
3343By iterating proposition~\ref{labelled-simulation-vm-mips},
3344we derive:
3345
3346{\footnotesize\[
3347\w{An}_{\mips}(\cl{C'}(\cl{C}(P))) \Gives (9 \cdot p(0,\cl{C}(P)), m) \trarrow (9 \cdot p(|\cl{C}(P)|, \cl{C}(P)), m')~,
3348\]}
3349with $m'[d'/l_{\w{cost}}] \real \epsilon, s'[d'/\w{cost}]$ and $m'(l_{\w{cost}}) \le d'$.
3350%
3351We know that the last instruction of $\cl{C}(P)$ is an $\s{halt}$, therefore the last instruction of $\cl{C'}(\cl{C}(P))$ is also an $\s{halt}$. By definition, $p(|\cl{C}(P)|, \cl{C}(P))$ is the position after this instruction. That gives us:
3352
3353{\footnotesize\[
3354(\w{An}_{\mips}(\cl{C'}(\cl{C}(P))),m)
3355 \eval m' ~.
3356\]}
3357And by transitivity we have:
3358$m'(l_{\w{cost}}) \le d' \le c'$. \qed
3359
3360
3361\subsection{Proof of proposition \ref{labelled-sim-imp-vm}} 
3362
3363\Defitemf{(1)} By induction on the structure of the command $S$.
3364
3365\Defitem{(2)} By iterating the following proposition.
3366
3367\begin{proposition}
3368  If $(S,K,s) \stackrel{t}{\arrow} (S',K',s')$ and $R(C,i,S\cdot K)$ with $t=\ell$ or $t=\epsilon$ then
3369  $C\Gives (i,\sigma,s) \wact{t} (j,\sigma,s')$ and
3370  $R(C,j,S'\cdot K')$.
3371\end{proposition}
3372
3373This is an extension of proposition \ref{soundness-small-step} and it is proven in the same way with
3374an additional case for labelled commands. \qed
3375
3376\subsection{Proof of proposition \ref{sim-vm-mips-prop}} 
3377\Proofitemf{(1)} The compilation of the $\vm$ instruction $\s{nop}(\ell)$ is the $\mips$ instruction $(\s{nop} \ \ell)$.
3378
3379\Proofitem{(2)}                         
3380By iterating the following proposition.
3381
3382\begin{proposition}
3383Let $C : h$ be a well formed code.
3384If  $C \Gives (i,\sigma,s) \stackrel{t}\arrow (j,\sigma',s')$ with $t=\ell$ or $t=\epsilon$, $h(i) = |\sigma|$ and  $m \real \sigma,s$
3385then
3386$\cl{C'}(C) \Gives (p(i,C),m) \wact{t} (p(j,C),m')$ and $m'\real \sigma',s'$.
3387\end{proposition} 
3388
3389This is an extension of proposition \ref{simulation-vm-mips} and it is proven in the same way
3390with an additional case for the \s{nop} instruction.\qed
3391
3392\subsection{Proof of proposition \ref{lab-instr-erasure-imp}} 
3393In order to carry on the proof, one needs to develop a bit
3394more the properties of the small-step operational semantics.
3395In particular, one needs to show that a continuation such
3396as $S\cdot (S'\cdot K)$ is `observationally equivalent' to the
3397continuation  $(S;S')\cdot K$. \qed
3398
3399
3400\subsection{Proof of proposition \ref{global-commutation-prop}}
3401By diagram chasing using propositions \ref{labelled-sim-imp-vm}(1),
3402\ref{sim-vm-mips-prop}(1), and the definition \ref{labelling-def} of labelling. \qed
3403
3404
3405\subsection{Proof of proposition \ref{instrument-to-label-prop}}
3406Suppose that:
3407$$ (\cl{I}(\cl{L}(P)),s[c/\w{cost}]) \eval s'[c+\delta/\w{cost}] \text{ and } m\real s[c/\w{cost}]~. $$
3408Then, by proposition  \ref{lab-instr-erasure-imp}, for some $\lambda$:
3409$$ (\cl{L}(P),s[c/\w{cost}])\eval (s'[c/\w{cost}],\lambda) \text{ and } \kappa(\lambda)=\delta~. $$ 
3410Finally, by propositions  \ref{labelled-sim-imp-vm}(2) and \ref{sim-vm-mips-prop}(2) :
3411$$ (\cl{C'}(\cl{C}(\cl{L}(P))),m) \eval (m',\lambda) \text{ and } m'\real s'[c/\w{cost}]~.$$
3412\qed
3413
3414\subsection{Proof of proposition \ref{sound-label-prop}}
3415 If $\lambda=\ell_1 \cdots \ell_n$ then the
3416computation is the concatenation of simple paths
3417labelled with $\ell_1,\ldots,\ell_n$. Since $\kappa(\ell_i)$
3418bounds the cost of a simple path labelled with $\ell_i$, the
3419cost of the overall computation is bounded by $\kappa(\lambda)=
3420\kappa(\ell_1)+\cdots \kappa(\ell_n)$. \qed 
3421
3422
3423\subsection{Proof of proposition \ref{precise-label-prop}}
3424Same proof as proposition \ref{sound-label-prop}, by replacing the word \emph{bounds} by \emph{is exactly} and the words \emph{bounded by} by \emph{exactly}. \qed
3425
3426\subsection{Proof of proposition \ref{lab-sound}} 
3427In both labellings under consideration the root node is labelled.  An
3428obvious observation is that only commands of the shape \s{while} $b$
3429\s{do} $S$ and \s{while} $lb$ \s{do} $S$ introduce loops in the
3430compiled code.  In the second case, the compilation ensures that a
3431label is placed in the loop.  In the first case, we notice that
3432both labelling introduce a label in the loop (though at different places).
3433Thus all loops go through a label and the compiled code is always sound.
3434
3435To show the precision of the second labelling $\cl{L}_p$, we note the
3436following property.
3437
3438\begin{lemma}\label{precise-lemma}
3439A soundly labelled graph is precise if each label occurs at most once in the graph and
3440if the immediate successors of the \s{bge} nodes are either \s{halt} (no successor) or labelled nodes.
3441\end{lemma}
3442
3443Indeed, in a such a graph starting from a labelled node we can follow a unique path
3444up to a leaf, another labelled node, or a $\s{bge}$ node. In the last case, the
3445hypotheses in the lemma \ref{precise-lemma} guarantee that the two
3446simple paths one can follow from the $\s{bge}$ node have
3447the same length/cost.  \qed
3448
3449
3450
3451
3452
3453\subsection{Proof of proposition \ref{ann-correct}} 
3454By applying consecutively proposition \ref{instrument-to-label-prop}
3455and propositions \ref{sound-label-prop} or \ref{precise-label-prop}. \qed
3456
3457
3458
3459
3460
Note: See TracBrowser for help on using the repository browser.