1 | \documentclass[11pt,epsf,a4wide]{article} |
---|
2 | \usepackage[mathletters]{ucs} |
---|
3 | \usepackage[utf8x]{inputenc} |
---|
4 | \usepackage{listings} |
---|
5 | \usepackage{../../style/cerco} |
---|
6 | \newcommand{\ocaml}{OCaml} |
---|
7 | \newcommand{\clight}{Clight} |
---|
8 | \newcommand{\matita}{Matita} |
---|
9 | \newcommand{\sdcc}{\texttt{sdcc}} |
---|
10 | |
---|
11 | \newcommand{\textSigma}{\ensuremath{\Sigma}} |
---|
12 | |
---|
13 | % LaTeX Companion, p 74 |
---|
14 | \newcommand{\todo}[1]{\marginpar{\raggedright - #1}} |
---|
15 | |
---|
16 | \lstdefinelanguage{coq} |
---|
17 | {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record}, |
---|
18 | morekeywords={[2]if,then,else}, |
---|
19 | } |
---|
20 | |
---|
21 | \lstdefinelanguage{matita} |
---|
22 | {keywords={definition,lemma,theorem,remark,inductive,record,qed,let,rec,match,with,Type,and,on}, |
---|
23 | morekeywords={[2]whd,normalize,elim,cases,destruct}, |
---|
24 | mathescape=true, |
---|
25 | morecomment=[n]{(*}{*)}, |
---|
26 | } |
---|
27 | |
---|
28 | \lstset{language=matita,basicstyle=\small\tt,columns=flexible,breaklines=false, |
---|
29 | keywordstyle=\color{red}\bfseries, |
---|
30 | keywordstyle=[2]\color{blue}, |
---|
31 | commentstyle=\color{green}, |
---|
32 | stringstyle=\color{blue}, |
---|
33 | showspaces=false,showstringspaces=false} |
---|
34 | |
---|
35 | \lstset{extendedchars=false} |
---|
36 | \lstset{inputencoding=utf8x} |
---|
37 | \DeclareUnicodeCharacter{8797}{:=} |
---|
38 | \DeclareUnicodeCharacter{10746}{++} |
---|
39 | \DeclareUnicodeCharacter{9001}{\ensuremath{\langle}} |
---|
40 | \DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}} |
---|
41 | |
---|
42 | |
---|
43 | \title{ |
---|
44 | INFORMATION AND COMMUNICATION TECHNOLOGIES\\ |
---|
45 | (ICT)\\ |
---|
46 | PROGRAMME\\ |
---|
47 | \vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}} |
---|
48 | |
---|
49 | \date{ } |
---|
50 | \author{} |
---|
51 | |
---|
52 | \begin{document} |
---|
53 | \thispagestyle{empty} |
---|
54 | |
---|
55 | \vspace*{-1cm} |
---|
56 | \begin{center} |
---|
57 | \includegraphics[width=0.6\textwidth]{../../style/cerco_logo.png} |
---|
58 | \end{center} |
---|
59 | |
---|
60 | \begin{minipage}{\textwidth} |
---|
61 | \maketitle |
---|
62 | \end{minipage} |
---|
63 | |
---|
64 | |
---|
65 | \vspace*{0.5cm} |
---|
66 | \begin{center} |
---|
67 | \begin{LARGE} |
---|
68 | \bf |
---|
69 | Report n. D3.3\\ |
---|
70 | Executable Formal Semantics of front end intermediate languages\\ |
---|
71 | \end{LARGE} |
---|
72 | \end{center} |
---|
73 | |
---|
74 | \vspace*{2cm} |
---|
75 | \begin{center} |
---|
76 | \begin{large} |
---|
77 | Version 1.0 |
---|
78 | \end{large} |
---|
79 | \end{center} |
---|
80 | |
---|
81 | \vspace*{0.5cm} |
---|
82 | \begin{center} |
---|
83 | \begin{large} |
---|
84 | Main Authors:\\ |
---|
85 | Brian~Campbell |
---|
86 | \end{large} |
---|
87 | \end{center} |
---|
88 | |
---|
89 | \vspace*{\fill} |
---|
90 | \noindent |
---|
91 | Project Acronym: \cerco{}\\ |
---|
92 | Project full title: Certified Complexity\\ |
---|
93 | Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\ |
---|
94 | |
---|
95 | \clearpage \pagestyle{myheadings} \markright{\cerco{}, FP7-ICT-2009-C-243881} |
---|
96 | |
---|
97 | \newpage |
---|
98 | |
---|
99 | \vspace*{7cm} |
---|
100 | \paragraph{Abstract} |
---|
101 | We report on the formalization of the front end intermediate languages for the |
---|
102 | \cerco{} project's compiler using executable semantics in the \matita{} proof |
---|
103 | assistant. This includes common definitions for fresh identifier handling, |
---|
104 | $n$-bit arithmetic and operations and testing of the smeantics. We also |
---|
105 | consider embedding invariants into the semantics for showing correctness |
---|
106 | properties. |
---|
107 | |
---|
108 | \newpage |
---|
109 | |
---|
110 | \tableofcontents |
---|
111 | |
---|
112 | % TODO: clear up any -ize vs -ise |
---|
113 | % TODO: clear up any "front end" vs "front-end" |
---|
114 | % TODO: clear up any mentions of languages that aren't textsf'd. |
---|
115 | % TODO: fix unicode in listings |
---|
116 | |
---|
117 | \section{Introduction} |
---|
118 | |
---|
119 | This work formalizes the front end intermediate languages from the \cerco{} |
---|
120 | prototype compiler, described in previous deliverables~\cite{d2.1,d2.2}. The |
---|
121 | front end of the compiler is summarized in Figure~\ref{fig:summary} including |
---|
122 | the intermediate languages and the compiler passes described in the |
---|
123 | accompanying deliverable 3.2. We have also refined parts of the formal |
---|
124 | development that are used for several of the languages in the compiler. |
---|
125 | |
---|
126 | The input language to the formalized front end is the \textsf{Clight} |
---|
127 | language. The executable semantics for this language were presented in a |
---|
128 | previous deliverable~\cite{d3.1}. Here we will report on some |
---|
129 | minor changes to its semantics made to better align it with the whole |
---|
130 | development. |
---|
131 | |
---|
132 | The formalization of each language takes the form of definitions for abstract |
---|
133 | syntax and functions providing a small-step executable semantics. This is |
---|
134 | done in the Calculus of Inductive Constructions (CIC), as implemented in the |
---|
135 | \matita{} proof assistant. These definitions will be essential for the |
---|
136 | correctness proofs of the compiler in task 3.4. |
---|
137 | |
---|
138 | Finally, we will report on work to add several invariants to the |
---|
139 | languages. This activity overlaps with task 3.4 on the correctness of the |
---|
140 | compiler front end. However, the use of dependent types mean that this work |
---|
141 | is tied closely to the definition of the languages and the transformations of |
---|
142 | the front end in task 3.2. By considering it now we can experiment with and |
---|
143 | judge its impact on the formal semantics, and how feasible retrofitting such |
---|
144 | invariants is. |
---|
145 | |
---|
146 | \begin{figure} |
---|
147 | \begin{center} |
---|
148 | \begin{minipage}{.8\linewidth} |
---|
149 | \begin{tabbing} |
---|
150 | \quad \= $\downarrow$ \quad \= \kill |
---|
151 | \textsf{C} (unformalized)\\ |
---|
152 | \> $\downarrow$ \> CIL parser (unformalized \ocaml)\\ |
---|
153 | \textsf{Clight}\\ |
---|
154 | \> $\downarrow$ \> cast removal\\ |
---|
155 | \> $\downarrow$ \> add runtime functions\\ |
---|
156 | \> $\downarrow$ \> labelling\\ |
---|
157 | \> $\downarrow$ \> stack variable allocation and control structure |
---|
158 | simplification\\ |
---|
159 | \textsf{Cminor}\\ |
---|
160 | \> $\downarrow$ \> generate global variable initialisation code\\ |
---|
161 | \> $\downarrow$ \> transform to RTL graph\\ |
---|
162 | \textsf{RTLabs}\\ |
---|
163 | \> $\downarrow$ \> start of target specific backend\\ |
---|
164 | \>\quad \vdots |
---|
165 | \end{tabbing} |
---|
166 | \end{minipage} |
---|
167 | \end{center} |
---|
168 | \caption{Front end languages and transformations} |
---|
169 | \label{fig:summary} |
---|
170 | \end{figure} |
---|
171 | |
---|
172 | \subsection{Revisions to the prototype compiler} |
---|
173 | |
---|
174 | Ongoing work to maintain and improve the prototype compiler has resulted in |
---|
175 | several changes, mostly minor. The most important change is that the |
---|
176 | transformations to replace 16 and 32 bit types have been moved from the |
---|
177 | \textsf{Clight} language to the target-specific stage between \textsf{RTLabs} |
---|
178 | and \textsf{RTL} to help generate better code, and the addition of a |
---|
179 | \textsf{Clight} cast removal transformation to reduce the number of 16 and 32 |
---|
180 | bit operations. |
---|
181 | |
---|
182 | The formalized semantics have tracked these changes, and in this report we |
---|
183 | describe them as they currently stand. |
---|
184 | |
---|
185 | \section{Definitions common to several languages} |
---|
186 | |
---|
187 | The semantics for many of the languages in the compiler share some core parts: |
---|
188 | the memory model, environments, runtime values, definitions of operations and |
---|
189 | a monad for encapsulating failure and I/O (using resumptions). The executable |
---|
190 | memory model was ported from CompCert as part of the work on \textsf{Clight} |
---|
191 | and was reused for the front end languages\footnote{However, it is likely that |
---|
192 | we will revise the memory model to make it better suited for describing the |
---|
193 | back-end of our compiler.}. The failure and I/O monad was introduced in the |
---|
194 | previous deliverable on \textsf{Clight}~\cite[\S4.2]{d3.1}. In all of the |
---|
195 | languages except \textsf{Clight} we have a basic form of type, \lstinline'typ' |
---|
196 | identifying integers and pointers along with their sizes. The other parts |
---|
197 | are discussed below, with the only change to the runtime values being the |
---|
198 | representation of integers. |
---|
199 | |
---|
200 | \subsection{Identifiers} |
---|
201 | |
---|
202 | Each language features one or more kinds of names to represent variables, such |
---|
203 | as registers, \texttt{goto} labels or RTL graph labels. We also need to |
---|
204 | describe various maps whose domain is a set of identifiers when defining the |
---|
205 | semantics and compilation. |
---|
206 | |
---|
207 | Previous work on the executable semantics of the target machine |
---|
208 | code included bit vectors and bit vector tries to define various integers in |
---|
209 | the semantics, and give a low level view of memory~\cite{d4.1}. To keep the |
---|
210 | size of the development down we have reused these data structures for |
---|
211 | identifiers and maps respectively. |
---|
212 | |
---|
213 | One difficulty with using fixed size bit vectors for identifiers is that |
---|
214 | fresh name generation can fail if generate too many. While we use an error |
---|
215 | monad to deal with failures, we wish to minimize its use in the compiler. |
---|
216 | Thus we add a flag to detect overflows, and check it after the phase of the |
---|
217 | compiler is complete to report exhaustion. The rest of the phase can then be |
---|
218 | written as if name generation always succeeds. In practice this will never |
---|
219 | occur on normal programs because more identifiers of each sort are available |
---|
220 | than bytes of code memory on the target. |
---|
221 | |
---|
222 | Given the wide variety of identifiers used in the compiler we also |
---|
223 | wish to separate the different classes of identifier. Thus we encapsulate the |
---|
224 | bit vector representing the identifier in a datatype that also carries a tag |
---|
225 | identifying which class of identifier we are using: |
---|
226 | \begin{lstlisting} |
---|
227 | inductive identifier (tag:String) : Type[0] ≝ |
---|
228 | an_identifier : Word $\rightarrow$ identifier tag. |
---|
229 | \end{lstlisting} |
---|
230 | The tries are also tagged in the same manner. These tags have also proved |
---|
231 | useful during testing by making the resulting terms more readable. |
---|
232 | |
---|
233 | \subsection{Machine integers and arithmetic} |
---|
234 | |
---|
235 | The bit vectors in~\cite{d4.1} also came equipped with some basic arithmetic |
---|
236 | for the target semantics. The front end required these operations to be |
---|
237 | generalized and extended. In particular, we required operations such as zero |
---|
238 | and sign extension and translation between bit vectors and full integers. It |
---|
239 | also became apparent that while the original definitions worked reasonably on |
---|
240 | 8-bit vectors, they did not scale up to 32-bit integers. The definitions were |
---|
241 | then reworked to make them efficient enough to animate programs in the |
---|
242 | front-end semantics. |
---|
243 | |
---|
244 | \subsection{Front end operations} |
---|
245 | |
---|
246 | The two front end intermediate languages, \textsf{Cminor} and \textsf{RTLabs}, |
---|
247 | share the same set of operations on values. They differ from |
---|
248 | \textsf{Clight}'s operations by incorporating casts and by having a separate |
---|
249 | operation for each type of data operated upon. For example, subtraction of |
---|
250 | pointers is treated as a different operation from subtraction of integers. |
---|
251 | |
---|
252 | A common semantics is given for these operations in the form of simple |
---|
253 | CIC functions on the operation and runtime values. |
---|
254 | |
---|
255 | \section{\textsf{Clight} modifications} |
---|
256 | |
---|
257 | The \textsf{Clight} input language remained largely the same as in the |
---|
258 | previous deliverable~\cite{d3.1}. The principal changes were to use the |
---|
259 | identifiers and arithmetic described above in place of the arbitrarily large |
---|
260 | integers used before. For the identifiers, this relieved us of the burden of |
---|
261 | adding an efficient datatype for maps by reusing the bit vector tries instead. |
---|
262 | |
---|
263 | The arithmetic replaced a dependent pair of an arbitrary integer and a proof |
---|
264 | that it was in range of 32 bit integers by the exact bit vector for each |
---|
265 | size of integer. This direct approach is closer to the implementation and |
---|
266 | more obviously correct --- no extra precision can be left in by accident. |
---|
267 | |
---|
268 | \section{\textsf{Cminor}} |
---|
269 | |
---|
270 | The \textsf{Cminor} language does not store local variables in memory, and has |
---|
271 | simpler control structures than \textsf{Clight}. It is similar in nature to |
---|
272 | the \textsf{Cminor} language in CompCert, although the semantics have been |
---|
273 | based on the \cerco{} prototype rather than ported from CompCert. The syntax |
---|
274 | is similar to the prototype, except that the types attached to expressions are |
---|
275 | restricted so that some corner cases are ruled out in the \textsf{Cminor} to |
---|
276 | \textsf{RTLabs} stage (see the accompanying deliverable 3.2 for details): |
---|
277 | \begin{lstlisting} |
---|
278 | inductive expr : typ $\rightarrow$ Type[0] ≝ |
---|
279 | | Id : $\forall$t. ident $\rightarrow$ expr t |
---|
280 | | Cst : $\forall$t. constant $\rightarrow$ expr t |
---|
281 | | Op1 : $\forall$t,t'. unary_operation $\rightarrow$ expr t $\rightarrow$ expr t' |
---|
282 | | Op2 : $\forall$t1,t2,t'. binary_operation $\rightarrow$ expr t1 $\rightarrow$ expr t2 $\rightarrow$ expr t' |
---|
283 | | Mem : $\forall$t,r. memory_chunk $\rightarrow$ expr (ASTptr r) $\rightarrow$ expr t |
---|
284 | | Cond : $\forall$sz,sg,t. expr (ASTint sz sg) $\rightarrow$ expr t $\rightarrow$ expr t $\rightarrow$ expr t |
---|
285 | | Ecost : $\forall$t. costlabel $\rightarrow$ expr t $\rightarrow$ expr t. |
---|
286 | \end{lstlisting} |
---|
287 | For example, note that conditional expressions only switch on integer |
---|
288 | expressions. |
---|
289 | In principle we could extend this to statically ensure that only well-typed |
---|
290 | \textsf{Cminor} expressions are considered, and we will consider this as part |
---|
291 | of the work on correctness in task 3.4. |
---|
292 | |
---|
293 | We also provide a variant of the syntax where the only initialization data is |
---|
294 | the size of each global variable, for use after the initialization code has |
---|
295 | been generated. |
---|
296 | |
---|
297 | The definition of the semantics is routine: a functional definition of a |
---|
298 | single small step of the machine is given, reusing the memory model, |
---|
299 | environments, arithmetic and operations mentioned above. |
---|
300 | |
---|
301 | \section{\textsf{RTLabs}} |
---|
302 | |
---|
303 | The \textsf{RTLabs} language provides a target independent Register Transfer |
---|
304 | Language, where programs are represented as control flow graphs. We use the |
---|
305 | identifiers described above for the graph labels and the maps for the graph |
---|
306 | itself. The tagging mechanism ensures that labels cannot be mixed up with |
---|
307 | other identifiers in the program (in particular, we cannot accidentally reuse |
---|
308 | a \texttt{goto} label from Cminor where a graph label should appear). |
---|
309 | |
---|
310 | Otherwise, the syntax and semantics of \textsf{RTLabs} mirrors that of the |
---|
311 | prototype compiler. Some of the syntax is shown below, including the type of |
---|
312 | the control flow graphs. The same common elements are used as for \textsf{Cminor}, |
---|
313 | including the front end operations mentioned above. |
---|
314 | |
---|
315 | \begin{lstlisting}[language=matita] |
---|
316 | inductive statement : Type[0] ≝ |
---|
317 | | St_skip : label $\rightarrow$ statement |
---|
318 | | St_cost : costlabel $\rightarrow$ label $\rightarrow$ statement |
---|
319 | | St_const : register $\rightarrow$ constant $\rightarrow$ label $\rightarrow$ statement |
---|
320 | | St_op1 : unary_operation $\rightarrow$ register $\rightarrow$ register $\rightarrow$ label $\rightarrow$ statement |
---|
321 | ... |
---|
322 | | St_return : statement |
---|
323 | . |
---|
324 | |
---|
325 | definition graph : Type[0] $\rightarrow$ Type[0] ≝ identifier_map LabelTag. |
---|
326 | |
---|
327 | record internal_function : Type[0] ≝ |
---|
328 | { f_labgen : universe LabelTag |
---|
329 | ; f_reggen : universe RegisterTag |
---|
330 | ; f_result : option (register $\times$ typ) |
---|
331 | ; f_params : list (register $\times$ typ) |
---|
332 | ; f_locals : list (register $\times$ typ) |
---|
333 | ; f_stacksize : nat |
---|
334 | ; f_graph : graph statement |
---|
335 | }. |
---|
336 | \end{lstlisting} |
---|
337 | |
---|
338 | \section{Testing} |
---|
339 | |
---|
340 | To provide some assurance that the semantics were properly implemented, and to |
---|
341 | support the testing described in the accompanying deliverable 3.2, we have |
---|
342 | adapted the pretty printers in the prototype compiler to produce \matita{} |
---|
343 | terms for the syntax of each language described above. |
---|
344 | |
---|
345 | A few common definitions were added for animating the small step semantics |
---|
346 | definitions for any of the front end languages in \matita. We then used a |
---|
347 | small selection of test cases to ensure basic functionality. However, this is |
---|
348 | still a time consuming process, so more testing will carried out once the |
---|
349 | extraction of CIC terms to \ocaml{} programs is implemented in \matita. |
---|
350 | |
---|
351 | \section{Embedding invariants} |
---|
352 | |
---|
353 | Each phase of the prototype compiler can fail in a number of places if the |
---|
354 | input language permits programs that are badly structured in some sense: a |
---|
355 | missing label in a \texttt{goto} statement or CFG, an undefined variable name, |
---|
356 | a \texttt{break} statement outside of a loop or \texttt{switch}, and so on. |
---|
357 | We wish to restrict our intermediate languages using dependent types to remove |
---|
358 | as many `junk' programs as possible to rule out such failures. We also hope |
---|
359 | that such restrictions will help in other correctness proofs. |
---|
360 | |
---|
361 | This goal lies in the overlap between several tasks in the project: it |
---|
362 | involves manipulating the syntax and semantics of the intermediate languages |
---|
363 | (the present work), the encoding of the front-end compiler phases in \matita{} |
---|
364 | (task 3.2) and the correctness of the front-end (task 3.4). Thus this work is |
---|
365 | rather experimental; it is being carried out on branches in our source code |
---|
366 | repository and the final form will be decided and merged in during task 3.4. |
---|
367 | |
---|
368 | So far we have tried adding two forms of invariant --- one using dependent |
---|
369 | types to index statements in \textsf{Cminor} by their block depth, and the |
---|
370 | other asserts that variables and labels are present in the appropriate |
---|
371 | environments by adding a separate invariant to each function. Note that these |
---|
372 | do not yet cover all of the properties that a program in these languages is |
---|
373 | expected to enjoy; for example, there are currently no checks that references |
---|
374 | to globals are well-defined. |
---|
375 | |
---|
376 | \subsection{\textsf{Cminor} block depth} |
---|
377 | |
---|
378 | The \textsf{Cminor} language has relatively simple control structures. |
---|
379 | Statements are provided for infinite loops, non-looping blocks and |
---|
380 | exiting an arbitrary number of blocks (for \texttt{break}, failing loop guards |
---|
381 | and the switch statement\footnote{We are considering replacing the |
---|
382 | \textsf{Cminor} switch statement with one that uses \texttt{goto}-like labels |
---|
383 | in both the prototype and the formalized compilers, but for now we stick with |
---|
384 | this CompCert-style arrangement.}). |
---|
385 | |
---|
386 | However, this means that there are badly-formed \textsf{Cminor} programs such |
---|
387 | as |
---|
388 | \begin{lstlisting}[language={}] |
---|
389 | int main() { |
---|
390 | block { |
---|
391 | loop { |
---|
392 | exit 5 |
---|
393 | } |
---|
394 | } |
---|
395 | } |
---|
396 | \end{lstlisting} |
---|
397 | where we attempt to exit more blocks than exist. To rule these out (including |
---|
398 | demonstrating that the previous phase of the compiler does not generate them) |
---|
399 | we can index the statements of the language by the depth of the enclosing |
---|
400 | blocks. |
---|
401 | |
---|
402 | The adaption of the syntax adds the depth to every statement, and uses bounded |
---|
403 | integers in the exit and switch statements: |
---|
404 | \begin{lstlisting}[language=matita] |
---|
405 | inductive stmt : $\forall$blockdepth:nat. Type[0] ≝ |
---|
406 | | St_skip : $\forall$n. stmt n |
---|
407 | ... |
---|
408 | | St_loop : $\forall$n. stmt n $\rightarrow$ stmt n |
---|
409 | | St_block : $\forall$n. stmt (S n) $\rightarrow$ stmt n |
---|
410 | | St_exit : $\forall$n. Fin n $\rightarrow$ stmt n |
---|
411 | (* expr to switch on, table of <switch value, #blocks to exit>, default *) |
---|
412 | | St_switch : expr $\rightarrow$ $\forall$n. list (int $\times$ (Fin n)) $\rightarrow$ Fin n $\rightarrow$ stmt n |
---|
413 | ... |
---|
414 | \end{lstlisting} |
---|
415 | where \texttt{stmt n} is a statement enclosed in \texttt{n} blocks, |
---|
416 | and \texttt{Fin n} is a standard construction for a natural number which is at |
---|
417 | most \texttt{n}. |
---|
418 | |
---|
419 | In the semantics the number of blocks is also added to the continuations and |
---|
420 | state, and the function to find the continuation from an exit statement can be |
---|
421 | made failure-free. |
---|
422 | We note in passing that adding this parameter detected a small mistake in the |
---|
423 | semantics concerning continuations and tail calls, although the mistake itself |
---|
424 | was benign. |
---|
425 | |
---|
426 | \subsection{Identifier invariants} |
---|
427 | |
---|
428 | To show that the variables and labels occurring in the body of a function |
---|
429 | are present in the relevant structures we add an additional invariant to the |
---|
430 | function records. |
---|
431 | |
---|
432 | For \textsf{Cminor} we use a higher-order predicate which recursively applies |
---|
433 | a predicate to all substatements: |
---|
434 | \begin{lstlisting}[language=matita] |
---|
435 | let rec stmt_P (P:stmt $\rightarrow$ Prop) (s:stmt) on s : Prop ≝ |
---|
436 | match s with |
---|
437 | [ St_seq s1 s2 $\Rightarrow$ P s $\wedge$ stmt_P P s1 $\wedge$ stmt_P P s2 |
---|
438 | | St_ifthenelse _ _ _ s1 s2 $\Rightarrow$ P s $\wedge$ stmt_P P s1 $\wedge$ stmt_P P s2 |
---|
439 | | St_loop s' $\Rightarrow$ P s $\wedge$ stmt_P P s' |
---|
440 | | St_block s' $\Rightarrow$ P s $\wedge$ stmt_P P s' |
---|
441 | | St_label _ s' $\Rightarrow$ P s $\wedge$ stmt_P P s' |
---|
442 | | St_cost _ s' $\Rightarrow$ P s $\wedge$ stmt_P P s' |
---|
443 | | _ $\Rightarrow$ P s |
---|
444 | ]. |
---|
445 | \end{lstlisting} |
---|
446 | Dependent pattern matching on statements thus allows an accompanying |
---|
447 | \lstinline'stmt_P' fact to be unfold to the predicate on the current statement |
---|
448 | and the predicate applied to all substatements. |
---|
449 | |
---|
450 | We require two properties to hold in \textsf{Cminor} functions: |
---|
451 | \begin{enumerate} |
---|
452 | \item All variables in the body are present in the list of parameters or the |
---|
453 | list of variables for the function (this also uses a similar recursive |
---|
454 | predicate on expressions). |
---|
455 | \item All labels in \texttt{goto} statements appear in a label statement. |
---|
456 | \end{enumerate} |
---|
457 | The function definition thus becomes: |
---|
458 | \begin{lstlisting}[language=matita] |
---|
459 | record internal_function : Type[0] ≝ |
---|
460 | { f_return : option typ |
---|
461 | ; f_params : list (ident $\times$ typ) |
---|
462 | ; f_vars : list (ident $\times$ typ) |
---|
463 | ; f_stacksize : nat |
---|
464 | ; f_body : stmt |
---|
465 | ; f_inv : stmt_P ($\lambda$s.stmt_vars ($\lambda$i.Exists ? ($\lambda$x.\fst x = i) (f_params @ f_vars)) s $\wedge$ |
---|
466 | stmt_labels ($\lambda$l.Exists ? ($\lambda$l'.l' = l) (labels_of f_body)) s) f_body |
---|
467 | }. |
---|
468 | \end{lstlisting} |
---|
469 | where \lstinline'stmt_vars' and \lstinline'stmt_labels' constrain the |
---|
470 | variables and labels that appear directly in a statement (but not |
---|
471 | substatements) to appear in the given list, and |
---|
472 | \lstinline'labels_of' returns a list of all the labels defined in a |
---|
473 | statement. |
---|
474 | |
---|
475 | The \textsf{Clight} semantics can be amended to use these invariants, although |
---|
476 | the main benefit is for the compiler stages (see the accompanying deliverable |
---|
477 | 3.2 for details). The semantics require the invariants to be added to the |
---|
478 | state and continuations. It was convenient to split the continuations between |
---|
479 | the local continuation representing the rest of the code to be executed within |
---|
480 | the function, and the stack of function calls because it becomes easier to |
---|
481 | state the property on the local continuation alone. |
---|
482 | The invariant for variables is |
---|
483 | slightly different --- we require that every variable appear in the local |
---|
484 | environment. We use \lstinline'f_inv' from the function to establish this |
---|
485 | invariant when the environment is set up on function entry. |
---|
486 | |
---|
487 | It is unclear whether changing the semantics is really worthwhile. It |
---|
488 | witnesses that the invariants are those we wanted, but makes no difference to |
---|
489 | the actual execution of the program, especially as the execution can still |
---|
490 | fail due to genuine runtime errors. Moreover, it is unclear what effect the |
---|
491 | presence of proof terms and more dependent pattern matching in the semantics |
---|
492 | will have on the complexity of future correctness proofs. We plan to examine |
---|
493 | this issue during task 3.4. |
---|
494 | |
---|
495 | We use a similar method to specify the invariant that the \textsf{RTLabs} |
---|
496 | graph is closed --- that is, any successor labels in a statement in the graph |
---|
497 | are present in the graph. The definition is simpler in \textsf{RTLabs} |
---|
498 | because the flat representation of the graph does not require recursive |
---|
499 | definitions like \lstinline'stmt_P' above. |
---|
500 | |
---|
501 | \section{Conclusion} |
---|
502 | |
---|
503 | We have developed executable semantics for each of the front-end languages of |
---|
504 | the \cerco{} compiler. These will form the basis of the correctness |
---|
505 | statements for each stage of the compiler in task 3.4. We have also shown |
---|
506 | that useful invariants can be added as dependent types, and intend to use |
---|
507 | these in subsequent work. |
---|
508 | |
---|
509 | \bibliographystyle{plain} |
---|
510 | \bibliography{report} |
---|
511 | |
---|
512 | \end{document} |
---|