1 | \documentclass[11pt,epsf,a4wide]{article} |
---|
2 | \usepackage[utf8x]{inputenc} |
---|
3 | \usepackage{listings} |
---|
4 | \usepackage{../../style/cerco} |
---|
5 | \newcommand{\ocaml}{OCaml} |
---|
6 | \newcommand{\clight}{Clight} |
---|
7 | \newcommand{\matita}{Matita} |
---|
8 | \newcommand{\sdcc}{\texttt{sdcc}} |
---|
9 | |
---|
10 | \newcommand{\textSigma}{\ensuremath{\Sigma}} |
---|
11 | |
---|
12 | % LaTeX Companion, p 74 |
---|
13 | \newcommand{\todo}[1]{\marginpar{\raggedright - #1}} |
---|
14 | |
---|
15 | \lstdefinelanguage{coq} |
---|
16 | {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record}, |
---|
17 | morekeywords={[2]if,then,else}, |
---|
18 | } |
---|
19 | |
---|
20 | \lstdefinelanguage{matita} |
---|
21 | {keywords={ndefinition,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,rec,match,with,Type,and,on}, |
---|
22 | morekeywords={[2]nwhd,nnormalize,nelim,ncases,ndestruct}, |
---|
23 | mathescape=true, |
---|
24 | } |
---|
25 | |
---|
26 | \lstset{language=matita,basicstyle=\small\tt,columns=flexible,breaklines=false, |
---|
27 | keywordstyle=\color{red}\bfseries, |
---|
28 | keywordstyle=[2]\color{blue}, |
---|
29 | commentstyle=\color{green}, |
---|
30 | stringstyle=\color{blue}, |
---|
31 | showspaces=false,showstringspaces=false} |
---|
32 | |
---|
33 | \lstset{extendedchars=false} |
---|
34 | \lstset{inputencoding=utf8x} |
---|
35 | \DeclareUnicodeCharacter{8797}{:=} |
---|
36 | \DeclareUnicodeCharacter{10746}{++} |
---|
37 | \DeclareUnicodeCharacter{9001}{\ensuremath{\langle}} |
---|
38 | \DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}} |
---|
39 | |
---|
40 | |
---|
41 | \title{ |
---|
42 | INFORMATION AND COMMUNICATION TECHNOLOGIES\\ |
---|
43 | (ICT)\\ |
---|
44 | PROGRAMME\\ |
---|
45 | \vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}} |
---|
46 | |
---|
47 | \date{ } |
---|
48 | \author{} |
---|
49 | |
---|
50 | \begin{document} |
---|
51 | \thispagestyle{empty} |
---|
52 | |
---|
53 | \vspace*{-1cm} |
---|
54 | \begin{center} |
---|
55 | \includegraphics[width=0.6\textwidth]{../../style/cerco_logo.png} |
---|
56 | \end{center} |
---|
57 | |
---|
58 | \begin{minipage}{\textwidth} |
---|
59 | \maketitle |
---|
60 | \end{minipage} |
---|
61 | |
---|
62 | |
---|
63 | \vspace*{0.5cm} |
---|
64 | \begin{center} |
---|
65 | \begin{LARGE} |
---|
66 | \bf |
---|
67 | Report n. D3.1\\ |
---|
68 | Executable Formal Semantics of C\\ |
---|
69 | \end{LARGE} |
---|
70 | \end{center} |
---|
71 | |
---|
72 | \vspace*{2cm} |
---|
73 | \begin{center} |
---|
74 | \begin{large} |
---|
75 | Version 1.0 |
---|
76 | \end{large} |
---|
77 | \end{center} |
---|
78 | |
---|
79 | \vspace*{0.5cm} |
---|
80 | \begin{center} |
---|
81 | \begin{large} |
---|
82 | Main Authors:\\ |
---|
83 | Brian~Campbell, Randy~Pollack |
---|
84 | \end{large} |
---|
85 | \end{center} |
---|
86 | |
---|
87 | \vspace*{\fill} |
---|
88 | \noindent |
---|
89 | Project Acronym: \cerco{}\\ |
---|
90 | Project full title: Certified Complexity\\ |
---|
91 | Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\ |
---|
92 | |
---|
93 | \clearpage \pagestyle{myheadings} \markright{\cerco{}, FP7-ICT-2009-C-243881} |
---|
94 | |
---|
95 | \newpage |
---|
96 | |
---|
97 | \vspace*{7cm} |
---|
98 | \paragraph{Abstract} |
---|
99 | We present an execution semantics of the C programming language for |
---|
100 | use in the \cerco{} project. It is based on the small-step inductive |
---|
101 | semantics used by the CompCert verified compiler. We discuss the |
---|
102 | extensions required for our target architecture, porting the semantics |
---|
103 | to our choice of tool, \matita{}, providing an equivalent executable |
---|
104 | semantics and the validation of the semantics. |
---|
105 | |
---|
106 | \newpage |
---|
107 | |
---|
108 | \tableofcontents |
---|
109 | |
---|
110 | \section{Introduction} |
---|
111 | |
---|
112 | We present an executable formal semantics of the C programming language which |
---|
113 | will serve as the specification of the input language for the \cerco{} |
---|
114 | verified compiler. Our semantics is based on Leroy et.~al.'s C semantics for |
---|
115 | the CompCert project~\cite{1111042,compcert-mm08}, which divides the treatment |
---|
116 | of C into two pieces. The first is an \ocaml{} stage which parses and |
---|
117 | elaborates C into an abstract syntax tree for the simpler \clight{} language, |
---|
118 | based on the CIL C parser. The second part is a small step semantics for |
---|
119 | \clight{} formalised in the proof tool, which we have ported from Coq to the |
---|
120 | \matita{} theorem prover. |
---|
121 | This semantics is given in the form of inductive definitions, and so we have |
---|
122 | added a third part giving an equivalent functional presentation in \matita{}. |
---|
123 | |
---|
124 | The \cerco{} compiler needs to deal with the constrained memory model of |
---|
125 | the target microcontroller (in our case, the 8051). Thus each part of the |
---|
126 | semantics has been extended to allow explicit handling of the |
---|
127 | microcontroller's memory spaces. \emph{Cost labels} have also been added to |
---|
128 | the \clight{} semantics to support the labelling approach to cost annotations |
---|
129 | presented in a previous deliverable~\cite{d2.1}. |
---|
130 | |
---|
131 | The following section discusses the C language extensions for memory spaces. |
---|
132 | Then the port of the two stages of the CompCert \clight{} semantics is |
---|
133 | described in Section~\ref{sec:port}, followed by the new executable semantics |
---|
134 | in Section~\ref{sec:exec}. Finally we discuss how the semantics has |
---|
135 | been validated |
---|
136 | in Section~\ref{sec:valid}. |
---|
137 | |
---|
138 | \section{Language extensions for the 8051 memory model} |
---|
139 | \label{sec:ext} |
---|
140 | |
---|
141 | The choice of an extended 8051 target for the \cerco{} compiler imposes an |
---|
142 | irregular memory model with tight resource constraints. The different memory |
---|
143 | spaces and access modes are summarised in Figure~\ref{fig:memorymodel} --- |
---|
144 | essentially the evolution of the 8051 family has fragmented memory into |
---|
145 | four regions: one half of the `internal' memory is fully accessible but also |
---|
146 | contains the register banks, the second half cannot be accessed by direct addressing |
---|
147 | because it is shadowed by the `Special Function Registers' (SFRs) for I/O; |
---|
148 | `external memory' provides the bulk of memory in a separate address space; and |
---|
149 | the code is in its own read-only space. |
---|
150 | \begin{figure} |
---|
151 | \setlength{\unitlength}{1pt} |
---|
152 | \begin{picture}(410,250)(-50,200) |
---|
153 | %\put(-50,200){\framebox(410,250){}} |
---|
154 | \put(12,410){\makebox(80,0)[b]{Internal (256B)}} |
---|
155 | \put(13,242){\line(0,1){165}} |
---|
156 | \put(93,242){\line(0,1){165}} |
---|
157 | \put(13,407){\line(1,0){80}} |
---|
158 | \put(12,400){\makebox(0,0)[r]{0h}} \put(14,400){\makebox(0,0)[l]{Register bank 0}} |
---|
159 | \put(13,393){\line(1,0){80}} |
---|
160 | \put(12,386){\makebox(0,0)[r]{8h}} \put(14,386){\makebox(0,0)[l]{Register bank 1}} |
---|
161 | \put(13,379){\line(1,0){80}} |
---|
162 | \put(12,372){\makebox(0,0)[r]{10h}} \put(14,372){\makebox(0,0)[l]{Register bank 2}} |
---|
163 | \put(13,365){\line(1,0){80}} |
---|
164 | \put(12,358){\makebox(0,0)[r]{18h}} \put(14,358){\makebox(0,0)[l]{Register bank 3}} |
---|
165 | \put(13,351){\line(1,0){80}} |
---|
166 | \put(12,344){\makebox(0,0)[r]{20h}} \put(14,344){\makebox(0,0)[l]{Bit addressable}} |
---|
167 | \put(13,323){\line(1,0){80}} |
---|
168 | \put(12,316){\makebox(0,0)[r]{30h}} |
---|
169 | \put(14,309){\makebox(0,0)[l]{\quad \vdots}} |
---|
170 | \put(13,291){\line(1,0){80}} |
---|
171 | \put(12,284){\makebox(0,0)[r]{80h}} |
---|
172 | \put(14,263){\makebox(0,0)[l]{\quad \vdots}} |
---|
173 | \put(12,249){\makebox(0,0)[r]{ffh}} |
---|
174 | \put(13,242){\line(1,0){80}} |
---|
175 | |
---|
176 | \qbezier(-2,407)(-6,407)(-6,393) |
---|
177 | \qbezier(-6,393)(-6,324)(-10,324) |
---|
178 | \put(-12,324){\makebox(0,0)[r]{Indirect/Stack}} |
---|
179 | \qbezier(-6,256)(-6,324)(-10,324) |
---|
180 | \qbezier(-2,242)(-6,242)(-6,256) |
---|
181 | |
---|
182 | \qbezier(94,407)(98,407)(98,393) |
---|
183 | \qbezier(98,393)(98,349)(102,349) |
---|
184 | \put(104,349){\makebox(0,0)[l]{Direct}} |
---|
185 | \qbezier(98,305)(98,349)(102,349) |
---|
186 | \qbezier(94,291)(98,291)(98,305) |
---|
187 | |
---|
188 | \put(102,242){\framebox(20,49){SFR}} |
---|
189 | % bit access to sfrs? |
---|
190 | |
---|
191 | \qbezier(124,291)(128,291)(128,277) |
---|
192 | \qbezier(128,277)(128,266)(132,266) |
---|
193 | \put(134,266){\makebox(0,0)[l]{Direct}} |
---|
194 | \qbezier(128,257)(128,266)(132,266) |
---|
195 | \qbezier(124,242)(128,242)(128,256) |
---|
196 | |
---|
197 | \put(164,410){\makebox(80,0)[b]{External (64kB)}} |
---|
198 | \put(164,220){\line(0,1){187}} |
---|
199 | \put(164,407){\line(1,0){80}} |
---|
200 | \put(244,220){\line(0,1){187}} |
---|
201 | \put(164,242){\line(1,0){80}} |
---|
202 | \put(163,400){\makebox(0,0)[r]{0h}} |
---|
203 | \put(164,324){\makebox(80,0){Paged access}} |
---|
204 | \put(164,310){\makebox(80,0){direct/indirect}} |
---|
205 | \put(163,235){\makebox(0,0)[r]{80h}} |
---|
206 | \put(164,228){\makebox(80,0){\vdots}} |
---|
207 | \put(164,210){\makebox(80,0){Direct/indirect}} |
---|
208 | |
---|
209 | \put(264,410){\makebox(80,0)[b]{Code (64kB)}} |
---|
210 | \put(264,220){\line(0,1){187}} |
---|
211 | \put(264,407){\line(1,0){80}} |
---|
212 | \put(344,220){\line(0,1){187}} |
---|
213 | \put(263,400){\makebox(0,0)[r]{0h}} |
---|
214 | \put(264,228){\makebox(80,0){\vdots}} |
---|
215 | \put(264,324){\makebox(80,0){Dsirect}} |
---|
216 | \put(264,310){\makebox(80,0){PC relative}} |
---|
217 | \end{picture} |
---|
218 | \caption{The extended 8051 memory model} |
---|
219 | \label{fig:memorymodel} |
---|
220 | \end{figure} |
---|
221 | |
---|
222 | To make efficient use of the limited amount of memory, compilers for 8051 |
---|
223 | microcontrollers provide extra keywords to allocate global variables to |
---|
224 | particular memory spaces, and to limit pointers to address a particular space. |
---|
225 | The freely available \sdcc{} compiler provides the following extensions for |
---|
226 | the 8051 memory spaces: |
---|
227 | \begin{table}[ht] |
---|
228 | \begin{tabular}{rcl} |
---|
229 | Attribute & Pointer size (bytes) & Memory space \\\hline |
---|
230 | \lstinline'__data' & 1 & Internal, first half (0h -- 7fh) \\ |
---|
231 | \lstinline'__idata' & 1 & Internal, indirect only (80h -- ffh) \\ |
---|
232 | \lstinline'__pdata' & 1 & External, page access (usually 0h -- 7fh) \\ |
---|
233 | \lstinline'__xdata' & 2 & External, any (0h -- ffffh) \\ |
---|
234 | \lstinline'__code' & 2 & Code, any (0h -- ffffh) \\ |
---|
235 | \emph{none}& 3 & Any / Generic pointer |
---|
236 | \end{tabular} |
---|
237 | \end{table}\\ |
---|
238 | The generic pointers are a tagged union of the other kinds of pointers. |
---|
239 | |
---|
240 | We intend the \cerco{} compiler to support extensions that are broadly |
---|
241 | compatible with \sdcc{} to enable the compilation of programs with |
---|
242 | either tool. In particular, this would allow the comparison of the |
---|
243 | behaviour of test cases compiled with each compiler. Thus the C |
---|
244 | syntax and semantics have been extended with the memory space |
---|
245 | attributes listed above. The syntax follows \sdcc{} and in the |
---|
246 | semantics we track the memory space that each block was allocated from |
---|
247 | and only permit access via the appropriate kinds of pointers. The |
---|
248 | details on these changes are given in the following sections. |
---|
249 | |
---|
250 | The \sdcc{} compiler also supports special variable types for accessing the |
---|
251 | SFRs, which provide the standard I/O mechanism for the 8051 family. (Note |
---|
252 | that pointers to these types are not permitted because only direct addressing of |
---|
253 | the SFRs is allowed.) We intend to use CompCert-style `external functions' |
---|
254 | instead of special types. These are functions which are declared, but no C |
---|
255 | implementation of them is provided. Instead they are provided by the runtime or |
---|
256 | compiled directly to the corresponding machine code. This has the advantage |
---|
257 | that no changes from the CompCert semantics are required, and a compatibility |
---|
258 | library can be provided for \sdcc{} if necessary. The 8051 and the \sdcc{} |
---|
259 | compiler also provide bit-level access to a small region of internal memory. |
---|
260 | We do not intend to expose this feature to C programs in the \cerco{} |
---|
261 | compiler, and so no extension is provided for it. |
---|
262 | |
---|
263 | Finally, we have the option of using CompCert's translation of |
---|
264 | \lstinline'volatile' variable accesses to `external' function calls. Should we |
---|
265 | need more flexible I/O than SFRs provide, then we could adopt the \sdcc{} |
---|
266 | extension to allocate a variable at a particular address to provide a way to |
---|
267 | deal with memory mapped I/O in the external memory space. The translation to |
---|
268 | function calls would mean that the semantics presented here would be |
---|
269 | unaffected. |
---|
270 | |
---|
271 | \section{Port of CompCert \clight{} semantics to \matita{}} |
---|
272 | \label{sec:port} |
---|
273 | |
---|
274 | \subsection{Parsing and elaboration} |
---|
275 | The first stage taken from the CompCert semantics is the parsing and |
---|
276 | elaboration of C programs into the simpler \clight{} language. This is based |
---|
277 | upon the CIL library for parsing, analysing and transforming C programs by |
---|
278 | Necula et.~al.~\cite{cil02}. The elaboration provides explicit type |
---|
279 | information throughout the program, including extra casts for |
---|
280 | promotion. It also |
---|
281 | performs simplifications such as breaking up expressions with side effects |
---|
282 | into effect-free expressions along with statements to perform the effects. The |
---|
283 | transformed \clight{} programs are much more manageable and lack the ambiguities |
---|
284 | of C, but also remain easily understood by C programmers. |
---|
285 | |
---|
286 | The parser has been extended with the 8051 memory spaces attributes given |
---|
287 | above. The resulting abstract syntax tree records them on global variable |
---|
288 | declarations and pointer types. However, we also need to deal with them |
---|
289 | during the elaboration process to produce all of the required type information. |
---|
290 | For example, when the address-of operator \lstinline'&' is used it must decide |
---|
291 | which kind of pointer should be used. Thus the extended elaboration process |
---|
292 | keeps track of the memory space (if any) that the value of each |
---|
293 | expression resides in. Where the memory space is not known, a generic pointer |
---|
294 | will be used instead. Moreover, we also include the pointer kind when |
---|
295 | determining whether a cast must be inserted so that conversions between |
---|
296 | pointer representations can be performed. |
---|
297 | |
---|
298 | |
---|
299 | Thus the elaboration turns the following C code |
---|
300 | \begin{quote} |
---|
301 | \begin{lstlisting}[language=C] |
---|
302 | int g(int *x) { return 5; } |
---|
303 | |
---|
304 | int f(__data int *x, int *y) { |
---|
305 | return x==y ? g(x) : *x; |
---|
306 | } |
---|
307 | |
---|
308 | __data int i = 1; |
---|
309 | |
---|
310 | int main(void) { |
---|
311 | return f(&i, &i); |
---|
312 | } |
---|
313 | \end{lstlisting} |
---|
314 | \end{quote} |
---|
315 | into the Clight program below: |
---|
316 | \begin{quote} |
---|
317 | \begin{lstlisting}[language=C] |
---|
318 | int g(int *x) { return 5; } |
---|
319 | |
---|
320 | int f(__data int * x, int * y) |
---|
321 | { |
---|
322 | int t; |
---|
323 | if (x == (__data int * )y) { |
---|
324 | t = g((int * )x); |
---|
325 | } else { |
---|
326 | t = *x; |
---|
327 | } |
---|
328 | return t; |
---|
329 | } |
---|
330 | |
---|
331 | int main(void) |
---|
332 | { |
---|
333 | int t; |
---|
334 | t = f(&i, (int * )(&i)); |
---|
335 | return t; |
---|
336 | } |
---|
337 | \end{lstlisting} |
---|
338 | \end{quote} |
---|
339 | The expression in \lstinline'f' had to be broken up due |
---|
340 | to the call to \lstinline'g', and casts have been added to change between generic pointers |
---|
341 | and pointers specific to the \lstinline'__data' section of memory. The |
---|
342 | underlying data structure also has types attached to every expression, but |
---|
343 | these are impractical to show in source form. |
---|
344 | |
---|
345 | Note that the translation from C to \clight{} is not proven correct |
---|
346 | --- instead it effectively forms a semi-formal part of the whole C |
---|
347 | semantics. We can have some confidence in the code, however, because |
---|
348 | it has received testing in the \cerco{} prototype, and it is very |
---|
349 | close to the version used in CompCert. We can also perform testing of |
---|
350 | the semantics without involving the rest of the compiler because we |
---|
351 | have an executable semantics. Moreover, the cautious programmer could |
---|
352 | choose to inspect the generated \clight{} code, or even work entirely |
---|
353 | in the \clight{} language. |
---|
354 | |
---|
355 | \subsection{Small-step inductive semantics} |
---|
356 | \label{sec:inductive} |
---|
357 | |
---|
358 | The semantics for \clight{} itself has been ported from the Coq development |
---|
359 | used in CompCert to \matita{} for use in \cerco{}. Details about the original |
---|
360 | big-step formalisation of \clight{} can be found in Leroy and |
---|
361 | Blazy~\cite{compcert-mm08} (including a discussion of the translation from C |
---|
362 | in \S 4.1), although we started from a later version with a |
---|
363 | small-step semantics and hence support for \lstinline'goto' statements. |
---|
364 | Several parts of the semantics were shared with other parts of the CompCert |
---|
365 | development, notably: |
---|
366 | \begin{itemize} |
---|
367 | \item the representation of primitive values (integers, pointers and undefined |
---|
368 | values, but not structures or unions) and operations on them, |
---|
369 | \item traces of I/O events, |
---|
370 | \item a memory model that keeps conceptually distinct sections of memory |
---|
371 | strictly separate (assigning `undefined behaviour' to a buffer overflow, for |
---|
372 | instance), |
---|
373 | \item results about composing execution steps of arbitrary small-step |
---|
374 | semantics, |
---|
375 | \item data structures for local and global environments, and |
---|
376 | \item common error handling constructs, in particular an error monad. |
---|
377 | \end{itemize} |
---|
378 | We anticipate a similar arrangement for the \cerco{} verified |
---|
379 | compiler, although this means that there may be further changes to the |
---|
380 | common parts of the semantics later in the project to harmonise the |
---|
381 | stages of the compiler. In particular, some of data structures for |
---|
382 | environments are just preliminary definitions for developing the |
---|
383 | semantics. |
---|
384 | |
---|
385 | The main body of the small-step semantics is a number of inductive definitions |
---|
386 | giving details of the defined behaviour for casts, expressions and statements. |
---|
387 | Expressions are side-effect free in \clight{} and only produce a value as |
---|
388 | output. In our case we also need a trace of any cost labels that are |
---|
389 | `evaluated' so that we will be able to give fine-grained costs for |
---|
390 | the execution of compiled conditional expressions. |
---|
391 | |
---|
392 | As an example of one of the expression rules, consider an expression which |
---|
393 | evaluates a variable, \lstinline'Expr (Evar id) ty'. A variable is an example |
---|
394 | of a class of expressions called \emph{lvalues}, which are roughly those |
---|
395 | expressions which can be assigned to. Thus we use a general rule for lvalues, |
---|
396 | \label{page:evalvar} |
---|
397 | \begin{quote} |
---|
398 | \begin{lstlisting} |
---|
399 | ninductive eval_expr (ge:genv) (e:env) (m:mem) : expr $\rightarrow$ val $\rightarrow$ trace $\rightarrow$ Prop := |
---|
400 | ... |
---|
401 | | eval_Elvalue: $\forall$a,ty,psp,loc,ofs,v,tr. |
---|
402 | eval_lvalue ge e m (Expr a ty) psp loc ofs tr $\rightarrow$ |
---|
403 | load_value_of_type ty m psp loc ofs = Some ? v $\rightarrow$ |
---|
404 | eval_expr ge e m (Expr a ty) v tr |
---|
405 | \end{lstlisting} |
---|
406 | \end{quote} |
---|
407 | where the auxiliary relation \lstinline'eval_lvalue' yields the location of |
---|
408 | the value, \lstinline'psp,loc,ofs', consisting of memory space, memory block, |
---|
409 | and offset into the block, respectively. The expression can thus evaluate to |
---|
410 | the value \lstinline'v' if \lstinline'v' can be loaded from that location. |
---|
411 | One corresponding part of the \lstinline'eval_lvalue' definition is |
---|
412 | \begin{quote} |
---|
413 | \begin{lstlisting} |
---|
414 | ... |
---|
415 | with eval_lvalue (*(ge:genv) (e:env) (m:mem)*) : |
---|
416 | expr $\rightarrow$ memory_space $\rightarrow$ block $\rightarrow$ int $\rightarrow$ trace $\rightarrow$ Prop ≝ |
---|
417 | | eval_Evar_local: $\forall$id,l,ty. |
---|
418 | get ??? id e = Some ? l $\rightarrow$ |
---|
419 | eval_lvalue ge e m (Expr (Evar id) ty) Any l zero E0 |
---|
420 | ... |
---|
421 | \end{lstlisting} |
---|
422 | \end{quote} |
---|
423 | simply looks up the variable in the local environment. The offset is |
---|
424 | zero because all variables are given their own memory block to prevent |
---|
425 | the use of stray pointers. A similar rule handles global variables, |
---|
426 | with an extra check to ensure that no local variable has the same |
---|
427 | name. Note that the two relations are defined using mutual recursion |
---|
428 | because \lstinline'eval_lvalue' uses \lstinline'eval_expr' for the |
---|
429 | evaluation of the pointer expression in the dereferencing rule. |
---|
430 | |
---|
431 | Casts also have an auxiliary relation to specify the allowed changes, and |
---|
432 | operations on values (including the changes in representation performed by |
---|
433 | casting) are given as functions. |
---|
434 | |
---|
435 | The only new expression in our semantics is the cost label which wraps |
---|
436 | around another expression. It does not change the result, but merely |
---|
437 | augments the trace with the given label to identify the branches taken |
---|
438 | in conditional expressions so that accurate cost information can be |
---|
439 | attached to the program: |
---|
440 | \begin{quote} |
---|
441 | \begin{lstlisting} |
---|
442 | | eval_Ecost: $\forall$a,ty,v,l,tr. |
---|
443 | eval_expr ge e m a v tr $\rightarrow$ |
---|
444 | eval_expr ge e m (Expr (Ecost l a) ty) v (tr++Echarge l) |
---|
445 | \end{lstlisting} |
---|
446 | \end{quote} |
---|
447 | |
---|
448 | As the expressions are side-effect free, all of the changes to the state are |
---|
449 | performed by statements. The state itself is represented by records of the |
---|
450 | form |
---|
451 | \begin{quote} |
---|
452 | \begin{lstlisting} |
---|
453 | ninductive state: Type := |
---|
454 | | State: |
---|
455 | $\forall$f: function. |
---|
456 | $\forall$s: statement. |
---|
457 | $\forall$k: cont. |
---|
458 | $\forall$e: env. |
---|
459 | $\forall$m: mem. state |
---|
460 | | Callstate: |
---|
461 | $\forall$fd: fundef. |
---|
462 | $\forall$args: list val. |
---|
463 | $\forall$k: cont. |
---|
464 | $\forall$m: mem. state |
---|
465 | | Returnstate: |
---|
466 | $\forall$res: val. |
---|
467 | $\forall$k: cont. |
---|
468 | $\forall$m: mem. state. |
---|
469 | \end{lstlisting} |
---|
470 | \end{quote} |
---|
471 | During normal execution the state contains the currently executing |
---|
472 | function's definition (used to find \lstinline'goto' labels and also |
---|
473 | to check whether the function is expected to return a value), the |
---|
474 | statement to be executed next, a continuation value to be executed |
---|
475 | afterwards (where successor statements and details of function calls |
---|
476 | and loops are stored), the local environment mapping variables to |
---|
477 | memory locations\footnote{In the semantics all variables are |
---|
478 | allocated, although the compiler may subsequently allocate them to |
---|
479 | registers where possible.} and the current memory state. The |
---|
480 | function call and return states appear to store less information |
---|
481 | because the details of the caller are contained in the continuation. |
---|
482 | |
---|
483 | An example of the statement execution rules is the assignment rule |
---|
484 | (corresponding to the C syntax \lstinline'a1 = a2'), |
---|
485 | \label{page:Sassign} |
---|
486 | \begin{quote} |
---|
487 | \begin{lstlisting} |
---|
488 | ninductive step (ge:genv) : state $\rightarrow$ trace $\rightarrow$ state $\rightarrow$ Prop ≝ |
---|
489 | | step_assign: $\forall$f,a1,a2,k,e,m,psp,loc,ofs,v2,m',tr1,tr2. |
---|
490 | eval_lvalue ge e m a1 psp loc ofs tr1 $\rightarrow$ |
---|
491 | eval_expr ge e m a2 v2 tr2 $\rightarrow$ |
---|
492 | store_value_of_type (typeof a1) m psp loc ofs v2 = Some ? m' $\rightarrow$ |
---|
493 | step ge (State f (Sassign a1 a2) k e m) |
---|
494 | (tr1++tr2) (State f Sskip k e m') |
---|
495 | ... |
---|
496 | \end{lstlisting} |
---|
497 | \end{quote} |
---|
498 | which can be read as: |
---|
499 | \begin{itemize} |
---|
500 | \item if \lstinline'a1' can evaluate to the location \lstinline'psp,loc,ofs', |
---|
501 | \item \lstinline'a2' can evaluate to a value \lstinline'v2', and |
---|
502 | \item storing \lstinline'v2' at location \lstinline'psp,loc,ofs' succeeds, |
---|
503 | yielding the new memory state \lstinline-m'-, then |
---|
504 | \item the program can step from the state about to execute |
---|
505 | \lstinline'Sassign a1 a2' to a state with the updated memory \lstinline-m'- |
---|
506 | about to execute the no-op \lstinline'Sskip'. |
---|
507 | \end{itemize} |
---|
508 | This rule would be followed by one of the rules to which replaces the |
---|
509 | \lstinline'Sskip' statement with the `real' next statement constructed |
---|
510 | from the continuation \lstinline'k'. Note that the only true |
---|
511 | side-effect here is the change in memory --- the local environment is |
---|
512 | initialised once and for all on function entry, and the only events |
---|
513 | appearing in the trace are cost labels used purely for accounting. At |
---|
514 | present this imposes an ordering due to the cost labels. Should this |
---|
515 | prove too restrictive we may change it to produce a set of labels |
---|
516 | encountered. |
---|
517 | |
---|
518 | The \clight{} language provides input and output effects through `external' |
---|
519 | functions and the step rule |
---|
520 | \begin{quote} |
---|
521 | \begin{lstlisting} |
---|
522 | | step_external_function: $\forall$id,targs,tres,vargs,k,m,vres,t. |
---|
523 | event_match (external_function id targs tres) vargs t vres $\rightarrow$ |
---|
524 | step ge (Callstate (External id targs tres) vargs k m) |
---|
525 | t (Returnstate vres k m) |
---|
526 | \end{lstlisting} |
---|
527 | \end{quote} |
---|
528 | which allows the function to be invoked with and return any values subject to |
---|
529 | the enforcement of the typing rules in \lstinline'event_match', which also |
---|
530 | provides the trace. |
---|
531 | |
---|
532 | Cost label statements prefix the trace with the given label, similar to the cost |
---|
533 | label expressions above. |
---|
534 | |
---|
535 | \section{Executable semantics} |
---|
536 | \label{sec:exec} |
---|
537 | |
---|
538 | We have added an equivalent functional version of the \clight{} semantics that |
---|
539 | can be used to animate programs. The definitions roughly follow the inductive |
---|
540 | semantics, but are necessarily rearranged around pattern matching of the |
---|
541 | relevant parts of the state rather than presenting each case separately. |
---|
542 | |
---|
543 | \subsection{Expressions} |
---|
544 | |
---|
545 | The code corresponding to the variable lookup definitions on |
---|
546 | page~\pageref{page:evalvar} is |
---|
547 | \begin{quote} |
---|
548 | \begin{lstlisting} |
---|
549 | nlet rec exec_expr (ge:genv) (en:env) (m:mem) (e:expr) on e : res (val$\times$trace) ≝ |
---|
550 | match e with |
---|
551 | [ Expr e' ty $\Rightarrow$ |
---|
552 | match e' with |
---|
553 | [ ... |
---|
554 | | Evar _ $\Rightarrow$ |
---|
555 | do $\langle$l,tr$\rangle$ $\leftarrow$ exec_lvalue' ge en m e' ty; |
---|
556 | do v $\leftarrow$ opt_to_res ? (load_value_of_type' ty m l); |
---|
557 | OK ? $\langle$v,tr$\rangle$ |
---|
558 | ... |
---|
559 | ] |
---|
560 | ] |
---|
561 | and exec_lvalue' (ge:genv) (en:env) (m:mem) (e':expr_descr) (ty:type) on e' |
---|
562 | : res (memory_space $\times$ block $\times$ int $\times$ trace) ≝ |
---|
563 | match e' with |
---|
564 | [ Evar id $\Rightarrow$ |
---|
565 | match (get … id en) with |
---|
566 | [ None $\Rightarrow$ do $\langle$sp,l$\rangle$ $\leftarrow$ opt_to_res ? (find_symbol ? ? ge id); |
---|
567 | OK ? $\langle$$\langle$$\langle$sp,l$\rangle$,zero$\rangle$,E0$\rangle$ (* global *) |
---|
568 | | Some loc $\Rightarrow$ OK ? $\langle$$\langle$$\langle$Any,loc$\rangle$,zero$\rangle$,E0$\rangle$ (* local *) |
---|
569 | ] |
---|
570 | ... |
---|
571 | \end{lstlisting} |
---|
572 | \end{quote} |
---|
573 | where the result is placed in an error monad (the \lstinline'res' type |
---|
574 | constructor) so that \emph{undefined behaviour} such as dereferencing an |
---|
575 | invalid pointer can be rejected. We use \lstinline'do' notation similar to |
---|
576 | Haskell and CompCert, where |
---|
577 | \begin{quote} |
---|
578 | \begin{lstlisting} |
---|
579 | do x $\leftarrow$ e; e' |
---|
580 | \end{lstlisting} |
---|
581 | \end{quote} |
---|
582 | means evaluate \lstinline'e' and if it yields a value then bind that to |
---|
583 | \lstinline'x' and evaluate \lstinline-e'-, and otherwise propogate the error. |
---|
584 | |
---|
585 | \subsection{Statements} |
---|
586 | |
---|
587 | Evaluating a step of a statement is complicated by the presence of the |
---|
588 | `external' functions for I/O, which can return arbitrary values. These are |
---|
589 | handled by a resumption monad, which on encountering some I/O returns a |
---|
590 | suspension. When the suspension is applied to a value the evaluation of the |
---|
591 | semantics is resumed. Resumption monads are a standard tool for providing |
---|
592 | denotational semantics for input~\cite{Moggi199155} and interleaved |
---|
593 | concurrency~\cite[Chapter 12]{schmidt86}. |
---|
594 | The definition also incorporates errors, and uses a coercion to automatically |
---|
595 | transform values from the plain error monad. |
---|
596 | |
---|
597 | The definition of the monad is: |
---|
598 | \begin{quote} |
---|
599 | \begin{lstlisting} |
---|
600 | ninductive IO (output:Type) (input:output $\rightarrow$ Type) (T:Type) : Type := |
---|
601 | | Interact : $\forall$o:output. (input o $\rightarrow$ IO output input T) $\rightarrow$ IO output input T |
---|
602 | | Value : T $\rightarrow$ IO output input T |
---|
603 | | Wrong : IO output input T. |
---|
604 | |
---|
605 | nlet rec bindIO (O:Type) (I:O $\rightarrow$ Type) (T,T':Type) |
---|
606 | (v:IO O I T) (f:T $\rightarrow$ IO O I T') on v : IO O I T' := |
---|
607 | match v with |
---|
608 | [ Interact out k $\Rightarrow$ (Interact ??? out ($\lambda$res. bindIO O I T T' (k res) f)) |
---|
609 | | Value v' $\Rightarrow$ (f v') |
---|
610 | | Wrong $\Rightarrow$ Wrong O I T' |
---|
611 | ]. |
---|
612 | \end{lstlisting} |
---|
613 | \end{quote} |
---|
614 | Note that the type of the input value is dependent on the output value. |
---|
615 | This enables us to ensure that the input is always well-typed. An |
---|
616 | alternative approach is a check in the semantics, but this causes |
---|
617 | programs to fail in a way that has no counterpart in the inductive |
---|
618 | semantics. |
---|
619 | |
---|
620 | The execution of assignments is straightforward, |
---|
621 | \begin{quote} |
---|
622 | \begin{lstlisting} |
---|
623 | nlet rec exec_step (ge:genv) (st:state) on st : (IO io_out io_in (trace $\times$ state)) ≝ |
---|
624 | match st with |
---|
625 | [ State f s k e m $\Rightarrow$ |
---|
626 | match s with |
---|
627 | [ Sassign a1 a2 $\Rightarrow$ |
---|
628 | ! $\langle$l,tr1$\rangle$ $\leftarrow$ exec_lvalue ge e m a1; |
---|
629 | ! $\langle$v2,tr2$\rangle$ $\leftarrow$ exec_expr ge e m a2; |
---|
630 | ! m' $\leftarrow$ store_value_of_type' (typeof a1) m l v2; |
---|
631 | ret ? $\langle$tr1++tr2, State f Sskip k e m'$\rangle$ |
---|
632 | ... |
---|
633 | \end{lstlisting} |
---|
634 | \end{quote} |
---|
635 | where \lstinline'!' is used in place of \lstinline'do' due to the change in |
---|
636 | monad. The content is essentially the same as the inductive rule given on |
---|
637 | page~\pageref{page:Sassign}. |
---|
638 | |
---|
639 | Most other rules are similar translations of the inductive semantics. |
---|
640 | The handling of external calls uses the |
---|
641 | \begin{quote} |
---|
642 | \begin{lstlisting} |
---|
643 | do_io : ident $\rightarrow$ list eventval $\rightarrow$ IO eventval io_out eventval |
---|
644 | \end{lstlisting} |
---|
645 | \end{quote} |
---|
646 | function to suspend execution: |
---|
647 | \begin{quote} |
---|
648 | \begin{lstlisting} |
---|
649 | ... |
---|
650 | | Callstate f0 vargs k m $\Rightarrow$ |
---|
651 | match f0 with |
---|
652 | [ ... |
---|
653 | | External f argtys retty $\Rightarrow$ |
---|
654 | ! evargs $\leftarrow$ err_to_io_sig … (check_eventval_list vargs (typlist_of_typelist argtys)); |
---|
655 | ! evres $\leftarrow$ do_io f evargs (proj_sig_res (signature_of_type argtys retty)); |
---|
656 | ret ? $\langle$Eextcall f evargs (mk_eventval ? evres), Returnstate (mk_val ? evres) k m$\rangle$ |
---|
657 | ] |
---|
658 | ... |
---|
659 | \end{lstlisting} |
---|
660 | \end{quote} |
---|
661 | The rest of the code after \lstinline'do_io' is included in the |
---|
662 | suspension returned. |
---|
663 | |
---|
664 | Together with functions to provide the initial state for a program and to |
---|
665 | detect a final state we can write a function to run the program up to a given |
---|
666 | number of steps. Similarly, a corecursive function can return the entire |
---|
667 | execution as a stream of trace and state pairs. |
---|
668 | |
---|
669 | \section{Validation} |
---|
670 | \label{sec:valid} |
---|
671 | |
---|
672 | We have used two methods to validate our executable semantics: we have |
---|
673 | proven them equivalent to the inductive semantics of |
---|
674 | Section~\ref{sec:inductive}, and we have animated small examples of |
---|
675 | key areas. |
---|
676 | |
---|
677 | \subsection{Equivalence to inductive semantics} |
---|
678 | |
---|
679 | To show that the executable semantics are sound with respect to the |
---|
680 | inductive semantics we need to prove that any value produced by each |
---|
681 | function satisfies the corresponding relation, modulo errors and |
---|
682 | resumption. To deal with these monads we lift the properties |
---|
683 | required. In particular, for the resumption monad we ignore error |
---|
684 | values, require the property when a value is produced, and quantify |
---|
685 | over any interaction with the outside world: |
---|
686 | \begin{quote} |
---|
687 | \begin{lstlisting} |
---|
688 | nlet rec P_io O I (A:Type) (P:A $\rightarrow$ Prop) (v:IO O I A) on v : Prop := |
---|
689 | match v return $\lambda$_.Prop with |
---|
690 | [ Wrong $\Rightarrow$ True |
---|
691 | | Value z $\Rightarrow$ P z |
---|
692 | | Interact out k $\Rightarrow$ $\forall$v'.P_io O I A P (k v') |
---|
693 | ]. |
---|
694 | \end{lstlisting} |
---|
695 | \end{quote} |
---|
696 | We can use this lifting with the relations from the inductive |
---|
697 | semantics to state soundness properties: |
---|
698 | \begin{quote} |
---|
699 | \begin{lstlisting} |
---|
700 | ntheorem exec_step_sound: $\forall$ge,st. |
---|
701 | P_io ??? ($\lambda$r. step ge st (\fst r) (\snd r)) (exec_step ge st). |
---|
702 | \end{lstlisting} |
---|
703 | \end{quote} |
---|
704 | The proofs of these theorems use case analysis over the state, a few |
---|
705 | lemmas to break up the expressions in the monad and the other |
---|
706 | soundness results to form the corresponding derivation in the |
---|
707 | inductive semantics. |
---|
708 | |
---|
709 | We experimented with a different way of specifying soundness using |
---|
710 | dependent types: |
---|
711 | \begin{quote} |
---|
712 | \begin{lstlisting} |
---|
713 | nlet rec exec_step (ge:genv) (st:state) on st |
---|
714 | : (IO eventval io_out ($\Sigma$r:trace $\times$ state. step ge st (\fst r) (\snd r))) ≝ |
---|
715 | \end{lstlisting} |
---|
716 | \end{quote} |
---|
717 | Note the $\Sigma$ type for the result of the function, which shows that |
---|
718 | successful executions are sound with respect to the inductive semantics. |
---|
719 | Matita automatically generates proof obligations for each case due to a |
---|
720 | coercion between the types |
---|
721 | \begin{center} |
---|
722 | \lstinline'option (res T)' \quad and \quad \lstinline'res ($\Sigma$x:T. P x)' |
---|
723 | \end{center} |
---|
724 | (where a branch marked \lstinline'None' would generate a proof obligation to |
---|
725 | show that it is impossible, although the semantics do not use this feature). |
---|
726 | This is intended to mimic Sozeau's \textsc{Russell} language and elaboration |
---|
727 | into Coq~\cite{sozeau06}. The coercion also triggers an automatic mechanism |
---|
728 | in \matita{} to add equalities for each pattern matched. The proofs |
---|
729 | are essentially the same as before. |
---|
730 | |
---|
731 | However, the soundness proofs then pervade the executable semantics, |
---|
732 | making rewriting in the correctness proofs more difficult. We decided |
---|
733 | to keep the soundness results separate, partly because of the |
---|
734 | increased difficulty of using the resulting terms in proofs, and |
---|
735 | partly because they are of little consequence once equivalence has |
---|
736 | been shown. |
---|
737 | |
---|
738 | The completeness results requiring a dual lifting which requires the |
---|
739 | term to reduce to a particular value, allowing for resumptions with |
---|
740 | existential quantification: |
---|
741 | \begin{quote} |
---|
742 | \begin{lstlisting} |
---|
743 | nlet rec yieldsIO (A:Type) (a:IO io_out io_in A) (v':A) on a : Prop := |
---|
744 | match a with |
---|
745 | [ Value v $\Rightarrow$ v' = v |
---|
746 | | Interact _ k $\Rightarrow$ $\exists$r.yieldsIO A (k r) v' |
---|
747 | | _ $\Rightarrow$ False |
---|
748 | ]. |
---|
749 | \end{lstlisting} |
---|
750 | \end{quote} |
---|
751 | We then show the completeness theorems, such as |
---|
752 | \begin{quote} |
---|
753 | \begin{lstlisting} |
---|
754 | ntheorem step_complete: $\forall$ge,s,tr,s'. |
---|
755 | step ge s tr s' $\rightarrow$ yieldsIO ? (exec_step ge s) $\langle$tr,s'$\rangle$. |
---|
756 | \end{lstlisting} |
---|
757 | \end{quote} |
---|
758 | by case analysis on the inductive derivation and a mixture of |
---|
759 | reduction and rewriting. Thus we know that executing a step in these |
---|
760 | semantics is equivalent to a step in the inductive semantics. |
---|
761 | |
---|
762 | Showing the equivalence of whole program execution is a little |
---|
763 | trickier. Our executable semantics produces a coinductive |
---|
764 | \lstinline'execution' which is really a tree of executions, branching |
---|
765 | at each I/O resumption on the input value: |
---|
766 | \[ |
---|
767 | \begin{array}{rcl@{\;}l} |
---|
768 | & & \mathsf{e\_step}\ t_i\ s_i \rightarrow &\cdots\ \mathsf{e\_stop}\ t_j\ i\ m \\ |
---|
769 | & \nearrow & \quad\quad \vdots \\ |
---|
770 | \mathsf{e\_step}\ \mathsf{E0}\ s_0 \rightarrow \mathsf{e\_step}\ t_1\ s_1 \rightarrow \cdots \mathsf{e\_interact}\ o_1\ k_1 & \rightarrow & \mathsf{e\_step}\ t_i'\ s_i' \rightarrow &\cdots\ \mathsf{e\_step}\ t_j\ s_j \rightarrow \cdots \\ |
---|
771 | & \searrow & \quad\quad\vdots \\ |
---|
772 | && \mathsf{e\_step}\ t_i''\ s_i'' \rightarrow &\cdots\ \mathsf{e\_wrong} |
---|
773 | \end{array} |
---|
774 | \] |
---|
775 | Each \textsf{e\_step} comes with the trace (often the empty trace, |
---|
776 | \textsf{E0}) and current state. Each branch corresponds to calling |
---|
777 | the continuation $k_1$ with a different input value. We use the |
---|
778 | \lstinline'single_exec_of' predicate to identify single executions from |
---|
779 | these trees, essentially fixing a stream of input values. |
---|
780 | |
---|
781 | However, the inductive semantics divides program behaviours into four |
---|
782 | categories which have \emph{individual} (co)inductive descriptions: |
---|
783 | \begin{itemize} |
---|
784 | \item successfully terminating executions; |
---|
785 | \item programs which eventually diverge (with an empty trace); |
---|
786 | \item programs which keep interacting in some way (with an infinite |
---|
787 | trace)\footnote{In our setting this includes passing through cost |
---|
788 | labels as well as I/O.}; and |
---|
789 | \item programs which \emph{go wrong}. |
---|
790 | \end{itemize} |
---|
791 | |
---|
792 | We cannot constructively decide which of these categories an execution |
---|
793 | can fit into because the properties they describe are undecidable. |
---|
794 | Hence we follow CompCert's approach for showing that one of the |
---|
795 | behaviours always exists using classical logic. Thus we characterise |
---|
796 | the executions, then show the existence of the inductive semantics' |
---|
797 | behaviour that matches. We limit the scope of classical reasoning by |
---|
798 | taking the relevant axioms as hypotheses: |
---|
799 | \begin{quote} |
---|
800 | \begin{lstlisting} |
---|
801 | ntheorem exec_inf_equivalence: |
---|
802 | $\forall$classic:($\forall$P:Prop.P $\vee$ $\neg$P). |
---|
803 | $\forall$constructive_indefinite_description:($\forall$A:Type. $\forall$P:A$\rightarrow$Prop. ($\exists$x. P x) $\rightarrow$ $\Sigma$x:A. P x). |
---|
804 | $\forall$p,e. single_exec_of (exec_inf p) e $\rightarrow$ |
---|
805 | $\exists$b.execution_matches_behavior e b $\wedge$ exec_program p b. |
---|
806 | \end{lstlisting} |
---|
807 | \end{quote} |
---|
808 | |
---|
809 | \subsection{Animation of simple C programs} |
---|
810 | |
---|
811 | We are currently working with a development version of \matita{} which |
---|
812 | (temporarily) does not support extraction to OCaml code. Hence to |
---|
813 | animate a program we first parse it with CIL and produce a \matita{} |
---|
814 | term in text format representing the program, then interpret it within |
---|
815 | \matita{}. |
---|
816 | |
---|
817 | This process is rather laborious, so we have concentrated on testing |
---|
818 | small programs which exercised areas of the semantics which depart |
---|
819 | from CompCert. In particular, we tested several aspects of the |
---|
820 | handling of memory spaces and the casting of pointers. Together with |
---|
821 | the results in the previous sections we gain considerable confidence |
---|
822 | that the semantics describe the behaviour of programs properly. |
---|
823 | Nevertheless, we intend to experiment with larger C programs once |
---|
824 | extraction is available. |
---|
825 | |
---|
826 | To give a concrete example, the following C program reads an integer |
---|
827 | using an `external' function and returns its factorial as the |
---|
828 | program's exit value: |
---|
829 | \begin{lstlisting}[language=C] |
---|
830 | int get_input(void); |
---|
831 | |
---|
832 | int main(void) { |
---|
833 | int i = get_input(); |
---|
834 | int r = 1; |
---|
835 | int j; |
---|
836 | |
---|
837 | for (j = 2; j<=i; j++) |
---|
838 | r = r * j; |
---|
839 | |
---|
840 | return r; |
---|
841 | } |
---|
842 | \end{lstlisting} |
---|
843 | The Clight code is essentially the same, and the \matita{} term is: |
---|
844 | \begin{lstlisting} |
---|
845 | ndefinition myprog := mk_program fundef type |
---|
846 | [mk_pair ?? (succ_pos_of_nat 132 (* get_input *)) |
---|
847 | (External (succ_pos_of_nat 132) Tnil (Tint I32 Signed)); |
---|
848 | mk_pair ?? (succ_pos_of_nat 133 (* main *)) (Internal ( |
---|
849 | mk_function (Tint I32 Signed ) [] [mk_pair ?? (succ_pos_of_nat 134) (Tint I32 Signed); |
---|
850 | mk_pair ?? (succ_pos_of_nat 135) (Tint I32 Signed); |
---|
851 | mk_pair ?? (succ_pos_of_nat 136) (Tint I32 Signed)] |
---|
852 | (Ssequence |
---|
853 | (Scall (Some ? (Expr (Evar (succ_pos_of_nat 134)) (Tint I32 Signed))) |
---|
854 | (Expr (Evar (succ_pos_of_nat 132)) |
---|
855 | (Tfunction Tnil (Tint I32 Signed))) |
---|
856 | []) |
---|
857 | (Ssequence |
---|
858 | (Sassign (Expr (Evar (succ_pos_of_nat 135)) (Tint I32 Signed)) |
---|
859 | (Expr (Econst_int (repr 1)) (Tint I32 Signed))) |
---|
860 | (Ssequence |
---|
861 | (Sfor (Sassign (Expr (Evar (succ_pos_of_nat 136)) (Tint I32 Signed)) |
---|
862 | (Expr (Econst_int (repr 2)) (Tint I32 Signed))) |
---|
863 | (Expr (Ebinop Ole |
---|
864 | (Expr (Evar (succ_pos_of_nat 136)) (Tint I32 Signed)) |
---|
865 | (Expr (Evar (succ_pos_of_nat 134)) (Tint I32 Signed))) |
---|
866 | (Tint I32 Signed )) |
---|
867 | (Sassign (Expr (Evar (succ_pos_of_nat 136)) (Tint I32 Signed)) |
---|
868 | (Expr (Ebinop Oadd |
---|
869 | (Expr (Evar (succ_pos_of_nat 136)) (Tint I32 Signed)) |
---|
870 | (Expr (Econst_int (repr 1)) (Tint I32 Signed))) |
---|
871 | (Tint I32 Signed))) |
---|
872 | (Sassign (Expr (Evar (succ_pos_of_nat 135)) (Tint I32 Signed)) |
---|
873 | (Expr (Ebinop Omul |
---|
874 | (Expr (Evar (succ_pos_of_nat 135)) (Tint I32 Signed)) |
---|
875 | (Expr (Evar (succ_pos_of_nat 136)) (Tint I32 Signed))) |
---|
876 | (Tint I32 Signed))) |
---|
877 | ) |
---|
878 | (Sreturn (Some ? (Expr (Evar (succ_pos_of_nat 135)) |
---|
879 | (Tint I32 Signed))))))) |
---|
880 | ))] |
---|
881 | (succ_pos_of_nat 133) |
---|
882 | []. |
---|
883 | \end{lstlisting} |
---|
884 | We can use the definitions in the \texttt{Animation.ma} file to reduce |
---|
885 | the term for a given input (5, in this case; executing a maximum of 40 steps): |
---|
886 | \begin{lstlisting} |
---|
887 | nremark exec: result ? (exec_up_to myprog 40 [EVint (repr 5)]). |
---|
888 | nnormalize; (* you can examine the result here *) |
---|
889 | @; nqed. |
---|
890 | \end{lstlisting} |
---|
891 | The result state records the interaction (the \lstinline'EVextcall') |
---|
892 | and the return result (120 in least-significant-bit first binary): |
---|
893 | \begin{lstlisting} |
---|
894 | result (res (list event$\times$state)) |
---|
895 | (OK (list event$\times$state) |
---|
896 | $\langle$[(EVextcall (p1 (p0 (p1 (p0 (p0 (p0 (p0 one))))))) [] |
---|
897 | (EVint (mk_int (pos (p1 (p0 one))) (inrg_mod (pos (p1 (p0 one)))))))], |
---|
898 | Returnstate (Vint (mk_int (pos (p0 (p0 (p0 (p1 (p1 (p1 one))))))) |
---|
899 | (inrg_mod (pos (p0 (p0 (p0 (p1 (p1 (p1 one)))))))))) |
---|
900 | Kstop MEM $\rangle$) |
---|
901 | \end{lstlisting} |
---|
902 | |
---|
903 | \appendix |
---|
904 | |
---|
905 | \section{Description of the Code} |
---|
906 | |
---|
907 | The files ported from CompCert were based on version 1.6, with some |
---|
908 | minor details taken from 1.7.1 (in particular, the parser and the |
---|
909 | method of building infinite traces for the equivalence proof). |
---|
910 | |
---|
911 | The majority of the semantics is given as \matita{} source files. The |
---|
912 | exception is the changes to the CIL based parser, which is presented |
---|
913 | as a patch to a preliminary version of the \cerco{} prototype |
---|
914 | compiler. This patch provides both the extensions to the parser and a |
---|
915 | pretty printer to produce a usable \matita{} term representing the |
---|
916 | program. |
---|
917 | |
---|
918 | \begin{quote} |
---|
919 | \begin{tabular}{p{9em}l} |
---|
920 | acc-0.1.spaces.patch & Changes to early prototype compiler for parsing |
---|
921 | \end{tabular} |
---|
922 | \end{quote} |
---|
923 | |
---|
924 | \subsubsection*{Ancilliary definitions} |
---|
925 | |
---|
926 | Files corresponding to CompCert. |
---|
927 | |
---|
928 | \begin{quote} |
---|
929 | \begin{tabular}{p{9em}l} |
---|
930 | Coqlib.ma & Minor definitions ported from CompCert \\ |
---|
931 | Errors.ma & The error monad \\ |
---|
932 | Floats.ma & Axiomatised floating point numbers \\ |
---|
933 | Globalenvs.ma & Global environments \\ |
---|
934 | Integers.ma & Integers modulo powers of two \\ |
---|
935 | Maps.ma & Finite maps (used in particular for local environments) \\ |
---|
936 | Smallstep.ma & Generic definitions and lemmas for small step semantics \\ |
---|
937 | \end{tabular} |
---|
938 | \end{quote} |
---|
939 | |
---|
940 | \noindent |
---|
941 | Files specific to this development. |
---|
942 | |
---|
943 | \begin{quote} |
---|
944 | \begin{tabular}{p{9em}l} |
---|
945 | binary/positive.ma & Binary positive numbers \\ |
---|
946 | binary/Z.ma & Binary integers \\ |
---|
947 | extralib.ma & Extensions to \matita{}'s library \\ |
---|
948 | \end{tabular} |
---|
949 | \end{quote} |
---|
950 | |
---|
951 | \subsubsection*{Inductive semantics ported from CompCert} |
---|
952 | |
---|
953 | \begin{quote} |
---|
954 | \begin{tabular}{p{9em}l} |
---|
955 | AST.ma & Minor syntax definitions intended for several compiler stages \\ |
---|
956 | Values.ma & Definitions for values manipulated by Clight programs \\ |
---|
957 | Mem.ma & Definition of the memory model \\ |
---|
958 | Events.ma & I/O events \\ |
---|
959 | CostLabel.ma & Definition of cost labels \\ |
---|
960 | Csyntax.ma & Clight syntax trees \\ |
---|
961 | Csem.ma & Clight inductive semantics \\ |
---|
962 | \end{tabular} |
---|
963 | \end{quote} |
---|
964 | |
---|
965 | \subsubsection*{Executable semantics} |
---|
966 | |
---|
967 | \begin{quote} |
---|
968 | \begin{tabular}{p{9em}l} |
---|
969 | IOMonad.ma & Definitions of I/O resumption monad \\ |
---|
970 | Cexec.ma & Definition of the executable semantics \\ |
---|
971 | CexecSound.ma & Soundness of individual steps \\ |
---|
972 | CexecComplete.ma & Completeness of individual steps \\ |
---|
973 | CexecEquiv.ma & Equivalence of whole program executions \\ |
---|
974 | Animation.ma & Definitions to help test the semantics |
---|
975 | \end{tabular} |
---|
976 | \end{quote} |
---|
977 | |
---|
978 | \bibliographystyle{plain} |
---|
979 | \bibliography{report} |
---|
980 | |
---|
981 | \end{document} |
---|