source: Deliverables/D3.1/Report/report.tex @ 335

Last change on this file since 335 was 335, checked in by campbell, 9 years ago

Quick pass through 3.1 text.

File size: 26.4 KB
Line 
1\documentclass[11pt,epsf,a4wide]{article}
2\usepackage[utf8x]{inputenc}
3\usepackage{listings}
4\usepackage{../../style/cerco}
5\newcommand{\ocaml}{OCaml}
6\newcommand{\clight}{Clight}
7\newcommand{\matita}{matita}
8\newcommand{\sdcc}{\texttt{sdcc}}
9
10\newcommand{\textSigma}{\ensuremath{\Sigma}}
11
12% LaTeX Companion, p 74
13\newcommand{\todo}[1]{\marginpar{\raggedright - #1}}
14
15\lstdefinelanguage{coq}
16  {keywords={Definition,Lemma,Theorem,Remark,Qed,Save,Inductive,Record},
17   morekeywords={[2]if,then,else},
18  }
19
20\lstdefinelanguage{matita}
21  {keywords={ndefinition,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,rec,match,with,Type},
22   morekeywords={[2]nwhd,nnormalize,nelim,ncases,ndestruct},
23   mathescape=true,
24  }
25
26\lstset{language=matita,basicstyle=\small\tt,columns=flexible,breaklines=false,
27        keywordstyle=\color{red}\bfseries,
28        keywordstyle=[2]\color{blue},
29        commentstyle=\color{green},
30        stringstyle=\color{blue},
31        showspaces=false,showstringspaces=false}
32
33\lstset{extendedchars=false}
34\lstset{inputencoding=utf8x}
35\DeclareUnicodeCharacter{8797}{:=}
36\DeclareUnicodeCharacter{10746}{++}
37\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
38\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
39
40
41\title{
42INFORMATION AND COMMUNICATION TECHNOLOGIES\\
43(ICT)\\
44PROGRAMME\\
45\vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}}
46
47\date{ }
48\author{}
49
50\begin{document}
51\thispagestyle{empty}
52
53\vspace*{-1cm}
54\begin{center}
55\includegraphics[width=0.6\textwidth]{../../style/cerco_logo.png}
56\end{center}
57
58\begin{minipage}{\textwidth}
59\maketitle
60\end{minipage}
61
62
63\vspace*{0.5cm}
64\begin{center}
65\begin{LARGE}
66\bf
67Report n. D3.1\\
68Executable Formal Semantics of C\\
69\end{LARGE} 
70\end{center}
71
72\vspace*{2cm}
73\begin{center}
74\begin{large}
75Version 1.0
76\end{large}
77\end{center}
78
79\vspace*{0.5cm}
80\begin{center}
81\begin{large}
82Main Authors:\\
83Brian~Campbell, Randy~Pollack
84\end{large}
85\end{center}
86
87\vspace*{\fill}
88\noindent
89Project Acronym: \cerco{}\\
90Project full title: Certified Complexity\\
91Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\
92
93\clearpage \pagestyle{myheadings} \markright{\cerco{}, FP7-ICT-2009-C-243881}
94
95\tableofcontents
96
97\section{Introduction}
98
99We present an executable formal semantics of the C programming language which
100will serve as the specification of the input language for the \cerco{}
101verified compiler.  Our semantics is based on Leroy et.~al.'s C semantics for
102the CompCert project~\cite{1111042,compcert-mm08}, which divides the treatment
103of C into two pieces.  The first is an \ocaml{} stage which parses and
104elaborates C into an abstract syntax tree for the simpler \clight{} language,
105based on the CIL C parser.  The second part is a small step semantics for
106\clight{} formalised in the proof tool, which we have ported from Coq to the
107\matita{} theorem prover.
108This semantics is given in the form of inductive definitions, and so we have
109added a third part giving an equivalent functional presentation in \matita{}.
110
111The \cerco{} compiler needs to deal with the constrained memory model of
112the target microcontroller (in our case, the 8051).  Thus each part of the
113semantics has been extended to allow explicit handling of the
114microcontroller's memory spaces.  \emph{Cost labels} have also been added to
115the \clight{} semantics to support the labelling approach to cost annotations
116presented in~\cite{d2.1}.
117
118The following section discusses the C language extensions for memory spaces.
119Then the port of the two stages of the CompCert \clight{} semantics is
120described in Section~\ref{sec:port}, followed by the new executable semantics
121in Section~\ref{sec:exec}.  Finally we discuss how the semantics can be tested
122in Section~\ref{sec:valid}.
123
124\section{Language extensions for the 8051 memory model}
125\label{sec:ext}
126
127The choice of an extended 8051 target for the \cerco{} compiler imposes an
128irregular memory model with tight resource constraints.  The different memory
129spaces and access modes are summarised in Figure~\ref{fig:memorymodel} ---
130essentially the evolution of the 8051 family has fragmented memory into
131four regions:  one half of the `internal' memory is fully accessible but also
132contains the register banks, the second half cannot be accessed by direct addressing
133because it is shadowed by the `Special Function Registers' (SFRs) for I/O;
134`external memory' provides the bulk of memory in a separate address space; and
135the code is in its own read-only space.
136\begin{figure}
137\setlength{\unitlength}{1pt}
138\begin{picture}(410,250)(-50,200)
139%\put(-50,200){\framebox(410,250){}}
140\put(12,410){\makebox(80,0)[b]{Internal (256B)}}
141\put(13,242){\line(0,1){165}}
142\put(93,242){\line(0,1){165}}
143\put(13,407){\line(1,0){80}}
144\put(12,400){\makebox(0,0)[r]{0h}}  \put(14,400){\makebox(0,0)[l]{Register bank 0}}
145\put(13,393){\line(1,0){80}}
146\put(12,386){\makebox(0,0)[r]{8h}}  \put(14,386){\makebox(0,0)[l]{Register bank 1}}
147\put(13,379){\line(1,0){80}}
148\put(12,372){\makebox(0,0)[r]{10h}}  \put(14,372){\makebox(0,0)[l]{Register bank 2}}
149\put(13,365){\line(1,0){80}}
150\put(12,358){\makebox(0,0)[r]{18h}} \put(14,358){\makebox(0,0)[l]{Register bank 3}}
151\put(13,351){\line(1,0){80}}
152\put(12,344){\makebox(0,0)[r]{20h}} \put(14,344){\makebox(0,0)[l]{Bit addressable}}
153\put(13,323){\line(1,0){80}}
154\put(12,316){\makebox(0,0)[r]{30h}}
155  \put(14,309){\makebox(0,0)[l]{\quad \vdots}}
156\put(13,291){\line(1,0){80}}
157\put(12,284){\makebox(0,0)[r]{80h}}
158  \put(14,263){\makebox(0,0)[l]{\quad \vdots}}
159\put(12,249){\makebox(0,0)[r]{ffh}}
160\put(13,242){\line(1,0){80}}
161
162\qbezier(-2,407)(-6,407)(-6,393)
163\qbezier(-6,393)(-6,324)(-10,324)
164\put(-12,324){\makebox(0,0)[r]{indirect}}
165\qbezier(-6,256)(-6,324)(-10,324)
166\qbezier(-2,242)(-6,242)(-6,256)
167
168\qbezier(94,407)(98,407)(98,393)
169\qbezier(98,393)(98,349)(102,349)
170\put(104,349){\makebox(0,0)[l]{direct/stack}}
171\qbezier(98,305)(98,349)(102,349)
172\qbezier(94,291)(98,291)(98,305)
173
174\put(102,242){\framebox(20,49){sfr}}
175% bit access to sfrs?
176
177\qbezier(124,291)(128,291)(128,277)
178\qbezier(128,277)(128,266)(132,266)
179\put(134,266){\makebox(0,0)[l]{direct}}
180\qbezier(128,257)(128,266)(132,266)
181\qbezier(124,242)(128,242)(128,256)
182
183\put(164,410){\makebox(80,0)[b]{External (64kB)}}
184\put(164,220){\line(0,1){187}}
185\put(164,407){\line(1,0){80}}
186\put(244,220){\line(0,1){187}}
187\put(164,242){\line(1,0){80}}
188\put(163,400){\makebox(0,0)[r]{0h}}
189\put(164,324){\makebox(80,0){paged access}}
190  \put(164,310){\makebox(80,0){direct/indirect}}
191\put(163,235){\makebox(0,0)[r]{80h}}
192  \put(164,228){\makebox(80,0){\vdots}}
193  \put(164,210){\makebox(80,0){direct/indirect}}
194
195\put(264,410){\makebox(80,0)[b]{Code (64kB)}}
196\put(264,220){\line(0,1){187}}
197\put(264,407){\line(1,0){80}}
198\put(344,220){\line(0,1){187}}
199\put(263,400){\makebox(0,0)[r]{0h}}
200  \put(264,228){\makebox(80,0){\vdots}}
201  \put(264,324){\makebox(80,0){direct}}
202  \put(264,310){\makebox(80,0){PC relative}}
203\end{picture}
204\caption{The extended 8051 memory model}
205\end{figure}
206
207To make efficient use of the limited amount of memory, compilers for 8051
208microcontrollers provide extra keywords to allocate global variables to
209particular memory spaces, and to limit pointers to address a particular space.
210The freely available \sdcc{} compiler provides the following extensions for
211the 8051 memory spaces:
212\begin{table}[ht]
213\begin{tabular}{rcl}
214Attribute & Pointer size (bytes) & Memory space \\\hline
215\lstinline'__data' & 1 & Internal, first half (0h -- 7fh) \\
216\lstinline'__idata' & 1 & Internal, indirect only (80h -- ffh) \\
217\lstinline'__pdata' & 1 & External, page access (usually 0h -- 7fh) \\
218\lstinline'__xdata' & 2 & External, any (0h -- ffffh) \\
219\lstinline'__code' & 2 & Code, any (0h -- ffffh) \\
220\emph{none}& 3 & Any / Generic pointer
221\end{tabular}
222\end{table}
223The generic pointers are a tagged union of the other kinds of pointers.
224
225We intend the \cerco{} compiler to support extensions that are broadly
226compatible with \sdcc{} to enable the compilation of programs with
227either tool.  In particular, this would allow the comparison of the
228behaviour of test cases compiled with each compiler.  Thus the C
229syntax and semantics have been extended with the memory space
230attributes listed above.  The syntax follows \sdcc{} and in the
231semantics we track the memory space that each block was allocated in
232and only permit access via the appropriate kinds of pointers.  The
233details on these changes are given in the following sections.
234
235The \sdcc{} compiler also supports special variable types for accessing the
236SFRs, which provide the standard I/O mechanism for the 8051 family.  (Note
237that pointers to these types are not permitted because only direct addressing of
238the SFRs is allowed.)  We intend to use CompCert-style `external functions'
239instead of special types.  These are functions which are declared, but no C
240implementation of them is provided.  Instead they are provided by the runtime or
241compiled directly to the corresponding machine code.  This has the advantage
242that no changes from the CompCert semantics are required, and a compatibility
243library can be provided for \sdcc{} if necessary.  The 8051 and the \sdcc{}
244compiler also provide bit-level access to a small region of internal memory.
245We do not intend to expose this feature to C programs in the \cerco{}
246compiler, and so no extension is provided for it.
247
248Finally, we adopt the \sdcc{} extension to allocate a variable at a
249particular address to provide a way to deal with memory mapped I/O in
250the external memory space.
251\todo{Are we really going to do this?}
252
253\section{Port of CompCert \clight{} semantics to \matita{}}
254\label{sec:port}
255
256\todo{Could do with some text here}
257\subsection{Parsing and elaboration}
258The first stage taken from the CompCert semantics is the parsing and
259elaboration of C programs into the simpler \clight{} language.  This is based
260upon the CIL library for parsing, analysing and transforming C programs by
261Necula et.~al.~\cite{cil02}.  The elaboration performed provides explicit type
262information throughout the program, including extra casts for promotion, and
263performs simplifications such as breaking up expressions with side effects
264into effect-free expressions and statements to perform the effects.  The
265transformed \clight{} programs are much more manageable and lack the ambiguities
266of C, but also remain easily understood by C programmers.
267
268The parser has been extended with the 8051 memory spaces attributes given
269above.  The resulting abstract syntax tree records them on global variable
270declarations and pointer types.  However, we also need to deal with them
271during the elaboration process to produce all of the required type information.
272For example, when the address-of operator \lstinline'&' is used it must decide
273which kind of pointer should be used.  Thus the extended elaboration process
274keeps track of the memory space (if any) that the value of each
275expression resides in.  Where the memory space is not known, a generic pointer
276will be used instead.  Moreover, we also include the pointer kind when
277determining whether a cast must be inserted so that conversions between
278pointer representations can be performed.
279
280
281Thus the elaboration turns the following C code
282\begin{quote}
283\begin{lstlisting}[language=C]
284int g(int *x) { return 5; }
285
286int f(__data int *x, int *y) {
287  return x==y ? g(x) : *x;
288}
289
290__data int i = 1;
291
292int main(void) {
293  return f(&i, &i);
294}
295\end{lstlisting}
296\end{quote}
297into the Clight program below:
298\begin{quote}
299\begin{lstlisting}[language=C]
300int g(int *x) { return 5; }
301
302int f(__data int * x, int * y)
303{
304  int t;
305  if (x == (__data int * )y) {
306    t = g((int * )x);
307  } else {
308    t = *x;
309  }
310  return t;
311}
312
313int main(void)
314{
315  int t;
316  t = f(&i, (int * )(&i));
317  return t;
318}
319\end{lstlisting}
320\end{quote}
321The expressions in \lstinline'f' and \lstinline'main' had to be broken up due
322to side-effects, and casts have been added to change between generic pointers
323and pointers specific to the \lstinline'__data' section of memory.  The
324underlying data structure also has types attached to every expression, but
325these are impractical to show in source form.
326
327Note that the translation from C to \clight{} is not proven correct
328--- instead it effectively forms a semi-formal part of the whole C
329semantics.  We can have some confidence in the code, however, because
330it has received testing in the \cerco{} prototype, and it is very
331close to the version used in CompCert.  We can also perform testing of
332the semantics without involving the rest of the compiler because we
333have an executable semantics.  Moreover, the cautious programmer could
334choose to inspect the generated \clight{} code, or even work entirely
335in the \clight{} language.
336
337\subsection{Small-step inductive semantics}
338
339The semantics for \clight{} itself has been ported from the Coq development
340used in CompCert to \matita{} for use in \cerco{}.  Details about the original
341big-step formalisation of \clight{} can be found in Leroy and
342Blazy~\cite{compcert-mm08} (including a discussion of the translation from C
343in \S 4.1), although we started from a later version with a
344small-step semantics and hence support for \lstinline'goto' statements.
345Several parts of the semantics were shared with other parts of the CompCert
346development, notably:
347\begin{itemize}
348\item the representation of primitive values (integers, pointers and undefined
349values, but not structures or unions) and operations on them,
350\item traces of I/O events,
351\item a memory model that keeps conceptually distinct sections of memory
352strictly separate (assigning `undefined behaviour' to a buffer overflow, for
353instance),
354\item results about composing execution steps of arbitrary small-step
355semantics,
356\item data structures for local and global environments, and
357\item common error handling constructs, in particular an error monad.
358\end{itemize}
359We anticipate a similar arrangement for the \cerco{} verified
360compiler, although this means that there may be further changes to the
361common parts of the semantics later in the project to harmonise the
362stages of the compiler.  In particular, some of data structures for
363environments are just preliminary definitions for developing the
364semantics.
365
366The main body of the small-step semantics is a number of inductive definitions
367giving details of the defined behaviour for casts, expressions and statements.
368Expressions are side-effect free in \clight{} and only produce a value as
369output.  In our case we also need a trace of any cost labels that are
370`evaluated' so that we will be able to give fine-grained costs for
371the execution of compiled conditional expressions.
372
373As an example of one of the expression rules, consider an expression which
374evaluates a variable, \lstinline'Expr (Evar id) ty'.  A variable is an example
375of a class of expressions called \emph{lvalues}, which are roughly those
376expressions which can be assigned to.  Thus we use a general rule for lvalues,
377\label{page:evalvar}
378\begin{quote}
379\begin{lstlisting}
380ninductive eval_expr (ge:genv) (e:env) (m:mem) : expr $\rightarrow$ val $\rightarrow$ trace $\rightarrow$ Prop ≝
381...
382  | eval_Elvalue: $\forall$a,ty,psp,loc,ofs,v,tr.
383      eval_lvalue ge e m (Expr a ty) psp loc ofs tr $\rightarrow$
384      load_value_of_type ty m psp loc ofs = Some ? v $\rightarrow$
385      eval_expr ge e m (Expr a ty) v tr
386\end{lstlisting}
387\end{quote}
388where the auxiliary relation \lstinline'eval_lvalue' yields the location of
389the value, \lstinline'psp,loc,ofs', consisting of memory space, memory block,
390and offset into the block, respectively.  The expression can thus evaluate to
391the value \lstinline'v' if \lstinline'v' can be loaded from that location.
392One corresponding part of the \lstinline'eval_lvalue' definition is
393\begin{quote}
394\begin{lstlisting}
395...
396with eval_lvalue (*(ge:genv) (e:env) (m:mem)*) :
397        expr $\rightarrow$ memory_space $\rightarrow$ block $\rightarrow$ int $\rightarrow$ trace $\rightarrow$ Prop ≝
398  | eval_Evar_local:   $\forall$id,l,ty.
399      get ??? id e = Some ? l $\rightarrow$
400      eval_lvalue ge e m (Expr (Evar id) ty) Any l zero E0
401...
402\end{lstlisting}
403\end{quote}
404simply looks up the variable in the local environment.  The offset is
405zero because all variables are given their own memory block to prevent
406the use of stray pointers.  A similar rule handles global variables,
407with an extra check to ensure that no local variable has the same
408name.  Note that the two relations are defined using mutual recursion
409because \lstinline'eval_lvalue' uses \lstinline'eval_expr' for the
410evaluation of the pointer expression in the dereferencing rule.
411
412Casts also have an auxiliary relation to specify the allowed changes, and
413operations on values (including the changes in representation performed by
414casting) are given as functions.
415
416The only new expression in our semantics is the cost label which wraps
417around another expression.  It does not change the result, but merely
418prefixes the trace with the given label to identify the branches taken
419in conditional expressions so that accurate cost information can be
420attached to the program:
421\begin{quote}
422\begin{lstlisting}
423  | eval_Ecost: $\forall$a,ty,v,l,tr.
424      eval_expr ge e m a v tr $\rightarrow$
425      eval_expr ge e m (Expr (Ecost l a) ty) v (tr++Echarge l)
426\end{lstlisting}
427\end{quote}
428\todo{make the semantics of the cost labels clearer}
429
430As the expressions are side-effect free, all of the changes to the state are
431performed by statements.  The state itself is represented by records of the
432form
433\begin{quote}
434\begin{lstlisting}
435ninductive state: Type :=
436  | State:
437      $\forall$f: function.
438      $\forall$s: statement.
439      $\forall$k: cont.
440      $\forall$e: env.
441      $\forall$m: mem.  state
442  | Callstate:
443      $\forall$fd: fundef.
444      $\forall$args: list val.
445      $\forall$k: cont.
446      $\forall$m: mem. state
447  | Returnstate:
448      $\forall$res: val.
449      $\forall$k: cont.
450      $\forall$m: mem. state.
451\end{lstlisting}
452\end{quote}
453During normal execution the state contains the currently executing
454function's definition (used to find \lstinline'goto' labels and also
455to check whether the function is expected to return a value), the
456statement to be executed next, a continuation value to be executed
457afterwards (where successor statements and details of function calls
458and loops are stored), the local environment mapping variables to
459memory locations\footnote{In the semantics all variables are
460  allocated, although the compiler may subsequently allocate them to
461  registers where possible.} and the current memory state.  \todo{need
462  to note that all variables 'appear' to be memory allocated, even if
463  they're subsequently optimised away.}  The function call and return
464states appear to store less information because the details of the
465caller are contained in the continuation.
466
467An example of the statement execution rules is the assignment rule
468(corresponding to the C syntax \lstinline'a1 = a2'),
469\label{page:Sassign}
470\begin{quote}
471\begin{lstlisting}
472ninductive step (ge:genv) : state $\rightarrow$ trace $\rightarrow$ state $\rightarrow$ Prop ≝
473  | step_assign:   $\forall$f,a1,a2,k,e,m,psp,loc,ofs,v2,m',tr1,tr2.
474      eval_lvalue ge e m a1 psp loc ofs tr1 $\rightarrow$
475      eval_expr ge e m a2 v2 tr2 $\rightarrow$
476      store_value_of_type (typeof a1) m psp loc ofs v2 = Some ? m' $\rightarrow$
477      step ge (State f (Sassign a1 a2) k e m)
478           (tr1++tr2) (State f Sskip k e m')
479...
480\end{lstlisting}
481\end{quote}
482which can be read as:
483\begin{itemize}
484\item if \lstinline'a1' can evaluate to the location \lstinline'psp,loc,ofs',
485\item \lstinline'a2' can evaluate to a value \lstinline'v2', and
486\item storing \lstinline'v2' at location \lstinline'psp,loc,ofs' succeeds,
487yielding the new memory state \lstinline-m'-, then
488\item the program can step from the state about to execute
489\lstinline'Sassign a1 a2' to a state with the updated memory \lstinline-m'-
490about to execute the no-op \lstinline'Sskip'.
491\end{itemize}
492This rule would be followed by one of the rules to which replaces the
493\lstinline'Sskip' statement with the `real' next statement constructed
494from the continuation \lstinline'k'.  Note that the only true
495side-effect here is the change in memory --- the local environment is
496initialised once and for all on function entry, and the only events
497appearing in the trace are cost labels used purely for accounting.  At
498present this imposes an ordering due to the cost labels.  Should this
499prove too restrictive we may change it to produce a set of labels
500encountered.
501
502The \clight{} language provides input and output effects through `external'
503functions and the step rule
504\begin{quote}
505\begin{lstlisting}
506  | step_external_function: $\forall$id,targs,tres,vargs,k,m,vres,t.
507      event_match (external_function id targs tres) vargs t vres $\rightarrow$
508      step ge (Callstate (External id targs tres) vargs k m)
509            t (Returnstate vres k m)
510\end{lstlisting}
511\end{quote}
512which allows the function to be invoked with and return any values subject to
513the enforcement of the typing rules in \lstinline'event_match', which also
514provides the trace.
515
516Cost label statements add the given label to the trace analogously to the cost
517label expressions above.
518
519\section{Executable semantics}
520\label{sec:exec}
521
522We have added an equivalent functional version of the \clight{} semantics that
523can be used to animate programs.  The definitions roughly follow the inductive
524semantics, but are necessarily rearranged around pattern matching of the
525relevant parts of the state rather than presenting each case separately.
526
527\subsection{Expressions}
528
529The code corresponding to the variable lookup definitions on
530page~\pageref{page:evalvar} is
531\begin{quote}
532\begin{lstlisting}
533nlet rec exec_expr (ge:genv) (en:env) (m:mem) (e:expr) on e :
534               res ($\Sigma$r:val$\times$trace. eval_expr ge en m e (\fst r) (\snd r)) ≝
535match e with
536[ Expr e' ty $\Rightarrow$
537  match e' with
538  [ ...
539  | Evar _ $\Rightarrow$ Some ? (
540      do $\langle$l,tr$\rangle$ $\leftarrow$ exec_lvalue' ge en m e' ty;
541      do v $\leftarrow$ opt_to_res ? (load_value_of_type' ty m l);
542      OK ? $\langle$v,tr$\rangle$)
543...
544  ]
545]
546and exec_lvalue' (ge:genv) (en:env) (m:mem) (e':expr_descr) (ty:type) on e' :
547    res ($\Sigma$r:memory_space $\times$ block $\times$ int $\times$ trace. eval_lvalue ge en m (Expr e' ty) (\fst (\fst (\fst r))) (\snd (\fst (\fst r))) (\snd (\fst r)) (\snd r)) ≝
548  match e' with
549  [ Evar id $\Rightarrow$ 
550      match (get … id en) with
551      [ None $\Rightarrow$ Some ? (do $\langle$sp,l$\rangle$ $\leftarrow$ opt_to_res ? (find_symbol ? ? ge id); OK ? $\langle$$\langle$$\langle$sp,l$\rangle$,zero$\rangle$,E0$\rangle$) (* global *)
552      | Some loc $\Rightarrow$ Some ? (OK ? $\langle$$\langle$$\langle$Any,loc$\rangle$,zero$\rangle$,E0$\rangle$) (* local *)
553      ]
554...
555\end{lstlisting}
556\end{quote}
557where the result is placed in an error monad (the \lstinline'res' type
558constructor) so that \emph{undefined behaviour} such as dereferencing an
559invalid pointer can be rejected.  We use \lstinline'do' notation similar to
560Haskell and CompCert, where
561\begin{quote}
562\begin{lstlisting}
563do x $\leftarrow$ e; e'
564\end{lstlisting}
565\end{quote}
566means evaluate \lstinline'e' and if it yields a value then bind that to
567\lstinline'x' and evaluate \lstinline-e'-, and otherwise propogate the error.
568
569Note the $\Sigma$ type for the result of the function, which shows that
570successful executions are sound with respect to the inductive semantics.
571Matita automatically generates proof obligations for each case due to a
572coercion between the types
573\begin{center}
574\lstinline'option (res T)' \quad and \quad \lstinline'res ($\Sigma$x:T. P x)'
575\end{center}
576(where a branch marked \lstinline'None' would generate a proof obligation to
577show that it is impossible, although the semantics do not use this feature).
578This is intended to mimic Sozeau's \textsc{Russell} language and elaboration
579into Coq~\cite{sozeau06}.  The coercion also triggers an automatic mechanism
580in \matita{} to add equalities for each pattern matched.  Each case of the
581soundness proof consists of extracting an equality for each computation in the
582monad,
583% "in the monad" is a bit vague here
584making any further case distinctions necessary and applying the corresponding
585rule from the inductive semantics.
586
587\subsection{Statements}
588
589Evaluating a step of a statement is complicated by the presence of the
590`external' functions for I/O, which can return arbitrary values.  These are
591handled by a resumption monad, which on encountering some I/O returns a
592suspension. When the suspension is applied to a value the evaluation of the
593semantics is resumed.  Resumption monads are a standard tool for providing
594denotational semantics for input~\cite{Moggi199155} and interleaved
595concurrency~\cite{??}.
596The definition also incorporates errors, and uses a coercion to automatically
597transform values from the plain error monad.
598\todo{should perhaps give more details of the resumption monad?}
599
600The execution of assignments is straightforward,
601\begin{quote}
602\begin{lstlisting}
603nlet rec exec_step (ge:genv) (st:state) on st : (IO eventval io_out ($\Sigma$r:trace $\times$ state. step ge st (\fst r) (\snd r))) ≝
604match st with
605[ State f s k e m $\Rightarrow$
606  match s with
607  [ Sassign a1 a2 $\Rightarrow$ Some ? (
608    ! $\langle$l,tr1$\rangle$ $\leftarrow$ exec_lvalue ge e m a1;:
609    ! $\langle$v2,tr2$\rangle$ $\leftarrow$ exec_expr ge e m a2;:
610    ! m' $\leftarrow$ store_value_of_type' (typeof a1) m l v2;:
611    ret ? $\langle$tr1++tr2, State f Sskip k e m'$\rangle$)
612...
613\end{lstlisting}
614\end{quote}
615where \lstinline'!' is used in place of \lstinline'do' due to the change in
616monad.  The content is essentially the same as the inductive rule given on
617page~\pageref{page:Sassign}.
618
619The handling of external calls uses the
620\begin{quote}
621\begin{lstlisting}
622do_io : ident $\rightarrow$ list eventval $\rightarrow$ IO eventval io_out eventval
623\end{lstlisting}
624\end{quote}
625function to suspend execution:
626\begin{quote}
627\begin{lstlisting}
628...
629  ]
630| Callstate f0 vargs k m $\Rightarrow$
631  match f0 with
632  [ ...
633  | External f argtys retty $\Rightarrow$ Some ? (
634      ! evargs $\leftarrow$ check_eventval_list vargs (typlist_of_typelist argtys);:
635      ! evres $\leftarrow$ do_io f evargs;:
636      ! vres $\leftarrow$ check_eventval evres (proj_sig_res (signature_of_type argtys rett
637      ret ? $\langle$(Eextcall f evargs evres), Returnstate vres k m$\rangle$)
638...
639\end{lstlisting}
640\end{quote}
641\todo{say something more about the rest?}
642
643Together with functions to provide the initial state for a program and to
644detect a final state we can write a function to run the program up to a given
645number of steps.  Similarly, a corecursive function can return the entire
646execution as a stream of trace and state pairs.
647
648\todo{completeness?}
649
650\section{Validation}
651\label{sec:valid}
652
653\begin{verbatim}
654Animation of simple C programs.
655\end{verbatim}
656
657\begin{verbatim}
658MORE:
659  library stuff: choice of integers, maps
660  check for any axioms
661\end{verbatim}
662
663\bibliographystyle{plain}
664\bibliography{report}
665
666\end{document}
Note: See TracBrowser for help on using the repository browser.