Changeset 402 for Deliverables
 Timestamp:
 Dec 13, 2010, 12:02:29 AM (11 years ago)
 Location:
 Deliverables/D3.1/Report
 Files:

 2 edited
Legend:
 Unmodified
 Added
 Removed

Deliverables/D3.1/Report/report.tex
r393 r402 5 5 \newcommand{\ocaml}{OCaml} 6 6 \newcommand{\clight}{Clight} 7 \newcommand{\matita}{ matita}7 \newcommand{\matita}{Matita} 8 8 \newcommand{\sdcc}{\texttt{sdcc}} 9 9 … … 19 19 20 20 \lstdefinelanguage{matita} 21 {keywords={ndefinition,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,rec,match,with,Type },21 {keywords={ndefinition,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,rec,match,with,Type,and,on}, 22 22 morekeywords={[2]nwhd,nnormalize,nelim,ncases,ndestruct}, 23 23 mathescape=true, … … 92 92 93 93 \clearpage \pagestyle{myheadings} \markright{\cerco{}, FP7ICT2009C243881} 94 95 \newpage 96 97 \vspace*{7cm} 98 \paragraph{Abstract} 99 We present an execution semantics of the C programming language for 100 use in the \cerco{} project. It is based on the smallstep inductive 101 semantics used by the CompCert verified compiler. We discuss the 102 extensions required for our target architecture, porting the semantics 103 to our choice of tool, \matita{}, providing an equivalent executable 104 semantics and the validation of the semantics. 105 106 \newpage 94 107 95 108 \tableofcontents … … 114 127 microcontroller's memory spaces. \emph{Cost labels} have also been added to 115 128 the \clight{} semantics to support the labelling approach to cost annotations 116 presented in ~\cite{d2.1}.129 presented in a previous deliverable~\cite{d2.1}. 117 130 118 131 The following section discusses the C language extensions for memory spaces. 119 132 Then the port of the two stages of the CompCert \clight{} semantics is 120 133 described in Section~\ref{sec:port}, followed by the new executable semantics 121 in Section~\ref{sec:exec}. Finally we discuss how the semantics can be tested 134 in Section~\ref{sec:exec}. Finally we discuss how the semantics has 135 been validated 122 136 in Section~\ref{sec:valid}. 123 137 … … 221 235 \emph{none}& 3 & Any / Generic pointer 222 236 \end{tabular} 223 \end{table} 237 \end{table}\\ 224 238 The generic pointers are a tagged union of the other kinds of pointers. 225 239 … … 230 244 syntax and semantics have been extended with the memory space 231 245 attributes listed above. The syntax follows \sdcc{} and in the 232 semantics we track the memory space that each block was allocated in246 semantics we track the memory space that each block was allocated from 233 247 and only permit access via the appropriate kinds of pointers. The 234 248 details on these changes are given in the following sections. … … 247 261 compiler, and so no extension is provided for it. 248 262 249 Finally, we adopt the \sdcc{} extension to allocate a variable at a 250 particular address to provide a way to deal with memory mapped I/O in 251 the external memory space. 252 \todo{Are we really going to do this?} 263 Finally, we have the option of using CompCert's translation of 264 \lstinline'volatile' variable accesses to `external' function calls. Should we 265 need more flexible I/O than SFRs provide, then we could adopt the \sdcc{} 266 extension to allocate a variable at a particular address to provide a way to 267 deal with memory mapped I/O in the external memory space. The translation to 268 function calls would mean that the semantics presented here would be 269 unaffected. 253 270 254 271 \section{Port of CompCert \clight{} semantics to \matita{}} 255 272 \label{sec:port} 256 273 257 \todo{Could do with some text here}258 274 \subsection{Parsing and elaboration} 259 275 The first stage taken from the CompCert semantics is the parsing and 260 276 elaboration of C programs into the simpler \clight{} language. This is based 261 277 upon the CIL library for parsing, analysing and transforming C programs by 262 Necula et.~al.~\cite{cil02}. The elaboration performed provides explicit type 263 information throughout the program, including extra casts for promotion, and 278 Necula et.~al.~\cite{cil02}. The elaboration provides explicit type 279 information throughout the program, including extra casts for 280 promotion. It also 264 281 performs simplifications such as breaking up expressions with side effects 265 into effectfree expressions a ndstatements to perform the effects. The282 into effectfree expressions along with statements to perform the effects. The 266 283 transformed \clight{} programs are much more manageable and lack the ambiguities 267 284 of C, but also remain easily understood by C programmers. … … 320 337 \end{lstlisting} 321 338 \end{quote} 322 The expression s in \lstinline'f' and \lstinline'main' had to be broken up due323 to sideeffects, and casts have been added to change between generic pointers339 The expression in \lstinline'f' had to be broken up due 340 to the call to \lstinline'g', and casts have been added to change between generic pointers 324 341 and pointers specific to the \lstinline'__data' section of memory. The 325 342 underlying data structure also has types attached to every expression, but … … 380 397 \begin{quote} 381 398 \begin{lstlisting} 382 ninductive eval_expr (ge:genv) (e:env) (m:mem) : expr $\rightarrow$ val $\rightarrow$ trace $\rightarrow$ Prop ≝399 ninductive eval_expr (ge:genv) (e:env) (m:mem) : expr $\rightarrow$ val $\rightarrow$ trace $\rightarrow$ Prop := 383 400 ... 384 401  eval_Elvalue: $\forall$a,ty,psp,loc,ofs,v,tr. … … 418 435 The only new expression in our semantics is the cost label which wraps 419 436 around another expression. It does not change the result, but merely 420 prefixes the trace with the given label to identify the branches taken437 augments the trace with the given label to identify the branches taken 421 438 in conditional expressions so that accurate cost information can be 422 439 attached to the program: … … 428 445 \end{lstlisting} 429 446 \end{quote} 430 \todo{make the semantics of the cost labels clearer}431 447 432 448 As the expressions are sideeffect free, all of the changes to the state are … … 461 477 memory locations\footnote{In the semantics all variables are 462 478 allocated, although the compiler may subsequently allocate them to 463 registers where possible.} and the current memory state. \todo{need 464 to note that all variables 'appear' to be memory allocated, even if 465 they're subsequently optimised away.} The function call and return 466 states appear to store less information because the details of the 467 caller are contained in the continuation. 479 registers where possible.} and the current memory state. The 480 function call and return states appear to store less information 481 because the details of the caller are contained in the continuation. 468 482 469 483 An example of the statement execution rules is the assignment rule … … 516 530 provides the trace. 517 531 518 Cost label statements add the given label to the trace analogouslyto the cost532 Cost label statements prefix the trace with the given label, similar to the cost 519 533 label expressions above. 520 534 … … 533 547 \begin{quote} 534 548 \begin{lstlisting} 535 nlet rec exec_expr (ge:genv) (en:env) (m:mem) (e:expr) on e : 536 res ($\Sigma$r:val$\times$trace. eval_expr ge en m e (\fst r) (\snd r)) ≝ 549 nlet rec exec_expr (ge:genv) (en:env) (m:mem) (e:expr) on e : res (val$\times$trace) ≝ 537 550 match e with 538 551 [ Expr e' ty $\Rightarrow$ 539 552 match e' with 540 553 [ ... 541  Evar _ $\Rightarrow$ Some ? (554  Evar _ $\Rightarrow$ 542 555 do $\langle$l,tr$\rangle$ $\leftarrow$ exec_lvalue' ge en m e' ty; 543 556 do v $\leftarrow$ opt_to_res ? (load_value_of_type' ty m l); 544 OK ? $\langle$v,tr$\rangle$ )557 OK ? $\langle$v,tr$\rangle$ 545 558 ... 546 559 ] 547 560 ] 548 and exec_lvalue' (ge:genv) (en:env) (m:mem) (e':expr_descr) (ty:type) on e' :549 res ($\Sigma$r:memory_space $\times$ block $\times$ int $\times$ trace. eval_lvalue ge en m (Expr e' ty) (\fst (\fst (\fst r))) (\snd (\fst (\fst r))) (\snd (\fst r)) (\snd r)) ≝561 and exec_lvalue' (ge:genv) (en:env) (m:mem) (e':expr_descr) (ty:type) on e' 562 : res (memory_space $\times$ block $\times$ int $\times$ trace) ≝ 550 563 match e' with 551 564 [ Evar id $\Rightarrow$ 552 565 match (get … id en) with 553 [ None $\Rightarrow$ Some ? (do $\langle$sp,l$\rangle$ $\leftarrow$ opt_to_res ? (find_symbol ? ? ge id); OK ? $\langle$$\langle$$\langle$sp,l$\rangle$,zero$\rangle$,E0$\rangle$) (* global *) 554  Some loc $\Rightarrow$ Some ? (OK ? $\langle$$\langle$$\langle$Any,loc$\rangle$,zero$\rangle$,E0$\rangle$) (* local *) 566 [ None $\Rightarrow$ do $\langle$sp,l$\rangle$ $\leftarrow$ opt_to_res ? (find_symbol ? ? ge id); 567 OK ? $\langle$$\langle$$\langle$sp,l$\rangle$,zero$\rangle$,E0$\rangle$ (* global *) 568  Some loc $\Rightarrow$ OK ? $\langle$$\langle$$\langle$Any,loc$\rangle$,zero$\rangle$,E0$\rangle$ (* local *) 555 569 ] 556 570 ... … … 589 603  Wrong : IO output input T. 590 604 591 nlet rec bindIO (O:Type) (I:O $\rightarrow$ Type) (T,T':Type) (v:IO O I T) (f:T $\rightarrow$ IO O I T') on v : IO O I T' := 605 nlet rec bindIO (O:Type) (I:O $\rightarrow$ Type) (T,T':Type) 606 (v:IO O I T) (f:T $\rightarrow$ IO O I T') on v : IO O I T' := 592 607 match v with 593 608 [ Interact out k $\Rightarrow$ (Interact ??? out ($\lambda$res. bindIO O I T T' (k res) f)) … … 606 621 \begin{quote} 607 622 \begin{lstlisting} 608 nlet rec exec_step (ge:genv) (st:state) on st : (IO eventval io_out ($\Sigma$r:trace $\times$ state. step ge st (\fst r) (\snd r))) ≝623 nlet rec exec_step (ge:genv) (st:state) on st : (IO io_out io_in (trace $\times$ state)) ≝ 609 624 match st with 610 625 [ State f s k e m $\Rightarrow$ 611 626 match s with 612 [ Sassign a1 a2 $\Rightarrow$ Some ? (613 ! $\langle$l,tr1$\rangle$ $\leftarrow$ exec_lvalue ge e m a1; :614 ! $\langle$v2,tr2$\rangle$ $\leftarrow$ exec_expr ge e m a2; :615 ! m' $\leftarrow$ store_value_of_type' (typeof a1) m l v2; :616 ret ? $\langle$tr1++tr2, State f Sskip k e m'$\rangle$ )627 [ Sassign a1 a2 $\Rightarrow$ 628 ! $\langle$l,tr1$\rangle$ $\leftarrow$ exec_lvalue ge e m a1; 629 ! $\langle$v2,tr2$\rangle$ $\leftarrow$ exec_expr ge e m a2; 630 ! m' $\leftarrow$ store_value_of_type' (typeof a1) m l v2; 631 ret ? $\langle$tr1++tr2, State f Sskip k e m'$\rangle$ 617 632 ... 618 633 \end{lstlisting} … … 622 637 page~\pageref{page:Sassign}. 623 638 639 Most other rules are similar translations of the inductive semantics. 624 640 The handling of external calls uses the 625 641 \begin{quote} … … 632 648 \begin{lstlisting} 633 649 ... 634 ]635 650  Callstate f0 vargs k m $\Rightarrow$ 636 651 match f0 with 637 652 [ ... 638  External f argtys retty $\Rightarrow$ Some ? (639 ! evargs $\leftarrow$ check_eventval_list vargs (typlist_of_typelist argtys);:640 ! evres $\leftarrow$ do_io f evargs ;:641 ! vres $\leftarrow$ check_eventval evres (proj_sig_res (signature_of_type argtys rett642 ret ? $\langle$(Eextcall f evargs evres), Returnstate vres k m$\rangle$)653  External f argtys retty $\Rightarrow$ 654 ! evargs $\leftarrow$ err_to_io_sig … (check_eventval_list vargs (typlist_of_typelist argtys)); 655 ! evres $\leftarrow$ do_io f evargs (proj_sig_res (signature_of_type argtys retty)); 656 ret ? $\langle$Eextcall f evargs (mk_eventval ? evres), Returnstate (mk_val ? evres) k m$\rangle$ 657 ] 643 658 ... 644 659 \end{lstlisting} 645 660 \end{quote} 646 \todo{say something more about the rest?} 661 The rest of the code after \lstinline'do_io' is included in the 662 suspension returned. 647 663 648 664 Together with functions to provide the initial state for a program and to … … 695 711 \begin{quote} 696 712 \begin{lstlisting} 697 nlet rec exec_step (ge:genv) (st:state) on st : (IO eventval io_out ($\Sigma$r:trace $\times$ state. step ge st (\fst r) (\snd r))) ≝ 713 nlet rec exec_step (ge:genv) (st:state) on st 714 : (IO eventval io_out ($\Sigma$r:trace $\times$ state. step ge st (\fst r) (\snd r))) ≝ 698 715 \end{lstlisting} 699 716 \end{quote} … … 709 726 This is intended to mimic Sozeau's \textsc{Russell} language and elaboration 710 727 into Coq~\cite{sozeau06}. The coercion also triggers an automatic mechanism 711 in \matita{} to add equalities for each pattern matched. Each case of the 712 soundness proof consists of extracting an equality for each computation in the 713 monad, 714 % "in the monad" is a bit vague here 715 making any further case distinctions necessary and applying the corresponding 716 rule from the inductive semantics. 728 in \matita{} to add equalities for each pattern matched. The proofs 729 are essentially the same as before. 717 730 718 731 However, the soundness proofs then pervade the executable semantics, 719 making the correctness proofs more difficult. We decided to keep the 720 soundness results separate, partly because of the increased difficulty 721 of using the resulting terms in proofs, and partly because they are of 722 little consequence once equivalence has been shown. 732 making rewriting in the correctness proofs more difficult. We decided 733 to keep the soundness results separate, partly because of the 734 increased difficulty of using the resulting terms in proofs, and 735 partly because they are of little consequence once equivalence has 736 been shown. 723 737 724 738 The completeness results requiring a dual lifting which requires the … … 727 741 \begin{quote} 728 742 \begin{lstlisting} 729 nlet rec yieldsIO bare(A:Type) (a:IO io_out io_in A) (v':A) on a : Prop :=743 nlet rec yieldsIO (A:Type) (a:IO io_out io_in A) (v':A) on a : Prop := 730 744 match a with 731 745 [ Value v $\Rightarrow$ v' = v 732  Interact _ k $\Rightarrow$ $\exists$r.yieldsIO bareA (k r) v'746  Interact _ k $\Rightarrow$ $\exists$r.yieldsIO A (k r) v' 733 747  _ $\Rightarrow$ False 734 748 ]. … … 739 753 \begin{lstlisting} 740 754 ntheorem step_complete: $\forall$ge,s,tr,s'. 741 step ge s tr s' $\rightarrow$ yieldsIO bare? (exec_step ge s) $\langle$tr,s'$\rangle$.755 step ge s tr s' $\rightarrow$ yieldsIO ? (exec_step ge s) $\langle$tr,s'$\rangle$. 742 756 \end{lstlisting} 743 757 \end{quote} 744 758 by case analysis on the inductive derivation and a mixture of 745 reduction and rewriting. 759 reduction and rewriting. Thus we know that executing a step in these 760 semantics is equivalent to a step in the inductive semantics. 746 761 747 762 Showing the equivalence of whole program execution is a little … … 752 767 [TODO] 753 768 \end{quote} 769 The \lstinline'single_exec_of' predicate identifies single executions 770 from these trees, essentially fixing a stream of input values. 771 754 772 However, the inductive semantics divides program behaviours into four 755 773 categories which have \emph{individual} (co)inductive descriptions: … … 758 776 \item programs which eventually diverge (with an empty trace); 759 777 \item programs which keep interacting in some way (with an infinite 760 trace \footnote{In our setting this includes passing through cost778 trace)\footnote{In our setting this includes passing through cost 761 779 labels as well as I/O.}; and 762 780 \item programs which \emph{go wrong}. … … 774 792 ntheorem exec_inf_equivalence: 775 793 $\forall$classic:($\forall$P:Prop.P $\vee$ $\neg$P). 776 $\forall$constructive_indefinite_description:($\forall$A:Type. $\forall$P:A$\rightarrow$Prop. ($\exists$x. P x) $\rightarrow$ $\Sigma$x : A. P x). 777 $\forall$p. $\exists$b.execution_matches_behavior (exec_inf p) b $\wedge$ exec_program p b. 794 $\forall$constructive_indefinite_description:($\forall$A:Type. $\forall$P:A$\rightarrow$Prop. ($\exists$x. P x) $\rightarrow$ $\Sigma$x:A. P x). 795 $\forall$p,e. single_exec_of (exec_inf p) e $\rightarrow$ 796 $\exists$b.execution_matches_behavior e b $\wedge$ exec_program p b. 778 797 \end{lstlisting} 779 798 \end{quote} … … 781 800 \subsection{Animation of simple C programs} 782 801 783 \begin{verbatim} 784 MORE: 785 library stuff: choice of integers, maps 786 check for any axioms 787 \end{verbatim} 802 We are currently working with a development version of \matita{} which 803 (temporarily) does not support extraction to OCaml code. Hence to 804 animate a program we first parse it with CIL and produce a \matita{} 805 term in text format representing the program, then interpret it within 806 \matita{}. 807 808 This process is rather laborious, so we have concentrated on testing 809 small programs which exercised areas of the semantics which depart 810 from CompCert. In particular, we tested several aspects of the 811 handling of memory spaces and the casting of pointers. Together with 812 the results in the previous sections we gain considerable confidence 813 that the semantics describe the behaviour of programs properly. 814 Nevertheless, we index to experiment with larger C programs once 815 extraction is available. 816 817 \appendix 818 819 \section{Description of the Code} 820 821 The majority of the semantics is given as \matita{} source files. The 822 exception is the changes to the CIL based parser, which is presented 823 as a patch to a preliminary version of the \cerco{} prototype 824 compiler. This patch provides both the extensions to the parser and a 825 pretty printer to produce a usable \matita{} term representing the 826 program. 827 828 \begin{quote} 829 \begin{tabular}{p{9em}l} 830 acc0.1.spaces.patch & Changes to early prototype compiler for parsing 831 \end{tabular} 832 \end{quote} 833 834 \subsubsection*{Ancilliary definitions} 835 836 Files corresponding to CompCert. 837 838 \begin{quote} 839 \begin{tabular}{p{9em}l} 840 Coqlib.ma & Minor definitions ported from CompCert \\ 841 Errors.ma & The error monad \\ 842 Floats.ma & Axiomatised floating point numbers \\ 843 Globalenvs.ma & Global environments \\ 844 Integers.ma & Integers modulo powers of two \\ 845 Maps.ma & Finite maps (used in particular for local environments) \\ 846 Smallstep.ma & Generic definitions and lemmas for small step semantics \\ 847 \end{tabular} 848 \end{quote} 849 850 \noindent 851 Files specific to this development. 852 853 \begin{quote} 854 \begin{tabular}{p{9em}l} 855 binary/positive.ma & Binary positive numbers \\ 856 binary/Z.ma & Binary integers \\ 857 extralib.ma & Extensions to \matita{}'s library \\ 858 \end{tabular} 859 \end{quote} 860 861 \subsubsection*{Inductive semantics ported from CompCert} 862 863 \begin{quote} 864 \begin{tabular}{p{9em}l} 865 AST.ma & Minor syntax definitions intended for several compiler stages \\ 866 Values.ma & Definitions for values manipulated by Clight programs \\ 867 Mem.ma & Definition of the memory model \\ 868 Events.ma & I/O events \\ 869 CostLabel.ma & Definition of cost labels \\ 870 Csyntax.ma & Clight syntax trees \\ 871 Csem.ma & Clight inductive semantics \\ 872 \end{tabular} 873 \end{quote} 874 875 \subsubsection*{Executable semantics} 876 877 \begin{quote} 878 \begin{tabular}{p{9em}l} 879 IOMonad.ma & Definitions of I/O resumption monad \\ 880 Cexec.ma & Definition of the executable semantics \\ 881 CexecSound.ma & Soundness of individual steps \\ 882 CexecComplete.ma & Completeness of individual steps \\ 883 CexecEquiv.ma & Equivalence of whole program executions \\ 884 \end{tabular} 885 \end{quote} 788 886 789 887 \bibliographystyle{plain}
Note: See TracChangeset
for help on using the changeset viewer.