Changeset 402 for Deliverables/D3.1


Ignore:
Timestamp:
Dec 13, 2010, 12:02:29 AM (9 years ago)
Author:
campbell
Message:

Revise D3.1, add notes on files.

Location:
Deliverables/D3.1/Report
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • Deliverables/D3.1/Report/report.tex

    r393 r402  
    55\newcommand{\ocaml}{OCaml}
    66\newcommand{\clight}{Clight}
    7 \newcommand{\matita}{matita}
     7\newcommand{\matita}{Matita}
    88\newcommand{\sdcc}{\texttt{sdcc}}
    99
     
    1919
    2020\lstdefinelanguage{matita}
    21   {keywords={ndefinition,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,rec,match,with,Type},
     21  {keywords={ndefinition,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,rec,match,with,Type,and,on},
    2222   morekeywords={[2]nwhd,nnormalize,nelim,ncases,ndestruct},
    2323   mathescape=true,
     
    9292
    9393\clearpage \pagestyle{myheadings} \markright{\cerco{}, FP7-ICT-2009-C-243881}
     94
     95\newpage
     96
     97\vspace*{7cm}
     98\paragraph{Abstract}
     99We present an execution semantics of the C programming language for
     100use in the \cerco{} project.  It is based on the small-step inductive
     101semantics used by the CompCert verified compiler.  We discuss the
     102extensions required for our target architecture, porting the semantics
     103to our choice of tool, \matita{}, providing an equivalent executable
     104semantics and the validation of the semantics.
     105
     106\newpage
    94107
    95108\tableofcontents
     
    114127microcontroller's memory spaces.  \emph{Cost labels} have also been added to
    115128the \clight{} semantics to support the labelling approach to cost annotations
    116 presented in~\cite{d2.1}.
     129presented in a previous deliverable~\cite{d2.1}.
    117130
    118131The following section discusses the C language extensions for memory spaces.
    119132Then the port of the two stages of the CompCert \clight{} semantics is
    120133described in Section~\ref{sec:port}, followed by the new executable semantics
    121 in Section~\ref{sec:exec}.  Finally we discuss how the semantics can be tested
     134in Section~\ref{sec:exec}.  Finally we discuss how the semantics has
     135been validated
    122136in Section~\ref{sec:valid}.
    123137
     
    221235\emph{none}& 3 & Any / Generic pointer
    222236\end{tabular}
    223 \end{table}
     237\end{table}\\
    224238The generic pointers are a tagged union of the other kinds of pointers.
    225239
     
    230244syntax and semantics have been extended with the memory space
    231245attributes listed above.  The syntax follows \sdcc{} and in the
    232 semantics we track the memory space that each block was allocated in
     246semantics we track the memory space that each block was allocated from
    233247and only permit access via the appropriate kinds of pointers.  The
    234248details on these changes are given in the following sections.
     
    247261compiler, and so no extension is provided for it.
    248262
    249 Finally, we adopt the \sdcc{} extension to allocate a variable at a
    250 particular address to provide a way to deal with memory mapped I/O in
    251 the external memory space.
    252 \todo{Are we really going to do this?}
     263Finally, we have the option of using CompCert's translation of
     264\lstinline'volatile' variable accesses to `external' function calls.  Should we
     265need more flexible I/O than SFRs provide, then we could adopt the \sdcc{}
     266extension to allocate a variable at a particular address to provide a way to
     267deal with memory mapped I/O in the external memory space.  The translation to
     268function calls would mean that the semantics presented here would be
     269unaffected.
    253270
    254271\section{Port of CompCert \clight{} semantics to \matita{}}
    255272\label{sec:port}
    256273
    257 \todo{Could do with some text here}
    258274\subsection{Parsing and elaboration}
    259275The first stage taken from the CompCert semantics is the parsing and
    260276elaboration of C programs into the simpler \clight{} language.  This is based
    261277upon the CIL library for parsing, analysing and transforming C programs by
    262 Necula et.~al.~\cite{cil02}.  The elaboration performed provides explicit type
    263 information throughout the program, including extra casts for promotion, and
     278Necula et.~al.~\cite{cil02}.  The elaboration provides explicit type
     279information throughout the program, including extra casts for
     280promotion.  It also
    264281performs simplifications such as breaking up expressions with side effects
    265 into effect-free expressions and statements to perform the effects.  The
     282into effect-free expressions along with statements to perform the effects.  The
    266283transformed \clight{} programs are much more manageable and lack the ambiguities
    267284of C, but also remain easily understood by C programmers.
     
    320337\end{lstlisting}
    321338\end{quote}
    322 The expressions in \lstinline'f' and \lstinline'main' had to be broken up due
    323 to side-effects, and casts have been added to change between generic pointers
     339The expression in \lstinline'f' had to be broken up due
     340to the call to \lstinline'g', and casts have been added to change between generic pointers
    324341and pointers specific to the \lstinline'__data' section of memory.  The
    325342underlying data structure also has types attached to every expression, but
     
    380397\begin{quote}
    381398\begin{lstlisting}
    382 ninductive eval_expr (ge:genv) (e:env) (m:mem) : expr $\rightarrow$ val $\rightarrow$ trace $\rightarrow$ Prop
     399ninductive eval_expr (ge:genv) (e:env) (m:mem) : expr $\rightarrow$ val $\rightarrow$ trace $\rightarrow$ Prop :=
    383400...
    384401  | eval_Elvalue: $\forall$a,ty,psp,loc,ofs,v,tr.
     
    418435The only new expression in our semantics is the cost label which wraps
    419436around another expression.  It does not change the result, but merely
    420 prefixes the trace with the given label to identify the branches taken
     437augments the trace with the given label to identify the branches taken
    421438in conditional expressions so that accurate cost information can be
    422439attached to the program:
     
    428445\end{lstlisting}
    429446\end{quote}
    430 \todo{make the semantics of the cost labels clearer}
    431447
    432448As the expressions are side-effect free, all of the changes to the state are
     
    461477memory locations\footnote{In the semantics all variables are
    462478  allocated, although the compiler may subsequently allocate them to
    463   registers where possible.} and the current memory state.  \todo{need
    464   to note that all variables 'appear' to be memory allocated, even if
    465   they're subsequently optimised away.}  The function call and return
    466 states appear to store less information because the details of the
    467 caller are contained in the continuation.
     479  registers where possible.} and the current memory state.  The
     480function call and return states appear to store less information
     481because the details of the caller are contained in the continuation.
    468482
    469483An example of the statement execution rules is the assignment rule
     
    516530provides the trace.
    517531
    518 Cost label statements add the given label to the trace analogously to the cost
     532Cost label statements prefix the trace with the given label, similar to the cost
    519533label expressions above.
    520534
     
    533547\begin{quote}
    534548\begin{lstlisting}
    535 nlet rec exec_expr (ge:genv) (en:env) (m:mem) (e:expr) on e :
    536                res ($\Sigma$r:val$\times$trace. eval_expr ge en m e (\fst r) (\snd r)) ≝
     549nlet rec exec_expr (ge:genv) (en:env) (m:mem) (e:expr) on e : res (val$\times$trace) ≝
    537550match e with
    538551[ Expr e' ty $\Rightarrow$
    539552  match e' with
    540553  [ ...
    541   | Evar _ $\Rightarrow$ Some ? (
     554  | Evar _ $\Rightarrow$
    542555      do $\langle$l,tr$\rangle$ $\leftarrow$ exec_lvalue' ge en m e' ty;
    543556      do v $\leftarrow$ opt_to_res ? (load_value_of_type' ty m l);
    544       OK ? $\langle$v,tr$\rangle$)
     557      OK ? $\langle$v,tr$\rangle$
    545558...
    546559  ]
    547560]
    548 and exec_lvalue' (ge:genv) (en:env) (m:mem) (e':expr_descr) (ty:type) on e' :
    549     res ($\Sigma$r:memory_space $\times$ block $\times$ int $\times$ trace. eval_lvalue ge en m (Expr e' ty) (\fst (\fst (\fst r))) (\snd (\fst (\fst r))) (\snd (\fst r)) (\snd r)) ≝
     561and exec_lvalue' (ge:genv) (en:env) (m:mem) (e':expr_descr) (ty:type) on e'
     562 : res (memory_space $\times$ block $\times$ int $\times$ trace) ≝
    550563  match e' with
    551564  [ Evar id $\Rightarrow$
    552565      match (get … id en) with
    553       [ None $\Rightarrow$ Some ? (do $\langle$sp,l$\rangle$ $\leftarrow$ opt_to_res ? (find_symbol ? ? ge id); OK ? $\langle$$\langle$$\langle$sp,l$\rangle$,zero$\rangle$,E0$\rangle$) (* global *)
    554       | Some loc $\Rightarrow$ Some ? (OK ? $\langle$$\langle$$\langle$Any,loc$\rangle$,zero$\rangle$,E0$\rangle$) (* local *)
     566      [ None $\Rightarrow$ do $\langle$sp,l$\rangle$ $\leftarrow$ opt_to_res ? (find_symbol ? ? ge id);
     567                 OK ? $\langle$$\langle$$\langle$sp,l$\rangle$,zero$\rangle$,E0$\rangle$ (* global *)
     568      | Some loc $\Rightarrow$ OK ? $\langle$$\langle$$\langle$Any,loc$\rangle$,zero$\rangle$,E0$\rangle$ (* local *)
    555569      ]
    556570...
     
    589603| Wrong : IO output input T.
    590604
    591 nlet rec bindIO (O:Type) (I:O $\rightarrow$ Type) (T,T':Type) (v:IO O I T) (f:T $\rightarrow$ IO O I T') on v : IO O I T' :=
     605nlet rec bindIO (O:Type) (I:O $\rightarrow$ Type) (T,T':Type)
     606                  (v:IO O I T) (f:T $\rightarrow$ IO O I T') on v : IO O I T' :=
    592607match v with
    593608[ Interact out k $\Rightarrow$ (Interact ??? out ($\lambda$res. bindIO O I T T' (k res) f))
     
    606621\begin{quote}
    607622\begin{lstlisting}
    608 nlet rec exec_step (ge:genv) (st:state) on st : (IO eventval io_out ($\Sigma$r:trace $\times$ state. step ge st (\fst r) (\snd r))) ≝
     623nlet rec exec_step (ge:genv) (st:state) on st : (IO io_out io_in (trace $\times$ state)) ≝
    609624match st with
    610625[ State f s k e m $\Rightarrow$
    611626  match s with
    612   [ Sassign a1 a2 $\Rightarrow$ Some ? (
    613     ! $\langle$l,tr1$\rangle$ $\leftarrow$ exec_lvalue ge e m a1;:
    614     ! $\langle$v2,tr2$\rangle$ $\leftarrow$ exec_expr ge e m a2;:
    615     ! m' $\leftarrow$ store_value_of_type' (typeof a1) m l v2;:
    616     ret ? $\langle$tr1++tr2, State f Sskip k e m'$\rangle$)
     627  [ Sassign a1 a2 $\Rightarrow$
     628    ! $\langle$l,tr1$\rangle$ $\leftarrow$ exec_lvalue ge e m a1;
     629    ! $\langle$v2,tr2$\rangle$ $\leftarrow$ exec_expr ge e m a2;
     630    ! m' $\leftarrow$ store_value_of_type' (typeof a1) m l v2;
     631    ret ? $\langle$tr1++tr2, State f Sskip k e m'$\rangle$
    617632...
    618633\end{lstlisting}
     
    622637page~\pageref{page:Sassign}.
    623638
     639Most other rules are similar translations of the inductive semantics.
    624640The handling of external calls uses the
    625641\begin{quote}
     
    632648\begin{lstlisting}
    633649...
    634   ]
    635650| Callstate f0 vargs k m $\Rightarrow$
    636651  match f0 with
    637652  [ ...
    638   | External f argtys retty $\Rightarrow$ Some ? (
    639       ! evargs $\leftarrow$ check_eventval_list vargs (typlist_of_typelist argtys);:
    640       ! evres $\leftarrow$ do_io f evargs;:
    641       ! vres $\leftarrow$ check_eventval evres (proj_sig_res (signature_of_type argtys rett
    642       ret ? $\langle$(Eextcall f evargs evres), Returnstate vres k m$\rangle$)
     653  | External f argtys retty $\Rightarrow$
     654      ! evargs $\leftarrow$ err_to_io_sig … (check_eventval_list vargs (typlist_of_typelist argtys));
     655      ! evres $\leftarrow$ do_io f evargs (proj_sig_res (signature_of_type argtys retty));
     656      ret ? $\langle$Eextcall f evargs (mk_eventval ? evres), Returnstate (mk_val ? evres) k m$\rangle$
     657  ]
    643658...
    644659\end{lstlisting}
    645660\end{quote}
    646 \todo{say something more about the rest?}
     661The rest of the code after \lstinline'do_io' is included in the
     662suspension returned.
    647663
    648664Together with functions to provide the initial state for a program and to
     
    695711\begin{quote}
    696712\begin{lstlisting}
    697 nlet rec exec_step (ge:genv) (st:state) on st : (IO eventval io_out ($\Sigma$r:trace $\times$ state. step ge st (\fst r) (\snd r))) ≝
     713nlet rec exec_step (ge:genv) (st:state) on st
     714 : (IO eventval io_out ($\Sigma$r:trace $\times$ state. step ge st (\fst r) (\snd r))) ≝
    698715\end{lstlisting}
    699716\end{quote}
     
    709726This is intended to mimic Sozeau's \textsc{Russell} language and elaboration
    710727into Coq~\cite{sozeau06}.  The coercion also triggers an automatic mechanism
    711 in \matita{} to add equalities for each pattern matched.  Each case of the
    712 soundness proof consists of extracting an equality for each computation in the
    713 monad,
    714 % "in the monad" is a bit vague here
    715 making any further case distinctions necessary and applying the corresponding
    716 rule from the inductive semantics.
     728in \matita{} to add equalities for each pattern matched.  The proofs
     729are essentially the same as before.
    717730
    718731However, the soundness proofs then pervade the executable semantics,
    719 making the correctness proofs more difficult.  We decided to keep the
    720 soundness results separate, partly because of the increased difficulty
    721 of using the resulting terms in proofs, and partly because they are of
    722 little consequence once equivalence has been shown.
     732making rewriting in the correctness proofs more difficult.  We decided
     733to keep the soundness results separate, partly because of the
     734increased difficulty of using the resulting terms in proofs, and
     735partly because they are of little consequence once equivalence has
     736been shown.
    723737
    724738The completeness results requiring a dual lifting which requires the
     
    727741\begin{quote}
    728742\begin{lstlisting}
    729 nlet rec yieldsIObare (A:Type) (a:IO io_out io_in A) (v':A) on a : Prop :=
     743nlet rec yieldsIO (A:Type) (a:IO io_out io_in A) (v':A) on a : Prop :=
    730744match a with
    731745[ Value v $\Rightarrow$ v' = v
    732 | Interact _ k $\Rightarrow$ $\exists$r.yieldsIObare A (k r) v'
     746| Interact _ k $\Rightarrow$ $\exists$r.yieldsIO A (k r) v'
    733747| _ $\Rightarrow$ False
    734748].
     
    739753\begin{lstlisting}
    740754ntheorem step_complete: $\forall$ge,s,tr,s'.
    741   step ge s tr s' $\rightarrow$ yieldsIObare ? (exec_step ge s) $\langle$tr,s'$\rangle$.
     755  step ge s tr s' $\rightarrow$ yieldsIO ? (exec_step ge s) $\langle$tr,s'$\rangle$.
    742756\end{lstlisting}
    743757\end{quote}
    744758by case analysis on the inductive derivation and a mixture of
    745 reduction and rewriting.
     759reduction and rewriting.  Thus we know that executing a step in these
     760semantics is equivalent to a step in the inductive semantics.
    746761
    747762Showing the equivalence of whole program execution is a little
     
    752767[TODO]
    753768\end{quote}
     769The \lstinline'single_exec_of' predicate identifies single executions
     770from these trees, essentially fixing a stream of input values.
     771
    754772However, the inductive semantics divides program behaviours into four
    755773categories which have \emph{individual} (co)inductive descriptions:
     
    758776\item programs which eventually diverge (with an empty trace);
    759777\item programs which keep interacting in some way (with an infinite
    760   trace\footnote{In our setting this includes passing through cost
     778  trace)\footnote{In our setting this includes passing through cost
    761779    labels as well as I/O.}; and
    762780\item programs which \emph{go wrong}.
     
    774792ntheorem exec_inf_equivalence:
    775793  $\forall$classic:($\forall$P:Prop.P $\vee$ $\neg$P).
    776   $\forall$constructive_indefinite_description:($\forall$A:Type. $\forall$P:A$\rightarrow$Prop. ($\exists$x. P x) $\rightarrow$ $\Sigma$x : A. P x).
    777   $\forall$p. $\exists$b.execution_matches_behavior (exec_inf p) b $\wedge$ exec_program p b.
     794  $\forall$constructive_indefinite_description:($\forall$A:Type. $\forall$P:A$\rightarrow$Prop. ($\exists$x. P x) $\rightarrow$ $\Sigma$x:A. P x).
     795  $\forall$p,e. single_exec_of (exec_inf p) e $\rightarrow$
     796   $\exists$b.execution_matches_behavior e b $\wedge$ exec_program p b.
    778797\end{lstlisting}
    779798\end{quote}
     
    781800\subsection{Animation of simple C programs}
    782801
    783 \begin{verbatim}
    784 MORE:
    785   library stuff: choice of integers, maps
    786   check for any axioms
    787 \end{verbatim}
     802We are currently working with a development version of \matita{} which
     803(temporarily) does not support extraction to OCaml code.  Hence to
     804animate a program we first parse it with CIL and produce a \matita{}
     805term in text format representing the program, then interpret it within
     806\matita{}.
     807
     808This process is rather laborious, so we have concentrated on testing
     809small programs which exercised areas of the semantics which depart
     810from CompCert.  In particular, we tested several aspects of the
     811handling of memory spaces and the casting of pointers.  Together with
     812the results in the previous sections we gain considerable confidence
     813that the semantics describe the behaviour of programs properly.
     814Nevertheless, we index to experiment with larger C programs once
     815extraction is available.
     816
     817\appendix
     818
     819\section{Description of the Code}
     820
     821The majority of the semantics is given as \matita{} source files.  The
     822exception is the changes to the CIL based parser, which is presented
     823as a patch to a preliminary version of the \cerco{} prototype
     824compiler.  This patch provides both the extensions to the parser and a
     825pretty printer to produce a usable \matita{} term representing the
     826program.
     827
     828\begin{quote}
     829\begin{tabular}{p{9em}l}
     830acc-0.1.spaces.patch & Changes to early prototype compiler for parsing
     831\end{tabular}
     832\end{quote}
     833
     834\subsubsection*{Ancilliary definitions}
     835
     836Files corresponding to CompCert.
     837
     838\begin{quote}
     839\begin{tabular}{p{9em}l}
     840Coqlib.ma & Minor definitions ported from CompCert \\
     841Errors.ma & The error monad \\
     842Floats.ma & Axiomatised floating point numbers \\
     843Globalenvs.ma & Global environments \\
     844Integers.ma & Integers modulo powers of two \\
     845Maps.ma & Finite maps (used in particular for local environments) \\
     846Smallstep.ma & Generic definitions and lemmas for small step semantics \\
     847\end{tabular}
     848\end{quote}
     849
     850\noindent
     851Files specific to this development.
     852
     853\begin{quote}
     854\begin{tabular}{p{9em}l}
     855binary/positive.ma & Binary positive numbers \\
     856binary/Z.ma & Binary integers \\
     857extralib.ma & Extensions to \matita{}'s library \\
     858\end{tabular}
     859\end{quote}
     860
     861\subsubsection*{Inductive semantics ported from CompCert}
     862
     863\begin{quote}
     864\begin{tabular}{p{9em}l}
     865AST.ma &   Minor syntax definitions intended for several compiler stages \\
     866Values.ma &   Definitions for values manipulated by Clight programs \\
     867Mem.ma &   Definition of the memory model \\
     868Events.ma &   I/O events \\
     869CostLabel.ma &   Definition of cost labels \\
     870Csyntax.ma &   Clight syntax trees \\
     871Csem.ma &   Clight inductive semantics \\
     872\end{tabular}
     873\end{quote}
     874
     875\subsubsection*{Executable semantics}
     876
     877\begin{quote}
     878\begin{tabular}{p{9em}l}
     879IOMonad.ma &  Definitions of I/O resumption monad \\
     880Cexec.ma &  Definition of the executable semantics \\
     881CexecSound.ma &  Soundness of individual steps \\
     882CexecComplete.ma &  Completeness of individual steps \\
     883CexecEquiv.ma &  Equivalence of whole program executions \\
     884\end{tabular}
     885\end{quote}
    788886
    789887\bibliographystyle{plain}
Note: See TracChangeset for help on using the changeset viewer.