# Changeset 335

Ignore:
Timestamp:
Nov 29, 2010, 6:42:10 PM (9 years ago)
Message:

Quick pass through 3.1 text.

Location:
Deliverables/D3.1/Report
Files:
2 edited

### Legend:

Unmodified
 r207 irregular memory model with tight resource constraints.  The different memory spaces and access modes are summarised in Figure~\ref{fig:memorymodel} --- essentially the development of the 8051 family has fragmented memory into essentially the evolution of the 8051 family has fragmented memory into four regions:  one half of the internal' memory is fully accessible but also contains the register banks, the second half cannot be directly addressed contains the register banks, the second half cannot be accessed by direct addressing because it is shadowed by the Special Function Registers' (SFRs) for I/O; external memory' provides the bulk of memory in a separate address space; and the code is in its own read-only space. \begin{figure} \setlength{\unitlength}{1pt} \begin{picture}(410,250)(-50,200) %\put(-50,200){\framebox(410,250){}} \put(12,410){\makebox(80,0)[b]{Internal (256B)}} \put(13,242){\line(0,1){165}} \put(93,242){\line(0,1){165}} \put(13,407){\line(1,0){80}} \put(12,400){\makebox(0,0)[r]{0h}}  \put(14,400){\makebox(0,0)[l]{Register bank 0}} \put(13,393){\line(1,0){80}} \put(12,386){\makebox(0,0)[r]{8h}}  \put(14,386){\makebox(0,0)[l]{Register bank 1}} \put(13,379){\line(1,0){80}} \put(12,372){\makebox(0,0)[r]{10h}}  \put(14,372){\makebox(0,0)[l]{Register bank 2}} \put(13,365){\line(1,0){80}} \put(12,358){\makebox(0,0)[r]{18h}} \put(14,358){\makebox(0,0)[l]{Register bank 3}} \put(13,351){\line(1,0){80}} \put(12,344){\makebox(0,0)[r]{20h}} \put(14,344){\makebox(0,0)[l]{Bit addressable}} \put(13,323){\line(1,0){80}} \put(12,316){\makebox(0,0)[r]{30h}} \put(14,309){\makebox(0,0)[l]{\quad \vdots}} \put(13,291){\line(1,0){80}} \put(12,284){\makebox(0,0)[r]{80h}} \put(14,263){\makebox(0,0)[l]{\quad \vdots}} \put(12,249){\makebox(0,0)[r]{ffh}} \put(13,242){\line(1,0){80}} \qbezier(-2,407)(-6,407)(-6,393) \qbezier(-6,393)(-6,324)(-10,324) \put(-12,324){\makebox(0,0)[r]{indirect}} \qbezier(-6,256)(-6,324)(-10,324) \qbezier(-2,242)(-6,242)(-6,256) \qbezier(94,407)(98,407)(98,393) \qbezier(98,393)(98,349)(102,349) \put(104,349){\makebox(0,0)[l]{direct/stack}} \qbezier(98,305)(98,349)(102,349) \qbezier(94,291)(98,291)(98,305) \put(102,242){\framebox(20,49){sfr}} % bit access to sfrs? \qbezier(124,291)(128,291)(128,277) \qbezier(128,277)(128,266)(132,266) \put(134,266){\makebox(0,0)[l]{direct}} \qbezier(128,257)(128,266)(132,266) \qbezier(124,242)(128,242)(128,256) \put(164,410){\makebox(80,0)[b]{External (64kB)}} \put(164,220){\line(0,1){187}} \put(164,407){\line(1,0){80}} \put(244,220){\line(0,1){187}} \put(164,242){\line(1,0){80}} \put(163,400){\makebox(0,0)[r]{0h}} \put(164,324){\makebox(80,0){paged access}} \put(164,310){\makebox(80,0){direct/indirect}} \put(163,235){\makebox(0,0)[r]{80h}} \put(164,228){\makebox(80,0){\vdots}} \put(164,210){\makebox(80,0){direct/indirect}} \put(264,410){\makebox(80,0)[b]{Code (64kB)}} \put(264,220){\line(0,1){187}} \put(264,407){\line(1,0){80}} \put(344,220){\line(0,1){187}} \put(263,400){\makebox(0,0)[r]{0h}} \put(264,228){\makebox(80,0){\vdots}} \put(264,324){\makebox(80,0){direct}} \put(264,310){\makebox(80,0){PC relative}} \end{picture} \caption{The extended 8051 memory model} \end{figure} To make efficient use of the limited amount of memory, compilers for 8051 \begin{table}[ht] \begin{tabular}{rcl} Attribute & Pointer size & Memory space \\\hline Attribute & Pointer size (bytes) & Memory space \\\hline \lstinline'__data' & 1 & Internal, first half (0h -- 7fh) \\ \lstinline'__idata' & 1 & Internal, indirect only (80h -- ffh) \\ We intend the \cerco{} compiler to support extensions that are broadly compatible with \sdcc{} to enable the compilation of programs with either.  In particular, this would allow the comparison of the behaviour of test cases compiled with both compilers.  Thus the C syntax and semantics have been extended with the memory space attributes listed above.  The syntax follows \sdcc{} and in the semantics we track the memory space that each block was allocated in and only permit access via the appropriate kinds of pointers. The details on these changes are given in the following sections. compatible with \sdcc{} to enable the compilation of programs with either tool.  In particular, this would allow the comparison of the behaviour of test cases compiled with each compiler.  Thus the C syntax and semantics have been extended with the memory space attributes listed above.  The syntax follows \sdcc{} and in the semantics we track the memory space that each block was allocated in and only permit access via the appropriate kinds of pointers.  The details on these changes are given in the following sections. The \sdcc{} compiler also supports special variable types for accessing the SFRs, which provide the standard I/O mechanism for the 8051 family.  (Note that pointers to these types are not allowed because only direct addressing of that pointers to these types are not permitted because only direct addressing of the SFRs is allowed.)  We intend to use CompCert-style external functions' instead of special types.  These are functions which are declared, but no C implementation is provided.  Instead they are provided by the runtime or implementation of them is provided.  Instead they are provided by the runtime or compiled directly to the corresponding machine code.  This has the advantage that no changes from the CompCert semantics are required, and a compatibility compiler, and so no extension is provided for it. Finally, we adopt the \sdcc{} extension to allocate a variable at a particular address to provide a way to deal with memory mapped I/O in the external memory space. \todo{Are we really going to do this?} \section{Port of CompCert \clight{} semantics to \matita{}} \label{sec:port} \todo{Could do with some text here} \subsection{Parsing and elaboration} The first stage taken from the CompCert semantics is the parsing and elaboration of C programs into the simpler \clight{} language.  This is based information throughout the program, including extra casts for promotion, and performs simplifications such as breaking up expressions with side effects into statements that perform the effects using effect-free expressions.  The into effect-free expressions and statements to perform the effects.  The transformed \clight{} programs are much more manageable and lack the ambiguities of C, but also remain easily understood by C programmers. above.  The resulting abstract syntax tree records them on global variable declarations and pointer types.  However, we also need to deal with them during the elaboration process to produce all of the required information. during the elaboration process to produce all of the required type information. For example, when the address-of operator \lstinline'&' is used it must decide which kind of pointer should be used.  Thus the extended elaboration process Thus the elaboration turns the C code Thus the elaboration turns the following C code \begin{quote} \begin{lstlisting}[language=C] \end{lstlisting} \end{quote} into the following Clight program: into the Clight program below: \begin{quote} \begin{lstlisting}[language=C] and pointers specific to the \lstinline'__data' section of memory.  The underlying data structure also has types attached to every expression, but these are inconvenient to show in source form. Note that the translation from C to \clight{} is not proven correct --- instead it effectively forms a semi-formal part of the whole C semantics.  We can have some confidence in the code, however, because it has received testing in the \cerco{} prototype, and it is very close to the version used in CompCert.  We can also perform testing of the semantics without involving the rest of the compiler because we have an executable semantics.  Moreover, the cautious programmer could choose to inspect the \clight{} code, or even work entirely in the \clight{} language. these are impractical to show in source form. Note that the translation from C to \clight{} is not proven correct --- instead it effectively forms a semi-formal part of the whole C semantics.  We can have some confidence in the code, however, because it has received testing in the \cerco{} prototype, and it is very close to the version used in CompCert.  We can also perform testing of the semantics without involving the rest of the compiler because we have an executable semantics.  Moreover, the cautious programmer could choose to inspect the generated \clight{} code, or even work entirely in the \clight{} language. \subsection{Small-step inductive semantics} The semantics for \clight{} itself has been ported from the Coq development development, notably: \begin{itemize} \item representation of primitive values (integers, pointers and undefined \item the representation of primitive values (integers, pointers and undefined values, but not structures or unions) and operations on them, \item traces of I/O events, \item common error handling constructs, in particular an error monad. \end{itemize} We anticipate a similar arrangement for the \cerco{} verified compiler, although this means that there may be further changes to the common parts of the semantics later in the project to harmonise the stages of the compiler. We anticipate a similar arrangement for the \cerco{} verified compiler, although this means that there may be further changes to the common parts of the semantics later in the project to harmonise the stages of the compiler.  In particular, some of data structures for environments are just preliminary definitions for developing the semantics. The main body of the small-step semantics is a number of inductive definitions \end{lstlisting} \end{quote} simply looks up the variable in the local environment.  A similar rule handles global variables, with an extra check to ensure that no local variable has the same name.  Note that the two relations are defined using mutual recursion because \lstinline'eval_lvalue' uses \lstinline'eval_expr' for the evaluation of the pointer expression in the dereferencing rule. simply looks up the variable in the local environment.  The offset is zero because all variables are given their own memory block to prevent the use of stray pointers.  A similar rule handles global variables, with an extra check to ensure that no local variable has the same name.  Note that the two relations are defined using mutual recursion because \lstinline'eval_lvalue' uses \lstinline'eval_expr' for the evaluation of the pointer expression in the dereferencing rule. Casts also have an auxiliary relation to specify the allowed changes, and casting) are given as functions. The only new expression in our semantics is the cost label which wraps around another expression.  It does not change the result, but merely adds the given label to the trace to identify the branches taken in conditional expressions so that accurate cost information can be attached to the program: The only new expression in our semantics is the cost label which wraps around another expression.  It does not change the result, but merely prefixes the trace with the given label to identify the branches taken in conditional expressions so that accurate cost information can be attached to the program: \begin{quote} \begin{lstlisting} \end{lstlisting} \end{quote} During normal execution the state contains the currently executing function's definition (used to find \lstinline'goto' labels and also to check whether the function is expected to return a value), the statement to be executed next, a continuation value to be executed afterwards (where successor statements and details of function calls and loops are stored), the local environment mapping variables to memory locations and the current memory state. \todo{need to note that all variables 'appear' to be memory allocated, even if they're subsequently optimised away.} The function call and return states appear to store less information because the details of the caller are contained in the continuation. During normal execution the state contains the currently executing function's definition (used to find \lstinline'goto' labels and also to check whether the function is expected to return a value), the statement to be executed next, a continuation value to be executed afterwards (where successor statements and details of function calls and loops are stored), the local environment mapping variables to memory locations\footnote{In the semantics all variables are allocated, although the compiler may subsequently allocate them to registers where possible.} and the current memory state.  \todo{need to note that all variables 'appear' to be memory allocated, even if they're subsequently optimised away.}  The function call and return states appear to store less information because the details of the caller are contained in the continuation. An example of the statement execution rules is the assignment rule \begin{itemize} \item if \lstinline'a1' can evaluate to the location \lstinline'psp,loc,ofs', \item \lstinline'a2' evaluates to a value \lstinline'v2', and \item \lstinline'a2' can evaluate to a value \lstinline'v2', and \item storing \lstinline'v2' at location \lstinline'psp,loc,ofs' succeeds, yielding the new memory state \lstinline-m'-, then \end{itemize} This rule would be followed by one of the rules to which replaces the \lstinline'Sskip' statement with the real' next statement constructed from the continuation \lstinline'k'.  Note that the only true side-effect here is the change in memory --- the local environment is initialised once and for all on function entry, and the only events appearing in the trace are cost labels used purely for accounting. \todo{ordering imposed by cost labels?} \lstinline'Sskip' statement with the real' next statement constructed from the continuation \lstinline'k'.  Note that the only true side-effect here is the change in memory --- the local environment is initialised once and for all on function entry, and the only events appearing in the trace are cost labels used purely for accounting.  At present this imposes an ordering due to the cost labels.  Should this prove too restrictive we may change it to produce a set of labels encountered. The \clight{} language provides input and output effects through external' \label{sec:exec} \todo{bit of a cheat to claim that it's equivalent without a completeness proof.} We have added an equivalent functional version of the \clight{} semantics that can be used to animate programs.  The definitions roughly follow the inductive semantics, but are necessarily rearranged around pattern matching of the relevant parts of the state rather than presenting each case separately. \subsection{Expressions} The code corresponding to the variable lookup definitions on rule from the inductive semantics. \subsection{Statements} Evaluating a step of a statement is complicated by the presence of the external' functions for I/O, which can return arbitrary values.  These are