source: C-semantics/8051-Memory/memory.tex @ 189

Last change on this file since 189 was 157, checked in by campbell, 10 years ago

Make proposed memory spaces semantics more explicit.

File size: 10.2 KB
Line 
1\documentclass[a4paper]{article}
2\usepackage{graphics}
3\usepackage{listings}
4
5\lstset{language=C,basicstyle=\small\tt,columns=flexible,breaklines=false,
6        % numbers=left, numberstyle=\tiny, stepnumber=2, numbersep=5pt
7        %keywordstyle=\color{red}\bfseries,
8        %keywordstyle=[2]\color{blue},
9        %commentstyle=\color{green},
10        %stringstyle=\color{blue},
11        showspaces=false,showstringspaces=false}
12
13\newcommand{\sdcc}{\texttt{sdcc}}
14
15\begin{document}
16\title{C extensions for the 8051}
17\maketitle
18
19The 8051 (as modern implementations have it) features a rather complex memory
20arrangement, see Figure~\ref{fig:memory}.  This document outlines the C
21extensions present in the \sdcc\ compiler for dealing with them, and suggests
22what the corresponding extensions in the CerCo project should be.
23
24\begin{figure}[h]
25\setlength{\unitlength}{1pt}
26\begin{picture}(410,250)(-50,200)
27%\put(-50,200){\framebox(410,250){}}
28\put(12,410){\makebox(80,0)[b]{Internal (256B)}}
29\put(13,242){\line(0,1){165}}
30\put(93,242){\line(0,1){165}}
31\put(13,407){\line(1,0){80}}
32\put(12,400){\makebox(0,0)[r]{0h}}  \put(14,400){\makebox(0,0)[l]{Register bank 0}}
33\put(13,393){\line(1,0){80}}
34\put(12,386){\makebox(0,0)[r]{8h}}  \put(14,386){\makebox(0,0)[l]{Register bank 1}}
35\put(13,379){\line(1,0){80}}
36\put(12,372){\makebox(0,0)[r]{10h}}  \put(14,372){\makebox(0,0)[l]{Register bank 2}}
37\put(13,365){\line(1,0){80}}
38\put(12,358){\makebox(0,0)[r]{18h}} \put(14,358){\makebox(0,0)[l]{Register bank 3}}
39\put(13,351){\line(1,0){80}}
40\put(12,344){\makebox(0,0)[r]{20h}} \put(14,344){\makebox(0,0)[l]{Bit addressable}}
41\put(13,323){\line(1,0){80}}
42\put(12,316){\makebox(0,0)[r]{30h}}
43  \put(14,309){\makebox(0,0)[l]{\quad \vdots}}
44\put(13,291){\line(1,0){80}}
45\put(12,284){\makebox(0,0)[r]{80h}}
46  \put(14,263){\makebox(0,0)[l]{\quad \vdots}}
47\put(12,249){\makebox(0,0)[r]{ffh}}
48\put(13,242){\line(1,0){80}}
49
50\qbezier(-2,407)(-6,407)(-6,393)
51\qbezier(-6,393)(-6,324)(-10,324)
52\put(-12,324){\makebox(0,0)[r]{indirect}}
53\qbezier(-6,256)(-6,324)(-10,324)
54\qbezier(-2,242)(-6,242)(-6,256)
55
56\qbezier(94,407)(98,407)(98,393)
57\qbezier(98,393)(98,349)(102,349)
58\put(104,349){\makebox(0,0)[l]{direct/stack}}
59\qbezier(98,305)(98,349)(102,349)
60\qbezier(94,291)(98,291)(98,305)
61
62\put(102,242){\framebox(20,49){sfr}}
63% bit access to sfrs?
64
65\qbezier(124,291)(128,291)(128,277)
66\qbezier(128,277)(128,266)(132,266)
67\put(134,266){\makebox(0,0)[l]{direct}}
68\qbezier(128,257)(128,266)(132,266)
69\qbezier(124,242)(128,242)(128,256)
70
71\put(164,410){\makebox(80,0)[b]{External (64kB)}}
72\put(164,220){\line(0,1){187}}
73\put(164,407){\line(1,0){80}}
74\put(244,220){\line(0,1){187}}
75\put(164,242){\line(1,0){80}}
76\put(163,400){\makebox(0,0)[r]{0h}}
77\put(164,324){\makebox(80,0){paged access}}
78  \put(164,310){\makebox(80,0){direct/indirect}}
79\put(163,235){\makebox(0,0)[r]{80h}}
80  \put(164,228){\makebox(80,0){\vdots}}
81  \put(164,210){\makebox(80,0){direct/indirect}}
82
83\put(264,410){\makebox(80,0)[b]{Code (64kB)}}
84\put(264,220){\line(0,1){187}}
85\put(264,407){\line(1,0){80}}
86\put(344,220){\line(0,1){187}}
87\put(263,400){\makebox(0,0)[r]{0h}}
88  \put(264,228){\makebox(80,0){\vdots}}
89  \put(264,324){\makebox(80,0){direct}}
90  \put(264,310){\makebox(80,0){PC relative}}
91
92\end{picture}
93\caption{(extended) 8051 memory layout --- not to scale}
94\label{fig:memory}
95\end{figure}
96
97\section{\sdcc\ extensions}
98
99\sdcc\ adds a number of \emph{storage classes} to the standard C classes
100(\lstinline'auto', \lstinline'extern', and so on) to specify the memory space
101that should be used for storage.  They can also be used to refine pointer
102types for particular areas\footnote{One side effect of this dual meaning is
103that they appear as \emph{type specifiers} in the extended C grammar.}.  The
104new storage classes and corresponding pointer sizes are:
105\begin{table}[ht]
106\begin{tabular}{rcl}
107Attribute & Pointer size & Memory space \\\hline
108\lstinline'__data' & 1 & Internal, first half (0h -- 7fh) \\
109\lstinline'__idata' & 1 & Internal, indirect only (80h -- ffh) \\
110\lstinline'__pdata' & 1 & External, page access (usually 0h -- 7fh) \\
111\lstinline'__xdata' & 2 & External, any (0h -- ffffh) \\
112\lstinline'__code' & 2 & Code, any (0h -- ffffh) \\
113\emph{none}& 3 & Any / Generic pointer
114\end{tabular}
115\end{table}
116
117There are two areas of overlap between these classes: generic pointers overlap
118with everything else and one page of external memory can be accessed by both
119\lstinline'__pdata' and \lstinline'__xdata' pointers.
120
121Two types with an implied storage class are also defined.  The
122\lstinline'__bit' type declare a single bit variable, and \lstinline'__sbit',
123\lstinline'__sfr', \lstinline'__sfr16' and \lstinline'__sfr32' declare
124`variables' which access the \emph{Special Function Registers} (SFR).
125Pointers to these types are not allowed --- the corresponding 8051 instructions
126only allow direct addressing for them.
127
128These are all defined in Section~3.4.1 of the \sdcc\ manual.
129
130Explicit addresses can be given for variables, using the
131\lstinline'__at (0x1234)' syntax.  From the manual, it appears that these are
132sometimes used with \lstinline'volatile' in preference to using the SFR types,
133or where memory mapped I/O appears in the extended memory space.
134
135\subsection{Pointer operations}
136
137C allows pointers to be compared for equality and order, have integers added
138to them, and for subtraction of two pointers to yield an integer difference.
139
140\sdcc\ forbids comparison between incompatible pointer types, allows some
141form of subtraction for mixed pointers (even though it doesn't make much
142sense) and appears to reject mixed order comparisons, even when one of the
143pointers is generic.
144
145\section{Proposals}
146
147Where these are more restrictive than \sdcc, the intention is to prevent
148aliasing or reduce complexity.
149
150\begin{itemize}
151\item Allow each of the ordinary storage classes from \sdcc, but forbid
152explicit addresses except for \lstinline'volatile' variables.  The compiler
153should ensure that \lstinline'volatile' variables do not overlap with any
154non-\lstinline'volatile' data.  Note that this means that the semantics do not
155include accessing registers through a pointer.
156\item No bit variables are provided to the programmer (although the compiler
157could choose to use them internally).
158\item Bit and byte access to SFRs will be provided by `external functions' (in the CompCert sense).  They could be compiled to the corresponding instruction
159rather than implemented with actual functions.
160\item Casting between pointer types for different memory space attributes is
161defined when the pointer can be represented in both.  In particular, conversion
162to and from a generic pointer type to a specific memory area pointer type is
163guaranteed to work on values pointing to that memory area, and conversions
164between \lstinline'__pdata' and \lstinline'__xdata' pointers will work on
165values that point to the \lstinline'__pdata' space.  Other conversions are
166undefined.
167\item Zeros of any integer type can be converted to any pointer type to give
168a null pointer.  Conversions of any other integer are undefined.  [We could
169extend this to give a semantics for more conversions, or restrict it to
170only deal with integer constant zeros.]
171\item Only provide equality, ordering and subtraction operations on pointer
172types with the same class (with promotion to generic?).  Note that comparison
173between pointers to different blocks is not specified by C or CompCert anyway.
174\item String literals are placed in \lstinline'__code' memory.
175\end{itemize}
176There are no alignment constraints on the 8051.
177
178\section{Impact on formal C semantics}
179
180Values, types and the memory model are changed as follows:
181\begin{itemize}
182\item Each pointer value has a `class' associated with it, either generic or
183identifying a particular memory space.  Each pointer type has a corresponding
184space, too.
185\item Array types are also given a space, because the underlying values are
186pointers.
187\item Each block has a class associated with it, either `unspecified' or
188identifying a particular memory space.
189\end{itemize}
190The behaviour of casts between pointer types is defined only when the target
191pointer type is suitable for the memory space pointed at.  Thus casts now
192depend upon the memory rather than just the values and types.
193
194Stack memory is treated as in some `unspecified' space at present.  This has
195the problem that only generic pointers may be used to refer to it.  It may
196be possible to parametrise the semantics by the stack memory space to allow
197for a choice of memory model (like the \texttt{--model-}$\star$ options in
198\sdcc{}).
199
200SFR access (both by byte and by bit) do not require any changes as they will
201be implemented as `external' I/O functions.
202
203The definitions for alignment constraints from CompCert may be kept for
204consistency with CompCert, but made vacuous by only `requiring' byte alignment.
205Note that the size of pointer depends on the memory space associated with its
206type, and that we need to deal with 24 bit `chunks' for generic pointer values.
207
208\section{Changes to CIL and Clight conversion}
209
210The prototype compiler uses the CIL-derived C parser from CompCert.  This can
211be altered to accept the memory spaces as extra attributes for declarations.
212Note that this allows them to appear in meaningless places (such as local
213variable declarations which are always held in stack or registers), which
214ought to be rejected during translation to Clight.
215
216As an extreme example, consider the following:
217\begin{lstlisting}
218/* fn1 is a global variable located in idata memory,
219   which contains a pointer to a value in code memory,
220   which is a function that takes and returns a pointer to an integer in xdata.
221 */
222__xdata int * __code (* __idata fn1)(__xdata int *);
223\end{lstlisting}
224
225As noted above, the Clight abstract syntax trees are extended by memory spaces
226added to the pointer and array types, and added to global variable declarations.
227To support this, the transformations in CIL are extended to keep track of the
228memory space for each expression during elaboration so that the appropriate
229pointer types and casts are inserted.
230
231The Clight version of the above variable declaration is:
232\begin{lstlisting}
233(mk_pair ?? (mk_pair ?? (mk_pair ?? (succ_pos_of_nat 143 (* fn1 *))
234    [(Init_space 4) ]) IData)
235    (Tpointer Code (Tfunction (Tcons (Tpointer XData (Tint I32 Signed  )) Tnil)
236                              (Tpointer XData (Tint I32 Signed  )))))]
237\end{lstlisting}
238
239\end{document}
Note: See TracBrowser for help on using the repository browser.