 Timestamp:
 Sep 10, 2012, 4:45:44 PM (7 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

src/ASM/CPP2012asm/cpp2012asm.tex
r2095 r2327 49 49 This assembler constitutes a major component of the EU's CerCo project. 50 50 51 The efficient expansion of pseudoinstructions particularly jumpsinto MCS51 machine instructions is complex.51 The efficient expansion of pseudoinstructionsnamely jumpsinto MCS51 machine instructions is complex. 52 52 We isolate the decision making over how jumps should be expanded from the expansion process itself as much as possible using `policies'. 53 53 This makes the proof of correctness for the assembler significantly more straightforward. … … 102 102 103 103 Abandoning this attempt, we instead split the `policy'the decision over how any particular jump should be expandedfrom the implementation that actually expands assembly programs into machine code. 104 Assuming the existence of a correct policy, we prove dthe implementation of the assembler correct.104 Assuming the existence of a correct policy, we prove the implementation of the assembler correct. 105 105 Further, we proved that the assembler fails to assemble an assembly program if and only if a correct policy does not exist. 106 106 This is achieved by means of dependent types: the assembly function is total over a program, a policy and the proof that the policy is correct for that program. … … 111 111 The rest of this paper is a detailed description of our proof that is, in part, still a work in progress. 112 112 113 %  % 114 % SECTION % 115 %  % 116 \subsection{Overview of the paper} 117 \label{subsect.overview.of.the.paper} 113 We provide the reader with a brief `roadmap' for the rest of the paper. 118 114 In Section~\ref{sect.matita} we provide a brief overview of the Matita proof assistant for the unfamiliar reader. 119 115 In Section~\ref{sect.the.proof} we discuss the design and implementation of the proof proper. … … 127 123 128 124 Matita is a proof assistant based on a variant of the Calculus of (Co)inductive Constructions~\cite{asperti:user:2007}. 129 In particular, it features dependent types that we heavilyexploit in the formalisation.125 In particular, it features dependent types that we exploit in the formalisation. 130 126 The syntax of the statements and definitions in the paper should be selfexplanatory, at least to those exposed to dependent type theory. 131 We only remark th e use ofof `$\mathtt{?}$' or `$\mathtt{\ldots}$' for omitting single terms or sequences of terms to be inferred automatically by the system, respectively.127 We only remark that the use of `$\mathtt{?}$' or `$\mathtt{\ldots}$' for omitting single terms or sequences of terms to be inferred automatically by the system, respectively. 132 128 Those that are not inferred are left to the user as proof obligations. 133 129 Pairs are denoted with angular brackets, $\langle, \rangle$. … … 155 151 156 152 Our emulator centres around a \texttt{Status} record, describing the microprocessor's state. 157 This record contains fields corresponding to the microprocessor's program counter, registers, and so on.153 This record contains fields corresponding to the microprocessor's program counter, registers, stack pointer, special function registers, and so on. 158 154 At the machine code level, code memory is implemented as a compact trie of bytes addressed by the program counter. 159 155 We parameterise \texttt{Status} records by this representation as a few technical tasks manipulating statuses are made simpler using this approach, as well as permitting a modicum of abstraction. … … 261 257 The function \texttt{assembly\_1\_pseudo\_instruction} used in the body of the paper is essentially the composition of the two passes. 262 258 263 The branch displacement problem s consists of the problemof expanding pseudojumps into their concrete counterparts, preferably as efficiently as possible.259 The branch displacement problem refers to the task of expanding pseudojumps into their concrete counterparts, preferably as efficiently as possible. 264 260 For instance, the MCS51 features three unconditional jump instructions: \texttt{LJMP} and \texttt{SJMP}`long jump' and `short jump' respectivelyand an 11bit oddity of the MCS51, \texttt{AJMP}. 265 261 Each of these three instructions expects arguments in different sizes and behaves in markedly different ways: \texttt{SJMP} may only perform a `local jump'; \texttt{LJMP} may jump to any address in the MCS51's memory space and \texttt{AJMP} may jump to any address in the current memory page. … … 345 341 nth j a = nth (add $\ldots$ (sigma ppc) (bitvector_of_nat ? j))) assembled 346 342 \end{lstlisting} 347 In plain words, the type of assemblystates the following.343 In plain words, the type of \texttt{assembly} states the following. 348 344 Suppose we are given a policy that is correct for the program we are assembling, and suppose the program to be assembled is fully addressable by a 16bit word. 349 345 Then if we fetch from the pseudo program counter \texttt{ppc} we get a pseudoinstruction \texttt{pi} and a new pseudo program counter \texttt{ppc}. … … 419 415 \end{lstlisting} 420 416 421 Here we use $\pi_1 \ldots$ to eject out the existential witness from the Russelltyped function \texttt{assembly}.417 Here we use $\pi_1 \ldots$ to project the existential witness from the Russelltyped function \texttt{assembly}. 422 418 423 419 We read \texttt{fetch\_assembly\_pseudo2} as follows. … … 426 422 The fetched sequence corresponds to the expansion, according to \texttt{sigma}, of the pseudoinstruction. 427 423 428 At first, the lemma sappears to immediately imply the correctness of the assembler.424 At first, the lemma appears to immediately imply the correctness of the assembler. 429 425 However, this property is \emph{not} strong enough to establish that the semantics of an assembly program has been preserved by the assembly process since it does not establish the correspondence between the semantics of a pseudoinstruction and that of its expansion. 430 426 In particular, the two semantics differ on instructions that \emph{could} directly manipulate program addresses. … … 455 451 456 452 We believe that this approach is more flexible when compared to the traditional approach, as in principle it allows us to introduce some permitted \emph{benign} manipulations of addresses that the traditional approach, using opaque addresses, cannot handle, therefore expanding the set of input programs that can be assembled correctly. 457 This approach, of using real addresses coupled with a weak, dynamic typing system sitting atop of memory, is similar to one taken by Huth \emph{et al}~\cite{tuch:types:2007}, for reasoning about lowlevel C code.453 This approach, of using real addresses coupled with a weak, dynamic typing system sitting atop of memory, is similar to one taken by Tuch \emph{et al}~\cite{tuch:types:2007}, for reasoning about lowlevel C code. 458 454 459 455 Our analogue of the semantic function above is then merely a wrapper around the function that implements the semantics of machine code, paired with a function that keeps track of addresses. … … 470 466 Note here that all addresses are 16 bit words, but are stored (and manipulated) as 8 bit bytes. 471 467 All \texttt{MOV} instructions in the MCS51 must use the accumulator \texttt{A} as an intermediary, moving a byte at a time. 472 The second component of \texttt{internal\_pseudo\_address\_map} therefore states whether the accumulator currently holds a piece of an address, and if so, whether it is the upper or lower byte of the address (using the boolean flag) complete with the corresponding ,source address in full.468 The second component of \texttt{internal\_pseudo\_address\_map} therefore states whether the accumulator currently holds a piece of an address, and if so, whether it is the upper or lower byte of the address (using the boolean flag) complete with the corresponding source address in full. 473 469 The first component, on the other hand, performs a similar task for the rest of external RAM. 474 470 Again, we use a boolean flag to describe whether a byte is the upper or lower component of a 16bit address. … … 565 561 Our formalisation exploits dependent types in different ways and for multiple purposes. 566 562 The first purpose is to reduce potential errors in the formalisation of the microprocessor. 567 In particular, dependent types are used to constrain tthe size of bitvectors and tries that represent memory quantities and memory areas respectively.568 They are also used to simulate polymorphic variants in Matita, in order to provide precise typings to various functions expecting only a subset of all possible addressing modes tha nthe MCS51 offers.563 In particular, dependent types are used to constrain the size of bitvectors and tries that represent memory quantities and memory areas respectively. 564 They are also used to simulate polymorphic variants in Matita, in order to provide precise typings to various functions expecting only a subset of all possible addressing modes that the MCS51 offers. 569 565 Polymorphic variants nicely capture the absolutely unorthogonal instruction set of the MCS51 where every opcode must accept its own subset of the 11 addressing mode of the processor. 570 566
Note: See TracChangeset
for help on using the changeset viewer.