Changeset 2327 for src/ASM/CPP2012-asm


Ignore:
Timestamp:
Sep 10, 2012, 4:45:44 PM (7 years ago)
Author:
mulligan
Message:

Fixed typos in paper highlighted by referees. More substantial changes will have to wait due to us moving house.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • src/ASM/CPP2012-asm/cpp-2012-asm.tex

    r2095 r2327  
    4949This assembler constitutes a major component of the EU's CerCo project.
    5050
    51 The efficient expansion of pseudoinstructions---particularly jumps---into MCS-51 machine instructions is complex.
     51The efficient expansion of pseudoinstructions---namely jumps---into MCS-51 machine instructions is complex.
    5252We isolate the decision making over how jumps should be expanded from the expansion process itself as much as possible using `policies'.
    5353This makes the proof of correctness for the assembler significantly more straightforward.
     
    102102
    103103Abandoning this attempt, we instead split the `policy'---the decision over how any particular jump should be expanded---from the implementation that actually expands assembly programs into machine code.
    104 Assuming the existence of a correct policy, we proved the implementation of the assembler correct.
     104Assuming the existence of a correct policy, we prove the implementation of the assembler correct.
    105105Further, we proved that the assembler fails to assemble an assembly program if and only if a correct policy does not exist.
    106106This is achieved by means of dependent types: the assembly function is total over a program, a policy and the proof that the policy is correct for that program.
     
    111111The rest of this paper is a detailed description of our proof that is, in part, still a work in progress.
    112112
    113 % ---------------------------------------------------------------------------- %
    114 % SECTION                                                                      %
    115 % ---------------------------------------------------------------------------- %
    116 \subsection{Overview of the paper}
    117 \label{subsect.overview.of.the.paper}
     113We provide the reader with a brief `roadmap' for the rest of the paper.
    118114In Section~\ref{sect.matita} we provide a brief overview of the Matita proof assistant for the unfamiliar reader.
    119115In Section~\ref{sect.the.proof} we discuss the design and implementation of the proof proper.
     
    127123
    128124Matita is a proof assistant based on a variant of the Calculus of (Co)inductive Constructions~\cite{asperti:user:2007}.
    129 In particular, it features dependent types that we heavily exploit in the formalisation.
     125In particular, it features dependent types that we exploit in the formalisation.
    130126The syntax of the statements and definitions in the paper should be self-explanatory, at least to those exposed to dependent type theory.
    131 We only remark the use of of `$\mathtt{?}$' or `$\mathtt{\ldots}$' for omitting single terms or sequences of terms to be inferred automatically by the system, respectively.
     127We only remark that the use of `$\mathtt{?}$' or `$\mathtt{\ldots}$' for omitting single terms or sequences of terms to be inferred automatically by the system, respectively.
    132128Those that are not inferred are left to the user as proof obligations.
    133129Pairs are denoted with angular brackets, $\langle-, -\rangle$.
     
    155151
    156152Our emulator centres around a \texttt{Status} record, describing the microprocessor's state.
    157 This record contains fields corresponding to the microprocessor's program counter, registers, and so on.
     153This record contains fields corresponding to the microprocessor's program counter, registers, stack pointer, special function registers, and so on.
    158154At the machine code level, code memory is implemented as a compact trie of bytes addressed by the program counter.
    159155We parameterise \texttt{Status} records by this representation as a few technical tasks manipulating statuses are made simpler using this approach, as well as permitting a modicum of abstraction.
     
    261257The function \texttt{assembly\_1\_pseudo\_instruction} used in the body of the paper is essentially the composition of the two passes.
    262258
    263 The branch displacement problems consists of the problem of expanding pseudojumps into their concrete counterparts, preferably as efficiently as possible.
     259The branch displacement problem refers to the task of expanding pseudojumps into their concrete counterparts, preferably as efficiently as possible.
    264260For instance, the MCS-51 features three unconditional jump instructions: \texttt{LJMP} and \texttt{SJMP}---`long jump' and `short jump' respectively---and an 11-bit oddity of the MCS-51, \texttt{AJMP}.
    265261Each of these three instructions expects arguments in different sizes and behaves in markedly different ways: \texttt{SJMP} may only perform a `local jump'; \texttt{LJMP} may jump to any address in the MCS-51's memory space and \texttt{AJMP} may jump to any address in the current memory page.
     
    345341      nth j a = nth (add $\ldots$ (sigma ppc) (bitvector_of_nat ? j))) assembled
    346342\end{lstlisting}
    347 In plain words, the type of assembly states the following.
     343In plain words, the type of \texttt{assembly} states the following.
    348344Suppose we are given a policy that is correct for the program we are assembling, and suppose the program to be assembled is fully addressable by a 16-bit word.
    349345Then if we fetch from the pseudo program counter \texttt{ppc} we get a pseudoinstruction \texttt{pi} and a new pseudo program counter \texttt{ppc}.
     
    419415\end{lstlisting}
    420416
    421 Here we use $\pi_1 \ldots$ to eject out the existential witness from the Russell-typed function \texttt{assembly}.
     417Here we use $\pi_1 \ldots$ to project the existential witness from the Russell-typed function \texttt{assembly}.
    422418
    423419We read \texttt{fetch\_assembly\_pseudo2} as follows.
     
    426422The fetched sequence corresponds to the expansion, according to \texttt{sigma}, of the pseudoinstruction.
    427423
    428 At first, the lemmas appears to immediately imply the correctness of the assembler.
     424At first, the lemma appears to immediately imply the correctness of the assembler.
    429425However, this property is \emph{not} strong enough to establish that the semantics of an assembly program has been preserved by the assembly process since it does not establish the correspondence between the semantics of a pseudoinstruction and that of its expansion.
    430426In particular, the two semantics differ on instructions that \emph{could} directly manipulate program addresses.
     
    455451
    456452We believe that this approach is more flexible when compared to the traditional approach, as in principle it allows us to introduce some permitted \emph{benign} manipulations of addresses that the traditional approach, using opaque addresses, cannot handle, therefore expanding the set of input programs that can be assembled correctly.
    457 This approach, of using real addresses coupled with a weak, dynamic typing system sitting atop of memory, is similar to one taken by Huth \emph{et al}~\cite{tuch:types:2007}, for reasoning about low-level C code.
     453This approach, of using real addresses coupled with a weak, dynamic typing system sitting atop of memory, is similar to one taken by Tuch \emph{et al}~\cite{tuch:types:2007}, for reasoning about low-level C code.
    458454
    459455Our analogue of the semantic function above is then merely a wrapper around the function that implements the semantics of machine code, paired with a function that keeps track of addresses.
     
    470466Note here that all addresses are 16 bit words, but are stored (and manipulated) as 8 bit bytes.
    471467All \texttt{MOV} instructions in the MCS-51 must use the accumulator \texttt{A} as an intermediary, moving a byte at a time.
    472 The second component of \texttt{internal\_pseudo\_address\_map} therefore states whether the accumulator currently holds a piece of an address, and if so, whether it is the upper or lower byte of the address (using the boolean flag) complete with the corresponding, source address in full.
     468The second component of \texttt{internal\_pseudo\_address\_map} therefore states whether the accumulator currently holds a piece of an address, and if so, whether it is the upper or lower byte of the address (using the boolean flag) complete with the corresponding source address in full.
    473469The first component, on the other hand, performs a similar task for the rest of external RAM.
    474470Again, we use a boolean flag to describe whether a byte is the upper or lower component of a 16-bit address.
     
    565561Our formalisation exploits dependent types in different ways and for multiple purposes.
    566562The first purpose is to reduce potential errors in the formalisation of the microprocessor.
    567 In particular, dependent types are used to constraint the size of bitvectors and tries that represent memory quantities and memory areas respectively.
    568 They are also used to simulate polymorphic variants in Matita, in order to provide precise typings to various functions expecting only a subset of all possible addressing modes than the MCS-51 offers.
     563In particular, dependent types are used to constrain the size of bitvectors and tries that represent memory quantities and memory areas respectively.
     564They are also used to simulate polymorphic variants in Matita, in order to provide precise typings to various functions expecting only a subset of all possible addressing modes that the MCS-51 offers.
    569565Polymorphic variants nicely capture the absolutely unorthogonal instruction set of the MCS-51 where every opcode must accept its own subset of the 11 addressing mode of the processor.
    570566
Note: See TracChangeset for help on using the changeset viewer.