# Changeset 528 for Deliverables/D4.1/ITP-Paper/itp-2011.tex

Ignore:
Timestamp:
Feb 16, 2011, 11:14:29 AM (9 years ago)
Message:

Removed all \begin{quote} .. \end{quote} around code snippets for better layout. Reduced to 14 pages as a result.

File:
1 edited

### Legend:

Unmodified
 r527 Instead, we chose to use a modified form of trie, where paths are represented by bitvectors. As bitvectors were widely used in our implementation already for representing integers, this worked well: \begin{quote} \begin{lstlisting} inductive BitVectorTrie (A: Type[0]): nat $\rightarrow$ Type[0] ≝ | Stub: ∀n. BitVectorTrie A n. \end{lstlisting} \end{quote} Here, \texttt{Stub} is a constructor that can appear at any point in our tries. It internalises the notion of uninitialized data'. The internal state of our Matita emulator is represented as a record: \begin{quote} \begin{lstlisting} record Status: Type[0] ≝ }. \end{lstlisting} \end{quote} This record neatly encapsulates the current memory contents, the program counter, the state of the current SFRs, and so on. One peculiarity is the packing of the 24 combined SFRs into fixed length vectors. Both the Matita and O'Caml emulators follows the classic fetch-decode-execute' model of processor operation. The most important functions in the Matita emulator are highlighted below. \begin{quote} \begin{lstlisting} definition fetch: BitVectorTrie Byte 16 → Word → instruction × Word × nat := ... \end{lstlisting} \end{quote} The next instruction, indexed by the program counter, is fetched from code memory with \texttt{fetch}. An updated program counter, along with the concrete cost, in processor cycles for executing this instruction, is also returned. These costs are taken from a Siemen's data sheet for the MCS-51, and will likely vary across manufacturers and particular derivatives of the processor. \begin{quote} \begin{lstlisting} definition fetch: BitVectorTrie Byte 16 $\rightarrow$ Word $\rightarrow$ instruction $\times$ Word $\times$ nat := ... \end{lstlisting} \end{quote} A single instruction is assembled into its corresponding bit encoding with \texttt{assembly1}: \begin{quote} \begin{lstlisting} definition assembly1: instruction $\rightarrow$ list Byte := ... \end{lstlisting} \end{quote} An assembly program, consisting of a preamble containing global data, and a list of (pseudo)instructions, is assembled using \texttt{assembly}. Pseudoinstructions and labels are eliminated in favour of concrete instructions from the MCS-51 instruction set. A map associating memory locations and cost labels (see Subsection~\ref{subsect.computation.cost.traces}) is also produced. \begin{quote} \begin{lstlisting} definition assembly: assembly_program $\rightarrow$ option (list Byte $\times$ (BitVectorTrie String 16)) := ... \end{lstlisting} \end{quote} A single execution step of the processor is evaluated using \texttt{execute\_1}, mapping a \texttt{Status} to a \texttt{Status}: \begin{quote} \begin{lstlisting} definition execute_1: Status → Status := ... \end{lstlisting} \end{quote} Multiple steps of processor execution are implemented in \texttt{execute}, which wraps \texttt{execute\_1}: \begin{quote} \begin{lstlisting} let rec execute (n: nat) (s: Status) on n: Status := ... \end{lstlisting} \end{quote} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Such non-orthogonality in the instruction set was handled with the use of polymorphic variants in the O'Caml emulator. For instance, we introduced types corresponding to each addressing mode: \begin{quote} \begin{lstlisting} type direct = [ DIRECT of byte ] ... \end{lstlisting} \end{quote} Which were then used in our inductive datatype for assembly instructions, as follows: \begin{quote} \begin{lstlisting} type 'addr preinstruction = ... \end{lstlisting} \end{quote} Here, \texttt{union6} is a disjoint union type, defined as follows: \begin{quote} \begin{lstlisting} type ('a,'b,'c,'d,'e,'f) union6 = [ U1 of 'a | ... | U6 of 'f ] \end{lstlisting} \end{quote} For our purposes, the types \texttt{union2}, \texttt{union3} and \texttt{union6} sufficed. We first provided an inductive data type representing all possible addressing modes, a type that functions will pattern match against: \begin{quote} \begin{lstlisting} inductive addressing_mode: Type[0] ≝ ... \end{lstlisting} \end{quote} We also wished to express in the type of functions the \emph{impossibility} of pattern matching against certain constructors. In order to do this, we introduced an inductive type of addressing mode tags'. The constructors of \texttt{addressing\_mode\_tag} are in one-to-one correspondence with the constructors of \texttt{addressing\_mode}: \begin{quote} \begin{lstlisting} inductive addressing_mode_tag : Type[0] ≝ ... \end{lstlisting} \end{quote} A function that checks whether an \texttt{addressing\_mode} is `morally' an \texttt{addressing\_mode\_tag} is provided, as follows: \begin{quote} \begin{lstlisting} let rec is_a (d: addressing_mode_tag) (A: addressing_mode) on d ≝ ... \end{lstlisting} \end{quote} We also extend this check to vectors of \texttt{addressing\_mode\_tag}'s in the obvious manner: \begin{quote} \begin{lstlisting} let rec is_in (n) (l: Vector addressing_mode_tag n) (A: addressing_mode) on l ≝ is_a he A $\vee$ is_in ? tl A ]. \end{lstlisting} \end{quote} Here \texttt{VEmpty} and \texttt{VCons} are the two constructors of the \texttt{Vector} data type, and $\mathtt{\vee}$ is inclusive disjunction on Booleans. \begin{quote} Here $\mathtt{\vee}$ is inclusive disjunction on the \texttt{bool} datatype. \begin{lstlisting} record subaddressing_mode (n: Nat) (l: Vector addressing_mode_tag (S n)) : Type[0] ≝ }. \end{lstlisting} \end{quote} We can now provide an inductive type of preinstructions with precise typings: \begin{quote} \begin{lstlisting} inductive preinstruction (A: Type[0]): Type[0] ≝ ... \end{lstlisting} \end{quote} Here $\llbracket - \rrbracket$ is syntax denoting a vector. We see that the constructor \texttt{ADD} expects two parameters, the first being the accumulator A (\texttt{acc\_a}), and the second being one of a register, direct, indirect or data addressing mode. The previous machinery allows us to state in the type of a function what addressing modes that function expects. For instance, consider \texttt{set\_arg\_16}, which expects only a \texttt{DPTR}: \begin{quote} \begin{lstlisting} definition set_arg_16: Status $\rightarrow$ Word $\rightarrow$ $\llbracket$ dptr $\rrbracket$ $\rightarrow$ Status ≝ ] (subaddressing_modein $\ldots$ a). \end{lstlisting} \end{quote} All other cases are discharged by the catch-all at the bottom of the match expression. Attempting to match against another addressing mode not indicated in the type (for example, \texttt{REGISTER}) will produce a type-error. To accurately model timers, we must modify the central \texttt{status} record of the emulator to keep track of the current time: \begin{quote} \begin{lstlisting} type time = int type status = { ... clock: time; ... } \end{lstlisting} \end{quote} Before every execution step, the \texttt{clock} is incremented by the number of processor cycles that the instruction just fetched will take to execute. The processor then executes the instruction, followed by the code implementing the timers.