Changeset 2348 for Papers/cpp-asm-2012


Ignore:
Timestamp:
Sep 26, 2012, 3:43:36 PM (7 years ago)
Author:
mulligan
Message:

Edited up to Sec. 3.3

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Papers/cpp-asm-2012/cpp-2012-asm.tex

    r2347 r2348  
    182182formalisation is available at~\url{http://cerco.cs.unibo.it}.
    183183
    184 In Section~\ref{subsect.machine.code.semantics} we sketch an operational
    185 semantics (a realistic and efficient emulator) for the MCS-51.
    186 We also introduce a syntax for decoded instructions that will be reused for
    187 the assembly language.
    188 
    189 In Section~\ref{subsect.assembly.code.semantics} we describe the assembly code
    190 language and operational semantics. The latter is parametric in the cost model
    191 that will be induced by the assembler. It fully reuses the semantics of the
    192 machine code on all `real' (i.e. non pseudo) instructions.
     184In Section~\ref{subsect.machine.code.semantics} we sketch an operational semantics (a realistic and efficient emulator) for the MCS-51.
     185We also introduce a syntax for decoded instructions that will be reused for the assembly language.
     186
     187In Section~\ref{subsect.assembly.code.semantics} we describe the assembly language and its operational semantics.
     188The latter is parametric in the cost model that will be induced by the assembler.
     189It reuses the semantics of the machine code on all `real' (i.e. non-pseudo-) instructions.
    193190
    194191Branch displacement policies are introduced in Section~\ref{subsect.the.assembler} where we also describe the assembler as a function over policies as previously described.
    195192
    196 The proof of correctness of the assembler consists in showing that the
    197 object code given in output together with a cost model for the source program
    198 simulates the source program executed using that cost model. The proof
    199 can be divided into two main lemmas. The first is correctness with respect
    200 to fetching, described in
    201 Section~\ref{subsect.total.correctness.of.the.assembler}. It roughly
    202 states that one step of fetching at the assembly level that returns the
    203 decoded instruction $I$ is simulated by
    204 $n$ steps of fetching at object level that return instructions
    205 $J_1,\ldots,J_n$, where $J_1,\ldots,J_n$ is, among the possible expansions
    206 of $I$, the one picked by the policy.
    207 The second lemma shows that $J_1,\ldots,J_n$ simulates $I$ but only if
    208 the $I$ is well-behaved, i.e. it manipulates addresses in ways that are
    209 anticipated in the correctness proof. To keep track of the well-behaved ways,
    210 we couple the assembly status with a map that records where addresses are
    211 currently stored in memory or in the registers. Then we introduce a dynamic
    212 checking function that inspects the assembly status and the map to determine
    213 if the operation is well behaved. A positive answer is the pre-condition of
    214 the lemma. The second lemma is part of
    215 Section~\ref{subsect.total.correctness.for.well.behaved.assembly.programs}
    216 where we also establish \emph{total correctness} of our assembler as a
    217 composition of the two lemmas: programs that are well behaved when executed
    218 under the cost model induced by the compiler are correctly simulated by the
    219 compiled code.
     193The proof of correctness of the assembler consists in showing that the object code given in output, together with a cost model for the source program, simulates the source program executed using that cost model.
     194The proof can be divided into two main lemmas.
     195The first is correctness with respect to fetching, described in Section~\ref{subsect.total.correctness.of.the.assembler}.
     196It roughly states that one step of fetching at the assembly level that returns the decoded instruction $I$ is simulated by $n$ steps of fetching at the object level that returns instructions $J_1,\ldots,J_n$, where $J_1,\ldots,J_n$ is, amongst the possible expansions of $I$, the one picked by the policy.
     197The second lemma shows that $J_1,\ldots,J_n$ simulates $I$ but only if $I$ is well-behaved, i.e. it manipulates addresses in ways that are anticipated in the correctness proof.
     198To keep track of well-behaved address manipulations, we couple the assembly status with a map that records where addresses are currently stored in memory or in the processor's accumulators.
     199We then introduce a dynamic checking function that inspects the assembly status and this map to determine if the operation is well behaved.
     200An affirmative answer is the pre-condition of the lemma.
     201The second lemma is detailed in Section~\ref{subsect.total.correctness.for.well.behaved.assembly.programs} where we also establish \emph{total correctness} of our assembler as a composition of the two lemmas: programs that are well behaved when executed under the cost model induced by the compiler are correctly simulated by the compiled code.
    220202
    221203% ---------------------------------------------------------------------------- %
     
    235217definition execute_1: $\forall$cm. Status cm $\rightarrow$ Status cm
    236218\end{lstlisting}
    237 Here \texttt{Status} is parameterised by \texttt{cm}, a code memory represented as a trie.
     219Here \texttt{cm} is our trie of bytes representing code memory.
    238220%The function \texttt{execute} allows one to execute an arbitrary, but fixed (due to Matita's normalisation requirement) number of steps of a program.
    239 The function \texttt{execute\_1} closely matches the operation of the MCS-51 hardware, as described by a Siemen's manufacturer's data sheet (specifically, this one~\cite{siemens:2011}).
     221The function \texttt{execute\_1} closely matches the operation of the MCS-51 hardware, as described by a Siemen's manufacturer's data sheet~\cite{siemens:2011}.
    240222We first fetch, using the program counter, from code memory the first byte of the instruction to be executed, decoding the resulting opcode, fetching more bytes as is necessary.
    241223Decoded instructions are represented as an inductive type, where $\llbracket - \rrbracket$ denotes a fixed-length vector:
     
    290272These are jumps and calls to absolute addresses, which we do not wish to allow at the assembly level.
    291273
    292 Execution of pseudoinstructions is a function from \texttt{PseudoStatus} to \texttt{PseudoStatus}.
     274Execution of pseudoinstructions is an endofunction on \texttt{PseudoStatus}.
    293275Both \texttt{Status} and \texttt{PseudoStatus} are instances of a more general type parameterised over the representation of code memory.
    294276The more general type is crucial for sharing the majority of the semantics of the two languages.
     
    301283\end{lstlisting}
    302284The type of \texttt{execute\_1\_pseudo\_instruction} is interesting.
    303 Note here that our representation of code memory \texttt{cm} is no longer a bitvector trie of bytes, but a list of pseudoinstructions.
    304 Further, we take in a function that maps program counters (at the assembly level) to pairs of natural numbers representing the number of clock ticks that the pseudoinstructions needs to execute, post expansion.
    305 We call this function a \emph{costing}, and note that the costing is induced by the particular strategy we use to expand pseudoinstructions.
     285Here our representation of code memory \texttt{cm} is no longer a trie of bytes, but a list of pseudoinstructions.
     286We take in a function that maps program counters (at the assembly level) to pairs of natural numbers representing the number of clock ticks that the pseudoinstructions needs to execute, post expansion.
     287This function is called a \emph{costing}, and the costing is induced by whatever strategy we use to expand pseudoinstructions.
    306288If we change how we expand conditional jumps to labels, for instance, then the costing needs to change, hence \texttt{execute\_1\_pseudo\_instruction}'s parametricity in the costing.
    307 This costing function expects a proof, at its second argument, stating that the program counter falls within the bounds of the pseudoprogram.
    308 A similar invariant holds of the pseudoprogram counter passed to the \texttt{execute\_1\_pseudo\_instruction} function.
     289This costing function expects a proof, as its second argument, stating that the program counter falls within the bounds of the pseudoprogram.
     290A similar proof is required for the pseudo-program counter passed to the \texttt{execute\_1\_pseudo\_instruction} function.
    309291
    310292The costing returns \emph{pairs} of natural numbers because, in the case of expanding conditional jumps to labels, the expansion of the `true branch' and `false branch' may differ in execution time.
Note: See TracChangeset for help on using the changeset viewer.