# Changeset 3555 for Papers/jar-assembler-2015

Ignore:
Timestamp:
Apr 24, 2015, 1:04:26 PM (6 years ago)
Message:

dominic, jaap taking another look at the paper

Location:
Papers/jar-assembler-2015
Files:
2 deleted
2 copied
1 moved

### Legend:

Unmodified
 r3554 \documentclass[a4paper]{article} \usepackage{a4wide} \usepackage{algpseudocode} \usepackage{alltt} \usepackage{subcaption} \usepackage{kpfonts} %\usepackage{kpfonts} \usepackage[T1]{fontenc} \usepackage[ttdefault=true]{AnonymousPro} %\usepackage[ttdefault=true]{AnonymousPro} \newcommand{\All}[1]{\forall{#1}.\;} All proofs are checked by the Matita theorem prover, and the work forms a major component of the Certified Complexity (CerCo') project, aiming to produce a verified concrete complexity-preserving compiler for a subset of the C programming language. Our optimisations complicate the proof of correctness of the assembler proper and therefore we employ the use of policies', separating out decisions of how a pseudoinstruction should be expanded from the expansion process itself, simplifying our proofs. Our optimisations complicate the proof of correctness of the assembler proper and therefore we employ the use of policies', separating out the decision of how a pseudoinstruction should be expanded from the expansion process itself, simplifying our proofs. Our proof strategy also tracks good' addresses, and only programs that use good addresses have their semantics preserved under assembly. This tracking offers increased flexibility over the traditional approach---where addresses are kept abstract and immutable---as we may experiment with some manipulations of machine addresses. \tableofcontents \blfootnote{Research supported by the CerCo project, within the Future and Emerging Technologies (FET) programme of the Seventh Framework Programme for Research of the European Commission, under FET-Open grant number 243881} \blfootnote{Research supported by the CerCo project, within the Future and Emerging Technologies (FET) programme of the Seventh Framework Programme for Research of the European Commission, under FET-Open grant number 243881.} \newpage \label{sect.introduction} We present an overview of work formalising and proving correct an optimising assembler for a typical embedded microcontroller. Though the work described herein is specialised to the 8-bit Intel MCS-51 series, we believe the design decisions and techniques developed for our proof of correctness are of general interest. % Everybody \label{subsect.matita} In this Subsection we provide an overview of Matita, the host system for all proofs reported in this paper. Matita is a theorem prover based on a variant of the Calculus of Coinductive Constructions~\cite{asperti:user:2007}. The system features a full spectrum of dependent types and (co)inductive families, a sophisticated system of uniform coercions, a tactic-driven proof construction engine~\cite{sacerdoti-coen:tinycals:2007}, and paramodulation based automation~\cite{asperti:higher-order:2007} all of which we exploit in the formalisation described herein. Matita's syntax is largely similar to the syntaxes of mainstream functional programming languages such as OCaml or Standard ML, and the type theory that Matita implements is very similar to that of Coq~\cite{coq:2004} and Agda~\cite{bove:brief:2009}. The system features a full spectrum of dependent types and (co)inductive families, a system of coercions, a tactic-driven proof construction engine~\cite{sacerdoti-coen:tinycals:2007}, and paramodulation based automation~\cite{asperti:higher-order:2007}, all of which we exploit in the formalisation described herein. Matita's syntax is similar to the syntaxes of mainstream functional programming languages such as OCaml or Standard ML. The type theory that Matita implements is broadly akin to that of Coq~\cite{coq:2004} and Agda~\cite{bove:brief:2009}. Nevertheless, we provide a brief explanation of the main syntactic and type-theoretic features of Matita that will be needed to follow the body of the paper: \begin{itemize} \item Non-recursive functions and definitions are introduced via the \texttt{definition} keyword whilst recursive functions are introduced with \texttt{let rec}. Non-recursive functions and definitions are introduced via the \texttt{definition} keyword. Recursive functions are introduced with \texttt{let rec}. Mutually recursive functions are separated via the \texttt{and} keyword. Matita checks that all recursive functions are terminating before being admitted. Matita's termination checker ensures that all recursive functions are terminating before being admitted to maintain soundness of the system's logic. \item Matita has an infinite hierarchy of type universes. \item Matita's type theory plays host to a rich and expressive higher-order logic. Constant \texttt{True} and \texttt{False} represent truth and falsity in \texttt{Prop} respectively. Constants \texttt{True} and \texttt{False} represent truth and falsity in \texttt{Prop} respectively. Two inductive families in \texttt{Prop} encode conjunction and disjunction---$\mathtt{P \wedge Q}$ and $\mathtt{P \vee Q}$ respectively. As is usual, implication and universal quantification are identified with the dependent function space ($\Pi$ types), whereas (constructive) existential quantification is encoded as a dependent sum (a $\Sigma$-type). We write $\All{x : \phi}\psi$ for the dependent function space, and abbreviate this as $\phi \rightarrow \psi$ when $x \not\in fv(\psi)$ as usual. \item Inductive and coinductive families are introduced via the \texttt{inductive} and \texttt{coinductive} keywords respectively, with named constructor declarations separated with a bar. We use $\langle M,N \rangle$ for the pairing of $M$ and $N$. \item Inductive and coinductive families are introduced via the \texttt{inductive} and \texttt{coinductive} keywords respectively, with named constructor declarations separated by a bar. Mutually inductive data family declarations are separated via \texttt{with}. In the following declaration In the following declaration: \begin{lstlisting} inductive I ($P_1$ : $\tau_1$) $\ldots$ ($P_n$ : $\tau_n$) : $\phi_1 \rightarrow \ldots \rightarrow \phi_m \rightarrow \phi$ := $\ldots$ \end{lstlisting} we call $P_i$ for $0 \leq i \leq n$ the \deffont{parameters} of \texttt{I} and $\phi_j$ for $0 \leq j \leq m$ the \deffont{indices} of \texttt{I}. Matita checks that constructors have strictly-positive types before admitting an inductive family. We call $P_i$ for $0 \leq i \leq n$ the \deffont{parameters} of \texttt{I} and $\phi_j$ for $0 \leq j \leq m$ the \deffont{indices} of \texttt{I}. Matita's positivity checker ensures that constructors have strictly-positive types before admitting an inductive family to maintain soundness of the system's logic. \item Records are introduced with the \texttt{record} keyword. | mk_R : nat -> R. \end{lstlisting} The declaration of a record \texttt{R} causes a \deffont{constructor} function \texttt{mk\_R} to be entered into the global context as a side effect. \deffont{Projections}, one for each of the record's fields, are also registered in the global context, and in the example record above \texttt{F1} of type $R \rightarrow nat$ is registered as a field projection. A record field's type may depend on fields declared earlier in the record. Record fields may be marked as coercions. Records may be decomposed with projections. Projections, one for each of field of a record, are registered in the global context. In the example record above, \texttt{F1} of type $R \rightarrow nat$ is registered as a field projection and $mk\_R$ of type $nat \rightarrow R$ is registered as a constructor. Record fields may also be marked as coercions. In the following example \begin{lstlisting} } \end{lstlisting} the field \texttt{Carrier} is declared to be a coercion with \texttt{:>}' with the effect that the field projection \texttt{Carrier} may be omitted where it could be successfully inferred by Matita. the field \texttt{Carrier} is declared to be a coercion with \texttt{:>}'.with the operational effect being that the field projection \texttt{Carrier} may be omitted where it could be successfully inferred by Matita. Field coercions facilitate the informal but common mathematical practice of intentionally confusing a structure with its underlying carrier set. \item Three consecutive dots, \texttt{$\ldots$}, denote multiple terms or types that have been omitted. \item Pattern matching is carried out with a \texttt{match} expression. We may sometimes fully annotate a \texttt{match} expression with its return type. Data may be decomposed by pattern matching with a \texttt{match} expression. We may fully annotate a \texttt{match} expression with its return type. This is especially useful when working with indexed families of types or with invariants, expressed as types, on functions. In the following \item Matita features a liberal system of coercions (distinct from the previously mentioned record field coercions). It is possible to define a uniform coercion $\lam{x}\langle x, ?\rangle$ from every type $T$ to the dependent product $\Sigma{x : T}. P x$ (here $\langle \_,\_ \rangle$ is Matita's sugared syntax for pairs). It is possible to define a uniform coercion $\lam{x}\langle x, ?\rangle$ from every type $T$ to the dependent product $\Sigma{x : T}. P x$. The coercion opens a proof obligation that asks the user to prove that $P$ holds for $x$. When a coercion is to be applied to a complex term (for example, a $\lambda$-abstraction, a local definition, or a case analysis), the system automatically propagates the coercion to the sub-terms. % DPM and CSC One of the central data types in our formalisation is the \texttt{BitVector}. \texttt{BitVector}s are fixed-length lists of bits (implemented as booleans) used to encode words and bytes: \begin{lstlisting} definition BitVector := $\lambda$n: nat. Vector bool n. \end{lstlisting} Arithmetic in our formalisation is carried out on \texttt{BitVector} values. An MCS-51 byte is eight bits wide and an MCS-51 word is a double byte, or sixteen bits wide. Some MCS-51 instructions require four, seven and eleven bit wide sub-bytes and sub-words respectively. We capture these notions as definitions: \begin{lstlisting} definition Nibble := BitVector 4. definition Word11 := BitVector 11. \end{lstlisting} A central design decision in our formalisation is the choice of how to encode memory. The MCS-51 has various memory spaces and registers that can be addressed with bytes and words. \begin{lstlisting} inductive BitVectorTrie (A: Type[0]): nat $\rightarrow$ Type[0] := \newpage \bibliography{boender-jar-2014} \bibliography{boender-jar-2015} \end{document}