# Changeset 2515 for Papers/polymorphic-variants-2012/polymorphic-variants.tex

Ignore:
Timestamp:
Dec 2, 2012, 5:20:39 PM (7 years ago)
Message:

...

File:
1 edited

### Legend:

Unmodified
 r2427 \section{Bounded polymorphic variants via dependent types} Requirements (i.e. O(1) pattern-matching, natural extracted code, etc.) The topic of this paper, and this section in particular, is to present an \emph{encoding} of a subclass of polymorphic variants into dependent type theory. The final aim is to provide a solution to the expression problem that is implementable in user space into interactive theorem provers based on dependent type theories, in particular Coq, Agda and Matita. Moreover, we are only interested into an encoding that satisfies a number of constraints listed below. Essentially, the constraints ask that code automatically extracted from the system should be compiled efficiently without any change to the extraction machine. The encoding has been tested in the interactive theorem prover Matita, but we believe that it should be easily ported to the other cited systems. \subsection{Code extraction} Since our requirements have code extraction in mind, we briefly recall here what code extraction is in our context. Code extraction, as implemented in Coq, Matita and other systems, is the procedure that, given a proof term, extracts a program in a functional programming language that implements the computational content of the proof term. In particular, all proof parts that only establish correctness of the answer, totality of functions, etc. are to be erased. In particular, given a precondition $P$ and a postcondition $Q$, from a constructive proof of a $\forall x:A.P(x) \to \exists y:B. Q(x,y)$ statement one extracts a (partial) function $f$ from the encoding of $A$ to the encoding of $B$ that satisfies $\forall x. P(x) \to Q(x,f(x))$. Code extraction, as implemented by the above systems, does not attempt any optimization of the representation of data types. The only modification allowed is that parts of the data that are propositional and thus bare no content (being isomorphic to the unit type) are simplified to the unit type and possibly completely erased using the categorical properties of unit types ($1 * T \symeq T$, $1 \to T \symeq T$, etc.). For more information, one can read~\cite{berardi,christine-mohring,letouzey}. Thus, if a polymorphic variant is encoded in type theory as an n-ary sum of n-ary products, these being inefficiently represented using binary sums and products, the same holds for the extracted code. \subsection{The requirements} We list the following requirements. \begin{enumerate} \item The representation of a polymorphic variant type in the extracted code should be as efficient, in memory and time, as the isomorphic inductive type. Efficiency should here be considered up to constants, and not asymptotic. \item Polymorphic types in the subtyping relation should be encoded using data types that, once extracted, are in the subtyping relation too. Before code extraction, instead, it is allowed to use explicit non identity functions to coerce inhabitants of the sub-type into the super-type. What matters is that the computational content of these coercions should be the identity. \end{enumerate} The first requirements prevents the encoding of polymorphic variants using binary sums and products. For example, a constructor of an inductive type with four constructors requires one memory cell, but the same constructor for a binary sum of binary sums requires two memory cells, and twice the time to inhabit the data type. The solution is to force the code extracted from the polymorphic variant $\tau$ to be an inductive type. In turn this forces the encoding of polymorphic variants to be based on inductive types. The second requirement is in contrast with the possibility to extend a polymorphic variant type: when a type $\tau$ is extended to $\tau + a:T$, an inhabitant of the extraction of $\tau$ should already be an inhabitant of the extraction of $\tau + a:T$. This is possible if polymorphic variants were extracted to polymorphic variants in the target language, but this is not actually the case in any of the considered systems (e.g. Coq or Matita). Indeed, because of the first requirement, $\tau$ and $\tau + a:T$ are to be extracted to two inductive types. The only way to obtain subtyping is to already use the same representation: we need to anticipate or \emph{bound} the extension. \subsection{Simulation (reduction + type checking)} \subsection{Examples}