\documentclass[smallextended]{svjour3}
\smartqed
\usepackage[english]{babel}
\usepackage[colorlinks]{hyperref}
\usepackage{listings}
\usepackage{microtype}
\lstset{basicstyle=\footnotesize\tt,columns=flexible,breaklines=false,
keywordstyle=\color{red}\bfseries,
keywordstyle=[2]\color{blue},
commentstyle=\color{green},
stringstyle=\color{blue},
showspaces=false,showstringspaces=false,
xleftmargin=1em}
\bibliographystyle{spphys}
\author{Dominic P. Mulligan \and Claudio Sacerdoti Coen}
\title{Polymorphic variants in dependent type theory\thanks{The project CerCo acknowledges the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under FET-Open grant number: 243881.}}
\institute{
Dominic P. Mulligan \at
Computer Laboratory,\\
University of Cambridge.
\email{dominic.p.mulligan@gmail.com} \and
Claudio Sacerdoti Coen \at
Dipartimento di Scienze dell'Informazione,\\
Universit\`a di Bologna.
\email{sacerdot@cs.unibo.it}
}
\begin{document}
\maketitle
\begin{abstract}
Big long abstract introducing the work
\keywords{Polymorphic variants \and dependent type theory \and Matita theorem prover}
\end{abstract}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Section
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}
\label{sect.introduction}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Section
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Polymorphic variants}
\label{sect.polymorphic.variants}
In this section we will attempt to provide a self-contained \emph{pr\'ecis} of polymorphic variants for the unfamiliar reader.
Those readers who wish to survey a more complete introduction to the subject are referred to Garrigue's original publications on the subject matter~\cite{garrigue:programming:1998,garrigue:code:2000}.
Most mainstream functional programming languages, such as OCaml and Haskell, have mechanisms for defining new inductive types from old through the use of algebraic data type definitions.
Each algebraic data type may be described as a sum-of-products.
The programmer provides a fixed number of distinct \emph{constructors} with each constructor expects a (potentially empty) product of arguments.
Values of a given inductive data type are built using a data type's constructors.
Quotidian data structures---such as lists, trees, heaps, zippers, and so forth---can all be introduced using this now familiar mechanism.
Having built data from various combinations of constructors, to complete the picture we now need some facility for picking said data apart, in order to build functions that operate over that data.
Functional languages almost uniformly employ \emph{pattern matching} for this task.
Given any inhabitant of an algebraic data type, by the aforementioned sum-of-products property, we know that it must consist of some constructor applied to various arguments.
Using pattern matching we can therefore deconstruct algebraic data by performing a case analysis on the constructors of a given type.
The combination of algebraic data types and pattern matching is powerful, and is arguably the main branching mechanism for most functional programming languages.
Furthermore, using pattern matching it is easy to define new functions that consume algebraic data---the set of operations that can be defined for any given algebraic type is practically without bound.
Unfortunately, in practical programming, we often want to expand a previously defined algebraic data type, adding more constructors.
When it comes to extending algebraic data types with new constructors in this way, we hit a problem: these types are `closed'.
In order to circumvent this restriction, we must introduce a new algebraic type with the additional constructor, lifting the old type---and any functions defined over it---into this type.
\begin{figure}[ht]
\begin{minipage}[b]{0.45\linewidth}
\begin{lstlisting}
data Term
= Lit Int
| Add Term Term
| Mul Term Term
evaluate :: Term -> Int
evaluate (Lit i) = i
evaluate (Add l r) = evaluate l + evaluate r
evaluate (Mul l r) = evaluate l * evaluate r
\end{lstlisting}
\end{minipage}
\hspace{0.5cm}
\begin{minipage}[b]{0.45\linewidth}
\begin{lstlisting}
\end{lstlisting}
\end{minipage}
\label{fig.pattern-matching.vs.oop}
\caption{A simple language of integer arithmetic embedded as an algebraic data type and as a class hierarchy.}
\end{figure}
We can compare and contrast functional programming languages' use of algebraic data and pattern matching with the approach taken by object-oriented languages (see Figure~\ref{fig.pattern-matching.vs.oop} for a concrete example).
In mainstream object-oriented languages such as Java algebraic data types correspond to interfaces, or some base object.
Constructors correspond to classes implementing this interface; branching by pattern matching is emulated using the language's dynamic dispatch mechanism.
The base interface of the object hierarchy specifies the permitted operations defined for the type.
As all operations
it is hard to enlarge the set of operations defined over a given type without altering the entire class hierarchy.
If the interface changes so must every class implementing it.
However, note it is easy to extend the hierarchy to new cases, corresponding to the introduction of a new constructor in the functional world, by merely adding another class corresponding to that constructor implementing the interface.
[dpm: reword the above --- hard to phrase precisely ]
\begin{itemize}
\item General introduction, motivations
\item Bounded vs not-bounded.
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Section
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Matita}
\label{subsect.matita}
Matita~\cite{asperti:user:2007} is a dependently-typed proof assistant developed in Bologna, implementing the Calculus of (Co)inductive Constructions, a type theory similar to that of Coq.
Throughout this paper, all running examples are provided in Matita's proof script vernacular.
This vernacular, similar to the syntax of OCaml or Coq, should be mostly straightforward.
One possible source of confusion is our use of $?$ and $\ldots$, which correspond to a term, or terms, to be inferred automatically by unification, respectively.
Any terms that cannot be inferred by unification from the surrounding context are left for the user to provide as proof obligations.
Matita
\subsection{Subtyping as instantiation vs subtyping as safe static cast}
\subsection{Syntax \& type checking rules}
The ones of Guarrigue + casts, but also for the bounded case?
Casts support both styles of subtyping.
\subsection{Examples}
The weird function types that only work in subtyping as instantiation
\subsection{Solution to the expression problem}
Our running example in pseudo-OCaml syntax
\section{Bounded polymorphic variants via dependent types}
Requirements (i.e. O(1) pattern-matching, natural extracted code, etc.)
\subsection{Simulation (reduction + type checking)}
\subsection{Examples}
The weird function types redone
\subsection{Subtyping as instantiation vs subtyping as safe static cast}
Here we show/discuss how our approach supports both styles at once.
\subsection{Solution to the expression problem, I}
Using subtyping as cast, the file I have produced
\subsection{Solution to the expression problem, II}
Using subtyping as instantiation, comparisons, pros vs cons
\subsection{Negative encoding (??)}
The negative encoding and application to the expression problem
\subsection{Other encodings (??)}
Hints to other possible encodings
\section{Extensible records (??)}
\section{Comparison to related work and alternatives}
\begin{itemize}
\item Disjoint unions: drawbacks
\item Encoding the unbounded case: drawbacks
\end{itemize}
\section{Appendix: interface of library functions used to implement everything}
\bibliography{polymorphic-variants}
\end{document}