# Changeset 1360

Ignore:
Timestamp:
Oct 12, 2011, 11:47:21 AM (9 years ago)
Message:

added more on use of dependent types, and also discussing the abstraction of the intermediate languages

File:
1 edited

### Legend:

Unmodified
 r1356 \label{sect.backend.intermediate.languages.matita} We now discuss the encoding of the compiler backend languages in the Calculus of Constructions proper. We pay particular heed to changes that we made from the O'Caml prototype. In particular, many aspects of the backend languages have been unified into a single joint' language. We have also made heavy use of dependent types to reduce spurious partiality' and to encode invariants. %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% % SECTION.                                                                    % %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% \subsection{Abstracting related languages} \label{subsect.abstracting.related.languages} The O'Caml compiler is written in the following manner. Each intermediate language has its own dedicated syntax, notions of internal function, and so on. Translations map syntaxes to syntaxes, and internal functions to internal functions explicitly. This is a perfectly valid way to write a compiler, where everything is made explicit, but writing a \emph{verified} compiler poses new challenges. In particular, we must look ahead to see how our choice of encodings will affect the size and complexity of the forthcoming proofs of correctness. We now discuss some abstractions, introduced in the Matita code, which we hope will make our proofs shorter, amongst other benefits. \paragraph{Shared code} Many features of individual backend intermediate languages are shared with other intermediate languages. For instance, RTLabs, RTL, ERTL and LTL are all graph based languages, where functions are represented as a graph of statements that form their bodies. Functions for adding statements to a graph, searching the graph, and so on, are remarkably similar across all languages, but are duplicated in the O'Caml code. %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% % SECTION.                                                                    % \label{subsect.use.of.dependent.types} %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% % SECTION.                                                                    % %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% \subsection{Abstracting related languages} \label{subsect.abstracting.related.languages} We use dependent types in the backend for three reasons. First, we encode informal invariants, or uses of \texttt{assert false} in the O'Caml code, with dependent types, converting partial functions into total functions. There are numerous examples of this throughout the backend. For example, in the \texttt{RTLabs} to \texttt{RTL} transformation pass, many functions only make sense' when lists of registers passed to them as arguments conform to some specific length. For instance, the \texttt{translate\_negint} function, which translates a negative integer constant: \begin{lstlisting} definition translate_negint ≝ $\lambda$globals: list ident. $\lambda$destrs: list register. $\lambda$srcrs: list register. $\lambda$start_lbl: label. $\lambda$dest_lbl: label. $\lambda$def: rtl_internal_function globals. $\lambda$prf: |destrs| = |srcrs|. (* assert in caml code *) ... \end{lstlisting} The last argument to the function, \texttt{prf}, is a proof that the lengths of the lists of source and destination registers are the same. This was an assertion in the O'Caml code. Secondly, we make use of dependent types to make the Matita code easier to read, and eventually the proofs of correctness for the compiler easier to write. For instance, many intermediate languages in the backend of the compiler, from RTLabs to LTL, are graph based languages. Here, function definitions consist of a graph (i.e. a map from labels to statements) and a pair of labels denoting the entry and exit points of this graph. Practically, we would always like to ensure that the entry and exit labels are present in the statement graph. We ensure that this is so with a dependent sum type in the \texttt{joint\_internal\_function} record, which all graph based languages specialise to obtain their own internal function representation: \begin{lstlisting} record joint_internal_function (globals: list ident) (p: params globals): Type[0] := { ... joint_if_code     : codeT $\ldots$ p; joint_if_entry    : $\Sigma$l: label. lookup $\ldots$ joint_if_code l $\neq$ None $\ldots$; ... }. \end{lstlisting} Here, \texttt{codeT} is a parameterised type representing the structure' of the function's body (a graph in graph based languages, and a list in the linearised LIN language). Specifically, the \texttt{joint\_if\_entry} is a dependent pair consisting of a label and a proof that the label in question is a vertex in the function's graph. A similar device exists for the exit label. Finally, we make use of dependent types for another reason: experimentation. Namely, CompCert makes little use of dependent types to encode invariants. In contrast, we wish to make as much use of dependent types as possible, both to experiment with different ways of encoding compilers in a proof assistant, but also as a way of `stress testing' Matita's support for dependent types. %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%