Oct 12, 2011, 11:47:21 AM (10 years ago)

added more on use of dependent types, and also discussing the abstraction of the intermediate languages

1 edited


  • Deliverables/D4.2-4.3/reports/D4-2.tex

    r1356 r1360  
     152We now discuss the encoding of the compiler backend languages in the Calculus of Constructions proper.
     153We pay particular heed to changes that we made from the O'Caml prototype.
     154In particular, many aspects of the backend languages have been unified into a single `joint' language.
     155We have also made heavy use of dependent types to reduce `spurious partiality' and to encode invariants.
     158% SECTION.                                                                    %
     160\subsection{Abstracting related languages}
     163The O'Caml compiler is written in the following manner.
     164Each intermediate language has its own dedicated syntax, notions of internal function, and so on.
     165Translations map syntaxes to syntaxes, and internal functions to internal functions explicitly.
     167This is a perfectly valid way to write a compiler, where everything is made explicit, but writing a \emph{verified} compiler poses new challenges.
     168In particular, we must look ahead to see how our choice of encodings will affect the size and complexity of the forthcoming proofs of correctness.
     169We now discuss some abstractions, introduced in the Matita code, which we hope will make our proofs shorter, amongst other benefits.
     171\paragraph{Shared code}
     172Many features of individual backend intermediate languages are shared with other intermediate languages.
     173For instance, RTLabs, RTL, ERTL and LTL are all graph based languages, where functions are represented as a graph of statements that form their bodies.
     174Functions for adding statements to a graph, searching the graph, and so on, are remarkably similar across all languages, but are duplicated in the O'Caml code.
    153177% SECTION.                                                                    %
    158 %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
    159 % SECTION.                                                                    %
    160 %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
    161 \subsection{Abstracting related languages}
    162 \label{subsect.abstracting.related.languages}
     182We use dependent types in the backend for three reasons.
     184First, we encode informal invariants, or uses of \texttt{assert false} in the O'Caml code, with dependent types, converting partial functions into total functions.
     185There are numerous examples of this throughout the backend.
     186For example, in the \texttt{RTLabs} to \texttt{RTL} transformation pass, many functions only `make sense' when lists of registers passed to them as arguments conform to some specific length.
     187For instance, the \texttt{translate\_negint} function, which translates a negative integer constant:
     189definition translate_negint ≝
     190  $\lambda$globals: list ident.
     191  $\lambda$destrs: list register.
     192  $\lambda$srcrs: list register.
     193  $\lambda$start_lbl: label.
     194  $\lambda$dest_lbl: label.
     195  $\lambda$def: rtl_internal_function globals.
     196  $\lambda$prf: |destrs| = |srcrs|. (* assert in caml code *)
     197    ...
     199The last argument to the function, \texttt{prf}, is a proof that the lengths of the lists of source and destination registers are the same.
     200This was an assertion in the O'Caml code.
     202Secondly, we make use of dependent types to make the Matita code easier to read, and eventually the proofs of correctness for the compiler easier to write.
     203For instance, many intermediate languages in the backend of the compiler, from RTLabs to LTL, are graph based languages.
     204Here, function definitions consist of a graph (i.e. a map from labels to statements) and a pair of labels denoting the entry and exit points of this graph.
     205Practically, we would always like to ensure that the entry and exit labels are present in the statement graph.
     206We ensure that this is so with a dependent sum type in the \texttt{joint\_internal\_function} record, which all graph based languages specialise to obtain their own internal function representation:
     208record joint_internal_function (globals: list ident) (p: params globals): Type[0] :=
     210  ...
     211  joint_if_code     : codeT $\ldots$ p;
     212  joint_if_entry    : $\Sigma$l: label. lookup $\ldots$ joint_if_code l $\neq$ None $\ldots$;
     213  ...
     216Here, \texttt{codeT} is a parameterised type representing the `structure' of the function's body (a graph in graph based languages, and a list in the linearised LIN language).
     217Specifically, the \texttt{joint\_if\_entry} is a dependent pair consisting of a label and a proof that the label in question is a vertex in the function's graph.
     218A similar device exists for the exit label.
     220Finally, we make use of dependent types for another reason: experimentation.
     221Namely, CompCert makes little use of dependent types to encode invariants.
     222In contrast, we wish to make as much use of dependent types as possible, both to experiment with different ways of encoding compilers in a proof assistant, but also as a way of `stress testing' Matita's support for dependent types.
Note: See TracChangeset for help on using the changeset viewer.