Index: /Papers/jar-cerco-2017/cerco.tex
===================================================================
--- /Papers/jar-cerco-2017/cerco.tex (revision 3656)
+++ /Papers/jar-cerco-2017/cerco.tex (revision 3657)
@@ -70,4 +70,10 @@
\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
+
+\newcommand{\cerco}{CerCo}
+\newcommand{\ocaml}{OCaml}
+\newcommand{\clight}{Clight}
+\newcommand{\matita}{Matita}
+\newcommand{\sdcc}{\texttt{sdcc}}
\title{CerCo: Certified Complexity\thanks{The project CerCo acknowledges the
Index: /Papers/jar-cerco-2017/lst-grafite.tex
===================================================================
--- /Papers/jar-cerco-2017/lst-grafite.tex (revision 3656)
+++ /Papers/jar-cerco-2017/lst-grafite.tex (revision 3657)
@@ -8,4 +8,5 @@
notation, interpretation, lapply, repeat, try, as, clear, in, change, whd, exists},
morekeywords={[2]include},
+keywordstyle=\color{blue}\bfseries,
%emph={[1]Type, Prop, nat, real}, emphstyle={[1]\textit},
% literate=
@@ -150,7 +151,7 @@
captionpos=b,
mathescape=true,
-keywordstyle=\bfseries,
-keywordstyle=[2]\bfseries,
-keywordstyle=[3]\bfseries,
+%keywordstyle=\bfseries,
+%keywordstyle=[2]\bfseries,
+%keywordstyle=[3]\bfseries,
%backgroundcolor=\color{gray},
frame=tblr,
Index: /Papers/jar-cerco-2017/proof.tex
===================================================================
--- /Papers/jar-cerco-2017/proof.tex (revision 3656)
+++ /Papers/jar-cerco-2017/proof.tex (revision 3657)
@@ -8,4 +8,1036 @@
\section{Compiler proof}
\label{sect.compiler.proof}
+
+\section{Introduction}
+
+The \cerco{} compiler produces a version of the source code containing
+annotations describing the timing behaviour of the object code, as
+well as the object code itself. It compiles C code, targeting
+microcontrollers implementing the Intel 8051 architecture. There are
+two versions: first, an initial prototype was implemented in
+\ocaml{}~\cite{d2.2}, then a version was formalised in the \matita{}
+proof assistant~\cite{d3.2,d4.2} and extracted to \ocaml{} code to
+produce an executable compiler. In this document we present results
+from Deliverable 3.4, the formalised proofs in \matita{} about the
+front-end of the latter version of the compiler (culminating in the
+\lstinline'front_end_correct' lemma), and describe how that fits
+into the verification of the whole compiler.
+
+A key part of this work was to layer the \emph{intensional} correctness
+results that show that the costs produced are correct on top of the
+proofs about the compiled code's \emph{extensional} behaviour (that is, the
+functional correctness of the compiler). Unfortunately, the ambitious
+goal of completely verifying the entire compiler was not feasible
+within the time available, but thanks to this separation of
+extensional and intensional proofs we are able to axiomatise some extensional
+simulation results which are similar to those in other compiler verification
+projects and concentrate on the novel intensional proofs. We were
+also able to add stack space costs to obtain a stronger result. The
+proofs were made more tractable by introducing compile-time checks for
+the `sound and precise' cost labelling properties rather than proving
+that they are preserved throughout.
+
+The overall statement of correctness says that the annotated program has the
+same behaviour as the input, and that for any suitably well-structured part of
+the execution (which we call \emph{measurable}), the object code will execute
+the same behaviour taking precisely the time given by the cost annotations in
+the annotated source program.
+
+In the next section we recall the structure of the compiler and make the overall
+statement more precise. Following that, in Section~\ref{sec:fegoals} we
+describe the statements we need to prove about the intermediate \textsf{RTLabs}
+programs for the back-end proofs.
+Section~\ref{sec:inputtolabelling} covers the compiler passes which produce the
+annotated source program and Section~\ref{sec:measurablelifting} the rest
+of the transformations in the front-end. Then the compile-time checks
+for good cost labelling are detailed in Section~\ref{sec:costchecks}
+and the proofs that the structured traces required by the back-end
+exist are discussed in Section~\ref{sec:structuredtrace}.
+
+\section{The compiler and its correctness statement}
+
+The uncertified prototype \ocaml{} \cerco{} compiler was originally described
+in Deliverables 2.1 and 2.2. Its design was replicated in the formal
+\matita{} code, which was presented in Deliverables 3.2 and 4.2, for
+the front-end and back-end respectively.
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=0.5\linewidth]{compiler-plain.pdf}
+\end{center}
+\caption{Languages in the \cerco{} compiler}
+\label{fig:compilerlangs}
+\end{figure}
+
+The compiler uses a number of intermediate languages, as outlined the
+middle two lines of Figure~\ref{fig:compilerlangs}. The upper line
+represents the front-end of the compiler, and the lower the back-end,
+finishing with Intel 8051 binary code. Not all of the front-end compiler passes
+introduce a new language, and Figure~\ref{fig:summary} presents a
+list of every pass involved.
+
+\begin{figure}
+\begin{center}
+\begin{minipage}{.8\linewidth}
+\begin{tabbing}
+\quad \= $\downarrow$ \quad \= \kill
+\textsf{C} (unformalised)\\
+\> $\downarrow$ \> CIL parser (unformalised \ocaml)\\
+\textsf{Clight}\\
+%\> $\downarrow$ \> add runtime functions\\
+\> $\downarrow$ \> \lstinline[language=C]'switch' removal\\
+\> $\downarrow$ \> labelling\\
+\> $\downarrow$ \> cast removal\\
+\> $\downarrow$ \> stack variable allocation and control structure
+ simplification\\
+\textsf{Cminor}\\
+%\> $\downarrow$ \> generate global variable initialization code\\
+\> $\downarrow$ \> transform to RTL graph\\
+\textsf{RTLabs}\\
+\> $\downarrow$ \> check cost labelled properties of RTL graph\\
+\> $\downarrow$ \> start of target specific back-end\\
+\>\quad \vdots
+\end{tabbing}
+\end{minipage}
+\end{center}
+\caption{Front-end languages and compiler passes}
+\label{fig:summary}
+\end{figure}
+
+\label{page:switchintro}
+The annotated source code is produced by the cost labelling phase.
+Note that there is a pass to replace C \lstinline[language=C]'switch'
+statements before labelling --- we need to remove them because the
+simple form of labelling used in the formalised compiler is not quite
+capable of capturing their execution time costs, largely due to C's
+`fall-through' behaviour where execution from one branch continues in
+the next unless there is an explicit \lstinline[language=C]'break'.
+
+The cast removal phase which follows cost labelling simplifies
+expressions to prevent unnecessary arithmetic promotion, which is
+specified by the C standard but costly for an 8-bit target. The
+transformation to \textsf{Cminor} and subsequently \textsf{RTLabs}
+bear considerable resemblance to some passes of the CompCert
+compiler~\cite{Blazy-Leroy-Clight-09,Leroy-backend}, although we use a simpler \textsf{Cminor} where
+all loops use \lstinline[language=C]'goto' statements, and the
+\textsf{RTLabs} language retains a target-independent flavour. The
+back-end takes \textsf{RTLabs} code as input.
+
+The whole compilation function returns the following information on success:
+\begin{lstlisting}[language=Grafite]
+ record compiler_output : Type[0] :=
+ { c_labelled_object_code: labelled_object_code
+ ; c_stack_cost: stack_cost_model
+ ; c_max_stack: nat
+ ; c_init_costlabel: costlabel
+ ; c_labelled_clight: clight_program
+ ; c_clight_cost_map: clight_cost_map
+ }.
+\end{lstlisting}
+It consists of annotated 8051 object code, a mapping from function
+identifiers to the function's stack space usage, the space available for the
+stack after global variable allocation, a cost label covering the
+execution time for the initialisation of global variables and the call
+to the \lstinline[language=C]'main' function, the annotated source
+code, and finally a mapping from cost labels to actual execution time
+costs.
+
+An \ocaml{} pretty printer is used to provide a concrete version of
+the output code and annotated source code. In the case of the
+annotated source code, it also inserts the actual costs alongside the
+cost labels, and optionally adds a global cost variable and
+instrumentation to support further reasoning in external tools such as
+Frama-C.
+
+\subsection{Revisions to the prototype compiler}
+
+Our focus on intensional properties prompted us to consider whether we
+could incorporate stack space into the costs presented to the user.
+We only allocate one fixed-size frame per function, so modelling this
+was relatively simple. It is the only form of dynamic memory
+allocation provided by the compiler, so we were able to strengthen the
+statement of the goal to guarantee successful execution whenever the
+stack space obeys the \lstinline'c_max_stack' bound calculated by
+subtracting the global variable requirements from the total memory
+available.
+
+The cost labelling checks at the end of Figure~\ref{fig:summary} have been
+introduced to reduce the proof burden, and are described in
+Section~\ref{sec:costchecks}.
+
+The use of dependent types to capture simple intermediate language
+invariants makes every front-end pass a total function, except
+\textsf{Clight} to \textsf{Cminor} and the cost checks. Hence various
+well-formedness and type safety checks are performed only once between
+\textsf{Clight} and \textsf{Cminor}, and the invariants rule out any
+difficulties in the later stages. With the benefit of hindsight we
+would have included an initial checking phase to produce a
+`well-formed' variant of \textsf{Clight}, conjecturing that this would
+simplify various parts of the proofs for the \textsf{Clight} stages
+which deal with potentially ill-formed code.
+
+Following D2.2, we previously generated code for global variable
+initialisation in \textsf{Cminor}, for which we reserved a cost label
+to represent the execution time for initialisation. However, the
+back-end must also add an initial call to the main function, whose
+cost must also be accounted for, so we decided to move the
+initialisation code to the back-end and merge the costs.
+
+\subsection{Main correctness statement}
+
+Informally, our main intensional result links the time difference in a source
+code execution to the time difference in the object code, expressing the time
+for the source by summing the values for the cost labels in the trace, and the
+time for the target by a clock built in to the 8051 executable semantics.
+
+The availability of precise timing information for 8501
+implementations and the design of the compiler allow it to give exact
+time costs in terms of processor cycles, not just upper bounds.
+However, these exact results are only available if the subtrace we
+measure starts and ends at suitable points. In particular, pure
+computation with no observable effects may be reordered and moved past
+cost labels, so we cannot measure time between arbitrary statements in
+the program.
+
+There is also a constraint on the subtraces that we
+measure due to the requirements of the correctness proof for the
+object code timing analysis. To be sure that the timings are assigned
+to the correct cost label, we need to know that each return from a
+function call must go to the correct return address. It is difficult
+to observe this property locally in the object code because it relies
+on much earlier stages in the compiler. To convey this information to
+the timing analysis extra structure is imposed on the subtraces, which
+is described in Section~\ref{sec:fegoals}.
+
+% Regarding the footnote, would there even be much point?
+% TODO: this might be quite easy to add ('just' subtract the
+% measurable subtrace from the second label to the end). Could also
+% measure other traces in this manner.
+These restrictions are reflected in the subtraces that we give timing
+guarantees on; they must start at a cost label and end at the return
+of the enclosing function of the cost label\footnote{We expect that
+ this would generalise to more general subtraces by subtracting costs
+ for unwanted measurable suffixes of a measurable subtrace.}. A
+typical example of such a subtrace is the execution of an entire
+function from the cost label at the start of the function until it
+returns. We call such any such subtrace \emph{measurable} if it (and
+the prefix of the trace from the start to the subtrace) can also be
+executed within the available stack space.
+
+Now we can give the main intensional statement for the compiler.
+Given a \emph{measurable} subtrace for a labelled \textsf{Clight}
+program, there is a subtrace of the 8051 object code program where the
+time differences match. Moreover, \emph{observable} parts of the
+trace also match --- these are the appearance of cost labels and
+function calls and returns.
+
+
+
+More formally, the definition of this statement in \matita{} is
+\begin{lstlisting}[language=Grafite]
+definition simulates :=
+ $\lambda$p: compiler_output.
+ let initial_status := initialise_status $...$ (cm (c_labelled_object_code $...$ p)) in
+ $\forall$m1,m2.
+ measurable Clight_pcs (c_labelled_clight $...$ p) m1 m2
+ (stack_sizes (c_stack_cost $...$ p)) (c_max_stack $...$ p) $\rightarrow$
+ $\forall$c1,c2.
+ clock_after Clight_pcs (c_labelled_clight $...$ p) m1 (c_clight_cost_map $...$ p) = OK $...$ c1 $\rightarrow$
+ clock_after Clight_pcs (c_labelled_clight $...$ p) (m1+m2) (c_clight_cost_map $...$ p) = OK $...$ c2 $\rightarrow$
+ $\exists$n1,n2.
+ observables Clight_pcs (c_labelled_clight $...$ p) m1 m2 =
+ observables (OC_preclassified_system (c_labelled_object_code $...$ p))
+ (c_labelled_object_code $...$ p) n1 n2
+ $\wedge$
+ clock ?? (execute (n1+n2) ? initial_status) =
+ clock ?? (execute n1 ? initial_status) + (c2-c1).
+\end{lstlisting}
+where the \lstinline'measurable', \lstinline'clock_after' and
+\lstinline'observables' definitions are generic definitions for multiple
+languages; in this case the \lstinline'Clight_pcs' record applies them
+to \textsf{Clight} programs.
+
+There is a second part to the statement, which says that the initial
+processing of the input program to produce the cost labelled version
+does not affect the semantics of the program:
+% Yes, I'm paraphrasing the result a tiny bit to remove the observe non-function
+\begin{lstlisting}[language=Grafite]
+ $\forall$input_program,output.
+ compile input_program = return output $\rightarrow$
+ not_wrong $...$ (exec_inf $...$ clight_fullexec input_program) $\rightarrow$
+ sim_with_labels
+ (exec_inf $...$ clight_fullexec input_program)
+ (exec_inf $...$ clight_fullexec (c_labelled_clight $...$ output))
+\end{lstlisting}
+That is, any successful compilation produces a labelled program that
+has identical behaviour to the original, so long as there is no
+`undefined behaviour'.
+
+Note that this statement provides full functional correctness, including
+preservation of (non-)termination. The intensional result above does
+not do this directly --- it does not guarantee the same result or same
+termination. There are two mitigating factors, however: first, to
+prove the intensional property you need local simulation results --- these
+can be pieced together to form full behavioural equivalence, only time
+constraints have prevented us from doing so. Second, if we wish to
+confirm a result, termination, or non-termination we could add an
+observable witness, such as a function that is only called if the
+correct result is given. The intensional result guarantees that the
+observable witness is preserved, so the program must behave correctly.
+
+These two results are combined in the the \lstinline'correct'
+theorem in the file \lstinline'correctness.ma'.
+
+\section{Correctness statement for the front-end}
+\label{sec:fegoals}
+
+The essential parts of the intensional proof were outlined during work
+on a toy compiler in Task
+2.1~\cite{d2.1,springerlink:10.1007/978-3-642-32469-7_3}. These are
+\begin{enumerate}
+\item functional correctness, in particular preserving the trace of
+ cost labels,
+\item the \emph{soundness} and \emph{precision} of the cost labelling
+ on the object code, and
+\item the timing analysis on the object code produces a correct
+ mapping from cost labels to time.
+\end{enumerate}
+
+However, that toy development did not include function calls. For the
+full \cerco{} compiler we also need to maintain the invariant that
+functions return to the correct program location in the caller, as we
+mentioned in the previous section. During work on the back-end timing
+analysis (describe in more detail in the companion deliverable, D4.4)
+the notion of a \emph{structured trace} was developed to enforce this
+return property, and also most of the cost labelling properties too.
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=0.5\linewidth]{compiler.pdf}
+\end{center}
+\caption{The compiler and proof outline}
+\label{fig:compiler}
+\end{figure}
+
+Jointly, we generalised the structured traces to apply to any of the
+intermediate languages which have some idea of program counter. This means
+that they are introduced part way through the compiler, see
+Figure~\ref{fig:compiler}. Proving that a structured trace can be
+constructed at \textsf{RTLabs} has several virtues:
+\begin{itemize}
+\item This is the first language where every operation has its own
+ unique, easily addressable, statement.
+\item Function calls and returns are still handled implicitly in the
+ language and so the structural properties are ensured by the
+ semantics.
+\item Many of the back-end languages from \textsf{RTL} onwards share a common
+ core set of definitions, and using structured traces throughout
+ increases this uniformity.
+\end{itemize}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=0.6\linewidth]{strtraces.pdf}
+\end{center}
+\caption{Nesting of functions in structured traces}
+\label{fig:strtrace}
+\end{figure}
+A structured trace is a mutually inductive data type which
+contains the steps from a normal program trace, but arranged into a
+nested structure which groups entire function calls together and
+aggregates individual steps between cost labels (or between the final
+cost label and the return from the function), see
+Figure~\ref{fig:strtrace}. This captures the idea that the cost labels
+only represent costs \emph{within} a function --- calls to other
+functions are accounted for in the nested trace for their execution, and we
+can locally regard function calls as a single step.
+
+These structured traces form the core part of the intermediate results
+that we must prove so that the back-end can complete the main
+intensional result stated above. In full, we provide the back-end
+with
+\begin{enumerate}
+\item A normal trace of the \textbf{prefix} of the program's execution
+ before reaching the measurable subtrace. (This needs to be
+ preserved so that we know that the stack space consumed is correct,
+ and to set up the simulation results.)
+\item The \textbf{structured trace} corresponding to the measurable
+ subtrace.
+\item An additional property about the structured trace that no
+ `program counter' is \textbf{repeated} between cost labels. Together with
+ the structure in the trace, this takes over from showing that
+ cost labelling is sound and precise.
+\item A proof that the \textbf{observables} have been preserved.
+\item A proof that the \textbf{stack limit} is still observed by the prefix and
+ the structure trace. (This is largely a consequence of the
+ preservation of observables.)
+\end{enumerate}
+The \lstinline'front_end_correct' lemma in the
+\lstinline'correctness.ma' file provides a record containing these.
+
+Following the outline in Figure~\ref{fig:compiler}, we will first deal
+with the transformations in \textsf{Clight} that produce the source
+program with cost labels, then show that measurable traces can be
+lifted to \textsf{RTLabs}, and finally show that we can construct the
+properties listed above ready for the back-end proofs.
+
+\section{Input code to cost labelled program}
+\label{sec:inputtolabelling}
+
+As explained on page~\pageref{page:switchintro}, the costs of complex
+C \lstinline[language=C]'switch' statements cannot be represented with
+the simple labelling used in the formalised compiler. Our first pass
+replaces these statements with simpler C code, allowing our second
+pass to perform the cost labelling. We show that the behaviour of
+programs is unchanged by these passes using forward
+simulations\footnote{All of our languages are deterministic, which can
+be seen directly from their executable definitions. Thus we know that
+forward simulations are sufficient because the target cannot have any
+other behaviour.}.
+
+\subsection{Switch removal}
+
+We compile away \lstinline[language=C]'switch' statements into more
+basic \textsf{Clight} code.
+Note that this transformation does not necessarily deteriorate the
+efficiency of the generated code. For instance, compilers such as GCC
+introduce balanced trees of ``if-then-else'' constructs for small
+switches. However, our implementation strategy is much simpler. Let
+us consider the following input statement.
+
+\begin{lstlisting}[language=C]
+ switch(e) {
+ case v1:
+ stmt1;
+ case v2:
+ stmt2;
+ default:
+ stmt_default;
+ }
+\end{lstlisting}
+
+Note that \textsf{stmt1}, \textsf{stmt2}, \ldots \textsf{stmt\_default}
+may contain \lstinline[language=C]'break' statements, which have the
+effect of exiting the switch statement. In the absence of break, the
+execution falls through each case sequentially. In our implementation,
+we produce an equivalent sequence of ``if-then'' chained by gotos:
+\begin{lstlisting}[language=C]
+ fresh = e;
+ if(fresh == v1) {
+ $\llbracket$stmt1$\rrbracket$;
+ goto lbl_case2;
+ };
+ if(fresh == v2) {
+ lbl_case2:
+ $\llbracket$stmt2$\rrbracket$;
+ goto lbl_case2;
+ };
+ $\llbracket$stmt_default$\rrbracket$;
+ exit_label:
+\end{lstlisting}
+
+The proof had to tackle the following points:
+\begin{itemize}
+\item the source and target memories are not the same (due to the fresh variable),
+\item the flow of control is changed in a non-local way (e.g. \textbf{goto}
+ instead of \textbf{break}).
+\end{itemize}
+In order to tackle the first point, we implemented a version of memory
+extensions similar to those of CompCert.
+
+For the simulation we decided to prove a sufficient amount to give us
+confidence in the definitions and approach, but to curtail the proof
+because this pass does not contribute to the intensional correctness
+result. We tackled several simple cases, that do not interact with
+the switch removal per se, to show that the definitions were usable,
+and part of the switch case to check that the approach is
+reasonable. This comprises propagating the memory extension through
+each statement (except switch), as well as various invariants that are
+needed for the switch case (in particular, freshness hypotheses). The
+details of the evaluation process for the source switch statement and
+its target counterpart can be found in the file
+\lstinline'switchRemoval.ma', along more details on the transformation
+itself.
+
+Proving the correctness of the second point would require reasoning on the
+semantics of \lstinline[language=C]'goto' statements. In the \textsf{Clight}
+semantics, this is implemented as a function-wide lookup of the target label.
+The invariant we would need is the fact that a global label lookup on a freshly
+created goto is equivalent to a local lookup. This would in turn require the
+propagation of some freshness hypotheses on labels. As discussed,
+we decided to omit this part of the correctness proof.
+
+\subsection{Cost labelling}
+
+The simulation for the cost labelling pass is the simplest in the
+front-end. The main argument is that any step of the source program
+is simulated by the same step of the labelled one, plus any extra
+steps for the added cost labels. The extra instructions do not change
+the memory or local environments, although we have to keep track of
+the extra instructions that appear in continuations, for example
+during the execution of a \lstinline[language=C]'while' loop.
+
+We do not attempt to capture any cost properties of the labelling\footnote{We describe how the cost properties are
+established in Section~\ref{sec:costchecks}.} in
+the simulation proof, which allows the proof to be oblivious to the choice
+of cost labels. Hence we do not have to reason about the threading of
+name generation through the labelling function, greatly reducing the
+amount of effort required.
+
+%TODO: both give one-step-sim-by-many forward sim results; switch
+%removal tricky, uses aux var to keep result of expr, not central to
+%intensional correctness so curtailed proof effort once reasonable
+%level of confidence in code gained; labelling much simpler; don't care
+%what the labels are at this stage, just need to know when to go
+%through extra steps. Rolled up into a single result with a cofixpoint
+%to obtain coinductive statement of equivalence (show).
+
+\section{Finding corresponding measurable subtraces}
+\label{sec:measurablelifting}
+
+There follow the three main passes of the front-end:
+\begin{enumerate}
+\item simplification of casts in \textsf{Clight} code
+\item \textsf{Clight} to \textsf{Cminor} translation, performing stack
+ variable allocation and simplifying control structures
+\item transformation to \textsf{RTLabs} control flow graph
+\end{enumerate}
+We have taken a common approach to
+each pass: first we build (or axiomatise) forward simulation results
+that are similar to normal compiler proofs, but which are slightly more
+fine-grained so that we can see that the call structure and relative
+placement of cost labels is preserved.
+
+Then we instantiate a general result which shows that we can find a
+\emph{measurable} subtrace in the target of the pass that corresponds
+to the measurable subtrace in the source. By repeated application of
+this result we can find a measurable subtrace of the execution of the
+\textsf{RTLabs} code, suitable for the construction of a structured
+trace (see Section~\ref{sec:structuredtrace}). This is essentially an
+extra layer on top of the simulation proofs that provides us with the
+additional information required for our intensional correctness proof.
+
+\subsection{Generic measurable subtrace lifting proof}
+
+Our generic proof is parametrised on a record containing small-step
+semantics for the source and target language, a classification of
+states (the same form of classification is used when defining
+structured traces), a simulation relation which respects the
+classification and cost labelling and
+four simulation results. The simulations are split by the starting state's
+classification and whether it is a cost label, which will allow us to
+observe that the call structure is preserved. They are:
+\begin{enumerate}
+\item a step from a `normal' state (which is not classified as a call
+ or return) which is not a cost label is simulated by zero or more
+ `normal' steps;
+\item a step from a `call' state followed by a cost label step is
+ simulated by a step from a `call' state, a corresponding label step,
+ then zero or more `normal' steps;
+\item a step from a `call' state not followed by a cost label
+ similarly (note that this case cannot occur in a well-labelled
+ program, but we do not have enough information locally to exploit
+ this); and
+\item a cost label step is simulated by a cost label step.
+\end{enumerate}
+Finally, we need to know that a successfully translated program will
+have an initial state in the simulation relation with the original
+program's initial state.
+
+The back-end has similar requirements for lifting simulations to
+structured traces. Fortunately, our treatment of calls and returns
+can be slightly simpler because we have special call and return states
+that correspond to function entry and return that are separate from
+the actual instructions. This was originally inherited from our port
+of CompCert's \textsf{Clight} semantics, but proves useful here
+because we only need to consider adding extra steps \emph{after} a
+call or return state, because the instruction step deals with extra
+steps that occur before. The back-end makes all of the call and
+return machinery explicit, and thus needs more complex statements
+about extra steps before and after each call and return.
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=0.5\linewidth]{meassim.pdf}
+\end{center}
+\caption{Tiling of simulation for a measurable subtrace}
+\label{fig:tiling}
+\end{figure}
+
+To find the measurable subtrace in the target program's execution we
+walk along the original program's execution trace applying the
+appropriate simulation result by induction on the number of steps.
+While the number of steps taken varies, the overall structure is
+preserved, as illustrated in Figure~\ref{fig:tiling}. By preserving
+the structure we also maintain the same intensional observables. One
+delicate point is that the cost label following a call must remain
+directly afterwards\footnote{The prototype compiler allowed some
+ straight-line code to appear before the cost label until a later
+ stage of the compiler, but we must move the requirement forward to
+ fit with the structured traces.}
+% Damn it, I should have just moved the cost label forwards in RTLabs,
+% like the prototype does in RTL to ERTL; the result would have been
+% simpler. Or was there some reason not to do that?
+(both in the program code and in the execution trace), even if we
+introduce extra steps, for example to store parameters in memory in
+\textsf{Cminor}. Thus we have a version of the call simulation
+that deals with both the call and the cost label in one result.
+
+In addition to the subtrace we are interested in measuring, we must
+prove that the earlier part of the trace is also preserved in
+order to use the simulation from the initial state. This proof also
+guarantees that we do not run out of stack space before the subtrace
+we are interested in. The lemmas for this prefix and the measurable
+subtrace are similar, following the pattern above. However, the
+measurable subtrace also requires us to rebuild the termination
+proof. This is defined recursively:
+\label{prog:terminationproof}
+\begin{lstlisting}[language=Grafite]
+ let rec will_return_aux C (depth:nat)
+ (trace:list (cs_state $...$ C $\times$ trace)) on trace : bool :=
+ match trace with
+ [ nil $\Rightarrow$ false
+ | cons h tl $\Rightarrow$
+ let $\langle$s,tr$\rangle$ := h in
+ match cs_classify C s with
+ [ cl_call $\Rightarrow$ will_return_aux C (S depth) tl
+ | cl_return $\Rightarrow$
+ match depth with
+ [ O $\Rightarrow$ match tl with [ nil $\Rightarrow$ true | _ $\Rightarrow$ false ]
+ | S d $\Rightarrow$ will_return_aux C d tl
+ ]
+ | _ $\Rightarrow$ will_return_aux C depth tl
+ ]
+ ].
+\end{lstlisting}
+The \lstinline'depth' is the number of return states we need to see
+before we have returned to the original function (initially zero) and
+\lstinline'trace' the measurable subtrace obtained from the running
+the semantics for the correct number of steps. This definition
+unfolds tail recursively for each step, and once the corresponding
+simulation result has been applied a new one for the target can be
+asserted by unfolding and applying the induction hypothesis on the
+shorter trace.
+
+Combining the lemmas about the prefix and the measurable subtrace
+requires a little care because the states joining the two might not be
+related in the simulation. In particular, if the measurable subtrace
+starts from the cost label at the beginning of the function there may
+be some extra instructions in the target code to execute to complete
+function entry before the states are back in the relation. Hence we
+carefully phrased the lemmas to allow for such extra steps.
+
+Together, these then gives us an overall result for any simulation fitting the
+requirements above (contained in the \lstinline'meas_sim' record):
+\begin{lstlisting}[language=Grafite]
+theorem measured_subtrace_preserved :
+ $\forall$MS:meas_sim.
+ $\forall$p1,p2,m,n,stack_cost,max.
+ ms_compiled MS p1 p2 $\rightarrow$
+ measurable (ms_C1 MS) p1 m n stack_cost max $\rightarrow$
+ $\exists$m',n'.
+ measurable (ms_C2 MS) p2 m' n' stack_cost max $\wedge$
+ observables (ms_C1 MS) p1 m n = observables (ms_C2 MS) p2 m' n'.
+\end{lstlisting}
+The stack space requirement that is embedded in \lstinline'measurable'
+is a consequence of the preservation of observables, because it is
+determined by the functions called and returned from, which are observable.
+
+\subsection{Simulation results for each pass}
+
+We now consider the simulation results for the passes, each of which
+is used to instantiate the
+\lstinline[language=Grafite]'measured_subtrace_preserved' theorem to
+construct the measurable subtrace for the next language.
+
+\subsubsection{Cast simplification}
+
+The parser used in \cerco{} introduces a lot of explicit type casts.
+If left as they are, these constructs can greatly hamper the
+quality of the generated code -- especially as the architecture
+we consider is an $8$-bit one. In \textsf{Clight}, casts are
+expressions. Hence, most of the work of this transformation
+proceeds on expressions. The transformation proceeds by recursively
+trying to coerce an expression to a particular integer type, which
+is in practice smaller than the original one. This functionality
+is implemented by two mutually recursive functions whose signature
+is the following.
+
+\begin{lstlisting}[language=Grafite]
+let rec simplify_expr (e:expr) (target_sz:intsize) (target_sg:signedness)
+ : $\Sigma$result:bool$\times$expr.
+ $\forall$ge,en,m. simplify_inv ge en m e (\snd result) target_sz target_sg (\fst result) := $\ldots$
+
+and simplify_inside (e:expr) : $\Sigma$result:expr. conservation e result := $\ldots$
+\end{lstlisting}
+
+The \textsf{simplify\_inside} acts as a wrapper for
+\textsf{simplify\_expr}. Whenever \textsf{simplify\_inside} encounters
+a \textsf{Ecast} expression, it tries to coerce the sub-expression
+to the desired type using \textsf{simplify\_expr}, which tries to
+perform the actual coercion. In return, \textsf{simplify\_expr} calls
+back \textsf{simplify\_inside} in some particular positions, where we
+decided to be conservative in order to simplify the proofs. However,
+the current design allows to incrementally revert to a more aggressive
+version, by replacing recursive calls to \textsf{simplify\_inside} by
+calls to \textsf{simplify\_expr} \emph{and} proving the corresponding
+invariants -- where possible.
+
+The \textsf{simplify\_inv} invariant encodes either the conservation
+of the semantics during the transformation corresponding to the failure
+of the coercion (\textsf{Inv\_eq} constructor), or the successful
+downcast of the considered expression to the target type
+(\textsf{Inv\_coerce\_ok}).
+
+\begin{lstlisting}[language=Grafite]
+inductive simplify_inv
+ (ge : genv) (en : env) (m : mem)
+ (e1 : expr) (e2 : expr) (target_sz : intsize) (target_sg : signedness) : bool $\rightarrow$ Prop :=
+| Inv_eq : $\forall$result_flag. $\ldots$
+ simplify_inv ge en m e1 e2 target_sz target_sg result_flag
+| Inv_coerce_ok : $\forall$src_sz,src_sg.
+ typeof e1 = Tint src_sz src_sg $\rightarrow$
+ typeof e2 = Tint target_sz target_sg $\rightarrow$
+ smaller_integer_val src_sz target_sz src_sg (exec_expr ge en m e1) (exec_expr ge en m e2) $\rightarrow$
+ simplify_inv ge en m e1 e2 target_sz target_sg true.
+\end{lstlisting}
+
+The \textsf{conservation} invariant for \textsf{simplify\_inside} simply states the conservation
+of the semantics, as in the \textsf{Inv\_eq} constructor of the previous
+invariant.
+
+\begin{lstlisting}[language=Grafite]
+definition conservation := $\lambda$e,result. $\forall$ge,en,m.
+ res_sim ? (exec_expr ge en m e) (exec_expr ge en m result)
+ $\wedge$ res_sim ? (exec_lvalue ge en m e) (exec_lvalue ge en m result)
+ $\wedge$ typeof e = typeof result.
+\end{lstlisting}
+
+This invariant is then easily lifted to statement evaluations.
+The main problem encountered with this particular pass was dealing with
+inconsistently typed programs, a canonical case being a particular
+integer constant of a certain size typed with another size. This
+prompted the need to introduce numerous type checks, making
+both the implementation and the proof more complex, even though more
+comprehensive checks are made in the next stage.
+%\todo{Make this a particular case of the more general statement on baking more invariants in the Clight language}
+
+\subsubsection{Clight to Cminor}
+
+This pass is the last one operating on the \textsf{Clight} language.
+Its input is a full \textsf{Clight} program, and its output is a
+\textsf{Cminor} program. Note that we do not use an equivalent of
+CompCert's \textsf{C\#minor} language: we translate directly to a
+variant of \textsf{Cminor}. This presents the advantage of not
+requiring the special loop constructs, nor the explicit block
+structure. Another salient point of our approach is that a significant
+number of the properties needed for the simulation proof were directly
+encoded in dependently typed translation functions. In particular,
+freshness conditions and well-typedness conditions are included. The
+main effects of the transformation from \textsf{Clight} to
+\textsf{Cminor} are listed below.
+
+\begin{itemize}
+\item Variables are classified as being either globals, stack-allocated
+ locals or potentially register-allocated locals. The value of register-allocated
+ local variables is moved out of the modelled memory and stored in a
+ dedicated environment.
+\item In \textsf{Clight}, each local variable has a dedicated memory block, whereas
+ stack-allocated locals are bundled together on a function-by-function basis.
+\item Loops are converted to jumps.
+\end{itemize}
+
+The first two points require memory injections which are more flexible that those
+needed in the switch removal case. In the remainder of this section, we briefly
+discuss our implementation of memory injections, and then the simulation proof.
+
+\paragraph{Memory injections.}
+
+Our memory injections are modelled after the work of Blazy \& Leroy.
+However, the corresponding paper is based on the first version of the
+CompCert memory model~\cite{2008-Leroy-Blazy-memory-model}, whereas we use a much more concrete model, allowing byte-level
+manipulations (as in the later version of CompCert's memory model). We proved
+roughly 80 \% of the required lemmas. Notably, some of the difficulties encountered were
+due to overly relaxed conditions on pointer validity (fixed during development).
+Some more side conditions had to be added to take care of possible overflows when converting
+from \textbf{Z} block bounds to $16$ bit pointer offsets (in practice, such overflows
+only occur in edge cases that are easily ruled out -- but this fact is not visible
+in memory injections). Concretely, some of the lemmas on the preservation of simulation of
+loads after writes were axiomatised, due to a lack of time.
+
+\paragraph{Simulation proof.}
+
+We proved the simulation result for expressions and a representative
+selection of statements. In particular we tackled
+\lstinline[language=C]'while' statements to ensure that we correctly
+translate loops because our approach differs from CompCert by
+converting directly to \textsf{Cminor} \lstinline[language=C]'goto's
+rather than maintaining a notion of loop in \textsf{Cminor}. We also have a partial
+proof for function entry, covering the setup of the memory injection,
+but not function exit. Exits, and the remaining statements, have been
+axiomatised.
+
+Careful management of the proof state was required because proof terms
+are embedded in \textsf{Cminor} code to show that invariants are
+respected. These proof terms appear in the proof state when inverting
+the translation functions, and they can be large and awkward. While
+generalising them away is usually sufficient, it can be difficult when
+they appear under a binder.
+
+%The correctness proof for this transformation was not completed. We proved the
+%simulation result for expressions and for some subset of the critical statement cases.
+%Notably lacking are the function entry and exit, where the memory injection is
+%properly set up. As would be expected, a significant amount of work has to be performed
+%to show the conservation of all invariants at each simulation step.
+
+%\todo{list cases, explain while loop, explain labeling problem}
+
+\subsubsection{Cminor to RTLabs}
+
+The translation from \textsf{Cminor} to \textsf{RTLabs} is a fairly
+routine control flow graph (CFG) construction. As such, we chose to
+axiomatise the associated extensional simulation results. However, we did prove several
+properties of the generated programs:
+\begin{itemize}
+\item All statements are type correct with respect to the declared
+ pseudo-register type environment.
+\item The CFG is closed, and has a distinguished entry node and a
+ unique exit node.
+\end{itemize}
+
+These properties rely on similar properties about type safety and the
+presence of \lstinline[language=C]'goto'-labels for \textsf{Cminor} programs
+which are checked at the preceding stage. As a result, this
+transformation is total and any compilation failures must occur when
+the corresponding \textsf{Clight} source is available and a better
+error message can be generated.
+
+The proof obligations for these properties include many instances of
+graph inclusion. We automated these proofs using a small amount of
+reflection, making the obligations much easier to handle. One
+drawback to enforcing invariants throughout is that temporarily
+breaking them can be awkward. For example, \lstinline'return'
+statements were originally used as placeholders for
+\lstinline[language=C]'goto' destinations that had not yet been
+translated. However, this made establishing the single exit node
+property rather difficult, and a different placeholder was chosen
+instead. In other circumstances it is possible to prove a more
+complex invariant then simplify it at the end of the transformation.
+
+\section{Checking cost labelling properties}
+\label{sec:costchecks}
+
+Ideally, we would provide proofs that the cost labelling pass always
+produces programs that are soundly and precisely labelled and that
+each subsequent pass preserves these properties. This would match our
+use of dependent types to eliminate impossible sources of errors
+during compilation, in particular retaining intermediate language type
+information.
+
+However, given the limited amount of time available we realised that
+implementing a compile-time check for a sound and precise labelling of
+the \textsf{RTLabs} intermediate code would reduce the proof burden
+considerably. This is similar in spirit to the use of translation
+validation in certified compilation, which makes a similar trade-off
+between the potential for compile-time failure and the volume of proof
+required.
+
+The check cannot be pushed into a later stage of the compiler because
+much of the information is embedded into the structured traces.
+However, if an alternative method was used to show that function
+returns in the compiled code are sufficiently well-behaved, then we
+could consider pushing the cost property checks into the timing
+analysis itself. We leave this as a possible area for future work.
+
+\subsection{Implementation and correctness}
+\label{sec:costchecksimpl}
+
+For a cost labelling to be sound and precise we need a cost label at
+the start of each function, after each branch and at least one in
+every loop. The first two parts are trivial to check by examining the
+code. In \textsf{RTLabs} the last part is specified by saying
+that there is a bound on the number of successive instruction nodes in
+the CFG that you can follow before you encounter a cost label, and
+checking this is more difficult.
+
+The implementation progresses through the set of nodes in the graph,
+following successors until a cost label is found or a label-free cycle
+is discovered (in which case the property does not hold and we return
+an error). This is made easier by the prior knowledge that every
+successor of a branch instruction is a cost label, so we do not need
+to search each branch. When a label is found, we remove the chain of
+program counters from the set and continue from another node in the
+set until it is empty, at which point we know that there is a bound
+for every node in the graph.
+
+Directly reasoning about the function that implements this procedure would be
+rather awkward, so an inductive specification of a single step of its
+behaviour was written and proved to match the implementation. This
+was then used to prove the implementation sound and complete.
+
+While we have not attempted to prove that the cost labelled properties
+are established and preserved earlier in the compiler, we expect that
+the effort for the \textsf{Cminor} to \textsf{RTLabs} stage alone
+would be similar to the work outlined above, because it involves the
+change from requiring a cost label at particular positions to
+requiring cost labels to break loops in the CFG. As there are another
+three passes to consider (including the labelling itself), we believe
+that using the check above is much simpler overall.
+
+% TODO? Found some Clight to Cminor bugs quite quickly
+
+\section{Existence of a structured trace}
+\label{sec:structuredtrace}
+
+The \emph{structured trace} idea introduced in
+Section~\ref{sec:fegoals} enriches the execution trace of a program by
+nesting function calls in a mixed-step style and embedding the cost
+labelling properties of the program. See Figure~\ref{fig:strtrace} on
+page~\pageref{fig:strtrace} for an illustration of a structured trace.
+It was originally designed to support the proof of correctness for the
+timing analysis of the object code in the back-end, then generalised
+to provide a common structure to use from the end of the front-end to
+the object code.
+
+To make the definition generic we abstract over the semantics of the
+language,
+\begin{lstlisting}[language=Grafite]
+record abstract_status : Type[1] :=
+ { as_status :> Type[0]
+ ; as_execute : relation as_status
+ ; as_pc : DeqSet
+ ; as_pc_of : as_status $\rightarrow$ as_pc
+ ; as_classify : as_status $\rightarrow$ status_class
+ ; as_label_of_pc : as_pc $\rightarrow$ option costlabel
+ ; as_after_return : ($\Sigma$s:as_status. as_classify s = cl_call) $\rightarrow$ as_status $\rightarrow$ Prop
+ ; as_result: as_status $\rightarrow$ option int
+ ; as_call_ident : ($\Sigma$s:as_status.as_classify s = cl_call) $\rightarrow$ ident
+ ; as_tailcall_ident : ($\Sigma$s:as_status.as_classify s = cl_tailcall) $\rightarrow$ ident
+ }.
+\end{lstlisting}
+which requires a type of states, an execution relation\footnote{All of
+ our semantics are executable, but using a relation was simpler in
+ the abstraction.}, some notion of abstract
+program counter with decidable equality, the classification of states,
+and functions to extract the observable intensional information (cost
+labels and the identity of functions that are called). The
+\lstinline'as_after_return' property links the state before a function
+call with the state after return, providing the evidence that
+execution returns to the correct place. The precise form varies
+between stages; in \textsf{RTLabs} it insists the CFG, the pointer to
+the CFG node to execute next, and some call stack information is
+preserved.
+
+The structured traces are defined using three mutually inductive
+types. The core data structure is \lstinline'trace_any_label', which
+captures some straight-line execution until the next cost label or the
+return from the enclosing function. Any function calls are embedded as
+a single step, with its own trace nested inside and the before and
+after states linked by \lstinline'as_after_return'; and states
+classified as a `jump' (in particular branches) must be followed by a
+cost label.
+
+The second type, \lstinline'trace_label_label', is a
+\lstinline'trace_any_label' where the initial state is cost labelled.
+Thus a trace in this type identifies a series of steps whose cost is
+entirely accounted for by the label at the start.
+
+Finally, \lstinline'trace_label_return' is a sequence of
+\lstinline'trace_label_label' values which end in the return from the
+function. These correspond to a measurable subtrace, and in
+particular include executions of an entire function call (and so are
+used for the nested calls in \lstinline'trace_any_label').
+
+\subsection{Construction}
+
+The construction of the structured trace replaces syntactic cost
+labelling properties, which place requirements on where labels appear
+in the program, with semantic properties that constrain the execution
+traces of the program. The construction begins by defining versions
+of the sound and precise labelling properties on states and global
+environments (for the code that appears in each of them) rather than
+whole programs, and showing that these are preserved by steps of the
+\textsf{RTLabs} semantics.
+
+Then we show that each cost labelling property required by the
+definition of structured traces is locally satisfied. These proofs are
+broken up by the classification of states. Similarly, we prove a
+step-by-step stack preservation result, which states that the
+\textsf{RTLabs} semantics never changes the lower parts of the stack.
+
+The core part of the construction of a structured trace is to use the
+proof of termination from the measurable trace (defined on
+page~\pageref{prog:terminationproof}) to `fold up' the execution into
+the nested form. The results outlined above fill in the proof
+obligations for the cost labelling properties and the stack
+preservation result shows that calls return to the correct location.
+
+The structured trace alone is not sufficient to capture the property
+that the program is soundly labelled. While the structured trace
+guarantees termination, it still permits a loop to be executed a
+finite number of times without encountering a cost label. We
+eliminate this by proving that no `program counter' repeats within any
+\lstinline'trace_any_label' section by showing that it is incompatible
+with the property that there is a bound on the number of successor
+instructions you can follow in the CFG before you encounter a cost
+label (from Section~\ref{sec:costchecksimpl}).
+
+\subsubsection{Complete execution structured traces}
+
+The development of the construction above started relatively early,
+before the measurable subtrace preservation proofs. To be confident
+that the traces were well-formed at that time, we also developed a
+complete execution form that embeds the traces above. This includes
+non-terminating program executions, where an infinite number of the terminating
+structured traces are embedded. This construction confirmed that our
+definition of structured traces was consistent, although we later
+found that we did not require the whole execution version for the
+compiler correctness results.
+
+To construct these we need to know whether each function call will
+eventually terminate, requiring the use of the excluded middle. This
+classical reasoning is local to the construction of whole program
+traces and is not necessary for our main results.
+
+\section{Conclusion}
+
+In combination with the work on the CerCo back-end and by
+concentrating on the novel intensional parts of the proof, we have
+shown that it is possible to construct certifying compilers that
+correctly report execution time and stack space costs. The layering
+of intensional correctness proofs on top of normal simulation results
+provides a useful separation of concerns, and could permit the reuse
+of existing results.
+
+\section{Files}
+
+The following table gives a high-level overview of the \matita{}
+source files in Deliverable 3.4:
+
+\bigskip
+
+\begin{tabular}{rp{.7\linewidth}}
+\lstinline'compiler.ma' & Top-level compiler definitions, in particular
+\lstinline'front_end', and the whole compiler definition
+\lstinline'compile'. \\
+\lstinline'correctness.ma' & Correctness results: \lstinline'front_end_correct'
+and \lstinline'correct', respectively. \\
+\lstinline'Clight/*' & \textsf{Clight}: proofs for switch
+removal, cost labelling, cast simplification and conversion to
+\textsf{Cminor}. \\
+\lstinline'Cminor/*' & \textsf{Cminor}: axioms of conversion to
+\textsf{RTLabs}. \\
+\lstinline'RTLabs/*' & \textsf{RTLabs}: definitions and proofs for
+compile-time cost labelling checks, construction of structured traces.
+\\
+\lstinline'common/Measurable.ma' & Definitions for measurable
+subtraces. \\
+\lstinline'common/FEMeasurable.ma' & Generic measurable subtrace
+lifting proof. \\
+\lstinline'common/*' & Other common definitions relevant to many parts
+of the compiler and proof. \\
+\lstinline'utilities/*' & General purpose definitions used throughout,
+including extensions to the standard \matita{} library.
+\end{tabular}
\subsection{A brief overview of the backend compilation chain}
@@ -80,5 +1112,5 @@
Our abstraction takes the following form:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
inductive joint_instruction (p: params__) (globals: list ident): Type[0] :=
| COMMENT: String $\rightarrow$ joint_instruction p globals
@@ -86,8 +1118,5 @@
| INT: generic_reg p $\rightarrow$ Byte $\rightarrow$ joint_instruction p globals
...
- | OP1: Op1 $
-ightarrow$ acc_a_reg p $
-ightarrow$ acc_a_reg p $
-ightarrow$ joint_instruction p globals
+ | OP1: Op1 $\rightarrow$ acc_a_reg p $\rightarrow$ acc_a_reg p $\rightarrow$ joint_instruction p globals
...
| extension: extend_statements p $\rightarrow$ joint_instruction p globals.
@@ -106,5 +1135,5 @@
As \texttt{p} varies between intermediate languages, we can provide language specific extensions to the syntax of the joint language.
For example, ERTL's extended syntax consists of the following extra statements:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
inductive ertl_statement_extension: Type[0] :=
| ertl_st_ext_new_frame: ertl_statement_extension
@@ -113,5 +1142,5 @@
\end{lstlisting}
These are further packaged into an ERTL specific instance of \texttt{params\_\_} as follows:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
definition ertl_params__: params__ :=
mk_params__ register register ... ertl_statement_extension.
@@ -129,5 +1158,5 @@
Our joint internal function record looks like so:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record joint_internal_function (globals: list ident) (p:params globals) : Type[0] :=
{
@@ -150,5 +1179,5 @@
The `joint' syntax makes this especially clear.
For instance, in the definition:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
inductive joint_instruction (p:params__) (globals: list ident): Type[0] :=
...
@@ -232,5 +1261,5 @@
For example, in the \texttt{RTLabs} to \texttt{RTL} transformation pass, many functions only `make sense' when lists of registers passed to them as arguments conform to some specific length.
For instance, the \texttt{translate\_negint} function, which translates a negative integer constant:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
definition translate_negint :=
$\lambda$globals: list ident.
@@ -251,5 +1280,5 @@
Practically, we would always like to ensure that the entry and exit labels are present in the statement graph.
We ensure that this is so with a dependent sum type in the \texttt{joint\_internal\_function} record, which all graph based languages specialise to obtain their own internal function representation:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record joint_internal_function (globals: list ident) (p: params globals): Type[0] :=
{
@@ -389,5 +1418,5 @@
We begin the abstraction process with the \texttt{params\_\_} record.
This holds the types of the representations of the different register varieties in the intermediate languages:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record params__: Type[1] :=
{
@@ -421,5 +1450,5 @@
As mentioned in the report for Deliverable D4.2, the record \texttt{params\_\_} is enough to be able to specify the instructions of the joint languages:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
inductive joint_instruction (p: params__) (globals: list ident): Type[0] :=
| COMMENT: String $\rightarrow$ joint_instruction p globals
@@ -435,5 +1464,5 @@
Naturally, as some intermediate languages are graph based, and others linearised, the passing act of passing control on to the `successor' instruction can either be the act of following a graph edge in a control flow graph, or incrementing an index into a list.
We make a distinction between instructions that pass control onto their immediate successors, and those that jump elsewhere in the program, through the use of \texttt{succ}, denoting the immediate successor of the current instruction, in the \texttt{params\_} record described below.
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record params_: Type[1] :=
{
@@ -444,5 +1473,5 @@
The type \texttt{succ} corresponds to labels, in the case of control flow graph based languages, or is instantiated to the unit type for the linearised language, LIN.
Using \texttt{param\_} we can define statements of the joint language:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
inductive joint_statement (p:params_) (globals: list ident): Type[0] :=
| sequential: joint_instruction p globals $\rightarrow$ succ p $\rightarrow$ joint_statement p globals
@@ -457,5 +1486,5 @@
For the semantics, we need further parametererised types.
In particular, we parameterise the result and parameter type of an internal function call in \texttt{params0}:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record params0: Type[1] :=
{
@@ -468,5 +1497,5 @@
We further extend \texttt{params0} with a type for local variables in internal function calls:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record params1 : Type[1] :=
{
@@ -478,5 +1507,5 @@
Further, we hypothesise a generic method for looking up the next instruction in the graph, called \texttt{lookup}.
Note that \texttt{lookup} may fail, and returns an \texttt{option} type:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record params (globals: list ident): Type[1] :=
{
@@ -492,5 +1521,5 @@
In particular, we have a description of the result, parameters and the local variables of a function.
Note also that we have lifted the hypothesised \texttt{lookup} function from \texttt{params} into a dependent sigma type, which combines a label (the entry and exit points of the control flow graph or list) combined with a proof that the label is in the graph structure:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record joint_internal_function (globals: list ident) (p:params globals) : Type[0] :=
{
@@ -509,5 +1538,5 @@
The reason is because some intermediate languages share a host of parameters, and only differ on some others.
For instance, in instantiating the ERTL language, certain parameters are shared with RTL, whilst others are ERTL specific:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
...
definition ertl_params__: params__ :=
@@ -516,5 +1545,5 @@
...
definition ertl_params1: params1 := rtl_ertl_params1 ertl_params0.
-definition ertl_params: ∀globals. params globals := rtl_ertl_params ertl_params0.
+definition ertl_params: $\forall$globals. params globals := rtl_ertl_params ertl_params0.
...
definition ertl_statement := joint_statement ertl_params_.
@@ -524,10 +1553,10 @@
\end{lstlisting}
Here, \texttt{rtl\_ertl\_params1} are the common parameters of the ERTL and RTL languages:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
definition rtl_ertl_params1 := $\lambda$pars0. mk_params1 pars0 (list register).
\end{lstlisting}
The record \texttt{more\_sem\_params} bundles together functions that store and retrieve values in various forms of register:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record more_sem_params (p:params_): Type[1] :=
{
@@ -565,5 +1594,5 @@
We extend \texttt{more\_sem\_params} with yet more parameters via \texttt{more\_sem\_params2}:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record more_sem_params1 (globals: list ident) (p: params globals) : Type[1] :=
{
@@ -605,5 +1634,5 @@
We bundle \texttt{params} and \texttt{sem\_params} together into a single record.
This will be used in the function \texttt{eval\_statement} which executes a single statement of the joint language:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record sem_params2 (globals: list ident): Type[1] :=
{
@@ -614,5 +1643,5 @@
\noindent
The \texttt{state} record holds the current state of the interpreter:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
record state (p: sem_params): Type[0] :=
{
@@ -632,5 +1661,5 @@
We use the function \texttt{eval\_statement} to evaluate a single joint statement:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
definition eval_statement:
$\forall$globals: list ident.$\forall$p:sem_params2 globals.
@@ -663,5 +1692,5 @@
We call this function \texttt{return} and say that it must have type $\alpha \rightarrow M \alpha$, where $M$ is the name of the monad.
In our example, the `lifting' function for the \texttt{option} monad can be implemented as:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
let return x = Some x
\end{lstlisting}
@@ -671,5 +1700,5 @@
We can see that bind `unpacks' a monadic value, applies a function after unpacking, and `repacks' the new value in the monad.
In our example, the sequencing function for the \texttt{option} monad can be implemented as:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
let bind o f =
match o with
@@ -687,5 +1716,5 @@
Here, the monad's sequencing operation ensures that cost label emissions and function calls are maintained in the correct order.
We have already seen how the \texttt{eval\_statement} function of the joint language is monadic, with type:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
definition eval_statement:
$\forall$globals: list ident.$\forall$p:sem_params2 globals.
@@ -695,5 +1724,5 @@
If we examine the body of \texttt{eval\_statement}, we may also see how the monad sequences effects.
For instance, in the case for the \texttt{LOAD} statement, we have the following:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
definition eval_statement:
$\forall$globals: list ident. $\forall$p:sem_params2 globals.
@@ -702,5 +1731,5 @@
...
match s with
- | LOAD dst addrl addrh ⇒
+ | LOAD dst addrl addrh $\Rightarrow$
! vaddrh $\leftarrow$ dph_retrieve ... st addrh;
! vaddrl $\leftarrow$ dpl_retrieve ... st addrl;
@@ -713,5 +1742,5 @@
Here, we employ a certain degree of syntactic sugaring.
The syntax
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
...
! vaddrh $\leftarrow$ dph_retrieve ... st addrh;
@@ -721,5 +1750,5 @@
is sugaring for the \texttt{IO} monad's binding operation.
We can expand this sugaring to the following much more verbose code:
-\begin{lstlisting}
+\begin{lstlisting}[language=Grafite]
...
bind (dph_retrieve ... st addrh) ($\lambda$vaddrh. bind (dpl_retrieve ... st addrl)