Index: /Deliverables/D1.2/CompilerProofOutline/outline.tex
===================================================================
 /Deliverables/D1.2/CompilerProofOutline/outline.tex (revision 1745)
+++ /Deliverables/D1.2/CompilerProofOutline/outline.tex (revision 1746)
@@ 80,6 +80,6 @@
\item syntactic proofs about the cost labelling.
\end{enumerate}
The first will support both function correctness and allow us to show
the totality of some of the compiler stages (that is, these stages of
+The first will support both functional correctness and allow us to show
+the totality of some of the compiler stages (that is, those stages of
the compiler cannot fail). The second provides the main functional
correctness result, and the last will be crucial for applying
@@ 88,5 +88,5 @@
We will also prove that a suitably labelled RTLabs trace can be turned
into a \emph{structured trace} which splits the execution trace into
costlabeltocostlabel chunks with nested function calls. This
+costlabel to costlabel chunks with nested function calls. This
structure was identified during work on the correctness of the
backend cost analysis as retaining important information about the
@@ 94,18 +94,20 @@
the compiler.
\subsection{Clight Cast removal}

This removes some casts inserted by the parser to make arithmetic
promotion explicit when they are superfluous (such as
\lstinline[language=C]'c = (short)((int)a + (int)b);').

The transformation only affects Clight expressions, recursively
detecting casts that can be safely eliminated. The semantics provides
a bigstep definition for expression, so we should be able to show a
lockstep forward simulation using a lemma showing that cast
elimination does not change the evaluation of expressions. This lemma
will follow from a structural induction on the source expression. We
have already proved a few of the underlying arithmetic results
necessary to validate the approach.
+\subsection{Clight cast removal}
+
+This transformation removes some casts inserted by the parser to make
+arithmetic promotion explicit but which are superfluous (such as
+\lstinline[language=C]'c = (short)((int)a + (int)b);' where
+\lstinline'a' and \lstinline'b' are \lstinline[language=C]'short').
+This is necessary for producing good code for our target architecture.
+
+It only affects Clight expressions, recursively detecting casts that
+can be safely eliminated. The semantics provides a bigstep
+definition for expression, so we should be able to show a lockstep
+forward simulation between otherwise identical states using a lemma
+showing that cast elimination does not change the evaluation of
+expressions. This lemma will follow from a structural induction on
+the source expression. We have already proved a few of the underlying
+arithmetic results necessary to validate the approach.
\subsection{Clight cost labelling}
@@ 113,18 +115,18 @@
This adds cost labels before and after selected statements and
expressions, and the execution traces ought to be equivalent modulo
cost labels. Hence it requires a simple forward simulation with a
limited amount of stuttering whereever a new cost label is introduced.
A bound can be given for the amount of stuttering allowed can be given
+the new cost labels. Hence it requires a simple forward simulation
+with a limited amount of stuttering whereever a new cost label is
+introduced. A bound can be given for the amount of stuttering allowed
based on the statement or continuation to be evaluated next.
We also intend to show three syntactic properties about the cost
labelling:
\begin{itemize}
+\begin{enumerate}
\item every function starts with a cost label,
\item every branching instruction is followed by a label (note that
+\item every branching instruction is followed by a cost label (note that
exiting a loop is treated as a branch), and
\item the head of every loop (and any \lstinline'goto' destination) is
a cost label.
\end{itemize}
+\end{enumerate}
These can be shown by structural induction on the source term.
@@ 143,5 +145,5 @@
\item that variables used in the generated code are present in the
resulting environment (either by checking their presence in the
 source environment, or from a list of freshly temporary variables);
+ source environment, or from a list of freshly generated temporary variables);
and
\item that all \lstinline[language=C]'goto' labels are present (by
@@ 152,11 +154,11 @@
The simulation will be similar to the relevant stages of CompCert
(Clight to Csharpminor and Csharpminor to Cminor  in the event that
the direct proof is unwieldy we could introduce a corresponding
intermediate language). During early experimentation with porting
CompCert definitions to the Matita proof assistant we found little
difficulty reproving the results for the memory model, so we plan to
port the memory injection properties and use them to relate Clight
inmemory variables with either a local variable valuation or a stack
slot, depending on how it was classified.
+the direct proof is unwieldy we could introduce an intermediate
+language corresponding to Csharpminor). During early experimentation
+with porting CompCert definitions to the Matita proof assistant we
+found little difficulty reproving the results for the memory model, so
+we plan to port the memory injection properties and use them to relate
+Clight inmemory variables with either the value of the local variable or a
+stack slot, depending on how it was classified.
This should be sufficient to show the equivalence of (bigstep)
@@ 190,5 +192,5 @@
We already enforce several invariants with dependent types: some type
safety is enforced mostly using the type information from Cminor; and
+safety, mostly shown using the type information from Cminor; and
that the graph is closed (by showing that each successor was recently
added, or corresponds to a \lstinline[language=C]'goto' label which
@@ 199,6 +201,6 @@
later stages using the same method as Cminor.
The simulation will relate Cminor states to RTLabs states about to
execute the corresponding code (with a corresponding continuation).
+The simulation will relate Cminor states to RTLabs states which are about to
+execute the code corresponding to the Cminor statement or continuation.
Each Cminor statement becomes zero or more RTLabs statements, with a
decreasing measure based on the statement and continuations similar to
@@ 209,11 +211,11 @@
The first two labelling properties remain as before; we will show that
cost labels are preserved, so the function entry point will be a cost
label, and successors that are cost labels map still map to cost
labels, preserving the condition on branches. We replace the property
for loops with the notion that we will always reach a cost label or
the end of the function by following a bounded number of successors.
This can be easily seen in Cminor using the requirement for cost
labels at the head of loops and after gotos. It remains to show that
this is preserved by the translation to RTLabs. % how?
+label, and successors to any statement that are cost labels map still
+map to cost labels, preserving the condition on branches. We replace
+the property for loops with the notion that we will always reach a
+cost label or the end of the function after following a bounded number of
+successors. This can be easily seen in Cminor using the requirement
+for cost labels at the head of loops and after gotos. It remains to
+show that this is preserved by the translation to RTLabs. % how?
\subsection{RTLabs structured trace generation}
@@ 222,29 +224,35 @@
labelling properties into the execution trace. As the function calls
are nested within the trace, we need to distinguish between
terminating and nonterminating function calls.

Thus we use the excluded middle (specialised to a function termination
property) to do this. Terminating functions are built by following
the trace, breaking it into chunks between cost labels and recursively
processing function calls. The main difficulties here are the
nonstructurally recursive nature of the function (instead we use the
size of the termination proof as a measure) and using the properties
on RTLabs cost labelling to create the correct structure. We also
show that the lower stack frames are preserved during function calls
in order to prove that after returning from a function call we execute
the correct code. This part of the work has already been constructed,
but still requires a simple proof to show that flattening the structured
+terminating and nonterminating function calls. Thus we use the
+excluded middle (specialised to a function termination property) to do
+this.
+
+Structured traces for terminating functions are built by following the
+flat trace, breaking it into chunks between cost labels and
+recursively processing function calls. The main difficulties here are
+the nonstructurally recursive nature of the function (instead we use
+the size of the termination proof as a measure) and using the RTLabs
+cost labelling properties to show that the constraints of the
+structured traces are observed. We also show that the lower stack
+frames are preserved during function calls in order to prove that
+after returning from a function call we resume execution of the
+correct code. This part of the work has already been constructed, but
+still requires a simple proof to show that flattening the structured
trace recreates the original flat trace.
The nonterminating case uses the excluded middle to decide whether to
use the function described above to construct an inductive terminating
structured trace, or a coinductive version for nontermination. It
follows the trace like the terminating version to build up chunks of
trace from costlabel to costlabel (which, by the finite distance to
a cost label property shown before, can be represented by an inductive
type), unless it encounters a nonterminating function in which case
it corecursively processes that. This part of the trace
transformation is currently under construction, and will also need a
flattening result to show that it is correct.
+The nonterminating case follows the trace like the terminating
+version to build up chunks of trace from costlabel to costlabel
+(which, by the finite distance to a cost label property shown before,
+can be represented by an inductive type). These chunks are chained
+together in a coinductive data structure that can represent
+nonterminating traces. The excluded middle is used to decide whether
+function calls terminate, in which case the function described above
+constructs an inductive terminating structured trace which is nested
+in the caller's trace. Otherwise, another coinductive constructor is
+used to embed the nonterminating trace of the callee, generated by
+corecursion. This part of the trace transformation is currently under
+construction, and will also need a flattening result to show that it
+is correct.
+
\section{Backend: RTLabs to machine code}