Changeset 3657


Ignore:
Timestamp:
Mar 16, 2017, 12:39:35 PM (7 months ago)
Author:
mulligan
Message:

more cannibalisation

Location:
Papers/jar-cerco-2017
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • Papers/jar-cerco-2017/cerco.tex

    r3656 r3657  
    7070\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
    7171\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
     72
     73\newcommand{\cerco}{CerCo}
     74\newcommand{\ocaml}{OCaml}
     75\newcommand{\clight}{Clight}
     76\newcommand{\matita}{Matita}
     77\newcommand{\sdcc}{\texttt{sdcc}}
    7278
    7379\title{CerCo: Certified Complexity\thanks{The project CerCo acknowledges the
  • Papers/jar-cerco-2017/lst-grafite.tex

    r3622 r3657  
    88        notation, interpretation, lapply, repeat, try, as, clear, in, change, whd, exists},
    99morekeywords={[2]include},
     10keywordstyle=\color{blue}\bfseries,
    1011%emph={[1]Type, Prop, nat, real}, emphstyle={[1]\textit},
    1112% literate=     
     
    150151captionpos=b,
    151152mathescape=true,
    152 keywordstyle=\bfseries,
    153 keywordstyle=[2]\bfseries,
    154 keywordstyle=[3]\bfseries,
     153%keywordstyle=\bfseries,
     154%keywordstyle=[2]\bfseries,
     155%keywordstyle=[3]\bfseries,
    155156%backgroundcolor=\color{gray},
    156157frame=tblr,
  • Papers/jar-cerco-2017/proof.tex

    r3656 r3657  
    88\section{Compiler proof}
    99\label{sect.compiler.proof}
     10
     11\section{Introduction}
     12
     13The \cerco{} compiler produces a version of the source code containing
     14annotations describing the timing behaviour of the object code, as
     15well as the object code itself. It compiles C code, targeting
     16microcontrollers implementing the Intel 8051 architecture.  There are
     17two versions: first, an initial prototype was implemented in
     18\ocaml{}~\cite{d2.2}, then a version was formalised in the \matita{}
     19proof assistant~\cite{d3.2,d4.2} and extracted to \ocaml{} code to
     20produce an executable compiler.  In this document we present results
     21from Deliverable 3.4, the formalised proofs in \matita{} about the
     22front-end of the latter version of the compiler (culminating in the
     23\lstinline'front_end_correct' lemma), and describe how that fits
     24into the verification of the whole compiler.
     25
     26A key part of this work was to layer the \emph{intensional} correctness
     27results that show that the costs produced are correct on top of the
     28proofs about the compiled code's \emph{extensional} behaviour (that is, the
     29functional correctness of the compiler).  Unfortunately, the ambitious
     30goal of completely verifying the entire compiler was not feasible
     31within the time available, but thanks to this separation of
     32extensional and intensional proofs we are able to axiomatise some extensional
     33simulation results which are similar to those in other compiler verification
     34projects and concentrate on the novel intensional proofs.  We were
     35also able to add stack space costs to obtain a stronger result.  The
     36proofs were made more tractable by introducing compile-time checks for
     37the `sound and precise' cost labelling properties rather than proving
     38that they are preserved throughout.
     39
     40The overall statement of correctness says that the annotated program has the
     41same behaviour as the input, and that for any suitably well-structured part of
     42the execution (which we call \emph{measurable}), the object code will execute
     43the same behaviour taking precisely the time given by the cost annotations in
     44the annotated source program.
     45
     46In the next section we recall the structure of the compiler and make the overall
     47statement more precise.  Following that, in Section~\ref{sec:fegoals} we
     48describe the statements we need to prove about the intermediate \textsf{RTLabs}
     49programs for the back-end proofs.
     50Section~\ref{sec:inputtolabelling} covers the compiler passes which produce the
     51annotated source program and Section~\ref{sec:measurablelifting} the rest
     52of the transformations in the front-end.  Then the compile-time checks
     53for good cost labelling are detailed in Section~\ref{sec:costchecks}
     54and the proofs that the structured traces required by the back-end
     55exist are discussed in Section~\ref{sec:structuredtrace}.
     56
     57\section{The compiler and its correctness statement}
     58
     59The uncertified prototype \ocaml{} \cerco{} compiler was originally described
     60in Deliverables 2.1 and 2.2.  Its design was replicated in the formal
     61\matita{} code, which was presented in Deliverables 3.2 and 4.2, for
     62the front-end and back-end respectively.
     63
     64\begin{figure}
     65\begin{center}
     66\includegraphics[width=0.5\linewidth]{compiler-plain.pdf}
     67\end{center}
     68\caption{Languages in the \cerco{} compiler}
     69\label{fig:compilerlangs}
     70\end{figure}
     71
     72The compiler uses a number of intermediate languages, as outlined the
     73middle two lines of Figure~\ref{fig:compilerlangs}.  The upper line
     74represents the front-end of the compiler, and the lower the back-end,
     75finishing with Intel 8051 binary code.  Not all of the front-end compiler passes
     76introduce a new language, and Figure~\ref{fig:summary} presents a
     77list of every pass involved.
     78
     79\begin{figure}
     80\begin{center}
     81\begin{minipage}{.8\linewidth}
     82\begin{tabbing}
     83\quad \= $\downarrow$ \quad \= \kill
     84\textsf{C} (unformalised)\\
     85\> $\downarrow$ \> CIL parser (unformalised \ocaml)\\
     86\textsf{Clight}\\
     87%\> $\downarrow$ \> add runtime functions\\
     88\> $\downarrow$ \> \lstinline[language=C]'switch' removal\\
     89\> $\downarrow$ \> labelling\\
     90\> $\downarrow$ \> cast removal\\
     91\> $\downarrow$ \> stack variable allocation and control structure
     92 simplification\\
     93\textsf{Cminor}\\
     94%\> $\downarrow$ \> generate global variable initialization code\\
     95\> $\downarrow$ \> transform to RTL graph\\
     96\textsf{RTLabs}\\
     97\> $\downarrow$ \> check cost labelled properties of RTL graph\\
     98\> $\downarrow$ \> start of target specific back-end\\
     99\>\quad \vdots
     100\end{tabbing}
     101\end{minipage}
     102\end{center}
     103\caption{Front-end languages and compiler passes}
     104\label{fig:summary}
     105\end{figure}
     106
     107\label{page:switchintro}
     108The annotated source code is produced by the cost labelling phase.
     109Note that there is a pass to replace C \lstinline[language=C]'switch'
     110statements before labelling --- we need to remove them because the
     111simple form of labelling used in the formalised compiler is not quite
     112capable of capturing their execution time costs, largely due to C's
     113`fall-through' behaviour where execution from one branch continues in
     114the next unless there is an explicit \lstinline[language=C]'break'.
     115
     116The cast removal phase which follows cost labelling simplifies
     117expressions to prevent unnecessary arithmetic promotion, which is
     118specified by the C standard but costly for an 8-bit target.  The
     119transformation to \textsf{Cminor} and subsequently \textsf{RTLabs}
     120bear considerable resemblance to some passes of the CompCert
     121compiler~\cite{Blazy-Leroy-Clight-09,Leroy-backend}, although we use a simpler \textsf{Cminor} where
     122all loops use \lstinline[language=C]'goto' statements, and the
     123\textsf{RTLabs} language retains a target-independent flavour.  The
     124back-end takes \textsf{RTLabs} code as input.
     125
     126The whole compilation function returns the following information on success:
     127\begin{lstlisting}[language=Grafite]
     128  record compiler_output : Type[0] :=
     129   { c_labelled_object_code: labelled_object_code
     130   ; c_stack_cost: stack_cost_model
     131   ; c_max_stack: nat
     132   ; c_init_costlabel: costlabel
     133   ; c_labelled_clight: clight_program
     134   ; c_clight_cost_map: clight_cost_map
     135   }.
     136\end{lstlisting}
     137It consists of annotated 8051 object code, a mapping from function
     138identifiers to the function's stack space usage, the space available for the
     139stack after global variable allocation, a cost label covering the
     140execution time for the initialisation of global variables and the call
     141to the \lstinline[language=C]'main' function, the annotated source
     142code, and finally a mapping from cost labels to actual execution time
     143costs.
     144
     145An \ocaml{} pretty printer is used to provide a concrete version of
     146the output code and annotated source code.  In the case of the
     147annotated source code, it also inserts the actual costs alongside the
     148cost labels, and optionally adds a global cost variable and
     149instrumentation to support further reasoning in external tools such as
     150Frama-C.
     151
     152\subsection{Revisions to the prototype compiler}
     153
     154Our focus on intensional properties prompted us to consider whether we
     155could incorporate stack space into the costs presented to the user.
     156We only allocate one fixed-size frame per function, so modelling this
     157was relatively simple.  It is the only form of dynamic memory
     158allocation provided by the compiler, so we were able to strengthen the
     159statement of the goal to guarantee successful execution whenever the
     160stack space obeys the \lstinline'c_max_stack' bound calculated by
     161subtracting the global variable requirements from the total memory
     162available.
     163
     164The cost labelling checks at the end of Figure~\ref{fig:summary} have been
     165introduced to reduce the proof burden, and are described in
     166Section~\ref{sec:costchecks}.
     167
     168The use of dependent types to capture simple intermediate language
     169invariants makes every front-end pass a total function, except
     170\textsf{Clight} to \textsf{Cminor} and the cost checks.  Hence various
     171well-formedness and type safety checks are performed only once between
     172\textsf{Clight} and \textsf{Cminor}, and the invariants rule out any
     173difficulties in the later stages.  With the benefit of hindsight we
     174would have included an initial checking phase to produce a
     175`well-formed' variant of \textsf{Clight}, conjecturing that this would
     176simplify various parts of the proofs for the \textsf{Clight} stages
     177which deal with potentially ill-formed code.
     178
     179Following D2.2, we previously generated code for global variable
     180initialisation in \textsf{Cminor}, for which we reserved a cost label
     181to represent the execution time for initialisation.  However, the
     182back-end must also add an initial call to the main function, whose
     183cost must also be accounted for, so we decided to move the
     184initialisation code to the back-end and merge the costs.
     185
     186\subsection{Main correctness statement}
     187
     188Informally, our main intensional result links the time difference in a source
     189code execution to the time difference in the object code, expressing the time
     190for the source by summing the values for the cost labels in the trace, and the
     191time for the target by a clock built in to the 8051 executable semantics.
     192
     193The availability of precise timing information for 8501
     194implementations and the design of the compiler allow it to give exact
     195time costs in terms of processor cycles, not just upper bounds.
     196However, these exact results are only available if the subtrace we
     197measure starts and ends at suitable points.  In particular, pure
     198computation with no observable effects may be reordered and moved past
     199cost labels, so we cannot measure time between arbitrary statements in
     200the program.
     201
     202There is also a constraint on the subtraces that we
     203measure due to the requirements of the correctness proof for the
     204object code timing analysis.  To be sure that the timings are assigned
     205to the correct cost label, we need to know that each return from a
     206function call must go to the correct return address.  It is difficult
     207to observe this property locally in the object code because it relies
     208on much earlier stages in the compiler.  To convey this information to
     209the timing analysis extra structure is imposed on the subtraces, which
     210is described in Section~\ref{sec:fegoals}.
     211
     212% Regarding the footnote, would there even be much point?
     213% TODO: this might be quite easy to add ('just' subtract the
     214% measurable subtrace from the second label to the end).  Could also
     215% measure other traces in this manner.
     216These restrictions are reflected in the subtraces that we give timing
     217guarantees on; they must start at a cost label and end at the return
     218of the enclosing function of the cost label\footnote{We expect that
     219  this would generalise to more general subtraces by subtracting costs
     220  for unwanted measurable suffixes of a measurable subtrace.}.  A
     221typical example of such a subtrace is the execution of an entire
     222function from the cost label at the start of the function until it
     223returns.  We call such any such subtrace \emph{measurable} if it (and
     224the prefix of the trace from the start to the subtrace) can also be
     225executed within the available stack space.
     226
     227Now we can give the main intensional statement for the compiler.
     228Given a \emph{measurable} subtrace for a labelled \textsf{Clight}
     229program, there is a subtrace of the 8051 object code program where the
     230time differences match.  Moreover, \emph{observable} parts of the
     231trace also match --- these are the appearance of cost labels and
     232function calls and returns.
     233
     234
     235
     236More formally, the definition of this statement in \matita{} is
     237\begin{lstlisting}[language=Grafite]
     238definition simulates :=
     239  $\lambda$p: compiler_output.
     240  let initial_status := initialise_status $...$ (cm (c_labelled_object_code $...$ p)) in
     241  $\forall$m1,m2.
     242   measurable Clight_pcs (c_labelled_clight $...$ p) m1 m2
     243       (stack_sizes (c_stack_cost $...$ p)) (c_max_stack $...$ p) $\rightarrow$
     244  $\forall$c1,c2.
     245   clock_after Clight_pcs (c_labelled_clight $...$ p) m1 (c_clight_cost_map $...$ p) = OK $...$ c1 $\rightarrow$
     246   clock_after Clight_pcs (c_labelled_clight $...$ p) (m1+m2) (c_clight_cost_map $...$ p) = OK $...$ c2 $\rightarrow$
     247  $\exists$n1,n2.
     248   observables Clight_pcs (c_labelled_clight $...$ p) m1 m2 =
     249     observables (OC_preclassified_system (c_labelled_object_code $...$ p))
     250          (c_labelled_object_code $...$ p) n1 n2
     251  $\wedge$
     252   clock ?? (execute (n1+n2) ? initial_status) =
     253     clock ?? (execute n1 ? initial_status) + (c2-c1).
     254\end{lstlisting}
     255where the \lstinline'measurable', \lstinline'clock_after' and
     256\lstinline'observables' definitions are generic definitions for multiple
     257languages; in this case the \lstinline'Clight_pcs' record applies them
     258to \textsf{Clight} programs.
     259
     260There is a second part to the statement, which says that the initial
     261processing of the input program to produce the cost labelled version
     262does not affect the semantics of the program:
     263% Yes, I'm paraphrasing the result a tiny bit to remove the observe non-function
     264\begin{lstlisting}[language=Grafite]
     265  $\forall$input_program,output.
     266  compile input_program = return output $\rightarrow$
     267  not_wrong $...$ (exec_inf $...$ clight_fullexec input_program) $\rightarrow$
     268  sim_with_labels
     269   (exec_inf $...$ clight_fullexec input_program)
     270   (exec_inf $...$ clight_fullexec (c_labelled_clight $...$ output))
     271\end{lstlisting}
     272That is, any successful compilation produces a labelled program that
     273has identical behaviour to the original, so long as there is no
     274`undefined behaviour'.
     275
     276Note that this statement provides full functional correctness, including
     277preservation of (non-)termination.  The intensional result above does
     278not do this directly --- it does not guarantee the same result or same
     279termination.  There are two mitigating factors, however: first, to
     280prove the intensional property you need local simulation results --- these
     281can be pieced together to form full behavioural equivalence, only time
     282constraints have prevented us from doing so.  Second, if we wish to
     283confirm a result, termination, or non-termination we could add an
     284observable witness, such as a function that is only called if the
     285correct result is given.  The intensional result guarantees that the
     286observable witness is preserved, so the program must behave correctly.
     287
     288These two results are combined in the the \lstinline'correct'
     289theorem in the file \lstinline'correctness.ma'.
     290
     291\section{Correctness statement for the front-end}
     292\label{sec:fegoals}
     293
     294The essential parts of the intensional proof were outlined during work
     295on a toy compiler in Task
     2962.1~\cite{d2.1,springerlink:10.1007/978-3-642-32469-7_3}.  These are
     297\begin{enumerate}
     298\item functional correctness, in particular preserving the trace of
     299  cost labels,
     300\item the \emph{soundness} and \emph{precision} of the cost labelling
     301  on the object code, and
     302\item the timing analysis on the object code produces a correct
     303  mapping from cost labels to time.
     304\end{enumerate}
     305
     306However, that toy development did not include function calls.  For the
     307full \cerco{} compiler we also need to maintain the invariant that
     308functions return to the correct program location in the caller, as we
     309mentioned in the previous section.  During work on the back-end timing
     310analysis (describe in more detail in the companion deliverable, D4.4)
     311the notion of a \emph{structured trace} was developed to enforce this
     312return property, and also most of the cost labelling properties too.
     313
     314\begin{figure}
     315\begin{center}
     316\includegraphics[width=0.5\linewidth]{compiler.pdf}
     317\end{center}
     318\caption{The compiler and proof outline}
     319\label{fig:compiler}
     320\end{figure}
     321
     322Jointly, we generalised the structured traces to apply to any of the
     323intermediate languages which have some idea of program counter.  This means
     324that they are introduced part way through the compiler, see
     325Figure~\ref{fig:compiler}.  Proving that a structured trace can be
     326constructed at \textsf{RTLabs} has several virtues:
     327\begin{itemize}
     328\item This is the first language where every operation has its own
     329  unique, easily addressable, statement.
     330\item Function calls and returns are still handled implicitly in the
     331  language and so the structural properties are ensured by the
     332  semantics.
     333\item Many of the back-end languages from \textsf{RTL} onwards share a common
     334  core set of definitions, and using structured traces throughout
     335  increases this uniformity.
     336\end{itemize}
     337
     338\begin{figure}
     339\begin{center}
     340\includegraphics[width=0.6\linewidth]{strtraces.pdf}
     341\end{center}
     342\caption{Nesting of functions in structured traces}
     343\label{fig:strtrace}
     344\end{figure}
     345A structured trace is a mutually inductive data type which
     346contains the steps from a normal program trace, but arranged into a
     347nested structure which groups entire function calls together and
     348aggregates individual steps between cost labels (or between the final
     349cost label and the return from the function), see
     350Figure~\ref{fig:strtrace}.  This captures the idea that the cost labels
     351only represent costs \emph{within} a function --- calls to other
     352functions are accounted for in the nested trace for their execution, and we
     353can locally regard function calls as a single step.
     354
     355These structured traces form the core part of the intermediate results
     356that we must prove so that the back-end can complete the main
     357intensional result stated above.  In full, we provide the back-end
     358with
     359\begin{enumerate}
     360\item A normal trace of the \textbf{prefix} of the program's execution
     361  before reaching the measurable subtrace.  (This needs to be
     362  preserved so that we know that the stack space consumed is correct,
     363  and to set up the simulation results.)
     364\item The \textbf{structured trace} corresponding to the measurable
     365  subtrace.
     366\item An additional property about the structured trace that no
     367  `program counter' is \textbf{repeated} between cost labels.  Together with
     368  the structure in the trace, this takes over from showing that
     369  cost labelling is sound and precise.
     370\item A proof that the \textbf{observables} have been preserved.
     371\item A proof that the \textbf{stack limit} is still observed by the prefix and
     372  the structure trace.  (This is largely a consequence of the
     373  preservation of observables.)
     374\end{enumerate}
     375The \lstinline'front_end_correct' lemma in the
     376\lstinline'correctness.ma' file provides a record containing these.
     377
     378Following the outline in Figure~\ref{fig:compiler}, we will first deal
     379with the transformations in \textsf{Clight} that produce the source
     380program with cost labels, then show that measurable traces can be
     381lifted to \textsf{RTLabs}, and finally show that we can construct the
     382properties listed above ready for the back-end proofs.
     383
     384\section{Input code to cost labelled program}
     385\label{sec:inputtolabelling}
     386
     387As explained on page~\pageref{page:switchintro}, the costs of complex
     388C \lstinline[language=C]'switch' statements cannot be represented with
     389the simple labelling used in the formalised compiler.  Our first pass
     390replaces these statements with simpler C code, allowing our second
     391pass to perform the cost labelling.  We show that the behaviour of
     392programs is unchanged by these passes using forward
     393simulations\footnote{All of our languages are deterministic, which can
     394be seen directly from their executable definitions.  Thus we know that
     395forward simulations are sufficient because the target cannot have any
     396other behaviour.}.
     397
     398\subsection{Switch removal}
     399
     400We compile away \lstinline[language=C]'switch' statements into more
     401basic \textsf{Clight} code.
     402Note that this transformation does not necessarily deteriorate the
     403efficiency of the generated code. For instance, compilers such as GCC
     404introduce balanced trees of ``if-then-else'' constructs for small
     405switches. However, our implementation strategy is much simpler. Let
     406us consider the following input statement.
     407
     408\begin{lstlisting}[language=C]
     409   switch(e) {
     410   case v1:
     411     stmt1;
     412   case v2:
     413     stmt2;
     414   default:
     415     stmt_default;
     416   }
     417\end{lstlisting}
     418
     419Note that \textsf{stmt1}, \textsf{stmt2}, \ldots \textsf{stmt\_default}
     420may contain \lstinline[language=C]'break' statements, which have the
     421effect of exiting the switch statement. In the absence of break, the
     422execution falls through each case sequentially. In our implementation,
     423we produce an equivalent sequence of ``if-then'' chained by gotos:
     424\begin{lstlisting}[language=C]
     425   fresh = e;
     426   if(fresh == v1) {
     427     $\llbracket$stmt1$\rrbracket$;
     428     goto lbl_case2;
     429   };
     430   if(fresh == v2) {
     431     lbl_case2:
     432     $\llbracket$stmt2$\rrbracket$;
     433     goto lbl_case2;
     434   };
     435   $\llbracket$stmt_default$\rrbracket$;
     436   exit_label:
     437\end{lstlisting}
     438
     439The proof had to tackle the following points:
     440\begin{itemize}
     441\item the source and target memories are not the same (due to the fresh variable),
     442\item the flow of control is changed in a non-local way (e.g. \textbf{goto}
     443  instead of \textbf{break}).
     444\end{itemize}
     445In order to tackle the first point, we implemented a version of memory
     446extensions similar to those of CompCert.
     447
     448For the simulation we decided to prove a sufficient amount to give us
     449confidence in the definitions and approach, but to curtail the proof
     450because this pass does not contribute to the intensional correctness
     451result.  We tackled several simple cases, that do not interact with
     452the switch removal per se, to show that the definitions were usable,
     453and part of the switch case to check that the approach is
     454reasonable. This comprises propagating the memory extension through
     455each statement (except switch), as well as various invariants that are
     456needed for the switch case (in particular, freshness hypotheses). The
     457details of the evaluation process for the source switch statement and
     458its target counterpart can be found in the file
     459\lstinline'switchRemoval.ma', along more details on the transformation
     460itself.
     461
     462Proving the correctness of the second point would require reasoning on the
     463semantics of \lstinline[language=C]'goto' statements. In the \textsf{Clight}
     464semantics, this is implemented as a function-wide lookup of the target label.
     465The invariant we would need is the fact that a global label lookup on a freshly
     466created goto is equivalent to a local lookup. This would in turn require the
     467propagation of some freshness hypotheses on labels. As discussed,
     468we decided to omit this part of the correctness proof.
     469
     470\subsection{Cost labelling}
     471
     472The simulation for the cost labelling pass is the simplest in the
     473front-end.  The main argument is that any step of the source program
     474is simulated by the same step of the labelled one, plus any extra
     475steps for the added cost labels.  The extra instructions do not change
     476the memory or local environments, although we have to keep track of
     477the extra instructions that appear in continuations, for example
     478during the execution of a \lstinline[language=C]'while' loop.
     479
     480We do not attempt to capture any cost properties of the labelling\footnote{We describe how the cost properties are
     481established in Section~\ref{sec:costchecks}.} in
     482the simulation proof, which allows the proof to be oblivious to the choice
     483of cost labels.  Hence we do not have to reason about the threading of
     484name generation through the labelling function, greatly reducing the
     485amount of effort required.
     486
     487%TODO: both give one-step-sim-by-many forward sim results; switch
     488%removal tricky, uses aux var to keep result of expr, not central to
     489%intensional correctness so curtailed proof effort once reasonable
     490%level of confidence in code gained; labelling much simpler; don't care
     491%what the labels are at this stage, just need to know when to go
     492%through extra steps.  Rolled up into a single result with a cofixpoint
     493%to obtain coinductive statement of equivalence (show).
     494
     495\section{Finding corresponding measurable subtraces}
     496\label{sec:measurablelifting}
     497
     498There follow the three main passes of the front-end:
     499\begin{enumerate}
     500\item simplification of casts in \textsf{Clight} code
     501\item \textsf{Clight} to \textsf{Cminor} translation, performing stack
     502  variable allocation and simplifying control structures
     503\item transformation to \textsf{RTLabs} control flow graph
     504\end{enumerate}
     505We have taken a common approach to
     506each pass: first we build (or axiomatise) forward simulation results
     507that are similar to normal compiler proofs, but which are slightly more
     508fine-grained so that we can see that the call structure and relative
     509placement of cost labels is preserved.
     510
     511Then we instantiate a general result which shows that we can find a
     512\emph{measurable} subtrace in the target of the pass that corresponds
     513to the measurable subtrace in the source.  By repeated application of
     514this result we can find a measurable subtrace of the execution of the
     515\textsf{RTLabs} code, suitable for the construction of a structured
     516trace (see Section~\ref{sec:structuredtrace}).  This is essentially an
     517extra layer on top of the simulation proofs that provides us with the
     518additional information required for our intensional correctness proof.
     519
     520\subsection{Generic measurable subtrace lifting proof}
     521
     522Our generic proof is parametrised on a record containing small-step
     523semantics for the source and target language, a classification of
     524states (the same form of classification is used when defining
     525structured traces), a simulation relation which respects the
     526classification and cost labelling and
     527four simulation results.  The simulations are split by the starting state's
     528classification and whether it is a cost label, which will allow us to
     529observe that the call structure is preserved.  They are:
     530\begin{enumerate}
     531\item a step from a `normal' state (which is not classified as a call
     532  or return) which is not a cost label is simulated by zero or more
     533  `normal' steps;
     534\item a step from a `call' state followed by a cost label step is
     535  simulated by a step from a `call' state, a corresponding label step,
     536  then zero or more `normal' steps;
     537\item a step from a `call' state not followed by a cost label
     538  similarly (note that this case cannot occur in a well-labelled
     539  program, but we do not have enough information locally to exploit
     540  this); and
     541\item a cost label step is simulated by a cost label step.
     542\end{enumerate}
     543Finally, we need to know that a successfully translated program will
     544have an initial state in the simulation relation with the original
     545program's initial state.
     546
     547The back-end has similar requirements for lifting simulations to
     548structured traces.  Fortunately, our treatment of calls and returns
     549can be slightly simpler because we have special call and return states
     550that correspond to function entry and return that are separate from
     551the actual instructions.  This was originally inherited from our port
     552of CompCert's \textsf{Clight} semantics, but proves useful here
     553because we only need to consider adding extra steps \emph{after} a
     554call or return state, because the instruction step deals with extra
     555steps that occur before.  The back-end makes all of the call and
     556return machinery explicit, and thus needs more complex statements
     557about extra steps before and after each call and return.
     558
     559\begin{figure}
     560\begin{center}
     561\includegraphics[width=0.5\linewidth]{meassim.pdf}
     562\end{center}
     563\caption{Tiling of simulation for a measurable subtrace}
     564\label{fig:tiling}
     565\end{figure}
     566
     567To find the measurable subtrace in the target program's execution we
     568walk along the original program's execution trace applying the
     569appropriate simulation result by induction on the number of steps.
     570While the number of steps taken varies, the overall structure is
     571preserved, as illustrated in Figure~\ref{fig:tiling}.  By preserving
     572the structure we also maintain the same intensional observables.  One
     573delicate point is that the cost label following a call must remain
     574directly afterwards\footnote{The prototype compiler allowed some
     575  straight-line code to appear before the cost label until a later
     576  stage of the compiler, but we must move the requirement forward to
     577  fit with the structured traces.}
     578% Damn it, I should have just moved the cost label forwards in RTLabs,
     579% like the prototype does in RTL to ERTL; the result would have been
     580% simpler.  Or was there some reason not to do that?
     581(both in the program code and in the execution trace), even if we
     582introduce extra steps, for example to store parameters in memory in
     583\textsf{Cminor}.  Thus we have a version of the call simulation
     584that deals with both the call and the cost label in one result.
     585
     586In addition to the subtrace we are interested in measuring, we must
     587prove that the earlier part of the trace is also preserved in
     588order to use the simulation from the initial state.  This proof also
     589guarantees that we do not run out of stack space before the subtrace
     590we are interested in.  The lemmas for this prefix and the measurable
     591subtrace are similar, following the pattern above.  However, the
     592measurable subtrace also requires us to rebuild the termination
     593proof.  This is defined recursively:
     594\label{prog:terminationproof}
     595\begin{lstlisting}[language=Grafite]
     596  let rec will_return_aux C (depth:nat)
     597                               (trace:list (cs_state $...$ C $\times$ trace)) on trace : bool :=
     598  match trace with
     599  [ nil $\Rightarrow$ false
     600  | cons h tl $\Rightarrow$
     601    let $\langle$s,tr$\rangle$ := h in
     602    match cs_classify C s with
     603    [ cl_call $\Rightarrow$ will_return_aux C (S depth) tl
     604    | cl_return $\Rightarrow$
     605        match depth with
     606        [ O $\Rightarrow$ match tl with [ nil $\Rightarrow$ true | _ $\Rightarrow$ false ]
     607        | S d $\Rightarrow$ will_return_aux C d tl
     608        ]
     609    | _ $\Rightarrow$ will_return_aux C depth tl
     610    ]
     611  ].
     612\end{lstlisting}
     613The \lstinline'depth' is the number of return states we need to see
     614before we have returned to the original function (initially zero) and
     615\lstinline'trace' the measurable subtrace obtained from the running
     616the semantics for the correct number of steps.  This definition
     617unfolds tail recursively for each step, and once the corresponding
     618simulation result has been applied a new one for the target can be
     619asserted by unfolding and applying the induction hypothesis on the
     620shorter trace.
     621
     622Combining the lemmas about the prefix and the measurable subtrace
     623requires a little care because the states joining the two might not be
     624related in the simulation.  In particular, if the measurable subtrace
     625starts from the cost label at the beginning of the function there may
     626be some extra instructions in the target code to execute to complete
     627function entry before the states are back in the relation.  Hence we
     628carefully phrased the lemmas to allow for such extra steps.
     629
     630Together, these then gives us an overall result for any simulation fitting the
     631requirements above (contained in the \lstinline'meas_sim' record):
     632\begin{lstlisting}[language=Grafite]
     633theorem measured_subtrace_preserved :
     634  $\forall$MS:meas_sim.
     635  $\forall$p1,p2,m,n,stack_cost,max.
     636  ms_compiled MS p1 p2 $\rightarrow$
     637  measurable (ms_C1 MS) p1 m n stack_cost max $\rightarrow$
     638  $\exists$m',n'.
     639    measurable (ms_C2 MS) p2 m' n' stack_cost max $\wedge$
     640    observables (ms_C1 MS) p1 m n = observables (ms_C2 MS) p2 m' n'.
     641\end{lstlisting}
     642The stack space requirement that is embedded in \lstinline'measurable'
     643is a consequence of the preservation of observables, because it is
     644determined by the functions called and returned from, which are observable.
     645
     646\subsection{Simulation results for each pass}
     647
     648We now consider the simulation results for the passes, each of which
     649is used to instantiate the
     650\lstinline[language=Grafite]'measured_subtrace_preserved' theorem to
     651construct the measurable subtrace for the next language.
     652
     653\subsubsection{Cast simplification}
     654
     655The parser used in \cerco{} introduces a lot of explicit type casts.
     656If left as they are, these constructs can greatly hamper the
     657quality of the generated code -- especially as the architecture
     658we consider is an $8$-bit one. In \textsf{Clight}, casts are
     659expressions. Hence, most of the work of this transformation
     660proceeds on expressions. The transformation proceeds by recursively
     661trying to coerce an expression to a particular integer type, which
     662is in practice smaller than the original one. This functionality
     663is implemented by two mutually recursive functions whose signature
     664is the following.
     665
     666\begin{lstlisting}[language=Grafite]
     667let rec simplify_expr (e:expr) (target_sz:intsize) (target_sg:signedness)
     668  : $\Sigma$result:bool$\times$expr.
     669    $\forall$ge,en,m. simplify_inv ge en m e (\snd result) target_sz target_sg (\fst result) := $\ldots$
     670
     671and simplify_inside (e:expr) : $\Sigma$result:expr. conservation e result := $\ldots$
     672\end{lstlisting}
     673
     674The \textsf{simplify\_inside} acts as a wrapper for
     675\textsf{simplify\_expr}. Whenever \textsf{simplify\_inside} encounters
     676a \textsf{Ecast} expression, it tries to coerce the sub-expression
     677to the desired type using \textsf{simplify\_expr}, which tries to
     678perform the actual coercion. In return, \textsf{simplify\_expr} calls
     679back \textsf{simplify\_inside} in some particular positions, where we
     680decided to be conservative in order to simplify the proofs. However,
     681the current design allows to incrementally revert to a more aggressive
     682version, by replacing recursive calls to \textsf{simplify\_inside} by
     683calls to \textsf{simplify\_expr} \emph{and} proving the corresponding
     684invariants -- where possible.
     685
     686The \textsf{simplify\_inv} invariant encodes either the conservation
     687of the semantics during the transformation corresponding to the failure
     688of the coercion (\textsf{Inv\_eq} constructor), or the successful
     689downcast of the considered expression to the target type
     690(\textsf{Inv\_coerce\_ok}).
     691
     692\begin{lstlisting}[language=Grafite]
     693inductive simplify_inv
     694  (ge : genv) (en : env) (m : mem)
     695  (e1 : expr) (e2 : expr) (target_sz : intsize) (target_sg : signedness) : bool $\rightarrow$ Prop :=
     696| Inv_eq : $\forall$result_flag. $\ldots$
     697     simplify_inv ge en m e1 e2 target_sz target_sg result_flag
     698| Inv_coerce_ok : $\forall$src_sz,src_sg.
     699     typeof e1 = Tint src_sz src_sg $\rightarrow$
     700     typeof e2 = Tint target_sz target_sg $\rightarrow$     
     701     smaller_integer_val src_sz target_sz src_sg (exec_expr ge en m e1) (exec_expr ge en m e2) $\rightarrow$
     702     simplify_inv ge en m e1 e2 target_sz target_sg true.
     703\end{lstlisting}
     704
     705The \textsf{conservation} invariant for \textsf{simplify\_inside} simply states the conservation
     706of the semantics, as in the \textsf{Inv\_eq} constructor of the previous
     707invariant.
     708
     709\begin{lstlisting}[language=Grafite]
     710definition conservation := $\lambda$e,result. $\forall$ge,en,m.
     711          res_sim ? (exec_expr ge en m e) (exec_expr ge en m result)
     712       $\wedge$ res_sim ? (exec_lvalue ge en m e) (exec_lvalue ge en m result)
     713       $\wedge$ typeof e = typeof result.
     714\end{lstlisting}
     715
     716This invariant is then easily lifted to statement evaluations.
     717The main problem encountered with this particular pass was dealing with
     718inconsistently typed programs, a canonical case being a particular
     719integer constant of a certain size typed with another size. This
     720prompted the need to introduce numerous type checks, making
     721both the implementation and the proof more complex, even though more
     722comprehensive checks are made in the next stage.
     723%\todo{Make this a particular case of the more general statement on baking more invariants in the Clight language}
     724
     725\subsubsection{Clight to Cminor}
     726
     727This pass is the last one operating on the \textsf{Clight} language.
     728Its input is a full \textsf{Clight} program, and its output is a
     729\textsf{Cminor} program. Note that we do not use an equivalent of
     730CompCert's \textsf{C\#minor} language: we translate directly to a
     731variant of \textsf{Cminor}. This presents the advantage of not
     732requiring the special loop constructs, nor the explicit block
     733structure. Another salient point of our approach is that a significant
     734number of the properties needed for the simulation proof were directly
     735encoded in dependently typed translation functions.  In particular,
     736freshness conditions and well-typedness conditions are included. The
     737main effects of the transformation from \textsf{Clight} to
     738\textsf{Cminor} are listed below.
     739
     740\begin{itemize}
     741\item Variables are classified as being either globals, stack-allocated
     742  locals or potentially register-allocated locals. The value of register-allocated
     743  local variables is moved out of the modelled memory and stored in a
     744  dedicated environment.
     745\item In \textsf{Clight}, each local variable has a dedicated memory block, whereas
     746  stack-allocated locals are bundled together on a function-by-function basis.
     747\item Loops are converted to jumps.
     748\end{itemize}
     749
     750The first two points require memory injections which are more flexible that those
     751needed in the switch removal case. In the remainder of this section, we briefly
     752discuss our implementation of memory injections, and then the simulation proof.
     753
     754\paragraph{Memory injections.}
     755
     756Our memory injections are modelled after the work of Blazy \& Leroy.
     757However, the corresponding paper is based on the first version of the
     758CompCert memory model~\cite{2008-Leroy-Blazy-memory-model}, whereas we use a much more concrete model, allowing byte-level
     759manipulations (as in the later version of CompCert's memory model). We proved
     760roughly 80 \% of the required lemmas. Notably, some of the difficulties encountered were
     761due to overly relaxed conditions on pointer validity (fixed during development).
     762Some more side conditions had to be added to take care of possible overflows when converting
     763from \textbf{Z} block bounds to $16$ bit pointer offsets (in practice, such overflows
     764only occur in edge cases that are easily ruled out -- but this fact is not visible
     765in memory injections). Concretely, some of the lemmas on the preservation of simulation of
     766loads after writes were axiomatised, due to a lack of time.
     767
     768\paragraph{Simulation proof.}
     769
     770We proved the simulation result for expressions and a representative
     771selection of statements.  In particular we tackled
     772\lstinline[language=C]'while' statements to ensure that we correctly
     773translate loops because our approach differs from CompCert by
     774converting directly to \textsf{Cminor} \lstinline[language=C]'goto's
     775rather than maintaining a notion of loop in \textsf{Cminor}.  We also have a partial
     776proof for function entry, covering the setup of the memory injection,
     777but not function exit.  Exits, and the remaining statements, have been
     778axiomatised.
     779
     780Careful management of the proof state was required because proof terms
     781are embedded in \textsf{Cminor} code to show that invariants are
     782respected.  These proof terms appear in the proof state when inverting
     783the translation functions, and they can be large and awkward.  While
     784generalising them away is usually sufficient, it can be difficult when
     785they appear under a binder.
     786
     787%The correctness proof for this transformation was not completed. We proved the
     788%simulation result for expressions and for some subset of the critical statement cases.
     789%Notably lacking are the function entry and exit, where the memory injection is
     790%properly set up. As would be expected, a significant amount of work has to be performed
     791%to show the conservation of all invariants at each simulation step.
     792
     793%\todo{list cases, explain while loop, explain labeling problem}
     794
     795\subsubsection{Cminor to RTLabs}
     796
     797The translation from \textsf{Cminor} to \textsf{RTLabs} is a fairly
     798routine control flow graph (CFG) construction.  As such, we chose to
     799axiomatise the associated extensional simulation results.  However, we did prove several
     800properties of the generated programs:
     801\begin{itemize}
     802\item All statements are type correct with respect to the declared
     803  pseudo-register type environment.
     804\item The CFG is closed, and has a distinguished entry node and a
     805  unique exit node.
     806\end{itemize}
     807
     808These properties rely on similar properties about type safety and the
     809presence of \lstinline[language=C]'goto'-labels for \textsf{Cminor} programs
     810which are checked at the preceding stage.  As a result, this
     811transformation is total and any compilation failures must occur when
     812the corresponding \textsf{Clight} source is available and a better
     813error message can be generated.
     814
     815The proof obligations for these properties include many instances of
     816graph inclusion.  We automated these proofs using a small amount of
     817reflection, making the obligations much easier to handle.  One
     818drawback to enforcing invariants throughout is that temporarily
     819breaking them can be awkward.  For example, \lstinline'return'
     820statements were originally used as placeholders for
     821\lstinline[language=C]'goto' destinations that had not yet been
     822translated.  However, this made establishing the single exit node
     823property rather difficult, and a different placeholder was chosen
     824instead.  In other circumstances it is possible to prove a more
     825complex invariant then simplify it at the end of the transformation.
     826
     827\section{Checking cost labelling properties}
     828\label{sec:costchecks}
     829
     830Ideally, we would provide proofs that the cost labelling pass always
     831produces programs that are soundly and precisely labelled and that
     832each subsequent pass preserves these properties.  This would match our
     833use of dependent types to eliminate impossible sources of errors
     834during compilation, in particular retaining intermediate language type
     835information.
     836
     837However, given the limited amount of time available we realised that
     838implementing a compile-time check for a sound and precise labelling of
     839the \textsf{RTLabs} intermediate code would reduce the proof burden
     840considerably.  This is similar in spirit to the use of translation
     841validation in certified compilation, which makes a similar trade-off
     842between the potential for compile-time failure and the volume of proof
     843required.
     844
     845The check cannot be pushed into a later stage of the compiler because
     846much of the information is embedded into the structured traces.
     847However, if an alternative method was used to show that function
     848returns in the compiled code are sufficiently well-behaved, then we
     849could consider pushing the cost property checks into the timing
     850analysis itself.  We leave this as a possible area for future work.
     851
     852\subsection{Implementation and correctness}
     853\label{sec:costchecksimpl}
     854
     855For a cost labelling to be sound and precise we need a cost label at
     856the start of each function, after each branch and at least one in
     857every loop.  The first two parts are trivial to check by examining the
     858code.  In \textsf{RTLabs} the last part is specified by saying
     859that there is a bound on the number of successive instruction nodes in
     860the CFG that you can follow before you encounter a cost label, and
     861checking this is more difficult.
     862
     863The implementation progresses through the set of nodes in the graph,
     864following successors until a cost label is found or a label-free cycle
     865is discovered (in which case the property does not hold and we return
     866an error).  This is made easier by the prior knowledge that every
     867successor of a branch instruction is a cost label, so we do not need
     868to search each branch.  When a label is found, we remove the chain of
     869program counters from the set and continue from another node in the
     870set until it is empty, at which point we know that there is a bound
     871for every node in the graph.
     872
     873Directly reasoning about the function that implements this procedure would be
     874rather awkward, so an inductive specification of a single step of its
     875behaviour was written and proved to match the implementation.  This
     876was then used to prove the implementation sound and complete.
     877
     878While we have not attempted to prove that the cost labelled properties
     879are established and preserved earlier in the compiler, we expect that
     880the effort for the \textsf{Cminor} to \textsf{RTLabs} stage alone
     881would be similar to the work outlined above, because it involves the
     882change from requiring a cost label at particular positions to
     883requiring cost labels to break loops in the CFG.  As there are another
     884three passes to consider (including the labelling itself), we believe
     885that using the check above is much simpler overall.
     886
     887% TODO? Found some Clight to Cminor bugs quite quickly
     888
     889\section{Existence of a structured trace}
     890\label{sec:structuredtrace}
     891
     892The \emph{structured trace} idea introduced in
     893Section~\ref{sec:fegoals} enriches the execution trace of a program by
     894nesting function calls in a mixed-step style and embedding the cost
     895labelling properties of the program.  See Figure~\ref{fig:strtrace} on
     896page~\pageref{fig:strtrace} for an illustration of a structured trace.
     897It was originally designed to support the proof of correctness for the
     898timing analysis of the object code in the back-end, then generalised
     899to provide a common structure to use from the end of the front-end to
     900the object code.
     901
     902To make the definition generic we abstract over the semantics of the
     903language,
     904\begin{lstlisting}[language=Grafite]
     905record abstract_status : Type[1] :=
     906  { as_status :> Type[0]
     907  ; as_execute : relation as_status
     908  ; as_pc : DeqSet
     909  ; as_pc_of : as_status $\rightarrow$ as_pc
     910  ; as_classify : as_status $\rightarrow$ status_class
     911  ; as_label_of_pc : as_pc $\rightarrow$ option costlabel
     912  ; as_after_return : ($\Sigma$s:as_status. as_classify s =  cl_call) $\rightarrow$ as_status $\rightarrow$ Prop
     913  ; as_result: as_status $\rightarrow$ option int
     914  ; as_call_ident : ($\Sigma$s:as_status.as_classify s = cl_call) $\rightarrow$ ident
     915  ; as_tailcall_ident : ($\Sigma$s:as_status.as_classify s = cl_tailcall) $\rightarrow$ ident
     916  }.
     917\end{lstlisting}
     918which requires a type of states, an execution relation\footnote{All of
     919  our semantics are executable, but using a relation was simpler in
     920  the abstraction.}, some notion of abstract
     921program counter with decidable equality, the classification of states,
     922and functions to extract the observable intensional information (cost
     923labels and the identity of functions that are called).  The
     924\lstinline'as_after_return' property links the state before a function
     925call with the state after return, providing the evidence that
     926execution returns to the correct place.  The precise form varies
     927between stages; in \textsf{RTLabs} it insists the CFG, the pointer to
     928the CFG node to execute next, and some call stack information is
     929preserved.
     930
     931The structured traces are defined using three mutually inductive
     932types.  The core data structure is \lstinline'trace_any_label', which
     933captures some straight-line execution until the next cost label or the
     934return from the enclosing function.  Any function calls are embedded as
     935a single step, with its own trace nested inside and the before and
     936after states linked by \lstinline'as_after_return'; and states
     937classified as a `jump' (in particular branches) must be followed by a
     938cost label.
     939
     940The second type, \lstinline'trace_label_label', is a
     941\lstinline'trace_any_label' where the initial state is cost labelled.
     942Thus a trace in this type identifies a series of steps whose cost is
     943entirely accounted for by the label at the start.
     944
     945Finally, \lstinline'trace_label_return' is a sequence of
     946\lstinline'trace_label_label' values which end in the return from the
     947function.  These correspond to a measurable subtrace, and in
     948particular include executions of an entire function call (and so are
     949used for the nested calls in \lstinline'trace_any_label').
     950
     951\subsection{Construction}
     952
     953The construction of the structured trace replaces syntactic cost
     954labelling properties, which place requirements on where labels appear
     955in the program, with semantic properties that constrain the execution
     956traces of the program.  The construction begins by defining versions
     957of the sound and precise labelling properties on states and global
     958environments (for the code that appears in each of them) rather than
     959whole programs, and showing that these are preserved by steps of the
     960\textsf{RTLabs} semantics.
     961
     962Then we show that each cost labelling property required by the
     963definition of structured traces is locally satisfied.  These proofs are
     964broken up by the classification of states.  Similarly, we prove a
     965step-by-step stack preservation result, which states that the
     966\textsf{RTLabs} semantics never changes the lower parts of the stack.
     967
     968The core part of the construction of a structured trace is to use the
     969proof of termination from the measurable trace (defined on
     970page~\pageref{prog:terminationproof}) to `fold up' the execution into
     971the nested form.  The results outlined above fill in the proof
     972obligations for the cost labelling properties and the stack
     973preservation result shows that calls return to the correct location.
     974
     975The structured trace alone is not sufficient to capture the property
     976that the program is soundly labelled.  While the structured trace
     977guarantees termination, it still permits a loop to be executed a
     978finite number of times without encountering a cost label.  We
     979eliminate this by proving that no `program counter' repeats within any
     980\lstinline'trace_any_label' section by showing that it is incompatible
     981with the property that there is a bound on the number of successor
     982instructions you can follow in the CFG before you encounter a cost
     983label (from Section~\ref{sec:costchecksimpl}).
     984
     985\subsubsection{Complete execution structured traces}
     986
     987The development of the construction above started relatively early,
     988before the measurable subtrace preservation proofs.  To be confident
     989that the traces were well-formed at that time, we also developed a
     990complete execution form that embeds the traces above.  This includes
     991non-terminating program executions, where an infinite number of the terminating
     992structured traces are embedded.  This construction confirmed that our
     993definition of structured traces was consistent, although we later
     994found that we did not require the whole execution version for the
     995compiler correctness results.
     996
     997To construct these we need to know whether each function call will
     998eventually terminate, requiring the use of the excluded middle.  This
     999classical reasoning is local to the construction of whole program
     1000traces and is not necessary for our main results.
     1001
     1002\section{Conclusion}
     1003
     1004In combination with the work on the CerCo back-end and by
     1005concentrating on the novel intensional parts of the proof, we have
     1006shown that it is possible to construct certifying compilers that
     1007correctly report execution time and stack space costs.  The layering
     1008of intensional correctness proofs on top of normal simulation results
     1009provides a useful separation of concerns, and could permit the reuse
     1010of existing results.
     1011
     1012\section{Files}
     1013
     1014The following table gives a high-level overview of the \matita{}
     1015source files in Deliverable 3.4:
     1016
     1017\bigskip
     1018
     1019\begin{tabular}{rp{.7\linewidth}}
     1020\lstinline'compiler.ma' & Top-level compiler definitions, in particular
     1021\lstinline'front_end', and the whole compiler definition
     1022\lstinline'compile'. \\
     1023\lstinline'correctness.ma' & Correctness results: \lstinline'front_end_correct'
     1024and \lstinline'correct', respectively. \\
     1025\lstinline'Clight/*' & \textsf{Clight}: proofs for switch
     1026removal, cost labelling, cast simplification and conversion to
     1027\textsf{Cminor}. \\
     1028\lstinline'Cminor/*' & \textsf{Cminor}: axioms of conversion to
     1029\textsf{RTLabs}. \\
     1030\lstinline'RTLabs/*' & \textsf{RTLabs}: definitions and proofs for
     1031compile-time cost labelling checks, construction of structured traces.
     1032\\
     1033\lstinline'common/Measurable.ma' & Definitions for measurable
     1034subtraces. \\
     1035\lstinline'common/FEMeasurable.ma' & Generic measurable subtrace
     1036lifting proof. \\
     1037\lstinline'common/*' & Other common definitions relevant to many parts
     1038of the compiler and proof. \\
     1039\lstinline'utilities/*' & General purpose definitions used throughout,
     1040including extensions to the standard \matita{} library.
     1041\end{tabular}
    101042
    111043\subsection{A brief overview of the backend compilation chain}
     
    801112
    811113Our abstraction takes the following form:
    82 \begin{lstlisting}
     1114\begin{lstlisting}[language=Grafite]
    831115inductive joint_instruction (p: params__) (globals: list ident): Type[0] :=
    841116  | COMMENT: String $\rightarrow$ joint_instruction p globals
     
    861118  | INT: generic_reg p $\rightarrow$ Byte $\rightarrow$ joint_instruction p globals
    871119  ...
    88   | OP1: Op1 $
    89 ightarrow$ acc_a_reg p $
    90 ightarrow$ acc_a_reg p $
    91 ightarrow$ joint_instruction p globals
     1120  | OP1: Op1 $\rightarrow$ acc_a_reg p $\rightarrow$ acc_a_reg p $\rightarrow$ joint_instruction p globals
    921121  ...
    931122  | extension: extend_statements p $\rightarrow$ joint_instruction p globals.
     
    1061135As \texttt{p} varies between intermediate languages, we can provide language specific extensions to the syntax of the joint language.
    1071136For example, ERTL's extended syntax consists of the following extra statements:
    108 \begin{lstlisting}
     1137\begin{lstlisting}[language=Grafite]
    1091138inductive ertl_statement_extension: Type[0] :=
    1101139  | ertl_st_ext_new_frame: ertl_statement_extension
     
    1131142\end{lstlisting}
    1141143These are further packaged into an ERTL specific instance of \texttt{params\_\_} as follows:
    115 \begin{lstlisting}
     1144\begin{lstlisting}[language=Grafite]
    1161145definition ertl_params__: params__ :=
    1171146  mk_params__ register register ... ertl_statement_extension.
     
    1291158
    1301159Our joint internal function record looks like so:
    131 \begin{lstlisting}
     1160\begin{lstlisting}[language=Grafite]
    1321161record joint_internal_function (globals: list ident) (p:params globals) : Type[0] :=
    1331162{
     
    1501179The `joint' syntax makes this especially clear.
    1511180For instance, in the definition:
    152 \begin{lstlisting}
     1181\begin{lstlisting}[language=Grafite]
    1531182inductive joint_instruction (p:params__) (globals: list ident): Type[0] :=
    1541183  ...
     
    2321261For example, in the \texttt{RTLabs} to \texttt{RTL} transformation pass, many functions only `make sense' when lists of registers passed to them as arguments conform to some specific length.
    2331262For instance, the \texttt{translate\_negint} function, which translates a negative integer constant:
    234 \begin{lstlisting}
     1263\begin{lstlisting}[language=Grafite]
    2351264definition translate_negint :=
    2361265  $\lambda$globals: list ident.
     
    2511280Practically, we would always like to ensure that the entry and exit labels are present in the statement graph.
    2521281We ensure that this is so with a dependent sum type in the \texttt{joint\_internal\_function} record, which all graph based languages specialise to obtain their own internal function representation:
    253 \begin{lstlisting}
     1282\begin{lstlisting}[language=Grafite]
    2541283record joint_internal_function (globals: list ident) (p: params globals): Type[0] :=
    2551284{
     
    3891418We begin the abstraction process with the \texttt{params\_\_} record.
    3901419This holds the types of the representations of the different register varieties in the intermediate languages:
    391 \begin{lstlisting}
     1420\begin{lstlisting}[language=Grafite]
    3921421record params__: Type[1] :=
    3931422{
     
    4211450
    4221451As mentioned in the report for Deliverable D4.2, the record \texttt{params\_\_} is enough to be able to specify the instructions of the joint languages:
    423 \begin{lstlisting}
     1452\begin{lstlisting}[language=Grafite]
    4241453inductive joint_instruction (p: params__) (globals: list ident): Type[0] :=
    4251454  | COMMENT: String $\rightarrow$ joint_instruction p globals
     
    4351464Naturally, as some intermediate languages are graph based, and others linearised, the passing act of passing control on to the `successor' instruction can either be the act of following a graph edge in a control flow graph, or incrementing an index into a list.
    4361465We make a distinction between instructions that pass control onto their immediate successors, and those that jump elsewhere in the program, through the use of \texttt{succ}, denoting the immediate successor of the current instruction, in the \texttt{params\_} record described below.
    437 \begin{lstlisting}
     1466\begin{lstlisting}[language=Grafite]
    4381467record params_: Type[1] :=
    4391468{
     
    4441473The type \texttt{succ} corresponds to labels, in the case of control flow graph based languages, or is instantiated to the unit type for the linearised language, LIN.
    4451474Using \texttt{param\_} we can define statements of the joint language:
    446 \begin{lstlisting}
     1475\begin{lstlisting}[language=Grafite]
    4471476inductive joint_statement (p:params_) (globals: list ident): Type[0] :=
    4481477  | sequential: joint_instruction p globals $\rightarrow$ succ p $\rightarrow$ joint_statement p globals
     
    4571486For the semantics, we need further parametererised types.
    4581487In particular, we parameterise the result and parameter type of an internal function call in \texttt{params0}:
    459 \begin{lstlisting}
     1488\begin{lstlisting}[language=Grafite]
    4601489record params0: Type[1] :=
    4611490{
     
    4681497
    4691498We further extend \texttt{params0} with a type for local variables in internal function calls:
    470 \begin{lstlisting}
     1499\begin{lstlisting}[language=Grafite]
    4711500record params1 : Type[1] :=
    4721501{
     
    4781507Further, we hypothesise a generic method for looking up the next instruction in the graph, called \texttt{lookup}.
    4791508Note that \texttt{lookup} may fail, and returns an \texttt{option} type:
    480 \begin{lstlisting}
     1509\begin{lstlisting}[language=Grafite]
    4811510record params (globals: list ident): Type[1] :=
    4821511{
     
    4921521In particular, we have a description of the result, parameters and the local variables of a function.
    4931522Note also that we have lifted the hypothesised \texttt{lookup} function from \texttt{params} into a dependent sigma type, which combines a label (the entry and exit points of the control flow graph or list) combined with a proof that the label is in the graph structure:
    494 \begin{lstlisting}
     1523\begin{lstlisting}[language=Grafite]
    4951524record joint_internal_function (globals: list ident) (p:params globals) : Type[0] :=
    4961525{
     
    5091538The reason is because some intermediate languages share a host of parameters, and only differ on some others.
    5101539For instance, in instantiating the ERTL language, certain parameters are shared with RTL, whilst others are ERTL specific:
    511 \begin{lstlisting}
     1540\begin{lstlisting}[language=Grafite]
    5121541...
    5131542definition ertl_params__: params__ :=
     
    5161545...
    5171546definition ertl_params1: params1 := rtl_ertl_params1 ertl_params0.
    518 definition ertl_params: globals. params globals := rtl_ertl_params ertl_params0.
     1547definition ertl_params: $\forall$globals. params globals := rtl_ertl_params ertl_params0.
    5191548...
    5201549definition ertl_statement := joint_statement ertl_params_.
     
    5241553\end{lstlisting}
    5251554Here, \texttt{rtl\_ertl\_params1} are the common parameters of the ERTL and RTL languages:
    526 \begin{lstlisting}
     1555\begin{lstlisting}[language=Grafite]
    5271556definition rtl_ertl_params1 := $\lambda$pars0. mk_params1 pars0 (list register).
    5281557\end{lstlisting}
    5291558
    5301559The record \texttt{more\_sem\_params} bundles together functions that store and retrieve values in various forms of register:
    531 \begin{lstlisting}
     1560\begin{lstlisting}[language=Grafite]
    5321561record more_sem_params (p:params_): Type[1] :=
    5331562{
     
    5651594
    5661595We extend \texttt{more\_sem\_params} with yet more parameters via \texttt{more\_sem\_params2}:
    567 \begin{lstlisting}
     1596\begin{lstlisting}[language=Grafite]
    5681597record more_sem_params1 (globals: list ident) (p: params globals) : Type[1] :=
    5691598{
     
    6051634We bundle \texttt{params} and \texttt{sem\_params} together into a single record.
    6061635This will be used in the function \texttt{eval\_statement} which executes a single statement of the joint language:
    607 \begin{lstlisting}
     1636\begin{lstlisting}[language=Grafite]
    6081637record sem_params2 (globals: list ident): Type[1] :=
    6091638{
     
    6141643\noindent
    6151644The \texttt{state} record holds the current state of the interpreter:
    616 \begin{lstlisting}
     1645\begin{lstlisting}[language=Grafite]
    6171646record state (p: sem_params): Type[0] :=
    6181647{
     
    6321661
    6331662We use the function \texttt{eval\_statement} to evaluate a single joint statement:
    634 \begin{lstlisting}
     1663\begin{lstlisting}[language=Grafite]
    6351664definition eval_statement:
    6361665  $\forall$globals: list ident.$\forall$p:sem_params2 globals.
     
    6631692We call this function \texttt{return} and say that it must have type $\alpha \rightarrow M \alpha$, where $M$ is the name of the monad.
    6641693In our example, the `lifting' function for the \texttt{option} monad can be implemented as:
    665 \begin{lstlisting}
     1694\begin{lstlisting}[language=Grafite]
    6661695let return x = Some x
    6671696\end{lstlisting}
     
    6711700We can see that bind `unpacks' a monadic value, applies a function after unpacking, and `repacks' the new value in the monad.
    6721701In our example, the sequencing function for the \texttt{option} monad can be implemented as:
    673 \begin{lstlisting}
     1702\begin{lstlisting}[language=Grafite]
    6741703let bind o f =
    6751704  match o with
     
    6871716Here, the monad's sequencing operation ensures that cost label emissions and function calls are maintained in the correct order.
    6881717We have already seen how the \texttt{eval\_statement} function of the joint language is monadic, with type:
    689 \begin{lstlisting}
     1718\begin{lstlisting}[language=Grafite]
    6901719definition eval_statement:
    6911720  $\forall$globals: list ident.$\forall$p:sem_params2 globals.
     
    6951724If we examine the body of \texttt{eval\_statement}, we may also see how the monad sequences effects.
    6961725For instance, in the case for the \texttt{LOAD} statement, we have the following:
    697 \begin{lstlisting}
     1726\begin{lstlisting}[language=Grafite]
    6981727definition eval_statement:
    6991728  $\forall$globals: list ident. $\forall$p:sem_params2 globals.
     
    7021731  ...
    7031732  match s with
    704   | LOAD dst addrl addrh
     1733  | LOAD dst addrl addrh $\Rightarrow$
    7051734    ! vaddrh $\leftarrow$ dph_retrieve ... st addrh;
    7061735    ! vaddrl $\leftarrow$ dpl_retrieve ... st addrl;
     
    7131742Here, we employ a certain degree of syntactic sugaring.
    7141743The syntax
    715 \begin{lstlisting}
     1744\begin{lstlisting}[language=Grafite]
    7161745  ...
    7171746! vaddrh $\leftarrow$ dph_retrieve ... st addrh;
     
    7211750is sugaring for the \texttt{IO} monad's binding operation.
    7221751We can expand this sugaring to the following much more verbose code:
    723 \begin{lstlisting}
     1752\begin{lstlisting}[language=Grafite]
    7241753  ...
    7251754  bind (dph_retrieve ... st addrh) ($\lambda$vaddrh. bind (dpl_retrieve ... st addrl)
Note: See TracChangeset for help on using the changeset viewer.