Changeset 3173


Ignore:
Timestamp:
Apr 23, 2013, 6:30:54 PM (4 years ago)
Author:
campbell
Message:

A little reworking of 3.4.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Deliverables/D3.4/Report/report.tex

    r3168 r3173  
    100100\vspace*{7cm}
    101101\paragraph{Abstract}
    102 We report on the correctness proofs for the front-end of the \cerco{} cost
    103 lifting compiler, considering three distinct parts of the task: showing that
    104 the \emph{annotated source code} output by the compiler has equivalent
    105 behaviour to the original input (up to the annotations); showing that a
    106 \emph{measurable} subtrace of the annotated source code corresponds to an
    107 equivalent measurable subtrace in the code produced by the front-end, including
    108 costs; and finally showing that the enriched \emph{structured} execution traces
    109 required for cost correctness in the back-end can be constructed from the
     102We report on the correctness proofs for the front-end of the \cerco{}
     103cost lifting compiler.  First, we identify the core result we wish to
     104prove, which says that the we correctly predict the precise execution
     105time for particular parts of the execution called \emph{measurable}
     106subtraces.  Then we consider the three distinct parts of the task:
     107showing that the \emph{annotated source code} output by the compiler
     108has equivalent behaviour to the original input (up to the
     109annotations); showing that a measurable subtrace of the
     110annotated source code corresponds to an equivalent measurable subtrace
     111in the code produced by the front-end, including costs; and finally
     112showing that the enriched \emph{structured} execution traces required
     113for cost correctness in the back-end can be constructed from the
    110114properties of the code produced by the front-end.
    111115
     
    113117that we get consistent cost measurements throughout the intermediate languages
    114118of the compiler can be layered on top of normal forward simulation results,
    115 if we split them into local call-structure preserving results.
    116 
    117 This deliverable shows correctness results about the formalised compiler
    118 described in D3.2, using the source language semantics from D3.1 and
    119 intermediate language semantics from D3.3.  Together with the companion
    120 deliverable about the correctness of the back-end, D4.4, we obtain results
    121 about the whole formalised compiler.
     119if we split those results into local call-structure preserving simulations.
     120
     121This deliverable shows correctness results about the formalised
     122compiler described in D3.2, using the source language semantics from
     123D3.1 and intermediate language semantics from D3.3.  It builds on
     124earlier work on a toy compiler built to test the labelling approach in
     125D2.1. Together with the companion deliverable about the correctness of
     126the back-end, D4.4, we obtain results about the whole formalised
     127compiler.
    122128
    123129% TODO: mention the deliverable about the extracted compiler et al?
     
    137143translation validation on sound, precise labelling properties.}
    138144
    139 The \cerco{} compiler compiles C code targeting microcontrollers implementing
    140 the Intel 8051 architecture, which produces both the object code and source
    141 code containing annotations describing the timing behavior of the object code.
    142 There are two versions: first, an initial prototype implemented in
    143 \ocaml{}~\cite{d2.2}, and a version formalised in the \matita{} proof
    144 assistant~\cite{d3.2,d4.2} and then extracted to \ocaml{} code to produce an
    145 executable compiler.  In this document we present results formalised in
    146 \matita{} about the front-end of that version of the compiler, and how that fits
     145The \cerco{} compiler compiles C code, targeting microcontrollers
     146implementing the Intel 8051 architecture.  It produces both the object
     147code and source code containing annotations describing the timing
     148behaviour of the object code.  There are two versions: first, an
     149initial prototype implemented in \ocaml{}~\cite{d2.2}, and a version
     150formalised in the \matita{} proof assistant~\cite{d3.2,d4.2} and then
     151extracted to \ocaml{} code to produce an executable compiler.  In this
     152document we present results formalised in \matita{} about the
     153front-end of the latter version of the compiler, and how that fits
    147154into the verification of the whole compiler.
    148155
    149 \todo{maybe mention stack space here?  other additions?  refer to "layering"?}
    150 A key part of this work was to separate the proofs about the compiled code's
    151 extensional behaviour (that is, the functional correctness of the compiler)
    152 from the intensional correctness that the costs given are correct.
    153 Unfortunately, the ambitious goal of completely verifying the compiler was not
    154 feasible within the time available, but thanks to this separation of extensional
    155 and intensional proofs we are able to axiomatize simulation results similar to
    156 those in other compiler verification projects and concentrate on the novel
    157 intensional proofs.  The proofs were also made more tractable by introducing
    158 compile-time checks for the `sound and precise' cost labelling properties
    159 rather than proving that they are preserved throughout.
     156\todo{maybe mention stack space here?  other additions?  refer to
     157  "layering"?}  A key part of this work was to separate the proofs
     158about the compiled code's extensional behaviour (that is, the
     159functional correctness of the compiler) from the intensional
     160correctness that the costs given are correct.  Unfortunately, the
     161ambitious goal of completely verifying the entire compiler was not
     162feasible within the time available, but thanks to this separation of
     163extensional and intensional proofs we are able to axiomatize
     164simulation results similar to those in other compiler verification
     165projects and concentrate on the novel intensional proofs.  The proofs
     166were also made more tractable by introducing compile-time checks for
     167the `sound and precise' cost labelling properties rather than proving
     168that they are preserved throughout.
    160169
    161170The overall statement of correctness says that the annotated program has the
     
    218227identifiers to the function's stack space usage\footnote{The compiled
    219228  code's only stack usage is to allocate a fixed-size frame on each
    220   function entry and discard it on exit.}, a cost label covering the
    221 initialisation of global variables and calling the
    222 \lstinline[language=C]'main' function, the annotated source code, and
    223 finally a mapping from cost labels to actual execution time costs.
     229  function entry and discard it on exit.  No other dynamic allocation
     230  is provided by the compiler.}, the space available for the
     231stack after global variable allocation, a cost label covering the
     232execution time for the initialisation of global variables and the call
     233to the \lstinline[language=C]'main' function, the annotated source
     234code, and finally a mapping from cost labels to actual execution time
     235costs.
    224236
    225237An \ocaml{} pretty printer is used to provide a concrete version of the output
     
    248260The availability of precise timing information for 8501
    249261implementations and the design of the compiler allow it to give exact
    250 time costs in terms of processor cycles.  However, these exact results
    251 are only available if the subtrace we measure starts and ends at
    252 suitable points.  In particular, pure computation with no observable
    253 effects may be reordered and moved past cost labels, so we cannot
    254 measure time between arbitrary statements in the program.
     262time costs in terms of processor cycles, not just upper bounds.
     263However, these exact results are only available if the subtrace we
     264measure starts and ends at suitable points.  In particular, pure
     265computation with no observable effects may be reordered and moved past
     266cost labels, so we cannot measure time between arbitrary statements in
     267the program.
    255268
    256269There is also a constraint on the subtraces that we
     
    262275on much earlier stages in the compiler.  To convey this information to
    263276the timing analysis extra structure is imposed on the subtraces, which
    264 we will give more details on in Section~\ref{sec:fegoals}.
     277is described in Section~\ref{sec:fegoals}.
    265278
    266279% Regarding the footnote, would there even be much point?
     280% TODO: this might be quite easy to add ('just' subtract the
     281% measurable subtrace from the second label to the end).  Could also
     282% measure other traces in this manner.
    267283These restrictions are reflected in the subtraces that we give timing
    268284guarantees on; they must start at a cost label and end at the return
     
    273289the execution of an entire function from the cost label at the start
    274290of the function until it returns.  We call such any such subtrace
    275 \emph{measurable} if it (and the prefix of the trace before it) can
    276 also be executed within the available stack space.
     291\emph{measurable} if it (and the prefix of the trace from the start to
     292the subtrace) can also be executed within the available stack space.
    277293
    278294Now we can give the main intensional statement for the compiler.
     
    374390\item This is the first language where every operation has its own
    375391  unique, easily addressable, statement.
    376 \item Function calls and returns are still handled in the language and
    377   so the structural properties are ensured by the semantics.
     392\item Function calls and returns are still handled implicitly in the
     393  language and so the structural properties are ensured by the
     394  semantics.
    378395\item Many of the back-end languages from \textsf{RTL} share a common
    379396  core set of definitions, and using structured traces throughout
     
    388405\label{fig:strtrace}
    389406\end{figure}
    390 A structured trace is a mutually inductive data type which principally
     407A structured trace is a mutually inductive data type which
    391408contains the steps from a normal program trace, but arranged into a
    392409nested structure which groups entire function calls together and
    393410aggregates individual steps between cost labels (or between the final
    394411cost label and the return from the function), see
    395 Figure~\ref{fig:strtrace}.  This capture the idea that the cost labels
     412Figure~\ref{fig:strtrace}.  This captures the idea that the cost labels
    396413only represent costs \emph{within} a function --- calls to other
    397 functions are accounted for in the trace for their execution, and we
     414functions are accounted for in the nested trace for their execution, and we
    398415can locally regard function calls as a single step.
    399416
     
    405422\item A normal trace of the \textbf{prefix} of the program's execution
    406423  before reaching the measurable subtrace.  (This needs to be
    407   preserved so that we know that the stack space consumed is correct.)
     424  preserved so that we know that the stack space consumed is correct,
     425  and to set up the simulation results.)
    408426\item The \textbf{structured trace} corresponding to the measurable
    409427  subtrace.
     
    460478Then we instantiate a general result which shows that we can find a
    461479\emph{measurable} subtrace in the target of the pass that corresponds
    462 to the measurable subtract in the source.  By repeated application of
     480to the measurable subtrace in the source.  By repeated application of
    463481this result we can find a measurable subtrace of the execution of the
    464482\textsf{RTLabs} code, suitable for the construction of a structured
     
    526544subtrace are similar, following the pattern above.  However, the
    527545measurable subtrace also requires us to rebuild the termination
    528 proof.  This has a recursive form:
     546proof.  This is defined recursively:
     547\label{prog:terminationproof}
    529548\begin{lstlisting}[language=matita]
    530549let rec will_return_aux C (depth:nat)
     
    752771
    753772For a cost labelling to be sound and precise we need a cost label at
    754 the start of each function, after each branch and at least once in
     773the start of each function, after each branch and at least one in
    755774every loop.  The first two parts are trivial to check by examining the
    756775code.  In \textsf{RTLabs} the last part is specified by saying
     
    764783This is made easier by the prior knowledge that any branch is followed
    765784by cost labels, so we do not need to search each branch.  When a label
    766 is found, we remove the chain from the set and continue until it is
    767 empty, at which point we know that there is a bound for every node in
    768 the graph.
     785is found, we remove the chain from the set and continue from another
     786node in the set until it is empty, at which point we know that there
     787is a bound for every node in the graph.
    769788
    770789Directly reasoning about the function that implements this would be
     
    816835program counters with decidable equality, the classification of states
    817836and functions to extract the observable intensional information (cost
    818 labels and the identity of functions that are called).
     837labels and the identity of functions that are called).  The
     838\lstinline'as_after_return' property links the state before a function
     839call with the state after return, providing the evidence that
     840execution returns to the correct place.  The precise form varies
     841between stages; in \textsf{RTLabs} it insists the CFG, the pointer to
     842the CFG node to execute next, and some call stack information is
     843preserved.
    819844
    820845The structured traces are defined using three mutually inductive
     
    822847captures some straight-line execution until the next cost label or
    823848function return.  Each function call is embedded as a single step,
    824 with its own trace nested inside, and states classified as a `jump'
    825 (in particular branches) must be followed by a cost label.
     849with its own trace nested inside and the before and after states
     850linked by \lstinline'as_after_return', and states classified as a
     851`jump' (in particular branches) must be followed by a cost label.
    826852
    827853The second type, \lstinline'trace_label_label', is a
     
    833859\lstinline'trace_label_label' values which end in the return from the
    834860function.  These correspond to a measurable subtrace, and in
    835 particular include executions of an entire function call.
     861particular include executions of an entire function call (and so are
     862used for the nested calls in \lstinline'trace_any_label').
     863
     864\subsection{Construction}
     865
     866The core part of the construction of a fixed trace is to use the proof
     867of termination from the measurable trace (defined on
     868page~\pageref{prog:terminationproof}) to `fold up' the execution into
     869the nested form.
    836870
    837871TODO: design, basic structure from termination proof, how cost
Note: See TracChangeset for help on using the changeset viewer.