Changeset 3173 for Deliverables
- Timestamp:
- Apr 23, 2013, 6:30:54 PM (7 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
Deliverables/D3.4/Report/report.tex
r3168 r3173 100 100 \vspace*{7cm} 101 101 \paragraph{Abstract} 102 We report on the correctness proofs for the front-end of the \cerco{} cost 103 lifting compiler, considering three distinct parts of the task: showing that 104 the \emph{annotated source code} output by the compiler has equivalent 105 behaviour to the original input (up to the annotations); showing that a 106 \emph{measurable} subtrace of the annotated source code corresponds to an 107 equivalent measurable subtrace in the code produced by the front-end, including 108 costs; and finally showing that the enriched \emph{structured} execution traces 109 required for cost correctness in the back-end can be constructed from the 102 We report on the correctness proofs for the front-end of the \cerco{} 103 cost lifting compiler. First, we identify the core result we wish to 104 prove, which says that the we correctly predict the precise execution 105 time for particular parts of the execution called \emph{measurable} 106 subtraces. Then we consider the three distinct parts of the task: 107 showing that the \emph{annotated source code} output by the compiler 108 has equivalent behaviour to the original input (up to the 109 annotations); showing that a measurable subtrace of the 110 annotated source code corresponds to an equivalent measurable subtrace 111 in the code produced by the front-end, including costs; and finally 112 showing that the enriched \emph{structured} execution traces required 113 for cost correctness in the back-end can be constructed from the 110 114 properties of the code produced by the front-end. 111 115 … … 113 117 that we get consistent cost measurements throughout the intermediate languages 114 118 of the compiler can be layered on top of normal forward simulation results, 115 if we split them into local call-structure preserving results. 116 117 This deliverable shows correctness results about the formalised compiler 118 described in D3.2, using the source language semantics from D3.1 and 119 intermediate language semantics from D3.3. Together with the companion 120 deliverable about the correctness of the back-end, D4.4, we obtain results 121 about the whole formalised compiler. 119 if we split those results into local call-structure preserving simulations. 120 121 This deliverable shows correctness results about the formalised 122 compiler described in D3.2, using the source language semantics from 123 D3.1 and intermediate language semantics from D3.3. It builds on 124 earlier work on a toy compiler built to test the labelling approach in 125 D2.1. Together with the companion deliverable about the correctness of 126 the back-end, D4.4, we obtain results about the whole formalised 127 compiler. 122 128 123 129 % TODO: mention the deliverable about the extracted compiler et al? … … 137 143 translation validation on sound, precise labelling properties.} 138 144 139 The \cerco{} compiler compiles C code targeting microcontrollers implementing 140 the Intel 8051 architecture, which produces both the object code and source 141 code containing annotations describing the timing behavior of the object code. 142 There are two versions: first, an initial prototype implemented in 143 \ocaml{}~\cite{d2.2}, and a version formalised in the \matita{} proof 144 assistant~\cite{d3.2,d4.2} and then extracted to \ocaml{} code to produce an 145 executable compiler. In this document we present results formalised in 146 \matita{} about the front-end of that version of the compiler, and how that fits 145 The \cerco{} compiler compiles C code, targeting microcontrollers 146 implementing the Intel 8051 architecture. It produces both the object 147 code and source code containing annotations describing the timing 148 behaviour of the object code. There are two versions: first, an 149 initial prototype implemented in \ocaml{}~\cite{d2.2}, and a version 150 formalised in the \matita{} proof assistant~\cite{d3.2,d4.2} and then 151 extracted to \ocaml{} code to produce an executable compiler. In this 152 document we present results formalised in \matita{} about the 153 front-end of the latter version of the compiler, and how that fits 147 154 into the verification of the whole compiler. 148 155 149 \todo{maybe mention stack space here? other additions? refer to "layering"?} 150 A key part of this work was to separate the proofs about the compiled code's 151 extensional behaviour (that is, the functional correctness of the compiler) 152 from the intensional correctness that the costs given are correct. 153 Unfortunately, the ambitious goal of completely verifying the compiler was not 154 feasible within the time available, but thanks to this separation of extensional 155 and intensional proofs we are able to axiomatize simulation results similar to 156 those in other compiler verification projects and concentrate on the novel 157 intensional proofs. The proofs were also made more tractable by introducing 158 compile-time checks for the `sound and precise' cost labelling properties 159 rather than proving that they are preserved throughout. 156 \todo{maybe mention stack space here? other additions? refer to 157 "layering"?} A key part of this work was to separate the proofs 158 about the compiled code's extensional behaviour (that is, the 159 functional correctness of the compiler) from the intensional 160 correctness that the costs given are correct. Unfortunately, the 161 ambitious goal of completely verifying the entire compiler was not 162 feasible within the time available, but thanks to this separation of 163 extensional and intensional proofs we are able to axiomatize 164 simulation results similar to those in other compiler verification 165 projects and concentrate on the novel intensional proofs. The proofs 166 were also made more tractable by introducing compile-time checks for 167 the `sound and precise' cost labelling properties rather than proving 168 that they are preserved throughout. 160 169 161 170 The overall statement of correctness says that the annotated program has the … … 218 227 identifiers to the function's stack space usage\footnote{The compiled 219 228 code's only stack usage is to allocate a fixed-size frame on each 220 function entry and discard it on exit.}, a cost label covering the 221 initialisation of global variables and calling the 222 \lstinline[language=C]'main' function, the annotated source code, and 223 finally a mapping from cost labels to actual execution time costs. 229 function entry and discard it on exit. No other dynamic allocation 230 is provided by the compiler.}, the space available for the 231 stack after global variable allocation, a cost label covering the 232 execution time for the initialisation of global variables and the call 233 to the \lstinline[language=C]'main' function, the annotated source 234 code, and finally a mapping from cost labels to actual execution time 235 costs. 224 236 225 237 An \ocaml{} pretty printer is used to provide a concrete version of the output … … 248 260 The availability of precise timing information for 8501 249 261 implementations and the design of the compiler allow it to give exact 250 time costs in terms of processor cycles. However, these exact results 251 are only available if the subtrace we measure starts and ends at 252 suitable points. In particular, pure computation with no observable 253 effects may be reordered and moved past cost labels, so we cannot 254 measure time between arbitrary statements in the program. 262 time costs in terms of processor cycles, not just upper bounds. 263 However, these exact results are only available if the subtrace we 264 measure starts and ends at suitable points. In particular, pure 265 computation with no observable effects may be reordered and moved past 266 cost labels, so we cannot measure time between arbitrary statements in 267 the program. 255 268 256 269 There is also a constraint on the subtraces that we … … 262 275 on much earlier stages in the compiler. To convey this information to 263 276 the timing analysis extra structure is imposed on the subtraces, which 264 we will give more details onin Section~\ref{sec:fegoals}.277 is described in Section~\ref{sec:fegoals}. 265 278 266 279 % Regarding the footnote, would there even be much point? 280 % TODO: this might be quite easy to add ('just' subtract the 281 % measurable subtrace from the second label to the end). Could also 282 % measure other traces in this manner. 267 283 These restrictions are reflected in the subtraces that we give timing 268 284 guarantees on; they must start at a cost label and end at the return … … 273 289 the execution of an entire function from the cost label at the start 274 290 of the function until it returns. We call such any such subtrace 275 \emph{measurable} if it (and the prefix of the trace before it) can276 also be executed within the available stack space.291 \emph{measurable} if it (and the prefix of the trace from the start to 292 the subtrace) can also be executed within the available stack space. 277 293 278 294 Now we can give the main intensional statement for the compiler. … … 374 390 \item This is the first language where every operation has its own 375 391 unique, easily addressable, statement. 376 \item Function calls and returns are still handled in the language and 377 so the structural properties are ensured by the semantics. 392 \item Function calls and returns are still handled implicitly in the 393 language and so the structural properties are ensured by the 394 semantics. 378 395 \item Many of the back-end languages from \textsf{RTL} share a common 379 396 core set of definitions, and using structured traces throughout … … 388 405 \label{fig:strtrace} 389 406 \end{figure} 390 A structured trace is a mutually inductive data type which principally407 A structured trace is a mutually inductive data type which 391 408 contains the steps from a normal program trace, but arranged into a 392 409 nested structure which groups entire function calls together and 393 410 aggregates individual steps between cost labels (or between the final 394 411 cost label and the return from the function), see 395 Figure~\ref{fig:strtrace}. This capture the idea that the cost labels412 Figure~\ref{fig:strtrace}. This captures the idea that the cost labels 396 413 only represent costs \emph{within} a function --- calls to other 397 functions are accounted for in the trace for their execution, and we414 functions are accounted for in the nested trace for their execution, and we 398 415 can locally regard function calls as a single step. 399 416 … … 405 422 \item A normal trace of the \textbf{prefix} of the program's execution 406 423 before reaching the measurable subtrace. (This needs to be 407 preserved so that we know that the stack space consumed is correct.) 424 preserved so that we know that the stack space consumed is correct, 425 and to set up the simulation results.) 408 426 \item The \textbf{structured trace} corresponding to the measurable 409 427 subtrace. … … 460 478 Then we instantiate a general result which shows that we can find a 461 479 \emph{measurable} subtrace in the target of the pass that corresponds 462 to the measurable subtrac tin the source. By repeated application of480 to the measurable subtrace in the source. By repeated application of 463 481 this result we can find a measurable subtrace of the execution of the 464 482 \textsf{RTLabs} code, suitable for the construction of a structured … … 526 544 subtrace are similar, following the pattern above. However, the 527 545 measurable subtrace also requires us to rebuild the termination 528 proof. This has a recursive form: 546 proof. This is defined recursively: 547 \label{prog:terminationproof} 529 548 \begin{lstlisting}[language=matita] 530 549 let rec will_return_aux C (depth:nat) … … 752 771 753 772 For a cost labelling to be sound and precise we need a cost label at 754 the start of each function, after each branch and at least on ce in773 the start of each function, after each branch and at least one in 755 774 every loop. The first two parts are trivial to check by examining the 756 775 code. In \textsf{RTLabs} the last part is specified by saying … … 764 783 This is made easier by the prior knowledge that any branch is followed 765 784 by cost labels, so we do not need to search each branch. When a label 766 is found, we remove the chain from the set and continue until it is767 empty, at which point we know that there is a bound for every node in 768 the graph.785 is found, we remove the chain from the set and continue from another 786 node in the set until it is empty, at which point we know that there 787 is a bound for every node in the graph. 769 788 770 789 Directly reasoning about the function that implements this would be … … 816 835 program counters with decidable equality, the classification of states 817 836 and functions to extract the observable intensional information (cost 818 labels and the identity of functions that are called). 837 labels and the identity of functions that are called). The 838 \lstinline'as_after_return' property links the state before a function 839 call with the state after return, providing the evidence that 840 execution returns to the correct place. The precise form varies 841 between stages; in \textsf{RTLabs} it insists the CFG, the pointer to 842 the CFG node to execute next, and some call stack information is 843 preserved. 819 844 820 845 The structured traces are defined using three mutually inductive … … 822 847 captures some straight-line execution until the next cost label or 823 848 function return. Each function call is embedded as a single step, 824 with its own trace nested inside, and states classified as a `jump' 825 (in particular branches) must be followed by a cost label. 849 with its own trace nested inside and the before and after states 850 linked by \lstinline'as_after_return', and states classified as a 851 `jump' (in particular branches) must be followed by a cost label. 826 852 827 853 The second type, \lstinline'trace_label_label', is a … … 833 859 \lstinline'trace_label_label' values which end in the return from the 834 860 function. These correspond to a measurable subtrace, and in 835 particular include executions of an entire function call. 861 particular include executions of an entire function call (and so are 862 used for the nested calls in \lstinline'trace_any_label'). 863 864 \subsection{Construction} 865 866 The core part of the construction of a fixed trace is to use the proof 867 of termination from the measurable trace (defined on 868 page~\pageref{prog:terminationproof}) to `fold up' the execution into 869 the nested form. 836 870 837 871 TODO: design, basic structure from termination proof, how cost
Note: See TracChangeset
for help on using the changeset viewer.