source: Papers/structured-traces-2012/structured-traces.tex @ 2630

Last change on this file since 2630 was 2480, checked in by mulligan, 8 years ago

Some more changes.

File size: 8.4 KB
[2331]1\documentclass[a4paper, 10pt]{article}
7\author{Brian Campbell, Dominic P. Mulligan, Claudio Sacerdoti Coen,\\Ian Stark and Paolo Tranquilli}
8\title{Structured traces}
19% Section.                                                                    %
[2479]24Compiler correctness proofs are typically based on simulation results for local portions of code.
25This is because compiler formalisation efforts have focused their attention on providing guarantees about the extensional behaviour of generated code.
26In contrast, when verified compiler projects also consider the intensional behaviour of generated code such as execution time or space, we have found that invariants about the structure of program execution are particularly valuable.
[2479]28The ideas summaried in this paper have been developed within the context of the CerCo (`Certified Complexity') project, though by no means are limited in their scope or application to this project.
29CerCo's ultimate aim is to produce a verified C compiler for a large subset of the C programming language.
30Unlike previous compiler verification projects---notably Leroy's CompCert project at \textsc{inria}---CerCo is also concerned with the intensional correctness of our compiler.
31In particular, the CerCo C compiler will lift, in a provably correct way, a cost model induced by the compilation process back to the C-source level.
[2480]32As the CerCo compiler knows how every C expression is expanded into machine code by the compilation process, and as we have a precise cost model for machine code instructions, we can pass this cost model backwards through the compiler, back to the C source level.
[2480]34However, the structure of machine code execution is unconstrained.
35For example, an arbitrary machine code program is free to modify the call stack in any way that it chooses.
36As a result, we cannot assume that the execution path of a machine code function call will return to the subsequent instruction after the call is complete.
37This freedom presents a problem when trying to prove that the execution costs of a program are correctly predicted by our compiler.
[2480]39Fortunately, we are able to preserve information from earlier stages of the compilation chain and use this information in the cost preservation proof.
40This information is passed in the form of a \emph{structured trace}, a novel enrichment of execution traces with invariants about the structure of the program.
45% Section.                                                                    %
47\section{Body of paper that is complex}
[2455]51% Definitions, choices, etc                                                   %
53\section{Definition of structured traces}
56TODO: abstraction of intermediate language semantics, definition of structured
57traces, commentary on decisions made, pretty diagrams
60% Edinburgh work on existence of RTLabs str traces                            %
62\section{Existence of structured traces}
65%  existence proof details that I think are irrelevant to this paper:
66%  - fn ids are memory block ids rather than the original name
67%  - a shadow stack is maintained because the semantics don't record the fn id
68%  - we first go from flat traces made from exec_inf to ones which include
69%    evidence of each step of execution (flat_trace)
71% TODO: no I/O in present construction - do we need to do something about that?
73[Bit of a braindump for now.]
75TODO: choice of language (esp, first to have reasonable PCs),
76restricted use of excluded middle for termination
78Currently in the order of the proof, rearrange.
80Brief note on abstraction: PCs are fn id / cfg node id pairs (with slight
81variations for call/ret/fin states)
82% fn ids are actually blocks, and there's a shadow stack to record which we
83% are on, but that's not really relevant
84Only statements considered as `jumps' are conditionals (no unconditional jumps
85because we're in a cfg).
86\verb'as_after_return' property is that next caller node and caller fn must be
89\verb'will_return' asserts that a prefix of a trace eventually returns, using
90the call/ret/jump/other classification fn.  Will use excluded middle on this
91for deciding termination.  Use length of these as part of termination witness
92for construction fn.
93% lemmas for composing (will_return_prepend), slicing (will_return_remove_call)
94% will_return?
96Specifications of cost labelling (???.ma), lift them to runtime notions of
97globalenvs, states, preserved by steps (\verb'well_cost_labelled_state_step').
98Label follows a conditional (\verb'well_cost_labelled_jump'), call
99(semantically, i.e., the first thing in a function is the cost label).
101(RTLabs) Stack preservation - essential for \verb'as_after_return' (and more?)
102- as with structured traces it ends with the state \emph{after} the return.
103Like \verb'will_return' do composition, steps (including over calls).  Then
104we extract \verb'as_after_return'.  Note: these proofs are technically fairly
105routine, once we've got the right definitions.
107\verb'make_label_return' and it's mutually defined functions
108\verb'make_label_label' and \verb'make_any_label' are the main results for
109finite subtraces.  These consume a prefix of the flat trace (given some
110hypotheses about well labelling) according to a \verb'will_return' structure
111that determines how much is used.  A record is returned containing the end
112state, the remainder of the trace, a proof of stack preservation, and the
113remainder (if any) of the \verb'will_terminates' structure.  The functions
114themselves examine the flat trace, checks for a cost label where they can
115optionally appear and proves the properties embedded in the structured traces
116using the above lemmas.  They are given in a mixed definition/proof script
117style; unusually some of the proofs are partly contained in the definition to
118make full use of dependent pattern matching.
121Abstract successors from semantics to syntax; rephrase bound on instructions
122from cost spec to bound on (local) steps to label; prove that the bound
123reduces as you would expect (requires \verb'as_after_return'), that sound
124labelled lifted to semantics is preserved by steps.
126Construction of whole program traces proceeds using excluded middle on existence
127of a \verb'will_return' structure.  Terminating executions just embed the
128structured trace from \verb'make_label_return', but non-terminating executions
129are built by a cofixpoint which extracts a finite prefix, which is constructed
130in a similar way to \verb'make_label_return', but using the bound on steps to a
131cost label to demonstrate termination and stopping on the next cost label in
132the function, or another non-terminating function call.  (Better to describe
133the idea, then note interesting things about the implementation.)
135Both structured traces and flat traces are determined by their start state
136because there's no I/O, so get flattening = original trace.
138Also show separate non-repeating property (no repeated PC without a cost label)
139by splitting on whether a repeated state is a cost label - if so it can't
140reappear in the relevant part of the structured trace, if not we break the
141sound labelling property on instructions by finding a loop of successor
142instructions without a cost label.  (Note: this is on instructions, above we
143used a derived version on states.)  Rule out repeated call states by reference
144to the successor instruction.  Applying this throughout the structured trace
145give the \verb'*_sound_unrepeating' properties.
[2331]148% Section.                                                                    %
Note: See TracBrowser for help on using the repository browser.