Changeset 3350


Ignore:
Timestamp:
Jun 14, 2013, 11:46:10 AM (4 years ago)
Author:
sacerdot
Message:

...

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Papers/itp-2013/ccexec2.tex

    r3349 r3350  
    255255associated to $L_1$ is the number of cycles required to execute the block
    256256$I_3$ and \verb+COND l_2+, while the cost $k_2$ associated to $L_2$ counts the
    257 cycles required by the block $I_4$ and \verb+GOTO l_1+. The compiler also guarantees
     257cycles required by the block $I_4$, \verb+GOTO l_1+ and \verb+COND l_2+. The compiler also guarantees
    258258that every executed instruction is in the scope of some code emission label,
    259259that each scope does not contain loops (to associate a finite cost), and that
     
    431431runs are weakly similar to the source code runs.
    432432
    433 The notion of weak bisimulation for structured traces is a global property
     433The notion of weak simulation for structured traces is a global property
    434434which is hard to prove formally and much more demanding than the simple forward
    435435simulation required for proofs of preservation of functional properties.
     
    693693Let's consider a generic unstructured language already equipped with a
    694694small step structured operational semantics (SOS). We introduce a
    695 deterministic labelled transition system~\cite{LTS} $(S,s_{\mathrm{init}},\Lambda,\to)$
     695deterministic labelled transition system~\cite{LTS} $(S,\Lambda,\to)$
    696696that refines the
    697697SOS by observing function calls and the beginning of basic blocks.
    698 $S$ is the set of states of the program, $s_\mathrm{init}$ the initial state and
     698$S$ is the set of states of the program and
    699699$\Lambda = \{ \tau, RET \} \cup \Labels \cup \Functions$
    700700where $\Functions$ is the set of names of functions that can occur in the
     
    705705denotes the image of this function.
    706706The transition function is defined as $s_1 \to[o] s_2$ if
    707 $s_1$ moves to $s_2$ according to the SOS; moreover $o = f \in \Functions$ if
     707$s_1$ moves to $s_2$ according to the SOS and $o = f \in \Functions$ if
    708708the function $f$ is called, $o = RET$ if a \verb+RETURN+ is executed,
    709709$o = L \in \Labels$ if an \verb+EMIT $L$+ is executed to signal the
     
    711711Because we assume the language to be deterministic, the label emitted can
    712712actually be computed simply observing $s_1$. Finally, $S$ is also endowed with
    713 a relation $s\ar s'$ ($s'$ \emph{follows} $s$) when the instruction to be executed
    714 in $s'$ is just after the one in $s$.
    715 
     713a relation $s\ar s'$ ($s'$ \emph{follows} $s$) that holds when the instruction
     714to be executed in $s'$ follows syntactically the one in $s$ in the source program.
     715
     716In the rest of the paper we write $s_0 \to^{*} s_n$ for the finite execution
     717fragment $T = s_0 \to[o_0] s_1 \to[o_1] \ldots \to[o_{n-1}] s_n$
     718and, we call \emph{weak trace} of $T$ (denoted as $|T|$) the
     719subsequence $o_{i_0} \ldots o_{i_m}$ of $o_0 \ldots o_{n-1}$ obtained dropping
     720every internal action $\tau$.
     721
     722%Let $k$ be a cost model for observables actions that maps elements of
     723%$\Lambda \setminus \{\tau\}$ to any commutative cost monoid
     724%(e.g. natural numbers). We extend the domain of $k$ to executable fragments
     725%by posing $k(T) = \Sigma_{o \in |T|} k(o)$.
     726
     727\paragraph{Structured execution fragments}
    716728Among all possible finite execution fragments we want to
    717729identify the ones that satisfy the requirements we sketched in the previous
    718730section. We say that an execution fragment
    719 $s_0 \to[o_0] s_1 \to[o_1] \ldots \to[o_n] s_n$
    720 is \emph{structured} (marking it as $s_0 \To s_n$) iff the following conditions
     731$s_0 \to^{*} s_n$
     732is \emph{structured} (and we denote it as $s_0 \To s_n$) iff the following conditions
    721733are met.
    722734\begin{enumerate}
     
    726738    $s_i \ar s_{k+1}$.
    727739   In other words, $s_{i+1}$ must start execution with \verb+EMIT $\ell(f)$+
     740   --- so that no instruction falls outside the scope of every label ---
    728741   and then continue with a structured fragment returning control
    729742   to the instruction immediately following the call.
    730    This captures the requirements that the body of function calls always start
    731    with a label emission statement, and every function
    732    call must converge yielding back control just after it.
    733  \item For every $i$ and $f$, if $s_{i+1}\to[\ell(f)]s_{i+2}$ then
    734    $s_i\to[f]s_{i+1}$. This is a technical condition needed to ensure that
    735    labels associated with functions always follow a call.
     743
     744   The condition also enforces convergence of every function call, which is
     745   necessary to bound the cost of the fragment. Note that
     746   non convergent programs may still have structured execution fragments
     747   that are worth measuring. For example, we can measure the reaction time
     748   of a server implemented as an unbounded loop whose body waits for an
     749   input, process it and performs an output before looping: the processing
     750   steps form a structured execution fragment.
    736751 \item The number of $RET$'s in the fragment is equal to the number of
    737    calls, i.e.\ the number of observables in $\Functions$. This, together with
    738    the above condition, captures the well-bracketing of the fragment with
    739    respect to function calls.
     752   calls performed. In combination with the previous condition, this ensures
     753   well-backeting of function calls.
     754 \item
     755   \label{req3}
     756   For every $i$ and $f$, if $s_{i+1}\to[\ell(f)]s_{i+2}$ then
     757   $s_i\to[f]s_{i+1}$. This is a technical condition needed to ensure that a
     758   label associated with a function is only used at the beginning of its
     759   body. Its use will become clear in~\autoref{simulation}.
    740760 \item For every $i$, if the instruction to be executed in $s_i$ is a
    741761   conditional branch, then there is an $L$ such that $s_{i+1} \to[L] s_{i+2}$ or, equivalently, that $s_{i+1}$ must start execution with an
    742762   \verb+EMIT $L$+. This captures the requirement that every branch which is
    743    live code must start with a label emission.
     763   live code must start with a label emission. Otherwise, it would be possible
     764   to have conditional instructions whose branches are assigned different
     765   costs, making impossible to assign a single cost to the label whose scope
     766   contains the jump.
    744767\end{enumerate}
    745768One might wonder why $f$ and $\ell(f)$, that aways appear in this order, are not
    746 collapsed into a single observable. This would indeed simplify some aspects of
    747 the formalisation, but has the problem of misassagning the cost of calls, which would
    748 fall under the associated label. As different call instructions with different costs
    749 are possible, this is not acceptable.
    750 
    751 Let $T = s_0 \to[o_0] s_1 \ldots \to[o_n] s_{n+1}$ be an
    752 execution fragment. The \emph{weak trace} $|T|$ associated to $T$ is the
    753 subsequence $o_{i_0} \ldots o_{i_m}$ of $o_0 \ldots o_n$ obtained dropping
    754 every internal action $\tau$.
    755 
    756 Let $k$ be a cost model that maps observables actions to any commutative cost
    757 monoid (e.g. natural numbers). We extend the domain of $k$ to fragments
    758 by posing $k(T) = \Sigma_{o \in |T|} k(o)$.
    759 
    760 The labelling approach is based on the idea that the execution cost of
     769collapsed into a single observable. This would simplify some aspects of the
     770formalisation at the price of others. For example, we should add special
     771cases when the fragment starts at the beginning of a function body
     772(e.g. the one of \texttt{main}) because in that case nobody would have emitted
     773the observable $\ell(f)$.
     774
     775\paragraph{Measurable execution fragments and their cost prediction.}
     776The first main theorem of CerCo deals with programs written in object code.
     777It states that the execution cost of
    761778certain execution fragments, that we call \emph{measurable fragments}, can be
    762 computed from their weak trace by choosing a $k$ that assigns to any label
    763 the cost of the instructions in its scope.
    764 A structured fragment
    765 $T = s_0 \To s_n$ is
    766 measurable if it does not start or end in the middle of a basic block.
    767 Ending in the middle of a block would mean having pre-paid more instructions
    768 than the ones executed, and starting in the middle would mean not paying any
    769 instruction up to the first label emission.
    770 Formally we require $o_0 \in \Labels$ (or equivalently
     779computed from their weak trace by choosing the cost model $k$ that assigns to
     780any label the cost (in clock cycles) of the instructions in its scope, and
     781$0$ to function calls and $RET$ observables.
     782
     783\begin{theorem}
     784 \label{static}
     785 for all measurable fragment $T = s_0 \to^{*} s_n$,\\
     786 $$\Delta_t := \verb+clock+_{s_n} - \verb+clock+_{s_0} = \Sigma_{o \in |T|} k(o)$$
     787\end{theorem}
     788
     789An execution fragment $s_0 \to^{*} s_n$ is
     790measurable if it is structured (up to a possible final \texttt{RETURN}) and
     791if it does not start or end in the middle of a basic block.
     792Ending in the middle of a block would mean that the last label encountered
     793would have pre-paid more instructions than the ones executed; starting in the
     794middle would mean not paying any instruction up to the first label emission.
     795
     796Formally, $s_0 \to^{*} s_n$ is measurable iff $o_0 \in \Labels$ (or equivalently
    771797in $s_0$ the program must emit a label) and either
    772 $s_{n-1}\to[RET]s_n$ or $s_n$ must be a label emission statement
    773 (i.e.\ $s_n \to[L] s_{n+1}$).
    774 
     798$s_0 \To s_{n-1}$ and $s_{n-1}\to[RET]s_n$ or
     799$s_0 \To s_n$ and $s_n$ must be a label emission statement.
     800
     801\textbf{CSC: PROVA----------------------}
     802% The theorem is proved by structural induction over the structured
     803% trace, and is based on the invariant that
     804% iff the function that computes the cost model has analysed the instruction
     805% to be executed at $s_2$ after the one to be executed at $s_1$, and if
     806% the structured trace starts with $s_1$, then eventually it will contain also
     807% $s_2$. When $s_1$ is not a function call, the result holds trivially because
     808% of the $s_1\exec s_2$ condition obtained by inversion on
     809% the trace. The only non
     810% trivial case is the one of function calls: the cost model computation function
     811% does recursion on the first instruction that follows that function call; the
     812% \verb+as_after_return+ condition of the \verb+tal_base_call+ and
     813% \verb+tal_step_call+ grants exactly that the execution will eventually reach
     814% this state.
     815
     816\paragraph{Weak similarity and cost invariance.}
    775817Given two deterministic unstructured programming languages with their own
    776 operational semantics, we say that a state $s_2$ of the second language
    777 (weakly) simulates the state $s_1$ of the first iff the two unique weak traces
    778 that originate from them are equal. If $s_1$ also (weakly) simulates $s_2$,
    779 then the two states are weakly trace equivalent.
    780 
    781  or, equivalently
    782 because of determinism, that they are weakly bisimilar.
     818operational semantics, we say that two execution fragments are
     819\emph{weakly trace equivalent} if their weak traces are equal.
     820
     821A compiler (pass) that preserves the program semantics also preserves weak
     822traces and propagates measurability iff for every measurable
     823fragment $T_1 = s_1 \to^{*} s_1'$ of the source code, the corresponding
     824execution fragment $T_2 = s_2 \to^{*} s_2'$ of the object code is measurable
     825and $T_1$ and $T_2$ are weakly trace equivalent. The very intuitive notion of
     826``corresponding fragment'' is made clear in the forward simulation proof of
     827preservation of the semantics of the program by saying that $s_2$ and $s_1$
     828are in a certain relation. Clearly the property holds for a compiler if it
     829holds for each compiler pass.
     830
     831Having proved in~\autoref{static} that the statically computed cost model is
     832accurate for the object code, we get as a corollary that it is also accurate
     833for the source code if the compiler preserves weak traces and
     834propagates measurability. Thus it becomes possible to compute cost models
     835on the object code, transfer it to the source code and then reason comfortably
     836on the source code only.
     837
     838\begin{theorem}\label{preservation}
     839Given a compiler that preserves weak traces and propagates measurability,
     840for all measurable execution fragment $T_1 = s_1 \to^{*} s_1'$ of the source
     841code such that $T_2 = s_2 \to^{*} s_2'$ is the corresponding fragment of the
     842object code,
     843
     844$$\Delta_t := \verb+clock+_{s_2'} - \verb+clock+_{s_2} = \Sigma_{o \in |T_2|} k(o) = \Sigma_{o \in |T_1|} k(o)$$
     845\end{theorem}
     846
     847\section{Forward simulation}
     848\label{simulation}
     849Because of \autoref{preservation}, to certify a compiler for the labelling
     850approach we need to both prove that it respects the functional semantics of the
     851program, and that it preserves weak traces and propagates measurability.
     852The first property is standard and can be proved by means of a forward simulation argument (see for example~\cite{compcert}) that runs like this.
     853First a relation between the corresponding
     854source and target states is established. Then a lemma establishes
     855a local simulation condition: given two states in relation, if the source
     856one performs one step then the target one performs zero or more steps and
     857the two resulting states are synchronized again according to the relation.
     858Finally, the lemma is iterated over the execution trace to establish the
     859final result.
     860
     861In principle, preservation of weak traces could be easily shown with the same
     862argument (and also at the same time). Surprisingly, propagation of
     863measurability cannot. What makes the standard forward
     864simulation proof work is the fact that usually a compiler pass performs some
     865kind of local or global analysis of the code followed by a compositional, order
     866preserving translation of every instruction. In order to produce structured
     867traces, however, code emission cannot be fully compositional any longer.
     868
     869For example, consider~requirement \ref{req3} that asks every function body
     870to start with a label emission statement. Some compiler passes must
     871add preambles to functions, for example to take care of the parameter passing
     872convention. In order to not violate the requirement, the preamble must be
     873inserted after the label emission. In the forward simulation proof, however,
     874function call steps in the source language are simulated by the new function
     875call followed by the execution of the preamble, and only at the end of the
     876preamble the reached states are again in the expected relation. In the meantime,
     877however, the object code has already performed the label emission statement,
     878that still needs to be executed in the source code, breaking forward simulation.
     879
     880Another reason why the standard argument breaks is due to the requirement that
     881function calls should yield back control after the calling point. This must be
     882enforced just after
     883
     884\textbf{XXXXXXXXXXXXXXXXXX}
     885
     886A compiler preserves the program semantics by suppressing or introducing
     887$\tau$ actions.
     888
     889Intuitively, it is because
     890
     891To understand why, consider the case of a function call
     892and the pass that fixes the parameter passing conventions. A function
     893call in the source code takes in input an arbitrary number of pseudo-registers (the actual parameters to pass) and returns an arbitrary number of pseudo-registers (where the result is stored). A function call in the target language has no
     894input nor output parameters. The pass must add explicit code before and after
     895the function call to move the pseudo-registers content from/to the hardware
     896registers or the stack in order to implement the parameter passing strategy.
     897Similarly, each function body must be augmented with a preamble and a postamble
     898to complete/initiate the parameter passing strategy for the call/return phase.
     899Therefore what used to be a call followed by the next instruction to execute
     900after the function return, now becomes a sequence of instructions, followed by
     901a call, followed by another sequence. The two states at the beginning of the
     902first sequence and at the end of the second sequence are in relation with
     903the status before/after the call in the source code, like in an usual forward
     904simulation. How can we prove however the additional condition for function calls
     905that asks that when the function returns the instruction immediately after the
     906function call is called? To grant this invariant, there must be another relation
     907between the address of the function call in the source and in the target code.
     908This additional relation is to be used in particular to relate the two stacks.
     909
     910
     911
     912
    783913% @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    784914%
     
    10431173% %   trace_label_return \(s_1\) \(s_2\) \(\to\) list (as_cost_label S)
    10441174% % \end{alltt}
    1045 %
    1046 % \paragraph{Cost prediction on structured traces.}
    1047 %
    1048 % The first main theorem of CerCo about traces
    1049 % (theorem \verb+compute_max_trace_label_return_cost_ok_with_trace+)
    1050 % holds for the
    1051 % instantiation
    1052 % of the structured traces to the concrete status of object code programs.
    1053 % Simplifying a bit, it states that
    1054 % \begin{equation}\label{th1}
    1055 % \begin{array}{l}\forall s_1,s_2. \forall \tau: \verb+TLR+~s_1~s_2.~
    1056 %   \verb+clock+~s_2 = \verb+clock+~s_1 +
    1057 %   \Sigma_{\alpha \in |\tau|}\;k(\alpha)
    1058 % \end{array}
    1059 % \end{equation}
    1060 % where the cost model $k$ is statically computed from the object code
    1061 % by associating to each label $\alpha$ the sum of the cost of the instructions
    1062 % in the basic block that starts at $\alpha$ and ends before the next labelled
    1063 % instruction. The theorem is proved by structural induction over the structured
    1064 % trace, and is based on the invariant that
    1065 % iff the function that computes the cost model has analysed the instruction
    1066 % to be executed at $s_2$ after the one to be executed at $s_1$, and if
    1067 % the structured trace starts with $s_1$, then eventually it will contain also
    1068 % $s_2$. When $s_1$ is not a function call, the result holds trivially because
    1069 % of the $s_1\exec s_2$ condition obtained by inversion on
    1070 % the trace. The only non
    1071 % trivial case is the one of function calls: the cost model computation function
    1072 % does recursion on the first instruction that follows that function call; the
    1073 % \verb+as_after_return+ condition of the \verb+tal_base_call+ and
    1074 % \verb+tal_step_call+ grants exactly that the execution will eventually reach
    1075 % this state.
    10761175%
    10771176% \paragraph{Structured traces similarity and cost prediction invariance.}
     
    12811380% As should be expected, even though the rules are asymmetric $\approx$ is in fact
    12821381% an equivalence relation.
    1283 \section{Forward simulation}
    1284 \label{simulation}
    12851382
    12861383We summarise here the results of the previous sections. Each intermediate
Note: See TracChangeset for help on using the changeset viewer.