# Changeset 3670

Ignore:
Timestamp:
Mar 21, 2017, 3:57:28 PM (3 years ago)
Message:

Move a chunk of old specification text into place

Location:
Papers/jar-cerco-2017
Files:
2 edited

### Legend:

Unmodified
 r3667 \subsection{Main correctness statement} Informally, our main intensional result links the time difference in a source code execution to the time difference in the object code, expressing the time for the source by summing the values for the cost labels in the trace, and the time for the target by a clock built in to the 8051 executable semantics. The availability of precise timing information for 8501 implementations and the design of the compiler allow it to give exact time costs in terms of processor cycles, not just upper bounds. However, these exact results are only available if the subtrace we measure starts and ends at suitable points.  In particular, pure computation with no observable effects may be reordered and moved past cost labels, so we cannot measure time between arbitrary statements in the program. There is also a constraint on the subtraces that we measure due to the requirements of the correctness proof for the object code timing analysis.  To be sure that the timings are assigned to the correct cost label, we need to know that each return from a function call must go to the correct return address.  It is difficult to observe this property locally in the object code because it relies on much earlier stages in the compiler.  To convey this information to the timing analysis extra structure is imposed on the subtraces, which is described in Section~\ref{sec:fegoals}. % Regarding the footnote, would there even be much point? % TODO: this might be quite easy to add ('just' subtract the % measurable subtrace from the second label to the end).  Could also % measure other traces in this manner. These restrictions are reflected in the subtraces that we give timing guarantees on; they must start at a cost label and end at the return of the enclosing function of the cost label\footnote{We expect that this would generalise to more general subtraces by subtracting costs for unwanted measurable suffixes of a measurable subtrace.}.  A typical example of such a subtrace is the execution of an entire function from the cost label at the start of the function until it returns.  We call such any such subtrace \emph{measurable} if it (and the prefix of the trace from the start to the subtrace) can also be executed within the available stack space. Now we can give the main intensional statement for the compiler. Given a \emph{measurable} subtrace for a labelled \textsf{Clight} program, there is a subtrace of the 8051 object code program where the time differences match.  Moreover, \emph{observable} parts of the trace also match --- these are the appearance of cost labels and function calls and returns. More formally, the definition of this statement in \matita{} is \begin{lstlisting}[language=Grafite] definition simulates := $\lambda$p: compiler_output. let initial_status := initialise_status $...$ (cm (c_labelled_object_code $...$ p)) in $\forall$m1,m2. measurable Clight_pcs (c_labelled_clight $...$ p) m1 m2 (stack_sizes (c_stack_cost $...$ p)) (c_max_stack $...$ p) $\rightarrow$ $\forall$c1,c2. clock_after Clight_pcs (c_labelled_clight $...$ p) m1 (c_clight_cost_map $...$ p) = OK $...$ c1 $\rightarrow$ clock_after Clight_pcs (c_labelled_clight $...$ p) (m1+m2) (c_clight_cost_map $...$ p) = OK $...$ c2 $\rightarrow$ $\exists$n1,n2. observables Clight_pcs (c_labelled_clight $...$ p) m1 m2 = observables (OC_preclassified_system (c_labelled_object_code $...$ p)) (c_labelled_object_code $...$ p) n1 n2 $\wedge$ clock ?? (execute (n1+n2) ? initial_status) = clock ?? (execute n1 ? initial_status) + (c2-c1). \end{lstlisting} where the \lstinline'measurable', \lstinline'clock_after' and \lstinline'observables' definitions are generic definitions for multiple languages; in this case the \lstinline'Clight_pcs' record applies them to \textsf{Clight} programs. There is a second part to the statement, which says that the initial processing of the input program to produce the cost labelled version does not affect the semantics of the program: % Yes, I'm paraphrasing the result a tiny bit to remove the observe non-function \begin{lstlisting}[language=Grafite] $\forall$input_program,output. compile input_program = return output $\rightarrow$ not_wrong $...$ (exec_inf $...$ clight_fullexec input_program) $\rightarrow$ sim_with_labels (exec_inf $...$ clight_fullexec input_program) (exec_inf $...$ clight_fullexec (c_labelled_clight $...$ output)) \end{lstlisting} That is, any successful compilation produces a labelled program that has identical behaviour to the original, so long as there is no undefined behaviour'. Note that this statement provides full functional correctness, including preservation of (non-)termination.  The intensional result above does not do this directly --- it does not guarantee the same result or same termination.  There are two mitigating factors, however: first, to prove the intensional property you need local simulation results --- these can be pieced together to form full behavioural equivalence, only time constraints have prevented us from doing so.  Second, if we wish to confirm a result, termination, or non-termination we could add an observable witness, such as a function that is only called if the correct result is given.  The intensional result guarantees that the observable witness is preserved, so the program must behave correctly. These two results are combined in the the \lstinline'correct' theorem in the file \lstinline'correctness.ma'. [[relocated to specification section]] \section{Correctness statement for the front-end}
 r3669 \end{itemize} ]] Informally, our main intensional result links the time difference in a source code execution to the time difference in the object code, expressing the time for the source by summing the values for the cost labels in the trace, and the time for the target by a clock built in to the 8051 executable semantics. The availability of precise timing information for 8501 implementations and the design of the compiler allow it to give exact time costs in terms of processor cycles, not just upper bounds. However, these exact results are only available if the subtrace we measure starts and ends at suitable points.  In particular, pure computation with no observable effects may be reordered and moved past cost labels, so we cannot measure time between arbitrary statements in the program. There is also a constraint on the subtraces that we measure due to the requirements of the correctness proof for the object code timing analysis.  To be sure that the timings are assigned to the correct cost label, we need to know that each return from a function call must go to the correct return address.  It is difficult to observe this property locally in the object code because it relies on much earlier stages in the compiler.  To convey this information to the timing analysis extra structure is imposed on the subtraces, which is described in Section~\ref{sec:fegoals}. % Regarding the footnote, would there even be much point? % TODO: this might be quite easy to add ('just' subtract the % measurable subtrace from the second label to the end).  Could also % measure other traces in this manner. These restrictions are reflected in the subtraces that we give timing guarantees on; they must start at a cost label and end at the return of the enclosing function of the cost label\footnote{We expect that this would generalise to more general subtraces by subtracting costs for unwanted measurable suffixes of a measurable subtrace.}.  A typical example of such a subtrace is the execution of an entire function from the cost label at the start of the function until it returns.  We call such any such subtrace \emph{measurable} if it (and the prefix of the trace from the start to the subtrace) can also be executed within the available stack space. Now we can give the main intensional statement for the compiler. Given a \emph{measurable} subtrace for a labelled \textsf{Clight} program, there is a subtrace of the 8051 object code program where the time differences match.  Moreover, \emph{observable} parts of the trace also match --- these are the appearance of cost labels and function calls and returns. More formally, the definition of this statement in \matita{} is \begin{lstlisting}[language=Grafite] definition simulates := $\lambda$p: compiler_output. let initial_status := initialise_status $...$ (cm (c_labelled_object_code $...$ p)) in $\forall$m1,m2. measurable Clight_pcs (c_labelled_clight $...$ p) m1 m2 (stack_sizes (c_stack_cost $...$ p)) (c_max_stack $...$ p) $\rightarrow$ $\forall$c1,c2. clock_after Clight_pcs (c_labelled_clight $...$ p) m1 (c_clight_cost_map $...$ p) = OK $...$ c1 $\rightarrow$ clock_after Clight_pcs (c_labelled_clight $...$ p) (m1+m2) (c_clight_cost_map $...$ p) = OK $...$ c2 $\rightarrow$ $\exists$n1,n2. observables Clight_pcs (c_labelled_clight $...$ p) m1 m2 = observables (OC_preclassified_system (c_labelled_object_code $...$ p)) (c_labelled_object_code $...$ p) n1 n2 $\wedge$ clock ?? (execute (n1+n2) ? initial_status) = clock ?? (execute n1 ? initial_status) + (c2-c1). \end{lstlisting} where the \lstinline'measurable', \lstinline'clock_after' and \lstinline'observables' definitions are generic definitions for multiple languages; in this case the \lstinline'Clight_pcs' record applies them to \textsf{Clight} programs. There is a second part to the statement, which says that the initial processing of the input program to produce the cost labelled version does not affect the semantics of the program: % Yes, I'm paraphrasing the result a tiny bit to remove the observe non-function \begin{lstlisting}[language=Grafite] $\forall$input_program,output. compile input_program = return output $\rightarrow$ not_wrong $...$ (exec_inf $...$ clight_fullexec input_program) $\rightarrow$ sim_with_labels (exec_inf $...$ clight_fullexec input_program) (exec_inf $...$ clight_fullexec (c_labelled_clight $...$ output)) \end{lstlisting} That is, any successful compilation produces a labelled program that has identical behaviour to the original, so long as there is no undefined behaviour'. Note that this statement provides full functional correctness, including preservation of (non-)termination.  The intensional result above does not do this directly --- it does not guarantee the same result or same termination.  There are two mitigating factors, however: first, to prove the intensional property you need local simulation results --- these can be pieced together to form full behavioural equivalence, only time constraints have prevented us from doing so.  Second, if we wish to confirm a result, termination, or non-termination we could add an observable witness, such as a function that is only called if the correct result is given.  The intensional result guarantees that the observable witness is preserved, so the program must behave correctly. These two results are combined in the the \lstinline'correct' theorem in the file \lstinline'correctness.ma'. \subsection{Specifications for phases of the compiler}