# Changeset 3168 for Deliverables/D3.4/Report/report.tex

Ignore:
Timestamp:
Apr 19, 2013, 12:30:32 PM (6 years ago)
Message:

 r3167 \usepackage[mathletters]{ucs} \usepackage[utf8x]{inputenc} \usepackage{stmaryrd} \usepackage{listings} \usepackage{../../style/cerco} \todo{don't use loop structures from CompCert, go straight to jumps} \paragraph{Cast simplification} The parser used in Cerco introduces a lot of explicit type casts. If left as they are, these casts can greatly hamper the quality of the generated code -- especially as the architecture we consider is an $8$ bit one. In \textsf{Clight}, casts are expressions. Hence, most of the work of this transformation proceeds on expressions, and the one-step simulation result one those is easily lifted to statement evaluations. \todo{introduce the explicit type signature of the functions below ?} The tranformation proceeds by recursively trying to coerce an expression to a particular type, which is in practice a integer type smaller than the original one. This functionality is implemented by two mutually recursive functions: \textbf{simplify\_expr} and \textbf{simplify\_inside}. The latter acts as a wrapper for the first one. Whenever \textbf{simplify\_inside} encounters a \textbf{Ecast} expression, it tries to coerce the sub-expression to the desired type using \textbf{simplify\_expr}, which tries to perform the actual coercion. In return, \textbf{simplify\_expr} calls back \textbf{simplify\_inside} in some particular positions, where we decided to be conservative in order to simplify the proofs. However, the current design allows to incrementally revert to a more aggressive version, by replacing recursive calls to \textbf{simplify\_inside} by calls to \textbf{simplify\_expr} \emph{and} proving the corresponding invariants -- where possible. The main problem encountered with this particular pass was dealing with inconsistently typed programs, a canonical case being a particular integer constant of a certain size typed with another size. This prompted the need to introduce numerous type checks, complexifying both the implementation and the proof. \paragraph{Switch removal} Our intermediate languages have no jump tables. Hence, we have to compile the \lstinline[language=C]'switch' constructs away before going from \textsf{Clight} to \textsf{Cminor}. Note that this is not necessarily a bad decision performance-wise for small switches. For instance, compilers such as GCC introduce balanced trees of if-then-else'' constructs. However, our implementation strategy is much simpler. Let us consider the following input statement. \begin{lstlisting}[language=C] switch(e) { case v1: stmt1; case v2: stmt2; default: stmt_default; } \end{lstlisting} Note that stmt1, stmt2, \ldots stmt\_default may contain "break" statements, which have the effect of exiting the switch statement. In the absence of break, the execution falls through each case sequentially. In our current implementation, we produce an equivalent sequence of if-then'' chained by gotos: \begin{lstlisting}[language=C] fresh = e; if(fresh == v1) { $\llbracket$stmt1$\rrbracket$; goto lbl_case2; }; if(fresh == v2) { lbl_case2: $\llbracket$stmt2;$\rrbracket$ goto lbl_case2; }; $\llbracket$stmt_default$\rrbracket$; exit_label: \end{lstlisting} The proof had to tackle the following points: \begin{itemize} \item the source and target memories are not the same (cf. fresh variable), \item the flow of control is changed in a non-local way (e.g. \textbf{goto} instead of \textbf{break}). \end{itemize} In order to tackle the first point, we implemented a version of memory extensions similar to Compcert's. What has been done is the simulation proof for all easy'' cases, that do not interact with the switch removal per se, plus a bit of the switch case. This comprises propagating the memory extension through  each statement (except switch), as well as various invariants that are needed for the switch case (in particular, freshness hypotheses). The details of the evaluation process for the source switch statement and its target counterpart can be found in the file \textbf{switchRemoval.ma}, along more details on the transformation itself. Proving the correctness of the second point would require reasoning on the semantics of \lstinline[language=C]'goto' statements. In the \textsf{Clight} semantics, this is implemented as a function-wide lookup of the target label. The invariant we would need is the fact that a global label lookup on a freshly created goto is equivalent to a local lookup. This would in turn require the propagation of some freshness hypotheses on labels. For reasons expressed above, we decided to omit this part of the correctness proof. \paragraph{Clight to Cminor} This pass is the last one operating on the Clight intermediate language. Its input is a full \textsf{Clight} program, and its output is a \textsf{Cminor} program. Note that we do not use an equivalent of the C\#minor language: we translate directly to Cminor. This presents the advantage of not requiring the special loop constructs, nor the explicit block structure. Another salient point of our approach is that a significant part of the properties needed for the simulation proof were directly encoded in dependently typed translation functions. This concerns more precisely freshness conditions and well-typedness conditions. The main effects of the transformation from \textsf{Clight} to \textsf{Cminor} are listed below. \begin{itemize} \item Variables are classified as being either globals, stack-allocated locals or potentially register-allocated locals. The value of register-allocated local variables is moved out of the modelled memory and stored in a dedicated environment. \item In Clight, each local variable has a dedicated memory block, whereas stack-allocated locals are bundled together on a function-by-function basis. \item Loops are converted to jumps. \end{itemize} The first two points require memory injections which are more flexible that those needed in the switch removal case. In the remainder of this section, we briefly discuss our implementation of memory injections, and then the simulation proof. \subparagraph{Memory injections} Our memory injections are modelled after the work of Blazy \& Leroy. However, the corresponding paper is based on the first version of the Compcert memory model, whereas we use a much more concrete model, allowing byte-level manipulations (as in the later version of Compcert's memory model). We proved roughly 80 \% of the required lemmas. Some difficulties encountered were notably due to some too relaxed conditions on pointer validity (problem fixed during development). Some more conditions had to be added to take care of possible overflows when converting from \textbf{Z} block bounds to $16$ bit pointer offsets (in pratice, these overflows only occur in edge cases that are easily ruled out -- but this fact is not visible in memory injections). Concretely, some of the lemmas on the preservation of simulation of loads after writes were axiomatised, for lack of time. \subparagraph{Simulation proof} The correction proof for this transformation was not terminated. We proved the simulation result for expressions and for some subset of the critical statement cases. Notably lacking are the function entry and exit, where the memory injection is properly set up. As can be guessed, a significant amount of work has to be performed to show the conservation of all invariants at each simulation step. \todo{list cases, explain while loop, explain labeling problem} \section{Checking cost labelling properties}