# Changeset 3191 for Deliverables/D3.4

Ignore:
Timestamp:
Apr 27, 2013, 5:00:40 PM (8 years ago)
Message:

 r3181 \todo{don't use loop structures from CompCert, go straight to jumps} \paragraph{Cast simplification} The parser used in Cerco introduces a lot of explicit type casts. If left as they are, these casts can greatly hamper the quality of the generated code -- especially as the architecture we consider is an $8$ bit one. In \textsf{Clight}, casts are expressions. Hence, most of the work of this transformation proceeds on expressions, and the one-step simulation result one those is easily lifted to statement evaluations. \todo{introduce the explicit type signature of the functions below ?} The tranformation proceeds by recursively trying to coerce an expression to a particular type, which is in practice a integer type smaller than the original one. This functionality is implemented by two mutually recursive functions: \textbf{simplify\_expr} and \textbf{simplify\_inside}. The latter acts as a wrapper for the first one. Whenever \textbf{simplify\_inside} encounters a \textbf{Ecast} expression, it tries to coerce the sub-expression to the desired type using \textbf{simplify\_expr}, which tries to perform the actual coercion. In return, \textbf{simplify\_expr} calls back \textbf{simplify\_inside} in some particular positions, where we decided to be conservative in order to simplify the proofs. However, the current design allows to incrementally revert to a more aggressive version, by replacing recursive calls to \textbf{simplify\_inside} by calls to \textbf{simplify\_expr} \emph{and} proving the corresponding invariants -- where possible. The main problem encountered with this particular pass was dealing with inconsistently typed programs, a canonical case being a particular integer constant of a certain size typed with another size. This prompted the need to introduce numerous type checks, complexifying both the implementation and the proof. \paragraph{Switch removal} Our intermediate languages have no jump tables. Hence, we have to compile the \lstinline[language=C]'switch' constructs away before going from \textsf{Clight} to \textsf{Cminor}. Note that this is not necessarily a bad decision performance-wise for small switches. For instance, compilers such as GCC introduce balanced trees of if-then-else'' constructs. However, our implementation strategy is much simpler. Let us consider the following input statement. \paragraph{Cast simplification.} The parser used in Cerco introduces a lot of explicit type casts. If left as they are, these constructs can greatly hamper the quality of the generated code -- especially as the architecture we consider is an $8$ bit one. In \textsf{Clight}, casts are expressions. Hence, most of the work of this transformation proceeds on expressions. The tranformation proceeds by recursively trying to coerce an expression to a particular integer type, which is in practice smaller than the original one. This functionality is implemented by two mutually recursive functions whose signature is the following. \begin{lstlisting}[language=matita] let rec simplify_expr (e:expr) (target_sz:intsize) (target_sg:signedness) : Σresult:bool×expr. ∀ge,en,m. simplify_inv ge en m e (\snd result) target_sz target_sg (\fst result) ≝ $\ldots$ and simplify_inside (e:expr) : Σresult:expr. conservation e result ≝ $\ldots$ \end{lstlisting} The \textsf{simplify\_inside} acts as a wrapper for \textsf{simplify\_expr}. Whenever \textsf{simplify\_inside} encounters a \textsf{Ecast} expression, it tries to coerce the sub-expression to the desired type using \textsf{simplify\_expr}, which tries to perform the actual coercion. In return, \textsf{simplify\_expr} calls back \textsf{simplify\_inside} in some particular positions, where we decided to be conservative in order to simplify the proofs. However, the current design allows to incrementally revert to a more aggressive version, by replacing recursive calls to \textsf{simplify\_inside} by calls to \textsf{simplify\_expr} \emph{and} proving the corresponding invariants -- where possible. The \textsf{simplify\_inv} invariant encodes either the conservation of the semantics during the transformation corresponding to the failure of the transformation (\textsf{Inv\_eq} constructor), or the successful downcast of the considered expression to the target type (\textsf{Inv\_coerce\_ok}). \begin{lstlisting}[language=matita] inductive simplify_inv (ge : genv) (en : env) (m : mem) (e1 : expr) (e2 : expr) (target_sz : intsize) (target_sg : signedness) : bool → Prop ≝ | Inv_eq : ∀result_flag. $\ldots$ simplify_inv ge en m e1 e2 target_sz target_sg result_flag | Inv_coerce_ok : ∀src_sz,src_sg. (typeof e1) = (Tint src_sz src_sg) → (typeof e2) = (Tint target_sz target_sg) → (smaller_integer_val src_sz target_sz src_sg (exec_expr ge en m e1) (exec_expr ge en m e2)) → simplify_inv ge en m e1 e2 target_sz target_sg true. \end{lstlisting} The \textsf{conservation} invariant simply states the conservation of the semantics, as in the \textsf{Inv\_eq} constructor of the previous invariant. \begin{lstlisting}[language=matita] definition conservation ≝ λe,result. ∀ge,en,m. res_sim ? (exec_expr ge en m e) (exec_expr ge en m result) ∧ res_sim ? (exec_lvalue ge en m e) (exec_lvalue ge en m result) ∧ typeof e = typeof result. \end{lstlisting} This invariant is then easily lifted to statement evaluations. The main problem encountered with this particular pass was dealing with inconsistently typed programs, a canonical case being a particular integer constant of a certain size typed with another size. This prompted the need to introduce numerous type checks, complexifying both the implementation and the proof. \todo{Make this a particular case of the more general statement on baking more invariants in the Clight language} \paragraph{Switch removal.} The intermediate languages of the front-end have no jump tables. Hence, we have to compile the \lstinline[language=C]'switch' constructs away before going from \textsf{Clight} to \textsf{Cminor}. Note that this transformation does not necessarily deteriorate the efficiency of the generated code. For instance, compilers such as GCC introduce balanced trees of if-then-else'' constructs for small switches. However, our implementation strategy is much simpler. Let us consider the following input statement. \begin{lstlisting}[language=C] \end{lstlisting} Note that stmt1, stmt2, \ldots stmt\_default may contain "break" statements, which have the effect of exiting the switch statement. In the absence of break, the execution falls through each case sequentially. In our current implementation, Note that \textsf{stmt1}, \textsf{stmt2}, \ldots \textsf{stmt\_default} may contain \lstinline[language=C]'break' statements, which have the effect of exiting the switch statement. In the absence of break, the execution falls through each case sequentially. In our current implementation, we produce an equivalent sequence of if-then'' chained by gotos: \begin{lstlisting}[language=C] we decided to omit this part of the correctness proof. \paragraph{Clight to Cminor} \paragraph{Clight to Cminor.} This pass is the last one operating on the Clight intermediate language. discuss our implementation of memory injections, and then the simulation proof. \subparagraph{Memory injections} \subparagraph{Memory injections.} Our memory injections are modelled after the work of Blazy \& Leroy. loads after writes were axiomatised, for lack of time. \subparagraph{Simulation proof} \subparagraph{Simulation proof.} The correction proof for this transformation was not terminated. We proved the