# Changeset 3233 for Deliverables

Ignore:
Timestamp:
Apr 30, 2013, 5:00:37 PM (8 years ago)
Message:

passed spell checker, added description of the cerco wrapper, minor changes to the cost plug-in section

File:
1 edited

### Legend:

Unmodified
 r3109 \documentclass[11pt, epsf, a4wide]{article} \documentclass[11pt]{article} \usepackage{../style/cerco} \begin{large} Main Authors:\\ XXXX %Dominic P. Mulligan and Claudio Sacerdoti Coen Roberto M. Amadio, Nicolas Ayache, François Bobot, Jaap Boender, Brian Campbell, Ilias Garnier, Antoine Madet, James McKinna, Dominic P. Mulligan, Mauro Piccolo, Yann R\'egis-Gianas, Claudio Sacerdoti Coen, Ian Stark and Paolo Tranquilli \end{large} \end{center} \vspace*{7cm} \paragraph{Abstract} The Trusted CerCo Prototype is meant to be the last piece of software produced The trusted CerCo Prototype is meant to be the last piece of software produced in the CerCo project. It consists of \begin{enumerate} user by producing an instrumented C code obtained injecting in the original source code some instructions to update three global variables that record the current clock, the current stack usage and the max stack usage so far. the current clock, the current stack usage and the maximum stack usage so far. \item A plug-in for the Frama-C Software Analyser architecture. The plug-in takes in input a C file, compiles it with the CerCo compiler and then provers to automatically verify them. Those that are not automatically verified can be manually checked by the user using a theorem prover. \item A wrapper that interfaces and integrates the two pieces of software above with the Frama-C Jessie plugin and with the Why3 platform. \end{enumerate} The Grant Agreement describes deliverable D5.2 as follows: \begin{quotation} \textbf{Trusted CerCo Prototype}: Final, fully trustable version of the \textbf{Trusted CerCo Prototype}: Final, fully trusted version of the system. \end{quotation} } \end{verbatim} \caption{A simple C program~\label{test1}} \caption{A simple C program.} \label{test1} \end{figure} } \end{verbatim} \caption{The instrumented version of the program in Figure~\ref{test1}\label{itest1}} \caption{The instrumented version of the program in \autoref{test1}.} \label{itest1} \end{figure} Let the file \texttt{test.c} contains the code presented in Figure~\ref{test1}. Let the file \texttt{test.c} contains the code presented in \autoref{test1}. By calling \texttt{acc -a test.c}, the user obtains the following files: \begin{itemize} \item \texttt{test-instrumented.c} (Figure~\ref{itest1}): the file is a copy \item \texttt{test-instrumented.c} (\autoref{itest1}): the file is a copy of \texttt{test.c} obtained by adding code that keeps track of the amount of time and space used by the source code. \item \texttt{test.hex}: the file is an Intel HEX representation of the object code for an 8051 microprocessor. The file can be loaded in any 8051 emulator (like the MCU 8051 IDE) for running or dissassemblying it. 8051 emulator (like the MCU 8051 IDE) for running or disassembling it. Moreover, an executable semantics (an emulator) for the 8051 is linked with the CerCo compilers and can be used to run the object code at the compilers for the intermediate passes are not the same and we do not output yet any intermediate syntax for the assembly code. The latter can be obtained by disassemblying the object code, e.g. by using the MCU 8051 IDE. by disassembling the object code, e.g. by using the MCU 8051 IDE. \subsection{Code structure} This is the code that will eventually be fully certified using Matita. It is fully contained in the \texttt{extracted} directory (not comprising the subdirectory \texttt{unstrusted}). the subdirectory \texttt{untrusted}). It translates the source C-light code to: \begin{itemize} at the beginning of basic blocks. The emitted labels are the ones that are observed calling \texttt{acc -is}. They will be replaced in the final instrumented code with incrementes of the \texttt{\_\_cost} final instrumented code with increments of the \texttt{\_\_cost} variable. \item The object code for the 8051 as a list of bytes to be loaded in code memory. The code is coupled with a trie over bitvectors that maps code memory. The code is coupled with a trie over bit vectors that maps program counters to cost labels \texttt{k}. The intended semantics is that, when the current program counter is associated to \texttt{k}, describe. \item Untrusted code called by trusted compiler (directory \texttt{extracted/unstrusted}). The two main untrusted components in this \texttt{extracted/untrusted}). The two main untrusted components in this directory are \begin{itemize} which pseudo-registers need spilling; finally it colours it again using real registers and spilling slots as colours to assign to each pseudoregister either a spilling slot or a colours to assign to each pseudo-register either a spilling slot or a real register. \end{itemize} Clight sources. The reason for the slowness is currently under investigation. It is likely to be due to the quality of the extracted code (see Section~\ref{quality}). \autoref{quality}). \item The back-ends of both compilers do not handle external functions because we have not implemented a linker. The trusted compiler fails during compilation, while the untrusted compiler silently turns every external function call into a \texttt{NOP}. \item The untrusted compiler had ad-hoc options to deal with C files generated from a Lustre compiler. The ad-hoc code simplified the C files \item The untrusted compiler had \emph{ad hoc} options to deal with C files generated from a Lustre compiler. The \emph{ad hoc} code simplified the C files by avoiding some calls to external functions and it was adding some debugging code to print the actual reaction time of every Lustre node. The trusted compiler does not implement any ad-hoc Lustre option yet. The trusted compiler does not implement any \emph{ad hoc} Lustre option yet. \end{itemize} \subsection{Implementative differences w.r.t. the untrusted compiler}\label{quality} \subsection{Implementation differences w.r.t.\ the untrusted compiler} The code of the trusted compiler greatly differs from the code of the untrusted prototype. The main architectural difference is the one of a unified syntax (data-structure), semantics and pretty-printing for every back-end language. In order to accomodate the differences between the original languages, the In order to accommodate the differences between the original languages, the syntax and semantics have been abstracted over a number of parameters, like the type of arguments of the instructions. For example, early languages use pseudo-registers to hold data while late languages store data into real machine registers or stack locations. The unification of the languages have bringed registers or stack locations. The unification of the languages have brought a few benefits and can potentially bring new ones in the future: \begin{itemize} reducing to 1/6th the number of lemmas to be proved (6 is the number of back-end intermediate languages). Moreover, several back-end passes --- a pass between two alternative semantics for RTL, the RTL to ERTL pass and the ERTL to LTL pass --- transform a graph instance of the generic ---a pass between two alternative semantics for RTL, the RTL to ERTL pass and the ERTL to LTL pass--- transform a graph instance of the generic back-end intermediate language to another graph instance. The graph-to-graph transformation has also been generalized and parameterized graph-to-graph transformation has also been generalized and parametrised over the pass-specific details. While the code saved in this way is not much, several significant lemmas are provided once and for all on the \item Some passes and several proofs can be given in greater generality, allowing more reusal. allowing more reuse. For example, in the untrusted prototype the LTL to LIN pass was turning a graph language into a linearized language LTL to LIN pass was turning a graph language into a linearised language with the very same instructions and similar semantics. In particular, the two semantics shared the same execute phase, while fetching was different. one of the two representations and a static single assignment (SSA) one. As a final observation, the insertion of another graph-based language after the LTL one is now made easy: the linearization pass needs not be redone the LTL one is now made easy: the linearisation pass needs not be redone for the new pass. \item Pass commutation and reusal. \item Pass commutation and reuse. Every pass is responsible for reducing a difference between the source and target languages. For example, the RTL to ERTL pass However, following Coq's tradition, detection of the useless parts is not done according to the computationally expensive algorithm by Berardi~\cite{berardixxx}. Instead, the user decides which data structures Berardi~\cite{berardi1,berardi2}. Instead, the user decides which data structures should be assigned computation interest by declaring them in one of the \texttt{Type\_i} sorts of the Calculus of (Co)Inductive Constructions. is employed, which is our choice for CerCo. Under this discipline, terms can be passed to type formers. For example, the data type for back-end languages in CerCo is parameterized over the list of global variables that languages in CerCo is parametrised over the list of global variables that may be referred to by the code. Another example is the type of vectors that is parameterized over a natural which is the size of the vector, or the type of vector tries which is parameterized over the fixed height of that is parametrised over a natural which is the size of the vector, or the type of vector tries which is parametrised over the fixed height of the tree and that can be read and updated only using keys (vectors of bits) whose lenght is the height of the trie. Functions that compute these bits) whose length is the height of the trie. Functions that compute these dependent types also have to compute the new indexes (parameters) for the types, even if this information is used only for typing. For example, appending two vectors require the computation of the lenght of the result appending two vectors require the computation of the length of the result vector just to type the result. In turn, this computation requires the lenghts of the two vectors in input. Therefore, functions that call append have to compute the length of the vectors to append even if the lenghts lengths of the two vectors in input. Therefore, functions that call append have to compute the length of the vectors to append even if the lengths will actually be ignored by the extracted code of the append function. In the litterature there are proposals for allowing the user to more In the literature there are proposals for allowing the user to more accurately distinguish computational from non computational arguments of functions. The proposals introduce two different types of $\lambda$-abstractions and ad-hoc typing rules to ensure that computationally $\lambda$-abstractions and \emph{ad hoc} typing rules to ensure that computationally irrelevant bound variables are not used in computationally relevant positions. An OCaml prototype that implements these ideas for Coq is available~\cite{xxyy}, but heavily bugged. We did not try yet to do available~\cite{implicitcoq}, but heavily bugged. We did not try yet to do anything along these lines in Matita. To avoid modifying the system, another approach based on the explicit use of a non computational monad has been also proposed in the litterature, but it introduces many has been also proposed in the literature, but it introduces many complications in the formalization and it cannot be used in every situation. Improvement of the code extraction to more aggresively remove irrelevant Improvement of the code extraction to more aggressively remove irrelevant code from code extracted from Matita is left as future work. At the moment, it seems that useless computations are indeed responsible the upper-right corner of Barendregt cube, can be re-typed in System $F_\omega$, the corresponding typed lambda calculus without dependent types~\cite{christinexx}. The calculi implemented by Coq and Matita, however, types~\cite{coctofomega}. The calculi implemented by Coq and Matita, however, are more expressive than CoC, and several type constructions have no counterparts in System $F_\omega$. Moreover, core OCaml does not even both as term formers and type formers, according to the arguments that are passed to them. In System $F_\omega$, instead, terms and types are syntactially distinct. Extracting terms according to all their possible uses may be unpractical because the number of uses is exponential in the syntactically distinct. Extracting terms according to all their possible uses may be impractical because the number of uses is exponential in the number of arguments of sort $Type_i$ with $i \geq 2$. \item Case analysis and recursion over inhabitants of primitive inductive $F_\omega$. In the CerCo compiler we largely exploit type formers declared in this way, for example to provide the same level of type safety achieved in the untrusted compiler via polymorphic variants~\cite{guarriguesxx}. in the untrusted compiler via polymorphic variants~\cite{garrigue98}. In particular, we have terms to syntactically describe as first class citizens the large number of combinations of operand modes of object code fragment. Sometimes simple code transformations could be used to make the function typeable, but the increased extraction code complexity would outweight the benefits. outweigh the benefits. \end{enumerate} \end{itemize} \item The two most recent versions of OCaml have introduced first class modules, which are exactly the feature needed for extracting code that uses records contining both types and term declarations. However, the syntax records containing both types and term declarations. However, the syntax required for first class modules is extremely cumbersome and it requires the explicit introduction of type expressions to make manifest the type \texttt{-cost-acc} flag of the plug-in can be used to select the executable to be used for compilation. The code of the plug-in has been modified w.r.t. D5.1 in order to take care also of the cost model for stack-size consumption. From the user point of view, The code of the plug-in has been modified w.r.t. D5.1 to address two issues. On the one side, the analysis of the stack-size consumption has been integrated into it. From the user point of view, time and space cost annotations and invariants are handled in a similar way. Nevertheless, we expect automated theorem provers to face more difficulties Most C programs, and in particular those used in time critical systems, avoid recursive functions. On the other side, the plug-in has been updated to take advantage of the new Why3 platform. \section{The \texttt{cerco} Wrapper} The Why3 platform is a complete rewrite of the old Why2 one. The update has triggered several additional passages to enable the use of the cost plug-in in conjunction with the Jessie one and the automatic and interactive theorem provers federated by the Why3 platform, mainly because the Jessie plug-in still uses Why2. These passages, which required either tedious manual commands or a complicated makefile, have prompted us to write a script wrapping all the functionalities provided by the software described in this deliverable. \begin{verbatim} Syntax: cerco [-ide] [-untrusted] filename.c \end{verbatim} The C file provided is processed via the cost plug-in and then to the Why3 platform. The two available options command the following features. \begin{itemize} \item \verb+-ide+: launch the Why3 interactive graphical interface for a fine-grained control on proving the synthesised program invariants. If not provided, the script will launch all available automatic theorem provers with a 5 second time-out, and just report failure or success. \item \verb+-untrusted+: if it is installed, use the untrusted prototype rather than the trusted one (which is the default behaviour). \end{itemize} The minimum dependencies for the use of this script are \begin{itemize} \item either the trusted or the untrusted \verb+acc+ CerCo compiler; \item both Why2 and Why3; \item the cost and Jessie plus-ins. \end{itemize} However it is recommended to install as much Why3-compatible automatic provers as possible to maximise the effectiveness of the command. The provers provided by default were not very effective in our experience. \section{Connection with other deliverables} \end{itemize} \bibliographystyle{unsrt} \bibliography{report} \end{document}