Changeset 3233


Ignore:
Timestamp:
Apr 30, 2013, 5:00:37 PM (4 years ago)
Author:
tranquil
Message:

passed spell checker, added description of the cerco wrapper, minor changes to the cost plug-in section

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Deliverables/D5.2/report.tex

    r3109 r3233  
    1 \documentclass[11pt, epsf, a4wide]{article}
     1\documentclass[11pt]{article}
    22
    33\usepackage{../style/cerco}
     
    7777\begin{large}
    7878Main Authors:\\
    79 XXXX %Dominic P. Mulligan and Claudio Sacerdoti Coen
     79Roberto M. Amadio, Nicolas Ayache, François Bobot, Jaap Boender, Brian Campbell, Ilias Garnier,
     80Antoine Madet, James McKinna,
     81Dominic P. Mulligan, Mauro Piccolo, Yann R\'egis-Gianas,
     82Claudio Sacerdoti Coen, Ian Stark and Paolo Tranquilli
    8083\end{large}
    8184\end{center}
     
    9699\vspace*{7cm}
    97100\paragraph{Abstract}
    98 The Trusted CerCo Prototype is meant to be the last piece of software produced
     101The trusted CerCo Prototype is meant to be the last piece of software produced
    99102in the CerCo project. It consists of
    100103\begin{enumerate}
     
    105108   user by producing an instrumented C code obtained injecting in the original
    106109   source code some instructions to update three global variables that record
    107    the current clock, the current stack usage and the max stack usage so far.
     110   the current clock, the current stack usage and the maximum stack usage so far.
    108111 \item A plug-in for the Frama-C Software Analyser architecture. The plug-in
    109112   takes in input a C file, compiles it with the CerCo compiler and then
     
    112115   provers to automatically verify them. Those that are not automatically
    113116   verified can be manually checked by the user using a theorem prover.
     117 \item A wrapper that interfaces and integrates the two pieces of software
     118   above with the Frama-C Jessie plugin and with the Why3 platform.
     119 
    114120\end{enumerate}
    115121
     
    135141The Grant Agreement describes deliverable D5.2 as follows:
    136142\begin{quotation}
    137 \textbf{Trusted CerCo Prototype}: Final, fully trustable version of the
     143\textbf{Trusted CerCo Prototype}: Final, fully trusted version of the
    138144system.
    139145\end{quotation}
     
    177183}
    178184\end{verbatim}
    179 \caption{A simple C program~\label{test1}}
     185\caption{A simple C program.}
     186\label{test1}
    180187\end{figure}
    181188
     
    227234}
    228235\end{verbatim}
    229 \caption{The instrumented version of the program in Figure~\ref{test1}\label{itest1}}
     236\caption{The instrumented version of the program in \autoref{test1}.}
     237\label{itest1}
    230238\end{figure}
    231239
    232 Let the file \texttt{test.c} contains the code presented in Figure~\ref{test1}.
     240Let the file \texttt{test.c} contains the code presented in \autoref{test1}.
    233241By calling \texttt{acc -a test.c}, the user obtains the following files:
    234242\begin{itemize}
    235  \item \texttt{test-instrumented.c} (Figure~\ref{itest1}): the file is a copy
     243 \item \texttt{test-instrumented.c} (\autoref{itest1}): the file is a copy
    236244   of \texttt{test.c} obtained by adding code that keeps track of the amount
    237245   of time and space used by the source code.
     
    277285 \item \texttt{test.hex}: the file is an Intel HEX representation of the
    278286   object code for an 8051 microprocessor. The file can be loaded in any
    279    8051 emulator (like the MCU 8051 IDE) for running or dissassemblying it.
     287   8051 emulator (like the MCU 8051 IDE) for running or disassembling it.
    280288   Moreover, an executable semantics (an emulator) for the 8051 is linked
    281289   with the CerCo compilers and can be used to run the object code at the
     
    303311compilers for the intermediate passes are not the same and we do not output
    304312yet any intermediate syntax for the assembly code. The latter can be obtained
    305 by disassemblying the object code, e.g. by using the MCU 8051 IDE.
     313by disassembling the object code, e.g. by using the MCU 8051 IDE.
    306314
    307315\subsection{Code structure}
     
    311319   This is the code that will eventually be fully certified using Matita.
    312320   It is fully contained in the \texttt{extracted} directory (not comprising
    313    the subdirectory \texttt{unstrusted}).
     321   the subdirectory \texttt{untrusted}).
    314322   It translates the source C-light code to:
    315323   \begin{itemize}
     
    318326      at the beginning of basic blocks. The emitted labels are the ones that
    319327      are observed calling \texttt{acc -is}. They will be replaced in the
    320       final instrumented code with incrementes of the \texttt{\_\_cost}
     328      final instrumented code with increments of the \texttt{\_\_cost}
    321329      variable.
    322330    \item The object code for the 8051 as a list of bytes to be loaded in
    323       code memory. The code is coupled with a trie over bitvectors that maps
     331      code memory. The code is coupled with a trie over bit vectors that maps
    324332      program counters to cost labels \texttt{k}. The intended semantics is
    325333      that, when the current program counter is associated to \texttt{k},
     
    333341   describe.
    334342 \item Untrusted code called by trusted compiler (directory
    335    \texttt{extracted/unstrusted}). The two main untrusted components in this
     343   \texttt{extracted/untrusted}). The two main untrusted components in this
    336344    directory are
    337345  \begin{itemize}
     
    370378    which pseudo-registers need spilling;
    371379    finally it colours it again using real registers and spilling slots as
    372     colours to assign to each pseudoregister either a spilling slot or a
     380    colours to assign to each pseudo-register either a spilling slot or a
    373381    real register.
    374382  \end{itemize}
     
    430438   Clight sources. The reason for the slowness is currently under investigation.
    431439   It is likely to be due to the quality of the extracted code (see
    432    Section~\ref{quality}).
     440   \autoref{quality}).
    433441 \item The back-ends of both compilers do not handle external functions
    434442   because we have not implemented a linker. The trusted compiler fails
    435443   during compilation, while the untrusted compiler silently turns every
    436444   external function call into a \texttt{NOP}.
    437  \item The untrusted compiler had ad-hoc options to deal with C files
    438    generated from a Lustre compiler. The ad-hoc code simplified the C files
     445 \item The untrusted compiler had \emph{ad hoc} options to deal with C files
     446   generated from a Lustre compiler. The \emph{ad hoc} code simplified the C files
    439447   by avoiding some calls to external functions and it was adding some
    440448   debugging code to print the actual reaction time of every Lustre node.
    441    The trusted compiler does not implement any ad-hoc Lustre option yet.
     449   The trusted compiler does not implement any \emph{ad hoc} Lustre option yet.
    442450\end{itemize}
    443451
    444 \subsection{Implementative differences w.r.t. the untrusted compiler}\label{quality}
     452\subsection{Implementation differences w.r.t.\ the untrusted compiler}
    445453The code of the trusted compiler greatly differs from the code of the
    446454untrusted prototype. The main architectural difference is the one of
     
    448456a unified syntax (data-structure), semantics and pretty-printing for every
    449457back-end language.
    450 In order to accomodate the differences between the original languages, the
     458In order to accommodate the differences between the original languages, the
    451459syntax and semantics have been abstracted over a number of parameters, like
    452460the type of arguments of the instructions. For example, early languages use
    453461pseudo-registers to hold data while late languages store data into real machine
    454 registers or stack locations. The unification of the languages have bringed
     462registers or stack locations. The unification of the languages have brought
    455463a few benefits and can potentially bring new ones in the future:
    456464\begin{itemize}
     
    462470   reducing to 1/6th the number of lemmas to be proved (6 is the number of
    463471   back-end intermediate languages). Moreover, several back-end passes
    464    --- a pass between two alternative semantics for RTL, the RTL to ERTL pass
    465    and the ERTL to LTL pass --- transform a graph instance of the generic
     472   ---a pass between two alternative semantics for RTL, the RTL to ERTL pass
     473   and the ERTL to LTL pass--- transform a graph instance of the generic
    466474   back-end intermediate language to another graph instance. The
    467    graph-to-graph transformation has also been generalized and parameterized
     475   graph-to-graph transformation has also been generalized and parametrised
    468476   over the pass-specific details. While the code saved in this way is not
    469477   much, several significant lemmas are provided once and for all on the
     
    479487
    480488 \item Some passes and several proofs can be given in greater generality,
    481    allowing more reusal.
     489   allowing more reuse.
    482490
    483491   For example, in the untrusted prototype the
    484    LTL to LIN pass was turning a graph language into a linearized language
     492   LTL to LIN pass was turning a graph language into a linearised language
    485493   with the very same instructions and similar semantics. In particular, the
    486494   two semantics shared the same execute phase, while fetching was different.
     
    505513   one of the two representations and a static single assignment (SSA) one.
    506514   As a final observation, the insertion of another graph-based language after
    507    the LTL one is now made easy: the linearization pass needs not be redone
     515   the LTL one is now made easy: the linearisation pass needs not be redone
    508516   for the new pass.
    509517   
    510 \item Pass commutation and reusal.
     518\item Pass commutation and reuse.
    511519   Every pass is responsible for reducing a difference
    512520   between the source and target languages. For example, the RTL to ERTL pass
     
    560568   However, following Coq's tradition, detection of the useless parts is not
    561569   done according to the computationally expensive algorithm by
    562    Berardi~\cite{berardixxx}. Instead, the user decides which data structures
     570   Berardi~\cite{berardi1,berardi2}. Instead, the user decides which data structures
    563571   should be assigned computation interest by declaring them in one of the
    564572   \texttt{Type\_i} sorts of the Calculus of (Co)Inductive Constructions.
     
    576584   is employed, which is our choice for CerCo. Under this discipline, terms
    577585   can be passed to type formers. For example, the data type for back-end
    578    languages in CerCo is parameterized over the list of global variables that
     586   languages in CerCo is parametrised over the list of global variables that
    579587   may be referred to by the code. Another example is the type of vectors
    580    that is parameterized over a natural which is the size of the vector, or
    581    the type of vector tries which is parameterized over the fixed height of
     588   that is parametrised over a natural which is the size of the vector, or
     589   the type of vector tries which is parametrised over the fixed height of
    582590   the tree and that can be read and updated only using keys (vectors of
    583    bits) whose lenght is the height of the trie. Functions that compute these
     591   bits) whose length is the height of the trie. Functions that compute these
    584592   dependent types also have to compute the new indexes (parameters) for the
    585593   types, even if this information is used only for typing. For example,
    586    appending two vectors require the computation of the lenght of the result
     594   appending two vectors require the computation of the length of the result
    587595   vector just to type the result. In turn, this computation requires the
    588    lenghts of the two vectors in input. Therefore, functions that call append
    589    have to compute the length of the vectors to append even if the lenghts
     596   lengths of the two vectors in input. Therefore, functions that call append
     597   have to compute the length of the vectors to append even if the lengths
    590598   will actually be ignored by the extracted code of the append function.
    591599
    592    In the litterature there are proposals for allowing the user to more
     600   In the literature there are proposals for allowing the user to more
    593601   accurately distinguish computational from non computational arguments of
    594602   functions. The proposals introduce two different types of
    595    $\lambda$-abstractions and ad-hoc typing rules to ensure that computationally
     603   $\lambda$-abstractions and \emph{ad hoc} typing rules to ensure that computationally
    596604   irrelevant bound variables are not used in computationally relevant
    597605   positions. An OCaml prototype that implements these ideas for Coq is
    598    available~\cite{xxyy}, but heavily bugged. We did not try yet to do
     606   available~\cite{implicitcoq}, but heavily bugged. We did not try yet to do
    599607   anything along these lines in Matita. To avoid modifying the system,
    600608   another approach based on the explicit use of a non computational monad
    601    has been also proposed in the litterature, but it introduces many
     609   has been also proposed in the literature, but it introduces many
    602610   complications in the formalization and it cannot be used in every situation.
    603611
    604    Improvement of the code extraction to more aggresively remove irrelevant
     612   Improvement of the code extraction to more aggressively remove irrelevant
    605613   code from code extracted from Matita is left as future work.
    606614   At the moment, it seems that useless computations are indeed responsible
     
    614622   the upper-right corner of Barendregt cube, can be re-typed in System
    615623   $F_\omega$, the corresponding typed lambda calculus without dependent
    616    types~\cite{christinexx}. The calculi implemented by Coq and Matita, however,
     624   types~\cite{coctofomega}. The calculi implemented by Coq and Matita, however,
    617625   are more expressive than CoC, and several type constructions have no
    618626   counterparts in System $F_\omega$. Moreover, core OCaml does not even
     
    637645    both as term formers and type formers, according to the arguments that are
    638646    passed to them. In System $F_\omega$, instead, terms and types are
    639     syntactially distinct. Extracting terms according to all their possible
    640     uses may be unpractical because the number of uses is exponential in the
     647    syntactically distinct. Extracting terms according to all their possible
     648    uses may be impractical because the number of uses is exponential in the
    641649    number of arguments of sort $Type_i$ with $i \geq 2$.
    642650   \item Case analysis and recursion over inhabitants of primitive inductive
     
    644652    $F_\omega$. In the CerCo compiler we largely exploit type formers declared
    645653    in this way, for example to provide the same level of type safety achieved
    646     in the untrusted compiler via polymorphic variants~\cite{guarriguesxx}.
     654    in the untrusted compiler via polymorphic variants~\cite{garrigue98}.
    647655    In particular, we have terms to syntactically describe as first class
    648656    citizens the large number of combinations of operand modes of object code
     
    665673    fragment. Sometimes simple code transformations could be used to make the
    666674    function typeable, but the increased extraction code complexity would
    667     outweight the benefits.
     675    outweigh the benefits.
    668676 \end{enumerate}
    669677\end{itemize}
     
    693701 \item The two most recent versions of OCaml have introduced first class
    694702  modules, which are exactly the feature needed for extracting code that uses
    695   records contining both types and term declarations. However, the syntax
     703  records containing both types and term declarations. However, the syntax
    696704  required for first class modules is extremely cumbersome and it requires
    697705  the explicit introduction of type expressions to make manifest the type
     
    722730\texttt{-cost-acc} flag of the plug-in can be used to select the executable
    723731to be used for compilation.
    724 
    725 The code of the plug-in has been modified w.r.t. D5.1 in order to take care
    726 also of the cost model for stack-size consumption. From the user point of view,
     732The code of the plug-in has been modified w.r.t. D5.1 to address two issues.
     733
     734On the one side, the analysis of the stack-size consumption has been integrated
     735into it. From the user point of view,
    727736time and space cost annotations and invariants are handled in a similar way.
    728737Nevertheless, we expect automated theorem provers to face more difficulties
     
    734743Most C programs, and in particular those used in time critical systems, avoid
    735744recursive functions.
     745
     746On the other side, the plug-in has been updated to take advantage of the new
     747Why3 platform.
     748
     749\section{The \texttt{cerco} Wrapper}
     750The Why3 platform is a complete rewrite of the old Why2 one. The update
     751has triggered several additional passages to enable the use of the cost plug-in
     752in conjunction with the Jessie one and the automatic and interactive theorem
     753provers federated by the Why3 platform, mainly because the Jessie plug-in still
     754uses Why2. These passages, which required either
     755tedious manual commands or a complicated makefile, have prompted us to write
     756a script wrapping all the functionalities provided by the software described
     757in this deliverable.
     758\begin{verbatim}
     759Syntax: cerco [-ide] [-untrusted] filename.c
     760\end{verbatim}
     761The C file provided is processed via the cost plug-in and then to the Why3 platform.
     762The two available options command the following features.
     763\begin{itemize}
     764 \item \verb+-ide+: launch the Why3 interactive graphical interface for a
     765 fine-grained control on proving the synthesised program invariants. If not provided,
     766 the script will launch all available automatic theorem provers with a 5 second
     767 time-out, and just report failure or success.
     768 \item \verb+-untrusted+: if it is installed, use the untrusted prototype
     769 rather than the trusted one (which is the default behaviour).
     770\end{itemize}
     771The minimum dependencies for the use of this script are
     772\begin{itemize}
     773 \item either the trusted or the untrusted \verb+acc+ CerCo compiler;
     774 \item both Why2 and Why3;
     775 \item the cost and Jessie plus-ins.
     776\end{itemize}
     777However it is recommended to install as much Why3-compatible automatic provers
     778as possible to maximise the effectiveness of the command. The provers provided
     779by default were not very effective in our experience.
    736780
    737781\section{Connection with other deliverables}
     
    770814\end{itemize}
    771815
     816\bibliographystyle{unsrt}
     817\bibliography{report}
    772818\end{document}
Note: See TracChangeset for help on using the changeset viewer.