Changeset 3636

Ignore:
Timestamp:
Mar 8, 2017, 12:03:20 PM (9 months ago)
Message:

more rewriting, fiddling, rephrasing etc.

File:
1 edited

Legend:

Unmodified
 r3635 A program's non-functional constraints may be given \emph{concretely}, or \emph{asymptotically}. Asymptotic complexity, as every Computer Science undergraduate knows, is important---but so too is concrete complexity for many applications, including the three we highlighted above as examples. A real-time system's response time is measured in seconds, milliseconds, or some other fundamental unit of time; a cryptographic library must have all execution paths execute in the same number of processor cycles, independent of any input passed by the client; and the size of an embedded controller for a pacemaker is measured in bits, bytes, or some other unit of memory capacity. In all cases, resource consumption is measured in some basal, concrete unit of measure. A real-time system's response time is measured in seconds, milliseconds, or some other fundamental unit of time; a cryptographic library must have all execution paths execute in the same number of processor cycles, independent of any input passed by the client; and the size of an embedded controller for a pacemaker is measured in bits, bytes, or some other unit of computer memory capacity. In all cases, resource consumption is measured (and is specified) in some basal, concrete unit of measure. Currently, a program's functional properties can be established by combining user annotations---preconditions, invariants, and so on---with various automated and semi-automated analyses---invariant generators, type systems, abstract interpretations, applications of theorem proving, and so on---on the high-level source code of the program. Functional properties of a program are therefore established by reasoning about the source code that the application programmer actually sees. Functional properties of a program are therefore established by reasoning about the source code that the application programmer actually sees and writes. Further, the results of any analysis can be communicated to the programmer in terms of abstractions, control flow, and an overall system design that they are familiar with. By contrast, a program's non-functional properties are established by reasoning on low-level object code produced not by a programmer, but by a compiler. By contrast, a program's concrete, non-functional properties are established by reasoning on low-level object code produced not by a programmer, but by a compiler. Whilst analyses operating at this level can and do produce very accurate results---Worst Case Execution Time (WCET) analysis can be extraordinarily accurate, for example---analysis at such a low-level of abstraction invariably has disadvantages: \begin{itemize} \end{itemize} More ideal would be a combination of high-level reasoning coupled with the accuracy one would expect of WCET, or other static analyses of non-functional properties. This paper presents such a combination. What has previously prevented high-level reasoning about non-functional properties is the lack of a uniform and precise cost model for programs written in programming languages like C. More ideal would be a high-level analysis of concrete, non-functional properties, coupled with the accuracy one would expect of e.g. existing WCET analysers. This would lead to a reconciliation of functional and non-functional analyses. Information could be shared between the two analyses, and both could be performed concurrently on the same high-level source code. What has previously prevented high-level reasoning about non-functional properties is the lack of a uniform and precise cost model for programs written in programming languages, such as C. Since modern compilers---as discussed previously---may compile each expression and statement occurrence in radically different ways, optimisations may change control flow, and the cost of an object code instruction may depend on the runtime state of hardware components like pipelines and caches, none of which are visible in the source code, it has not been clear how one could go about defining such a cost model. In this paper, we report on the scientific and technical contributions of the Certified Complexity (CerCo') project, an EU-funded collaboration between the Universities of Bologna, Edinburgh, and Paris 7. We first present a uniform, precise cost model for a large fragment of the C programming language---\emph{CerCo C}, which is similar to early versions of \emph{CompCert C}. To define this cost model, we have developed a technical device, which we have dubbed \emph{the labelling approach}~\cite{labelling}, a technique to implement compilers that induce cost models on source programs by a very lightweight tracking of code changes through the compilation pipeline. To establish that our approach works, and scales beyond toy' languages to more realistic programming languages, we have implemented a compiler using this technique. The \emph{CerCo verified C compiler} compiles the CerCo C language fragment to object binaries for the MCS-51 8-bit embedded microcontroller. This is a well-known and still popular processor in the embedded systems community. Our compiler lifts a cost-model induced on machine code that it produces, back through the compilation chain, to the source-level. At the source level, this cost model is made manifest as parametric cost-annotations, inserted into the original source code file of the program being compiled. These annotations may be used by the programmer to predict concrete execution time and stack space usage. As we are targeting an embedded microcontroller we have not considered dynamic memory allocation. However, we believe that our labelling approach is general enough to handle resources other than execution time and stack space usage, including dynamic memory allocation. To demonstrate source-level verification of costs we have implemented a Frama-C plugin~\cite{framac} that invokes our compiler on a source program and uses it to generate invariants on the high-level source that correctly model low-level costs. The plugin certifies that the program respects these costs by calling automated theorem provers, an innovative and novel technique in the field of static analysis for non-functional properties. Using this plugin, we have conducted case studies, including showing that the plugin can automatically compute and certify the exact reaction time of Lustre~\cite{lustre} data flow programs compiled into C. The compiler is verified using the Matita interactive theorem prover, an implementation of the Calculus of Coinductive Constructions, developed at the University of Bologna. \paragraph{Vision.} We want to reconcile functional and non-functional analysis: to share information between them and perform both at the same time on high-level source code. when components have been specified but not implemented, as modularity means that it is enough to specify the non-functional behaviour of missing components. \paragraph{Contributions.} We have developed \emph{the labelling approach}~\cite{labelling}, a technique to implement compilers that induce cost models on source programs by very lightweight tracking of code changes through the compilation pipeline. We have implemented a compiler from C to object binaries for the 8051 microcontroller which uses this technique. The compiler predicts execution time and stack space usage. We have also verified the compile using an interactive theorem prover. As we are targeting an embedded microcontroller we have not considered dynamic memory allocation. To demonstrate source-level verification of costs we have implemented a Frama-C plugin~\cite{framac} that invokes the compiler on a source program and uses it to generate invariants on the high-level source that correctly model low-level costs. The plugin certifies that the program respects these costs by calling automated theorem provers, a new and innovative technique in the field of cost analysis. Finally, we have conducted several case studies, including showing that the plugin can automatically compute and certify the exact reaction time of Lustre~\cite{lustre} data flow programs compiled into C. \subsection{Project context and approach}