# Changeset 3647

Ignore:
Timestamp:
Mar 13, 2017, 12:10:59 PM (3 years ago)
Message:

Reworked section on object-code approaches to resource analysis. Lots of new citations...

Location:
Papers/jar-cerco-2017
Files:
2 edited

### Legend:

Unmodified
 r3646 The concrete, non-functional properties of programs---time and space usage, for example---are an important factor in the specification of a program, and therefore overall program correctness. For many application domains, it is concrete complexity that is the more critical of the two analyses. Soft real time' programs, with bounds on application response time, and libraries exporting cryptographic primitives, designed to be impervious to timing side-channel attacks, are two important classes of application fitting this pattern, for example. Indeed, for many application domains, concrete complexity is a critical component of overall correctness: real time' programs, with bounds on application response time, and libraries exporting cryptographic primitives, designed to be impervious to timing side-channel attacks, are two important classes of application fitting this pattern, for example. Worst Case Execution Time (WCET) tools currently represent the state of the art in providing static analyses that determine accurate concrete resource bounds for programs. These tools possess a number of disadvantages. Worst Case Execution Time (WCET) static analysers currently represent the state of the art in determining accurate concrete resource bounds for programs. These tools possess a number of disadvantages, however. For instance, their analyses are at once sophisticated and opaque, with trust in the analysis devolving to trust in the complex WCET tool implementation. Further, they also require that the analysis be performed on machine code produced by a compiler, where all high-level program structure has been compiled away', making analyses potentially hard to understand for programmers accustomed to reasoning at the source-code level. More ideal would be a mechanism to lift' a cost model from the machine code generated by a compiler, back to the source code level, where analyses could be performed in terms of source code, abstractions, and control-flow constructs written and understood by the programmer. A major challenge here is obtaining a source-level resource analysis that can compete with the precision offered by traditional WCET tools, and how to perform this high-level analysis reliably. A major challenge here is obtaining a source-level resource analysis that can compete with the precision offered by traditional WCET tools. Further, how to perform this high-level analysis reliably is an open question. In this paper we have presented an approach to engineering a realistic, mildly-optimising compiler for a large fragment of the C programming language that combines the ergonomics of reasoning at the source-level, with the accuracy and precision one would expect of traditional static analyses for non-functional properties of programs. In particular, the CerCo verified C compiler itself is a lightly-optimising compiler for a large subset of the C programming language, and whilst not as aggressive in its optimisation passes as commercial and open-source C compilers, such as \texttt{gcc}, \texttt{clang}, or \texttt{icc}, the compiler is still a relatively substantial piece of software. As a result, the application of the techniques described herein, first described with pen-and-paper proofs, to the full CerCo compiler is, in our view, already evidence of their scalability, and we see no reason why these techniques could not be applied to a more aggressively optimising compiler, given that none of our techniques are tied to any particular choice of intermediate language or set of optimisations. The techniques that we have presented are not tied to any particular programming language, and are in fact amenable to arbitrary imperative and functional programming languages. Further, the techniques that we have presented are not tied to any particular programming language, or to any one programming paradigm, and are in fact amenable to arbitrary imperative and functional programming languages. This last claim is bolstered by a pen-and-paper proof of the correctness of the labelling approach applied to a `toy' compiler for an applicative language. %cite \subsubsection{Object-code approaches} Analysis currently takes place on object code for two main reasons. One existing approach to predicting concrete program execution time at the object-code level was developed in the EmBounded project~\cite{embounded}. The EmBounded compiler compositionally compiles high-level Hume programs to the byte code of the Hume Abstract Machine (HAM). The HAM provides strong guarantees on the maximal execution time spent for each byte code instruction, derived from empirically testing the maximum running time taken for each instruction on a given architecture~\cite{BoneAFerdCHH2007:IFL}. This provides a model that is uniform, though at the expense of precision---each cost is a pessimistic upper bound---and the performance of the executed code---the byte code is interpreted compositionally. Experiments were conducted using a prototype static analysis tool, and a mainstream WCET analyser, to derive upper bounds for running time of a Hume program. Case studies demonstrated the feasibility of using the Hume programming language for developing resource-constrained applications~~\cite{BoneAChenZHMWW2007:ACM}. Firstly, there cannot be a uniform, precise cost model for source code instructions (or even basic blocks). During compilation, high level instructions are broken up and reassembled in context-specific ways so that identifying a fragment of object code and a single high level instruction is unfeasible. The current state of the art in concrete timing analysis is represented by WCET tools, which use Abstract Interpretation~\cite{Cousot:1977:AIU:512950.512973}, control flow analyses, and detailed models of hardware to derive tight time bounds on expected program execution time. See~\cite{stateart} for a survey of state-of-the-art techniques in the WCET field. A variety of academic and commercial sources provide advanced WCET tooling: for example, aiT~\cite{absint}, bound-T~\cite{bound-T}, SWEET~\cite{Lisper2014}, Heptane~\cite{heptane}, and OTAWA~\cite{otawa}. Unlike the approach presented in this paper, the analyses performed by mainstream WCET tools are unverified: the WCET tool itself, typically a large, complex piece of software, must become part of one's trusted code base. Additionally, the control flow of the object and source code can be very different as a result of optimisations. For example, aggressive loop optimisations can completely transform source level loops. Despite the lack of a uniform, compilation- and program-independent cost model on the source language, research on the analysis of non-asymptotic execution time on high level languages assuming such a model is growing and gaining momentum. Unless such cost models are developed, the future practical impact of this research looks to be minimal. One existing approach is the EmBounded project~\cite{embounded}, which compositionally compiles high-level code to a byte code that is executed by an interpreter with guarantees on the maximal execution time spent for each byte code instruction. This provides a model that is uniform, though at the expense of precision (each cost is a pessimistic upper bound) and the performance of the executed code (the byte code is interpreted compositionally). The second reason to perform analysis on the object code is that bounding the worst case execution time of small code fragments in isolation (e.g. loop bodies) and then adding up the bounds yields very poor estimates as no knowledge of the hardware state prior to executing the fragment can be assumed. By analysing longer runs the bound obtained becomes more precise because the lack of information about the initial state has a relatively small impact. To calculate the cost of an execution, value and control flow analyses are required to bound the number of times each basic block is executed. Currently, state of the art WCET analysis tools, such as AbsInt's aiT toolset~\cite{absint}, perform these analyses on object code, where the logic of the program is harder to reconstruct and most information available at the source code level has been lost; see~\cite{stateart} for a survey. Imprecision in the analysis can lead to useless bounds. To augment precision, currently tools ask the user to provide constraints on the object code control flow, usually in the form of bounds on the number of iterations of loops or linear inequalities on them. This requires the user to manually link the source and object code, translating their assumptions on the source code (which may be wrong) to object code constraints. This task is hard and error-prone, especially in the presence of complex compiler optimisations. Traditional techniques for WCET that work on object code are also affected by another problem: they cannot be applied before the generation of the object code. Functional properties can be analysed in early development stages, while analysis of non-functional properties may come too late to avoid expensive changes to the program architecture. \paragraph{WCET, and Abstract Interpretation} As mentioned above, the current state of the art in concrete resource analysis is represented by WCET tools such as AbsInt's aiT static analyser, which use Abstract Interpretation, control flow analyses, and detailed models of hardware to derive tight time bounds on expected program execution time. Unlike the approach presented in this paper, the analyses performed by mainstream WCET tools are unverified: the WCET tool itself, typically a large, complex piece of software, must become part of one's trusted code base. However, recent work has investigated the verification of loop-bound analysis, a subcomponent of wider WCET analyses, for an intermediate language of the CompCert verified C compiler. However, recent work on the CompCert verified C compiler has investigated the verification of loop-bound analysis, a subcomponent of wider WCET analyses for the Clight intermediate language~\cite{DBLP:conf/wcet/MaronezeBPP14}. Note that this work is related but orthogonal to the work presented in this paper: our cost annotations at the C source-level are \emph{parametric} in the number of iterations of loops appearing in the source code. Indeed, to obtain a final resource analysis using our toolchain, the user must manually bound loop iterations.\footnote{This is typical for embedded and safety- or mission-critical applications, where unrestricted looping and recursion are not permitted.  See for example rule 16.2 of MISRA C 2004, which prohibits recursive function calls entirely in safety-critical code, and Nasa JPL's programming guidelines, which states all loops must have a bounded number of iterations.} %cite As a result, one could potentially imagine pairing the loop-bound analysis with our parametric cost model to reduce the amount of user interaction required when using it. Abstract Interpretation may also be used to derive upper bounds for stack usage (e.g.~\cite{Regehr:2005:ESO:1113830.1113833}), and abstract interpretation tools for predicting stack space usage, for example the StackAnalyzer tool~\cite{stackanalyzer}, are also commercially available. \subsubsection{High-level approaches} Resource Aware ML (RAML) developed by Hoffmann and Hoffmann is a first-order statically typed functional programming language.%cite RAML's sophisticated type system is able to \paragraph{Other approaches} The Hume programming language, developed as part of the EU's EmBounded project, is a typed functional programming language intended for embedded applications that is compiled to a custom abstract machine---the Hume Abstract Machine (HAM). %cite Each instruction of the HAM has a fixed concrete timing cost, derived experimentally by executing the abstract machine instruction repeatedly on target hardware and averaging. These costs induce a cost model on compiled Hume code, which can be used by a prototype static analysis tool, or AbsInt's aiT WCET analyser, to derive upper bounds for running time of a given Hume program. \subsection{Further work}