source: Deliverables/D6.3/report.tex @ 3116

Last change on this file since 3116 was 3116, checked in by sacerdot, 8 years ago

...

File size: 19.1 KB
Line 
1\documentclass[11pt, epsf, a4wide]{article}
2
3\usepackage{../style/cerco}
4
5\usepackage{amsfonts}
6\usepackage{amsmath}
7\usepackage{amssymb} 
8\usepackage[english]{babel}
9\usepackage{graphicx}
10\usepackage[utf8x]{inputenc}
11\usepackage{listings}
12\usepackage{stmaryrd}
13\usepackage{url}
14
15\title{
16INFORMATION AND COMMUNICATION TECHNOLOGIES\\
17(ICT)\\
18PROGRAMME\\
19\vspace*{1cm}Project FP7-ICT-2009-C-243881 \cerco{}}
20
21\lstdefinelanguage{matita-ocaml}
22  {keywords={ndefinition,ncoercion,nlemma,ntheorem,nremark,ninductive,nrecord,nqed,nlet,let,in,rec,match,return,with,Type,try},
23   morekeywords={[2]nwhd,nnormalize,nelim,ncases,ndestruct},
24   morekeywords={[3]type,of},
25   mathescape=true,
26  }
27
28\lstset{language=matita-ocaml,basicstyle=\small\tt,columns=flexible,breaklines=false,
29        keywordstyle=\color{red}\bfseries,
30        keywordstyle=[2]\color{blue},
31        keywordstyle=[3]\color{blue}\bfseries,
32        commentstyle=\color{green},
33        stringstyle=\color{blue},
34        showspaces=false,showstringspaces=false}
35
36\lstset{extendedchars=false}
37\lstset{inputencoding=utf8x}
38\DeclareUnicodeCharacter{8797}{:=}
39\DeclareUnicodeCharacter{10746}{++}
40\DeclareUnicodeCharacter{9001}{\ensuremath{\langle}}
41\DeclareUnicodeCharacter{9002}{\ensuremath{\rangle}}
42
43\date{}
44\author{}
45
46\begin{document}
47
48\thispagestyle{empty}
49
50\vspace*{-1cm}
51\begin{center}
52\includegraphics[width=0.6\textwidth]{../style/cerco_logo.png}
53\end{center}
54
55\begin{minipage}{\textwidth}
56\maketitle
57\end{minipage}
58
59\vspace*{0.5cm}
60\begin{center}
61\begin{LARGE}
62\textbf{
63Report n. D6.3\\
64Final Report on User Validation}
65\end{LARGE} 
66\end{center}
67
68\vspace*{2cm}
69\begin{center}
70\begin{large}
71Version 1.0
72\end{large}
73\end{center}
74
75\vspace*{0.5cm}
76\begin{center}
77\begin{large}
78Main Authors:\\
79XXXX %Dominic P. Mulligan and Claudio Sacerdoti Coen
80\end{large}
81\end{center}
82
83\vspace*{\fill}
84
85\noindent
86Project Acronym: \cerco{}\\
87Project full title: Certified Complexity\\
88Proposal/Contract no.: FP7-ICT-2009-C-243881 \cerco{}\\
89
90\clearpage
91\pagestyle{myheadings}
92\markright{\cerco{}, FP7-ICT-2009-C-243881}
93
94\newpage
95
96\vspace*{7cm}
97\paragraph{Abstract}
98xxx
99
100\newpage
101
102\tableofcontents
103
104\newpage
105
106\section{Task}
107\label{sect.task}
108
109The Grant Agreement describes deliverable D6.3 as follows:
110\begin{quotation}
111\textbf{Final Report on User Validation}: An articulated analysis and critics of the user validation experiences. In particular we will review the effectiveness of the techniques for cost annotation exploitement that have been employed in the project and that have been validated on simple and non trivial examples. We will also identify additional techniques that could be exploited in the middle and long term to bring the CerCo compiler to its full potentialities.
112
113\end{quotation}
114
115\section{Middle and Long Term Improvements}
116In order to identify the middle and long term improvements, we briefly review
117here the premises and goals of the CerCo approach to resource analysis.
118\begin{itemize}
119 \item There is a lot of recent and renewed activity in the formal method
120   community to accomodate resource analysis using techniques derived from
121   functional analysis (type systems, logics, abstract interpretation,
122   amortized analysis applied to data structures, etc.)
123 \item Most of this work, which currently remains at theoretical level,
124   is focused on high level languages and it assumes the existence of correct
125   and compositional resource cost models.
126 \item High level languages are compiled to object code by compilers that
127   should respect the functional properties of the program. However, because
128   of optimizations and the inherently non compositional nature of compilation,
129   compilers do not respect compositional cost models that are imposed a priori
130   on the source language. By controlling the compiler and coupling it with a
131   WCET analyser, it is actually possible to
132   choose the cost model in such a way that the cost bounds are high enough
133   to bound the cost of every produced code. This was attempted for instance
134   in the EmBounded project with good success. However, we believe that bounds
135   obtained in this way have few possibilities of being tight.
136 \item Therefore our approach consists in having the compiler generate the
137   cost model for the user by combining tracking of basic blocks during code
138   transformations with a static resource analysis on the object code for basic
139   blocks. We formally prove the compiler to respect the cost model that is
140   induced on the source level based on a very few assumptions: basically the
141   cost of a sequence of instructions should be associative and commutative
142   and it should not depend on the machine status, except its program counter.
143   Commutativity can be relaxed at the price of introducing more cost updates
144   in the instrumented source code.
145 \item The cost model for basic blocks induced on the source language must then
146   be exploited to derive cost invariants and to prove them automatically.
147   In CerCo we have shown how even simple invariant generations techniques
148   are sufficient to enable the fully automatic proving of parametric WCET
149   bounds for simple C programs and for Lustre programs of arbitrary complexity.
150\end{itemize}
151
152Compared to traditional WCET techniques, our approach currently has many
153similarities, some advantages and some limitations. Both techniques need to
154perform data flow analysis on the control flow graph of the program and both
155techniques need to estimate the cost of control blocks of instructions.
156
157\subsection{Control flow analysis}
158
159The first main difference is in the control flow analysis. Traditional WCET
160starts from object code and reconstructs the control flow graph from it.
161Moreover, abstract interpretation is heavily employed to bound the number of
162executions of cycles. In order to improve the accuracy of estimation, control
163flow constraints are provided by the user, usually as systems of (linear)
164inequalities. In order to do this, the user, helped by the system, needs to
165relate the object code control flow graph with the source one, because it is
166on the latter that the bounds can be figured out and be understood. This
167operations is untrusted and potentially error prone for complex optimizations
168(like aggressive loop optimizations). Efficient tools from linear algebra are
169then used to solve the systems of inequations obtained by the abstract
170interpreter and from the user constraints.
171
172In CerCo, instead, we assume full control on the compiler that
173is able to relate, even in non trivial ways, the object code control flow graph
174onto the source code control flow graph. A clear disadvantage is the
175impossibility of applying the tool on the object code produced by third party
176compilers. On the other hand, we get rid of the possibility of errors
177in the reconstruction of the control flow graph and in the translation of
178high level constraints into low level constraints. The second potentially
179important advantage is that, once we are dealing with the source language,
180we can augment the precision of our dataflow analysis by combining together
181functional and non functional invariants. This is what we attempted with
182the CerCo Cost Annotating Frama-C Plug-In. The Frama-C architecture allows
183several plug-ins to perform all kind of static analisys on the source code,
184reusing results from other plug-ins and augmenting the source code with their
185results. The techniques are absolutely not limited to linear algebra and
186abstract interpretation, and the most important plug-ins call domain specific
187and general purpose automated theorem provers to close proof obligations of
188arbitrary shape and complexity.
189
190In principle, the extended flexibility of the analysis should allow for a major
191advantage of our technique in terms of precision, also
192considering that all analysis used in traditional WCET can still be implemented
193as plug-ins. In particular, the target we have in mind are systems that are
194both (hard) real time and safety critical. Being safety critical, we can already
195expect them to be fully or partially specified at the functional level.
196Therefore we expect that the additional functional invariants should allow to
197augment the precision of the cost bounds, up to the point where the parametric
198cost bound is fully precise.
199In practice, we have not had the time to perform extensive
200comparisons on the kind of code used by industry in production systems.
201The first middle term improvement of CerCo would then consist in this kind
202of analysis, to support or disprove our expectations. It seems that the
203newborn TACLe Cost Action (Timing Analysis on Code Level) would be the
204best framework to achieve this improvement.
205In the case where our
206technique remains promising, the next long term improvement would consist in
207integrating in the Frama-C plug-in ad-hoc analysis and combinations of analysis
208that would augment the coverage of the efficiency of the cost estimation
209techniques.
210
211\subsection{Static analysis of costs of basic blocks}
212At the beginning of the project we have deliberately decided to focus our
213attention on the control flow preservation, the cost model propagation and
214the exploitation of the cost model induced on the high level code. For this
215reason we have devoted almost no attention to the static analysis of basic
216blocks. This was achieved by picking a very simple hardware architecture
217(the 8051 microprocessor family) whose cost model is fully predictable and
218compositional: the cost of every instruction --- except those that deal with
219I/O --- is constant, i.e. it does not depend on the machine status.
220We do not regret this choice because, with the limited amount of man power
221available in the project, it would have been difficult to also consider this
222aspect. However, without showing if the approach can scale to most complex
223architectures, our methodology remains of limited interest for the industry.
224Therefore, the next important middle term improvement will be the extension
225of our methodology to cover pipelines and simple caches. We will now present our
226ideas on how we intend to address the problem. The obvious long term
227improvement would be to take in consideration multicores system and complex
228memory architectures like the ones currently in use in networks on chips.
229The problem of execution time analysis for these systems is currently
230considered extremely hard or even unfeasible and at the moments it seems
231unlikely that our methodology could contribute to the solution of the problem.
232
233\subsubsection{Static analysis of costs of basic blocks revisited}
234We will now describe what currently seems to be the most interesting technique
235for the static analysis of the cost of basic blocks in presence of complex
236hardware architectures that do not have non compositional cost models.
237
238We start presenting an idealized model of the execution of a generic
239microprocessor (with caches) that has all interrupts disabled and no I/O
240instructions. We then classify the models according to some properties of
241their cost model. Then we show how to extend the labelling approach of CerCo
242to cover models that are classified in a certain way.
243
244\paragraph{The microprocessor model}
245
246Let $\sigma,\sigma_1,\ldots$ range over $\Sigma$, the set of the fragments of
247the microprocessor states that hold the program counter, the program status word and
248all the data manipulated by the object code program, i.e. registers and
249memory cells. We call these fragments the \emph{data states}.
250
251Let $\delta,\delta_1,\ldots$ range over $\Delta$, the set of the
252fragments of the microprocessor state that holds the \emph{internal state} of the
253microprocessor (e.g. the content of the pipeline and caches, the status of
254the branch prediction unit, etc.).
255The internal state of the microprocessor influences the execution cost of the
256next instruction, but it has no effect on the functional behaviour of the
257processor. The whole state of the processor is represented by a pair
258$(\sigma,\delta)$.
259
260Let $I,I_1,\ldots$ range over $\mathcal{I}$, the
261the set of \emph{instructions} of the processor and let
262$\gamma,\gamma_1,\ldots$ range over $\Gamma$, the set of \emph{operands}
263of instructions after the fetching and decoding passes.
264Thus a pair $(I,\gamma)$ represents a \emph{decoded instruction} and already contains
265the data required for execution. Execution needs to access the data state only
266to write the result.
267
268Let $fetch: \Sigma \to \mathcal{I} \times \Gamma$ be the function that
269performs the fetching and execution phases, returning the decoded instruction
270ready for execution. This is not meant to be the real fetch-decode function,
271that exploits the internal state too to speed up execution (e.g. by retrieving
272the instruction arguments from caches) and that, in case of pipelines, works
273in several stages. However, such a function exists and
274it is observationally equivalent to the real fetch-decode.
275
276We capture the semantics of the microprocessor with the following set of
277functions:
278\begin{itemize}
279 \item The \emph{functional transition} function $\longrightarrow : \Sigma \to \Sigma$
280   over data states. This is the only part of the semantics that is relevant
281   to functional analysis.
282 \item The \emph{internal state transition} function
283   $\Longrightarrow : \Sigma \times \Delta \to \Delta$ that updates the internal
284   state.
285 \item The \emph{cost function} $K: \mathcal{I} \times \Gamma \times \Delta \to \mathbb{N}$
286   that assigns a cost to transitions. Since decoded instructions hold the data
287   they act on, the cost of an instruction may depend both on the data being
288   manipulated and on the internal state.
289\end{itemize}
290
291Given a processor state $(\sigma,\delta)$, the processor evolves in the
292new state $(\sigma',\delta')$ in $n$ cost units if
293$\sigma \longrightarrow \sigma'$ and $(\sigma,\delta) \Longrightarrow \delta'$
294and $fetch(\sigma) = (I,\gamma)$ and $K(I,\gamma,\delta) = n$.
295
296An \emph{execution history} is a stream of states and transitions
297$\sigma_0 \longrightarrow \sigma_1 \longrightarrow \sigma_2 \ldots$
298that can be either finite or infinite. Given an execution history, the
299corresponding \emph{execution path} is the stream of program counters
300obtained from the execution history by forgetting all the remaining information.
301The execution path of the history $\sigma_0 \longrightarrow \sigma_1 \longrightarrow \sigma_2 \ldots$ is $pc_0,pc_1,\ldots$ where each $pc_i$ is the program
302counter of $\sigma_i$.
303
304We claim this simple model to be generic enough to cover real world
305architectures.
306
307\paragraph{Classification of cost models}
308A cost function is \emph{exact} if it assigns to transitions the real cost
309incurred. It is \emph{approximated} if it returns an upper bound of the real
310cost.
311
312A cost function is \emph{operand insensitive} if it does not depend on the
313operands of the instructions to be executed. Formally, $K$ is operand insensitive
314if there exists a $K': \mathcal{I} \times \Delta \to \mathbb{N}$ such that
315$K(I,\gamma,\delta) = K'(I,\delta)$. In this case, with an abuse of terminology,
316we will identify $K$ with $K'$.
317
318The cost functions of simple hardware architectures, in particular RISC ones,
319are naturally operand insensitive. In the other cases an exact operand sensitive
320cost function can always be turned into an approximated operand insensitive one
321by taking $K'(I,\delta) = \max\{K(I,\gamma,\delta)~|~\gamma \in \Gamma\}$.
322The question when one performs these approximation is how severe the
323approsimation is. A measure is given by the \emph{jitter}, which is defined
324as the difference between the best and worst cases. In our case, the jitter
325of the approximation $K'$ would be $\max\{K(I,\gamma,\delta)~|~\gamma \in \Gamma\} - \min\{K(I,\gamma,\delta)~|~\gamma \in \Gamma\}$. According to experts
326of WCET analysis, the jitters relative to operand sensitivity in modern
327architectures are small enough to make WCET estimations still useful.
328Therefore, in the sequel we will focus on operand insensitive cost models only.
329
330Note that cost model that are operand insensitive may still have significant
331dependencies over the data manipulated by the instructions, because of the
332dependency over internal states. For example, an instruction that reads data
333from the memory may change the state of the cache and thus greatly affect the
334execution time of successive instructions.
335
336Nevertheless, operand insensitivity is an important property for the labelling
337approach. In~\cite{tranquilli} we introduced \emph{dependent labels} and
338\emph{dependent costs}, which are the possibility of assigning costs to basic
339blocks of instructions which are also dependent on the state of the high level
340program at the beginning of the block. The idea we will now try to pursue is
341to exploit dependent costs to capture cost models that are sensitive to
342the internal states of the microprocessor. Operand sensitivity, however, is
343a major issue in presence of dependent labels: to lift a data sensitive cost
344model from the object code to the source language, we need a function that
345maps high level states to the operands of the instructions to be executed,
346and we need these functions to be simple enough to allow reasoning over them.
347Complex optimizations performed by the compiler, however, make the mappings
348extremely cumbersomes and history dependent. Moreover, keeping
349track of status transformations during compilation would be a significant
350departure from classical compilation models which we are not willing to
351undertake. By assuming or removing operand sensitivity, we get rid of part
352of the problem: we only need to make our costs dependent on the internal
353state of the microprocessor. The latter, however, is not at all visible
354in the high level code. Our next step is to make it visible.
355
356In general, the value of the internal state at a certain point in the program
357history is affected by all the preceding history. For instance, the
358pipeline stages either hold operands of instructions in execution or bubbles
359\footnote{A bubble is formed in the pipeline stage $n$ when an instruction is stucked in the pipeline stage $n-1$, waiting for some data which is not available yet.}. The execution history contains data states that in turn contain the
360object code data which we do not know how to relate simply to the source code
361data. We therefore introduce a new classification.
362
363A \emph{view} over internal states is a pair $(\mathcal{V},|.|)$ where
364$\mathcal{V}$ is a non empty set and $|.|:\Delta \to \mathcal{V}$ is
365a forgetful function over internal states.
366
367The operand insensitive cost function $K$ is \emph{dependent on the view
368$\mathcal{V}$} if there exists a $K': \mathcal{I} \times \mathcal{V} \to \mathbb{N}$ such that $K(I,\delta) = K'(I,|\delta|)$. In this case, with an abuse of terminology, we will identify $K$ with $K'$.
369
370Among the possible views, the ones that we will easily be able to work
371with in the labelling approach are the \emph{execution path dependent views}.
372A view $(\mathcal{V},|.|)$ is execution path dependent when there exists
373a transition function $\hookrightarrow : PC \times \mathcal{V} \to \mathcal{V}$
374such that for all $(\sigma,\delta)$ and $pc$ such that $pc$ is the program
375counter of $\sigma$, $(\sigma,\delta) \Longrightarrow \delta'$ iff
376$(pc,|\delta|) \hookrightarrow |\delta'|$.
377
378As a final definition, we say that a cost function $K$ is
379\emph{data independent} if it is dependent on a view that is execution path
380dependent. In the next paragraph we will show how we can extend the
381labelling approach to deal with data independent cost models.
382
383
384\end{document}
Note: See TracBrowser for help on using the repository browser.