Changeset 3123


Ignore:
Timestamp:
Apr 11, 2013, 6:22:39 PM (4 years ago)
Author:
sacerdot
Message:

...

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Deliverables/D6.3/report.tex

    r3122 r3123  
    456456statistical informations on the last executions of the branch.
    457457
    458 \paragraph{The labelling appraoch for data independent cost models}
     458\paragraph{The labelling approach for data independent cost models}
    459459We now describe how the labelling approach can be slightly modified to deal
    460460with a data independent cost model $(\hookrightarrow,K)$ built over
     
    602602
    603603\begin{figure}[t]
    604 \begin{center}
     604\begin{tiny}
    605605\begin{verbatim}
    606606int fact (int n) {
     
    614614}
    615615\end{verbatim}
    616 \end{center}
     616\end{tiny}
    617617\caption{A simple program that computes the factorial of 10.\label{factprog}}
    618618\end{figure}
     
    730730\end{itemize}
    731731
     732\paragraph{The problem with caches}
     733Cost models for pipelines --- at least simple ones --- are data independent,
     734i.e. they are dependent on a view that is execution path dependent. In other
     735words, the knowledge about the sequence of executed instructions is sufficient
     736to predict the cost of future instructions.
     737 
     738The same property does not hold for caches. The cost of accessing a memory
     739cell strongly depends on the addresses of the memory cells that have been read
     740in the past. In turn, the accessed addresses are a function of the low level
     741data state, that cannot be correlated to the source program state.
     742
     743The strong correlation between the internal state of caches and the data
     744accessed in the past is one of the two main responsibles for the lack of
     745precision of static analysis in modern uni-core architectures. The other one
     746is the lack of precise knowledge on the real behaviour of modern hardware
     747systems. In order to overcome both problems, that Cazorla\&alt.~\cite{proartis}
     748call the ``\emph{Timing Analysis Walls}'', the PROARTIS European Project has
     749proposed to design ``\emph{a hardware/software architecture whose execution
     750timing behaviour eradicates dependence on execution history}'' (\cite{proartis}, Section 1.2). The statement is obviously too strong. What is concretely
     751proposed by PROARTIS is the design of a hardware/software architecture whose
     752execution timing is \emph{execution path dependent} (our terminology).
     753
     754We have already seen that we are able to accomodate in the labelling approach
     755cost functions that are dependent on views that are execution path dependent.
     756Before fully embracing the PROARTIS vision,
     757we need to check if there are other aspects of the PROARTIS proposal
     758that can be problematic for CerCo.
     759
     760\paragraph{Static Probabilistic Time Analysis}
     761The approach of PROARTIS to achieve execution path dependent cost models
     762consists in turning the hard-to-analyze deterministic hardware components
     763(e.g. the cache) into probabilistic hardware components. Intuitively,
     764algorithms that took decision based on the program history now throw a dice.
     765The typical example which has been thoroughly studied in
     766PROARTIS~\cite{proartis2} is that of caches. There the proposal is to replace
     767the commonly used deterministic placement and replacement algorithms (e.g. LRU)
     768with fully probabilistic choices: when the cache needs to evict a page, the
     769page to be evicted is randomly selected according to the uniform distribution.
     770
     771The expectation is that probabilistic hardware will have worse performances
     772in the average case, but it will exhibit the worst case performance only with
     773negligible probability. Therefore, it becomes no longer interesting to estimate
     774the actual worst case bound. What becomes interesting is to plot the probability
     775that the execution time will exceed a certain treshold. For all practical
     776purposes, a program that misses its deadline with a negligible probability
     777(e.g. $10^-9$ per hour of operation) will be perfectly acceptable when deployed
     778on an hardware system (e.g. a car or an airplane) that is already specified in
     779such a way.
     780
     781In order to plot the probability distribution of execution times, PROARTIS
     782proposes two methodologies: Static Probabilistic Time Analysis (SPTA) and
     783Measurement Based Probabilistic Time Analysis (MBPTA). The first one is
     784similar to traditional static analysis, but it operates on probabilistic
     785hardware. It is the one that we would like to embrace. The second one is
     786based on measurements and it is justified by the following assumption: if
     787the probabilities associated to every hardware operation are all independent
     788and identically distributed, then measuring the fime spent on full runs of
     789sub-systems components yields a probabilistic estimate that remains valid
     790when the sub-system is plugged in a larger one. Moreover, the probabilistic
     791distribution of past runs must be equal to the one of future runs.
     792
     793We understand that MBPTA is useful to analyze closed (sub)-systems whose
     794functional behavior is deterministic. It does not seem to have immediate
     795applications to parametric time analysis, which we are interested in.
     796Therefore we focus on SPTA.
     797
     798According to~\cite{proartis},
     799``\emph{in SPTA, execution time probability distributions for individual operations \ldots are determined statically
     800from a model of the processor. The design principles of PROARTIS
     801will ensure \ldots that the probabilities for the execution time of each
     802instruction are independent. \ldots SPTA is performed by calculating the
     803convolution ($\oplus$) of the discrete probability distributions which describe
     804the execution time for each instruction on a CPU; this provides a probability
     805distribution \ldots representing the timing behaviour of the entire sequence of
     806instructions.}''
     807
     808We will now analyze to what extend we can embrace SPTA in CerCo.
     809
     810\paragraph{The labelling approach for Static Probabilistic Time Analysis}
     811
     812To summarize, the main practical differences between standard static time
     813analysis and SPTA are:
     814\begin{itemize}
     815 \item The cost functions for single instructions or sequences of instructions
     816   no longer return a natural numbers (number of cost units) but integral
     817   random variables.
     818 \item Cost functions are extended from single instructions to sequences of
     819   instructions by using the associative convolution operator $\oplus$
     820\end{itemize}
     821
     822By reviewing the papers that described the labelling approach, it is easy to
     823get convinced that the codomain of the cost analysis can be lifted from
     824that of natural numbers to any group. Moreover, by imposing labels after
     825every function call, commutativity can be dropped and the approach works on
     826every monoid (usually called \emph{cost monoids} in the litterature).
     827Because random variables and convolutions form a monoid, we immediately have
     828that the labelling approach extends itself to SPTA. The instrumented code
     829produced by an SPTA-CerCo compiler will then have random variables (on a finite
     830domain) as costs and convolutions in place of the \texttt\{\_\_cost\_incr\}
     831function.
     832
     833What is left to be understood is the way to state and compute the
     834probabilistic invariants to do \emph{parametric SPTA}. Indeed, it seems that
     835PROARTIS only got interested into non parametric PTA. For example, it is
     836well known that actually computing the convolutions results in an exponential
     837growth of the memory required to represent the result of the convolutions.
     838Therefore, the analysis should work symbolically until the moment where
     839we are interested into extracting information from the convolution.
     840
     841Moreover, assuming that the problem of computing invariants is solved,
     842the actual behavior of automated theorem
     843provers on probabilistic invariants is to be understood. It is likely that
     844a good amount of domain specific knowledge about probability theory must be
     845exploited and incorporated into automatic provers to achieve concrete results.
     846
     847Parametric SPTA using the methodology developed in CerCo is a future research
     848direction that we believe to be worth exploring in the middle and long term.
     849
     850\paragraph{Static Probabilistic Time Analysis for Caches in CerCo}
     851
     852As a final remark, we note that the analysis in CerCo of systems that implement
     853probabilistic caches requires a combination of SPTA and data independent cost
     854models. The need for a probabilistic analysis is obvious but, as we saw in
     855the previous paragraph, it requires no modification of the labelling approach.
     856
     857In order to understand the need for dependent labelling (to work on data
     858independent cost functions), we need to review the behaviour of probabilistic
     859caches as proposed by PROARTIS. The interested reader can
     860consult~\cite{proartis2-articolo30} for further informations.
     861
     862In a randomized cache, the probability of evicting a given line on every access
     863is $1/N$ where $N$ stands for the number of cache entries. Therefore the
     864hit probability of a specific access to such a cache is
     865$P(hit) = (\frac{N-1}{N})^K$ where $K$ is the number of cache misses between
     866two consecutive accesses to the same cache entry. For the purposes of our
     867analysis, we must assume that every cache access could cause an eviction.
     868Therefore, we define $K$ (the \emph{reuse distance}) to be the number of
     869memory accesses between two consecutive accesses to the same cache entry,
     870including the access for which we are computing $K$. In order to compute
     871$K$ for every code memory address, we need to know the execution path (in
     872our terminology). In other words, we need a view that records for each
     873cache entry the number of memory accesses that has occurred since the last
     874access.
     875
     876For pipelines with $n$ stages, the number of possible views is limited to
     877$2^n$: a view can usually just be represented by a word. This is not the
     878case for the views on caches, which are in principle very large. Therefore,
     879the dependent labelling approach for data indepedent cost functions that we
     880have presented here could still be unpractical for caches. If that turns out
     881to be the case, a possible strategy is the use of abstract interpretations
     882techniques on the object code to reduce the size of views exposed at the
     883source level, at the price of an early loss of precision in the analysis.
     884
     885More research work must be performed at the current stage to understand if
     886caches can be analyzed, even probabilistically, using the CerCo technology.
     887This is left for future work and it will be postponed after the work on
     888pipelines.
     889
    732890\end{document}
Note: See TracChangeset for help on using the changeset viewer.