Changeset 3123 for Deliverables/D6.3/report.tex
 Timestamp:
 Apr 11, 2013, 6:22:39 PM (9 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

Deliverables/D6.3/report.tex
r3122 r3123 456 456 statistical informations on the last executions of the branch. 457 457 458 \paragraph{The labelling appr aoch for data independent cost models}458 \paragraph{The labelling approach for data independent cost models} 459 459 We now describe how the labelling approach can be slightly modified to deal 460 460 with a data independent cost model $(\hookrightarrow,K)$ built over … … 602 602 603 603 \begin{figure}[t] 604 \begin{ center}604 \begin{tiny} 605 605 \begin{verbatim} 606 606 int fact (int n) { … … 614 614 } 615 615 \end{verbatim} 616 \end{ center}616 \end{tiny} 617 617 \caption{A simple program that computes the factorial of 10.\label{factprog}} 618 618 \end{figure} … … 730 730 \end{itemize} 731 731 732 \paragraph{The problem with caches} 733 Cost models for pipelines  at least simple ones  are data independent, 734 i.e. they are dependent on a view that is execution path dependent. In other 735 words, the knowledge about the sequence of executed instructions is sufficient 736 to predict the cost of future instructions. 737 738 The same property does not hold for caches. The cost of accessing a memory 739 cell strongly depends on the addresses of the memory cells that have been read 740 in the past. In turn, the accessed addresses are a function of the low level 741 data state, that cannot be correlated to the source program state. 742 743 The strong correlation between the internal state of caches and the data 744 accessed in the past is one of the two main responsibles for the lack of 745 precision of static analysis in modern unicore architectures. The other one 746 is the lack of precise knowledge on the real behaviour of modern hardware 747 systems. In order to overcome both problems, that Cazorla\&alt.~\cite{proartis} 748 call the ``\emph{Timing Analysis Walls}'', the PROARTIS European Project has 749 proposed to design ``\emph{a hardware/software architecture whose execution 750 timing behaviour eradicates dependence on execution history}'' (\cite{proartis}, Section 1.2). The statement is obviously too strong. What is concretely 751 proposed by PROARTIS is the design of a hardware/software architecture whose 752 execution timing is \emph{execution path dependent} (our terminology). 753 754 We have already seen that we are able to accomodate in the labelling approach 755 cost functions that are dependent on views that are execution path dependent. 756 Before fully embracing the PROARTIS vision, 757 we need to check if there are other aspects of the PROARTIS proposal 758 that can be problematic for CerCo. 759 760 \paragraph{Static Probabilistic Time Analysis} 761 The approach of PROARTIS to achieve execution path dependent cost models 762 consists in turning the hardtoanalyze deterministic hardware components 763 (e.g. the cache) into probabilistic hardware components. Intuitively, 764 algorithms that took decision based on the program history now throw a dice. 765 The typical example which has been thoroughly studied in 766 PROARTIS~\cite{proartis2} is that of caches. There the proposal is to replace 767 the commonly used deterministic placement and replacement algorithms (e.g. LRU) 768 with fully probabilistic choices: when the cache needs to evict a page, the 769 page to be evicted is randomly selected according to the uniform distribution. 770 771 The expectation is that probabilistic hardware will have worse performances 772 in the average case, but it will exhibit the worst case performance only with 773 negligible probability. Therefore, it becomes no longer interesting to estimate 774 the actual worst case bound. What becomes interesting is to plot the probability 775 that the execution time will exceed a certain treshold. For all practical 776 purposes, a program that misses its deadline with a negligible probability 777 (e.g. $10^9$ per hour of operation) will be perfectly acceptable when deployed 778 on an hardware system (e.g. a car or an airplane) that is already specified in 779 such a way. 780 781 In order to plot the probability distribution of execution times, PROARTIS 782 proposes two methodologies: Static Probabilistic Time Analysis (SPTA) and 783 Measurement Based Probabilistic Time Analysis (MBPTA). The first one is 784 similar to traditional static analysis, but it operates on probabilistic 785 hardware. It is the one that we would like to embrace. The second one is 786 based on measurements and it is justified by the following assumption: if 787 the probabilities associated to every hardware operation are all independent 788 and identically distributed, then measuring the fime spent on full runs of 789 subsystems components yields a probabilistic estimate that remains valid 790 when the subsystem is plugged in a larger one. Moreover, the probabilistic 791 distribution of past runs must be equal to the one of future runs. 792 793 We understand that MBPTA is useful to analyze closed (sub)systems whose 794 functional behavior is deterministic. It does not seem to have immediate 795 applications to parametric time analysis, which we are interested in. 796 Therefore we focus on SPTA. 797 798 According to~\cite{proartis}, 799 ``\emph{in SPTA, execution time probability distributions for individual operations \ldots are determined statically 800 from a model of the processor. The design principles of PROARTIS 801 will ensure \ldots that the probabilities for the execution time of each 802 instruction are independent. \ldots SPTA is performed by calculating the 803 convolution ($\oplus$) of the discrete probability distributions which describe 804 the execution time for each instruction on a CPU; this provides a probability 805 distribution \ldots representing the timing behaviour of the entire sequence of 806 instructions.}'' 807 808 We will now analyze to what extend we can embrace SPTA in CerCo. 809 810 \paragraph{The labelling approach for Static Probabilistic Time Analysis} 811 812 To summarize, the main practical differences between standard static time 813 analysis and SPTA are: 814 \begin{itemize} 815 \item The cost functions for single instructions or sequences of instructions 816 no longer return a natural numbers (number of cost units) but integral 817 random variables. 818 \item Cost functions are extended from single instructions to sequences of 819 instructions by using the associative convolution operator $\oplus$ 820 \end{itemize} 821 822 By reviewing the papers that described the labelling approach, it is easy to 823 get convinced that the codomain of the cost analysis can be lifted from 824 that of natural numbers to any group. Moreover, by imposing labels after 825 every function call, commutativity can be dropped and the approach works on 826 every monoid (usually called \emph{cost monoids} in the litterature). 827 Because random variables and convolutions form a monoid, we immediately have 828 that the labelling approach extends itself to SPTA. The instrumented code 829 produced by an SPTACerCo compiler will then have random variables (on a finite 830 domain) as costs and convolutions in place of the \texttt\{\_\_cost\_incr\} 831 function. 832 833 What is left to be understood is the way to state and compute the 834 probabilistic invariants to do \emph{parametric SPTA}. Indeed, it seems that 835 PROARTIS only got interested into non parametric PTA. For example, it is 836 well known that actually computing the convolutions results in an exponential 837 growth of the memory required to represent the result of the convolutions. 838 Therefore, the analysis should work symbolically until the moment where 839 we are interested into extracting information from the convolution. 840 841 Moreover, assuming that the problem of computing invariants is solved, 842 the actual behavior of automated theorem 843 provers on probabilistic invariants is to be understood. It is likely that 844 a good amount of domain specific knowledge about probability theory must be 845 exploited and incorporated into automatic provers to achieve concrete results. 846 847 Parametric SPTA using the methodology developed in CerCo is a future research 848 direction that we believe to be worth exploring in the middle and long term. 849 850 \paragraph{Static Probabilistic Time Analysis for Caches in CerCo} 851 852 As a final remark, we note that the analysis in CerCo of systems that implement 853 probabilistic caches requires a combination of SPTA and data independent cost 854 models. The need for a probabilistic analysis is obvious but, as we saw in 855 the previous paragraph, it requires no modification of the labelling approach. 856 857 In order to understand the need for dependent labelling (to work on data 858 independent cost functions), we need to review the behaviour of probabilistic 859 caches as proposed by PROARTIS. The interested reader can 860 consult~\cite{proartis2articolo30} for further informations. 861 862 In a randomized cache, the probability of evicting a given line on every access 863 is $1/N$ where $N$ stands for the number of cache entries. Therefore the 864 hit probability of a specific access to such a cache is 865 $P(hit) = (\frac{N1}{N})^K$ where $K$ is the number of cache misses between 866 two consecutive accesses to the same cache entry. For the purposes of our 867 analysis, we must assume that every cache access could cause an eviction. 868 Therefore, we define $K$ (the \emph{reuse distance}) to be the number of 869 memory accesses between two consecutive accesses to the same cache entry, 870 including the access for which we are computing $K$. In order to compute 871 $K$ for every code memory address, we need to know the execution path (in 872 our terminology). In other words, we need a view that records for each 873 cache entry the number of memory accesses that has occurred since the last 874 access. 875 876 For pipelines with $n$ stages, the number of possible views is limited to 877 $2^n$: a view can usually just be represented by a word. This is not the 878 case for the views on caches, which are in principle very large. Therefore, 879 the dependent labelling approach for data indepedent cost functions that we 880 have presented here could still be unpractical for caches. If that turns out 881 to be the case, a possible strategy is the use of abstract interpretations 882 techniques on the object code to reduce the size of views exposed at the 883 source level, at the price of an early loss of precision in the analysis. 884 885 More research work must be performed at the current stage to understand if 886 caches can be analyzed, even probabilistically, using the CerCo technology. 887 This is left for future work and it will be postponed after the work on 888 pipelines. 889 732 890 \end{document}
Note: See TracChangeset
for help on using the changeset viewer.