Changeset 3114

Apr 10, 2013, 9:48:31 AM (4 years ago)

Some progress on pipelines/caches.

1 edited


  • Deliverables/D6.3/report.tex

    r3113 r3114  
    200200comparisons on the kind of code used by industry in production systems.
    201201The first middle term improvement of CerCo would then consist in this kind
    202 of analysis, to support or disprove our expectations. In the case where our
     202of analysis, to support or disprove our expectations. It seems that the
     203newborn TACLe Cost Action (Timing Analysis on Code Level) would be the
     204best framework to achieve this improvement.
     205In the case where our
    203206technique remains promising, the next long term improvement would consist in
    204207integrating in the Frama-C plug-in ad-hoc analysis and combinations of analysis
    208 \subsection{Static analysis of basic blocks}
     211\subsection{Static analysis of costs of basic blocks}
     212At the beginning of the project we have deliberately decided to focus our
     213attention on the control flow preservation, the cost model propagation and
     214the exploitation of the cost model induced on the high level code. For this
     215reason we have devoted almost no attention to the static analysis of basic
     216blocks. This was achieved by picking a very simple hardware architecture
     217(the 8051 microprocessor family) whose cost model is fully predictable and
     218compositional: the cost of every instruction --- except those that deal with
     219I/O --- is constant, i.e. it does not depend on the machine status.
     220We do not regret this choice because, with the limited amount of man power
     221available in the project, it would have been difficult to also consider this
     222aspect. However, without showing if the approach can scale to most complex
     223architectures, our methodology remains of limited interest for the industry.
     224Therefore, the next important middle term improvement will be the extension
     225of our methodology to cover pipelines and simple caches. We will now present our
     226ideas on how we intend to address the problem. The obvious long term
     227improvement would be to take in consideration multicores system and complex
     228memory architectures like the ones currently in use in networks on chips.
     229The problem of execution time analysis for these systems is currently
     230considered extremely hard or even unfeasible and at the moments it seems
     231unlikely that our methodology could contribute to the solution of the problem.
     233\subsubsection{Static analysis of costs of basic blocks revisited}
     234We will now describe what currently seems to be the most interesting technique
     235for the static analysis of the cost of basic blocks in presence of complex
     236hardware architectures that do not have non compositional cost models.
     238We start presenting an idealized model of the execution of a generic
     239microprocessor (with caches) that has all interrupts disabled and no I/O
     242Let $\sigma,\sigma_1,\ldots$ range over $\Sigma$, the set of the fragments of
     243the microprocessor states that hold the program counter, the program status word and
     244all the data manipulated by the object code program, i.e. registers and
     245memory cells. We call these fragments the \emph{data states}.
     247Let $\delta,\delta_1,\ldots$ range over $\Delta$, the set of the
     248fragments of the microprocessor state that holds the internal state of the
     249microprocessor (e.g. the content of the pipeline and caches, the status of
     250the branch prediction unit, etc.).
     251The internal state of the microprocessor influences the execution cost of the
     252next instruction, but it has no effect on the functional behaviour of the
     253processor. The whole state of the processor is represented by a pair
     256Let $I,I_1,\ldots$ range over $\mathcal{I}$, the
     257the set of instructions of the processor and let
     258$\gamma,\gamma_1,\ldots$ range over $\Gamma$, the set of operands
     259of instructions after the fetching and decoding passes.
     260Thus a pair $(I,\gamma)$ represents a decoded instruction and already contains
     261the data required for execution. Execution needs to access the data state only
     262to write the result.
     264Let $fetch: \Sigma \to \mathcal{I} \times \Gamma$ be the function that
     265performs the fetching and execution phases, returning the decoded instruction
     266ready for execution. This is not meant to be the real fetch-decode function,
     267that exploits the internal state too to speed up execution (e.g. by retrieving
     268the instruction arguments from caches) and that, in case of pipelines, works
     269in several stages. However, such a function exists and
     270it is observationally equivalent to the real fetch-decode.
     272We capture the semantics of the microprocessor with the following set of
     275 \item The functional transition function $\longleftarrow : \Sigma \to \Sigma$
     276   over data states. This is the only part of the semantics that is relevant
     277   to functional analysis.
     278 \item The internal state transition function
     279   $\Longleftarrow : \Sigma \times \Delta \to \Delta$ that updates the internal
     280   state.
     281 \item The cost function $K: \mathcal{I} \times \Delta \to \mathbb{N}$ that
     282   assigns a cost to transitions. Since decoded instructions hold the data
     283   they act on, the cost of an instruction may depend both on the data being
     284   manipulated and on the internal state.
     287Given a processor state $(\sigma,\delta)$, the processor evoltes in the
     288new state $(\sigma',\delta')$ in $n$ cost units if
     289$\sigma \longleftarrow \sigma'$ and $(\sigma,\delta) \Longleftarrow \sigma'$
     290and $fetch(\sigma) = (I,\gamma)$ and $K(I,\gamma,\delta) = n$.
     292We claim this simple model to be generic enough to cover real world
Note: See TracChangeset for help on using the changeset viewer.