Changeset 3430
- Timestamp:
- Feb 13, 2014, 2:12:30 PM (7 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
Papers/fopara2013/fopara13.tex
r3429 r3430 98 98 the development of a technique for analysing non-functional 99 99 properties of programs (time, space) at the source level, with little or no loss of accuracy 100 and with a small ,trusted code base. The main software component100 and with a small trusted code base. The main software component 101 101 developed is a certified compiler producing cost annotations. The compiler 102 102 translates source code to object code and produces an instrumented copy of the … … 113 113 \section{Introduction} 114 114 115 \paragraph{Problem statement.} Computer programs can be specified with both 115 %\paragraph{Problem statement.} 116 Computer programs can be specified with both 116 117 functional constraints (what a program must do) and non-functional constraints 117 ( e.g.what resources---time, space, etc.---the program may use). In the current118 (what resources---time, space, etc.---the program may use). In the current 118 119 state of the art, functional properties are verified for high-level source code 119 120 by combining user annotations (e.g. preconditions and invariants) with a … … 121 122 interpretation, theorem proving, and so on). By contrast, non-functional properties 122 123 are generally checked on low-level object code, but also demand information 123 about high-level functional behaviour that must somehow be rec reated.124 about high-level functional behaviour that must somehow be reconstructed. 124 125 125 126 This situation presents several problems: 1) it can be hard to infer this … … 129 130 hard: how can we reflect a cost that depends on the execution state (e.g. the 130 131 value of a register or a carry bit) to a cost that the user can understand 131 looking at source code? 4) functional analysis performed only onobject code132 makes any contribution from the programmer hard, leading to less precision in132 looking at source code? 4) performing functional analysis on the object code 133 makes it hard for the programmer to provide information, leading to less precision in 133 134 the estimates. 134 135 … … 146 147 of program structure through compilation and optimisation, and to exploit that 147 148 information to define a cost model for source code that is precise, non-uniform, 148 and accounts for runtime state. With such a source-level cost model we can149 and which accounts for runtime state. With such a source-level cost model we can 149 150 reduce non-functional verification to the functional case and exploit the state 150 151 of the art in automated high-level verification~\cite{survey}. The techniques … … 153 154 analysis. Where the approach produces precise cost models too complex to reason 154 155 about, safe approximations can be used to trade complexity with precision. 155 Finally, source code analysis can be made during early development stages, when156 Finally, source code analysis can be used during the early stages of development, when 156 157 components have been specified but not implemented: source code modularity means 157 158 that it is enough to specify the non-functional behaviour of unimplemented … … 163 164 how to formally prove the correctness of compilers implementing this technique. 164 165 We have implemented such a compiler from C to object binaries for the 8051 165 micro-controller, and verified it in an interactive theorem prover. We have 166 implemented a Frama-C plugin \cite{framac} that invokes the compiler on a source 167 program and uses this to generate invariants on the high-level source that 168 correctly model low-level costs. Finally, the plugin certifies that the program 169 respects these costs by calling automated theorem provers, a new and innovative 170 technique in the field of cost analysis. As a case study, we show how the 171 plugin can automatically compute and certify the exact reaction time of Lustre 172 \cite{lustre} data flow programs compiled into C. 173 174 \section{Project context and objectives} 166 micro-controller, and verified it in an interactive theorem prover. 167 168 To demonstrate source-level verification of costs we have implemented 169 a Frama-C plugin \cite{framac} that invokes the compiler on a source 170 program and uses this to generate invariants on the high-level source 171 that correctly model low-level costs. The plugin certifies that the 172 program respects these costs by calling automated theorem provers, a 173 new and innovative technique in the field of cost analysis. Finally, 174 we have conduct several case studies, including showing that the 175 plugin can automatically compute and certify the exact reaction time 176 of Lustre~\cite{lustre} data flow programs compiled into C. 177 178 \section{Project context and approach} 175 179 Formal methods for verification of functional properties of programs have 176 180 now reached a level of maturity and automation that is facilitating a slow but … … 185 189 186 190 The scenario for the verification of non-functional properties (time spent, 187 memory used, energy consumed) is bleaker and the future seems to be getting even 188 worse. Most industries verify that real time systems meet their deadlines 191 memory used, energy consumed) is bleaker. 192 % and the future seems to be getting even worse. 193 Most industries verify that real time systems meet their deadlines 189 194 by simply performing many runs of the system and timing their execution, computing the 190 195 maximum time and adding an empirical safety margin, claiming the result to be a 191 196 bound for the WCET of the program. Formal methods and software to statically 192 197 analyse the WCET of programs exist, but they often produce bounds that are too 193 pessimistic to be useful. Recent advancements in hardware architectures are all 198 pessimistic to be useful. Recent advancements in hardware architecture 199 have been 194 200 focused on the improvement of the average case performance, not the 195 201 predictability of the worst case. Execution time is becoming increasingly … … 199 205 static analysis in isolation, because programs are less and less time 200 206 composable. Clock-precise hardware models are necessary for static analysis, and 201 obtaining them is becoming harder as a consequence ofthe increased sophistication207 obtaining them is becoming harder due to the increased sophistication 202 208 of hardware design. 203 209 204 Despite the aforementioned problems, the need for reliable real time systems and programs is 205 increasing, and there is increasing pressure from the research community towards 206 the differentiation of hardware. The aim is to introduce alternative 207 hardware with more predictable behaviour and hence more suitability for static 208 analyses, for example, the decoupling of execution time from execution history 209 by introducing randomisation \cite{proartis}. 210 Despite the aforementioned problems, the need for reliable real time 211 systems and programs is increasing, and there is increasing pressure 212 from the research community towards the introduction of alternative 213 hardware with more predictable behaviour, which would be more suitable 214 for static analysis. One example being investigated by the Proartis 215 project~\cite{proartis} is to decouple execution time from execution 216 history by introducing randomisation. 210 217 211 218 In the CerCo project \cite{cerco} we do not try to address this problem, optimistically 212 assuming that static analysis of non-functional properties of programs will213 becomefeasible in the longer term. The main objective of our work is219 assuming that improvements in low-level timing analysis or architecture will make 220 verification feasible in the longer term. The main objective of our work is 214 221 instead to bring together static analysis of functional and non-functional 215 222 properties, which, according to the current state of the art, are completely … … 231 238 program-independent cost model on the source language, the literature on the 232 239 analysis of non-asymptotic execution time on high level languages that assumes 233 such a model is growing and gaining momentum. However, unless we canprovide a240 such a model is growing and gaining momentum. However, unless we provide a 234 241 replacement for such cost models, this literature's future practical impact looks 235 242 to be minimal. Some hope has been provided by the EmBounded project \cite{embounded}, which 236 243 compositionally compiles high-level code to a byte code that is executed by an 237 emulator with guarantees on the maximal execution time spent for each byte code238 instruction. Th at approach provides a uniform model at the price of the model's239 precision (each cost is a pessimistic upper bound) and performance of the240 executed code (because the byte code is emulatedcompositionally instead of244 interpreter with guarantees on the maximal execution time spent for each byte code 245 instruction. This provides a uniform model at the expense of the model's 246 precision (each cost is a pessimistic upper bound) and the performance of the 247 executed code (because the byte code is interpreted compositionally instead of 241 248 performing a fully non-compositional compilation). 242 249 … … 244 251 the worst case execution time of small code fragments in isolation (e.g. loop 245 252 bodies) and then adding up the bounds yields very poor estimates because no 246 knowledge o n the hardware state can be assumed when executing the fragment. By253 knowledge of the hardware state on executing the fragment can be assumed. By 247 254 analysing longer runs the bound obtained becomes more precise because the lack 248 of knowledge on the initial state has less of an effect on longer computations.249 250 T he cost of an execution is the sum of the cost of basic blocks multiplied by251 the number of times they are executed, which is a functional property of the 252 program. Therefore, in order to perform (parametric) time analysis of programs, 253 it is necessary to combine a cost model with control and data flow analysis. 254 Current state of the art WCET technology such as AbsInt255 (see~\cite{stateart} for a survey) performs the analysis on the object code, where the logic of the 256 program is harder to reconstruct and most information available at the source 257 code level has been lost. Imprecision in the analysis leads to useless bounds. To 258 augment precision, the tools ask the user to provide constraints on the object 259 code control flow, usually in the form of bounds on the number of iterations of 260 loops or linear inequalities on them. This requires the user to manually link 261 t he source and object code, translating his assumptions on the source code262 (which may be wrong) to object code constraints. The task is error prone and 263 hard, especiallyin the presence of complex compiler optimisations.255 of information about the initial state has a relatively small impact. 256 257 To calculate the cost of an execution, value and control flow analyses 258 are required to bound the number of times each basic block is 259 executed. Current state 260 of the art WCET technology such as AbsInt performs these analyses on the 261 object code, where the logic of the program is harder to reconstruct 262 and most information available at the source code level has been lost; 263 see~\cite{stateart} for a survey. Imprecision in the analysis can lead to useless bounds. To 264 augment precision, the tools ask the user to provide constraints on 265 the object code control flow, usually in the form of bounds on the 266 number of iterations of loops or linear inequalities on them. This 267 requires the user to manually link the source and object code, 268 translating his assumptions on the source code (which may be wrong) to 269 object code constraints. The task is error prone and hard, especially 270 in the presence of complex compiler optimisations. 264 271 265 272 Traditional techniques for WCET that work on object code are also affected by … … 275 282 code is translated, must return the cost model for basic blocks of high level 276 283 instructions. It must do so by keeping track of the control flow modifications 277 to reverse them and by interfacing with static analysers.284 to reverse them and by interfacing with processor timing analysis. 278 285 279 286 By embracing compilation, instead of avoiding it like EmBounded did, a CerCo 280 compiler can at the same timeproduce efficient code and return costs that are281 as precise as the static analysis can be. Moreover, we allow our costs tobe282 parametric: the cost of a block can depend on actual program data oron a283 summary of the execution history or on an approximated representation of the287 compiler can both produce efficient code and return costs that are 288 as precise as the processor timing analysis can be. Moreover, our costs can be 289 parametric: the cost of a block can depend on actual program data, on a 290 summary of the execution history, or on an approximated representation of the 284 291 hardware state. For example, loop optimizations may assign to a loop body a cost 285 292 that is a function of the number of iterations performed. As another example, … … 298 305 can be immediately reused and multiple techniques can collaborate together to 299 306 infer and certify cost invariants for the program. Parametric cost analysis 300 becomes the default one, with non parametric bounds used as last resortswhen307 becomes the default one, with non-parametric bounds used as a last resort when 301 308 trading the complexity of the analysis with its precision. \emph{A priori}, no 302 technique previously used in traditional WCET is lost: they can just be applied 309 technique previously used in traditional WCET is lost: processor 310 timing analyses can be used by the compiler on the object code, and the rest can be applied 303 311 at the source code level. 304 312 Our approach can also work in the early … … 307 315 308 316 All software used to verify properties of programs must be as bug free as 309 possible. The trusted code base for verification is made bythe code that needs317 possible. The trusted code base for verification consists of the code that needs 310 318 to be trusted to believe that the property holds. The trusted code base of 311 319 state-of-the-art WCET tools is very large: one needs to trust the control flow … … 314 322 code and we are introducing a non-standard compiler too. To reduce the trusted 315 323 code base, we implemented a prototype and a static analyser in an interactive 316 theorem prover, which was used to certify that the cost computed onthe source317 code is indeed the one actually spentby the hardware. Formal models of the324 theorem prover, which was used to certify that the costs added to the source 325 code are indeed those incurred by the hardware. Formal models of the 318 326 hardware and of the high level source languages were also implemented in the 319 327 interactive theorem prover. Control flow analysis on the source code has been … … 442 450 execution time and the stack space usage. The red lines in 443 451 \autoref{itest1} introducing variables, functions and function calls 444 starting \lstinline'__cost' and \lstinline'__stack' are the instrumentation introduced by the 445 compiler. 452 starting with \lstinline'__cost' and \lstinline'__stack' are the instrumentation introduced by the 453 compiler. For example, the two calls at the start of 454 \lstinline'count' say that 4 bytes of stack are required, and that it 455 takes 111 cycles to reach the next cost annotation (in the loop body). 456 The compiler measures these on the labeled object code that it generates. 446 457 447 458 The annotated program can then be enriched with complexity 448 459 assertions in the style of Hoare logic, that are passed to a deductive 449 460 platform (in our case Frama-C). We provide as a Frama-C cost plugin a 450 simple automatic synthesiser for complexity assertions (the blue451 comments starting with \lstinline'/*@' in \autoref{itest1}), which can 452 be overridden by the user to increase or decrease accuracy.From the461 simple automatic synthesiser for complexity assertions which can 462 be overridden by the user to increase or decrease accuracy. These are the blue 463 comments starting with \lstinline'/*@' in \autoref{itest1}. From the 453 464 assertions, a general purpose deductive platform produces proof 454 465 obligations which in turn can be closed by automatic or interactive … … 461 472 to prove that the whole program execution takes at most 1358 cycles, 462 473 etc.). Note that the synthesised time bound for \lstinline'count', 463 $178+214*(1+\ lstinline'len')$ cycles, is parametric in the length of474 $178+214*(1+\text{\lstinline'len'})$ cycles, is parametric in the length of 464 475 the array. The CVC3 prover 465 476 closes all obligations within half a minute on routine commodity … … 489 500 ensures __cost <= \old(__cost)+(178+214*__max(1+\at(len,Pre), 0)); 490 501 */ 491 int count( unsignedchar *p, int len) {502 int count(char *p, int len) { 492 503 char j; int found = 0; 493 504 |__stack_incr(4); __cost_incr(111);| … … 772 783 many changes and more compiler code, but it is not fundamentally more 773 784 complex. The proof of correctness, however, becomes much harder. 774 \item We target a microprocessor that has a non 785 \item We target a microprocessor that has a non-uniform memory model, which is 775 786 still often the case for microprocessors used in embedded systems and that is 776 787 becoming common again in multi-core processors. Therefore the compiler has to
Note: See TracChangeset
for help on using the changeset viewer.