Changeset 1427 for Deliverables/D4.2-4.3/reports/D4-3.tex
- Timestamp:
- Oct 20, 2011, 3:07:31 PM (9 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
Deliverables/D4.2-4.3/reports/D4-3.tex
r1418 r1427 163 163 164 164 As a result, we see that the semantics of LIN and LTL are both instances of a single, more general language that is parametric in how the next instruction is fetched. 165 Furthermore, any prospective proof that the semantics of LTL and LIN are identical is no talmost trivial, saving a deal of work in Deliverable D4.4.165 Furthermore, any prospective proof that the semantics of LTL and LIN are identical is now almost trivial, saving a deal of work in Deliverable D4.4. 166 166 167 167 %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% … … 201 201 \texttt{pair\_reg} & Various different `move' instructions have been merged into a single move instruction in the joint language. A value can either be moved to or from the accumulator in some languages, or moved to and from an arbitrary pseudoregister in others. This type encodes how we should move data around the registers and accumulators. \\ 202 202 \texttt{generic\_reg} & The representation of generic registers (i.e. those that are not devoted to a specific task). \\ 203 \texttt{call\_args} & The number of arguments to a function. For some languages this is irrelevant. \\204 \texttt{call\_dest} & \\203 \texttt{call\_args} & The actual arguments passed to a function. For some languages this is simply the number of arguments passed to the function. \\ 204 \texttt{call\_dest} & The destination of the function call. \\ 205 205 \texttt{extend\_statements} & Instructions that are specific to a particular intermediate language, and which cannot be abstracted into the joint language. 206 206 \end{tabular*} … … 213 213 | COST_LABEL: costlabel → joint_instruction p globals 214 214 ... 215 \end{lstlisting} 216 217 We extend \texttt{params\_\_} with a type corresponding to labels, \texttt{succ}, obtaining a new record type of parameters called \texttt{params\_}: 215 | OP1: Op1 → acc_a_reg p → acc_a_reg p → joint_instruction p globals 216 ... 217 \end{lstlisting} 218 Here, we see that the instruction \texttt{OP1} (a unary operation on the accumulator A) can be given quite a specific type, through the use of the \texttt{params\_\_} data structure. 219 220 Joint statements can be split into two subclasses: those who simply pass the flow of control onto their successor statement, and those that jump to a potentially remote location in the program. 221 Naturally, as some intermediate languages are graph based, and others linearised, the passing act of passing control on to the `successor' instruction can either be the act of following a graph edge in a control flow graph, or incrementing an index into a list. 222 We make a distinction between instructions that pass control onto their immediate successors, and those that jump elsewhere in the program, through the use of \texttt{succ}, denoting the immediate successor of the current instruction, in the \texttt{params\_} record described below. 218 223 \begin{lstlisting} 219 224 record params_: Type[1] ≝ … … 243 248 }. 244 249 \end{lstlisting} 250 Here, \texttt{resultT} and \texttt{resultT} typically are the (pseudo)registers that store the parameters and result of a function. 251 245 252 We further extend \texttt{params0} with a type for local variables in internal function calls: 246 253 \begin{lstlisting} … … 264 271 The first two `universe' fields are only used in the compilation process, for generating fresh names, and do not affect the semantics. 265 272 The rest of the fields affect both compilation and semantics. 266 In particular, we have parameterised result types, function parameter types and the type of local variables.273 In particular, we have a description of the result, parameters and the local variables of a function. 267 274 Note also that we have lifted the hypothesised \texttt{lookup} function from \texttt{params} into a dependent sigma type, which combines a label (the entry and exit points of the control flow graph or list) combined with a proof that the label is in the graph structure: 268 275 \begin{lstlisting} … … 308 315 framesT: Type[0]; 309 316 empty_framesT: framesT; 317 310 318 regsT: Type[0]; 311 319 empty_regsT: regsT; 320 312 321 call_args_for_main: call_args p; 313 322 call_dest_for_main: call_dest p; 323 314 324 succ_pc: succ p → address → res address; 325 315 326 greg_store_: generic_reg p → beval → regsT → res regsT; 316 327 greg_retrieve_: regsT → generic_reg p → res beval; … … 325 336 }. 326 337 \end{lstlisting} 327 For instance, \texttt{greg\_store\_} and \texttt{greg\_retrieve\_} store and retrieve values from a generic register, respectively. 338 Here, the fields \texttt{empty\_framesT}, \texttt{empty\_regsT}, \texttt{call\_args\_for\_main} and \texttt{call\_dest\_for\_main} are used for state initialisation. 339 340 The field \texttt{succ\_pc} takes an address, and a `successor' label, and returns the address of the instruction immediately succeeding the one at hand. 341 342 The fields \texttt{greg\_store\_} and \texttt{greg\_retrieve\_} store and retrieve values from a generic register, respectively. 328 343 Similarly, \texttt{pair\_reg\_move} implements the generic move instruction of the joint language. 329 344 Here \texttt{framesT} is the type of stack frames, with \texttt{empty\_framesT} an empty stack frame. … … 332 347 In particular, we need to know when the \texttt{main} function has finished executing. 333 348 But this is complicated, in C, by the fact that the \texttt{main} function is explicitly allowed to be recursive (disallowed in C++). 334 Therefore, to understand whether the exiting \texttt{main} function is really exiting, or just recursively calling itself, we need to remember the address at which \texttt{main} is located.349 Therefore, to understand whether the exiting \texttt{main} function is really exiting, or just recursively calling itself, we need to remember the address to which \texttt{main} will return control once the initial call to \texttt{main} has finished executing. 335 350 This is done with \texttt{call\_dest\_for\_main}, whereas \texttt{call\_args\_for\_main} holds the \texttt{main} function's arguments. 336 351 … … 361 376 }. 362 377 \end{lstlisting} 363 Here, \texttt{fetch\_statement} fetches the next statement to be executed, \texttt{save\_frame} and \texttt{pop\_frame} manipulate stack frames, \texttt{set\_result} saves the result of the function computation, and \texttt{exec\_extended} is a function that executes the extended statements, peculiar to each individual intermediate language. 378 Here, \texttt{fetch\_statement} fetches the next statement to be executed. 379 The fields \texttt{save\_frame} and \texttt{pop\_frame} manipulate stack frames. 380 In particular, \texttt{save\_frame} creates a new stack frame on the top of the stack, saving the destination and parameters of a function, and returning an updated state. 381 The field \texttt{pop\_frame} destructively pops a stack frame from the stack, returning an updated state. 382 Further, \texttt{set\_result} saves the result of the function computation, and \texttt{exec\_extended} is a function that executes the extended statements, peculiar to each individual intermediate language. 364 383 365 384 We bundle \texttt{params} and \texttt{sem\_params} together into a single record. … … 380 399 pc: address; 381 400 sp: pointer; 401 isp: pointer; 382 402 carry: beval; 383 403 regs: regsT ? p; … … 385 405 }. 386 406 \end{lstlisting} 387 Here \texttt{st\_frms} represent stack frames, \texttt{pc} the program counter, \texttt{sp} the stack pointer, \texttt{carry} the carry flag, \texttt{regs} the generic registers and \texttt{m} external RAM. 407 Here \texttt{st\_frms} represent stack frames, \texttt{pc} the program counter, \texttt{sp} the stack pointer, \texttt{isp} the internal stack pointer, \texttt{carry} the carry flag, \texttt{regs} the registers (hardware and pseudoregisters) and \texttt{m} external RAM. 408 Note that we have two stack pointers, as we have two stacks: the physical stack of the MCS-51 microprocessor, and an emulated stack in external RAM. 409 The MCS-51's own stack is minuscule, therefore it is usual to emulate a much larger, more useful stack in external RAM. 410 We require two stack pointers as the MCS-51's \texttt{PUSH} and \texttt{POP} instructions manipulate the physical stack, and not the emulated one. 411 388 412 We use the function \texttt{eval\_statement} to evaluate a single joint statement: 389 413 \begin{lstlisting} … … 397 421 Monads and their use are further discussed in Subsection~\ref{subsect.use.of.monads}. 398 422 Further, the function returns a new state, updated by the single step of execution of the program. 399 Finally, a \emph{trace} is also returned, which records the trace of cost labels that the program passes through, in order to calculate the cost model for the CerCo compiler.423 Finally, a \emph{trace} is also returned, which records externally observable `events', such as the calling of external functions and the emission of cost labels. 400 424 401 425 %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% … … 439 463 This monadic infrastructure is shared between the frontend and backend languages. 440 464 441 In particular, an `IO' monad, signalling the emission or reading of data to some external location or memory address, is heavily used in the semantics of the intermediate languages. 442 Here, the monad's sequencing operation ensures that emissions and reads are maintained in the correct order. 443 Most functions in the intermediate language semantics fall into the IO monad. 444 In particular, we have already seen how the \texttt{eval\_statement} function of the joint language is monadic, with type: 465 In particular, an `IO' monad, signalling the emission of a cost label, or the calling of an external function, is heavily used in the semantics of the intermediate languages. 466 Here, the monad's sequencing operation ensures that cost label emissions and function calls are maintained in the correct order. 467 We have already seen how the \texttt{eval\_statement} function of the joint language is monadic, with type: 445 468 \begin{lstlisting} 446 469 definition eval_statement: … … 449 472 ... 450 473 \end{lstlisting} 451 If we inthe body of \texttt{eval\_statement}, we may also see how the monad sequences effects.474 If we examine the body of \texttt{eval\_statement}, we may also see how the monad sequences effects. 452 475 For instance, in the case for the \texttt{LOAD} statement, we have the following: 453 476 \begin{lstlisting} … … 486 509 We believe the sugaring for the monadic bind operation makes the program much more readable, and therefore easier to reason about. 487 510 In particular, note that the functions \texttt{dph\_retrieve}, \texttt{pointer\_of\_address}, \texttt{acca\_store} and \texttt{next} are all monadic. 488 The function \texttt{opt\_to\_res} is also --- this is a `utility' function that lifts an option type into the \texttt{IO} monad. 511 512 Note, however, that inside this monadic code, there is also another monad hiding. 513 The \texttt{res} monad signals failure, along with an error message. 514 The monad's sequencing operation ensures the order of error messages does not get rearranged. 515 The function \texttt{opt\_to\_res} lifts an option type into this monad, with an error message to be used in case of failure. 516 The \texttt{res} monad is then coerced into the \texttt{IO} monad, ensuring the whole code snippet typechecks. 489 517 490 518 \subsection{Memory models} … … 506 534 507 535 Right now, the two memory models are interfaced during the translation from RTLabs to RTL. 508 It is an open question whether we will unify the two memory models, using only the backend, bespoke memory model throughout the compiler, as the CompCert memory model seems to work fine for the fro tend, where such byte-by-byte copying is not needed.536 It is an open question whether we will unify the two memory models, using only the backend, bespoke memory model throughout the compiler, as the CompCert memory model seems to work fine for the frontend, where such byte-by-byte copying is not needed. 509 537 However, should we decide to port the frontend to the new memory model, it has been written in such an abstract way that doing so would be relatively straightforward. 510 538 … … 519 547 Closing these axioms should not be a problem. 520 548 521 Further, tailcalls to external functions are currently axiomatised.549 Most things related to external function calls are currently axiomatised. 522 550 This is due to there being a difficulty with how stackframes are handled with external function calls. 523 551 We leave this for further work, due to there being no pressing need to implement this feature at the present time. … … 548 576 Title & Description & O'Caml & Ratio \\ 549 577 \hline 578 \texttt{joint/Joint.ma} & Abstracted syntax for backend languages & N/A & N/A \\ 550 579 \texttt{RTLabs/syntax.ma} & The syntax of RTLabs & \texttt{RTLabs/RTLabs.mli} & 0.65 \\ 551 \texttt{joint/Joint.ma} & Abstracted syntax for backend languages & N/A & N/A \\552 580 \texttt{RTL/RTL.ma} & The syntax of RTL & \texttt{RTL/RTL.mli} & 1.85\tnote{b} \\ 553 581 \texttt{ERTL/ERTL.ma} & The syntax of ERTL & \texttt{ERTL/ERTL.mli} & 1.04\tnote{b} \\ … … 558 586 \begin{tablenotes} 559 587 \item[a] Includes \texttt{joint/Joint\_LTL\_LIN.ma} and \texttt{joint/Joint.ma}. 560 \item[b] Includes \texttt{joint/Joint.ma}. 588 \item[b] Includes \texttt{joint/Joint.ma}. \\ 589 Total lines of Matita code for the above files: 347. \\ 590 Total lines of O'Caml code for the above files: 616. \\ 591 Ration of total lines: 0.56. 561 592 \end{tablenotes} 562 593 \end{threeparttable} … … 569 600 These are computed with \texttt{wc -l}, a standard Unix tool. 570 601 602 Individual file's ratios are an over approximation, due to the fact that it's hard to relate an individual O'Caml file to the abstracted Matita code that has been spread across multiple files. 603 The ratio between total Matita code lines and total O'Caml code lines is more reflective of the compressed and abstracted state of the Matita translation. 604 571 605 Semantics specific files are presented in Table~\ref{table.semantics}. 572 606 \begin{landscape} … … 576 610 Title & Description & O'Caml & Ratio \\ 577 611 \hline 578 \texttt{RTLabs/semantics.ma} & Semantics of RTLabs & \texttt{RTLabs/RTLabsInterpret.ml} & 0.63 \\579 612 \texttt{joint/semantics.ma} & Semantics of the abstracted languages & N/A & N/A \\ 580 613 \texttt{joint/SemanticUtils.ma} & Generic utilities used in semantics `joint' languages & N/A & N/A \\ 614 \texttt{RTLabs/semantics.ma} & Semantics of RTLabs & \texttt{RTLabs/RTLabsInterpret.ml} & 0.63 \\ 581 615 \texttt{RTL/semantics.ma} & Semantics of RTL & \texttt{RTL/RTLInterpret.ml} & 1.88\tnote{a} \\ 582 616 \texttt{ERTL/semantics.ma} & Semantics of ERTL & \texttt{ERTL/ERTLInterpret.ml} & 1.22\tnote{a} \\ 617 \texttt{LTL/semantics.ma} & Semantics of LTL & \texttt{LTL/LTLInterpret.ml} & 1.25\tnote{c} \\ 583 618 \texttt{LIN/joint\_LTL\_LIN\_semantics.ma} & Semantics of the joint LTL-LIN language & N/A & N/A \\ 584 \texttt{LTL/semantics.ma} & Semantics of LTL & \texttt{LTL/LTLInterpret.ml} & 1.25\tnote{a}\tnote{b} \\ 585 \texttt{LIN/semantics.ma} & Semantics of LIN & \texttt{LIN/LINInterpret.ml} & 1.52\tnote{a}\tnote{b} 619 \texttt{LIN/semantics.ma} & Semantics of LIN & \texttt{LIN/LINInterpret.ml} & 1.52\tnote{c} 586 620 \end{tabular} 587 621 \begin{tablenotes} 588 622 \item{a} Includes \texttt{joint/semantics.ma} and \texttt{joint/SemanticUtils.ma}. 589 623 \item{b} Includes \texttt{joint/joint\_LTL\_LIN\_semantics.ma}. 624 \item{c} Includes \texttt{joint/semantics.ma}, \texttt{joint/SemanticUtils.ma} and \texttt{joint/joint\_LTL\_LIN\_semantics.ma}. \\ 625 Total lines of Matita code for the above files: 1125. \\ 626 Total lines of O'Caml code for the above files: 1978. \\ 627 Ration of total lines: 0.57. 590 628 \end{tablenotes} 591 629 \end{threeparttable}
Note: See TracChangeset
for help on using the changeset viewer.