# Changeset 1427 for Deliverables/D4.2-4.3/reports

Ignore:
Timestamp:
Oct 20, 2011, 3:07:31 PM (9 years ago)
Message:

 r1418 As a result, we see that the semantics of LIN and LTL are both instances of a single, more general language that is parametric in how the next instruction is fetched. Furthermore, any prospective proof that the semantics of LTL and LIN are identical is not almost trivial, saving a deal of work in Deliverable D4.4. Furthermore, any prospective proof that the semantics of LTL and LIN are identical is now almost trivial, saving a deal of work in Deliverable D4.4. %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% \texttt{pair\_reg} & Various different move' instructions have been merged into a single move instruction in the joint language.  A value can either be moved to or from the accumulator in some languages, or moved to and from an arbitrary pseudoregister in others.  This type encodes how we should move data around the registers and accumulators. \\ \texttt{generic\_reg} & The representation of generic registers (i.e. those that are not devoted to a specific task). \\ \texttt{call\_args} & The number of arguments to a function.  For some languages this is irrelevant. \\ \texttt{call\_dest} & \\ \texttt{call\_args} & The actual arguments passed to a function.  For some languages this is simply the number of arguments passed to the function. \\ \texttt{call\_dest} & The destination of the function call. \\ \texttt{extend\_statements} & Instructions that are specific to a particular intermediate language, and which cannot be abstracted into the joint language. \end{tabular*} | COST_LABEL: costlabel → joint_instruction p globals ... \end{lstlisting} We extend \texttt{params\_\_} with a type corresponding to labels, \texttt{succ}, obtaining a new record type of parameters called \texttt{params\_}: | OP1: Op1 → acc_a_reg p → acc_a_reg p → joint_instruction p globals ... \end{lstlisting} Here, we see that the instruction \texttt{OP1} (a unary operation on the accumulator A) can be given quite a specific type, through the use of the \texttt{params\_\_} data structure. Joint statements can be split into two subclasses: those who simply pass the flow of control onto their successor statement, and those that jump to a potentially remote location in the program. Naturally, as some intermediate languages are graph based, and others linearised, the passing act of passing control on to the successor' instruction can either be the act of following a graph edge in a control flow graph, or incrementing an index into a list. We make a distinction between instructions that pass control onto their immediate successors, and those that jump elsewhere in the program, through the use of \texttt{succ}, denoting the immediate successor of the current instruction, in the \texttt{params\_} record described below. \begin{lstlisting} record params_: Type[1] ≝ }. \end{lstlisting} Here, \texttt{resultT} and \texttt{resultT} typically are the (pseudo)registers that store the parameters and result of a function. We further extend \texttt{params0} with a type for local variables in internal function calls: \begin{lstlisting} The first two universe' fields are only used in the compilation process, for generating fresh names, and do not affect the semantics. The rest of the fields affect both compilation and semantics. In particular, we have parameterised result types, function parameter types and the type of local variables. In particular, we have a description of the result, parameters and the local variables of a function. Note also that we have lifted the hypothesised \texttt{lookup} function from \texttt{params} into a dependent sigma type, which combines a label (the entry and exit points of the control flow graph or list) combined with a proof that the label is in the graph structure: \begin{lstlisting} framesT: Type[0]; empty_framesT: framesT; regsT: Type[0]; empty_regsT: regsT; call_args_for_main: call_args p; call_dest_for_main: call_dest p; succ_pc: succ p → address → res address; greg_store_: generic_reg p → beval → regsT → res regsT; greg_retrieve_: regsT → generic_reg p → res beval; }. \end{lstlisting} For instance, \texttt{greg\_store\_} and \texttt{greg\_retrieve\_} store and retrieve values from a generic register, respectively. Here, the fields \texttt{empty\_framesT}, \texttt{empty\_regsT}, \texttt{call\_args\_for\_main} and \texttt{call\_dest\_for\_main} are used for state initialisation. The field \texttt{succ\_pc} takes an address, and a successor' label, and returns the address of the instruction immediately succeeding the one at hand. The fields \texttt{greg\_store\_} and \texttt{greg\_retrieve\_} store and retrieve values from a generic register, respectively. Similarly, \texttt{pair\_reg\_move} implements the generic move instruction of the joint language. Here \texttt{framesT} is the type of stack frames, with \texttt{empty\_framesT} an empty stack frame. In particular, we need to know when the \texttt{main} function has finished executing. But this is complicated, in C, by the fact that the \texttt{main} function is explicitly allowed to be recursive (disallowed in C++). Therefore, to understand whether the exiting \texttt{main} function is really exiting, or just recursively calling itself, we need to remember the address at which \texttt{main} is located. Therefore, to understand whether the exiting \texttt{main} function is really exiting, or just recursively calling itself, we need to remember the address to which \texttt{main} will return control once the initial call to \texttt{main} has finished executing. This is done with \texttt{call\_dest\_for\_main}, whereas \texttt{call\_args\_for\_main} holds the \texttt{main} function's arguments. }. \end{lstlisting} Here, \texttt{fetch\_statement} fetches the next statement to be executed, \texttt{save\_frame} and \texttt{pop\_frame} manipulate stack frames, \texttt{set\_result} saves the result of the function computation, and \texttt{exec\_extended} is a function that executes the extended statements, peculiar to each individual intermediate language. Here, \texttt{fetch\_statement} fetches the next statement to be executed. The fields \texttt{save\_frame} and \texttt{pop\_frame} manipulate stack frames. In particular, \texttt{save\_frame} creates a new stack frame on the top of the stack, saving the destination and parameters of a function, and returning an updated state. The field \texttt{pop\_frame} destructively pops a stack frame from the stack, returning an updated state. Further, \texttt{set\_result} saves the result of the function computation, and \texttt{exec\_extended} is a function that executes the extended statements, peculiar to each individual intermediate language. We bundle \texttt{params} and \texttt{sem\_params} together into a single record. pc: address; sp: pointer; isp: pointer; carry: beval; regs: regsT ? p; }. \end{lstlisting} Here \texttt{st\_frms} represent stack frames, \texttt{pc} the program counter, \texttt{sp} the stack pointer, \texttt{carry} the carry flag, \texttt{regs} the generic registers and \texttt{m} external RAM. Here \texttt{st\_frms} represent stack frames, \texttt{pc} the program counter, \texttt{sp} the stack pointer, \texttt{isp} the internal stack pointer, \texttt{carry} the carry flag, \texttt{regs} the registers (hardware and pseudoregisters) and \texttt{m} external RAM. Note that we have two stack pointers, as we have two stacks: the physical stack of the MCS-51 microprocessor, and an emulated stack in external RAM. The MCS-51's own stack is minuscule, therefore it is usual to emulate a much larger, more useful stack in external RAM. We require two stack pointers as the MCS-51's \texttt{PUSH} and \texttt{POP} instructions manipulate the physical stack, and not the emulated one. We use the function \texttt{eval\_statement} to evaluate a single joint statement: \begin{lstlisting} Monads and their use are further discussed in Subsection~\ref{subsect.use.of.monads}. Further, the function returns a new state, updated by the single step of execution of the program. Finally, a \emph{trace} is also returned, which records the trace of cost labels that the program passes through, in order to calculate the cost model for the CerCo compiler. Finally, a \emph{trace} is also returned, which records externally observable events', such as the calling of external functions and the emission of cost labels. %-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-% This monadic infrastructure is shared between the frontend and backend languages. In particular, an IO' monad, signalling the emission or reading of data to some external location or memory address, is heavily used in the semantics of the intermediate languages. Here, the monad's sequencing operation ensures that emissions and reads are maintained in the correct order. Most functions in the intermediate language semantics fall into the IO monad. In particular, we have already seen how the \texttt{eval\_statement} function of the joint language is monadic, with type: In particular, an IO' monad, signalling the emission of a cost label, or the calling of an external function, is heavily used in the semantics of the intermediate languages. Here, the monad's sequencing operation ensures that cost label emissions and function calls are maintained in the correct order. We have already seen how the \texttt{eval\_statement} function of the joint language is monadic, with type: \begin{lstlisting} definition eval_statement: ... \end{lstlisting} If we in the body of \texttt{eval\_statement}, we may also see how the monad sequences effects. If we examine the body of \texttt{eval\_statement}, we may also see how the monad sequences effects. For instance, in the case for the \texttt{LOAD} statement, we have the following: \begin{lstlisting} We believe the sugaring for the monadic bind operation makes the program much more readable, and therefore easier to reason about. In particular, note that the functions \texttt{dph\_retrieve}, \texttt{pointer\_of\_address}, \texttt{acca\_store} and \texttt{next} are all monadic. The function \texttt{opt\_to\_res} is also --- this is a utility' function that lifts an option type into the \texttt{IO} monad. Note, however, that inside this monadic code, there is also another monad hiding. The \texttt{res} monad signals failure, along with an error message. The monad's sequencing operation ensures the order of error messages does not get rearranged. The function \texttt{opt\_to\_res} lifts an option type into this monad, with an error message to be used in case of failure. The \texttt{res} monad is then coerced into the \texttt{IO} monad, ensuring the whole code snippet typechecks. \subsection{Memory models} Right now, the two memory models are interfaced during the translation from RTLabs to RTL. It is an open question whether we will unify the two memory models, using only the backend, bespoke memory model throughout the compiler, as the CompCert memory model seems to work fine for the frotend, where such byte-by-byte copying is not needed. It is an open question whether we will unify the two memory models, using only the backend, bespoke memory model throughout the compiler, as the CompCert memory model seems to work fine for the frontend, where such byte-by-byte copying is not needed. However, should we decide to port the frontend to the new memory model, it has been written in such an abstract way that doing so would be relatively straightforward. Closing these axioms should not be a problem. Further, tailcalls to external functions are currently axiomatised. Most things related to external function calls are currently axiomatised. This is due to there being a difficulty with how stackframes are handled with external function calls. We leave this for further work, due to there being no pressing need to implement this feature at the present time. Title & Description & O'Caml & Ratio \\ \hline \texttt{joint/Joint.ma} & Abstracted syntax for backend languages & N/A & N/A \\ \texttt{RTLabs/syntax.ma} & The syntax of RTLabs & \texttt{RTLabs/RTLabs.mli} & 0.65 \\ \texttt{joint/Joint.ma} & Abstracted syntax for backend languages & N/A & N/A \\ \texttt{RTL/RTL.ma} & The syntax of RTL & \texttt{RTL/RTL.mli} & 1.85\tnote{b} \\ \texttt{ERTL/ERTL.ma} & The syntax of ERTL & \texttt{ERTL/ERTL.mli} & 1.04\tnote{b} \\ \begin{tablenotes} \item[a] Includes \texttt{joint/Joint\_LTL\_LIN.ma} and \texttt{joint/Joint.ma}. \item[b] Includes \texttt{joint/Joint.ma}. \item[b] Includes \texttt{joint/Joint.ma}. \\ Total lines of Matita code for the above files: 347. \\ Total lines of O'Caml code for the above files: 616. \\ Ration of total lines: 0.56. \end{tablenotes} \end{threeparttable} These are computed with \texttt{wc -l}, a standard Unix tool. Individual file's ratios are an over approximation, due to the fact that it's hard to relate an individual O'Caml file to the abstracted Matita code that has been spread across multiple files. The ratio between total Matita code lines and total O'Caml code lines is more reflective of the compressed and abstracted state of the Matita translation. Semantics specific files are presented in Table~\ref{table.semantics}. \begin{landscape} Title & Description & O'Caml & Ratio \\ \hline \texttt{RTLabs/semantics.ma} & Semantics of RTLabs & \texttt{RTLabs/RTLabsInterpret.ml} & 0.63 \\ \texttt{joint/semantics.ma} & Semantics of the abstracted languages & N/A & N/A  \\ \texttt{joint/SemanticUtils.ma} & Generic utilities used in semantics `joint' languages & N/A & N/A \\ \texttt{RTLabs/semantics.ma} & Semantics of RTLabs & \texttt{RTLabs/RTLabsInterpret.ml} & 0.63 \\ \texttt{RTL/semantics.ma} & Semantics of RTL & \texttt{RTL/RTLInterpret.ml} & 1.88\tnote{a} \\ \texttt{ERTL/semantics.ma} & Semantics of ERTL & \texttt{ERTL/ERTLInterpret.ml} & 1.22\tnote{a} \\ \texttt{LTL/semantics.ma} & Semantics of LTL & \texttt{LTL/LTLInterpret.ml} & 1.25\tnote{c} \\ \texttt{LIN/joint\_LTL\_LIN\_semantics.ma} & Semantics of the joint LTL-LIN language & N/A & N/A \\ \texttt{LTL/semantics.ma} & Semantics of LTL & \texttt{LTL/LTLInterpret.ml} & 1.25\tnote{a}\tnote{b} \\ \texttt{LIN/semantics.ma} & Semantics of LIN & \texttt{LIN/LINInterpret.ml} & 1.52\tnote{a}\tnote{b} \texttt{LIN/semantics.ma} & Semantics of LIN & \texttt{LIN/LINInterpret.ml} & 1.52\tnote{c} \end{tabular} \begin{tablenotes} \item{a} Includes \texttt{joint/semantics.ma} and \texttt{joint/SemanticUtils.ma}. \item{b} Includes \texttt{joint/joint\_LTL\_LIN\_semantics.ma}. \item{c} Includes \texttt{joint/semantics.ma}, \texttt{joint/SemanticUtils.ma} and \texttt{joint/joint\_LTL\_LIN\_semantics.ma}. \\ Total lines of Matita code for the above files: 1125. \\ Total lines of O'Caml code for the above files: 1978. \\ Ration of total lines: 0.57. \end{tablenotes} \end{threeparttable}