Changeset 3127

Ignore:
Timestamp:
Apr 12, 2013, 4:03:35 PM (6 years ago)
Message:

report on general proof

Location:
Deliverables/D4.4
Files:
 r3126 \usepackage[utf8x]{inputenc} \usepackage{listings} \usepackage[all]{xy} \newcommand{\unif}{\ensuremath{\stackrel{\scriptscriptstyle ?}{\equiv}}} \newcommand{\founif}{\ensuremath{\stackrel{\scriptscriptstyle ?}{=}}} \newcommand{\vect}{\ensuremath{\stackrel{\to}{#1}}} \newcommand{\sem}{\ensuremath{[\![#1]\!]}} \newcommand{\bsem}{\ensuremath{[\![#1;\;#2]\!]}} \renewcommand{\verb}{\lstinline} \def\lstlanguagefiles{lst-grafite.tex} \lstset{language=Grafite} \usepackage{lscape} \usepackage{stmaryrd} \verb=src : joint_program p_in= (resp. \verb=out : joint_program p_out=), we would like to prove a statement similar to this shape. \begin{verbatim} \begin{lstlisting} theorem joint_correctness : ∀p_in,p_out : sem_graph_params. ∀prog : joint_program p_in.∀stack_size. (joint_abstract_status (mk_prog_params p_out trans_prog stack_size)) R init_in init_out. \end{verbatim} \end{lstlisting} When proving this statement (for each concrete istance of language), we need to proceed by cases according the classification of In order to reach this goal, we have to analyze first whether there is a common way to perform language translation for each pass. After having defined and specified the translation machienary, we will explain how it is possibile to use it in order to build such a layer. explain how it is possibile to use it in order to build such a layer. So this section is organized as follows: in the first part we explain the translation machienary while in the second part we explain such a layer. \subsection{Graph translation} Since a program is just a collection of functions, the compositional approach suggests us to define the translation of a program in terms of the way we translate each internal function. Thus, if we let \verb=src_g_pars= and \verb=dst_g_pars= being the graph parameters of respectively the source and the target program, the aim is writing a Matita's function that takes as input an object of type \verb=joint_closed_internal_function src_g_pars= together with additional information (that we will explain better later) and gives as output an object of type \verb=joint_closed_internal_function dst_g_pars= with some properties, that corresponds to the result of the translation (for the definition of \verb=joint_closed_internal_function=, see Deliverable 4.2 and 4.3) . The signature of the definition is the following one. \begin{lstlisting} definition b_graph_translate : ∀src_g_pars,dst_g_pars : graph_params. ∀globals: list ident. ∀data : bound_b_graph_translate_data src_g_pars dst_g_pars globals. ∀def_in : joint_closed_internal_function src_g_pars globals. Σdef_out : joint_closed_internal_function dst_g_pars globals. ∃data',regs,f_lbls,f_regs. bind_new_instantiates ?? data' data regs ∧ b_graph_translate_props … data' def_in def_out f_lbls f_regs. ........ \end{lstlisting} Let us now discuss in detail what are the parameter to be provide in input and what is the output of the translation process. \subsubsection{Input requested by the translation process} Clearly, \verb=b_graph_translate= takes as input the internal function of the source language we want to translate. But it also takes in input some useful infomrmation which will be used in order to dictate the translation process. These information are all contained in an instance of the following record. \begin{lstlisting} record b_graph_translate_data (src, dst : graph_params) (globals : list ident) : Type ≝ { init_ret : call_dest dst ; init_params : paramsT dst ; init_stack_size : ℕ ; added_prologue : list (joint_seq dst globals) ; new_regs : list register ; f_step : label → joint_step src globals → bind_step_block dst globals ; f_fin : label → joint_fin_step src → bind_fin_block dst globals ; good_f_step : ∀l,s.bind_new_P' ?? (λlocal_new_regs,block.let 〈pref, op, post〉 ≝ block in ∀l. let allowed_labels ≝ l :: step_labels … s in let allowed_registers ≝ new_regs @ local_new_regs @ step_registers … s in All (label → joint_seq ??) (λs'.step_labels_and_registers_in … allowed_labels allowed_registers (step_seq dst globals (s' l))) pref ∧ step_labels_and_registers_in … allowed_labels allowed_registers (op l) ∧ All (joint_seq ??) (step_labels_and_registers_in … allowed_labels allowed_registers) post) (f_step l s) ; good_f_fin : ∀l,s.bind_new_P' ?? (λlocal_new_regs,block.let 〈pref, op〉 ≝ block in let allowed_labels ≝ l :: fin_step_labels … s in let allowed_registers ≝ new_regs @ local_new_regs @ fin_step_registers … s in All (joint_seq ??) (λs.step_labels_and_registers_in … allowed_labels allowed_registers s) pref ∧ fin_step_labels_and_registers_in … allowed_labels allowed_registers op) (f_fin l s) ; f_step_on_cost : ∀l,c.f_step l (COST_LABEL … c) = bret ? (step_block ??) 〈[ ], λ_.COST_LABEL dst globals c, [ ]〉 ; cost_in_f_step : ∀l,s,c. bind_new_P ?? (λblock.∀l'.\snd (\fst block) l' = COST_LABEL dst globals c → s = COST_LABEL … c) (f_step l s) } \end{lstlisting} We will now summarize what each field means and how it is used in the translation process. We will say that an identifier of a pseudo-register (resp. a code point) is {\em fresh} when it never appears in the code of the source function. \begin{center} \begin{tabular*}{\textwidth}{p{4cm}p{11cm}} Field & Explanation \\ \hline \texttt{init\_ret} & It tells how to fill the field \texttt{joint\_if\_result}, i.e. it tells what is the translation of the result for the translated function.\\ \texttt{init\_params} & It tells how to fill the field \texttt{joint\_if\_params}, i.e. it tells what is the translation of the formal parameters of the translated function.\\ \texttt{init\_stack\_size} & It tells how to fill the field \texttt{joint\_if\_stacksize} of the translated function, which is used in orderto deal with cost model of stack usage. \\ \texttt{added\_prologue} & It is a list of sequential statements of the target language which is always added at the beginning of the translated function. \\ \texttt{new\_regs} & It is a list of identifiers for fresh pseudo-registers that are used by statements in \texttt{added\_prologue}. \\ \texttt{f\_step} & It is a function that tells how to translate all {\em step statements} i.e. statements admitting a syntactical successor. Statements of this kind are all sequential statements, a cost emission statement, a call statement and a conditional statement: in the two latter cases the syntactical successors are respectively the returning address (at the end of a function call) and the code-point of the instruction the execution flow will jump in case the guard of the conditional is evaluated to false. \\ \texttt{f\_fin} & It is a function that tells how to translate all {\em final statements} i.e. statements do not admitting a syntactical successor. Statements of this kind are return statements, unconditioned jump statements and tailcalls.\\ \texttt{good\_f\_step} & It tells that the translation function of all step statement is well formed in the sense that all identifier (pseudo-registers or code points) used by the block of step statements being the translation of a step statement either come from the statement being translated or they are fresh and generated by the universe field of the translated function (\verb=joint_if_luniverse= in case of code point identifier or \verb=joint_if_runiverse= in case of pseudo-register identifier).  \\ \texttt{good\_f\_fin} & It tells that the translation function of all final statements is well formed. Such a condition is similar to the one given for step statements.\\ \texttt{f\_step\_on\_cost} and \texttt{cost\_in\_f\_step}& It gives a particular restriction on the translation of a cost-emission statement: it tells that the translation of a cost-emission has to be the identity, i.e. it should be translated as the same cost-emission statement. Furthermore, the translation never introduce new cost-emission statements which do not correspond to a cost emission in the source. \end{tabular*} \end{center} \subsubsection{Output given the translation process} Clearly \verb=g_graph_translate= gives in output the translated internal function. Unfortunately, for what we are going to develop later, this information is insufficient because we need some information about the correspondence between the source internal function and the translated one. Such an information is given by the second component of the $\Sigma$-type returned by the translation process. We remind that the return type of \verb=b_graph_translate= is the following. \begin{lstlisting} Σdef_out : joint_closed_internal_function dst_g_pars globals. ∃data',regs,f_lbls,f_regs. bind_new_instantiates ?? data' data regs ∧ b_graph_translate_props … data' def_in def_out f_lbls f_regs. \end{lstlisting} The correspondence between the source internal function and the translated one is made explicit by two functions: \verb=f_lbls : code_point src_g_pars → list (code_point dst_g_pars)= and \verb=f_regs :  code_point src_g_pars → list (register)=. To understand their meaning, we need to stress the fact that the translation process translates every statement appearing in the code of the source internal function in a given code point $l$ into a block of statements in the code of the translated function where the first instruction of the block has $l$ as code point identifier and all succeeding instructions of the block have fresh code points identifiers. Furthermore statements of this block may use fresh identifiers of pseudo-registers or (even worste) they may use some fresh code-point identifiers being generated by the translation (we will see later that this will be important when translating a call statement). We will use the above mentioned functions to retrieve this information. \verb=f_lbls= takes in input an identifier $l$ for a code point in the source code and it gives back the list of all fresh code point identifiers generated by the translation process of the statement in the source located in $l$. \verb=f_regs= takes in input an identifier $l$ for a code point in the source code and it gives back the list of all fresh register identifiers generated by the translation process of the statement in the source located in $l$. The above mentioned properties about the functions \verb=f_lbls= and \verb=f_regs=, together with some other ones, are expressed and formalized by the following propositional record. \begin{lstlisting} record b_graph_translate_props (src_g_pars, dst_g_pars : graph_params) (globals: list ident) (data : b_graph_translate_data src_g_pars dst_g_pars globals) (def_in : joint_closed_internal_function src_g_pars globals) (def_out : joint_closed_internal_function dst_g_pars globals) (f_lbls : label → list label) (f_regs : label → list register) : Prop ≝ { res_def_out_eq : joint_if_result … def_out = init_ret … data ; pars_def_out_eq : joint_if_params … def_out = init_params … data ; ss_def_out_eq : joint_if_stacksize … def_out = init_stack_size … data ; partition_lbls : partial_partition … f_lbls ; partition_regs : partial_partition … f_regs ; freshness_lbls : (∀l.All … (λlbl.¬fresh_for_univ … lbl (joint_if_luniverse … def_in) ∧ fresh_for_univ … lbl (joint_if_luniverse … def_out)) (f_lbls l)) ; freshness_regs : (∀l.All … (λreg.¬fresh_for_univ … reg (joint_if_runiverse … def_in) ∧ fresh_for_univ … reg (joint_if_runiverse … def_out)) (f_regs l)) ; freshness_data_regs : All … (λreg.¬fresh_for_univ … reg (joint_if_runiverse … def_in) ∧ fresh_for_univ … reg (joint_if_runiverse … def_out)) (new_regs … data) ; data_regs_disjoint : ∀l,r.r ∈ f_regs l → r ∈ new_regs … data → False ; multi_fetch_ok : ∀l,s.stmt_at … (joint_if_code … def_in) l = Some ? s → let lbls ≝ f_lbls l in let regs ≝ f_regs l in match s with [ sequential s' nxt ⇒ let block ≝ if not_emptyb … (added_prologue … data) ∧ eq_identifier … (joint_if_entry … def_in) l then bret … [ ] else f_step … data l s' in l -(block, l::lbls, regs)-> nxt in joint_if_code … def_out | final s' ⇒ l -(f_fin … data l s', l::lbls, regs)-> it in joint_if_code … def_out | FCOND abs _ _ _ ⇒ Ⓧabs ] ; prologue_ok : if not_emptyb … (added_prologue … data) then ∃lbl.¬fresh_for_univ … lbl (joint_if_luniverse … def_in) ∧ joint_if_entry … def_out -(〈[ ],λ_.COST_LABEL … (get_first_costlabel … def_in), added_prologue … data〉, f_lbls … lbl)-> joint_if_entry … def_in in joint_if_code … def_out else (joint_if_entry … def_out = joint_if_entry … def_in) }. \end{lstlisting} We will now summarize the meaning of each field of the record using the same approach followed in the previous subsection. \begin{center} \begin{tabular*}{\textwidth}{p{4cm}p{11cm}} Field & Explanation \\ \hline \texttt{res\_def\_out\_eq} & It tells that the result of the translated function is the one specified in the suitable field of the record of type \verb=b_graph_translate_data= provided in input. \\ \texttt{pars\_def\_out\_eq} & It tells that the formal parameters of the translated function are the one specified in the suitable field of the record of type \verb=b_graph_translate_data= provided in input. \\ \texttt{ss\_def\_out\_eq} & It tells that the \verb=joint_if_stacksize= field of the translated function is the one specified in the suitable field of the record of type \verb=b_graph_translate_data= provided in input. \\ \texttt{partition\_lbls} & All lists of code-point identifiers generated by distinct code point identifiers of the code of the source internal function are pairwise disjoint and every fresh identifier of the list appears at most once, without any repetition. \\ \texttt{partition\_regs} & All lists of pseudo-register identifiers generated by distinct code point identifiers of the code of the source internal function are pairwise disjoint and every fresh identifier of the list appears at most once, without any repetion. \\ \texttt{freshness\_lbls} & All lists of code-point identifiers generated by a code-point of the code of the source internal function are fresh.  \\ \texttt{freshness\_regs} & All lists of pseudo-register identifiers generated by a code-point of the code of the source internal function are fresh.  \\ \texttt{freshness\_data\_regs} & All identifiers of pseudo-register being element of the field \texttt{new\_regs} of the record of type \verb=b_graph_translate_data= provided in input are fresh. \\ \texttt{data\_regs\_disjoint} &  All identifiers of pseudo-register being element of the field \texttt{new\_regs} of the record of type \verb=b_graph_translate_data= provided in input never appear in any list of pseudo-register identifiers generated by a code-point of the code of the source internal function. \\ \texttt{multi\_fetch\_ok} & Given a statement $I$ and a code-point identifier $l$ of the source internal function such that $I$ is located in $l$, if the translation process translate $I$ into a statement block $\langle I_1,\ldots ,I_n\rangle$ then $f\_lbls(l) = [l_1,\ldots,l_{n-1}]$ in the translated source code we have that $I_1$ is located in $l$ with $l1$ as syntactical successor, $I_2$ is located in $l_1$ with $l_2$ as syntactical successor, and so on with the last statement $I_n$ located in $l_{n-1}$ and it may have a syntactical successor depending wether $I$ is a step-statement or not: in the former case we have that the syntactical successor of $I_n$ is the syntactical successor of $I$, in the latter case, $I_n$ is a final statement.\\ \texttt{prologue\_ok} & If the field \texttt{added\_prologue} of the record of type \verb=b_graph_translate_data= provided in input is empty, then the code-point identifier of the first instruction of the translated function is the same of the one of the source internal function. Otherwise the two code points are different, with the first instruction of the translated function being a cost-emission statement followed by the instructions of  \texttt{added\_prologue}; the last instruction of \texttt{added\_prologue} has an identifier $l$ as syntactical successor and $l$ is the same identifier as the one of the first instruction of the source internal function and in $l$ we fetch a NOOP instruction. \end{tabular*} \end{center} \subsection{A general correctness proof} In order to prove our general result, we need to define the usual semantical (data) relation among states of the source and target language and call relation between states. We remind that two states are in call relation whenever a call statement is fetched at state's current program counter. These two relations have to satify some condition, already explained at the beginning of this deliverable (see Section ??). In this section we will give some general conditions that these two relations have to satisfy in order to obtain the desired simulation result. We begin our analysis from the latter relation (the call one) and then we show how to relate it with a semantical relation satisfying some conditions that allow us to prove our general result. \subsubsection{A standard calling relation} Two states are in call relation whenever it is possible to fetch a call statement at the program counter given by the two states. We will explot the properties of the translation explained in previous subsection in order to define a standard calling relation. We remind that the translation of a given statement $I$ is a block of statements $b= \langle I_1,\ldots I_n\rangle$ with $n \geq 1$. When $I$ is a call, then we will require that there is a unique $k \in [1,n]$ such $I_k$ has to be a call statement (we will see the formalization of this condition in the following subsection when we relate the calling relation with the semantical one). The idea of defining a standard calling relation is to compute the code-point identifier of this $I_k$, starting from the code-point identifier of the statement $I$ in the code of the source internal function. We will see how to use the information provided by the translation process (in particular the function \verb=f_lbls=) in order to obtain this result. We will see that, for technical reason, we will compute the code point of $I$ starting from the code point of $I_k$. In order to explain how this machienary work. we need to enter more in the detail of the translation process. Given a step-statement $I$, formally its translation is a triple $f\_step(s) = \langle pre,p,post \rangle$ such that $pre$ is a list of sequential statements called {\em preamble}, $p$ is a step-statement (we call it {\em pivot}) and $post$ is again a list of sequential statements called {\em postamble}. When $pre = [s_1,\ldots,s_n]$ and $post = [s_1',\ldots,s_m']$, the corresponding block being the translation of $I$ is $\langle s_1,\ldots,s_n,p,s_1',\ldots,s_m'\rangle$. In case $I$ is a final statement, than its translation does not have postable, i.e. it is a pair $f\_fin(s) = \langle pre,p\rangle$ where the pivot $p$ is a final statement. Given a statement $I$ at a given code point $l$ in the source internal function and given the pivot statement $p$ of the translation of $I$ staying at code-point $l'$ in the translated internal function, there is an easy way to relate $l$ and $l'$. Notice that, in case the preamble is empty, for the property of the translation process we have then $l = l'$, while if the preamble is non-empty. then $l'$ is $n-1$-th element of $f\_lbls(l)$, where $n \geq 0$ is the length of the preamble. The Matita's definition computing the code points according to the above mentioned specification is the following one. \begin{lstlisting} definition sigma_label : ∀p_in,p_out : sem_graph_params. joint_program p_in → (ident → option ℕ) → (∀globals.joint_closed_internal_function p_in globals →bound_b_graph_translate_data p_in p_out globals) → (block → list register) → lbl_funct_type → regs_funct_type → block → label → option label ≝ λp_in,p_out,prog,stack_size,init,init_regs,f_lbls,f_regs,bl,searched. ! bl ← code_block_of_block bl ; ! ← res_to_opt … (fetch_internal_function … (joint_globalenv p_in prog stack_size) bl); ! ← find ?? (joint_if_code ?? fn) (λlbl.λ_.match preamble_length … prog stack_size init init_regs f_regs bl lbl with [ None ⇒ false | Some n ⇒ match nth_opt ? n (lbl::(f_lbls bl lbl)) with [ None ⇒ false | Some x ⇒ eq_identifier … searched x ] ]); return res. \end{lstlisting} This function takes in input all the information provided by the translation process (in particular the function $f\_lbls$), a function location and a code-point identifier $l$; it fetches the internal function of the source language in the correponding location. Then it unfolds the code of the fetched function looking for a label $l'$ and a statement $I$ located in $l'$, such that, either $l = l'$ in case the preamble of the translation of $I$ is empty or $l'$ is the $n -1$-th where $n \geq 0$ is the length of the preamble of $I$. The function $find$ is the procedure realizing this search. If $preamble\_length$ is the function giving in output the length of the preamble and if $nth\_opt$ is the function giving the $n$-th element of a list, then this condition can be summarized as following: we are looking for a label $l'$ such that $l$ is the $n$-th element of the list $l :: f\_lbls (l)$, where $n$ is the length of the preamble. We can prove that, starting from a code point identifier of the translated internal function, whenever there exists a code-point identifier in the source internal function satifying the above condition, then it is always unique. The properties \verb=partition_lbls= and \verb=freshness_lbls= provided by the translation process are crucial in the proof of this statement. We can wrap this function inside the definition of the desired relation among program counter states in the following way The conditional at the beginning is put to deal with the pre-main case, which is translated without following the standard translation process we explain in previous section. \begin{lstlisting} definition sigma_pc_opt : ∀p_in,p_out : sem_graph_params. joint_program p_in → (ident → option ℕ) → (∀globals.joint_closed_internal_function p_in globals →bound_b_graph_translate_data p_in p_out globals) → (block → list register) → lbl_funct_type → regs_funct_type → program_counter → option program_counter ≝ λp_in,p_out,prog,stack_size,init,init_regs,f_lbls,f_regs,pc. let target_point ≝ point_of_pc p_out pc in if eqZb (block_id (pc_block pc)) (-1) then return pc else ! source_point ← sigma_label p_in p_out prog stack_size init init_regs f_lbls f_regs (pc_block pc) target_point; return pc_of_point p_in (pc_block pc) source_point. definition sigma_stored_pc ≝ λp_in,p_out,prog,stack_size,init,init_regs,f_lbls,f_regs,pc. match sigma_pc_opt p_in p_out prog stack_size init init_regs f_lbls f_regs pc with [None ⇒ null_pc (pc_offset … pc) | Some x ⇒ x]. \end{lstlisting} The main result about the program counter relation we have defined is the following. If we fetch a statement $I$ in at a given program counter $pc$ in the source program, then there is a program counter $pc'$ in the target program which is in relation with $pc$ (i.e. $sigma\_stored\_pc pc' = pc$) and the fetched statement at $pc'$ is the pivot statement of the tranlation. The formalization of this statement in Matita is given in the following. \begin{lstlisting} lemma fetch_statement_sigma_stored_pc : ∀p_in,p_out,prog,stack_sizes, init,init_regs,f_lbls,f_regs,pc,f,fn,stmt. b_graph_transform_program_props p_in p_out stack_sizes init prog init_regs f_lbls f_regs → block_id … (pc_block pc) ≠ -1 → let trans_prog ≝ b_graph_transform_program p_in p_out init prog in fetch_statement p_in … (joint_globalenv p_in prog stack_sizes) pc = return 〈f,fn,stmt〉 → ∃data.bind_instantiate ?? (init … fn) (init_regs (pc_block pc)) = return data ∧ match stmt with [ sequential step nxt ⇒ ∃step_block : step_block p_out (prog_names … trans_prog). bind_instantiate ?? (f_step … data (point_of_pc p_in pc) step) (f_regs (pc_block pc) (point_of_pc p_in pc)) = return step_block ∧ ∃pc'.sigma_stored_pc p_in p_out prog stack_sizes init init_regs f_lbls f_regs pc' = pc ∧ ∃fn',nxt'. fetch_statement p_out … (joint_globalenv p_out trans_prog stack_sizes) pc' = if not_emptyb … (added_prologue … data) ∧ eq_identifier … (point_of_pc p_in pc) (joint_if_entry … fn) then OK ? else OK ? | final fin ⇒ ∃fin_block.bind_instantiate ?? (f_fin … data (point_of_pc p_in pc) fin) (f_regs (pc_block pc) (point_of_pc p_in pc)) = return fin_block ∧ ∃pc'.sigma_stored_pc p_in p_out prog stack_sizes init init_regs f_lbls f_regs pc' = pc ∧ ∃fn'.fetch_statement p_out … (joint_globalenv p_out trans_prog stack_sizes) pc' = return 〈f,fn',final ?? (\snd fin_block)〉 | FCOND abs _ _ _ ⇒ Ⓧabs ]. \end{lstlisting} If we combine the statement above with the fact that the pivot statement of the translation of a call statement is always a call statement (which we will formalize better in the following section), then we can define our standard calling relation in the following way. \begin{lstlisting} (λs1:Σs: (joint_abstract_status (mk_prog_params p_in ??)).as_classifier ? s cl_call. λs2:Σs:(joint_abstract_status (mk_prog_params p_out ??)).as_classifier ? s cl_call. pc ? s1 = sigma_stored_pc p_in p_out prog stack_sizes init init_regs f_lbls f_regs (pc ? s2)). \end{lstlisting} We stress the fact that such a call relation will be always defined in this way for all joint-languages, in an independent way from the specific pass. The only condition we will ask is that the pass should use the translation process we explain in the previous section. \subsubsection{The semantical relation} The semantical relation between states is the classical relation used in forward simulation proofs. It correlates the data of the status (e.g. register, memory, etc.). We remind that the notion of state in joint language is summarized in the following record. \begin{lstlisting} record state_pc (semp : sem_state_params) : Type ≝ { st_no_pc :> state semp ; pc : program_counter ; last_pop : program_counter }. \end{lstlisting} It consists of three fields: the field \verb=st_no_pc= contains all data information of the state (the content of the registers, of the memory and so on), the field \verb=pc= contains the current program counter, while the field \verb=last_pop= is the address of the last popped calling address when executing a return instruction. The type of the semantical relation between state is the following. \begin{lstlisting} definition joint_state_pc_relation ≝ λP_in,P_out : sem_graph_params.state_pc P_in → state_pc P_out → Prop. \end{lstlisting} We would like to state some conditions the semantical relation between states have to satisfy in order to get our simulation result. We would like that this relation have some flavour of compositionality. In particular we would like that it depends strictly on the contents of the field \verb=st_no_pc=, i.e. the field that really contains data information of the state. So we need also a data relation, i.e. a relation of this type. \begin{lstlisting} definition joint_state_relation ≝ λP_in,P_out.program_counter → state P_in → state P_out → Prop. \end{lstlisting} Notice that the data relation cab depend on a specific program counter of the source. This is done to capture complex data relations like the ones in the ERTL to LTL pass, in which you need to know where data in pseudoregisters of ERTL are stored by the translation (either in hardware register or in memory) and this information depends on the code point on the statement being translated. The compositionality requirement is expressed by the following conditions (which are part of a bigger record, that we are going to introduce later). \begin{lstlisting} ; fetch_ok_sigma_state_ok : ∀st1,st2,f,fn. st_rel st1 st2 → fetch_internal_function … (joint_globalenv P_in prog stack_sizes) (pc_block (pc … st1)) = return → st_no_pc_rel (pc … st1) (st_no_pc … st1) (st_no_pc … st2) ; fetch_ok_pc_ok : ∀st1,st2,f,fn.st_rel st1 st2 → fetch_internal_function … (joint_globalenv P_in prog stack_sizes) (pc_block (pc … st1)) = return → pc … st1 = pc … st2 ; fetch_ok_sigma_last_pop_ok : ∀st1,st2,f,fn.st_rel st1 st2 → fetch_internal_function … (joint_globalenv P_in prog stack_sizes) (pc_block (pc … st1)) = return → (last_pop … st1) = sigma_stored_pc P_in P_out prog stack_sizes init init_regs f_lbls f_regs (last_pop … st2) ; st_rel_def : ∀st1,st2,pc,lp1,lp2,f,fn. fetch_internal_function … (joint_globalenv P_in prog stack_sizes) (pc_block pc) = return → st_no_pc_rel pc st1 st2 → lp1 = sigma_stored_pc P_in P_out prog stack_sizes init init_regs f_lbls f_regs lp2 → st_rel (mk_state_pc ? st1 pc lp1) (mk_state_pc ? st2 pc lp2) \end{lstlisting} Condition \texttt{fetch\_ok\_sigma\_state\_ok} postulates that two state that are in semantical relation should have their data field also in relation. Condition \texttt{fetch\_ok\_pc\_ok} postulates that two states that are in semantical relation should have the same program counter. This is due to the way the translation is performed. In fact a statement $I$ at a code point $l$ in the source internal function is translated with a block of instructions in the translated internal function whose initial statement is at the same code point $l$. Condition \texttt{fetch\_ok\_sigma\_last\_pop\_ok} postulates that two states that are in semantical relation have the last popped calling address in call relation. Finally \texttt{st\_rel\_def} postulates that given two states having the same program counter, the last pop fields in call relation and the data fields also in data relation, then they are in semantical relation. An other important condition is that the pivot statement of the translation of a call statement is always a call statement. This is important in order to obtain the correctness of the call relation and return relation between state. This condition is formalized by the following Matita's code, and we call it \texttt{call\_is\_call}. \begin{lstlisting} ;  call_is_call :∀f,fn,bl. fetch_internal_function … (joint_globalenv P_in prog stack_sizes) bl = return → ∀id,args,dest,lbl. bind_new_P' ?? (λregs1.λdata.bind_new_P' ?? (λregs2.λblp. ∀lbl.∃id',args',dest'.((\snd (\fst blp)) lbl) = CALL P_out ? id' args' dest') (f_step … data lbl (CALL P_in ? id args dest))) (init ? fn) \end{lstlisting} The conditions we are going to present now are standard semantical commutation lemmas that are commonly used when proving the correctness of the operational semantics of many imperative languages. We introduce some notation. We will use $I,J\ldots$ to range over by statements. We will use $l_1,\ldots,l_n$ to range over by code point identifiers. We will use $r_1,\ldots,r_n$ to range over by register identifiers. We will use $s_1,s_1',s_1'',\ldots$ to range over by states of programs of the source language. We will use $s_2,s_2',s_2'',\ldots$ to range over by states of programs of the target language. We denote respectively with $\simeq_{S}$, $\simeq_C$ and $\simeq_L$ the semantical relation, the call relation and the cost-label relation between states. These relations have been introduced at the beginning of this Deliverable (see Section ??). If $instr = [I_1,\ldots,I_n]$ is a list of instructions, then we write $s_i \stackrel{instr}{\longrightarrow} s_i'$ ($i \in [1,2]$) when $s_i'$ is the state being the result of the evaluation of the sequence of instructions $instr$ (performed in the order they appear in the list) starting from the initial state $s_i$. When $instr = [I]$ is a singleton, we use to omit square brackets and we write $s_i \stackrel{I}{\longrightarrow} s_i'$. We will denote with $\pi_i$ ($i \in [1,t]$) the projecting functions of $t$-uples. We will denote with $f\_step$ and $f\_fin$ the translating functions of respectively step-statements and final statements. We remind that $f\_step$ gives a triple as output (a list of instruction called preamble,an instruction called pivot and a list of instructions called postamble) while $f\_fin$ gives a pair as output (a list of instruction called preamble and an instruction called pivot). Furthermore, we denote with $prologue$ the content of the field \texttt{added\_prologue} of the record provided in input to the translation process. Many commutation conditions can be depicted using diagrams. We will use them to give a pictorial flavour of the conditions we will ask in order to obtain the final correctness statement. Given the states $s_1,s_1',s_2,s_2'$ and the instructions $I,J_1,\ldots,J_k$, the following diagram $$\xymatrix{ s_1 \ar@{->}[rr]^{I} \ar@{-}[d]^{\simeq_S} && s_1' \ar@{-}[d]^{\simeq_S}\\ s_2 \ar[rr]^{[J_1,\ldots,J_k]} && s_2' }$$ depicts a situation in which the state $s_1 \stackrel{I}{\longrightarrow} s_1'$, $s_2 \stackrel{I}{\longrightarrow} s_2'$, $s_1 \simeq_S s_2$ and $s_1' \simeq_S s_2'$. \paragraph{Commutation of pre-main instructions.} In order to get the commutation of pre-main instructions (states whose function location of program counter is -1), we have to prove the following condition: for all $s_1,s_1',s_2$ such that $s_1 \stackrel{I}{\longrightarrow} s_1'$ and $s_1 \simeq_S s_2$, then there exists an $s_2'$ such that $s_2 \stackrel{J}{\longrightarrow} s_2'$ and $s_1 \simeq_S s_2'$ i.e. such that the following diagram commutes. $$\xymatrix{ s_1 \ar@{->}[rr]^{I} \ar@{-}[d]^{\simeq_S} && s_1' \ar@{-}[d]^{\simeq_S}\\ s_2 \ar[rr]^{J} && s_2' }$$ The formalization of this statement in Matita is the following one. \begin{lstlisting} ; pre_main_ok : let trans_prog ≝ b_graph_transform_program P_in P_out init prog in ∀st1,st1' : joint_abstract_status (mk_prog_params P_in prog stack_sizes) . ∀st2 : joint_abstract_status (mk_prog_params P_out trans_prog stack_sizes). block_id … (pc_block (pc … st1)) = -1 → st_rel st1 st2 → as_label (joint_status P_in prog stack_sizes) st1 = as_label (joint_status P_out trans_prog stack_sizes) st2 ∧ joint_classify … (mk_prog_params P_in prog stack_sizes) st1 = joint_classify … (mk_prog_params P_out trans_prog stack_sizes) st2 ∧ (eval_state P_in … (joint_globalenv P_in prog stack_sizes) st1 = return st1' → ∃st2'. st_rel st1' st2' ∧ eval_state P_out … (joint_globalenv P_out trans_prog stack_sizes) st2 = return st2') \end{lstlisting} \paragraph{Commutation of conditional jump.} For all $s_1,s_1'$ and $s_2$ such that $s_1 \stackrel{COND \ r \ l}{\longrightarrow} s_1'$ and $s_1 \simeq_S s_2$ then \begin{itemize} \item there are $s_2^{fin}$ and $s_2'$ such that $s_2 \stackrel{\pi_1(f\_step(COND \ r \ l))}{\longrightarrow} s_2^{fin}$, $s_2^{fin} \stackrel{\pi_2(f\_step(COND \ r \ l))} s_2'$ and $s_1' \simeq_S s_2'$, i.e. the following diagram commutes $$\xymatrix{ s_1 \ar@{->}[rrrrrr]^{COND \ l \ r} \ar@{-}[d]^{\simeq_S} &&&&&& s_1' \ar@{-}[d]^{\simeq_S}\\ s_2 \ar[rrr]^{\pi_1(f\_step(COND \ r \ l))} &&& s_2^{fin} \ar[rrr]^{\pi_2(f\_step(COND \ r \ l))} &&& s_2' }$$ \item $\pi_3(f\_step(COND \ r \ l))$ is empty, while $\pi_2(f\_step(COND \ r \ l)) = COND \ r' \ l'$ is a conditional jump such that $l = l'$. \end{itemize} This condition is formalized in Matita in the following way. \begin{lstlisting} ; cond_commutation : let trans_prog ≝ b_graph_transform_program P_in P_out init prog in ∀st1 : joint_abstract_status (mk_prog_params P_in prog stack_sizes) . ∀st2 : joint_abstract_status (mk_prog_params P_out trans_prog stack_sizes). ∀f,fn,a,ltrue,lfalse,bv,b. block_id … (pc_block (pc … st1)) ≠ -1 → let cond ≝ (COND P_in ? a ltrue) in fetch_statement P_in … (joint_globalenv P_in prog stack_sizes) (pc … st1) = return → acca_retrieve … P_in (st_no_pc … st1) a = return bv → bool_of_beval … bv = return b → st_rel st1 st2 → ∀t_fn. fetch_internal_function … (joint_globalenv P_out trans_prog stack_sizes) (pc_block (pc … st2)) = return 〈f,t_fn〉 → bind_new_P' ?? (λregs1.λdata.bind_new_P' ?? (λregs2.λblp.(\snd blp) = [ ] ∧ ∀mid. stmt_at P_out … (joint_if_code ?? t_fn) mid = return sequential P_out ? ((\snd (\fst blp)) mid) lfalse→ ∃st2_pre_mid_no_pc. repeat_eval_seq_no_pc ? (mk_prog_params P_out trans_prog stack_sizes) f (map_eval ?? (\fst (\fst blp)) mid) (st_no_pc ? st2) = return st2_pre_mid_no_pc ∧ let new_pc ≝ if b then (pc_of_point P_in (pc_block (pc … st1)) ltrue) else (pc_of_point P_in (pc_block (pc … st1)) lfalse) in st_no_pc_rel new_pc (st_no_pc … st1) (st2_pre_mid_no_pc) ∧ ∃a'. ((\snd (\fst blp)) mid)  = COND P_out ? a' ltrue ∧ ∃bv'. acca_retrieve … P_out st2_pre_mid_no_pc a' = return bv' ∧ bool_of_beval … bv' = return b )  (f_step … data (point_of_pc P_in (pc … st1)) cond) ) (init ? fn) \end{lstlisting} \paragraph{Commutation of sequential statements.} In case of a sequential statement $I$, its translation $f\_step(I) = \langle pre , J , post \rangle$ is coerced into a list of sequential statements $pre$ \verb=@= $[J]$ \verb=@= $post$. Then we can state the condition in the following way. For all $s_1,s_1',s_2$ such that $s_1 \stackrel{I}{\longrightarrow} s_1'$ and $s_1 \simeq_S s_2$ then there is $s_2'$ such that $s_2 \stackrel{f\_step(I)}{\longrightarrow} s_s'$ and $s_1' \simeq_S s_2'$, i.e. such that the following diagram commutes. $$\xymatrix{ s_1 \ar@{->}[rr]^{I} \ar@{-}[d]^{\simeq_S} && s_1' \ar@{-}[d]^{\simeq_S}\\ s_2 \ar[rr]^{f\_step(I)} && s_2' }$$ The formalization in Matita of the above statement is as follows. \begin{lstlisting} ; seq_commutation : let trans_prog ≝ b_graph_transform_program P_in P_out init prog in ∀st1,st1' : joint_abstract_status (mk_prog_params P_in prog stack_sizes) . ∀st2 : joint_abstract_status (mk_prog_params P_out trans_prog stack_sizes). ∀f,fn,stmt,nxt. block_id … (pc_block (pc … st1)) ≠ -1 → let seq ≝ (step_seq P_in ? stmt) in fetch_statement P_in … (joint_globalenv P_in prog stack_sizes) (pc … st1) = return → eval_state P_in … (joint_globalenv P_in prog stack_sizes) st1 = return st1' → st_rel st1 st2 → ∀t_fn. fetch_internal_function … (joint_globalenv P_out trans_prog stack_sizes) (pc_block (pc … st2)) = return → bind_new_P' ?? (λregs1.λdata.bind_new_P' ?? (λregs2.λblp. ∃l : list (joint_seq P_out (globals ? (mk_prog_params P_out trans_prog stack_sizes))). blp = (ensure_step_block ?? l) ∧ ∃st2_fin_no_pc. repeat_eval_seq_no_pc ? (mk_prog_params P_out trans_prog stack_sizes) f l  (st_no_pc … st2)= return st2_fin_no_pc ∧ st_no_pc_rel (pc … st1') (st_no_pc … st1') st2_fin_no_pc ) (f_step … data (point_of_pc P_in (pc … st1)) seq) ) (init ? fn) \end{lstlisting} \paragraph{Commutation of call statement} For all $s_1,s_1',s_2$ such that $s_1 \stackrel{CALL \ id \ arg \ dst}{\longrightarrow} s_1'$, $s_1 \simeq_S s_2$ and the statement fetched in the translated language at the program counter being in call relation with the program counter of $s_1$ is $\pi_2(f\_step(CALL \ id \ arg \ dst)) = CALL id' \ arg' \ dst'$ for some $id',ags',dst'$, then there are $s_2^{pre},s_2^{after},s_2'$ such that \begin{itemize} \item $s_2 \stackrel{\pi_1(f\_step(CALL \ id \ arg \ dst))}{\longrightarrow} s_2^{pre}$, \item $s_2^{pre} \stackrel{CALL \ id' \arg' \ dst'}{\longrightarrow} s_2^{after}$, \item $s_2^{after} \stackrel{prologue}{\longrightarrow} s_2'$, \item $s_1' \simeq_L s_2^{after}$ and $s_1' \simeq_S s_2'$. \end{itemize} The situation is depicted by the following diagram. $$\xymatrix{ s_1 \ar@{-}[d]^{\simeq_S} \ar@{-}[d]^{\simeq_S} \ar[rrrrrrrrrr]^{CALL \ id \ arg \ dst} &&&&&&&&&& s_1' \ar@{-}[d]^{\simeq_S} \\ s_2 \ar[rrrr]^{\pi_1(f\_step(CALL \ id \ arg \ dst))} &&&& s_2^{pre} \ar[rrr]^{CALL \ id' \ arg' \ dst'} &&& s_2^{after} \ar[rrr]^{prologue} &&& s_2' }$$ The statement is formalized in Matita in the following way. \begin{lstlisting} ; call_commutation : let trans_prog ≝ b_graph_transform_program P_in P_out init prog in ∀st1 : joint_abstract_status (mk_prog_params P_in prog stack_sizes) . ∀st2 : joint_abstract_status (mk_prog_params P_out trans_prog stack_sizes). ∀f,fn,id,arg,dest,nxt. block_id … (pc_block (pc … st1)) ≠ -1 → fetch_statement P_in … (joint_globalenv P_in prog stack_sizes) (pc … st1) = return 〈f, fn,  sequential P_in ? (CALL P_in ? id arg dest) nxt〉 → ∀bl. block_of_call P_in … (joint_globalenv P_in prog stack_sizes) id (st_no_pc … st1) = return bl → ∀f1,fn1. fetch_internal_function … (joint_globalenv P_in prog stack_sizes) bl =  return 〈f1,fn1〉 → ∀st1_pre. save_frame … P_in (kind_of_call P_in id) dest st1 = return st1_pre → ∀n.stack_sizes f1 = return n → ∀st1'. setup_call ?? P_in n (joint_if_params … fn1) arg st1_pre = return st1' → st_rel st1 st2 → ∀t_fn1. fetch_internal_function … (joint_globalenv P_out trans_prog stack_sizes) bl = return 〈f1,t_fn1〉 → bind_new_P' ?? (λregs1.λdata. bind_new_P' ?? (λregs2.λblp. ∀pc',t_fn,id',arg',dest',nxt1. sigma_stored_pc P_in P_out prog stack_sizes init init_regs f_lbls f_regs pc' = (pc … st1) → fetch_statement P_out … (joint_globalenv P_out trans_prog stack_sizes) pc' = return 〈f,t_fn, sequential P_out ? ((\snd (\fst blp)) (point_of_pc P_out pc')) nxt1〉→ ((\snd (\fst blp)) (point_of_pc P_out pc')) = (CALL P_out ? id' arg' dest') → ∃st2_pre_call. repeat_eval_seq_no_pc ? (mk_prog_params P_out trans_prog stack_sizes) f (map_eval ?? (\fst (\fst blp)) (point_of_pc P_out pc')) (st_no_pc ? st2) = return st2_pre_call ∧ block_of_call P_out … (joint_globalenv P_out trans_prog stack_sizes) id' st2_pre_call = return bl ∧ ∃st2_pre. save_frame … P_out (kind_of_call P_out id') dest' (mk_state_pc ? st2_pre_call pc' (last_pop … st2)) = return st2_pre ∧ ∃st2_after_call. setup_call ?? P_out n (joint_if_params … t_fn1) arg' st2_pre = return st2_after_call ∧ bind_new_P' ?? (λregs11.λdata1. ∃st2'. repeat_eval_seq_no_pc ? (mk_prog_params P_out trans_prog stack_sizes) f1 (added_prologue … data1) (increment_stack_usage P_out n st2_after_call) = return st2' ∧ st_no_pc_rel (pc_of_point P_in bl (joint_if_entry … fn1)) (increment_stack_usage P_in n st1') st2' ) (init ? fn1) ) (f_step … data (point_of_pc P_in (pc … st1)) (CALL P_in ? id arg dest)) ) (init ? fn) \end{lstlisting} \paragraph{Commutation of return statement} For all $s_1,s_1',s_2$ such that $s_1 \stackrel{RETURN}{\longrightarrow} s_1'$, $s_1 \simeq_S s_2$, if $CALL \ id \ arg \ dst$ is the call statement that caused the function call ened by the current return (i.e. it is the statement whose code point identifier is the syntactical predecessor of the program counter of $s_1'$), then $\pi_2(f\_fin(RETURN)) = RETURN$, there are $s_2^{pre},s_2^{after},s_2'$ such that $s_2 \stackrel{\pi_1(f\_fin(RETURN))}{\longrightarrow} s_2^{pre}$, $s_2^{pre} \stackrel{RETURN}{\longrightarrow} s_2^{after}$, $s_2^{after} \stackrel{\pi_3(f\_step(CALL \ id \ arg \ dst))}{\longrightarrow} s_2'$ and $s_1' \simeq_S s_2'$. The following diagram depicts the above described requested situation. $$\xymatrix{ s_1 \ar@{-}[d]^{\simeq_S} \ar@{-}[d]^{\simeq_S} \ar[rrrrrrrrrrr]^{RETURN} &&&&&&&&&&& s_1' \ar@{-}[d]^{\simeq_S} \\\ s_2 \ar[rrrr]^{\pi_1(f\_fin(RETURN))} &&&& s_2^{pre} \ar[rrr]^{RETURN} &&& s_2^{after} \ar[rrrr]^{\pi_3(f\_step(CALL \ id \ arg \ dst))} &&&& s_2' }$$ The statement is formalized in Matita in the following way. \begin{lstlisting} ; return_commutation : let trans_prog ≝ b_graph_transform_program P_in P_out init prog in ∀st1 : joint_abstract_status (mk_prog_params P_in prog stack_sizes) . ∀st2 : joint_abstract_status (mk_prog_params P_out trans_prog stack_sizes). ∀f,fn. block_id … (pc_block (pc … st1)) ≠ -1 → fetch_statement P_in … (joint_globalenv P_in prog stack_sizes) (pc … st1) = return 〈f, fn,  final P_in ? (RETURN …)〉 → ∀n. stack_sizes f = return n → let curr_ret ≝ joint_if_result … fn in ∀st_pop,pc_pop. pop_frame ?? P_in ? (joint_globalenv P_in prog stack_sizes) f curr_ret (st_no_pc … st1) = return 〈st_pop,pc_pop〉 → ∀nxt.∀f1,fn1,id,args,dest. fetch_statement P_in … (joint_globalenv P_in prog stack_sizes) pc_pop  = return 〈f1,fn1,sequential P_in … (CALL P_in ? id args dest) nxt〉 → st_rel st1 st2 → ∀t_fn. fetch_internal_function … (joint_globalenv P_out trans_prog stack_sizes) (pc_block (pc … st2)) = return 〈f,t_fn〉 → bind_new_P' ?? (λregs1.λdata. bind_new_P' ?? (λregs2.λblp. \snd blp = (RETURN …) ∧ ∃st_fin. repeat_eval_seq_no_pc ? (mk_prog_params P_out trans_prog stack_sizes) f (\fst blp)  (st_no_pc … st2)= return st_fin ∧ ∃t_st_pop,t_pc_pop. pop_frame ?? P_out ? (joint_globalenv P_out trans_prog stack_sizes) f (joint_if_result … t_fn) st_fin = return 〈t_st_pop,t_pc_pop〉 ∧ sigma_stored_pc P_in P_out prog stack_sizes init init_regs f_lbls f_regs t_pc_pop = pc_pop ∧ if eqZb (block_id (pc_block pc_pop)) (-1) then st_no_pc_rel (pc_of_point P_in (pc_block pc_pop) nxt) (decrement_stack_usage ? n st_pop) (decrement_stack_usage ? n t_st_pop) (*pre_main*) else bind_new_P' ?? (λregs4.λdata1. bind_new_P' ?? (λregs3.λblp1. ∃st2'. repeat_eval_seq_no_pc ? (mk_prog_params P_out trans_prog stack_sizes) f1 (\snd blp1) (decrement_stack_usage ? n t_st_pop) = return st2' ∧ st_no_pc_rel (pc_of_point P_in (pc_block pc_pop) nxt) (decrement_stack_usage ? n st_pop) st2' ) (f_step … data1 (point_of_pc P_in pc_pop) (CALL P_in ? id args dest)) ) (init ? fn1) ) (f_fin … data (point_of_pc P_in (pc … st1)) (RETURN …)) ) (init ? fn) \end{lstlisting} \subsubsection{Conclusion} After having provided a semantic relation among states that satifies some conditions that correspond to commutation lemmas that are commonly proved in a forward simulation proof, it is possible to prove the general theorem. All these condition are summarized in a propositional record called \verb=good_state_relation=. The statement we are able to prove have the following shape. \begin{lstlisting} theorem joint_correctness : ∀p_in,p_out : sem_graph_params. ∀prog : joint_program p_in.∀stack_size : ident → option ℕ. ∀init : (∀globals.joint_closed_internal_function p_in globals → bound_b_graph_translate_data p_in p_out globals). ∀init_regs : block → list register.∀f_lbls : lbl_funct_type. ∀f_regs : regs_funct_type.∀st_no_pc_rel : joint_state_relation p_in p_out. ∀st_rel : joint_state_pc_relation p_in p_out. good_state_relation p_in p_out prog stack_size init init_regs f_lbls f_regs st_no_pc_rel st_rel → let trans_prog ≝ b_graph_transform_program … init prog in ∀init_in.make_initial_state (mk_prog_params p_in prog stack_size) = OK ? init_in → ∃init_out.make_initial_state (mk_prog_params p_out trans_prog stack_size) = OK ? init_out ∧ ∃ R. status_simulation_with_init (joint_abstract_status (mk_prog_params p_in prog stack_size)) (joint_abstract_status (mk_prog_params p_out trans_prog stack_size)) R init_in init_out. \end{lstlisting} The module formalizing the formal machienary we described in this document consists of about 3000 lines of Matita code. We stress the fact that this machienary proves general properties that do not depend on the specific backend compiler pass. Without this layer one would have to prove these properties each specific pass multiplying the number of Matita's lines by $n$ where $n$ is the number of backend pass.