Changeset 2373 for Papers


Ignore:
Timestamp:
Sep 28, 2012, 5:43:24 PM (7 years ago)
Author:
mulligan
Message:

Changes to the Italian-English

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Papers/cpp-asm-2012/cpp-2012-asm.tex

    r2372 r2373  
    520520\subsection{Correctness for `well-behaved' assembly programs}
    521521\label{subsect.total.correctness.for.well.behaved.assembly.programs}
    522 Most assemblers can map a single pseudoinstruction into zero or more
    523 instructions, whose size (in bytes) is not independent of the expansion.
    524 Therefore, the assembly process always produces a map (that for us is just the
    525 policy) that associates to each assembly address \texttt{a} a code memory
    526 address~\texttt{policy(a)} where the instructions that correspond to the
    527 pseudoinstruction at \texttt{a} are located. And the map is not just a
    528 linear function, but depends on the local choices and global optimizations
    529 performed.
    530 
    531 During execution of an assembly code, addresses can be stored in memory
    532 locations or in the registers. Moreover, arithmetical operations can be
    533 applied to addresses, for example to compare them or to shift a function pointer
    534 to implement C switch statements. In order to show that the object code
    535 simulates
    536 the assembly code, we need to compute the processor status that corresponds
    537 to the assembly status. In particular, those \texttt{a} in memory that
    538 are used as data should be preserved as \texttt{a}, but those used as
    539 addresses should be changed into \texttt{policy(a)}. Moreover, every
    540 arithmetic operation should commute with \texttt{policy} in order for the
    541 semantics to be preserved.
    542 
    543 According to the previous observation, we can ask if it is possible at all
    544 for an assembler to preserve the semantics of an assembly program.
    545 The traditional approach to the verification of assemblers answers the question
    546 in the positive by restricting the semantics of assembly programs. In
    547 particular, the type of memory cells and registers is set to the disjoint union
    548 of data and symbolic addresses, an the semantics is always forced to consider
    549 all possible combinations of arguments (data vs data, data vs addresses, etc.),
    550 rejecting operations whose semantics cannot be preserved.
    551 
     522Most assemblers can map a single pseudoinstruction to zero or more machine instructions, whose size (in bytes) is not independent of the expansion.
     523The assembly process therefore always produces a map (which for us is just the policy) that associates to each assembly address \texttt{a} a code memory address~\texttt{policy(a)} where the instructions that correspond to the pseudoinstruction at \texttt{a} are located.
     524Ordinarily, the map is not just a linear function, but depends on the local choices and global optimisations performed.
     525
     526During execution of assembly code, addresses can be stored in memory locations or in the registers.
     527Moreover, arithmetical operations can be applied to addresses, for example to compare them or to shift a function pointer in order to implement C \texttt{switch} statements.
     528In order to show that the object code simulates the assembly code we must compute the processor status that corresponds to the assembly status.
     529In particular, those \texttt{a} in memory that are used as data should be preserved as \texttt{a}, but those used as addresses should be changed into \texttt{policy(a)}.
     530Moreover, every arithmetic operation should commute with \texttt{policy} in order for the semantics to be preserved.
     531
     532Following the previous observation, we can ask if it is possible at all for an assembler to preserve the semantics of an assembly program.
     533The traditional approach to the verification of assemblers answers the question in the affirmative by restricting the semantics of assembly programs.
     534In particular, the type of memory cells and registers is set to the disjoint union of data and symbolic addresses, and the semantics is always forced to consider all possible combinations of arguments (data vs.\ data, data vs. addresses, and so on), rejecting operations whose semantics cannot be preserved.
    552535\begin{displaymath}
    553536\mathtt{Mem} : \mathtt{Addr} \rightarrow \mathtt{Bytes} + \mathtt{Addr} \qquad \llbracket - \rrbracket : \mathtt{Instr} \rightarrow \mathtt{Mem} \rightarrow \mathtt{option\ Mem}
     
    560543                                                            \end{cases}
    561544\end{gather*}
    562 This approach has two main limitations. The first one is that it does not
    563 assign any semantics to interesting programs that could intentionally mangle
    564 addresses for malign (e.g. viruses) or benign (e.g. operating systems)
    565 purposes. The second is that it does not allow to share very well the
    566 semantics of assembly pseudoinstructions and object code instructions: only
    567 the \texttt{Byte-Byte} branch above can share the semantics with the
    568 object code \texttt{MUL}.
    569 
    570 In this paper we have already taken a different approach from
    571 Section~\ref{subsect.assembly.code.semantics}, where we have assigned a
    572 semantics to every assembly program by not distinguishing at all between
    573 data and symbolic addresses. Memory cells and registers always hold bytes,
    574 and symbolic labels are mapped to absolute addresses before execution.
    575 The consequence is that we do not expect that all assembly programs will have
    576 their semantics respected by object code. We call \emph{well behaved} those
    577 that do.
    578 Other consequences is that we can now reason over the semantics
    579 of programs that are not well behaved, and that we can handle well behavedness
    580 as an open predicate, recognizing more and more good behaviours as required.
    581 Of course, compilers that target our assembler will need to produce well
    582 behaved programs, which is usually the case by construction.
    583 
    584 The definition of well behavedness we employ employs a map to keep tracks
    585 of the memory locations and registers that hold addresses during execution
    586 of an assembly program. Therefore, the map acts as a sort of dynamic type
    587 system sitting atop memory. This approach seems similar to one taken by Tuch
    588 \emph{et al}~\cite{tuch:types:2007} for reasoning about low-level C code.
    589 
    590 The semantics of an assembly program is then augmented with a function
    591 that at each execution step updates the maps, signalling an error when
    592 the program performs an ill behaved operation. The actual computation is
    593 not performed by this mechanism, being already part of the assembly
    594 semantics.
     545This approach has two main limitations.
     546The first one is that it does not assign any semantics to interesting programs that could intentionally mangle addresses for malign (e.g. viruses) or benign (e.g. operating systems) purposes.
     547The second is that it does not allow one to adequately share the semantics of assembly pseudoinstructions and object code instructions: only the \texttt{Byte-Byte} branch above can share the semantics with the object code \texttt{MUL}.
     548
     549In this paper we have already taken a different approach from Sect.~\ref{subsect.assembly.code.semantics}, where we have assigned a semantics to every assembly program by not distinguishing at all between data and symbolic addresses.
     550Memory cells and registers always hold bytes, and symbolic labels are mapped to absolute addresses before execution.
     551Consequently we do not expect that all assembly programs will have their semantics respected by object code.
     552We call those programs that do \emph{well-behaved}.
     553Further, we can now reason over the semantics of programs that are not well-behaved, and that we can handle well-behavedness as an open predicate, recognising more and more good behaviours as required.
     554Naturally, compilers that target our assembler will need to produce well-behaved programs, which is usually the case by construction.
     555
     556The definition of well-behavedness we employ uses a map to keep track of the memory locations and registers that hold addresses during execution of an assembly program.
     557The map acts as a sort of dynamic typing system sitting atop memory.
     558This approach seems similar to one taken by Tuch \emph{et al}~\cite{tuch:types:2007} for reasoning about low-level C code.
     559
     560The semantics of an assembly program is then augmented with a function that at each execution step updates the map, signalling an error when the program performs an ill-behaved operation.
     561The actual computation is not performed by this mechanism, being already part of the assembly semantics.
    595562\begin{displaymath}
    596563\mathtt{AddrMap} : \mathtt{Addr} \rightarrow \{Data,Addr\} \qquad \llbracket - \rrbracket : \mathtt{Instr} \rightarrow \mathtt{AddrMap} \rightarrow \mathtt{option\ AddrMap}
     
    603570                                                            \end{cases}
    604571\end{gather*}
    605 
    606 In order to prove preservation of the semantics, we need to associate an
    607 objecto code status to each assembly pseudostatus. This operation is driven
    608 by the current \texttt{AddrMap}: if at address \texttt{a} the assembly
    609 level memory holds \texttt{d}, then if \texttt{AddrMap(a) = Data} the
    610 object code memory will hold \texttt{d} (data is preserved), otherwise
    611 it will hold \texttt{policy(d)}. If all the operations accepted by the
    612 address update map are well behaved, this is sufficient to show preservation
    613 of the semantics for those computation steps that are well behaved, i.e.
    614 such that the map update does not fail.
     572In order to prove the preservation of the semantics, we need to associate an object code status to each assembly pseudostatus.
     573This operation is driven by the current \texttt{AddrMap}: if at address \texttt{a} the assembly level memory holds \texttt{d}, then if \texttt{AddrMap(a) = Data} the object code memory will hold \texttt{d} (data is preserved), otherwise it will hold \texttt{policy(d)}.
     574If all the operations accepted by the address update map are well-behaved, this is sufficient to show preservation of the semantics for those computation steps that are well-behaved, i.e. such that the map update does not fail.
    615575
    616576XXXXXXXXXXXXXXXXX
Note: See TracChangeset for help on using the changeset viewer.