Changeset 2382

Oct 1, 2012, 11:55:26 AM (7 years ago)

Final version of executable semantics paper.

3 edited
1 moved


  • Papers/cpp-exec-2012/ccexec.tex

    r2381 r2382  
    6868the compiler is formalised in the Coq proof
    6969assistant % TODO: consider citation?
    70 and accompanied by correctness results.
     70and accompanied by correctness results, most notably proving that any
     71behaviour given to the assembly output must be an acceptable behaviour
     72for the C source program.
    7274To state these correctness properties requires giving some semantics to
    7577these semantics can weaken the overall theorems, potentially masking
    7678bugs in the compiler.  In particular, if no semantics is given to a
    77 legitimate C program the compiler is free to generate bad
     79legitimate C program then any behaviour is acceptable for the
     80generated assembly, and the compiler is free to generate bad
    7881code\footnote{By design, the compiler is also free to fail with an
    79   error for \emph{any} program, even if it is well-defined.}.
     82  error for \emph{any} program, even if it is well-defined.}.  This
     83corresponds to the C standard's notion of \emph{undefined behaviour},
     84but this notion is left implicit in the semantics, so there is a
     85danger of underdefining C.
    8187% TODO: clight paper also mentions executable semantics, in some detail
    117123  between the original deterministic and non-deterministic versions of
    118124  the semantics, and
    119 \item demonstrate that a mixture of formal and informal code can make
     125\item demonstrate that a mixture of formal Coq and informal OCaml code can make
    120126  testing more effective by working around known bugs with little
    121127  effort.
     129% TODO: consider moving out of footnote to make it less footnote heavy
     130The full development is available online as a modified version of CompCert\footnote{\url{}}.
    124132In Section~\ref{sec:compcert} we will give an overview of the relevant
    125133parts of the CompCert compiler.  In Section~\ref{sec:construction} we
    126134will discuss the construction of the executable semantics, its
    127 relationship to the inductively defined semantics from CompCert and
     135relationship to the inductively defined semantics from CompCert (both
     136formalised in Coq) and
    128137the additional OCaml code used to support testing.  This is followed by the
    129138results of testing the semantics in Section~\ref{sec:testing},
    130139including descriptions of the problems encountered.
    131 Finally, Section~\ref{sec:related} discusses related work.
     140Finally, Section~\ref{sec:related} discusses related work, including
     141an interpreter Leroy added to newer versions of CompCert following
     142this work.
    133144\section{Overview of the CompCert compiler}
    136147CompCert compiles a simplified but substantial subset of the C
    137 programming language to one of 3 different architectures (as of
     148programming language to one of three different architectures (as of
    138149version 1.8.2, which we refer to throughout unless stated otherwise).
    139150The main stages of the compiler, from an abstract syntax tree of the
    166177tree into the CompCert C language.
    168 Earlier version of CompCert did not feature CompCert C at all, but the
     179Earlier versions of CompCert did not feature CompCert C at all, but the
    169180less complex Clight language.  Clight lives on as the next
    170181intermediate language in the compiler.  Hence several publications on
    223234this for Clight as part of the CerCo project~\cite{EDI-INF-RR-1412}), but we do
    224235not expect that it would yield any worthwhile insights during testing
    225 to justify the effort involved.
     236to justify the effort involved arising from the extra non-determinism
     237from I/O and the coinductive reasoning for non-terminating programs.
    227239The existing definitions that we can reuse include the memory model
    334346which subexpressions can be evaluated in the current state (for
    335347example, only the guard of a conditional expression can be evaluated
    336 before the conditional expression itself).  We could execute the
     348before the conditional expression itself, whereas either subexpression
     349of $e_1$ \lstinline'+' $e_2$ can be reduced).  We could execute the
    337350non-deterministic semantics by collecting all of the possible
    338 reductions in a set, but we are primarily interested in testing the
     351reductions in a set (in the form of the possible successor states), but we are primarily interested in testing the
    339352semantics on well behaved C programs, and an existing theorem in
    340353CompCert shows that any program with a \emph{safe} non-deterministic
    349362Moreover, if we choose a function matching the deterministic semantics
    350363then we know that any well-defined C program which goes wrong is
    351 demonstrating a bug which affects \emph{both} sets of semantics.  Regardless
     364demonstrating a bug (in the form of missing behaviour) which affects \emph{both} sets of semantics.  Regardless
    352365of the choice of strategy, we can used failed executions to pinpoint
    353366rules in the semantics which concern us.  We will discuss the
    358 Finally, to match the non-deterministic semantics precisely we must
     371Finally, to match the non-deterministic semantics of \lstinline'estep' precisely we must
    359372also check that there are no \emph{stuck} subexpressions.  This
    360373detects subexpressions which cannot be reduced because they are
    370383the program undefined because of the potential division by zero.
    372 Informally, this condition states that \emph{any subexpression in an
     385Informally, the condition for \lstinline'estep' states that \emph{any subexpression in an
    373386  evaluation context is either a value, or has a further subexpression
    374387  that is reducible}.  This form suggests an inefficient executable
    383396The deterministic semantics does not enforce this condition because it
    384397commits to one particular evaluation, regardless of whether other
    385 evaluations may fail.  Thus we make the executable version of the
     398evaluations may fail.
     399This means that the relationship between the non-deterministic and
     400deterministic semantics is not a straightforward inclusion: the
     401deterministic semantics accepts some executions where the stuckness
     402check fails, but rejects the executions allowed by the
     403non-deterministic semantics under a different expression
     404reduction order.
     405  Thus we make the executable version of the
    386406check optional so that we may implement either semantics.
    390410We show the equivalence of the executable semantics to the original
    391 relational semantics in two parts.  First, we show that any successful
     411relational semantics in two parts, summarising the proofs from the formal Coq development.  First, we show that any successful
    392412execution of a step corresponds to some derivation in the original
    393413semantics, regardless of the strategy we choose.
    454474different way to non-deterministic executions: expressions with no
    455475side effects are dealt with by a big-step relation, and effectful
    456 expressions (where control flow such as the conditional operator is
    457 considered an effect) use a small-step relation as before.
    459 Nonetheless, we can encode the resulting evaluation order as a
     476expressions use a small-step relation as before.
     477Here `effectful' includes both changes to the state and to control flow such as the conditional operator.
     478  This
     479prioritisation of the effectful subexpressions is also what causes the
     480difference in evaluation order.  For example,
     481\lstinline[language=C]'x+(x=1)' when \lstinline'x' $\not = 1$ gives a different
     482answer with the second strategy because the assignment is evaluated
     485Despite the difference in construction, we can still encode the resulting evaluation order as a
    478504executable semantics will perform the same steps, reaching the same
    479505end state as the big-step.  Then using some lemmas to show that the
    480 \lstinline'leftcontext's from the deterministic semantics are found by
     506\lstinline'leftcontext's (the deterministic version of expression contexts) used in the semantics are found by
    481507our leftmost-innermost effectful subexpression strategy, we can show
    482508that any step of the deterministic semantics is performed by some
    490516  $t$.
     518Again, a precise version of the above argument is formalised in Coq.
    493520Together with the connections between the non-deterministic semantics
    556583Finally, we note that the execution of the semantics can be examined
    557584in the debugger packaged with OCaml when absolute certainty about the
    558 cause of a problem is required.
    560 The full development is available online as a modified version of CompCert\footnote{\url{}}.
     585cause of a problem is required.  Indeed, during the testing below the
     586debugger was used to retrieve extra information about the program
     587state that was not included in the interpreter's error message.
    588616There are two noteworthy aspects to this bug.  First, the compiler
    589 itself features the correct type check and works perfectly.  The
     617itself features the correct type check and works perfectly.  The error
     618only affects the specification of the compiler; this is extra
     619behaviour that has not been proved correct. The
    590620semantics can be easily modified to use the same check, and the only
    591621other change required is to \emph{remove} some steps from the proof
    711741The tests revealed that the semantics were missing some minor casting
    712742rules and that zero initialisation of file scope variables was not
    713 performed when initialisation the memory model.  Both were handled
     743performed when initialising the memory model.  Both were handled
    714744correctly by the compiler, in the latter case by the informal code
    715745that produces the assembler output.
    753783surprisingly tractable).  However, there is no easy way to mimic just
    754784the deterministic evaluation order, and no tricks in the driver code
    755 to make testing easier.
     785to make testing easier.  Together with the bug fixes, this makes
     786directly comparing test results with the current work infeasible, although
     787some brief testing with the gcc-torture suite behaved as expected.
     788The latest version of CompCert, 1.11, also features a more efficient
     789memory model, improving the interpreter's performance.
    757791% NB: if I change the formatting of Clight, change ClightTSO too
    773807language that are not currently supported by CompCert.  An interpreter
    774808called \texttt{kcc} is derived from the semantics and has been tested
    775 against the gcc-torture test suite that we reused.  These semantics
     809against the gcc-torture test suite that we reused.  They go further
     810and perform coverage analysis to gauge the proportion of the semantics
     811exercised by the tests. These semantics
    776812are not used for compiler verification, but are intended for studying
    777813(and ultimately verifying) the behaviour of C programs.
    810846We have shown that executable semantics can be useful enough for
     847the necessary task of
    811848validating semantics to justify retrofitting them to existing
    812849verified compilers, and in the case of CompCert found a real compiler
  • Papers/cpp-exec-2012/semantics.svg

    r2381 r2382  
    6060     inkscape:pageshadow="2"
    6161     inkscape:zoom="1.75"
    62      inkscape:cx="296.13107"
     62     inkscape:cx="224.41678"
    6363     inkscape:cy="675.69143"
    6464     inkscape:document-units="px"
    6565     inkscape:current-layer="layer1"
    6666     showgrid="false"
    67      inkscape:window-width="629"
    68      inkscape:window-height="503"
     67     inkscape:window-width="1600"
     68     inkscape:window-height="1125"
    6969     inkscape:window-x="0"
    70      inkscape:window-y="25"
    71      inkscape:window-maximized="0" />
     70     inkscape:window-y="24"
     71     inkscape:window-maximized="1" />
    7272  <metadata
    7373     id="metadata7">
    138138       xml:space="preserve"
    139139       style="font-size:10px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;fill:#000000;fill-opacity:1;stroke:none;font-family:LMSans10;-inkscape-font-specification:LMSans10"
    140        x="230.85715"
     140       x="232.27119"
    141141       y="328.93362"
    142142       id="text4988"><tspan
    143143         sodipodi:role="line"
    144144         id="tspan4990"
    145          x="230.85715"
     145         x="232.27119"
    146146         y="328.93362">(safe only)</tspan></text>
    147147    <text
    188188       xml:space="preserve"
    189189       style="font-size:10px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:center;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;font-family:LMSans10;-inkscape-font-specification:LMSans10"
    190        x="245.71429"
     190       x="246.71158"
    191191       y="395.21933"
    192192       id="text2836"><tspan
    193193         sodipodi:role="line"
    194194         id="tspan2838"
    195          x="245.71429"
     195         x="246.71158"
    196196         y="395.21933">(strategy)</tspan></text>
     197    <text
     198       xml:space="preserve"
     199       style="font-size:10px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;fill:#000000;fill-opacity:1;stroke:none;font-family:LMSans10;-inkscape-font-specification:LMSans10"
     200       x="209.14285"
     201       y="296.93362"
     202       id="text2838"><tspan
     203         sodipodi:role="line"
     204         id="tspan2840"
     205         x="209.14285"
     206         y="296.93362" /></text>
     207    <text
     208       xml:space="preserve"
     209       style="font-size:10px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;fill:#000000;fill-opacity:1;stroke:none;font-family:LMSans10;-inkscape-font-specification:LMSans10"
     210       x="149.71428"
     211       y="412.36218"
     212       id="text2842"><tspan
     213         sodipodi:role="line"
     214         id="tspan2844"
     215         x="149.71428"
     216         y="412.36218">Theorem 1</tspan></text>
     217    <text
     218       xml:space="preserve"
     219       style="font-size:10px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;fill:#000000;fill-opacity:1;stroke:none;font-family:LMSans10;-inkscape-font-specification:LMSans10"
     220       x="224.00002"
     221       y="383.21933"
     222       id="text2846"><tspan
     223         sodipodi:role="line"
     224         id="tspan2848"
     225         x="224.00002"
     226         y="383.21933">Theorem 2</tspan></text>
     227    <text
     228       xml:space="preserve"
     229       style="font-size:10px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;fill:#000000;fill-opacity:1;stroke:none;font-family:LMSans10;-inkscape-font-specification:LMSans10"
     230       x="343.59332"
     231       y="399.79074"
     232       id="text2850"><tspan
     233         sodipodi:role="line"
     234         id="tspan2852"
     235         x="343.59332"
     236         y="399.79074">Theorem 3</tspan></text>
     237    <text
     238       xml:space="preserve"
     239       style="font-size:10px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;fill:#000000;fill-opacity:1;stroke:none;font-family:LMSans10;-inkscape-font-specification:LMSans10"
     240       x="224.57143"
     241       y="315.79077"
     242       id="text2854"><tspan
     243         sodipodi:role="line"
     244         id="tspan2856"
     245         x="224.57143"
     246         y="315.79077">Existing result</tspan></text>
    197247  </g>
Note: See TracChangeset for help on using the changeset viewer.