source: Papers/cpp-asm-2012/reviews.txt @ 2377

Last change on this file since 2377 was 2377, checked in by sacerdot, 7 years ago

typo fixed

File size: 9.1 KB
Line 
14) How do you model I/O operations?
25) How do you plan to run the assembler?  Can you extract an executable assembler from the Matita implementation?
36) Right now, the proofs and specifications are mixed together in the Matita files.  Is there a way to isolate and quantify the definitions that you trust for correctness (the "trusted computing base")?
4
510) Section 3.4, lemma fetch_assembly_pseudo: the variable "instructions" is not in scope (did you mean "instrs"?).
6
711) Section 3.4: "Here, l is the number of machine code instructions".  "pc_plus_len" is computed by adding pc to l, so wouldn't "number of bytes" make more sense?
8
9----------------------- REVIEW 3 ---------------------
10
11The term "total correctness" used in Sec. 3.4 and 3.5 is confusing,
12since it has already been used in Hoare logic with standard meaning.
13
14
15----------------------- REVIEW 4 ---------------------
16  (i)   The presentation is needlessly complicated and difficult to
17        understand; while individual sentences are well-written, they
18        sometimes lack precision,
19
20  (ii)  The most delicate and interesting part of the problem seems to be
21        the translation of jump offsets, but the algorithm for this has been
22        submitted as a companion paper; it may have been better to submit a
23        single, self-contained paper with a higher concentration of
24        interesting results.
25
26  (iii) The notion of `well behaved' assembly programs is either not
27        described clearly enough, or not adequately developed. It seems to
28        relie on `run-time' checks for correct address manipulation. In
29        fact, these checks seem to be made at `emulation-time' (since the
30        generated code is not changed to implement the checks), and it is
31        not clear whether (or how) this covers all possible program
32        behaviours.
33
341. The title talks about an _optimising_ assembler, but the only
35   optimisation seems to be in calculating branch displacements and this
36   question is only partially resolved in the paper. Can assemblers be
37   expected to perform any other optimisations?
38
392. The abstract talks about "the efficient expansion of
40   psuedoinstructions---particularly jumps---is complex", but given that
41   this is the sole optimisation, it would seem more accurate to say
42   "...---namely jumps". Furthermore, the word "complex" is perhaps
43   overreaching.
44
453. Some parts of the text lack precision. Namely:
46
47   a. p.2: "what one hand giveth, the other taketh away"---it seems rather
48      to be a question of a straight-out engineering tradeoff in processor
49      choice. Furthermore, the MCS51 is memory constrained by the fact that
50      it is an 8-bit microcontroller; but there would seem to be sufficient
51      memory for the tasks it is expected to perform. This does not mean
52      that the calculation of jumping instructions is not a worthy problem,
53      but I question whether it is as critical as stated.
54
55   b. p.2: The last paragraph states that the proof is effected in "a
56      rather profound way", and that the proof of correctness may descend
57      into a "diabolical quagmire". In my opinion, these literary
58      flourishes waste space that could instead be used for more accurate
59      descriptions.
60
61   e. p.6: "The function assembly_1_psuedo_instruction ... is essentially
62      the composition of the two passes": is it or isn't it?
63
64   f. p.6: "... markedly different ways": well, all of the instructions jump
65      to one place or another, so they're not that different!
66
67   h. p.14: "this policy is the only possible policy _in theory_ that can
68      preseve the semantics..."; what does this mean exactly? It would seem
69      that the approach being presented is a counterexample to this
70      statement.
71
724. The question of the MCS51's precise and predictable timing model is quite
73   interesting, but not discussed thoroughly in the paper. The second page
74   talks about "timing characteristics" and I/O but it is not completely
75   clear what is meant. Choosing one branching strategy over another will
76   obviously change the cycle-level timing of a program. The paper does not
77   discuss this in detail, nor whether it is important or not. Apart from
78   that, how else could the timing be changed by choices in the assembler?
79
80   The paper talks about "costing" in the encompassing CerCo project, but
81   few precise details are given. In particular, whenever "cost" is used in
82   the paper, does it just refer to the number of cycles required to execute
83   a sequence of instructions? If so, why use the more abstract term?
84
855. The paper does not mention the issue of placing assembler subroutines in
86   memory? Is this not an issue? Could placement choices affect code size
87   (from jump instructions) significantly?
88
896. p.4: It seems to me that the paper would have been much easier to
90   understand had it included more "sign posting". For instance, the start of
91   section 3 might state that there will be two models: one where jump
92   destinations are specified using relative offsets, and another where they
93   are given as labels. And, in fact, that the whole point is to translate
94   accurately from one to the other. The paper might then summarise
95   reference [4], which is very nicely written, to motivate the issue of
96   calculating branch displacements.
97
98   It might also state the two other issues are the correctness of assembly
99   encoding/fetching, and a notion of "well behaved" programs that modify
100   addresses in "reasonable" ways.
101
1027. p.5: It would be nice to see the type of the "preamble"; is it just a
103   DATA section?
104
1058. The paper does not discuss the complete tool chain. Does the verified
106   development parse MCS51 assembler and spit out a binary image?
107
1089. p.5: Why is "Word" and not "Identifier" or "nat" used to index psuedo
109   instructions? Or maybe even a type synonym so there is no risk of
110   confusion with the type used to index instructions.
111
112   For instance, it would be nice if the statement "The function sigma maps
113   pseudo program counters to program counters" were better reflected in the
114   type of sigma.
115
11610. p.6: I did not really understand what the paragraph "In contrast to other
117    approaches, we do not perform any kind of symbolic execution..." was
118    trying to say.
119
12011. pp.7--9: this text talks about "program counters", when I think it
121    means "addresses" (i.e., values that the program counter may take).
122
12312. It is confusing that the variable called "policy" is only one part of
124    the concept "policy".
125
12613. p.8: I didn't understand why it is sometimes necessary to force the
127    expansion of jumps to the largest size possible.
128
12914. p.11: states that "we must detect (at run time) programs that manipulate
130    addresses in well behaved ways". But what does this mean exactly? Given
131    that the output program is not augmented with checks, it seems that the
132    authors mean "at emulation time". But how can this be exhaustive (for
133    all input values and for non-terminating executions)? Are there some
134    kind of necessary conditions for having a well-behaved program? Must
135    users provide a proof of well-behavedness for each program before they
136    can be sure that its assembly is correct? Isn't this rather a typing
137    issue? Is there any relation with the work of Greg Morriset et al on
138    Typed Assembly Language?
139
14015. The paper states that the work may be usable in the context of verified
141    operating systems kernels. This is a point worth discussing, but the
142    text could be more precise. The seL4 kernel requires memory protection
143    but the MCS51 has no MMU. Furthermore, the verification was specific to
144    the arm assembly language, where it would seem that the issue of branch
145    displacement is less significant.
146
147    Similarly, the paper discusses compilers like CompCert and the issue of
148    well-formedness. Again, this is a worthy point, but the discussion seems
149    to neglect the possibility that well formedness may be guaranteed by the
150    conversion from the high-level language, i.e. that all C programs may
151    compile into "well behaved" assembly programs. Is this true?
152
15316. The related work section touches on "timing properties of an assembly
154    program", but this issue is not treated in depth by the paper, and, in
155    fact, these sentences seem out of place.
156
15718. Research Questions
158
159   a. Did you consider "running" your model in parallel with a real or
160      emulated MCS-51 to validate its correctness?
161      (as in the work of Peter Sewell et al's work on TCP/IP and memory
162       weak models; where they compare their models against real
163       implementations.)
164
165   b. Rather than verifying a branch displacement algorithm, did you
166      consider showing refinement between the input (assembler with labels) and
167      the output (modified assembler with offsets)? That way, the algorithm
168      could use whatever strategy it liked and the results would still be
169      verified. From my understanding, this is one of the approaches taken
170      in Xavier Leroy et al's CompCert project.
171
172x. Minor typos / issues
173
174   c. p.4: It would be helpful if the abbreviated definitions of
175       preinstruction, instruction, and pseudoinstruction included more
176       elements in common with themselves and with the other examples
177       (notably DEC).
Note: See TracBrowser for help on using the repository browser.