# Changeset 2378

Ignore:
Timestamp:
Sep 28, 2012, 6:24:44 PM (9 years ago)
Message:

...

Location:
Papers/cpp-asm-2012
Files:
2 edited

Unmodified
Added
Removed
• ## Papers/cpp-asm-2012/cpp-2012-asm.tex

 r2377 We may compare our work to an industrial grade' assembler for the MCS-51: SDCC~\cite{sdcc:2011}, the only open source C compiler that targets the MCS-51 instruction set. It appears that all pseudojumps in SDCC assembly are expanded to \texttt{LJMP} instructions, the worst possible jump expansion policy from an efficiency point of view. Note that this policy is the only possible policy \emph{in theory} that can preserve the semantics of an assembly program during the assembly process, coming at the expense of assembler completeness as the generated program may be too large for code memory, there being a trade-off between the completeness of the assembler and the efficiency of the assembled program. Note that this policy is the only possible policy \emph{in theory} that makes every assembly program well behaved, preserving its the semantics during the assembly process. This comes at the expense of assembler completeness as the generated program may be too large for code memory, there being a trade-off between the completeness of the assembler and the efficiency of the assembled program. The definition and proof of a terminating, correct jump expansion policy is described elsewhere~\cite{boender:correctness:2012}.
• ## Papers/cpp-asm-2012/reviews.txt

 r2377 6) Right now, the proofs and specifications are mixed together in the Matita files.  Is there a way to isolate and quantify the definitions that you trust for correctness (the "trusted computing base")? 10) Section 3.4, lemma fetch_assembly_pseudo: the variable "instructions" is not in scope (did you mean "instrs"?). 11) Section 3.4: "Here, l is the number of machine code instructions".  "pc_plus_len" is computed by adding pc to l, so wouldn't "number of bytes" make more sense? ----------------------- REVIEW 3 --------------------- The term "total correctness" used in Sec. 3.4 and 3.5 is confusing, since it has already been used in Hoare logic with standard meaning. ----------------------- REVIEW 4 --------------------- (i)   The presentation is needlessly complicated and difficult to understand; while individual sentences are well-written, they sometimes lack precision, (ii)  The most delicate and interesting part of the problem seems to be the translation of jump offsets, but the algorithm for this has been submitted as a companion paper; it may have been better to submit a single, self-contained paper with a higher concentration of interesting results. (iii) The notion of well behaved' assembly programs is either not described clearly enough, or not adequately developed. It seems to relie on run-time' checks for correct address manipulation. In fact, these checks seem to be made at emulation-time' (since the generated code is not changed to implement the checks), and it is not clear whether (or how) this covers all possible program behaviours. 1. The title talks about an _optimising_ assembler, but the only optimisation seems to be in calculating branch displacements and this 3. Some parts of the text lack precision. Namely: a. p.2: "what one hand giveth, the other taketh away"---it seems rather to be a question of a straight-out engineering tradeoff in processor choice. Furthermore, the MCS51 is memory constrained by the fact that a. p.2: The MCS51 is memory constrained by the fact that it is an 8-bit microcontroller; but there would seem to be sufficient memory for the tasks it is expected to perform. This does not mean but I question whether it is as critical as stated. b. p.2: The last paragraph states that the proof is effected in "a rather profound way", and that the proof of correctness may descend into a "diabolical quagmire". In my opinion, these literary flourishes waste space that could instead be used for more accurate descriptions. e. p.6: "The function assembly_1_psuedo_instruction ... is essentially the composition of the two passes": is it or isn't it? f. p.6: "... markedly different ways": well, all of the instructions jump to one place or another, so they're not that different! h. p.14: "this policy is the only possible policy _in theory_ that can preseve the semantics..."; what does this mean exactly? It would seem that the approach being presented is a counterexample to this statement. 4. The question of the MCS51's precise and predictable timing model is quite interesting, but not discussed thoroughly in the paper. The second page talks about "timing characteristics" and I/O but it is not completely clear what is meant. Choosing one branching strategy over another will obviously change the cycle-level timing of a program. The paper does not discuss this in detail, nor whether it is important or not. Apart from that, how else could the timing be changed by choices in the assembler? The paper talks about "costing" in the encompassing CerCo project, but 4. The paper talks about "costing" in the encompassing CerCo project, but few precise details are given. In particular, whenever "cost" is used in the paper, does it just refer to the number of cycles required to execute memory? Is this not an issue? Could placement choices affect code size (from jump instructions) significantly? 6. p.4: It seems to me that the paper would have been much easier to understand had it included more "sign posting". For instance, the start of section 3 might state that there will be two models: one where jump destinations are specified using relative offsets, and another where they are given as labels. And, in fact, that the whole point is to translate accurately from one to the other. The paper might then summarise reference [4], which is very nicely written, to motivate the issue of calculating branch displacements. It might also state the two other issues are the correctness of assembly encoding/fetching, and a notion of "well behaved" programs that modify addresses in "reasonable" ways. 7. p.5: It would be nice to see the type of the "preamble"; is it just a type of sigma. 10. p.6: I did not really understand what the paragraph "In contrast to other approaches, we do not perform any kind of symbolic execution..." was trying to say. CSC: because the semantics is shared! 11. pp.7--9: this text talks about "program counters", when I think it means "addresses" (i.e., values that the program counter may take). 12. It is confusing that the variable called "policy" is only one part of the concept "policy". 13. p.8: I didn't understand why it is sometimes necessary to force the expansion of jumps to the largest size possible. 14. p.11: states that "we must detect (at run time) programs that manipulate addresses in well behaved ways". But what does this mean exactly? Given that the output program is not augmented with checks, it seems that the authors mean "at emulation time". But how can this be exhaustive (for all input values and for non-terminating executions)? Are there some kind of necessary conditions for having a well-behaved program? Must users provide a proof of well-behavedness for each program before they can be sure that its assembly is correct? Isn't this rather a typing issue? Is there any relation with the work of Greg Morriset et al on Typed Assembly Language? 15. The paper states that the work may be usable in the context of verified conversion from the high-level language, i.e. that all C programs may compile into "well behaved" assembly programs. Is this true? 16. The related work section touches on "timing properties of an assembly program", but this issue is not treated in depth by the paper, and, in fact, these sentences seem out of place. 18. Research Questions elements in common with themselves and with the other examples (notably DEC). CSC: there is nothing in common; we could put MUL/DEC
Note: See TracChangeset for help on using the changeset viewer.