| Ok, here are my comments and suggestions about the LLVM instruction set. |
| We should discuss some now, but can discuss many of them later, when we |
| revisit synchronization, type inference, and other issues. |
| (We have discussed some of the comments already.) |
| |
| |
| o We should consider eliminating the type annotation in cases where it is |
| essentially obvious from the instruction type, e.g., in br, it is obvious |
| that the first arg. should be a bool and the other args should be labels: |
| |
| br bool <cond>, label <iftrue>, label <iffalse> |
| |
| I think your point was that making all types explicit improves clarity |
| and readability. I agree to some extent, but it also comes at the cost |
| of verbosity. And when the types are obvious from people's experience |
| (e.g., in the br instruction), it doesn't seem to help as much. |
| |
| |
| o On reflection, I really like your idea of having the two different switch |
| types (even though they encode implementation techniques rather than |
| semantics). It should simplify building the CFG and my guess is it could |
| enable some significant optimizations, though we should think about which. |
| |
| |
| o In the lookup-indirect form of the switch, is there a reason not to make |
| the val-type uint? Most HLL switch statements (including Java and C++) |
| require that anyway. And it would also make the val-type uniform |
| in the two forms of the switch. |
| |
| I did see the switch-on-bool examples and, while cute, we can just use |
| the branch instructions in that particular case. |
| |
| |
| o I agree with your comment that we don't need 'neg'. |
| |
| |
| o There's a trade-off with the cast instruction: |
| + it avoids having to define all the upcasts and downcasts that are |
| valid for the operands of each instruction (you probably have thought |
| of other benefits also) |
| - it could make the bytecode significantly larger because there could |
| be a lot of cast operations |
| |
| |
| o Making the second arg. to 'shl' a ubyte seems good enough to me. |
| 255 positions seems adequate for several generations of machines |
| and is more compact than uint. |
| |
| |
| o I still have some major concerns about including malloc and free in the |
| language (either as builtin functions or instructions). LLVM must be |
| able to represent code from many different languages. Languages such as |
| C, C++ Java and Fortran 90 would not be able to use our malloc anyway |
| because each of them will want to provide a library implementation of it. |
| |
| This gets even worse when code from different languages is linked |
| into a single executable (which is fairly common in large apps). |
| Having a single malloc would just not suffice, and instead would simply |
| complicate the picture further because it adds an extra variant in |
| addition to the one each language provides. |
| |
| Instead, providing a default library version of malloc and free |
| (and perhaps a malloc_gc with garbage collection instead of free) |
| would make a good implementation available to anyone who wants it. |
| |
| I don't recall all your arguments in favor so let's discuss this again, |
| and soon. |
| |
| |
| o 'alloca' on the other hand sounds like a good idea, and the |
| implementation seems fairly language-independent so it doesn't have the |
| problems with malloc listed above. |
| |
| |
| o About indirect call: |
| Your option #2 sounded good to me. I'm not sure I understand your |
| concern about an explicit 'icall' instruction? |
| |
| |
| o A pair of important synchronization instr'ns to think about: |
| load-linked |
| store-conditional |
| |
| |
| o Other classes of instructions that are valuable for pipeline performance: |
| conditional-move |
| predicated instructions |
| |
| |
| o I believe tail calls are relatively easy to identify; do you know why |
| .NET has a tailcall instruction? |
| |
| |
| o I agree that we need a static data space. Otherwise, emulating global |
| data gets unnecessarily complex. |
| |
| |
| o About explicit parallelism: |
| |
| We once talked about adding a symbolic thread-id field to each |
| instruction. (It could be optional so single-threaded codes are |
| not penalized.) This could map well to multi-threaded architectures |
| while providing easy ILP for single-threaded onces. But it is probably |
| too radical an idea to include in a base version of LLVM. Instead, it |
| could a great topic for a separate study. |
| |
| What is the semantics of the IA64 stop bit? |
| |
| |
| |
| |
| o And finally, another thought about the syntax for arrays :-) |
| |
| Although this syntax: |
| array <dimension-list> of <type> |
| is verbose, it will be used only in the human-readable assembly code so |
| size should not matter. I think we should consider it because I find it |
| to be the clearest syntax. It could even make arrays of function |
| pointers somewhat readable. |
| |