| SUMMARY | |
| ------- | |
| We met to discuss the LLVM instruction format and bytecode representation: | |
| ISSUES RESOLVED | |
| --------------- | |
| 1. We decided that we shall use a flat namespace to represent our | |
| variables in SSA form, as opposed to having a two dimensional namespace | |
| of the original variable and the SSA instance subscript. | |
| ARGUMENT AGAINST: | |
| * A two dimensional namespace would be valuable when doing alias | |
| analysis because the extra information can help limit the scope of | |
| analysis. | |
| ARGUMENT FOR: | |
| * Including this information would require that all users of the LLVM | |
| bytecode would have to parse and handle it. This would slow down the | |
| common case and inflate the instruction representation with another | |
| infinite variable space. | |
| REASONING: | |
| * It was decided that because original variable sources could be | |
| reconstructed from SSA form in linear time, that it would be an | |
| unjustified expense for the common case to include the extra | |
| information for one optimization. Alias analysis itself is typically | |
| greater than linear in asymptotic complexity, so this extra analaysis | |
| would not affect the runtime of the optimization in a significant | |
| way. Additionally, this would be an unlikely optimization to do at | |
| runtime. | |
| IDEAS TO CONSIDER | |
| ----------------- | |
| 1. Including dominator information in the LLVM bytecode | |
| representation. This is one example of an analysis result that may be | |
| packaged with the bytecodes themselves. As a conceptual implementation | |
| idea, we could include an immediate dominator number for each basic block | |
| in the LLVM bytecode program. Basic blocks could be numbered according | |
| to the order of occurrence in the bytecode representation. | |
| 2. Including loop header and body information. This would facilitate | |
| detection of intervals and natural loops. | |
| UNRESOLVED ISSUES | |
| ----------------- | |
| 1. Will oSUIF provide enough of an infrastructure to support the research | |
| that we will be doing? We know that it has less than stellar | |
| performance, but hope that this will be of little importance for our | |
| static compiler. This could affect us if we decided to do some IP | |
| research. Also we do not yet understand the level of exception support | |
| currently implemented. | |
| 2. Should we consider the requirements of a direct hardware implementation | |
| of the LLVM when we design it? If so, several design issues should | |
| have their priorities shifted. The other option is to focus on a | |
| software layer interpreting the LLVM in all cases. | |
| 3. Should we use some form of packetized format to improve forward | |
| compatibility? For example, we could design the system to encode a | |
| packet type and length field before analysis information, to allow a | |
| runtime to skip information that it didn't understand in a bytecode | |
| stream. The obvious benefit would be for compatibility, the drawback | |
| is that it would tend to splinter that 'standard' LLVM definition. | |
| 4. Should we use fixed length instructions or variable length | |
| instructions? Fetching variable length instructions is expensive (for | |
| either hardware or software based LLVM runtimes), but we have several | |
| 'infinite' spaces that instructions operate in (SSA register numbers, | |
| type spaces, or packet length [if packets were implemented]). Several | |
| options were mentioned including: | |
| A. Using 16 or 32 bit numbers, which would be 'big enough' | |
| B. A scheme similar to how UTF-8 works, to encode infinite numbers | |
| while keeping small number small. | |
| C. Use something similar to Huffman encoding, so that the most common | |
| numbers are the smallest. | |
| -Chris | |