| SUMMARY |
| ------- |
| |
| We met to discuss the LLVM instruction format and bytecode representation: |
| |
| ISSUES RESOLVED |
| --------------- |
| |
| 1. We decided that we shall use a flat namespace to represent our |
| variables in SSA form, as opposed to having a two dimensional namespace |
| of the original variable and the SSA instance subscript. |
| |
| ARGUMENT AGAINST: |
| * A two dimensional namespace would be valuable when doing alias |
| analysis because the extra information can help limit the scope of |
| analysis. |
| |
| ARGUMENT FOR: |
| * Including this information would require that all users of the LLVM |
| bytecode would have to parse and handle it. This would slow down the |
| common case and inflate the instruction representation with another |
| infinite variable space. |
| |
| REASONING: |
| * It was decided that because original variable sources could be |
| reconstructed from SSA form in linear time, that it would be an |
| unjustified expense for the common case to include the extra |
| information for one optimization. Alias analysis itself is typically |
| greater than linear in asymptotic complexity, so this extra analaysis |
| would not affect the runtime of the optimization in a significant |
| way. Additionally, this would be an unlikely optimization to do at |
| runtime. |
| |
| |
| IDEAS TO CONSIDER |
| ----------------- |
| |
| 1. Including dominator information in the LLVM bytecode |
| representation. This is one example of an analysis result that may be |
| packaged with the bytecodes themselves. As a conceptual implementation |
| idea, we could include an immediate dominator number for each basic block |
| in the LLVM bytecode program. Basic blocks could be numbered according |
| to the order of occurrence in the bytecode representation. |
| |
| 2. Including loop header and body information. This would facilitate |
| detection of intervals and natural loops. |
| |
| UNRESOLVED ISSUES |
| ----------------- |
| |
| 1. Will oSUIF provide enough of an infrastructure to support the research |
| that we will be doing? We know that it has less than stellar |
| performance, but hope that this will be of little importance for our |
| static compiler. This could affect us if we decided to do some IP |
| research. Also we do not yet understand the level of exception support |
| currently implemented. |
| |
| 2. Should we consider the requirements of a direct hardware implementation |
| of the LLVM when we design it? If so, several design issues should |
| have their priorities shifted. The other option is to focus on a |
| software layer interpreting the LLVM in all cases. |
| |
| 3. Should we use some form of packetized format to improve forward |
| compatibility? For example, we could design the system to encode a |
| packet type and length field before analysis information, to allow a |
| runtime to skip information that it didn't understand in a bytecode |
| stream. The obvious benefit would be for compatibility, the drawback |
| is that it would tend to splinter that 'standard' LLVM definition. |
| |
| 4. Should we use fixed length instructions or variable length |
| instructions? Fetching variable length instructions is expensive (for |
| either hardware or software based LLVM runtimes), but we have several |
| 'infinite' spaces that instructions operate in (SSA register numbers, |
| type spaces, or packet length [if packets were implemented]). Several |
| options were mentioned including: |
| A. Using 16 or 32 bit numbers, which would be 'big enough' |
| B. A scheme similar to how UTF-8 works, to encode infinite numbers |
| while keeping small number small. |
| C. Use something similar to Huffman encoding, so that the most common |
| numbers are the smallest. |
| |
| -Chris |
| |