| //===-- README.txt - Notes for WebAssembly code gen -----------------------===// |
| |
| This WebAssembly backend is presently under development. |
| |
| The most notable feature which is not yet stable is the ".o" file format. |
| ".o" file support is needed for many common ways of using LLVM, such as |
| using it through "clang -c", so this backend is not yet considered widely |
| usable. However, this backend is usable within some language toolchain |
| packages: |
| |
| Emscripten provides a C/C++ compilation environment that includes standard |
| libraries, tools, and packaging for producing WebAssembly applications that |
| can run in browsers and other environments. For more information, see the |
| Emscripten documentation in general, and this page in particular: |
| |
| * https://github.com/kripken/emscripten/wiki/New-WebAssembly-Backend |
| |
| Rust provides WebAssembly support integrated into Cargo. There are two |
| main options: |
| - wasm32-unknown-unknown, which provides a relatively minimal environment |
| that has an emphasis on being "native" |
| - wasm32-unknown-emscripten, which uses Emscripten internally and |
| provides standard C/C++ libraries, filesystem emulation, GL and SDL |
| bindings |
| For more information, see: |
| * https://www.hellorust.com/ |
| |
| |
| This backend does not yet support debug info. Full DWARF support needs a |
| design for how DWARF should be represented in WebAssembly. Sourcemap support |
| has an existing design and some corresponding browser implementations, so it |
| just needs implementing in LLVM. |
| |
| Work-in-progress documentation for the ".o" file format is here: |
| |
| * https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md |
| |
| A corresponding linker implementation is also under development: |
| |
| * https://lld.llvm.org/WebAssembly.html |
| |
| For more information on WebAssembly itself, see the home page: |
| * https://webassembly.github.io/ |
| |
| The following documents contain some information on the semantics and binary |
| encoding of WebAssembly itself: |
| * https://github.com/WebAssembly/design/blob/master/Semantics.md |
| * https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md |
| |
| The backend is built, tested and archived on the following waterfall: |
| https://wasm-stat.us |
| |
| The backend's bringup is done in part by using the GCC torture test suite, since |
| it doesn't require C library support. Current known failures are in |
| known_gcc_test_failures.txt, all other tests should pass. The waterfall will |
| turn red if not. Once most of these pass, further testing will use LLVM's own |
| test suite. The tests can be run locally using: |
| https://github.com/WebAssembly/waterfall/blob/master/src/compile_torture_tests.py |
| |
| Some notes on ways that the generated code could be improved follow: |
| |
| //===---------------------------------------------------------------------===// |
| |
| Br, br_if, and br_table instructions can support having a value on the value |
| stack across the jump (sometimes). We should (a) model this, and (b) extend |
| the stackifier to utilize it. |
| |
| //===---------------------------------------------------------------------===// |
| |
| The min/max instructions aren't exactly a<b?a:b because of NaN and negative zero |
| behavior. The ARM target has the same kind of min/max instructions and has |
| implemented optimizations for them; we should do similar optimizations for |
| WebAssembly. |
| |
| //===---------------------------------------------------------------------===// |
| |
| AArch64 runs SeparateConstOffsetFromGEPPass, followed by EarlyCSE and LICM. |
| Would these be useful to run for WebAssembly too? Also, it has an option to |
| run SimplifyCFG after running the AtomicExpand pass. Would this be useful for |
| us too? |
| |
| //===---------------------------------------------------------------------===// |
| |
| Register stackification uses the VALUE_STACK physical register to impose |
| ordering dependencies on instructions with stack operands. This is pessimistic; |
| we should consider alternate ways to model stack dependencies. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Lots of things could be done in WebAssemblyTargetTransformInfo.cpp. Similarly, |
| there are numerous optimization-related hooks that can be overridden in |
| WebAssemblyTargetLowering. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Instead of the OptimizeReturned pass, which should consider preserving the |
| "returned" attribute through to MachineInstrs and extending the StoreResults |
| pass to do this optimization on calls too. That would also let the |
| WebAssemblyPeephole pass clean up dead defs for such calls, as it does for |
| stores. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Consider implementing optimizeSelect, optimizeCompareInstr, optimizeCondBranch, |
| optimizeLoadInstr, and/or getMachineCombinerPatterns. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Find a clean way to fix the problem which leads to the Shrink Wrapping pass |
| being run after the WebAssembly PEI pass. |
| |
| //===---------------------------------------------------------------------===// |
| |
| When setting multiple local variables to the same constant, we currently get |
| code like this: |
| |
| i32.const $4=, 0 |
| i32.const $3=, 0 |
| |
| It could be done with a smaller encoding like this: |
| |
| i32.const $push5=, 0 |
| tee_local $push6=, $4=, $pop5 |
| copy_local $3=, $pop6 |
| |
| //===---------------------------------------------------------------------===// |
| |
| WebAssembly registers are implicitly initialized to zero. Explicit zeroing is |
| therefore often redundant and could be optimized away. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Small indices may use smaller encodings than large indices. |
| WebAssemblyRegColoring and/or WebAssemblyRegRenumbering should sort registers |
| according to their usage frequency to maximize the usage of smaller encodings. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Many cases of irreducible control flow could be transformed more optimally |
| than via the transform in WebAssemblyFixIrreducibleControlFlow.cpp. |
| |
| It may also be worthwhile to do transforms before register coloring, |
| particularly when duplicating code, to allow register coloring to be aware of |
| the duplication. |
| |
| //===---------------------------------------------------------------------===// |
| |
| WebAssemblyRegStackify could use AliasAnalysis to reorder loads and stores more |
| aggressively. |
| |
| //===---------------------------------------------------------------------===// |
| |
| WebAssemblyRegStackify is currently a greedy algorithm. This means that, for |
| example, a binary operator will stackify with its user before its operands. |
| However, if moving the binary operator to its user moves it to a place where |
| its operands can't be moved to, it would be better to leave it in place, or |
| perhaps move it up, so that it can stackify its operands. A binary operator |
| has two operands and one result, so in such cases there could be a net win by |
| preferring the operands. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Instruction ordering has a significant influence on register stackification and |
| coloring. Consider experimenting with the MachineScheduler (enable via |
| enableMachineScheduler) and determine if it can be configured to schedule |
| instructions advantageously for this purpose. |
| |
| //===---------------------------------------------------------------------===// |
| |
| WebAssemblyRegStackify currently assumes that the stack must be empty after |
| an instruction with no return values, however wasm doesn't actually require |
| this. WebAssemblyRegStackify could be extended, or possibly rewritten, to take |
| full advantage of what WebAssembly permits. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Add support for mergeable sections in the Wasm writer, such as for strings and |
| floating-point constants. |
| |
| //===---------------------------------------------------------------------===// |
| |
| The function @dynamic_alloca_redzone in test/CodeGen/WebAssembly/userstack.ll |
| ends up with a tee_local in its prolog which has an unused result, requiring |
| an extra drop: |
| |
| get_global $push8=, 0 |
| tee_local $push9=, 1, $pop8 |
| drop $pop9 |
| [...] |
| |
| The prologue code initially thinks it needs an FP register, but later it |
| turns out to be unneeded, so one could either approach this by being more |
| clever about not inserting code for an FP in the first place, or optimizing |
| away the copy later. |
| |
| //===---------------------------------------------------------------------===// |