|  | Subzero - Fast code generator for PNaCl bitcode | 
|  | =============================================== | 
|  |  | 
|  | Design | 
|  | ------ | 
|  |  | 
|  | See the accompanying DESIGN.rst file for a more detailed technical overview of | 
|  | Subzero. | 
|  |  | 
|  | Building | 
|  | -------- | 
|  |  | 
|  | Subzero is set up to be built within the Native Client tree.  Follow the | 
|  | `Developing PNaCl | 
|  | <https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl>`_ | 
|  | instructions, in particular the section on building PNaCl sources.  This will | 
|  | prepare the necessary external headers and libraries that Subzero needs. | 
|  | Checking out the Native Client project also gets the pre-built clang and LLVM | 
|  | tools in ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which | 
|  | are used for building Subzero. | 
|  |  | 
|  | The Subzero source is in ``native_client/toolchain_build/src/subzero``.  From | 
|  | within that directory, ``git checkout master && git pull`` to get the latest | 
|  | version of Subzero source code. | 
|  |  | 
|  | The Makefile is designed to be used as part of the higher level LLVM build | 
|  | system.  To build manually, use the ``Makefile.standalone``.  There are several | 
|  | build configurations from the command line:: | 
|  |  | 
|  | make -f Makefile.standalone | 
|  | make -f Makefile.standalone DEBUG=1 | 
|  | make -f Makefile.standalone NOASSERT=1 | 
|  | make -f Makefile.standalone DEBUG=1 NOASSERT=1 | 
|  | make -f Makefile.standalone MINIMAL=1 | 
|  | make -f Makefile.standalone ASAN=1 | 
|  | make -f Makefile.standalone TSAN=1 | 
|  |  | 
|  | ``DEBUG=1`` builds without optimizations and is good when running the translator | 
|  | inside a debugger.  ``NOASSERT=1`` disables assertions and is the preferred | 
|  | configuration for performance testing the translator.  ``MINIMAL=1`` attempts to | 
|  | minimize the size of the translator by compiling out everything unnecessary. | 
|  | ``ASAN=1`` enables AddressSanitizer, and ``TSAN=1`` enables ThreadSanitizer. | 
|  |  | 
|  | The result of the ``make`` command is the target ``pnacl-sz`` in the current | 
|  | directory. | 
|  |  | 
|  | Building within LLVM trunk | 
|  | -------------------------- | 
|  |  | 
|  | Subzero can also be built from within a standard LLVM trunk checkout.  Here is | 
|  | an example of how it can be checked out and built:: | 
|  |  | 
|  | mkdir llvm-git | 
|  | cd llvm-git | 
|  | git clone http://llvm.org/git/llvm.git | 
|  | cd llvm/projects/ | 
|  | git clone https://chromium.googlesource.com/native_client/pnacl-subzero | 
|  | cd ../.. | 
|  | mkdir build | 
|  | cd build | 
|  | cmake -G Ninja ../llvm/ | 
|  | ninja | 
|  | ./bin/pnacl-sz -version | 
|  |  | 
|  | This creates a default build of ``pnacl-sz``; currently any options such as | 
|  | ``DEBUG=1`` or ``MINIMAL=1`` have to be added manually. | 
|  |  | 
|  | ``pnacl-sz`` | 
|  | ------------ | 
|  |  | 
|  | The ``pnacl-sz`` program parses a pexe or an LLVM bitcode file and translates it | 
|  | into ICE (Subzero's intermediate representation).  It then invokes the ICE | 
|  | translate method to lower it to target-specific machine code, optionally dumping | 
|  | the intermediate representation at various stages of the translation. | 
|  |  | 
|  | The program can be run as follows:: | 
|  |  | 
|  | ../pnacl-sz ./path/to/<file>.pexe | 
|  | ../pnacl-sz ./tests_lit/pnacl-sz_tests/<file>.ll | 
|  |  | 
|  | At this time, ``pnacl-sz`` accepts a number of arguments, including the | 
|  | following: | 
|  |  | 
|  | ``-help`` -- Show available arguments and possible values.  (Note: this | 
|  | unfortunately also pulls in some LLVM-specific options that are reported but | 
|  | that Subzero doesn't use.) | 
|  |  | 
|  | ``-notranslate`` -- Suppress the ICE translation phase, which is useful if | 
|  | ICE is missing some support. | 
|  |  | 
|  | ``-target=<TARGET>`` -- Set the target architecture.  The default is x8632. | 
|  | Future targets include x8664, arm32, and arm64. | 
|  |  | 
|  | ``-filetype=obj|asm|iasm`` -- Select the output file type.  ``obj`` is a | 
|  | native ELF file, ``asm`` is a textual assembly file, and ``iasm`` is a | 
|  | low-level textual assembly file demonstrating the integrated assembler. | 
|  |  | 
|  | ``-O<LEVEL>`` -- Set the optimization level.  Valid levels are ``2``, ``1``, | 
|  | ``0``, ``-1``, and ``m1``.  Levels ``-1`` and ``m1`` are synonyms, and | 
|  | represent the minimum optimization and worst code quality, but fastest code | 
|  | generation. | 
|  |  | 
|  | ``-verbose=<list>`` -- Set verbosity flags.  This argument allows a | 
|  | comma-separated list of values.  The default is ``none``, and the value | 
|  | ``inst,pred`` will roughly match the .ll bitcode file.  Of particular use | 
|  | are ``all``, ``most``, and ``none``. | 
|  |  | 
|  | ``-o <FILE>`` -- Set the assembly output file name.  Default is stdout. | 
|  |  | 
|  | ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is | 
|  | controlled by ``-verbose``).  Default is stdout. | 
|  |  | 
|  | ``-timing`` -- Dump some pass timing information after translating the input | 
|  | file. | 
|  |  | 
|  | Running the test suite | 
|  | ---------------------- | 
|  |  | 
|  | Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which | 
|  | lives in ``tests_lit``. To execute the test suite, first build Subzero, and then | 
|  | run:: | 
|  |  | 
|  | make -f Makefile.standalone check-lit | 
|  |  | 
|  | There is also a suite of cross tests in the ``crosstest`` directory.  A cross | 
|  | test takes a test bitcode file implementing some unit tests, and translates it | 
|  | twice, once with Subzero and once with LLVM's known-good ``llc`` translator. | 
|  | The Subzero-translated symbols are specially mangled to avoid multiple | 
|  | definition errors from the linker.  Both translated versions are linked together | 
|  | with a driver program that calls each version of each unit test with a variety | 
|  | of interesting inputs and compares the results for equality.  The cross tests | 
|  | are currently invoked by running:: | 
|  |  | 
|  | make -f Makefile.standalone check-xtest | 
|  |  | 
|  | Similar, there is a suite of unit tests:: | 
|  |  | 
|  | make -f Makefile.standalone check-unit | 
|  |  | 
|  | A convenient way to run the lit, cross, and unit tests is:: | 
|  |  | 
|  | make -f Makefile.standalone check | 
|  |  | 
|  | Assembling ``pnacl-sz`` output as needed | 
|  | ---------------------------------------- | 
|  |  | 
|  | ``pnacl-sz`` can now produce a native ELF binary using ``-filetype=obj``. | 
|  |  | 
|  | ``pnacl-sz`` can also produce textual assembly code in a structure suitable for | 
|  | input to ``llvm-mc``, using ``-filetype=asm`` or ``-filetype=iasm``.  An object | 
|  | file can then be produced using the command:: | 
|  |  | 
|  | llvm-mc -triple=i686 -filetype=obj -o=MyObj.o | 
|  |  | 
|  | Building a translated binary | 
|  | ---------------------------- | 
|  |  | 
|  | There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe | 
|  | into a fully linked executable.  Run it with ``-help`` for extensive | 
|  | documentation. | 
|  |  | 
|  | By default, ``szbuild.py`` builds an executable using only Subzero translation, | 
|  | but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is | 
|  | the name of the LLVM translator) for bisection-based debugging.  In bisection | 
|  | debugging mode, the pexe is translated using both Subzero and ``llc``, and the | 
|  | resulting object files are combined into a single executable using symbol | 
|  | weakening and other linker tricks to control which Subzero symbols and which | 
|  | ``llc`` symbols take precedence.  This is controlled by the ``-include`` and | 
|  | ``-exclude`` arguments.  These can be used to rapidly find a single function | 
|  | that Subzero translates incorrectly leading to incorrect output. | 
|  |  | 
|  | There is another helper script, ``pydir/szbuild_spec2k.py``, that runs | 
|  | ``szbuild.py`` on one or more components of the Spec2K suite.  This assumes that | 
|  | Spec2K is set up in the usual place in the Native Client tree, and the finalized | 
|  | pexe files have been built.  (Note: for working with Spec2K and other pexes, | 
|  | it's helpful to finalize the pexe using ``--no-strip-syms``, to preserve the | 
|  | original function and global variable names.) | 
|  |  | 
|  | Status | 
|  | ------ | 
|  |  | 
|  | Subzero currently fully supports the x86-32 architecture, for both native and | 
|  | Native Client sandboxing modes.  The x86-64 architecture is also supported in | 
|  | native mode only, and only for the x32 flavor due to the fact that pointers and | 
|  | 32-bit integers are indistinguishable in PNaCl bitcode.  Sandboxing support for | 
|  | x86-64 is in progress.  ARM and MIPS support is in progress.  Two optimization | 
|  | levels, ``-Om1`` and ``-O2``, are implemented. | 
|  |  | 
|  | The ``-Om1`` configuration is designed to be the simplest and fastest possible, | 
|  | with a minimal set of passes and transformations. | 
|  |  | 
|  | * Simple Phi lowering before target lowering, by generating temporaries and | 
|  | adding assignments to the end of predecessor blocks. | 
|  |  | 
|  | * Simple register allocation limited to pre-colored or infinite-weight | 
|  | Variables. | 
|  |  | 
|  | The ``-O2`` configuration is designed to use all optimizations available and | 
|  | produce the best code. | 
|  |  | 
|  | * Address mode inference to leverage the complex x86 addressing modes. | 
|  |  | 
|  | * Compare/branch fusing based on liveness/last-use analysis. | 
|  |  | 
|  | * Global, linear-scan register allocation. | 
|  |  | 
|  | * Advanced phi lowering after target lowering and global register allocation, | 
|  | via edge splitting, topological sorting of the parallel moves, and final local | 
|  | register allocation. | 
|  |  | 
|  | * Stack slot coalescing to reduce frame size. | 
|  |  | 
|  | * Branch optimization to reduce the number of branches to the following block. |