blob: 917eca8b6a54a3961eb0cd27943d0d568804ad5b [file] [log] [blame]
Jim Stichnothf7c9a142014-04-29 10:52:43 -07001Subzero - Fast code generator for PNaCl bitcode
2===============================================
3
4Building
5--------
6
Jim Stichnoth144a3932014-11-18 09:16:31 -08007Subzero is set up to be built within the Native Client tree. Follow the
8`Developing PNaCl
9<https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl>`_
10instructions, in particular the section on building PNaCl sources. This will
11prepare the necessary external headers and libraries that Subzero needs.
12Checking out the Native Client project also gets the pre-built clang and LLVM
13tools in ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which
14are used for building Subzero.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070015
Jim Stichnoth144a3932014-11-18 09:16:31 -080016The Subzero source is in ``native_client/toolchain_build/src/subzero``. From
17within that directory, ``git checkout master && git pull`` to get the latest
18version of Subzero source code.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070019
Jim Stichnoth144a3932014-11-18 09:16:31 -080020The Makefile is designed to be used as part of the higher level LLVM build
21system. To build manually, use the ``Makefile.standalone``. There are several
22build configurations from the command line::
23
24 make -f Makefile.standalone
25 make -f Makefile.standalone DEBUG=1
26 make -f Makefile.standalone NOASSERT=1
27 make -f Makefile.standalone DEBUG=1 NOASSERT=1
28 make -f Makefile.standalone MINIMAL=1
29
30``DEBUG=1`` builds without optimizations and is good when running the translator
31inside a debugger. ``NOASSERT=1`` disables assertions and is the preferred
32configuration for performance testing the translator. ``MINIMAL=1`` attempts to
33minimize the size of the translator by compiling out everything unnecessary.
34
35The result of the ``make`` command is the target ``llvm2ice`` in the current
36directory.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070037
38``llvm2ice``
39------------
40
Jim Stichnoth144a3932014-11-18 09:16:31 -080041The ``llvm2ice`` program parses a pexe or an LLVM bitcode file and translates it
42into ICE (Subzero's intermediate representation). It then invokes the ICE
43translate method to lower it to target-specific machine code, optionally dumping
44the intermediate representation at various stages of the translation.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070045
46The program can be run as follows::
47
Jim Stichnoth144a3932014-11-18 09:16:31 -080048 ../llvm2ice ./path/to/<file>.pexe
Jim Stichnothf7c9a142014-04-29 10:52:43 -070049 ../llvm2ice ./tests_lit/llvm2ice_tests/<file>.ll
50
Jim Stichnoth144a3932014-11-18 09:16:31 -080051At this time, ``llvm2ice`` accepts a number of arguments, including the
52following:
Jim Stichnothf7c9a142014-04-29 10:52:43 -070053
Jim Stichnoth144a3932014-11-18 09:16:31 -080054 ``-help`` -- Show available arguments and possible values. (Note: this
55 unfortunately also pulls in some LLVM-specific options that are reported but
56 that Subzero doesn't use.)
Jim Stichnothf7c9a142014-04-29 10:52:43 -070057
58 ``-notranslate`` -- Suppress the ICE translation phase, which is useful if
59 ICE is missing some support.
60
Jim Stichnoth5bc2b1d2014-05-22 13:38:48 -070061 ``-target=<TARGET>`` -- Set the target architecture. The default is x8632.
62 Future targets include x8664, arm32, and arm64.
63
Jim Stichnoth144a3932014-11-18 09:16:31 -080064 ``-integrated-as=0|1`` -- Disable/enable the integrated assembler.
65
Jim Stichnoth5bc2b1d2014-05-22 13:38:48 -070066 ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``,
67 ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and
68 represent the minimum optimization and worst code quality, but fastest code
69 generation.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070070
71 ``-verbose=<list>`` -- Set verbosity flags. This argument allows a
72 comma-separated list of values. The default is ``none``, and the value
73 ``inst,pred`` will roughly match the .ll bitcode file. Of particular use
74 are ``all`` and ``none``.
75
Jim Stichnoth5bc2b1d2014-05-22 13:38:48 -070076 ``-o <FILE>`` -- Set the assembly output file name. Default is stdout.
77
78 ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is
79 controlled by ``-verbose``). Default is stdout.
80
Jim Stichnoth144a3932014-11-18 09:16:31 -080081 ``-timing`` -- Dump some pass timing information after translating the input
82 file.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070083
84Running the test suite
85----------------------
86
Jim Stichnoth144a3932014-11-18 09:16:31 -080087Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which
88lives in ``tests_lit``. To execute the test suite, first build Subzero, and then
89run::
Jim Stichnothf7c9a142014-04-29 10:52:43 -070090
Jim Stichnoth144a3932014-11-18 09:16:31 -080091 make -f Makefile.standalone check-lit
Jim Stichnothf7c9a142014-04-29 10:52:43 -070092
Jim Stichnoth144a3932014-11-18 09:16:31 -080093There is also a suite of cross tests in the ``crosstest`` directory. A cross
94test takes a test bitcode file implementing some unit tests, and translates it
95twice, once with Subzero and once with LLVM's known-good ``llc`` translator.
96The Subzero-translated symbols are specially mangled to avoid multiple
97definition errors from the linker. Both translated versions are linked together
98with a driver program that calls each version of each unit test with a variety
99of interesting inputs and compares the results for equality. The cross tests
100are currently invoked by running the ``runtests.sh`` script.
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700101
Jim Stichnoth144a3932014-11-18 09:16:31 -0800102A convenient way to run both the lit tests and the cross tests is::
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700103
Jim Stichnoth144a3932014-11-18 09:16:31 -0800104 make -f Makefile.standalone check
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700105
106Assembling ``llvm2ice`` output
107------------------------------
108
109Currently ``llvm2ice`` produces textual assembly code in a structure suitable
Jim Stichnoth144a3932014-11-18 09:16:31 -0800110for input to ``llvm-mc``. An object file can be produced using the command::
111
112 llvm-mc -arch=x86 -filetype=obj -o=MyObj.o
113
114In the future, the integrated assembler will directly produce ELF object files.
115
116Building a translated binary
117----------------------------
118
119There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe
120into a fully linked executable. Run it with ``-help`` for extensive
121documentation.
122
123By default, ``szbuild.py`` builds an executable using only Subzero translation,
124but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is
125the name of the LLVM translator) for bisection-based debugging. In bisection
126debugging mode, the pexe is translated using both Subzero and ``llc``, and the
127resulting object files are combined into a single executable using symbol
128weakening and other linker tricks to control which Subzero symbols and which
129``llc`` symbols take precedence. This is controlled by the ``-include`` and
130``-exclude`` arguments. These can be used to rapidly find a single function
131that Subzero translates incorrectly leading to incorrect output.
132
133There is another helper script, ``pydir/szbuild_spec2k.py``, that runs
134``szbuild.py`` on one or more components of the Spec2K suite. This assumes that
135Spec2K is set up in the usual place in the Native Client tree, and the finalized
136pexe files have been built. (Note: for working with Spec2K and other pexes,
137it's helpful to finalize the pexe using ``--no-strip-syms``, to preserve the
138original function and global variable names.)
139
140Status
141------
142
143Subzero currently translates only for the x86-32 architecture. Native Client
144sandboxing is not yet implemented. Two optimization levels, ``-Om1`` and
145``-O2``, are implemented.
146
147The ``-Om1`` configuration is designed to be the simplest and fastest possible,
148with a minimal set of passes and transformations.
149
150* Simple Phi lowering before target lowering, by generating temporaries and
151 adding assignments to the end of predecessor blocks.
152
153* Simple register allocation limited to pre-colored and infinite-weight
154 Variables.
155
156The ``-O2`` configuration is designed to use all optimizations available and
157produce the best code.
158
159* Address mode inference to leverage the complex x86 addressing modes.
160
161* Compare/branch fusing based on liveness/last-use analysis.
162
163* Global, linear-scan register allocation.
164
165* Advanced phi lowering after target lowering and global register allocation,
166 via edge splitting, topological sorting of the parallel moves, and final local
167 register allocation.
168
169* Stack slot coalescing to reduce frame size.
170
171* Branch optimization to reduce the number of branches to the following block.