Blame - third_party/LLVM/docs/LinkTimeOptimization.html - SwiftShader

blob: 5cd92142e87e9d07ce1215956e572009e9cb1715 [file] [log] [blame]

John Bauman	66b8ab2	2014-05-06 15:57:45 -0400	[diff] [blame]	1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
				2	"http://www.w3.org/TR/html4/strict.dtd">
				3	<html>
				4	<head>
				5	<title>LLVM Link Time Optimization: Design and Implementation</title>
				6	<link rel="stylesheet" href="llvm.css" type="text/css">
				7	</head>
				8
				9	<h1>
				10	LLVM Link Time Optimization: Design and Implementation
				11	</h1>
				12
				13	<ul>
				14	<li><a href="#desc">Description</a></li>
				15	<li><a href="#design">Design Philosophy</a>
				16	<ul>
				17	<li><a href="#example1">Example of link time optimization</a></li>
				18	<li><a href="#alternative_approaches">Alternative Approaches</a></li>
				19	</ul></li>
				20	<li><a href="#multiphase">Multi-phase communication between LLVM and linker</a>
				21	<ul>
				22	<li><a href="#phase1">Phase 1 : Read LLVM Bitcode Files</a></li>
				23	<li><a href="#phase2">Phase 2 : Symbol Resolution</a></li>
				24	<li><a href="#phase3">Phase 3 : Optimize Bitcode Files</a></li>
				25	<li><a href="#phase4">Phase 4 : Symbol Resolution after optimization</a></li>
				26	</ul></li>
				27	<li><a href="#lto">libLTO</a>
				28	<ul>
				29	<li><a href="#lto_module_t">lto_module_t</a></li>
				30	<li><a href="#lto_code_gen_t">lto_code_gen_t</a></li>
				31	</ul>
				32	</ul>
				33
				34	<div class="doc_author">
				35	<p>Written by Devang Patel and Nick Kledzik</p>
				36	</div>
				37
				38	<!-- *********************************************************************** -->
				39	<h2>
				40	<a name="desc">Description</a>
				41	</h2>
				42	<!-- *********************************************************************** -->
				43
				44	<div>
				45	<p>
				46	LLVM features powerful intermodular optimizations which can be used at link
				47	time. Link Time Optimization (LTO) is another name for intermodular optimization
				48	when performed during the link stage. This document describes the interface
				49	and design between the LTO optimizer and the linker.</p>
				50	</div>
				51
				52	<!-- *********************************************************************** -->
				53	<h2>
				54	<a name="design">Design Philosophy</a>
				55	</h2>
				56	<!-- *********************************************************************** -->
				57
				58	<div>
				59	<p>
				60	The LLVM Link Time Optimizer provides complete transparency, while doing
				61	intermodular optimization, in the compiler tool chain. Its main goal is to let
				62	the developer take advantage of intermodular optimizations without making any
				63	significant changes to the developer's makefiles or build system. This is
				64	achieved through tight integration with the linker. In this model, the linker
				65	treates LLVM bitcode files like native object files and allows mixing and
				66	matching among them. The linker uses <a href="#lto">libLTO</a>, a shared
				67	object, to handle LLVM bitcode files. This tight integration between
				68	the linker and LLVM optimizer helps to do optimizations that are not possible
				69	in other models. The linker input allows the optimizer to avoid relying on
				70	conservative escape analysis.
				71	</p>
				72
				73	<!-- ======================================================================= -->
				74	<h3>
				75	<a name="example1">Example of link time optimization</a>
				76	</h3>
				77
				78	<div>
				79	<p>The following example illustrates the advantages of LTO's integrated
				80	approach and clean interface. This example requires a system linker which
				81	supports LTO through the interface described in this document. Here,
				82	clang transparently invokes system linker. </p>
				83	<ul>
				84	<li> Input source file <tt>a.c</tt> is compiled into LLVM bitcode form.
				85	<li> Input source file <tt>main.c</tt> is compiled into native object code.
				86	</ul>
				87	<pre class="doc_code">
				88	--- a.h ---
				89	extern int foo1(void);
				90	extern void foo2(void);
				91	extern void foo4(void);
				92
				93	--- a.c ---
				94	#include "a.h"
				95
				96	static signed int i = 0;
				97
				98	void foo2(void) {
				99	i = -1;
				100	}
				101
				102	static int foo3() {
				103	foo4();
				104	return 10;
				105	}
				106
				107	int foo1(void) {
				108	int data = 0;
				109
				110	if (i < 0)
				111	data = foo3();
				112
				113	data = data + 42;
				114	return data;
				115	}
				116
				117	--- main.c ---
				118	#include <stdio.h>
				119	#include "a.h"
				120
				121	void foo4(void) {
				122	printf("Hi\n");
				123	}
				124
				125	int main() {
				126	return foo1();
				127	}
				128
				129	--- command lines ---
				130	$ clang -emit-llvm -c a.c -o a.o # <-- a.o is LLVM bitcode file
				131	$ clang -c main.c -o main.o # <-- main.o is native object file
				132	$ clang a.o main.o -o main # <-- standard link command without any modifications
				133	</pre>
				134
				135	<ul>
				136	<li>In this example, the linker recognizes that <tt>foo2()</tt> is an
				137	externally visible symbol defined in LLVM bitcode file. The linker
				138	completes its usual symbol resolution pass and finds that <tt>foo2()</tt>
				139	is not used anywhere. This information is used by the LLVM optimizer and
				140	it removes <tt>foo2()</tt>.</li>
				141	<li>As soon as <tt>foo2()</tt> is removed, the optimizer recognizes that condition
				142	<tt>i < 0</tt> is always false, which means <tt>foo3()</tt> is never
				143	used. Hence, the optimizer also removes <tt>foo3()</tt>.</li>
				144	<li>And this in turn, enables linker to remove <tt>foo4()</tt>.</li>
				145	</ul>
				146
				147	<p>This example illustrates the advantage of tight integration with the
				148	linker. Here, the optimizer can not remove <tt>foo3()</tt> without the
				149	linker's input.</p>
				150
				151	</div>
				152
				153	<!-- ======================================================================= -->
				154	<h3>
				155	<a name="alternative_approaches">Alternative Approaches</a>
				156	</h3>
				157
				158	<div>
				159	<dl>
				160	<dt><b>Compiler driver invokes link time optimizer separately.</b></dt>
				161	<dd>In this model the link time optimizer is not able to take advantage of
				162	information collected during the linker's normal symbol resolution phase.
				163	In the above example, the optimizer can not remove <tt>foo2()</tt> without
				164	the linker's input because it is externally visible. This in turn prohibits
				165	the optimizer from removing <tt>foo3()</tt>.</dd>
				166	<dt><b>Use separate tool to collect symbol information from all object
				167	files.</b></dt>
				168	<dd>In this model, a new, separate, tool or library replicates the linker's
				169	capability to collect information for link time optimization. Not only is
				170	this code duplication difficult to justify, but it also has several other
				171	disadvantages. For example, the linking semantics and the features
				172	provided by the linker on various platform are not unique. This means,
				173	this new tool needs to support all such features and platforms in one
				174	super tool or a separate tool per platform is required. This increases
				175	maintenance cost for link time optimizer significantly, which is not
				176	necessary. This approach also requires staying synchronized with linker
				177	developements on various platforms, which is not the main focus of the link
				178	time optimizer. Finally, this approach increases end user's build time due
				179	to the duplication of work done by this separate tool and the linker itself.
				180	</dd>
				181	</dl>
				182	</div>
				183
				184	</div>
				185
				186	<!-- *********************************************************************** -->
				187	<h2>
				188	<a name="multiphase">Multi-phase communication between libLTO and linker</a>
				189	</h2>
				190
				191	<div>
				192	<p>The linker collects information about symbol defininitions and uses in
				193	various link objects which is more accurate than any information collected
				194	by other tools during typical build cycles. The linker collects this
				195	information by looking at the definitions and uses of symbols in native .o
				196	files and using symbol visibility information. The linker also uses
				197	user-supplied information, such as a list of exported symbols. LLVM
				198	optimizer collects control flow information, data flow information and knows
				199	much more about program structure from the optimizer's point of view.
				200	Our goal is to take advantage of tight integration between the linker and
				201	the optimizer by sharing this information during various linking phases.
				202	</p>
				203
				204	<!-- ======================================================================= -->
				205	<h3>
				206	<a name="phase1">Phase 1 : Read LLVM Bitcode Files</a>
				207	</h3>
				208
				209	<div>
				210	<p>The linker first reads all object files in natural order and collects
				211	symbol information. This includes native object files as well as LLVM bitcode
				212	files. To minimize the cost to the linker in the case that all .o files
				213	are native object files, the linker only calls <tt>lto_module_create()</tt>
				214	when a supplied object file is found to not be a native object file. If
				215	<tt>lto_module_create()</tt> returns that the file is an LLVM bitcode file,
				216	the linker
				217	then iterates over the module using <tt>lto_module_get_symbol_name()</tt> and
				218	<tt>lto_module_get_symbol_attribute()</tt> to get all symbols defined and
				219	referenced.
				220	This information is added to the linker's global symbol table.
				221	</p>
				222	<p>The lto* functions are all implemented in a shared object libLTO. This
				223	allows the LLVM LTO code to be updated independently of the linker tool.
				224	On platforms that support it, the shared object is lazily loaded.
				225	</p>
				226	</div>
				227
				228	<!-- ======================================================================= -->
				229	<h3>
				230	<a name="phase2">Phase 2 : Symbol Resolution</a>
				231	</h3>
				232
				233	<div>
				234	<p>In this stage, the linker resolves symbols using global symbol table.
				235	It may report undefined symbol errors, read archive members, replace
				236	weak symbols, etc. The linker is able to do this seamlessly even though it
				237	does not know the exact content of input LLVM bitcode files. If dead code
				238	stripping is enabled then the linker collects the list of live symbols.
				239	</p>
				240	</div>
				241
				242	<!-- ======================================================================= -->
				243	<h3>
				244	<a name="phase3">Phase 3 : Optimize Bitcode Files</a>
				245	</h3>
				246	<div>
				247	<p>After symbol resolution, the linker tells the LTO shared object which
				248	symbols are needed by native object files. In the example above, the linker
				249	reports that only <tt>foo1()</tt> is used by native object files using
				250	<tt>lto_codegen_add_must_preserve_symbol()</tt>. Next the linker invokes
				251	the LLVM optimizer and code generators using <tt>lto_codegen_compile()</tt>
				252	which returns a native object file creating by merging the LLVM bitcode files
				253	and applying various optimization passes.
				254	</p>
				255	</div>
				256
				257	<!-- ======================================================================= -->
				258	<h3>
				259	<a name="phase4">Phase 4 : Symbol Resolution after optimization</a>
				260	</h3>
				261
				262	<div>
				263	<p>In this phase, the linker reads optimized a native object file and
				264	updates the internal global symbol table to reflect any changes. The linker
				265	also collects information about any changes in use of external symbols by
				266	LLVM bitcode files. In the example above, the linker notes that
				267	<tt>foo4()</tt> is not used any more. If dead code stripping is enabled then
				268	the linker refreshes the live symbol information appropriately and performs
				269	dead code stripping.</p>
				270	<p>After this phase, the linker continues linking as if it never saw LLVM
				271	bitcode files.</p>
				272	</div>
				273
				274	</div>
				275
				276	<!-- *********************************************************************** -->
				277	<h2>
				278	<a name="lto">libLTO</a>
				279	</h2>
				280
				281	<div>
				282	<p><tt>libLTO</tt> is a shared object that is part of the LLVM tools, and
				283	is intended for use by a linker. <tt>libLTO</tt> provides an abstract C
				284	interface to use the LLVM interprocedural optimizer without exposing details
				285	of LLVM's internals. The intention is to keep the interface as stable as
				286	possible even when the LLVM optimizer continues to evolve. It should even
				287	be possible for a completely different compilation technology to provide
				288	a different libLTO that works with their object files and the standard
				289	linker tool.</p>
				290
				291	<!-- ======================================================================= -->
				292	<h3>
				293	<a name="lto_module_t">lto_module_t</a>
				294	</h3>
				295
				296	<div>
				297
				298	<p>A non-native object file is handled via an <tt>lto_module_t</tt>.
				299	The following functions allow the linker to check if a file (on disk
				300	or in a memory buffer) is a file which libLTO can process:</p>
				301
				302	<pre class="doc_code">
				303	lto_module_is_object_file(const char*)
				304	lto_module_is_object_file_for_target(const char, const char)
				305	lto_module_is_object_file_in_memory(const void*, size_t)
				306	lto_module_is_object_file_in_memory_for_target(const void, size_t, const char)
				307	</pre>
				308
				309	<p>If the object file can be processed by libLTO, the linker creates a
				310	<tt>lto_module_t</tt> by using one of</p>
				311
				312	<pre class="doc_code">
				313	lto_module_create(const char*)
				314	lto_module_create_from_memory(const void*, size_t)
				315	</pre>
				316
				317	<p>and when done, the handle is released via</p>
				318
				319	<pre class="doc_code">
				320	lto_module_dispose(lto_module_t)
				321	</pre>
				322
				323	<p>The linker can introspect the non-native object file by getting the number of
				324	symbols and getting the name and attributes of each symbol via:</p>
				325
				326	<pre class="doc_code">
				327	lto_module_get_num_symbols(lto_module_t)
				328	lto_module_get_symbol_name(lto_module_t, unsigned int)
				329	lto_module_get_symbol_attribute(lto_module_t, unsigned int)
				330	</pre>
				331
				332	<p>The attributes of a symbol include the alignment, visibility, and kind.</p>
				333	</div>
				334
				335	<!-- ======================================================================= -->
				336	<h3>
				337	<a name="lto_code_gen_t">lto_code_gen_t</a>
				338	</h3>
				339
				340	<div>
				341
				342	<p>Once the linker has loaded each non-native object files into an
				343	<tt>lto_module_t</tt>, it can request libLTO to process them all and
				344	generate a native object file. This is done in a couple of steps.
				345	First, a code generator is created with:</p>
				346
				347	<pre class="doc_code">lto_codegen_create()</pre>
				348
				349	<p>Then, each non-native object file is added to the code generator with:</p>
				350
				351	<pre class="doc_code">
				352	lto_codegen_add_module(lto_code_gen_t, lto_module_t)
				353	</pre>
				354
				355	<p>The linker then has the option of setting some codegen options. Whether or
				356	not to generate DWARF debug info is set with:</p>
				357
				358	<pre class="doc_code">lto_codegen_set_debug_model(lto_code_gen_t)</pre>
				359
				360	<p>Which kind of position independence is set with:</p>
				361
				362	<pre class="doc_code">lto_codegen_set_pic_model(lto_code_gen_t) </pre>
				363
				364	<p>And each symbol that is referenced by a native object file or otherwise must
				365	not be optimized away is set with:</p>
				366
				367	<pre class="doc_code">
				368	lto_codegen_add_must_preserve_symbol(lto_code_gen_t, const char*)
				369	</pre>
				370
				371	<p>After all these settings are done, the linker requests that a native object
				372	file be created from the modules with the settings using:</p>
				373
				374	<pre class="doc_code">lto_codegen_compile(lto_code_gen_t, size*)</pre>
				375
				376	<p>which returns a pointer to a buffer containing the generated native
				377	object file. The linker then parses that and links it with the rest
				378	of the native object files.</p>
				379
				380	</div>
				381
				382	</div>
				383
				384	<!-- *********************************************************************** -->
				385
				386	<hr>
				387	<address>
				388	<a href="http://jigsaw.w3.org/css-validator/check/referer"><img
				389	src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
				390	<a href="http://validator.w3.org/check/referer"><img
				391	src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
				392
				393	Devang Patel and Nick Kledzik<br>
				394	<a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>
				395	Last modified: $Date: 2011-09-18 08:51:05 -0400 (Sun, 18 Sep 2011) $
				396	</address>
				397
				398	</body>
				399	</html>
				400