Blame - src/README.SIMD.rst - SwiftShader

blob: 58f25d96b1fa4285267267eaa56434df0afe31f0 [file] [log] [blame]

Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	1	Missing support
				2	===============
				3
				4	* The PNaCl LLVM backend expands shufflevector operations into
				5	sequences of insertelement and extractelement operations. For
				6	instance:
				7
				8	define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) {
				9	entry:
				10	%res = shufflevector <4 x i32> %arg1, <4 x i32> %arg2, <4 x i32> <i32 4, i32 5, i32 0, i32 1>
				11	ret <4 x i32> %res
				12	}
				13
				14	gets expanded into:
				15
				16	define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) {
				17	entry:
				18	%0 = extractelement <4 x i32> %arg2, i32 0
				19	%1 = insertelement <4 x i32> undef, i32 %0, i32 0
				20	%2 = extractelement <4 x i32> %arg2, i32 1
				21	%3 = insertelement <4 x i32> %1, i32 %2, i32 1
				22	%4 = extractelement <4 x i32> %arg1, i32 0
				23	%5 = insertelement <4 x i32> %3, i32 %4, i32 2
				24	%6 = extractelement <4 x i32> %arg1, i32 1
				25	%7 = insertelement <4 x i32> %5, i32 %6, i32 3
				26	ret <4 x i32> %7
				27	}
				28
				29	Subzero should recognize these sequences and recombine them into
				30	shuffle operations where appropriate.
				31
				32	* Add support for vector constants in the backend. The current code
				33	materializes the vector constants it needs (eg. for performing icmp
				34	on unsigned operands) using register operations, but this should be
				35	changed to loading them from a constant pool if the register
				36	initialization is too complicated (such as in
				37	TargetX8632::makeVectorOfHighOrderBits()).
				38
				39	* [x86 specific] llvm-mc does not allow lea to take a mem128 memory
				40	operand when assembling x86-32 code. The current
				41	InstX8632Lea::emit() code uses Variable::asType() to convert any
				42	mem128 Variables into a compatible memory operand type. However, the
				43	emit code does not do any conversions of OperandX8632Mem, so if an
				44	OperandX8632Mem is passed to lea as mem128 the resulting code will
				45	not assemble. One way to fix this is by implementing
				46	OperandX8632Mem::asType().
				47
				48	* [x86 specific] Lower shl with <4 x i32> using some clever float
				49	conversion:
				50	http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100726/105087.html
				51
				52	* [x86 specific] Add support for using aligned mov operations
				53	(movaps). This will require passing alignment information to loads
				54	and stores.
				55
				56	x86 SIMD Diversification
				57	========================
				58
				59	* Vector "bitwise" operations have several variant instructions: the
				60	AND operation can be implemented with pand, andpd, or andps. This
				61	pattern also holds for ANDN, OR, and XOR.
				62
				63	* Vector "mov" instructions can be diversified (eg. movdqu instead of
				64	movups) at the cost of a possible performance penalty.
				65
				66	* Scalar FP arithmetic can be diversified by performing the operations
				67	with the vector version of the instructions.