Don't make use of cvtps2dq in MSan builds MemorySanitizer's instrumentation currently does not handle the cvtps2dq intrinsic/instruction. It falls back to checking the source operand for any uninitialized data. In shaders we can have conditional code which causes only some SIMD lanes to be initialized, and only those lanes are logically used in further computations. So it's valid in these cases to do operations that use cvtps2dq on a partially initialized vector, but MSan will report it as an error. This can be worked around by avoiding the use of cvtps2dq and instead relying on lowerRoundInt(), which translates into the nearbyint intrinsic followed by an FPToSI. MemorySanitizer handles the former in the maybeHandleSimpleNomemIntrinsic() method, which copies the shadow of the source vector to the destination vector (i.e. it propagates it without checking it). Note that cvtps2dq does not follow the same code path as nearbyint because it has different types for the source and destination vector. We could handle it explicitly by doing the same shadow propagation. Considering that the workaround in Reactor results in simply using two cheap instructions instead of one it's not performance critical to fix this in LLVM. The rr::RoundIntClamped() intrinsic required a fix in the fallback path: 0x80000000 was meant to represent -2147483648 but instead got casted to a positive float value of 2147483648.0f. Explicitly casting it to int first produces the desired negative integer value before the conversion to float. Bug: b/172238865 Change-Id: I4f07bb8cb6d25d914dab836f64510f8b2bad18ba Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/65608 Kokoro-Result: kokoro <noreply+kokoro@google.com> Reviewed-by: Alexis Hétu <sugoi@google.com> Tested-by: Nicolas Capens <nicolascapens@google.com>

commit: 1ca6698b903cab2091dd41a81d8fc4a445e378ff [log] [tgz]
author: Nicolas Capens <capn@google.com> Wed May 04 17:19:46 2022 -0400
committer: Nicolas Capens <nicolascapens@google.com> Sat May 07 03:58:17 2022 +0000
tree: 0c413a412e22bccb96c02636b5464b976603de81
parent: 74e34ab97aeb2a6b68d52a9e0112968b33b509ea [diff]
diff --git a/src/Reactor/LLVMReactor.cpp b/src/Reactor/LLVMReactor.cpp
index 77b9fbc..05002a6 100644
--- a/src/Reactor/LLVMReactor.cpp
+++ b/src/Reactor/LLVMReactor.cpp

@@ -2671,7 +2671,7 @@
 RValue<Int4> RoundInt(RValue<Float4> cast)
 {
 	RR_DEBUG_INFO_UPDATE_LOC();
-#if defined(__i386__) || defined(__x86_64__)
+#if(defined(__i386__) || defined(__x86_64__)) && !__has_feature(memory_sanitizer)
 	return x86::cvtps2dq(cast);
 #else
 	return As<Int4>(V(lowerRoundInt(V(cast.value()), T(Int4::type()))));
@@ -2683,7 +2683,7 @@
 	RR_DEBUG_INFO_UPDATE_LOC();
 
 // TODO(b/165000222): Check if fptosi_sat produces optimal code for x86 and ARM.
-#if defined(__i386__) || defined(__x86_64__)
+#if(defined(__i386__) || defined(__x86_64__)) && !__has_feature(memory_sanitizer)
 	// cvtps2dq produces 0x80000000, a negative value, for input larger than
 	// 2147483520.0, so clamp to 2147483520. Values less than -2147483520.0
 	// saturate to 0x80000000.
@@ -2698,7 +2698,7 @@
 	    jit->module.get(), llvm::Intrinsic::fptosi_sat, { T(Int4::type()), T(Float4::type()) });
 	return RValue<Int4>(V(jit->builder->CreateCall(fptosi_sat, { rounded })));
 #else
-	RValue<Float4> clamped = Max(Min(cast, Float4(0x7FFFFF80)), Float4(0x80000000));
+	RValue<Float4> clamped = Max(Min(cast, Float4(0x7FFFFF80)), Float4(static_cast<int>(0x80000000)));
 	return As<Int4>(V(lowerRoundInt(V(clamped.value()), T(Int4::type()))));
 #endif
 }
@@ -3591,6 +3591,8 @@
 
 RValue<Int4> cvtps2dq(RValue<Float4> val)
 {
+	ASSERT(!__has_feature(memory_sanitizer));  // TODO(b/172238865): Not correctly instrumented by MemorySanitizer.
+
 	return RValue<Int4>(createInstruction(llvm::Intrinsic::x86_sse2_cvtps2dq, val.value()));
 }
commit	1ca6698b903cab2091dd41a81d8fc4a445e378ff	[log] [tgz]
author	Nicolas Capens <capn@google.com>	Wed May 04 17:19:46 2022 -0400
committer	Nicolas Capens <nicolascapens@google.com>	Sat May 07 03:58:17 2022 +0000
tree	0c413a412e22bccb96c02636b5464b976603de81
parent	74e34ab97aeb2a6b68d52a9e0112968b33b509ea [diff]