Optimize sin/cos polynomial evaluation using FMA This change conditionally uses Fused Multiply-Add operations to optimize the approximate polynomial used to implement sin and cos operations. The MulAdd() intrinsic evaluates to an FMA instruction when it is available and deemed more efficient than individual multiplication and addition. Bug: b/216472189 Bug: b/169754022 Change-Id: I423425250b1d5489514683d63f3d5261f5b59dbb Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/63548 Kokoro-Result: kokoro <noreply+kokoro@google.com> Tested-by: Nicolas Capens <nicolascapens@google.com> Reviewed-by: Sean Risser <srisser@google.com>
diff --git a/src/Pipeline/ShaderCore.cpp b/src/Pipeline/ShaderCore.cpp index c2ab391..68e7fcd 100644 --- a/src/Pipeline/ShaderCore.cpp +++ b/src/Pipeline/ShaderCore.cpp
@@ -182,7 +182,7 @@ Float4 x2 = x * x; - return ((A * x2 + B) * x2 + C) * x; + return MulAdd(MulAdd(A, x2, B), x2, C) * x; } Float4 Sin(RValue<Float4> x)