Subzero: proper fix for assert(Dest->hasReg())
Reverted the hack fix from bb222a10622, and implemented a proper fix.
The fix in TargetX86Base<TraitsType>::lowerCast
(IceTargetLoweringX86BaseImpl.h) was suggested by Jim Stichnoth. The
original problem is that we're trying movd an i32 to an i4i32, and since
the target Variable is long-lived, it may not get a register allocated
for it, and when that happens, we end up tripping the assert in
InstX86Movd::emitIAS() that expects the destination to have been
allocated a register. The solution that Jim suggested, and is
implemented here, is to create a temporary, short-lived, variable to
first movd into, which should guarantee a register target, since
short-lived Variables usually get registers. Then we 'mov' the temporary
register Variable to Dest, which *should* support moving the i4i32
register operand to an i32 memory operand.
I said *should* above, because with the above fix, we now trip another
assert in InstX86Mov::emitIAS (Mov, not Movd). The reason being that it
doesn't actually support moving an i4i32 reg -> i32 memory operand. I
added support for this as well (IceInstX86BaseImpl.h). Note that this
assert only tripped when building with Om1 optimization level, since O2
ostensibly optimized out the mov call.
Bug: b/145529686
Change-Id: I3c998a3e308838123cb415fcbf9f277113ac7d28
Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/39068
Kokoro-Presubmit: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nicolas Capens <nicolascapens@google.com>
Tested-by: Antonio Maiorano <amaiorano@google.com>
diff --git a/third_party/subzero/src/IceInstX86Base.h b/third_party/subzero/src/IceInstX86Base.h
index 6c140d7..4c62854 100644
--- a/third_party/subzero/src/IceInstX86Base.h
+++ b/third_party/subzero/src/IceInstX86Base.h
@@ -1249,10 +1249,6 @@
class InstX86Movd : public InstX86BaseUnaryopXmm<InstX86Base::Movd> {
public:
static InstX86Movd *create(Cfg *Func, Variable *Dest, Operand *Src) {
- // TODO(amaiorano): This fixes assert(Dest->hasReg()) in emitIAS when
- // there are no more registers left to allocate. Revisit this and fix it
- // the right way. See b/145529686.
- Dest->setMustHaveReg();
return new (Func->allocate<InstX86Movd>()) InstX86Movd(Func, Dest, Src);
}
diff --git a/third_party/subzero/src/IceInstX86BaseImpl.h b/third_party/subzero/src/IceInstX86BaseImpl.h
index fb9bba0..cf976ac 100644
--- a/third_party/subzero/src/IceInstX86BaseImpl.h
+++ b/third_party/subzero/src/IceInstX86BaseImpl.h
@@ -2314,6 +2314,12 @@
Assembler *Asm = Func->getAssembler<Assembler>();
Asm->movss(SrcTy, StackAddr, Traits::getEncodedXmm(SrcVar->getRegNum()));
return;
+ } else if (isVectorType(SrcTy)) {
+ // Src must be a register
+ const auto *SrcVar = llvm::cast<Variable>(Src);
+ assert(SrcVar->hasReg());
+ Assembler *Asm = Func->getAssembler<Assembler>();
+ Asm->movups(StackAddr, Traits::getEncodedXmm(SrcVar->getRegNum()));
} else {
// Src can be a register or immediate.
assert(isScalarIntegerType(SrcTy));
diff --git a/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h b/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
index ebb49a5..5c312e1 100644
--- a/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
+++ b/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
@@ -3292,7 +3292,10 @@
// use v16i8 vectors.
assert(getFlags().getApplicationBinaryInterface() != ABI_PNaCl &&
"PNaCl only supports real 128-bit vectors");
- _movd(Dest, legalize(Src0, Legal_Reg | Legal_Mem));
+ Operand *Src0RM = legalize(Src0, Legal_Reg | Legal_Mem);
+ Variable *T = makeReg(DestTy);
+ _movd(T, Src0RM);
+ _mov(Dest, T);
} else {
_movp(Dest, legalizeToReg(Src0));
}