Subzero: proper fix for assert(Dest->hasReg())

Reverted the hack fix from bb222a10622, and implemented a proper fix.

The fix in TargetX86Base<TraitsType>::lowerCast
(IceTargetLoweringX86BaseImpl.h) was suggested by Jim Stichnoth. The
original problem is that we're trying movd an i32 to an i4i32, and since
the target Variable is long-lived, it may not get a register allocated
for it, and when that happens, we end up tripping the assert in
InstX86Movd::emitIAS() that expects the destination to have been
allocated a register. The solution that Jim suggested, and is
implemented here, is to create a temporary, short-lived, variable to
first movd into, which should guarantee a register target, since
short-lived Variables usually get registers. Then we 'mov' the temporary
register Variable to Dest, which *should* support moving the i4i32
register operand to an i32 memory operand.

I said *should* above, because with the above fix, we now trip another
assert in InstX86Mov::emitIAS (Mov, not Movd). The reason being that it
doesn't actually support moving an i4i32 reg -> i32 memory operand. I
added support for this as well (IceInstX86BaseImpl.h). Note that this
assert only tripped when building with Om1 optimization level, since O2
ostensibly optimized out the mov call.

Bug: b/145529686
Change-Id: I3c998a3e308838123cb415fcbf9f277113ac7d28
Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/39068
Kokoro-Presubmit: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nicolas Capens <nicolascapens@google.com>
Tested-by: Antonio Maiorano <amaiorano@google.com>
diff --git a/third_party/subzero/src/IceInstX86Base.h b/third_party/subzero/src/IceInstX86Base.h
index 6c140d7..4c62854 100644
--- a/third_party/subzero/src/IceInstX86Base.h
+++ b/third_party/subzero/src/IceInstX86Base.h
@@ -1249,10 +1249,6 @@
   class InstX86Movd : public InstX86BaseUnaryopXmm<InstX86Base::Movd> {
   public:
     static InstX86Movd *create(Cfg *Func, Variable *Dest, Operand *Src) {
-      // TODO(amaiorano): This fixes assert(Dest->hasReg()) in emitIAS when
-      // there are no more registers left to allocate. Revisit this and fix it
-      // the right way. See b/145529686.
-      Dest->setMustHaveReg();
       return new (Func->allocate<InstX86Movd>()) InstX86Movd(Func, Dest, Src);
     }
 
diff --git a/third_party/subzero/src/IceInstX86BaseImpl.h b/third_party/subzero/src/IceInstX86BaseImpl.h
index fb9bba0..cf976ac 100644
--- a/third_party/subzero/src/IceInstX86BaseImpl.h
+++ b/third_party/subzero/src/IceInstX86BaseImpl.h
@@ -2314,6 +2314,12 @@
       Assembler *Asm = Func->getAssembler<Assembler>();
       Asm->movss(SrcTy, StackAddr, Traits::getEncodedXmm(SrcVar->getRegNum()));
       return;
+    } else if (isVectorType(SrcTy)) {
+      // Src must be a register
+      const auto *SrcVar = llvm::cast<Variable>(Src);
+      assert(SrcVar->hasReg());
+      Assembler *Asm = Func->getAssembler<Assembler>();
+      Asm->movups(StackAddr, Traits::getEncodedXmm(SrcVar->getRegNum()));
     } else {
       // Src can be a register or immediate.
       assert(isScalarIntegerType(SrcTy));
diff --git a/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h b/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
index ebb49a5..5c312e1 100644
--- a/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
+++ b/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
@@ -3292,7 +3292,10 @@
         // use v16i8 vectors.
         assert(getFlags().getApplicationBinaryInterface() != ABI_PNaCl &&
                "PNaCl only supports real 128-bit vectors");
-        _movd(Dest, legalize(Src0, Legal_Reg | Legal_Mem));
+        Operand *Src0RM = legalize(Src0, Legal_Reg | Legal_Mem);
+        Variable *T = makeReg(DestTy);
+        _movd(T, Src0RM);
+        _mov(Dest, T);
       } else {
         _movp(Dest, legalizeToReg(Src0));
       }