Limit Subzero routine stack size to 512 KiB

Fuzzing tests generate shaders with large arrays or very high numbers of
local variables, which can cause stack overflow. We need to limit the
allowable stack memory usage of generated routines.

Note this change does not yet gracefully deal with routines which exceed
this limit. They will cause a null pointer dereference instead of a
stack overflow.

The default stack size limit of 1 MiB at the Subzero level is to ensure
we catch cases of excessive stack sizes even in the case no explicit
limit was set. At the Reactor level we reduce it to 512 KiB to prevent
actual stack overflow for a 1 MiB stack, assuming some earlier calls
might want to use the stack. Also, our legacy 'ASM' compiler for GLSL
allocates 4096 'registers' of 4 components for 128-bit SIMD, which
already requires 256 KiB.

Bug: b/157555596
Change-Id: I474285eecc786496edffbaef29719ca0cdf03f7d
Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/52329
Presubmit-Ready: Nicolas Capens <nicolascapens@google.com>
Kokoro-Result: kokoro <noreply+kokoro@google.com>
Reviewed-by: Antonio Maiorano <amaiorano@google.com>
Tested-by: Nicolas Capens <nicolascapens@google.com>
Commit-Queue: Nicolas Capens <nicolascapens@google.com>
diff --git a/src/Reactor/SubzeroReactor.cpp b/src/Reactor/SubzeroReactor.cpp
index 9b0e0f1..f17574b 100644
--- a/src/Reactor/SubzeroReactor.cpp
+++ b/src/Reactor/SubzeroReactor.cpp
@@ -64,7 +64,9 @@
 Ice::Cfg *createFunction(Ice::GlobalContext *context, Ice::Type returnType, const std::vector<Ice::Type> &paramTypes)
 {
 	uint32_t sequenceNumber = 0;
-	auto function = Ice::Cfg::create(context, sequenceNumber).release();
+	auto *function = Ice::Cfg::create(context, sequenceNumber).release();
+
+	function->setStackSizeLimit(512 * 1024);  // 512 KiB
 
 	Ice::CfgLocalAllocatorScope allocScope{ function };
 
@@ -1039,6 +1041,11 @@
 		}
 
 		currFunc->emitIAS();
+
+		if(currFunc->hasError())
+		{
+			return nullptr;
+		}
 	}
 
 	// Emit items
diff --git a/third_party/subzero/src/IceCfg.h b/third_party/subzero/src/IceCfg.h
index 3a3dcbf..da676be 100644
--- a/third_party/subzero/src/IceCfg.h
+++ b/third_party/subzero/src/IceCfg.h
@@ -274,6 +274,9 @@
   /// in the correct information once everything is known.
   void fixPhiNodes();
 
+  void setStackSizeLimit(uint32_t Limit) { StackSizeLimit = Limit; }
+  uint32_t getStackSizeLimit() const { return StackSizeLimit; }
+
 private:
   friend class CfgAllocatorTraits; // for Allocator access.
 
@@ -344,6 +347,7 @@
   /// should be called to avoid spurious validation failures.
   const CfgNode *CurrentNode = nullptr;
   CfgVector<Loop> LoopInfo;
+  uint32_t StackSizeLimit = 1 * 1024 * 1024; // 1 MiB
 
 public:
   static void TlsInit() { CfgAllocatorTraits::init(); }
diff --git a/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h b/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
index 2b40b66..dda619b 100644
--- a/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
+++ b/third_party/subzero/src/IceTargetLoweringX86BaseImpl.h
@@ -1202,6 +1202,11 @@
   SpillAreaSizeBytes = StackSize - StackOffset; // Adjust for alignment, if any
 
   if (SpillAreaSizeBytes) {
+    auto *Func = Node->getCfg();
+    if (SpillAreaSizeBytes > Func->getStackSizeLimit()) {
+      Func->setError("Stack size limit exceeded");
+    }
+
     emitStackProbe(SpillAreaSizeBytes);
 
     // Generate "sub stackptr, SpillAreaSizeBytes"