Threaded Queue submit with events and fences

This cl does 3 main things:
- It pushes the queue submit operation to its own thread
- It implements events
- It implements fences

Some details:
- Because we can have N async draw operations and we need to signal
  the fence only after all operations are completed, fences have a
  add/done mechanism to allow signaling the fence only when all
  draw operations are completed.
- Device::waitForFences() detects large timeouts to avoid integer
  overflow if now+timeout is bigger than the remaining nanoseconds
  available in a long long.

Bug b/117835459

Change-Id: I2f02c3b4bb9d9ac9037909b02b0601e1bae15d21
Tests: dEQP-VK.synchronization.*
Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/29769
Presubmit-Ready: Alexis Hétu <sugoi@google.com>
Reviewed-by: Ben Clayton <bclayton@google.com>
Reviewed-by: Nicolas Capens <nicolascapens@google.com>
Kokoro-Presubmit: kokoro <noreply+kokoro@google.com>
Tested-by: Alexis Hétu <sugoi@google.com>
diff --git a/src/Vulkan/VkDevice.hpp b/src/Vulkan/VkDevice.hpp
index 1da119d..2b129f1 100644
--- a/src/Vulkan/VkDevice.hpp
+++ b/src/Vulkan/VkDevice.hpp
@@ -44,8 +44,8 @@
 	static size_t ComputeRequiredAllocationSize(const CreateInfo* info);
 
 	VkQueue getQueue(uint32_t queueFamilyIndex, uint32_t queueIndex) const;
-	void waitForFences(uint32_t fenceCount, const VkFence* pFences, VkBool32 waitAll, uint64_t timeout);
-	void waitIdle();
+	VkResult waitForFences(uint32_t fenceCount, const VkFence* pFences, VkBool32 waitAll, uint64_t timeout);
+	VkResult waitIdle();
 	void getDescriptorSetLayoutSupport(const VkDescriptorSetLayoutCreateInfo* pCreateInfo,
 	                                   VkDescriptorSetLayoutSupport* pSupport) const;
 	VkPhysicalDevice getPhysicalDevice() const { return physicalDevice; }