Subzero: Ensure coroutines are same-thread.

Marl now has a new flag on the task to say that it should be run on the same thread that calls `marl::schedule()`.

This can dramatically reduce the scheduling overhead when bouncing between two tasks, as other threads do not have to be woken.

```
> go run ./third_party/marl/tools/cmd/benchdiff/main.go pre.json post.json
Delta                    | Test name                                | (A) pre.json    | (B) post.json
-15.53x -1m16.299750679s | Coroutines/Fibonacci/iterations:16777216 | 1m21.551049724s | 5.251299045s
-15.71x -1.206771937s    | Coroutines/Fibonacci/iterations:262144   | 1.288789249s    | 82.017312ms
-15.72x -9.624071378s    | Coroutines/Fibonacci/iterations:2097152  | 10.277884306s   | 653.812928ms
-15.77x -35.755µs        | Coroutines/Fibonacci/iterations:8        | 38.176µs        | 2.421µs
-15.77x -150.190662ms    | Coroutines/Fibonacci/iterations:32768    | 160.356788ms    | 10.166126ms
-15.79x -18.864275ms     | Coroutines/Fibonacci/iterations:4096     | 20.139344ms     | 1.275069ms
-15.93x -2.332202ms      | Coroutines/Fibonacci/iterations:512      | 2.488404ms      | 156.202µs
-15.96x -292.896µs       | Coroutines/Fibonacci/iterations:64       | 312.481µs       | 19.585µs
```

Bug: b/145754674
Change-Id: I0e014083e1dbc9f5cdf51e7abc378df6be22d805
Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/42410
Tested-by: Ben Clayton <bclayton@google.com>
Kokoro-Presubmit: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nicolas Capens <nicolascapens@google.com>
diff --git a/src/Reactor/SubzeroReactor.cpp b/src/Reactor/SubzeroReactor.cpp
index df72f66..3221a7e 100644
--- a/src/Reactor/SubzeroReactor.cpp
+++ b/src/Reactor/SubzeroReactor.cpp
@@ -4760,7 +4760,7 @@
 		::getOrCreateScheduler().bind();
 	}
 
-	marl::schedule([=] {
+	auto run = [=] {
 		// Store handle in TLS so that the coroutine can grab it right away, before
 		// any fiber switch occurs.
 		coro::setHandleParam(coroData);
@@ -4770,7 +4770,8 @@
 		coroData->done.signal();        // coroutine is done.
 		coroData->suspended.signal();   // resume any blocking await() call.
 		coroData->terminated.signal();  // signal that the coroutine data is ready for freeing.
-	});
+	};
+	marl::schedule(marl::Task(run, marl::Task::Flags::SameThread));
 
 	coroData->suspended.wait();  // block until the first yield or coroutine end