Subzero: Emit functions and global initializers in a separate thread.

(This is a continuation of https://codereview.chromium.org/876083007/ .)

Emission is done in a separate thread when -threads=N with N>0 is specified.  This includes both functions and global initializers.

Emission is deterministic.  The parser assigns sequence numbers, and the emitter thread reassembles work units into their original order, regardless of the number of threads.

Dump output, however, is not intended to be in deterministic, reassembled order.  As such, lit tests that test dump output (i.e., '-verbose inst') are explicitly run with -threads=0.

For -elf-writer and -ias=1, the translator thread invokes Cfg::emitIAS() and the assembler buffer is passed to the emitter thread.  For -ias=0, the translator thread passed the Cfg to the emitter thread which then invokes Cfg::emit() to produce the textual asm.

Minor cleanup along the way:
  * Removed Flags from the Ice::Translator object and ctor, since it was redundant with Ctx->getFlags().
  * Cfg::getAssembler<> is the same as Cfg::getAssembler<Assembler> and is useful for just passing the assembler around.
  * Removed the redundant Ctx argument from TargetDataLowering::lowerConstants() .

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4075
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/916653004
diff --git a/src/IceCfg.h b/src/IceCfg.h
index 605dcf7..737bb64 100644
--- a/src/IceCfg.h
+++ b/src/IceCfg.h
@@ -30,8 +30,9 @@
 public:
   ~Cfg();
 
-  static std::unique_ptr<Cfg> create(GlobalContext *Ctx) {
-    return std::unique_ptr<Cfg>(new Cfg(Ctx));
+  static std::unique_ptr<Cfg> create(GlobalContext *Ctx,
+                                     uint32_t SequenceNumber) {
+    return std::unique_ptr<Cfg>(new Cfg(Ctx, SequenceNumber));
   }
   // Gets a pointer to the current thread's Cfg.
   static const Cfg *getCurrentCfg() { return ICE_TLS_GET_FIELD(CurrentCfg); }
@@ -45,6 +46,7 @@
   }
 
   GlobalContext *getContext() const { return Ctx; }
+  uint32_t getSequenceNumber() const { return SequenceNumber; }
 
   // Returns true if any of the specified options in the verbose mask
   // are set.  If the argument is omitted, it checks if any verbose
@@ -121,9 +123,10 @@
   TargetLowering *getTarget() const { return Target.get(); }
   VariablesMetadata *getVMetadata() const { return VMetadata.get(); }
   Liveness *getLiveness() const { return Live.get(); }
-  template <typename T> T *getAssembler() const {
+  template <typename T = Assembler> T *getAssembler() const {
     return static_cast<T *>(TargetAssembler.get());
   }
+  Assembler *releaseAssembler() { return TargetAssembler.release(); }
   bool hasComputedFrame() const;
   bool getFocusedTiming() const { return FocusedTiming; }
   void setFocusedTiming() { FocusedTiming = true; }
@@ -159,7 +162,8 @@
 
   void emit();
   void emitIAS();
-  void emitTextHeader(const IceString &MangledName);
+  static void emitTextHeader(const IceString &MangledName, GlobalContext *Ctx,
+                             const Assembler *Asm);
   void dump(const IceString &Message = "");
 
   // Allocate data of type T using the per-Cfg allocator.
@@ -181,9 +185,10 @@
   }
 
 private:
-  Cfg(GlobalContext *Ctx);
+  Cfg(GlobalContext *Ctx, uint32_t SequenceNumber);
 
   GlobalContext *Ctx;
+  uint32_t SequenceNumber; // output order for emission
   VerboseMask VMask;
   IceString FunctionName;
   Type ReturnType;