Add Om1 lowering with no optimizations.

This adds infrastructure for low-level x86-32 instructions, and the target lowering patterns.

Practically no optimizations are performed.  Optimizations to be introduced later include liveness analysis, dead-code elimination, global linear-scan register allocation, linear-scan based stack slot coalescing, and compare/branch fusing.  One optimization that is present is simple coalescing of stack slots for variables that are only live within a single basic block.

There are also some fairly comprehensive cross tests.  This testing infrastructure translates bitcode using both Subzero and llc, and a testing harness calls both versions with a variety of "interesting" inputs and compares the results.  Specifically, Arithmetic, Icmp, Fcmp, and Cast instructions are tested this way, across all PNaCl primitive types.

BUG=
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/265703002
diff --git a/crosstest/simple_loop_main.c b/crosstest/simple_loop_main.c
new file mode 100644
index 0000000..5ff36b8
--- /dev/null
+++ b/crosstest/simple_loop_main.c
@@ -0,0 +1,29 @@
+/* crosstest.py --test=simple_loop.c --driver=simple_loop_main.c \
+   --prefix=Subzero_ --output=simple_loop */
+
+#include <stdio.h>
+
+int simple_loop(int *a, int n);
+int Subzero_simple_loop(int *a, int n);
+
+int main(int argc, char **argv) {
+  unsigned TotalTests = 0;
+  unsigned Passes = 0;
+  unsigned Failures = 0;
+  int a[100];
+  for (int i = 0; i < 100; ++i)
+    a[i] = i * 2 - 100;
+  for (int i = -2; i < 100; ++i) {
+    ++TotalTests;
+    int llc_result = simple_loop(a, i);
+    int sz_result = Subzero_simple_loop(a, i);
+    if (llc_result == sz_result) {
+      ++Passes;
+    } else {
+      ++Failures;
+      printf("Failure: i=%d, llc=%d, sz=%d\n", i, llc_result, sz_result);
+    }
+  }
+  printf("TotalTests=%u Passes=%u Failures=%u\n", TotalTests, Passes, Failures);
+  return Failures;
+}