Initial skeleton of Subzero.

This includes just enough code to build the high-level ICE IR and dump it back out again.  There is a script szdiff.py that does a fuzzy diff of the input and output for verification.  See the comment in szdiff.py for a description of the fuzziness.

Building llvm2ice requires LLVM headers, libs, and tools (e.g. FileCheck) to be present.  These default to something like llvm_i686_linux_work/Release+Asserts/ based on the checked-out and built pnacl-llvm code; I'll try to figure out how to more automatically detect the build configuration.

"make check" runs the lit tests.

This CL has under 2000 lines of "interesting" Ice*.{h,cpp} code, plus 600 lines of llvm2ice.cpp driver code, and the rest is tests.

Here is the high-level mapping of source files to functionality:

IceDefs.h, IceTypes.h, IceTypes.cpp:
Commonly used types and utilities.

IceCfg.h, IceCfg.cpp:
Operations at the function level.

IceCfgNode.h, IceCfgNode.cpp:
Operations on basic blocks (nodes).

IceInst.h, IceInst.cpp:
Operations on instructions.

IceOperand.h, IceOperand.cpp:
Operations on operands, such as stack locations, physical registers, and constants.

BUG= none
R=jfb@chromium.org

Review URL: https://codereview.chromium.org/205613002
diff --git a/szdiff.py b/szdiff.py
new file mode 100755
index 0000000..f2696e8
--- /dev/null
+++ b/szdiff.py
@@ -0,0 +1,86 @@
+#!/usr/bin/env python2
+
+import argparse
+import itertools
+import subprocess
+import re
+
+if __name__ == '__main__':
+    """Runs llvm2ice on an input .ll file, and compares the output
+    against the input.
+
+    Before comparing, the input file is massaged to remove comments,
+    blank lines, global variable definitions, external function
+    declarations, and possibly other patterns that llvm2ice does not
+    handle.
+
+    The output file and the massaged input file are compared line by
+    line for differences.  However, there is a regex defined such that
+    if the regex matches a line in the input file, that line and the
+    corresponding line in the output file are ignored.  This lets us
+    ignore minor differences such as inttoptr and ptrtoint, and
+    printing of floating-point constants.
+
+    On success, no output is produced.  On failure, each mismatch is
+    printed as two lines, one starting with 'SZ' and one starting with
+    'LL'.
+    """
+    desc = 'Compare llvm2ice output against bitcode input.'
+    argparser = argparse.ArgumentParser(description=desc)
+    argparser.add_argument(
+        'llfile', nargs='?', default='-',
+        type=argparse.FileType('r'), metavar='FILE',
+        help='Textual bitcode file [default stdin]')
+    argparser.add_argument(
+        '--llvm2ice', required=False, default='./llvm2ice', metavar='LLVM2ICE',
+        help='Path to llvm2ice driver program [default ./llvm2ice]')
+    args = argparser.parse_args()
+    bitcode = args.llfile.readlines()
+
+    # Run llvm2ice and collect its output lines into sz_out.
+    command = [args.llvm2ice, '-verbose', 'inst', '-notranslate', '-']
+    p = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
+    sz_out = p.communicate(input=''.join(bitcode))[0].splitlines()
+
+    # Filter certain lines and patterns from the input, and collect
+    # the remainder into llc_out.
+    llc_out = []
+    tail_call = re.compile(' tail call ');
+    trailing_comment = re.compile(';.*')
+    ignore_pattern = re.compile('^ *$|^declare|^@')
+    for line in bitcode:
+        # Convert tail call into regular (non-tail) call.
+        line = tail_call.sub(' call ', line)
+        # Remove trailing comments and spaces.
+        line = trailing_comment.sub('', line).rstrip()
+        # Ignore blanks lines, forward declarations, and variable definitions.
+        if not ignore_pattern.search(line):
+            llc_out.append(line)
+
+    # Compare sz_out and llc_out line by line, but ignore pairs of
+    # lines where the llc line matches a certain pattern.
+    return_code = 0
+    lines_total = 0
+    lines_diff = 0
+    ignore_pattern = re.compile(
+        '|'.join([' -[0-9]',                # negative constants
+                  ' (float|double) [-0-9]', # FP constants
+                  ' (float|double) %\w+, [-0-9]',
+                  ' inttoptr ',             # inttoptr pointer types
+                  ' ptrtoint '              # ptrtoint pointer types
+                  ]))
+    for (sz_line, llc_line) in itertools.izip_longest(sz_out, llc_out):
+        lines_total += 1
+        if sz_line == llc_line:
+            continue
+        if llc_line and ignore_pattern.search(llc_line):
+            lines_diff += 1
+            continue
+        if sz_line: print 'SZ>' + sz_line
+        if llc_line: print 'LL>' + llc_line
+        return_code = 1
+
+    if return_code == 0:
+        message = 'Success (ignored %d diffs out of %d lines)'
+        print message % (lines_diff, lines_total)
+    exit(return_code)