Subzero: Update the lowering documentation.

BUG= none
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/846763002
diff --git a/LOWERING.rst b/LOWERING.rst
index 251e25c..d51cd46 100644
--- a/LOWERING.rst
+++ b/LOWERING.rst
@@ -18,7 +18,7 @@
 know whether a ``Variable`` operand will meet a target instruction's physical
 register requirement.
 
-To this end, ICE allows certain hints/directives:
+To this end, ICE allows certain directives:
 
     * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some
       physical register (without specifying which particular one) from a
@@ -27,18 +27,15 @@
     * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific
       physical register.
 
-    * ``Variable::setPreferredRegister()`` registers a preference for a physical
-      register based on another ``Variable``'s physical register assignment.
-
-These hints/directives are described below in more detail.  In most cases,
-though, they don't need to be explicity used, as the routines that create
-lowered instructions have reasonable defaults and simple options that control
-these hints/directives.
+These directives are described below in more detail.  In most cases, though,
+they don't need to be explicity used, as the routines that create lowered
+instructions have reasonable defaults and simple options that control these
+directives.
 
 The recommended ICE lowering strategy is to generate extra assignment
-instructions involving extra ``Variable`` temporaries, using the
-hints/directives to force suitable register assignments for the temporaries, and
-then let the global register allocator clean things up.
+instructions involving extra ``Variable`` temporaries, using the directives to
+force suitable register assignments for the temporaries, and then let the
+register allocator clean things up.
 
 Note: There is a spectrum of *implementation complexity* versus *translation
 speed* versus *code quality*.  This recommended strategy picks a point on the
@@ -47,8 +44,8 @@
 the fastest translation speed since extra instructions and operands are created
 up front and cleaned up at the end.
 
-Ensuring some physical register
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Ensuring a non-specific physical register
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 The x86 instruction::
 
@@ -71,71 +68,31 @@
 
 ``Cfg::makeVariable()`` generates a new temporary, and
 ``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of
-register allocation, thus guaranteeing it a physical register.
+register allocation, thus guaranteeing it a physical register (though leaving
+the particular physical register to be determined by the register allocator).
 
 The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently
 powerful to handle these details in most situations.  Its ``Dest`` argument is
-an in/out parameter.  If its input value is ``NULL``, then a new temporary
+an in/out parameter.  If its input value is ``nullptr``, then a new temporary
 variable is created, its type is set to the same type as the ``Src`` operand, it
 is given infinite register weight, and the new ``Variable`` is returned through
 the in/out parameter.  (This is in addition to the new temporary being the dest
 operand of the ``mov`` instruction.)  The simpler version of the above example
 is::
 
-    Variable *Reg = NULL;
+    Variable *Reg = nullptr;
     _mov(Reg, Src);
     _mov(Dst, Reg);
 
 Preferring another ``Variable``'s physical register
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-One problem with this example is that the register allocator usually just
-assigns the first available register to a live range.  If this instruction ends
-the live range of ``src``, this may lead to code like the following::
-
-    mov reg:eax, src:esi
-    mov dst:edi, reg:eax
-
-Since the first instruction happens to end the live range of ``src:esi``, it
-would be better to assign ``esi`` to ``reg``::
-
-    mov reg:esi, src:esi
-    mov dst:edi, reg:esi
-
-The first instruction, ``mov esi, esi``, is a redundant assignment and will
-ultimately be elided, leaving just ``mov edi, esi``.
-
-We can tell the register allocator to prefer the register assigned to a
-different ``Variable``, using ``Variable::setPreferredRegister()``::
-
-    Variable *Reg;
-    Reg = Func->makeVariable(Dst->getType());
-    Reg->setWeightInfinite();
-    Reg->setPreferredRegister(Src);
-    NewInst = InstX8632Mov::create(Func, Reg, Src);
-    NewInst = InstX8632Mov::create(Func, Dst, Reg);
-
-Or more simply::
-
-    Variable *Reg = NULL;
-    _mov(Reg, Src);
-    _mov(Dst, Reg);
-    Reg->setPreferredRegister(llvm::dyn_cast<Variable>(Src));
-
-The usefulness of ``setPreferredRegister()`` is tied into the implementation of
-the register allocator.  ICE uses linear-scan register allocation, which sorts
-live ranges by starting point and assigns registers in that order.  Using
-``B->setPreferredRegister(A)`` only helps when ``A`` has already been assigned a
-register by the time ``B`` is being considered.  For an assignment ``B=A``, this
-is usually a safe assumption because ``B``'s live range begins at this
-instruction but ``A``'s live range must have started earlier.  (There may be
-exceptions for variables that are no longer in SSA form.)  But
-``A->setPreferredRegister(B)`` is unlikely to help unless ``B`` has been
-precolored.  In summary, generally the best practice is to use a pattern like::
-
-    NewInst = InstX8632Mov::create(Func, Dst, Src);
-    Dst->setPreferredRegister(Src);
-    //Src->setPreferredRegister(Dst); -- unlikely to have any effect
+(An older version of ICE allowed the lowering code to provide a register
+allocation hint: if a physical register is to be assigned to one ``Variable``,
+then prefer a particular ``Variable``'s physical register if available.  This
+hint would be used to try to reduce the amount of register shuffling.
+Currently, the register allocator does this automatically through the
+``FindPreference`` logic.)
 
 Ensuring a specific physical register
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -159,83 +116,42 @@
 
 The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an
 optional ``RegNum`` argument to force a specific register assignment when the
-input ``Dest`` is ``NULL``.  As described above, passing in ``Dest=NULL`` causes
-a new temporary variable to be created with infinite register weight, and in
-addition the specific register is chosen.  The simpler version of the above
+input ``Dest`` is ``nullptr``.  As described above, passing in ``Dest=nullptr``
+causes a new temporary variable to be created with infinite register weight, and
+in addition the specific register is chosen.  The simpler version of the above
 example is::
 
-    Variable *Reg = NULL;
+    Variable *Reg = nullptr;
     _mov(Reg, Src, Reg_eax);
     _ret(Reg);
 
 Disabling live-range interference
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Another problem with the "``mov reg,src; mov dst,reg``" example happens when
-the instructions do *not* end the live range of ``src``.  In this case, the live
-ranges of ``reg`` and ``src`` interfere, so they can't get the same physical
-register despite the explicit preference.  However, ``reg`` is meant to be an
-alias of ``src`` so they needn't be considered to interfere with each other.
-This can be expressed via the second (bool) argument of
-``setPreferredRegister()``::
+(An older version of ICE allowed an overly strong preference for another
+``Variable``'s physical register even if their live ranges interfered.  This was
+risky, and currently the register allocator derives this automatically through
+the ``AllowOverlap`` logic.)
 
-    Variable *Reg;
-    Reg = Func->makeVariable(Dst->getType());
-    Reg->setWeightInfinite();
-    Reg->setPreferredRegister(Src, true);
-    NewInst = InstX8632Mov::create(Func, Reg, Src);
-    NewInst = InstX8632Mov::create(Func, Dst, Reg);
+Call instructions kill scratch registers
+----------------------------------------
 
-This should be used with caution and probably only for these short-live-range
-temporaries, otherwise the classic "lost copy" or "lost swap" problem may be
-encountered.
-
-Instructions with register side effects
----------------------------------------
-
-Some instructions produce unwanted results in other registers, or otherwise kill
-preexisting values in other registers.  For example, a ``call`` kills the
-scratch registers.  Also, the x86-32 ``idiv`` instruction produces the quotient
-in ``eax`` and the remainder in ``edx``, but generally only one of those is
-needed in the lowering.  It's important that the register allocator doesn't
-allocate that register to a live range that spans the instruction.
-
-ICE provides the ``InstFakeKill`` pseudo-instruction to mark such register
-kills.  For each of the instruction's source variables, a fake trivial live
-range is created that begins and ends in that instruction.  The ``InstFakeKill``
-instruction is inserted after the ``call`` instruction.  For example::
+A ``call`` instruction kills the values in all scratch registers, so it's
+important that the register allocator doesn't allocate a scratch register to a
+``Variable`` whose live range spans the ``call`` instruction.  ICE provides the
+``InstFakeKill`` pseudo-instruction to compactly mark such register kills.  For
+each scratch register, a fake trivial live range is created that begins and ends
+in that instruction.  The ``InstFakeKill`` instruction is inserted after the
+``call`` instruction.  For example::
 
     CallInst = InstX8632Call::create(Func, ... );
-    VarList KilledRegs;
-    KilledRegs.push_back(eax);
-    KilledRegs.push_back(ecx);
-    KilledRegs.push_back(edx);
-    NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
+    NewInst = InstFakeKill::create(Func, CallInst);
 
 The last argument to the ``InstFakeKill`` constructor links it to the previous
 call instruction, such that if its linked instruction is dead-code eliminated,
-the ``InstFakeKill`` instruction is eliminated as well.
-
-The killed register arguments need to be assigned a physical register via
-``Variable::setRegNum()`` for this to be effective.  To avoid a massive
-proliferation of ``Variable`` temporaries, the ``TargetLowering`` object caches
-one precolored ``Variable`` for each physical register::
-
-    CallInst = InstX8632Call::create(Func, ... );
-    VarList KilledRegs;
-    Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);
-    Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);
-    Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);
-    KilledRegs.push_back(eax);
-    KilledRegs.push_back(ecx);
-    KilledRegs.push_back(edx);
-    NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
-
-On first glance, it may seem unnecessary to explicitly kill the register that
-returns the ``call`` return value.  However, if for some reason the ``call``
-result ends up being unused, dead-code elimination could remove dead assignments
-and incorrectly expose the return value register to a register allocation
-assignment spanning the call, which would be incorrect.
+the ``InstFakeKill`` instruction is eliminated as well.  The linked ``call``
+instruction could be to a target known to be free of side effects, and therefore
+safe to remove if its result is unused.
 
 Instructions producing multiple values
 --------------------------------------
@@ -244,7 +160,9 @@
 instructions produce more than one usable result.  For example, the x86-32
 ``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair.
 Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit
-result in the ``edx:eax`` register pair.
+result in the ``edx:eax`` register pair.  The x86-32 ``idiv`` instruction
+produces the quotient in ``eax`` and the remainder in ``edx``, though generally
+only one or the other is needed in the lowering.
 
 To support multi-dest instructions, ICE provides the ``InstFakeDef``
 pseudo-instruction, whose destination can be precolored to the appropriate
@@ -252,8 +170,7 @@
 ``edx:eax``::
 
     CallInst = InstX8632Call::create(Func, RegLow, ... );
-    ...
-    NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
+    NewInst = InstFakeKill::create(Func, CallInst);
     Variable *RegHigh = Func->makeVariable(IceType_i32);
     RegHigh->setRegNum(Reg_edx);
     NewInst = InstFakeDef::create(Func, RegHigh);
@@ -262,14 +179,14 @@
 ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be
 eliminated as well.
 
-Preventing dead-code elimination
---------------------------------
+Managing dead-code elimination
+------------------------------
 
-ICE instructions with a non-NULL ``Dest`` are subject to dead-code elimination.
-However, some instructions must not be eliminated in order to preserve side
-effects.  This applies to most function calls, volatile loads, and loads and
-integer divisions where the underlying language and runtime are relying on
-hardware exception handling.
+ICE instructions with a non-nullptr ``Dest`` are subject to dead-code
+elimination.  However, some instructions must not be eliminated in order to
+preserve side effects.  This applies to most function calls, volatile loads, and
+loads and integer divisions where the underlying language and runtime are
+relying on hardware exception handling.
 
 ICE facilitates this with the ``InstFakeUse`` pseudo-instruction.  This forces a
 use of its source ``Variable`` to keep that variable's definition alive.  Since
@@ -281,14 +198,7 @@
     Variable *Reg = Func->makeVariable(IceType_i32);
     Reg->setRegNum(Reg_eax);
     CallInst = InstX8632Call::create(Func, Reg, ... );
-    VarList KilledRegs;
-    Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);
-    Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);
-    Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);
-    KilledRegs.push_back(eax);
-    KilledRegs.push_back(ecx);
-    KilledRegs.push_back(edx);
-    NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
+    NewInst = InstFakeKill::create(Func, CallInst);
     NewInst = InstFakeUse::create(Func, Reg);
     NewInst = InstX8632Mov::create(Func, Result, Reg);
 
@@ -301,7 +211,7 @@
 instruction.  Using pseudocode::
 
     t1:eax = call foo(arg1, ...)
-    InstFakeKill(eax, ecx, edx)
+    InstFakeKill  // eax, ecx, edx
     t2:edx = InstFakeDef(t1)
     v_result_low = t1
     v_result_high = t2