| Code Generation Notes for MSA |
| ============================= |
| |
| Intrinsics are lowered to SelectionDAG nodes where possible in order to enable |
| optimisation, reduce the size of the ISel matcher, and reduce repetition in |
| the implementation. In a small number of cases, this can cause different |
| (semantically equivalent) instructions to be used in place of the requested |
| instruction, even when no optimisation has taken place. |
| |
| Instructions |
| ============ |
| |
| This section describes any quirks of instruction selection for MSA. For |
| example, two instructions might be equally valid for some given IR and one is |
| chosen in preference to the other. |
| |
| bclri.b: |
| It is not possible to emit bclri.b since andi.b covers exactly the |
| same cases. andi.b should use fractionally less power than bclri.b in |
| most hardware implementations so it is used in preference to bclri.b. |
| |
| vshf.w: |
| It is not possible to emit vshf.w when the shuffle description is |
| constant since shf.w covers exactly the same cases. shf.w is used |
| instead. It is also impossible for the shuffle description to be |
| unknown at compile-time due to the definition of shufflevector in |
| LLVM IR. |
| |
| vshf.[bhwd] |
| When the shuffle description describes a splat operation, splat.[bhwd] |
| instructions will be selected instead of vshf.[bhwd]. Unlike the ilv*, |
| and pck* instructions, this is matched from MipsISD::VSHF instead of |
| a special-case MipsISD node. |
| |
| ilvl.d, pckev.d: |
| It is not possible to emit ilvl.d, or pckev.d since ilvev.d covers the |
| same shuffle. ilvev.d will be emitted instead. |
| |
| ilvr.d, ilvod.d, pckod.d: |
| It is not possible to emit ilvr.d, or pckod.d since ilvod.d covers the |
| same shuffle. ilvod.d will be emitted instead. |
| |
| splat.[bhwd] |
| The intrinsic will work as expected. However, unlike other intrinsics |
| it lowers directly to MipsISD::VSHF instead of using common IR. |
| |
| splati.w: |
| It is not possible to emit splati.w since shf.w covers the same cases. |
| shf.w will be emitted instead. |
| |
| copy_s.w: |
| On MIPS32, the copy_u.d intrinsic will emit this instruction instead of |
| copy_u.w. This is semantically equivalent since the general-purpose |
| register file is 32-bits wide. |
| |
| binsri.[bhwd], binsli.[bhwd]: |
| These two operations are equivalent to each other with the operands |
| swapped and condition inverted. The compiler may use either one as |
| appropriate. |
| Furthermore, the compiler may use bsel.[bhwd] for some masks that do |
| not survive the legalization process (this is a bug and will be fixed). |
| |
| bmnz.v, bmz.v, bsel.v: |
| These three operations differ only in the operand that is tied to the |
| result and the order of the operands. |
| It is (currently) not possible to emit bmz.v, or bsel.v since bmnz.v is |
| the same operation and will be emitted instead. |
| In future, the compiler may choose between these three instructions |
| according to register allocation. |
| These three operations can be very confusing so here is a mapping |
| between the instructions and the vselect node in one place: |
| bmz.v wd, ws, wt/i8 -> (vselect wt/i8, wd, ws) |
| bmnz.v wd, ws, wt/i8 -> (vselect wt/i8, ws, wd) |
| bsel.v wd, ws, wt/i8 -> (vselect wd, wt/i8, ws) |
| |
| bmnzi.b, bmzi.b: |
| Like their non-immediate counterparts, bmnzi.v and bmzi.v are the same |
| operation with the operands swapped. bmnzi.v will (currently) be emitted |
| for both cases. |
| |
| bseli.v: |
| Unlike the non-immediate versions, bseli.v is distinguishable from |
| bmnzi.b and bmzi.b and can be emitted. |