| //===---------------------------------------------------------------------===// |
| // Random ideas for the X86 backend: FP stack related stuff |
| //===---------------------------------------------------------------------===// |
| |
| //===---------------------------------------------------------------------===// |
| |
| Some targets (e.g. athlons) prefer freep to fstp ST(0): |
| http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html |
| |
| //===---------------------------------------------------------------------===// |
| |
| This should use fiadd on chips where it is profitable: |
| double foo(double P, int *I) { return P+*I; } |
| |
| We have fiadd patterns now but the followings have the same cost and |
| complexity. We need a way to specify the later is more profitable. |
| |
| def FpADD32m : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW, |
| [(set RFP:$dst, (fadd RFP:$src1, |
| (extloadf64f32 addr:$src2)))]>; |
| // ST(0) = ST(0) + [mem32] |
| |
| def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW, |
| [(set RFP:$dst, (fadd RFP:$src1, |
| (X86fild addr:$src2, i32)))]>; |
| // ST(0) = ST(0) + [mem32int] |
| |
| //===---------------------------------------------------------------------===// |
| |
| The FP stackifier should handle simple permutates to reduce number of shuffle |
| instructions, e.g. turning: |
| |
| fld P -> fld Q |
| fld Q fld P |
| fxch |
| |
| or: |
| |
| fxch -> fucomi |
| fucomi jl X |
| jg X |
| |
| Ideas: |
| http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html |
| |
| |
| //===---------------------------------------------------------------------===// |
| |
| Add a target specific hook to DAG combiner to handle SINT_TO_FP and |
| FP_TO_SINT when the source operand is already in memory. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Open code rint,floor,ceil,trunc: |
| http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html |
| http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html |
| |
| Opencode the sincos[f] libcall. |
| |
| //===---------------------------------------------------------------------===// |
| |
| None of the FPStack instructions are handled in |
| X86RegisterInfo::foldMemoryOperand, which prevents the spiller from |
| folding spill code into the instructions. |
| |
| //===---------------------------------------------------------------------===// |
| |
| Currently the x86 codegen isn't very good at mixing SSE and FPStack |
| code: |
| |
| unsigned int foo(double x) { return x; } |
| |
| foo: |
| subl $20, %esp |
| movsd 24(%esp), %xmm0 |
| movsd %xmm0, 8(%esp) |
| fldl 8(%esp) |
| fisttpll (%esp) |
| movl (%esp), %eax |
| addl $20, %esp |
| ret |
| |
| This just requires being smarter when custom expanding fptoui. |
| |
| //===---------------------------------------------------------------------===// |