| //===----------------------------------------------------------------------===// |
| // Representing sign/zero extension of function results |
| //===----------------------------------------------------------------------===// |
| |
| Mar 25, 2009 - Initial Revision |
| |
| Most ABIs specify that functions which return small integers do so in a |
| specific integer GPR. This is an efficient way to go, but raises the question: |
| if the returned value is smaller than the register, what do the high bits hold? |
| |
| There are three (interesting) possible answers: undefined, zero extended, or |
| sign extended. The number of bits in question depends on the data-type that |
| the front-end is referencing (typically i1/i8/i16/i32). |
| |
| Knowing the answer to this is important for two reasons: 1) we want to be able |
| to implement the ABI correctly. If we need to sign extend the result according |
| to the ABI, we really really do need to do this to preserve correctness. 2) |
| this information is often useful for optimization purposes, and we want the |
| mid-level optimizers to be able to process this (e.g. eliminate redundant |
| extensions). |
| |
| For example, lets pretend that X86 requires the caller to properly extend the |
| result of a return (I'm not sure this is the case, but the argument doesn't |
| depend on this). Given this, we should compile this: |
| |
| int a(); |
| short b() { return a(); } |
| |
| into: |
| |
| _b: |
| subl $12, %esp |
| call L_a$stub |
| addl $12, %esp |
| cwtl |
| ret |
| |
| An optimization example is that we should be able to eliminate the explicit |
| sign extension in this example: |
| |
| short y(); |
| int z() { |
| return ((int)y() << 16) >> 16; |
| } |
| |
| _z: |
| subl $12, %esp |
| call _y |
| ;; movswl %ax, %eax -> not needed because eax is already sext'd |
| addl $12, %esp |
| ret |
| |
| //===----------------------------------------------------------------------===// |
| // What we have right now. |
| //===----------------------------------------------------------------------===// |
| |
| Currently, these sorts of things are modelled by compiling a function to return |
| the small type and a signext/zeroext marker is used. For example, we compile |
| Z into: |
| |
| define i32 @z() nounwind { |
| entry: |
| %0 = tail call signext i16 (...)* @y() nounwind |
| %1 = sext i16 %0 to i32 |
| ret i32 %1 |
| } |
| |
| and b into: |
| |
| define signext i16 @b() nounwind { |
| entry: |
| %0 = tail call i32 (...)* @a() nounwind ; <i32> [#uses=1] |
| %retval12 = trunc i32 %0 to i16 ; <i16> [#uses=1] |
| ret i16 %retval12 |
| } |
| |
| This has some problems: 1) the actual precise semantics are really poorly |
| defined (see PR3779). 2) some targets might want the caller to extend, some |
| might want the callee to extend 3) the mid-level optimizer doesn't know the |
| size of the GPR, so it doesn't know that %0 is sign extended up to 32-bits |
| here, and even if it did, it could not eliminate the sext. 4) the code |
| generator has historically assumed that the result is extended to i32, which is |
| a problem on PIC16 (and is also probably wrong on alpha and other 64-bit |
| targets). |
| |
| //===----------------------------------------------------------------------===// |
| // The proposal |
| //===----------------------------------------------------------------------===// |
| |
| I suggest that we have the front-end fully lower out the ABI issues here to |
| LLVM IR. This makes it 100% explicit what is going on and means that there is |
| no cause for confusion. For example, the cases above should compile into: |
| |
| define i32 @z() nounwind { |
| entry: |
| %0 = tail call i32 (...)* @y() nounwind |
| %1 = trunc i32 %0 to i16 |
| %2 = sext i16 %1 to i32 |
| ret i32 %2 |
| } |
| define i32 @b() nounwind { |
| entry: |
| %0 = tail call i32 (...)* @a() nounwind |
| %retval12 = trunc i32 %0 to i16 |
| %tmp = sext i16 %retval12 to i32 |
| ret i32 %tmp |
| } |
| |
| In this model, no functions will return an i1/i8/i16 (and on a x86-64 target |
| that extends results to i64, no i32). This solves the ambiguity issue, allows us |
| to fully describe all possible ABIs, and now allows the optimizers to reason |
| about and eliminate these extensions. |
| |
| The one thing that is missing is the ability for the front-end and optimizer to |
| specify/infer the guarantees provided by the ABI to allow other optimizations. |
| For example, in the y/z case, since y is known to return a sign extended value, |
| the trunc/sext in z should be eliminable. |
| |
| This can be done by introducing new sext/zext attributes which mean "I know |
| that the result of the function is sign extended at least N bits. Given this, |
| and given that it is stuck on the y function, the mid-level optimizer could |
| easily eliminate the extensions etc with existing functionality. |
| |
| The major disadvantage of doing this sort of thing is that it makes the ABI |
| lowering stuff even more explicit in the front-end, and that we would like to |
| eventually move to having the code generator do more of this work. However, |
| the sad truth of the matter is that this is a) unlikely to happen anytime in |
| the near future, and b) this is no worse than we have now with the existing |
| attributes. |
| |
| C compilers fundamentally have to reason about the target in many ways. |
| This is ugly and horrible, but a fact of life. |
| |