Meeting notes: Implementation idea: Exception Handling in C++/Java | |
The 5/18/01 meeting discussed ideas for implementing exceptions in LLVM. | |
We decided that the best solution requires a set of library calls provided by | |
the VM, as well as an extension to the LLVM function invocation syntax. | |
The LLVM function invocation instruction previously looks like this (ignoring | |
types): | |
call func(arg1, arg2, arg3) | |
The extension discussed today adds an optional "with" clause that | |
associates a label with the call site. The new syntax looks like this: | |
call func(arg1, arg2, arg3) with funcCleanup | |
This funcHandler always stays tightly associated with the call site (being | |
encoded directly into the call opcode itself), and should be used whenever | |
there is cleanup work that needs to be done for the current function if | |
an exception is thrown by func (or if we are in a try block). | |
To support this, the VM/Runtime provide the following simple library | |
functions (all syntax in this document is very abstract): | |
typedef struct { something } %frame; | |
The VM must export a "frame type", that is an opaque structure used to | |
implement different types of stack walking that may be used by various | |
language runtime libraries. We imagine that it would be typical to | |
represent a frame with a PC and frame pointer pair, although that is not | |
required. | |
%frame getStackCurrentFrame(); | |
Get a frame object for the current function. Note that if the current | |
function was inlined into its caller, the "current" frame will belong to | |
the "caller". | |
bool isFirstFrame(%frame f); | |
Returns true if the specified frame is the top level (first activated) frame | |
for this thread. For the main thread, this corresponds to the main() | |
function, for a spawned thread, it corresponds to the thread function. | |
%frame getNextFrame(%frame f); | |
Return the previous frame on the stack. This function is undefined if f | |
satisfies the predicate isFirstFrame(f). | |
Label *getFrameLabel(%frame f); | |
If a label was associated with f (as discussed below), this function returns | |
it. Otherwise, it returns a null pointer. | |
doNonLocalBranch(Label *L); | |
At this point, it is not clear whether this should be a function or | |
intrinsic. It should probably be an intrinsic in LLVM, but we'll deal with | |
this issue later. | |
Here is a motivating example that illustrates how these facilities could be | |
used to implement the C++ exception model: | |
void TestFunction(...) { | |
A a; B b; | |
foo(); // Any function call may throw | |
bar(); | |
C c; | |
try { | |
D d; | |
baz(); | |
} catch (int) { | |
...int Stuff... | |
// execution continues after the try block: the exception is consumed | |
} catch (double) { | |
...double stuff... | |
throw; // Exception is propogated | |
} | |
} | |
This function would compile to approximately the following code (heavy | |
pseudo code follows): | |
Func: | |
%a = alloca A | |
A::A(%a) // These ctors & dtors could throw, but we ignore this | |
%b = alloca B // minor detail for this example | |
B::B(%b) | |
call foo() with fooCleanup // An exception in foo is propogated to fooCleanup | |
call bar() with barCleanup // An exception in bar is propogated to barCleanup | |
%c = alloca C | |
C::C(c) | |
%d = alloca D | |
D::D(d) | |
call baz() with bazCleanup // An exception in baz is propogated to bazCleanup | |
d->~D(); | |
EndTry: // This label corresponds to the end of the try block | |
c->~C() // These could also throw, these are also ignored | |
b->~B() | |
a->~A() | |
return | |
Note that this is a very straight forward and literal translation: exactly | |
what we want for zero cost (when unused) exception handling. Especially on | |
platforms with many registers (ie, the IA64) setjmp/longjmp style exception | |
handling is *very* impractical. Also, the "with" clauses describe the | |
control flow paths explicitly so that analysis is not adversly effected. | |
The foo/barCleanup labels are implemented as: | |
TryCleanup: // Executed if an exception escapes the try block | |
c->~C() | |
barCleanup: // Executed if an exception escapes from bar() | |
// fall through | |
fooCleanup: // Executed if an exception escapes from foo() | |
b->~B() | |
a->~A() | |
Exception *E = getThreadLocalException() | |
call throw(E) // Implemented by the C++ runtime, described below | |
Which does the work one would expect. getThreadLocalException is a function | |
implemented by the C++ support library. It returns the current exception | |
object for the current thread. Note that we do not attempt to recycle the | |
shutdown code from before, because performance of the mainline code is | |
critically important. Also, obviously fooCleanup and barCleanup may be | |
merged and one of them eliminated. This just shows how the code generator | |
would most likely emit code. | |
The bazCleanup label is more interesting. Because the exception may be caught | |
by the try block, we must dispatch to its handler... but it does not exist | |
on the call stack (it does not have a VM Call->Label mapping installed), so | |
we must dispatch statically with a goto. The bazHandler thus appears as: | |
bazHandler: | |
d->~D(); // destruct D as it goes out of scope when entering catch clauses | |
goto TryHandler | |
In general, TryHandler is not the same as bazHandler, because multiple | |
function calls could be made from the try block. In this case, trivial | |
optimization could merge the two basic blocks. TryHandler is the code | |
that actually determines the type of exception, based on the Exception object | |
itself. For this discussion, assume that the exception object contains *at | |
least*: | |
1. A pointer to the RTTI info for the contained object | |
2. A pointer to the dtor for the contained object | |
3. The contained object itself | |
Note that it is necessary to maintain #1 & #2 in the exception object itself | |
because objects without virtual function tables may be thrown (as in this | |
example). Assuming this, TryHandler would look something like this: | |
TryHandler: | |
Exception *E = getThreadLocalException(); | |
switch (E->RTTIType) { | |
case IntRTTIInfo: | |
...int Stuff... // The action to perform from the catch block | |
break; | |
case DoubleRTTIInfo: | |
...double Stuff... // The action to perform from the catch block | |
goto TryCleanup // This catch block rethrows the exception | |
break; // Redundant, eliminated by the optimizer | |
default: | |
goto TryCleanup // Exception not caught, rethrow | |
} | |
// Exception was consumed | |
if (E->dtor) | |
E->dtor(E->object) // Invoke the dtor on the object if it exists | |
goto EndTry // Continue mainline code... | |
And that is all there is to it. | |
The throw(E) function would then be implemented like this (which may be | |
inlined into the caller through standard optimization): | |
function throw(Exception *E) { | |
// Get the start of the stack trace... | |
%frame %f = call getStackCurrentFrame() | |
// Get the label information that corresponds to it | |
label * %L = call getFrameLabel(%f) | |
while (%L == 0 && !isFirstFrame(%f)) { | |
// Loop until a cleanup handler is found | |
%f = call getNextFrame(%f) | |
%L = call getFrameLabel(%f) | |
} | |
if (%L != 0) { | |
call setThreadLocalException(E) // Allow handlers access to this... | |
call doNonLocalBranch(%L) | |
} | |
// No handler found! | |
call BlowUp() // Ends up calling the terminate() method in use | |
} | |
That's a brief rundown of how C++ exception handling could be implemented in | |
llvm. Java would be very similar, except it only uses destructors to unlock | |
synchronized blocks, not to destroy data. Also, it uses two stack walks: a | |
nondestructive walk that builds a stack trace, then a destructive walk that | |
unwinds the stack as shown here. | |
It would be trivial to get exception interoperability between C++ and Java. | |