third_party/LLVM/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt - SwiftShader - Git at Google

 Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST)
 From: Chris Lattner <sabre@nondot.org>
 To: Vikram Adve <vadve@cs.uiuc.edu>
 Subject: Re: a few thoughts

 Okay... here are a few of my thoughts on this (it's good to know that we
 think so alike!):

 > 1. We need to be clear on our goals for the VM.  Do we want to emphasize
 >    portability and safety like the Java VM?  Or shall we focus on the
 >    architecture interface first (i.e., consider the code generation and
 >    processor issues), since the architecture interface question is also
 >    important for portable Java-type VMs?

 I forsee the architecture looking kinda like this: (which is completely
 subject to change)

 1. The VM code is NOT guaranteed safe in a java sense.  Doing so makes it
    basically impossible to support C like languages.  Besides that,
    certifying a register based language as safe at run time would be a
    pretty expensive operation to have to do.  Additionally, we would like
    to be able to statically eliminate many bounds checks in Java
    programs... for example.

  2. Instead, we can do the following (eventually):
    * Java bytecode is used as our "safe" representation (to avoid
      reinventing something that we don't add much value to).  When the
      user chooses to execute Java bytecodes directly (ie, not
      precompiled) the runtime compiler can do some very simple
      transformations (JIT style) to convert it into valid input for our
      VM.  Performance is not wonderful, but it works right.
    * The file is scheduled to be compiled (rigorously) at a later
      time.  This could be done by some background process or by a second
      processor in the system during idle time or something...
    * To keep things "safe" ie to enforce a sandbox on Java/foreign code,
      we could sign the generated VM code with a host specific private
      key.  Then before the code is executed/loaded, we can check to see if
      the trusted compiler generated the code.  This would be much quicker
      than having to validate consistency (especially if bounds checks have
      been removed, for example)

 >    This is important because the audiences for these two goals are very
 >    different.  Architects and many compiler people care much more about
 >    the second question.  The Java compiler and OS community care much more
 >    about the first one.

 3. By focusing on a more low level virtual machine, we have much more room
    for value add.  The nice safe "sandbox" VM can be provided as a layer
    on top of it.  It also lets us focus on the more interesting compilers
    related projects.

 > 2. Design issues to consider (an initial list that we should continue
 >    to modify).  Note that I'm not trying to suggest actual solutions here,
 >    but just various directions we can pursue:

 Understood.  :)

 >    a. A single-assignment VM, which we've both already been thinking
 >       about.

 Yup, I think that this makes a lot of sense.  I am still intrigued,
 however, by the prospect of a minimally allocated VM representation... I
 think that it could have definite advantages for certain applications
 (think very small machines, like PDAs).  I don't, however, think that our
 initial implementations should focus on this.  :)

 Here are some other auxiliary goals that I think we should consider:

 1. Primary goal: Support a high performance dynamic compilation
    system.  This means that we have an "ideal" division of labor between
    the runtime and static compilers.  Of course, the other goals of the
    system somewhat reduce the importance of this point (f.e. portability
    reduces performance, but hopefully not much)
 2. Portability to different processors.  Since we are most familiar with
    x86 and solaris, I think that these two are excellent candidates when
    we get that far...
 3. Support for all languages & styles of programming (general purpose
    VM).  This is the point that disallows java style bytecodes, where all
    array refs are checked for bounds, etc...
 4. Support linking between different language families.  For example, call
    C functions directly from Java without using the nasty/slow/gross JNI
    layer.  This involves several subpoints:
   A. Support for languages that require garbage collectors and integration
      with languages that don't.  As a base point, we could insist on
      always using a conservative GC, but implement free as a noop, f.e.

 >    b. A strongly-typed VM.  One question is do we need the types to be
 >       explicitly declared or should they be inferred by the dynamic
 >       compiler?

   B. This is kind of similar to another idea that I have: make OOP
      constructs (virtual function tables, class heirarchies, etc) explicit
      in the VM representation.  I believe that the number of additional
      constructs would be fairly low, but would give us lots of important
      information... something else that would/could be important is to
      have exceptions as first class types so that they would be handled in
      a uniform way for the entire VM... so that C functions can call Java
      functions for example...

 >    c. How do we get more high-level information into the VM while keeping
 >       to a low-level VM design?
 >       o  Explicit array references as operands?  An alternative is
 >          to have just an array type, and let the index computations be
 >          separate 3-operand instructions.

    C. In the model I was thinking of (subject to change of course), we
       would just have an array type (distinct from the pointer
       types).  This would allow us to have arbitrarily complex index
       expressions, while still distinguishing "load" from "Array load",
       for example.  Perhaps also, switch jump tables would be first class
       types as well?  This would allow better reasoning about the program.

 5. Support dynamic loading of code from various sources.  Already
    mentioned above was the example of loading java bytecodes, but we want
    to support dynamic loading of VM code as well.  This makes the job of
    the runtime compiler much more interesting:  it can do interprocedural
    optimizations that the static compiler can't do, because it doesn't
    have all of the required information (for example, inlining from
    shared libraries, etc...)

 6. Define a set of generally useful annotations to add to the VM
    representation.  For example, a function can be analysed to see if it
    has any sideeffects when run... also, the MOD/REF sets could be
    calculated, etc... we would have to determine what is reasonable.  This
    would generally be used to make IP optimizations cheaper for the
    runtime compiler...

 >       o  Explicit instructions to handle aliasing, e.g.s:
 >            -- an instruction to say "I speculate that these two values are not
 >               aliased, but check at runtime", like speculative execution in
 >             EPIC?
 >          -- or an instruction to check whether two values are aliased and
 >             execute different code depending on the answer, somewhat like
 >             predicated code in EPIC

 These are also very good points... if this can be determined at compile
 time.  I think that an epic style of representation (not the instruction
 packing, just the information presented) could be a very interesting model
 to use... more later...

 >         o  (This one is a difficult but powerful idea.)
 >          A "thread-id" field on every instruction that allows the static
 >          compiler to generate a set of parallel threads, and then have
 >          the runtime compiler and hardware do what they please with it.
 >          This has very powerful uses, but thread-id on every instruction
 >          is expensive in terms of instruction size and code size.
 >          We would need to compactly encode it somehow.

 Yes yes yes!  :)  I think it would be *VERY* useful to include this kind
 of information (which EPIC architectures *implicitly* encode.  The trend
 that we are seeing supports this greatly:

 1. Commodity processors are getting massive SIMD support:
    * Intel/Amd MMX/MMX2
    * AMD's 3Dnow!
    * Intel's SSE/SSE2
    * Sun's VIS
 2. SMP is becoming much more common, especially in the server space.
 3. Multiple processors on a die are right around the corner.

 If nothing else, not designing this in would severely limit our future
 expansion of the project...

 >          Also, this will require some reading on at least two other
 >          projects:
 >               -- Multiscalar architecture from Wisconsin
 >               -- Simultaneous multithreading architecture from Washington
 >
 >       o  Or forget all this and stick to a traditional instruction set?

 Heh... :)  Well, from a pure research point of view, it is almost more
 attactive to go with the most extreme/different ISA possible.  On one axis
 you get safety and conservatism, and on the other you get degree of
 influence that the results have.  Of course the problem with pure research
 is that often times there is no concrete product of the research... :)

 > BTW, on an unrelated note, after the meeting yesterday, I did remember
 > that you had suggested doing instruction scheduling on SSA form instead
 > of a dependence DAG earlier in the semester.  When we talked about
 > it yesterday, I didn't remember where the idea had come from but I
 > remembered later.  Just giving credit where its due...

 :) Thanks.

 > Perhaps you can save the above as a file under RCS so you and I can
 > continue to expand on this.

 I think it makes sense to do so when we get our ideas more formalized and
 bounce it back and forth a couple of times... then I'll do a more formal
 writeup of our goals and ideas.  Obviously our first implementation will
 not want to do all of the stuff that I pointed out above... be we will
 want to design the project so that we do not artificially limit ourselves
 at sometime in the future...

 Anyways, let me know what you think about these ideas... and if they sound
 reasonable...

 -Chris
	Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST)
	From: Chris Lattner <sabre@nondot.org>
	To: Vikram Adve <vadve@cs.uiuc.edu>
	Subject: Re: a few thoughts

	Okay... here are a few of my thoughts on this (it's good to know that we
	think so alike!):

	> 1. We need to be clear on our goals for the VM. Do we want to emphasize
	> portability and safety like the Java VM? Or shall we focus on the
	> architecture interface first (i.e., consider the code generation and
	> processor issues), since the architecture interface question is also
	> important for portable Java-type VMs?

	I forsee the architecture looking kinda like this: (which is completely
	subject to change)

	1. The VM code is NOT guaranteed safe in a java sense. Doing so makes it
	basically impossible to support C like languages. Besides that,
	certifying a register based language as safe at run time would be a
	pretty expensive operation to have to do. Additionally, we would like
	to be able to statically eliminate many bounds checks in Java
	programs... for example.

	2. Instead, we can do the following (eventually):
	* Java bytecode is used as our "safe" representation (to avoid
	reinventing something that we don't add much value to). When the
	user chooses to execute Java bytecodes directly (ie, not
	precompiled) the runtime compiler can do some very simple
	transformations (JIT style) to convert it into valid input for our
	VM. Performance is not wonderful, but it works right.
	* The file is scheduled to be compiled (rigorously) at a later
	time. This could be done by some background process or by a second
	processor in the system during idle time or something...
	* To keep things "safe" ie to enforce a sandbox on Java/foreign code,
	we could sign the generated VM code with a host specific private
	key. Then before the code is executed/loaded, we can check to see if
	the trusted compiler generated the code. This would be much quicker
	than having to validate consistency (especially if bounds checks have
	been removed, for example)

	> This is important because the audiences for these two goals are very
	> different. Architects and many compiler people care much more about
	> the second question. The Java compiler and OS community care much more
	> about the first one.

	3. By focusing on a more low level virtual machine, we have much more room
	for value add. The nice safe "sandbox" VM can be provided as a layer
	on top of it. It also lets us focus on the more interesting compilers
	related projects.

	> 2. Design issues to consider (an initial list that we should continue
	> to modify). Note that I'm not trying to suggest actual solutions here,
	> but just various directions we can pursue:

	Understood. :)

	> a. A single-assignment VM, which we've both already been thinking
	> about.

	Yup, I think that this makes a lot of sense. I am still intrigued,
	however, by the prospect of a minimally allocated VM representation... I
	think that it could have definite advantages for certain applications
	(think very small machines, like PDAs). I don't, however, think that our
	initial implementations should focus on this. :)

	Here are some other auxiliary goals that I think we should consider:

	1. Primary goal: Support a high performance dynamic compilation
	system. This means that we have an "ideal" division of labor between
	the runtime and static compilers. Of course, the other goals of the
	system somewhat reduce the importance of this point (f.e. portability
	reduces performance, but hopefully not much)
	2. Portability to different processors. Since we are most familiar with
	x86 and solaris, I think that these two are excellent candidates when
	we get that far...
	3. Support for all languages & styles of programming (general purpose
	VM). This is the point that disallows java style bytecodes, where all
	array refs are checked for bounds, etc...
	4. Support linking between different language families. For example, call
	C functions directly from Java without using the nasty/slow/gross JNI
	layer. This involves several subpoints:
	A. Support for languages that require garbage collectors and integration
	with languages that don't. As a base point, we could insist on
	always using a conservative GC, but implement free as a noop, f.e.

	> b. A strongly-typed VM. One question is do we need the types to be
	> explicitly declared or should they be inferred by the dynamic
	> compiler?

	B. This is kind of similar to another idea that I have: make OOP
	constructs (virtual function tables, class heirarchies, etc) explicit
	in the VM representation. I believe that the number of additional
	constructs would be fairly low, but would give us lots of important
	information... something else that would/could be important is to
	have exceptions as first class types so that they would be handled in
	a uniform way for the entire VM... so that C functions can call Java
	functions for example...

	> c. How do we get more high-level information into the VM while keeping
	> to a low-level VM design?
	> o Explicit array references as operands? An alternative is
	> to have just an array type, and let the index computations be
	> separate 3-operand instructions.

	C. In the model I was thinking of (subject to change of course), we
	would just have an array type (distinct from the pointer
	types). This would allow us to have arbitrarily complex index
	expressions, while still distinguishing "load" from "Array load",
	for example. Perhaps also, switch jump tables would be first class
	types as well? This would allow better reasoning about the program.

	5. Support dynamic loading of code from various sources. Already
	mentioned above was the example of loading java bytecodes, but we want
	to support dynamic loading of VM code as well. This makes the job of
	the runtime compiler much more interesting: it can do interprocedural
	optimizations that the static compiler can't do, because it doesn't
	have all of the required information (for example, inlining from
	shared libraries, etc...)

	6. Define a set of generally useful annotations to add to the VM
	representation. For example, a function can be analysed to see if it
	has any sideeffects when run... also, the MOD/REF sets could be
	calculated, etc... we would have to determine what is reasonable. This
	would generally be used to make IP optimizations cheaper for the
	runtime compiler...

	> o Explicit instructions to handle aliasing, e.g.s:
	> -- an instruction to say "I speculate that these two values are not
	> aliased, but check at runtime", like speculative execution in
	> EPIC?
	> -- or an instruction to check whether two values are aliased and
	> execute different code depending on the answer, somewhat like
	> predicated code in EPIC

	These are also very good points... if this can be determined at compile
	time. I think that an epic style of representation (not the instruction
	packing, just the information presented) could be a very interesting model
	to use... more later...

	> o (This one is a difficult but powerful idea.)
	> A "thread-id" field on every instruction that allows the static
	> compiler to generate a set of parallel threads, and then have
	> the runtime compiler and hardware do what they please with it.
	> This has very powerful uses, but thread-id on every instruction
	> is expensive in terms of instruction size and code size.
	> We would need to compactly encode it somehow.

	Yes yes yes! :) I think it would be VERY useful to include this kind
	of information (which EPIC architectures implicitly encode. The trend
	that we are seeing supports this greatly:

	1. Commodity processors are getting massive SIMD support:
	* Intel/Amd MMX/MMX2
	* AMD's 3Dnow!
	* Intel's SSE/SSE2
	* Sun's VIS
	2. SMP is becoming much more common, especially in the server space.
	3. Multiple processors on a die are right around the corner.

	If nothing else, not designing this in would severely limit our future
	expansion of the project...

	> Also, this will require some reading on at least two other
	> projects:
	> -- Multiscalar architecture from Wisconsin
	> -- Simultaneous multithreading architecture from Washington
	>
	> o Or forget all this and stick to a traditional instruction set?

	Heh... :) Well, from a pure research point of view, it is almost more
	attactive to go with the most extreme/different ISA possible. On one axis
	you get safety and conservatism, and on the other you get degree of
	influence that the results have. Of course the problem with pure research
	is that often times there is no concrete product of the research... :)

	> BTW, on an unrelated note, after the meeting yesterday, I did remember
	> that you had suggested doing instruction scheduling on SSA form instead
	> of a dependence DAG earlier in the semester. When we talked about
	> it yesterday, I didn't remember where the idea had come from but I
	> remembered later. Just giving credit where its due...

	:) Thanks.

	> Perhaps you can save the above as a file under RCS so you and I can
	> continue to expand on this.

	I think it makes sense to do so when we get our ideas more formalized and
	bounce it back and forth a couple of times... then I'll do a more formal
	writeup of our goals and ideas. Obviously our first implementation will
	not want to do all of the stuff that I pointed out above... be we will
	want to design the project so that we do not artificially limit ourselves
	at sometime in the future...

	Anyways, let me know what you think about these ideas... and if they sound
	reasonable...

	-Chris