The Enabling Optimization

Our optimization replaces each dynamic function call with a switch statement and a set of static function calls. Since existing compilers can analyze static functions calls, the compiler is in a better position to perform other optimizations after our optimization is applied. The replacement has the following steps:

Step 1.
Compute the class hierarchy to determine the set of functions that each virtual function call may invoke.

Step 2.
Replace each virtual function call with a switch statement and a set of static function calls. The switch statement uses the type of the receiving object to determine which function is called. The set of static calls used in the switch statement is determined by the following three rules:

Rule 1.
Based on the declared type of a pointer in C++, the set of functions that may be invoked by a virtual function call-site is limited. For example, consider a virtual function call to function f made though a pointer to class C. This call may result only in a call to the version of f declared in class C or any class derived from C.

Rule 2.
Classes that call the same virtual function are consolidated. For example, if B and C both derive from A, and B and C do not implement their own version of A::f, then A, B, and C all call A::f. Putting all three classes into one branch of the case statement shortens the code, and makes the compiler's jump table smaller.

Rule 3.
Any class that contains pure virtual functions can be eliminated from the switch statement. C++ forbids the creation of objects of a class that contains a pure virtual function; thus, there will never be a call to a function in such a class.

It is possible that, after applying these rules, only a single case remains. If so, we replace the switch statement with a single static call.

Example. Figure 2 shows the transformed functions Base::bar(), Sub1::bar(), and process() from Figure 1. Using Rule 2, the cases for Base and Sub2 in function process() are consolidated because Sub2 does not implement its own version of bar(). Also, since Sub1 has no subclasses, Rule 1 leaves a case statement with only one arm. After further optimization, this becomes the static call sub1::foo().

Like many interprocedural optimizations, ours is most effective when the entire source is analyzed at one time. Complete source allows the optimization to obtain complete information about class hierarchies and the use of virtual functions. It is possible to optimize procedures compiled separately, with some loss in the effectiveness. It is also possible to perform the optimization at link time; however, some local optimizations, such as constant propagation, that we hope to enable would be missed. An alternative is the use of a programming environment that incrementally computes summary information for large systems (often as a background task); thus, reducing the need to examine the entire program on each compile without sacrificing the effectiveness of the optimization. Such an environment for SELF programs is described in [8].


Copyright © 1995, 1996 Bradley M. Kuhn, David W. Binkley.

Verbatim copying and distribution of this entire paper is permitted in any medium, provided this notice is preserved.