The JVM is indeed generalized insofar as it provides a low level, object oriented, platform-independent assembler. However, most compilers today are not written to target some specific assembler directly. Usually, a higher level IR is used to represent the program. Since the IR is simpler to work with than the assembler, ports of new languages are easier. For example, the GNU Compiler Collection, GCC, was one of the first compilers to extensively use such an IR. GCC now supports six different languages on the front-end, and dozens of architectures on the back-end [24].
As discussed in Chapter 3, perl does have its own IR. Chapter 4 discussed how this IR was used to generate Jasmin assembler directly. It was discovered, however, that perl's IR simply did not map well onto the JVM.
In hindsight, this is not surprising. The perl IR was not designed, as GCC's was, to ease the burden of creating new front-ends and back-ends. In fact, perl's IR was actually designed specifically to work with and depend on the PVM. Thus, it makes sense that using perl's IR to port to new architectures would be difficult.
Given this reality, the next step is to find a way to still leverage the useful perl front-end IR in a way that will facilitate a port to the JVM. The solution proposed here considers using a second IR that is specifically designed to function with the JVM. A translator can then be written that massages perl's IR into the other IR.
This approach is easier, because an IR designed to be general will have better facilities to implement various language features. For example, features like lexically-scoped variables, and anonymous subroutines (i.e., lambda) are common in many languages. If the IR supports these features, translating from perl's IR to the new IR will be easier. And, even for those features that are unique to Perl, a good IR would provide facilities (better than those provided on the bare JVM) to implement those additional features.
Kawa's IR can definitely serve as this new IR. Originally designed for Scheme, Kawa's IR has been generalized to support basic generic features that are common in many dynamically typed, very high level languages. In addition, it has extensible parts, too. For example, the user of the IR can implement a class that controls variable binding lookup. Yet, the IR's object oriented interface hides the details of how that variable binding lookup operates internally. This feature alone can help simplify one of Perl's most complex features--tied variables.
Copyright © 2000, 2001 Bradley M. Kuhn.
Verbatim copying and distribution of this entire thesis is permitted in any medium, provided this notice is preserved.