Over the last few years, Java has rapidly become one of the most popular general purpose object-oriented (OO) programming languages. Java programs are compiled into class files which include type information and platform independent bytecode instructions. On a specific platform, a runtime system called a virtual machine loads and links class files then executes bytecode instructions. The virtual machine collaborates with the standard class libraries to provide key services to Java programs, including threads and synchronization, automatic memory management (garbage collection), safety features (array bound checks, null pointer detection, code verification), reflection, dynamic class loading, and more.3
Early Java virtual machines were simple bytecode interpreters. Soon, the quest for efficiency led to the addition of Just-In-Time compilers (JIT) to virtual machines, an idea formerly developed for other OO runtime systems like Smalltalk-80 and Self-91. In a few words, a just-in-time compiler works by compiling bytecodes to machine specific code on the first invocation of a method. JITs range from the very naive, that use templates to replace each bytecode with a fixed sequence of native code instructions (early versions of Kaffe did this), to the very sophisticated that perform register allocation, instruction scheduling and other scalar optimizations (e.g. [8,23,29,32]).
JITs face two major problems. First, they strive to generate good code in very little time, as compile time is lost to the running application. Second, the code of compiled method resides in memory; this augments the pressure on the memory manager and garbage collector. Recent virtual machines try to overcome these problems. The main trend is to use dynamic strategies to find hot execution paths, and only optimize these areas (e.g. [4,10,16]). HotSpot, for example, is a mixed interpreter and compiler environment. It only compiles and optimizes hot spots. Jalapeno[10,11], on the other hand, always compiles methods (naively at first), then uses adaptive online feedback to recompile and optimize hot methods. These techniques are particularly suited to virtual machines executing long running programs in server environments. The optimizer can be relatively slow and consist of a fully fledged optimizing compiler using intermediate representations and performing costly aggressive optimizations, as compile time will be amortized on the overall running time.
Our research complements these approaches by exploring opportunities for making the virtual machine execute efficiently. Rather than looking at fine grain techniques, like register allocation and instruction scheduling, we address the fundamental problem of data layout in a dynamic Java environment. While Java shares many properties with other object-oriented languages, the set of runtime constraints enforced by the verifier and the basic services provided for each object (hash code, locking) are unique. This leads us to revisit traditional data structures used in object-oriented runtime environments, and adapt them to fully take advantage of the properties of the Java runtime environment.
As a testbed for evaluating our proposed data structures and algorithms, we are designing and implementing SableVM, a standards conforming open-source virtual machine. Written in the C programming language, and depending on the POSIX application programming interface (API), it is meant as a small and portable interpreter4. It can be used as an experimental framework for extending the bytecode language. It can also be used as an efficient virtual machine for embedded systems, or as a profiling interpreter in a hybrid interpreter/just-in-time optimizing-compiler environment.
The remaining part of this document is structured as follows. Section 2, we state the contributions of this paper. In section 3, we give an overview of the SableVM framework. In section 4, we describe SableVM's threaded interpreter. In section 5, we introduce our classification of virtual machine memory. In section 6, we introduce our new layouts for object instances and virtual tables, and our improved thin locks. In section 7, we discuss our proposed experiments. Finally, in section 8, we present our conclusions.