SableVM's interpreter is a threaded interpreter.
Pure bytecode interpreters suffer from expensive
dispatch costs: on every iteration, the dispatch loop fetches the next bytecode, looks up the associated implementation address in a table (explicitly, or
through a switch statement), then transfers the control to that address. Direct threading reduces this overhead: in the executable code stream, each bytecode is replaced by the address of its associated implementation. In addition, each bytecode implementation ends with the code required to dispatch the next opcode. This is illustrated in figure 2. This technique eliminates the table lookup and the central dispatch loop (thus eliminating a branch instruction to the head of the loop). As these operations are expensive on modern processors, this technique has been shown to be quite effective[20,27].
Method bodies are translated to threaded code on their first
invocation. We take advantage
of this translation to do some optimizations. For example, we precompute
absolute branch destinations, we translate overloaded bytecodes like
GET_FIELD instruction to separate implementation addresses (GET_FIELD_INT, GET_FIELD_FLOAT, ...),
and we inline constant pool references to direct
This one pass translation is much simpler than the translation done by even the most naive just-in-time compiler, as each bytecode maps to an address, not a variable sized implementation. However, unlike a JIT, the threaded interpreter still pays the cost of an instruction dispatch for each bytecode. Piumarta has shown a technique to eliminate this overhead within a basic block using selective inlining in a portable manner, at the cost of additional memory6. SableVM implements this technique optionally through a compile-time flag, as it might not be appropriate for systems with little memory.