What should JITed bytecode do exactly?

Question

I'm working on a VM (and a scripting language for it) that I plan to implement JITing for. I'm only working on the "plumbing" of it now, but I don't want the JIT compiler to be an afterthought. However, while I understand the fundamentals of it, I'm a bit confused on what exactly the JIT should do.

There are two ways I could think of it being implemented:

Translate the bytecode into "proper" x86 just like any compiler would, thus eliminating the interpreter/VM part.
Translate the bytecode into x86 that calls VM functions to tell it what to do thus eliminating the interpreter's opcode decoding step and going straight to calling the internal VM functions.

The first method would be difficult to implement simply because it requires knowledge on not only building a full-blown compiler, but being able to compile a high-level language that relies normally on VM functionality to be compiled into native code.

The second method would be much simpler to implement as you're not actually compiling the program, you're just dynamically creating a list of C function calls (to the internal VM) with corresponding operands with x86 instructions to call in the same order that would've otherwise require an interpreter to "decode".

However, while the second clearly seems more sane to implement, I'm not too sure how much more (or less) it would effect the performance of the program. What direction should I aim for? Any notable pros and cons?

score 2 · Answer 1 · answered Jul 25 '15 at 11:59

You're right that method 2 won't give you a huge performance boost over a simple bytecode interpreter loop. The real gains are to be made by using method 1. That said, method 1 isn't as hard these days as it used to be, as there are libraries that can help.

One interesting approach that should be relatively east and might give good performance is to use LLVM to implement method 2, and use an existing LLVM compiler (eg clang) to compile the functions you're calling to LLVM code rather than native - then run its optimization steps over the result, particularly the function call inliner. This should result in reasonably good native code without much in the way of complexity.

score 0 · Answer 2 · answered Jul 25 '15 at 13:06

I would recommend using some existing JIT library, like GCCJIT, libjit, LLVM, asmjit, ... You could even consider translating your bytecode to C, dynamically compiling that C code into a shared library at runtime, and dlopen-ing that plugin etc...

You'll need to understand some compilation and optimization techniques to do so (in particular because your bytecode might be semantically far from the internal representations needed by the JIT library that you'll use).

What should JITed bytecode do exactly?

2 Answers2

Linked