As you might already know, compiling to native machine code is old fashioned. Modern compilers compile to some intermediate format such as MSIL (for .NET) or bytecode (for Java). This intermediate language looks a bit like assembly code but preserves meta data and actually knows about objects, classes, interfaces, types etc. unlike assembly language. When we want to run this intermediate language code we basically have three options:
- Buy/create a processor that understands the intermediate language (there actually are Java processors)
- Interpret the intermediate language with an interpreter (the old Java approach)
- Translate the intermediate language code into native machine code (the new Java and .NET approach)
The first two should be obvious, the third one should sound odd. If you're going to translate the IL (Intermediate Language) to native machine code in a separate step anyway, why not do it right away? Well, there are a couple of reasons for that:
- This ought to be a popular one: Laziness of the compiler constructor. Writing a compiler is hard enough, but if you also have to do optimizations in the compiler (you don't HAVE to, but people expect well performing software) it can get even harder. And face it, you're not as smart as the people who wrote that great intel C compiler. There will always be optimizations you won't be able to do. So, why not have one single system that takes this IL (which is fairly easy to generate) applies the most absurd optimization algorithms (even at runtime) and execute it? You'd only need one such a system per platform.
Well, this "system" is what we call the runtime enviroment: CLR (Common Language Runtime for .NET) or JVM (Java Virtual Machine for Java).
- Manageability and security. The problem with native machine code is that you have no idea what it does. The OS loads the code into memory, puts the program counter at the start and gives it a go. You can never know what it is doing. With IL what it does is preserved, you now can know what it does. If an "IL-program" wants access to some resource, we can now fully control if it can do that and under what circumstances (did the code come from a safe place? The internet? Intranet? CD-ROM?). This is why we call MSIL and Java bytecode managed code.
- Platform independency. The IL is platform independent. If we want to run it on Linux the runtime enviroment (CLR or JVM) can generate Linux specific machine code. If you run it on a a 64-bit Opteron processor the runtime enviroment can optimize for that, transparantly.
- New optimization opportunities. I've already mentioned this a bit, but because the runtime enviroment already knows a bit about what the IL is doing it can optimize the generated machine code for that. For example, if it notices that some routine is calling another routine an awful lot, maybe it can be inlined, that would improve it's performance (method calls are expensive). At compile time you only have a limited view of what will be happening at runtime. You can only guess if some routine will call another enough to let inling be worth it. Anyway, lots of optimization opportunities there. Funny effect of this is that the longer a .NET or Java applications runs, the faster it goes, it is self-optimizing, how cool is that?
The process of compiling some IL format to native machine code "on-the-fly" is called JIT (Just in Time compilation).
Got that? Then you're ready to learn about Mono's new JIT compiler ;).