Saturday, 30 July 2011

Java Virtual Machine

 


Overview of Java Virtual Machine (JVM) architecture. Source code is compiled down to Java bytecode. Any platform running a JVM can execute Java bytecode. Bytecode is verified, then interpreted or JIT-compiled for the native architecture. The Java APIs and JVM together make up the Java Runtime Environment (JRE).
A Java Virtual Machine (JVM) is a virtual machine capable of executing Java bytecode. Sun Microsystems states there are over 4.5 billion JVM-enabled devices.[1]

Overview

A Java Virtual Machine is a piece of software that is implemented on non-virtual hardware and on standard operating systems. A JVM provides an environment in which Java bytecode can be executed, enabling such features as automated exception handling, which provides "root-cause" debugging information for every software error (exception), independent of the source code. A JVM is distributed along with a set of standard class libraries that implement the Java application programming interface (API). Appropriate APIs bundled together form the Java Runtime Environment (JRE).
JVMs are available for many hardware and software platforms. The use of the same bytecode for all JVMs on all platforms allows Java to be described as a "compile once, run anywhere" programming language, as opposed to "write once, compile anywhere", which describes cross-platform compiled languages. Thus, the JVM is a crucial component of the Java platform.
Java bytecode is an intermediate language which is typically compiled from Java, but it can also be compiled from other programming languages. For example, Ada source code can be compiled to Java bytecode and executed on a JVM.
Oracle, the owner of Java, produces a JVM, but JVMs using the "Java" trademark may be developed by other companies as long as they adhere to the JVM specification published by Oracle and to related contractual obligations.

Execution environment

Java's execution environment is termed the Java Runtime Environment, or JRE.
Programs intended to run on a JVM must be compiled into a standardized portable binary format, which typically comes in the form of .class files. A program may consist of many classes in different files. For easier distribution of large programs, multiple class files may be packaged together in a .jar file (short for Java archive).
The Java application launcher, java, offers a standard way of executing Java code. Compare javaw.[2]
The JVM runtime executes .class or .jar files, emulating the JVM instruction set by interpreting it, or using a just-in-time compiler (JIT) such as Oracle's HotSpot. JIT compiling, not interpreting, is used in most JVMs today to achieve greater speed. There are also ahead-of-time compilers that enable developers to precompile class files into native code for particular platforms.
Like most virtual machines, the Java Virtual Machine has a stack-based architecture akin to a microcontroller/microprocessor. However, the JVM also has low-level support for Java-like classes and methods, which amounts to a highly idiosyncratic[clarification needed] memory model and capability-based architecture.

JVM languages

Versions of non-JVM languages
Language
On JVM
Languages designed expressly for JVM
Although the JVM was primarily aimed at running compiled Java programs, many other languages can now run on top of it.[4] The JVM has currently no built-in support for dynamically typed languages: the existing JVM instruction set is statically typed,[5] although the JVM can be used to implement interpreters for dynamic languages. The JVM has a limited support for dynamically modifying existing classes and methods; this currently only works in a debugging environment, where new classes and methods can be added dynamically. Built-in support for dynamic languages is currently planned for Java 7.[6]

Bytecode verifier

A basic philosophy of Java is that it is inherently "safe" from the standpoint that no user program can "crash" the host machine or otherwise interfere inappropriately with other operations on the host machine, and that it is possible to protect certain methods and data structures belonging to "trusted" code from access or corruption by "untrusted" code executing within the same JVM. Furthermore, common programmer errors that often lead to data corruption or unpredictable behavior such as accessing off the end of an array or using an uninitialized pointer are not allowed to occur. Several features of Java combine to provide this safety, including the class model, the garbage-collected heap, and the verifier.
The JVM verifies all bytecode before it is executed. This verification consists primarily of three types of checks:
  • Branches are always to valid locations
  • Data is always initialized and references are always type-safe
  • Access to "private" or "package private" data and methods is rigidly controlled.
The first two of these checks take place primarily during the "verification" step that occurs when a class is loaded and made eligible for use. The third is primarily performed dynamically, when data items or methods of a class are first accessed by another class.
The verifier permits only some bytecode sequences in valid programs, e.g. a jump (branch) instruction can only target an instruction within the same method. Furthermore, the verifier ensures that any given instruction operates on a fixed stack location,[7] allowing the JIT compiler to transform stack accesses into fixed register accesses. Because of this, that the JVM is a stack architecture does not imply a speed penalty for emulation on register-based architectures when using a JIT compiler. In the face of the code-verified JVM architecture, it makes no difference to a JIT compiler whether it gets named imaginary registers or imaginary stack positions that must be allocated to the target architecture's registers. In fact, code verification makes the JVM different from a classic stack architecture whose efficient emulation with a JIT compiler is more complicated and typically carried out by a slower interpreter.
Code verification also ensures that arbitrary bit patterns cannot get used as an address. Memory protection is achieved without the need for a memory management unit (MMU). Thus, JVM is an efficient way of getting memory protection on simple architectures that lack an MMU. This is analogous to managed code in Microsoft's .NET Common Language Runtime, and conceptually similar to capability architectures such as the Plessey 250, and IBM System/38.
The original specification for the bytecode verifier used natural language that was "incomplete or incorrect in some respects." A number of attempts have been made to specify the JVM as a formal system. By doing this, the security of current JVM implementations can more thoroughly be analyzed, and potential security exploits prevented. It will also be possible to optimize the JVM by skipping unnecessary safety checks, if the application being runned is proved to be safe.

No comments: