Java and Embedded Linux Team Up Randy Rorden, NewMonics,August 31, 2000
The Java platform and the Linux operating system are appearing more frequently in embedded applications thanks to their unique benefits. A combined Java-Linux platform accelerates the design process, reduces costs, and shortens time-to-market. However, developers do need to be aware of the tradeoffs they face and how to make the right choices for a specific application.
Embedding Linux
Commercial distributions of the Linux operating system have been available from many sources for years, however, it's just recently that embedded Linux offerings have become available from companies such as Lineo, LynuxWorks, and MontaVista. Linux offers the following well known advantages with respect to embedded system applications:
No runtime royalties -- Distribution vendors charge for the value they add in putting together a packaged product or for an annual subscription to their distribution, but so far none are charging for runtimes. Open source -- Linux is written and distributed under the GNU General Public License (GPL), which means that its source code is freely distributed and available to the public. With the consolidation of the RTOS market, an open source OS has the advantage of being bigger than any single vendor and more likely to survive changes in the industry. Portability and APIs -- With POSIX-standard APIs, Linux provides source-code compatibility with Linux on other processors (x86, 68k, PowerPC, ARM, etc.) as well as with other POSIX-compliant operating systems. Configurable kernel -- Configurable kernel - The Linux kernel is remarkably elastic. Developers can use kernel configuration utilities to add or remove services and functions, such as device drivers, file systems, and networking support. Most configurable features can be compiled into the kernel, compiled as a separate module that the kernel loads on the fly, or left out entirely. Scalability -- A wide range of tools, utilities, and applications are available to customize the functionality of a Linux-based system. Embedded Linux vendors offer configuration tools and lightweight utilities suitable for small memory footprints. Networking -- The Linux TCP/IP stack is under constant scrutiny for security and optimized for speed. Drivers, utilities, clients, and servers are available for just about every network function or protocol. Drivers -- Due to Linux' popularity, independent developers and hardware vendors are developing and releasing Linux drivers with products. Embedded Linux vendors are adding drivers specific to embedded devices, such as support for flash memories and wireless networking.
Enter Java
Although Java technology made a big hit in Web and enterprise computing, it has only recently made inroads into embedded systems. Originally developed with embedded systems in mind, Sun Microsystems shifted its efforts toward the Internet early on.
It's important to distinguish between the Java language and the Java platform. The Java language is an object-oriented programming language with syntax similar to C. The Java platform offers a run-time environment, which includes a Java Virtual Machine (JVM) to execute Java programs compiled into architecture-neutral byte codes. It also includes a library of standard Java classes that are available to all Java programs running on the platform.
Java technology's characteristics line up nicely with the today's needs in embedded systems:
Object oriented -- Designed for object-oriented programming from the start, the Java language is cleaner, simpler, and easier to learn than C++. Object orientation helps developers to partition programming tasks and create reusable software components, maximizing investment in the original code and decreasing both time-to-market and development costs. Portability and APIs -- The Java platform extends the concept of portability further by providing a virtual machine environment that runs any Java program without recompilation, regardless of the underlying processor and operating system. Java platform APIs provide a full-featured library with hundreds of classes for manipulating data and performing networking, display, and I/O tasks. Platform portability eases the task of building and testing a complete application on a desktop PC and then deploying the same object code on a target system. Memory management -- Memory management in C and C++ is error-prone and difficult to debug, with memory "leaks" manifesting themselves only under conditions that are difficult to reproduce. The Java run-time environment avoids many of these problems by determining when objects are no longer in use and automatically reclaiming memory, a process known as garbage collection. Garbage collection reduces the bookkeeping responsibilities of application developers and eases the evolution, reuse, and integration of independently developed software components. Threading support -- The Java language has a built-in notion of multithreading. A critical section of code can be marked synchronized, ensuring that any thread executing the marked code obtains an exclusive lock on a synchronized object. If another thread has the lock, the new thread waits until first thread's exit of the marked section. Networking support -- Writing network-aware programs in the Java language is as easy as reading and writing files. The same InputStream and OutputStream classes used to access files are also used to send and receive data over network connections. A simple Web server is implemented in a few lines of Java code and open-source third-party libraries are available to implement full-featured clients and servers. Dynamic class loading -- The Java run-time environment enables a unique extensibility through dynamic class loading. Base-level applications can be customized and enhanced later because the JVM allows new programs represented as Java classes to be loaded into an application while it is running via disk, flash, or serial/network connections.
A Java-Linux Platform
Several additional benefits come of combining Java technology and the Linux operating system:
Hardware abstraction -- Java can be used to "wrap" classes around Linux driver interfaces to provide an abstraction layer to an application. For the simplest memory-mapped or polled I/O interfaces, Linux provides a mechanism for reading and writing physical memory and I/O ports by opening specially named files (/dev/mem and /dev/port). A Java program can open, seek, and read and write data in these files to manipulate memory and I/O locations. Although this solution isn't portable to non-Linux systems, the limitation might be acceptable for many applications. Reduced footprint -- A Java-Linux platform uses built-in networking to reduce nonvolatile storage requirements. Linux network support and the dynamic class-loading feature of Java allows all or part of an application to be loaded on demand via a network connection to a server using a simple socket or a filesystem client like NFS. Tight integration -- The Linux bootstrap lets it literally "boot to Java," loading the necessary kernel, driver, and networking components and launching the JVM to load and run application classes. After the Linux kernel is configured and boots, the developer can focus entirely on developing the Java-based application.
Issues and Tradeoffs
Despite its advantages, a Java-Linux platform presents embedded system developers with a number of issues and tradeoffs:
Footprint
Although the Linux kernel is scalable, there are limits to footprint reduction. Using a standard Red Hat 6.0 configuration utility, a basic kernel with networking support, including drivers, TCP/IP stack, and an NFS client, can require up to 512 KB compressed in non-volatile memory.
Additional Linux executables, the Java platform, and any needed application classes will further increase ROM requirements. An embedded JVM may take up to 256 KB, while standard class libraries often require another 1 MB or more of additional ROM. The java.awt graphics libraries, for instance, can add 1 to 2 MB more of classes.
A common way to reduce non-volatile storage needs in Linux is to use a compressed initial RAM disk image that automatically expands into RAM at boot time. For example, a 1.44 MB Linux boot floppy with a 512 KB compressed, network-capable kernel and a 512 KB compressed RAM disk image containing a JVM and the lang, io, net, and util libraries leaves about 400 KB for application classes.
RAM requirements depend on the application, of course, but an off-the-shelf Linux distribution won't run very well in less than 8 MB. Embedded Linux vendors are working to reduce both ROM and RAM requirements. Java RAM requirements largely depend on your performance choices, but if you use static linking tools to make Java byte codes executable from ROM, you can reduce RAM usage. Without static linking, classes must be linked and loaded into RAM at execution time.
Performance
Early Sun releases had a slow byte code interpreter compared to C or C++. Despite significant speed improvements, a Java interpreter is still 10 to 20 times slower than C or C++.
To address this performance gap, some vendors have developed Just-in-Time (JIT) compilers (e.g., NewMonics' QuickPERC JIT) that compile Java byte codes into native code for the target processor on the fly as they are loaded into memory. JIT compilation boosts performance, but the compilation time may be too costly, especially if the code doesn't contain loops or only runs once. An Adaptive JIT compiler, such as Insignia Solutions', monitors execution and only compiles code that runs frequently. RAM footprint is higher with JIT compilation than with interpreted code because the native instructions are placed in RAM and are less compact than Java byte codes (native code is 4 to 8 times larger).
NewMonics' QuickPERC AOT and HP's TurboChai offer Ahead-of-Time (AOT) compilation, which generates native code that is statically linked with the runtime JVM. This method avoids runtime compilation at the expense of ROM footprint. An AOT-only solution (no Interpreter, no JIT), often called a Java-to-native compiler, still requires a runtime library and is functionally equivalent to an AOT compiler that links the code it generates with a JVM.
The table below summarizes the relative speed, ROM footprint, and RAM footprint tradeoffs of various performance features available from embedded Java vendors.
By moving from interpreted to compiled code, JIT and AOT improve performance by as much as ten or more times. Adaptive JIT is faster than conventional JIT because it doesn't compile all classes, while AOT provides the highest speed by avoiding runtime compilation entirely.
Assuming the JVM and Java classes are stored in ROM, the JIT and adaptive JIT compiler options increase ROM footprint by the size of the runtime compiler itself with the adaptive JIT being more complex and therefore larger. An AOT compiler uses much more ROM to store the compiled native code.
If you use a static linking tool (like the NewMonics ROMizer), the RAM footprint requirement for interpreted code is relatively low. Conventional JIT compilers use the most RAM because all classes are compiled into RAM, while adaptive JIT compilers use less RAM by only compiling frequently used byte codes. AOT-compiled code uses very little RAM.
Memory Management
Although the Sun virtual machine specification defines the functional requirements of the JVM, it leaves operational details, such as memory management, to the implementer. The choice of memory management algorithm can significantly affect the behavior of the Java platform.
For example, a garbage collector that does not address memory fragmentation depends on expansion of the memory heap to accommodate an allocation request that is too large for any available memory segments. Once memory has fragmented enough and the heap has reached maximum size, an out of memory error will occur. In embedded systems where memory is limited, ignoring fragmentation is unacceptable.
How a garbage collector finds active objects in memory can also make a big difference. The mark and sweep algorithm used by most collectors starts from a set of known "root" pointers found in registers and variable stacks and traces them. It locates and marks the objects they reference as in-use, then traces pointers in those objects, and so on, until all active objects have been traced and marked. The remaining unmarked objects are "swept" into the free memory pool (see Diagram 1).
Problems arise based on how the collector treats variables. Conservative collectors don't always know which variables and fields are pointers and which are primitive types (i.e., integers), so it treats uncertain values as pointers. While this is a common shortcut, it has a serious drawback: if a non-pointer contains a value that points to an inactive memory object, the object and the objects it references will not be collected. Essentially, it causes a memory leak. This is particularly bad because the leak is data dependent and may not occur until a variable, such as an encoded date, hits a magic number. Applications that worked fine yesterday may get an out of memory exception today.
In memory-constrained environments, a precise defragmenting garbage collector is the only reliable choice. A precise collector knows which variables and fields contain pointers and which contain primitive types, so it collects all unused objects. A defragmenting collector coalesces unused memory segments to allow subsequent large allocations to succeed. With a precise defragmenting collector, memory utilization can be tested and predicted reliably. (The NewMonics PERC collector is one example of a precise defragmenting collector.)
Determinism
Determinism may be described as a real-time performance issue, but many applications that aren't considered "real-time" still have a timeliness requirement.
Linux can affect real-time determinism if a driver or the kernel disables interrupts while it waits for an event to occur. Embedded Linux developers are working on this issue, some by careful characterization of the kernel and critical drivers and others by providing a real-time environment inside Linux itself.
Java determinism issues relate to execution time and garbage collection. If the application's execution time needs to be predictable, the JVM must provide constant-time memory allocation and Java method invocation. Constant-time memory allocation requires maintenance of multiple size-ordered free lists rather than "walking" a free list to locate a memory segment that is large enough. An adaptive JIT compiler doesn't allow constant-time method invocation due to the unpredictability of runtime compilation. Alternatively, performing JIT compilation at class load time assures predictability.
The garbage collector is another factor in Java determinism. If the mark and sweep algorithm is implemented in its most basic form, the collector cannot relinquish access to the memory heap until it has completed a full mark and sweep pass. If it stops midway through a pass, a Java thread might allocate a new object or modify a pointer, making it impossible for the collector to know if it can safely sweep an unmarked object into the free pool. The time required to complete a mark-and-sweep pass depends on memory usage, but it can delay scheduling of a periodic thread from tens of milliseconds to several seconds, based on the memory activity of other threads (see Diagram 2).
Diagram 2 depicts a scheduling latency test run on a common desktop JVM. The test has one Java thread allocating a large linked list in memory and then releasing it, iteratively, while a second thread checks the current time, sleeps for 100 milliseconds, and then checks the time again. The difference between actual and expected sleep time is plotted on a logarithmic scale. A deterministic Java implementation would show a flat line, indicating no variance from the expected sleep time. As you can see, the sleeping thread is often delayed by twice the expected time, typical of a classic stop and wait collector.
An incremental garbage collector performs its task in short increments and suspends when a high-priority Java thread needs to run, thus accomplishing the collection process without scheduling latency.
Developers need to ask detailed questions when vendors talk about their collectors. Some claim they have an incremental collector, but in fact an "increment" is an entire mark and sweep pass. A concurrent collector runs in a separate process or thread from the JVM, but that may still mean that other Java threads must stop and wait while it finishes a collection pass. Others may say they have a background collector that doesn't affect real-time tasks, but they require that real-time tasks be written in C and executed outside of the Java run-time environment. A truly incremental garbage collector should provide deterministic behavior for Java programs, nothing less.
Tools
You need some basic tools to develop Java-Linux based embedded systems. The familiar gcc compiler tool set is great since it targets many different processor architectures. Graphical IDE front-ends to gcc and the gdb debugger tool are also commercially available, from Red Hat/Cygnus and Motorola/Metrowerks or you can use gcc tools to cross-compile for Linux targets from non-Linux hosts.
On the Java tools side, a Java IDE, which includes a Java-to-byte-code compiler and a Java source-code debugger that allows remote debugging, is essential. Commercial Java IDEs are available from many sources, but avoid tools that only debug your code on the host machine or only with a specific JVM. A JVM and debugger supporting the Java Debug Wire Protocol (JDWP) standard provide remote debugging over a network or serial connection to your target system.
The JVM vendor should provide tools that link and package the JVM and classes for storage and execution from ROM or flash memory. These tools statically link the packaged Java classes so their byte codes are executable from ROM. Alternatively, if a filesystem is available, some or all of the classes can be stored separately, usually in a ZIP or JAR archive format file.
Many embedded Java vendors provide tools to reduce footprint requirements by eliminating unused classes and methods from classes. These "tree shakers" scan code execution paths and "shake out" unused fields and methods. However, they prevent subsequent dynamic loading of new application classes if they depend on classes or methods previously eliminated. If you want to use a tree shaker, look for one that lets you specify the core classes that should be left intact.
Lastly, embedded Java vendors may supply an AOT compiler for generating native code from Java source or class files to improve performance. The footprint tradeoffs are important to consider, so make sure you have the option of AOT-compiling selected classes for optimized performance and allowing other classes to be interpreted or JIT compiled.
Summary
As the complexity of the new generation of embedded devices increases and development schedules shorten, developers can take advantage of the benefits of a combined Java-Linux platform to enhance their productivity and reduce product time-to-market. With a good understanding of the tradeoffs available to them in footprint, performance, memory management, determinism and tools, developers can make the right choices for a successful project.
Author's bio: � Randy Rorden serves as Vice President of Engineering at NewMonics, responsible for development, testing, documentation, and support of the Embedded PERC product family. Randy has 20 years' experience developing computer hardware and software products in various markets, including medical management, training, scientific workstations, multiuser UNIX systems, RAID disk arrays, PC networking, wireless messaging, and Internet communications. You may reach him at rrorden@newmonics.com.