|
java performance
by Glen McCluskey Glen McCluskey is a consultant with 15 years of experience and has focused on programming languages since 1988. He specializes in Java and C++ performance, testing, and technical documentation areas.
<glenm@glenmccl.com>
It is interesting to look at the current state of Java performance. One way is to devise some simple benchmark programs. Such programs can provide useful information and also have some limitations. Sorting The first benchmark program to consider is one that sorts a vector of numbers, using an O(N**2) algorithm. This program doesn't have any dependencies like I/O or use of library functions, and is valuable in measuring the actual execution speed of Java bytecodes. A similar program can be written using C++, as a comparison. The Java program looks like this:
public class sort {
and its C++ counterpart:
const int N = 25000;
We will use three different compilers for this analysis:
enabling compiler optimization switches in each case. The running times of the above program are as follows:
The just-in-time compiler represents a huge advantage over interpretation and is nearly as fast as C++. But this example also illustrates one of the pitfalls of comparing the performance of two different languages. Suppose that I change the line in the Java sort that reads: for (int i = 0; i < N; i++) to: for (int i = 0; i <= N; i++) This off-by-one error will be caught when the Java program runs, but an equivalent error in the C++ program probably will not be. In other words, Java checks array subscripts at runtime, but C++ does not. The Java approach does more, but at some cost. Choosing which one is "right" depends a lot on your philosophy. Copying Files For the second example, consider copying files. The Java version is:
import java.io.*;
and the C++ version:
#include <stdio.h>
We could have used system calls (read/write) in the C++ version, but it seems fairer to use fread/fwrite, in that the Java I/O library, even at its most fundamental level, is at least one layer removed from actual system calls, and so the C++ version should be as well. Running times for copying large (125 MB) uncached files are:
In other words, there's not really much of a difference between the approaches. We might expect this similarity, in that FileInputStream.read() and FileOutputStream.write() are native methods, meaning that they are implemented in terms of underlying functionality written in some other language, for example, C. In other words, low-level Java I/O immediately translates into calls to underlying libraries, somewhat like fread/fwrite calls being implemented in terms of read/write system calls. Tabulating Word Frequencies The final example is one that uses higher-level library routines to tabulate word frequencies. Input is a file containing a list of words, one per line, and the output is a list of words and their frequencies. In Java, one way of implementing this is:
import java.io.*;
and in C++, we would say:
#include <fstream>
In both cases, we use hash tables to accumulate the words and their frequencies. For a list of around 900K words, running times are:
When we compare languages with a benchmark such as this one, a more complex benchmark that uses library features, it becomes harder to analyze and draw conclusions. For example, each program uses a hash-table data structure provided in the standard libraries. There are tradeoffs to be made in designing such structures, such as whether you're trying to optimize for speed or for space. If you're optimizing for speed, you might be more aggressive in growing the table size when the table starts to get full. One way of interpreting the relative times in this case is to look at this application as one that (1) uses low-level I/O, with similar performance across the three compilers, (2) uses high-level I/O (reading lines), which is affected by the issue of interpreter versus just-in-time versus native-code costs, and (3) uses hash-table data structures, with tradeoffs in representation and speed versus space. Other Performance Areas There are other performance areas we've not really looked at. For example, an important one in some contexts is invocation time. Using the above compilers, it takes about 15 milliseconds to invoke a null C++ program, and 325 for a null Java one. Another area that might be looked at is GUI performance. Conclusion The whole area of Java performance is changing rapidly. For example, Sun recently introduced its HotSpot compiler, and several vendors are working on native-code compilation for Java. Whether Java will ever be as fast or faster than languages like C or C++ is hard to say; it might be better to ask "Faster at what?" or ask whether taking a slight performance hit in return for added functionality (such as array subscript checking) is a worthwhile tradeoff to make. We already make such tradeoffs in other areas today, for example in using high-level languages in preference to assembly language.
|
|
Last changed: 18 Nov. 1999 mc |
|