Thursday, September 1, 2011

java profiling is confused by i/o


the other day a collegue came to me with output from a java profiler that showed that we were spending way too much cpu time doing i/o. (in our case the i/o was both network and disk i/o.) the application in question should be i/o bound but wasn't. we could see that by running top, so he was using the profiler to figure out where the cpu time was being spent.

modern operating systems are great about putting tasks waiting on i/o to sleep so that the cpu could work on other tasks or just wait until the i/o is finished. (In the dark ages your computer would hang during i/o. it was pretty amazing when microsoft released an operating system that would allow you to keep working while you formatted your disks!)

profilers work by taking periodic snapshots (or samples) of what a process is doing. after taking many such samples it can figure out where most of the time is spent in the code. an alternate way of profiling, much more accurate but with more overhead, will put probes in at each call entrance and exit and calculate the running time of each invocation of the call.

my collegue knew all of this of course, which is why he was concerned that our i/o functions had too much overhead. i pointed out that past experience had taught me that the java profiler counts the time that a thread waits on i/o as running time. thus, you cannot trust the profilers output when you are doing a bunch of i/o.

ever the skeptic, my collegue pointed out that that would be stupid. the profiler can see that a thread was waiting for i/o and thus should not count the waiting time, therefore i had to be wrong.

to prove my point i wrote the following short program:

import java.net.*;
import java.util.*;
class t {
  public static void main(String a[]) throws Exception {
    new Thread() {
      public void run() {
        while(true) {
          try {
             String s = "";
             for(int i=0; i<1000; i++) { s=s+i; }
             Thread.sleep(1);
          } catch(Exception e) {}
       }
    }}.start();
    System.in.read();
  }
}

all this program does is startup a thread that burns cpu by building a string and then throwing it away.
the main thread, in the meanwhile, is just sitting there waiting for input from stdin. obviously, one thread is cpu bound while the other is i/o bound. so lets see what the java profiler tells us after we let our test program run for a while under the profiler and then killed:

breed@laflaca:~$ java -agentlib:hprof=cpu=samples t
^CDumping CPU usage by sampling running threads ... done.
breed@laflaca:~$ cat java.hprof.txt
JAVA PROFILE 1.0.1, created Wed Aug 31 11:45:20 2011
Header for -agentlib:hprof (or -Xrunhprof) ASCII Output (JDK 5.0 JVMTI based)
...
CPU SAMPLES BEGIN (total = 966) Wed Aug 31 11:48:21 2011
rank   self  accum   count trace method
   1 54.76% 54.76%     529 300025 java.io.FileInputStream.readBytes
   2 17.60% 72.36%     170 300040 java.lang.Integer.getChars
   3 11.08% 83.44%     107 300039 t$1.run

according to this more than half of the time is spent in the read call! of course virtually no CPU time is spent there since it is waiting on i/o. ok, maybe it is a problem with sampling. lets try it with probes:

breed@laflaca:~$ java -agentlib:hprof=cpu=times t
^CDumping CPU usage by timing methods ... done.
breed@laflaca:~$ cat java.hprof.txt
...
CPU TIME (ms) BEGIN (total = 4946) Wed Aug 31 11:50:46 2011
rank   self  accum   count trace method
   1 12.94% 12.94%   23859 301024 java.lang.AbstractStringBuilder.append
   2 12.68% 25.62%   23858 301028 java.lang.AbstractStringBuilder.append
   3  6.61% 32.23%   23858 301031 java.lang.String.<init>

much better the top three calls are all related to string procesing, which we would expect. actually string processing makes up the top 20 calls. awesome, so perhaps we just need to use times... but wait. the read call never returned. what happens if we give our program some input (by pressing enter) before we kill it?

breed@laflaca:~$ java -agentlib:hprof=cpu=times t
^CDumping CPU usage by timing methods ... done.
breed@laflaca:~$ cat java.hprof.txt
...
CPU TIME (ms) BEGIN (total = 8431) Wed Aug 31 11:54:38 2011
rank   self  accum   count trace method
   1 49.91% 49.91%       1 301067 java.io.FileInputStream.read
   2  6.59% 56.51%   19795 301028 java.lang.AbstractStringBuilder.append
   3  6.35% 62.85%   19796 301024 java.lang.AbstractStringBuilder.append

yep, there's the read again!

so, what's the moral of this story? careful when using java profilers. they can be useful tools, but when dealing with processes that do lots of i/o you may need to skip over some of the results to get to the real performance problems.