Sunday, September 27, 2009

Fixing the annoying Java 2-line logging format

java.util.logging is far from perfect, but for me the main reason to not use it is the hard-to-read default formatter which shows the date on a whole row. There are more compact formatters - but you need to write code to install them, or edit the command line.

So every time I install a JVM, I change the logging formatter:

1. in $JAVA_HOME/jre/lib/, replace

java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter

java.util.logging.ConsoleHandler.formatter = org.apache.juli.JdkLoggerFormatter

2. copy TOMCAT_HOME/bin/tomcat-juli.jar to JAVA_HOME/jre/lib/ext.

Sunday, September 06, 2009

Java flush() - not what you would expect

What I expect: flush() on the output stream to send the data. After flush() is completed, I expect the receiver to be able to read the data. Operating system and network may have their own delays - flush() is less than commit, so there is no guarantee that data is completely received, or it will ever be received when flush() is done, but it shouldn't be stuck in an intermediate java buffer with no way out until close().

OutputStream.flush() documentation seems to match: 'forces any output bytes to be written out' and clarifies that if any bytes have been buffered, they should be written to their "intended destination". And it clarifies that the OS may further buffer "the bytes are passed to the OS for writing", and not that "they are actually written".

Let's take few examples: SocketOutputStream and FileOutputStream do nothing in flush() because they don't buffer anything, so write() will pass the bytes to the OS.

BufferedOutputStream is the best example of what to do when buffering: it does send all the bytes to the next stream in the chain AND calls out.flush(), so the data will go to the 'intended destination' - the buffer is just an intermediary. Well known that if you want any decent performance you should use BufferedOutputStream. Not so well known - you can use better buffers, for example one that doesn't reallocate and copy the byte[], but instead keeps a list of buffers.

The biggest offender: DeflaterOutputStream. Flush will just call out.flush()., but keeps recent bytes in its buffer. Worse - the actual Deflater doesn't even support zlib flush.

For Servlets, flush() does the right thing - the bytes go all the way to the net. Unless you're using compression, and the serlvet engine uses java Deflater - in which case flush() will not push bytes to the net, they get stuck in deflater. GZIPOutputStream extends Deflater and has the same problem - tomcat's coyote GzipOutputFilter and the example compression filter won't actually flush to the net.

The alternative for compressed output that flush() -, BSD licence, pure java - has some good info about the flush() problem ). You can also use JNI and libz - Harmony's Deflater is a good starting point - but Harmony has same bug as Sun VM.