Monday, January 11, 2010

ZLib memory overhead

I was testing some compression code - and the load tests were running out of memory. Looks like ZLib allocates about  256K per compressor. I tried this with both  BEST_COMPRESSION and BEST_SPEED - looking at Shallow Heap in eclipse heap analyzer, 2x32k short[] and 2x64k byte[].

With SPDY you need to keep the header compressor around for the entire kept-alive connection. HTTP connections only use gzip for individual requests - the context and memory can be reclaimed, so a kept-alive connection has a very small cost - can be as low as just a socket in a selector, sometimes few extra buffers.In tomcat-lite there is one 8k buffer associated with the connection - not hard to get rid of it, but low priority.


Wednesday, January 06, 2010

Heap dumps - looking at all objects and fields in a live system, without a debugger

Do you have a java server - maybe in production - and you want to look at the value of some field ? You can't attach a debugger, this would require starting the server with various flags - but you can get a heap dump, which includes all objects, including the value of their fields. 

The other use is to generate a heap dump before and after running a load test - and look at garbage to evaluate how much work you put on the garbage collector and find leaks. There are profilers that can get better results - but this is pretty fast and free way to get the same result, and you can use it against running systems. 

Getting heap dumps:

 jmap -dump:format=b,file=heap3.bin PID_OF_JAVA_PROCESS

Using code ( in a servlet, etc ):
server = ManagementFactory.getPlatformMBeanServer();
server.invoke(new ObjectName("com.sun.management:type=HotSpotDiagnostic"),
              "dumpHeap",
              new Object[] {fileName, Boolean.TRUE}, 
              new String[] {String.class.getName(), "boolean"});

Looking at the data:


  Eclipse MemoryAnalyzer 

or 
  jhat heap3.bin 
  open http://localhost:7000



Saturday, January 02, 2010

Flush() in a compressed stream

I finally understood the difference between 'sync' and partial flush: http://www.bolet.org/~pornin/deflate-flush.html

Since last byte may not end on a 8-bit boundary - you need to pad, i.e. to end the current compression block and start a new one.  The interesting part is that a flush with Z_SYNC_FLUSH inserts 0x00 0x00 0xFF 0xFF - but protocols like PPP can strip it.

Another good link: http://www.zlib.net/zlib_tech.html