Thursday, November 8, 2012

Performance Monitoring in Solaris

If you're unfortunate enough to use Solaris in production, you can at least console yourself with some of the excellent performance monitoring tools the OS supports. We recently had some performance issues which we needed to diagnose on a production server -- i.e. no profiler.

top == prstat

Your first point of call for CPU monitoring is normally top, which doesn't exist on Solaris (of course). Use prstat instead. It's basically the same in terms of information.

Individual Java Thread Monitoring

prstat has another trick -- you can use it to get an individual breakdown of the threads in your process. Use:
prstat -mvL -p <processid>
To get output like:
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 
  2108 stuart   1.7 0.0 0.0 0.0 0.0  98 0.0 0.0   5   2   5   0 java/52
  2108 stuart   0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 100   0 100   0 java/18
  2108 stuart   0.0 0.0 0.0 0.0 0.0 0.0 100 0.0  50   0  50   0 java/28
  2108 stuart   0.0 0.0 0.0 0.0 0.0 0.0 100 0.0   5   0   5   0 java/21
  2108 stuart   0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   5   1   5   0 java/8
  2108 stuart   0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   1   0   1   0 java/20
  2108 stuart   0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   2   0   2   0 java/85

man prstat will explain what the columns mean. The individual threads are in the last column; you can get out a stack trace from them via pstack. In this case, I could run pstack 2108/52 to see what the top thread looks like:
-----------------  lwp# 2108 / thread# 52  --------------------
 fc829624 * *com/sun/org/apache/xml/internal/dtm/ref/dom2dtm/DOM2DTM.nextNode()Z [compiled] 
 fc2c5bb8 * *com/sun/org/apache/xml/internal/dtm/ref/dom2dtm/DOM2DTM.getHandleFromNode(Lorg/w3c/dom/Node;)I [compiled] +50 (line 1341)
 fca4f1b8 * *com/sun/org/apache/xml/internal/dtm/ref/dom2dtm/DOM2DTM.getHandleOfNode(Lorg/w3c/dom/Node;)I [compiled] +89 (line 1409)
 fca4f1b8 * *com/sun/org/apache/xml/internal/dtm/ref/DTMManagerDefault.getDTMHandleFromNode(Lorg/w3c/dom/Node;)I+219 (line 1135)
 fca1088c * *com/sun/org/apache/xpath/internal/XPathContext.getDTMHandleFromNode(Lorg/w3c/dom/Node;)I [compiled] +6 (line 364)

IO Monitoring

This one is easy -- run iostat -xtc 1 10000 to get refreshes every second for 10,000 seconds. All these devices report how busy they are; of particular interest is the column %b, which will tell you how busy the disk is.

IO Monitoring per process

As far as I can tell, this isn't possible with the built in tools. I believe you can use DTrace and iotop, but as I didn't have enough privileges on the machine I couldn't try it.