Other online diaries:
Fri, 07 Aug 2009
Tracking down disk accesses - 14:02
I started looking around to see how to track down the problem. I have used iostat in the past to give me some details about activity happening. However the problem with that is it does not tell you which process is doing things. Running top also is not any good as it does not identify io and it also does not show cumulative use/hits as iostat and similar tools do.
There is a python program called iotop I have not tried yet, also there are files in /proc now days you can laboriously work your way through to track down some of this information. However while looking around and reading some of the man pages I discovered the existence of pidstat. This program is fantastic. It can display accumulated disk,vm,cpu,thread information on a per process basis. This is a program I have wished I had for years.
So I ran pidstat -d 5 and watched to see what was writing to the disk so often. First I noticed the predictable kjournald. Rather than messing around trying to change commit interval for this I found there is a laptop-mode-tools package I should have had installed on my laptop. I have now installed it and enabled it to operate even when AC power is plugged in and now kjournald seems to be able to go for minutes at a time without needing to write to disk.
Next I noticed xulrunner-stub was writing often and causing the disk to spin up now it was spun down due to laptop_mode. This is firefox (or iceweasel in the debian case). I found details suggesting firefox 3.0.1 onward had an option to decrease the save/fsync regularity and that 3.5 and up should be even better. I installed 3.5.1 from debian experimental and found another page with 28 good firefox tips, one of which actually told me which about:config option to change to decrease the save/sync interval.
So the disk is not always spinning up or constantly accessing now, though there still appear to be a few culprits I could track down more information on in the pidstat output. Also I may want to play around with more proc settings such as /proc/sys/vm/dirty_expire_centisecs which can change how often pdflush sends stuff to disk and there are other suggestions around which if I think about may help too. Update: I also have since first writing this found a good Linux Journal article on what laptop mode does.
One of the reasons I am so excited about pidstat is it helps with my work a lot, if there is a problem with a mail server, or student login server or any number of other machines. Getting a read out of this information by process accumulated over time is really useful to work out what is causing issues and thus work on controlling and preventing problems.