Document filesystem/IO performance debugging tips
We've had a number of calls recently where we debugged performance issues that ended up being issues with filesystem performance or access (whether slow disk itself or networking issues, etc). We should now know a few tips and tricks to identifying that IO is the issue.
- Look for processes in 'D' state. This means the process is waiting for IO -
ps auxf | awk '{if($8=="D") print $0;}'
- Write 1000 small files to disk and see how long it takes. As part of this we will need to identify some benchmarks. Off the top, this took less than half a second on a VM on my MBP and it took 4-5 seconds on a customer's well-performing instance over NFS. We should also see how this test performs on EFS to be sure it's a good tool to identify IO issues -
time for i in {0..1000}; do echo 'test' > "test${i}.txt"; done
@stanhu @dstanley @lbot Can you please share your tips/tricks from any recent calls? Let's get them in one place and then we can work on documentation.
I wanted to get this issue open and start collecting these before we all forget what we've done.