Threaded force and stress
This PR improves the OpenMP threading of force and stress computation. I was looking at Si63Ge-vc-relax test case. By decreasing the number of MPI ranks and increasing the OpenMP threads, the slowing-down routines of force and stress computation are investigated and enabled with OpenMP threading.
PS: I noticed that the parallel build is failing due to out-of-date dependence file.
make depend will be needed before the release.