Multithreaded interface for assembling a `CoordinateMatrix`
The assembly of sparse matrices (especially in the case of Finite Element Methods) can take a substantial amount of time. For example, the example/helmholtz_3d_pml.cc
driver, when run with 16 cores on the 60 x 60 x 60 element domain, spends more time in the sequential FEM matrix assembly than in the multithreaded factorization. A multithreaded interface to CoordinateMatrix
is therefore critical for parallel performance.