Add synchronize method to all devices.
This is to simply writing generic device code. Previously only the GPU
and Sycl tensor devices had a synchronize method, which is required
in testing to ensure all operations are performed. Added a dummy method
for threadpool and default devices, where are synchronous by default.