maindims "chunking" option would be nice

For data with large arrays along maindims, sometimes arrays are so large that loading the whole thing into memory slows down computations a lot. Also, some computations are local, with no dependence on other cells or only depending on nearby cells.

In those cases, it is probably much faster to break up the maindims into chunks, load one chunk at a time, and combine results appropriately.

the idea is something like: cc(var, chunks=dict(xsize=10)) would do cc(var, slices=dict(x=slice(0, 10)), cc(var, slices=dict(x=slice(10, 20))), ... etc, and then join them all together.
might make it easy to have the option to parallelize (e.g. via pc.TaskList).
might make it easy to have the option to save intermediate results, e.g. via arr.pc.save(f'{dst}/{basename}__{x}chunk_{chunknum}_of_{numchunks}'). In that case, could pc.xarray_mergeload(dst) to get the full result.
might be super useful for VDEMs, which are huge arrays internally.

NOTE: need to think a bit about safety measures, e.g. chunking during ffts should probably make warning or crash.

NOTE: also want to think a bit about stagger mesh in Bifrost, e.g. might want to pre-load any stagger vars across full box to avoid repetitive loading of values at edges of chunks (since, when slices are enabled, stagger needs to load a few extra values, to get proper answers). First pass at implementation though, can probably just ignore this part.

Flagging as relevant for @jumasy to consider as well, as per our discussion today. Though I will probably be the one who sets this up in the code itself 😄