Estimating correlation times for reasonable error estimates
We currently calculate error estimates based on equations only suitable for uncorrelated data. However, we do not warn the user if his data is correlated. A proposed method is to store a time series array of an observable and estimate the correlation time using the method of pymbar. A skeleton code for the AnalysisBase
taken from MDAnalysis could look like:
import warnings
class AnalysisBase(object):
def _single_frame(self):
"""Single frame calculations.
Called after the trajectory is moved onto each new frame.
Returns
-------
observable : float
An observable of the current frame. Usd for estimating
the correlation tim.
"""
observable = 0.0
return observable
def run(self, start=None, stop=None, step=None, frames=None, verbose=None):
timeseries = np.zeros(self.n_frames)
for i, ts in enumerate(trajectory):
timeseries[i] = self._single_frame()
corr_frames = statisticalInefficiency(timeseries, fast=True)/2 - 1
if corr_frames < self.step:
warnings.warn("Correlation time of the observables seem to be "
"larger than your step size. Consider increasing "
f"your step size to {corr_frames + 1} to get a "
"reasonable error estimate.")
For analysis classes calculating more than one property the developer has to choose one appropriate which is used for estimating the correlation time.
Instead of introducing pymbar as a new dependency, I suggest we only take the statisticalInefficiency
function directly into MAICoS and
acknowledge pymbar accordingly. There should be no license clash since pymbar uses MIT.
[1] J. D. Chodera, W. C. Swope, J. W. Pitera, C. Seok, and K. A. Dill. Use of the weighted