pyemma.coordinates.covariance_lagged¶

pyemma.coordinates.covariance_lagged(data=None, c00=True, c0t=True, ctt=False, remove_constant_mean=None, remove_data_mean=False, reversible=False, bessel=True, lag=0, weights='empirical', stride=1, skip=0, chunksize=None, ncov_max=inf, column_selection=None, diag_only=False)¶

Compute lagged covariances between time series. If data is available as an array of size (TxN), where T is the number of time steps and N the number of dimensions, this function can compute lagged covariances like

$\begin{split}C_00 &= X^T X \\ C_{0t} &= X^T Y \\ C_{tt} &= Y^T Y,\end{split}$

where X comprises the first T-lag time steps and Y the last T-lag time steps. It is also possible to use more than one time series, the number of time steps in each time series can also vary.

Parameters

data (ndarray (T, d) or list of ndarray (T_i, d) or a reader created by) – source function array with the data, if available. When given, the covariances are immediately computed.
c00 (bool, optional, default=True) – compute instantaneous correlations over the first part of the data. If lag==0, use all of the data.
c0t (bool, optional, default=False) – compute lagged correlations. Does not work with lag==0.
ctt (bool, optional, default=False) – compute instantaneous correlations over the second part of the data. Does not work with lag==0.
remove_constant_mean (ndarray(N,), optional, default=None) – substract a constant vector of mean values from time series.
remove_data_mean (bool, optional, default=False) – substract the sample mean from the time series (mean-free correlations).
reversible (bool, optional, default=False) – symmetrize correlations.
bessel (bool, optional, default=True) – use Bessel’s correction for correlations in order to use an unbiased estimator
lag (int, optional, default=0) – lag time. Does not work with xy=True or yy=True.
weights (optional, default="empirical") –
Re-weighting strategy to be used in order to compute equilibrium covariances from non-equilibrium data.
- ”empirical”: no re-weighting
- ”koopman”: use re-weighting procedure from 1
- weights: An object that allows to compute re-weighting factors. It must possess a method
  weights(X) that accepts a trajectory X (np.ndarray(T, n)) and returns a vector of re-weighting factors (np.ndarray(T,)).
stride (int, optional, default = 1) – Use only every stride-th time step. By default, every time step is used.
skip (int, optional, default=0) – skip the first initial n frames per trajectory.
chunksize (int, default=None) – Number of data frames to process at once. Choose a higher value here, to optimize thread usage and gain processing speed. If None is passed, use the default value of the underlying reader/data source. Choose zero to disable chunking at all.
ncov_max (int, default=infinity) – limit the memory usage of the algorithm from 2 to an amount that corresponds to ncov_max additional copies of each correlation matrix
column_selection (ndarray(k, dtype=int) or None) – Indices of those columns that are to be computed. If None, all columns are computed.
diag_only (bool) – If True, the computation is restricted to the diagonal entries (autocorrelations) only.

Returns

lc

Return type

a LaggedCovariance object.

1: Wu, H., Nueske, F., Paul, F., Klus, S., Koltai, P., and Noe, F. 2016. Bias reduced variational approximation of molecular kinetics from short off-equilibrium simulations. J. Chem. Phys. (submitted)
2: Chan, T. F., Golub G. H., LeVeque R. J. 1979. Updating formulae and pairwiese algorithms for computing sample variances. Technical Report STAN-CS-79-773, Department of Computer Science, Stanford University.