pyemma.coordinates.covariance_lagged¶
-
pyemma.coordinates.
covariance_lagged
(data=None, c00=True, c0t=True, ctt=False, remove_constant_mean=None, remove_data_mean=False, reversible=False, bessel=True, lag=0, weights='empirical', stride=1, skip=0, chunksize=None, ncov_max=inf, column_selection=None, diag_only=False)¶ Compute lagged covariances between time series. If data is available as an array of size (TxN), where T is the number of time steps and N the number of dimensions, this function can compute lagged covariances like
\[\begin{split}C_00 &= X^T X \\ C_{0t} &= X^T Y \\ C_{tt} &= Y^T Y,\end{split}\]where X comprises the first T-lag time steps and Y the last T-lag time steps. It is also possible to use more than one time series, the number of time steps in each time series can also vary.
Parameters: - data (ndarray (T, d) or list of ndarray (T_i, d) or a reader created by) – source function array with the data, if available. When given, the covariances are immediately computed.
- c00 (bool, optional, default=True) – compute instantaneous correlations over the first part of the data. If lag==0, use all of the data.
- c0t (bool, optional, default=False) – compute lagged correlations. Does not work with lag==0.
- ctt (bool, optional, default=False) – compute instantaneous correlations over the second part of the data. Does not work with lag==0.
- remove_constant_mean (ndarray(N,), optional, default=None) – substract a constant vector of mean values from time series.
- remove_data_mean (bool, optional, default=False) – substract the sample mean from the time series (mean-free correlations).
- reversible (bool, optional, default=False) – symmetrize correlations.
- bessel (bool, optional, default=True) – use Bessel’s correction for correlations in order to use an unbiased estimator
- lag (int, optional, default=0) – lag time. Does not work with xy=True or yy=True.
- weights (optional, default="empirical") –
- Re-weighting strategy to be used in order to compute equilibrium covariances from non-equilibrium data.
- ”empirical”: no re-weighting
- ”koopman”: use re-weighting procedure from [1]
- weights: An object that allows to compute re-weighting factors. It must possess a method
- weights(X) that accepts a trajectory X (np.ndarray(T, n)) and returns a vector of re-weighting factors (np.ndarray(T,)).
- stride (int, optional, default = 1) – Use only every stride-th time step. By default, every time step is used.
- skip (int, optional, default=0) – skip the first initial n frames per trajectory.
- chunksize (int, default=None) – Number of data frames to process at once. Choose a higher value here, to optimize thread usage and gain processing speed. If None is passed, use the default value of the underlying reader/data source. Choose zero to disable chunking at all.
- ncov_max (int, default=infinity) – limit the memory usage of the algorithm from [2] to an amount that corresponds to ncov_max additional copies of each correlation matrix
- column_selection (ndarray(k, dtype=int) or None) – Indices of those columns that are to be computed. If None, all columns are computed.
- diag_only (bool) – If True, the computation is restricted to the diagonal entries (autocorrelations) only.
Returns: lc
Return type: a
LaggedCovariance
object.[1] Wu, H., Nueske, F., Paul, F., Klus, S., Koltai, P., and Noe, F. 2016. Bias reduced variational approximation of molecular kinetics from short off-equilibrium simulations. J. Chem. Phys. (submitted) [2] Chan, T. F., Golub G. H., LeVeque R. J. 1979. Updating formulae and pairwiese algorithms for computing sample variances. Technical Report STAN-CS-79-773, Department of Computer Science, Stanford University.