pyemma.coordinates.tica¶
-
pyemma.coordinates.
tica
(data=None, lag=10, dim=2, stride=1, force_eigenvalues_le_one=False)¶ Time-lagged independent component analysis (TICA).
TICA is a linear transformation method. In contrast to PCA, which finds coordinates of maximal variance, TICA finds coordinates of maximal autocorrelation at the given lag time. Therefore, TICA is useful in order to find the slow components in a dataset and thus an excellent choice to transform molecular dynamics data before clustering data for the construction of a Markov model. When the input data is the result of a Markov process (such as thermostatted molecular dynamics), TICA finds in fact an approximation to the eigenfunctions and eigenvalues of the underlying Markov operator [1].
It estimates a TICA transformation from data. When input data is given as an argument, the estimation will be carried out straight away, and the resulting object can be used to obtain eigenvalues, eigenvectors or project input data onto the slowest TICA components. If no data is given, this object is an empty estimator and can be put into a
pipeline()
in order to use TICA in the streaming mode.Parameters: - data (ndarray (T, d) or list of ndarray (T_i, d) or a reader created by source function) – array with the data, if available. When given, the TICA transformation is immediately computed and can be used to transform data.
- lag (int, optional, default = 10) – the lag time, in multiples of the input time step
- dim (int, optional, default = 2) – the number of dimensions (independent components) to project onto. A call to the
map
function reduces the d-dimensional input to only dim dimensions such that the data preserves the maximum possible autocorrelation amongst dim-dimensional linear projections. - stride (int, optional, default = 1) – If set to 1, all input data will be used for estimation. Note that this could cause this calculation to be very slow for large data sets. Since molecular dynamics data is usually correlated at short timescales, it is often sufficient to estimate transformations at a longer stride. Note that the stride option in the get_output() function of the returned object is independent, so you can parametrize at a long stride, and still map all frames through the transformer.
- force_eigenvalues_le_one (boolean) – Compute covariance matrix and time-lagged covariance matrix such that the generalized eigenvalues are always guaranteed to be <= 1.
Returns: tica – Object for time-lagged independent component (TICA) analysis. it contains TICA eigenvalues and eigenvectors, and the projection of input data to the dominant TICA
Return type: a
TICA
transformation objectNotes
Given a sequence of multivariate data \(X_t\), it computes the mean-free covariance and time-lagged covariance matrix:
\[\begin{split}C_0 &= (X_t - \mu)^T (X_t - \mu) \\ C_{\tau} &= (X_t - \mu)^T (X_t + \tau - \mu)\end{split}\]and solves the eigenvalue problem
\[C_{\tau} r_i = C_0 \lambda_i r_i,\]where \(r_i\) are the independent components and \(\lambda_i\) are their respective normalized time-autocorrelations. The eigenvalues are related to the relaxation timescale by
\[t_i = -\frac{\tau}{\ln |\lambda_i|}.\]When used as a dimension reduction method, the input data is projected onto the dominant independent components.
TICA was originally introduced for signal processing in [2]. It was introduced to molecular dynamics and as a method for the construction of Markov models in [1] and [3]. It was shown in [1] that when applied to molecular dynamics data, TICA is an approximation to the eigenvalues and eigenvectors of the true underlying dynamics.
References
[1] (1, 2, 3) Perez-Hernandez G, F Paul, T Giorgino, G De Fabritiis and F Noe. 2013. Identification of slow molecular order parameters for Markov model construction J. Chem. Phys. 139, 015102. doi:10.1063/1.4811489 [2] L. Molgedey and H. G. Schuster. 1994. Separation of a mixture of independent signals using time delayed correlations Phys. Rev. Lett. 72, 3634. [3] Schwantes C, V S Pande. 2013. Improvements in Markov State Model Construction Reveal Many Non-Native Interactions in the Folding of NTL9 J. Chem. Theory. Comput. 9, 2000-2009. doi:10.1021/ct300878a