pyemma.msm.estimation.tmatrix¶
-
pyemma.msm.estimation.
tmatrix
(C, reversible=False, mu=None, **kwargs)¶ Estimate the transition matrix from the given countmatrix.
Parameters: - C (numpy ndarray or scipy.sparse matrix) – Count matrix
- reversible (bool (optional)) – If True restrict the ensemble of transition matrices to those having a detailed balance symmetry otherwise the likelihood optimization is carried out over the whole space of stochastic matrices.
- mu (array_like) – The stationary distribution of the MLE transition matrix.
- **kwargs –
- = 1E-6 (eps) – Optional parameter with reversible = True and mu!=None. Regularization parameter for the interior point method. This value is added to the diagonal elements of C that are zero.
- Xinit ((M, M) ndarray) – Optional parameter with reversible = True. initial value for the matrix of absolute transition probabilities. Unless set otherwise, will use X = diag(pi) t, where T is a nonreversible transition matrix estimated from C, i.e. T_ij = c_ij / sum_k c_ik, and pi is its stationary distribution.
- = 1000000 (maxiter) – Optional parameter with reversible = True. maximum number of iterations before the method exits
- = 1e-8 (maxerr) – Optional parameter with reversible = True. convergence tolerance for transition matrix estimation. This specifies the maximum change of the Euclidean norm of relative stationary probabilities (\(x_i = \sum_k x_{ik}\)). The relative stationary probability changes \(e_i = (x_i^{(1)} - x_i^{(2)})/(x_i^{(1)} + x_i^{(2)})\) are used in order to track changes in small probabilities. The Euclidean norm of the change vector, \(|e_i|_2\), is compared to maxerr.
- = False (return_conv) – Optional parameter with reversible = True. If set to true, the stationary distribution is also returned
- = False – Optional parameter with reversible = True. If set to true, the likelihood history and the pi_change history is returned.
Returns: - P ((M, M) ndarray or scipy.sparse matrix) – The MLE transition matrix. P has the same data type (dense or sparse) as the input matrix C.
- The reversible estimator returns by default only P, but may also return
- (P,pi) or (P,lhist,pi_changes) or (P,pi,lhist,pi_changes) depending on the return settings
- P (ndarray (n,n)) – transition matrix. This is the only return for return_statdist = False, return_conv = False
- (pi) (ndarray (n)) – stationary distribution. Only returned if return_statdist = True
- (lhist) (ndarray (k)) – likelihood history. Has the length of the number of iterations needed. Only returned if return_conv = True
- (pi_changes) (ndarray (k)) – history of likelihood history. Has the length of the number of iterations needed. Only returned if return_conv = True
Notes
The transition matrix is a maximum likelihood estimate (MLE) of the probability distribution of transition matrices with parameters given by the count matrix.
References
[1] Prinz, J H, H Wu, M Sarich, B Keller, M Senne, M Held, J D Chodera, C Schuette and F Noe. 2011. Markov models of molecular kinetics: Generation and validation. J Chem Phys 134: 174105 [2] Bowman, G R, K A Beauchamp, G Boxer and V S Pande. 2009. Progress and challenges in the automated construction of Markov state models for full protein systems. J. Chem. Phys. 131: 124101 Examples
>>> from pyemma.msm.estimation import transition_matrix
>>> C = np.array([10, 1, 1], [2, 0, 3], [0, 1, 4]])
Non-reversible estimate
>>> T_nrev = transition_matrix(C) >>> T_nrev array([[ 0.83333333, 0.08333333, 0.08333333], [ 0.33333333, 0.16666667, 0.5 ], [ 0. , 0.2 , 0.8 ]])
Reversible estimate
>>> T_rev = transition_matrix(C) >>> T_rev array([[ 0.83333333, 0.10385552, 0.06281115], [ 0.29228896, 0.16666667, 0.54104437], [ 0.04925323, 0.15074676, 0.80000001]])
Reversible estimate with given stationary vector
>>> mu = np.array([0.7, 0.01, 0.29]) >>> T_mu = transition_matrix(C, reversible=True, mu=mu) >>> T_mu array([[ 0.94841372, 0.00534691, 0.04623938], [ 0.37428347, 0.12715063, 0.4985659 ], [ 0.11161229, 0.01719193, 0.87119578]])