pyemma.thermo.tram¶

pyemma.thermo.tram(ttrajs, dtrajs, bias, lag, unbiased_state=None, count_mode='sliding', connectivity='summed_count_matrix', maxiter=10000, maxerr=1e-15, save_convergence_info=0, dt_traj='1 step', connectivity_factor=1.0, nn=None, direct_space=False, N_dtram_accelerations=0, callback=None, init='mbar', init_maxiter=10000, init_maxerr=1e-08, equilibrium=None)¶

Transition-based reweighting analysis method

Parameters:	ttrajs (numpy.ndarray(T), or list of numpy.ndarray(T_i)) – A single discrete trajectory or a list of discrete trajectories. The integers are indexes in 0,...,num_therm_states-1 enumerating the thermodynamic states the trajectory is in at any time. dtrajs (numpy.ndarray(T) of int, or list of numpy.ndarray(T_i) of int) – A single discrete trajectory or a list of discrete trajectories. The integers are indexes in 0,...,num_conf_states-1 enumerating the num_conf_states Markov states or the bins the trajectory is in at any time. bias (numpy.ndarray(T, num_therm_states), or list of numpy.ndarray(T_i, num_therm_states)) – A single reduced bias energy trajectory or a list of reduced bias energy trajectories. For every simulation frame seen in trajectory i and time step t, btrajs[i][t, k] is the reduced bias energy of that frame evaluated in the k’th thermodynamic state (i.e. at the k’th umbrella/Hamiltonian/temperature) lag (int or list of int, optional, default=1) – Integer lag time at which transitions are counted. Providing a list of lag times will trigger one estimation per lag time. maxiter (int, optional, default=10000) – The maximum number of dTRAM iterations before the estimator exits unsuccessfully. maxerr (float, optional, default=1e-15) – Convergence criterion based on the maximal free energy change in a self-consistent iteration step. save_convergence_info (int, optional, default=0) – Every save_convergence_info iteration steps, store the actual increment and the actual loglikelihood; 0 means no storage. dt_traj (str, optional, default='1 step') – Description of the physical time corresponding to the lag. May be used by analysis algorithms such as plotting tools to pretty-print the axes. By default ‘1 step’, i.e. there is no physical time unit. Specify by a number, whitespace and unit. Permitted units are (* is an arbitrary string): ‘fs’, ‘femtosecond’ ‘ps’, ‘picosecond’ ‘ns’, ‘nanosecond’ ‘us’, ‘microsecond’ ‘ms’, ‘millisecond’ ‘s’, ‘second’ connectivity (str, optional, default='summed_count_matrix') – One of ‘summed_count_matrix’, ‘strong_in_every_ensemble’, ‘neighbors’, ‘post_hoc_RE’ or ‘BAR_variance’. Defines what should be considered a connected set in the joint space of conformations and thermodynamic ensembles. For details see thermotools.cset.compute_csets_TRAM. nn (int, optional, default=None) – Only needed if connectivity=’neighbors’ See thermotools.cset.compute_csets_TRAM. connectivity_factor (float, optional, default=1.0) – Only needed if connectivity=’post_hoc_RE’ or ‘BAR_variance’. Weakens the connectivity requirement, see thermotools.cset.compute_csets_TRAM. direct_space (bool, optional, default=False) – Whether to perform the self-consitent iteration with Boltzmann factors (direct space) or free energies (log-space). When analyzing data from multi-temperature simulations, direct-space is not recommended. N_dtram_accelerations (int, optional, default=0) – Convergence of TRAM can be speeded up by interleaving the updates in the self-consitent iteration with a dTRAM-like update step. N_dtram_accelerations says how many times the dTRAM-like update step should be applied in every iteration of the TRAM equations. Currently this is only effective if direct_space=True. init (str, optional, default=None) – Use a specific initialization for self-consistent iteration: None: use a hard-coded guess for free energies and Lagrangian multipliers ‘wham’: perform a short WHAM estimate to initialize the free energies init_maxiter (int, optional, default=10000) – The maximum number of self-consistent iterations during the initialization. init_maxerr (float, optional, default=1.0E-8) – Convergence criterion for the initialization.
Returns:	tram_estimators – A multi-ensemble Markov state model (for each given lag time) which consists of stationary and kinetic quantities at all temperatures/thermodynamic states.
Return type:	MEMM or list of MEMMs

Example

Umbrella sampling: Suppose we simulate in K umbrellas, centered at positions \(y_0,...,y_{K-1}\) with bias energies

\[b_k(x) = \frac{c_k}{2 \textrm{kT}} \cdot (x - y_k)^2\]

Suppose we have one simulation of length T in each umbrella, and they are ordered from 0 to K-1. We have discretized the x-coordinate into 100 bins. Then dtrajs and ttrajs should each be a list of \(K\) arrays. dtrajs would look for example like this:

[ (0, 0, 0, 0, 1, 1, 1, 0, 0, 0, ...),  (0, 1, 0, 1, 0, 1, 1, 0, 0, 1, ...), ... ]

where each array has length T, and is the sequence of bins (in the range 0 to 99) visited along the trajectory. ttrajs would look like this:

[ (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...),  (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...), ... ]

Because trajectory 1 stays in umbrella 1 (index 0), trajectory 2 stays in umbrella 2 (index 1), and so forth.

The bias would be a list of \(T \times K\) arrays which specify each frame’s bias energy in all thermodynamic states:

[ ((0, 1.7, 2.3, 6.1, ...), ...), ((0, 2.4, 3.1, 9,5, ...), ...), ... ]

Let us try the above example:

>>> from pyemma.thermo import tram
>>> import numpy as np
>>> ttrajs = [np.array([0,0,0,0,0,0,0]), np.array([1,1,1,1,1,1,1])]
>>> dtrajs = [np.array([0,0,0,0,1,1,1]), np.array([0,1,0,1,0,1,1])]
>>> bias = [np.array([[1,0],[1,0],[0,0],[0,0],[0,0],[0,0],[0,0]],dtype=np.float64), np.array([[1,0],[0,0],[0,0],[1,0],[0,0],[1,0],[1,0]],dtype=np.float64)]
>>> tram_obj = tram(ttrajs, dtrajs, bias, 1)
>>> tram_obj.log_likelihood() 
-29.111...
>>> tram_obj.count_matrices 
array([[[1 1]
        [0 4]]
       [[0 3]
        [2 1]]], dtype=int32)
>>> tram_obj.stationary_distribution 
array([ 0.38...  0.61...])

References

[1]	Wu, H. et al 2016 Multiensemble Markov models of molecular thermodynamics and kinetics Proc. Natl. Acad. Sci. USA 113 E3221–E3230