pyemma.thermo.dtram¶

pyemma.thermo.dtram(ttrajs, dtrajs, bias, lag, unbiased_state=None, count_mode='sliding', connectivity='reversible_pathways', maxiter=10000, maxerr=1e-15, save_convergence_info=0, dt_traj='1 step', init=None, init_maxiter=10000, init_maxerr=1e-08)¶

Discrete transition-based reweighting analysis method

Parameters

ttrajs (numpy.ndarray(T) of int, or list of numpy.ndarray(T_i) of int) – A single discrete trajectory or a list of discrete trajectories. The integers are indexes in 0,…,num_therm_states-1 enumerating the thermodynamic states the trajectory is in at any time.
dtrajs (numpy.ndarray(T) of int, or list of numpy.ndarray(T_i) of int) – A single discrete trajectory or a list of discrete trajectories. The integers are indexes in 0,…,num_conf_states-1 enumerating the num_conf_states Markov states or the bins the trajectory is in at any time.
bias (numpy.ndarray(shape=(num_therm_states, num_conf_states)) object) – bias_energies_full[j, i] is the bias energy in units of kT for each discrete state i at thermodynamic state j.
lag (int or list of int, optional, default=1) – Integer lag time at which transitions are counted. Providing a list of lag times will trigger one estimation per lag time.
unbiased_state (int, optional, default=None) – Index of the unbiased thermodynamic state or None if there is no unbiased data available.
count_mode (str, optional, default='sliding') –
Mode to obtain count matrices from discrete trajectories. Should be one of:
- ’sliding’a trajectory of length T will have \(T-\tau\) counts at time indexes
  \[(0 \rightarrow \tau), (1 \rightarrow \tau+1), ..., (T-\tau-1 \rightarrow T-1)\]
- ‘sample’a trajectory of length T will have \(T/\tau\) counts at time indexes
  \[(0 \rightarrow \tau), (\tau \rightarrow 2 \tau), ..., ((T/\tau-1) \tau \rightarrow T)\]
Currently only ‘sliding’ is supported.
connectivity (str, optional, default='reversible_pathways') –
One of ‘reversible_pathways’, ‘summed_count_matrix’ or None. Defines what should be considered a connected set in the joint (product) space of conformations and thermodynamic ensembles. * ‘reversible_pathways’ : requires that every state in the connected set

can be reached by following a pathway of reversible transitions. A reversible transition between two Markov states (within the same thermodynamic state k) is a pair of Markov states that belong to the same strongly connected component of the count matrix (from thermodynamic state k). A pathway of reversible transitions is a list of reversible transitions [(i_1, i_2), (i_2, i_3),…, (i_(N-2), i_(N-1)), (i_(N-1), i_N)]. The thermodynamic state where the reversible transitions happen, is ignored in constructing the reversible pathways. This is equivalent to assuming that two ensembles overlap at some Markov state whenever there exist frames from both ensembles in that Markov state.
- ’summed_count_matrix’ : all thermodynamic states are assumed to overlap. The connected set is then computed by summing the count matrices over all thermodynamic states and taking it’s largest strongly connected set. Not recommended!
- None : assume that everything is connected. For debugging.
For more details see pyemma.thermo.extensions.cset.compute_csets_dTRAM().
maxiter (int, optional, default=10000) – The maximum number of dTRAM iterations before the estimator exits unsuccessfully.
maxerr (float, optional, default=1e-15) – Convergence criterion based on the maximal free energy change in a self-consistent iteration step.
save_convergence_info (int, optional, default=0) – Every save_convergence_info iteration steps, store the actual increment and the actual loglikelihood; 0 means no storage.
dt_traj (str, optional, default='1 step') –
Description of the physical time corresponding to the lag. May be used by analysis algorithms such as plotting tools to pretty-print the axes. By default ‘1 step’, i.e. there is no physical time unit. Specify by a number, whitespace and unit. Permitted units are (* is an arbitrary string):

’fs’, ‘femtosecond*’

’ps’, ‘picosecond*’

’ns’, ‘nanosecond*’

’us’, ‘microsecond*’

’ms’, ‘millisecond*’

’s’, ‘second*’
init (str, optional, default=None) –
Use a specific initialization for self-consistent iteration:

None: use a hard-coded guess for free energies and Lagrangian multipliers

’wham’: perform a short WHAM estimate to initialize the free energies
init_maxiter (int, optional, default=10000) – The maximum number of self-consistent iterations during the initialization.
init_maxerr (float, optional, default=1.0E-8) – Convergence criterion for the initialization.

Returns

A multi-ensemble Markov state model (for each given lag time) which consists of stationary and kinetic quantities at all temperatures/thermodynamic states.

Return type

A MEMM object or list thereof

Example

Umbrella sampling: Suppose we simulate in K umbrellas, centered at positions \(y_0,...,y_{K-1}\) with bias energies

\[b_k(x) = \frac{c_k}{2 \textrm{kT}} \cdot (x - y_k)^2\]

Suppose we have one simulation of length T in each umbrella, and they are ordered from 0 to K-1. We have discretized the x-coordinate into 100 bins. Then dtrajs and ttrajs should each be a list of \(K\) arrays. dtrajs would look for example like this:

[ (0, 0, 0, 0, 1, 1, 1, 0, 0, 0, ...),  (0, 1, 0, 1, 0, 1, 1, 0, 0, 1, ...), ... ]

where each array has length T, and is the sequence of bins (in the range 0 to 99) visited along the trajectory. ttrajs would look like this:

[ (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...),  (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...), ... ]

Because trajectory 1 stays in umbrella 1 (index 0), trajectory 2 stays in umbrella 2 (index 1), and so forth. bias is a \(K \times n\) matrix with all reduced bias energies evaluated at all centers:

\[\begin{split}\left(\begin{array}{cccc} b_0(y_0) & b_0(y_1) & ... & b_0(y_{n-1}) \\ b_1(y_0) & b_1(y_1) & ... & b_1(y_{n-1}) \\ ... \\ b_{K-1}(y_0) & b_{K-1}(y_1) & ... & b_{K-1}(y_{n-1}) \end{array}\right)\end{split}\]

Let us try the above example:

>>> from pyemma.thermo import dtram
>>> import numpy as np
>>> ttrajs = [np.array([0,0,0,0,0,0,0,0,0,0]), np.array([1,1,1,1,1,1,1,1,1,1])]
>>> dtrajs = [np.array([0,0,0,0,1,1,1,0,0,0]), np.array([0,1,0,1,0,1,1,0,0,1])]
>>> bias = np.array([[0.0, 0.0], [0.5, 1.0]])
>>> dtram_obj = dtram(ttrajs, dtrajs, bias, 1)
>>> dtram_obj.log_likelihood() # doctest: +ELLIPSIS
-9.805...
>>> dtram_obj.count_matrices # doctest: +SKIP
array([[[5, 1],
        [1, 2]],
       [[1, 4],
        [3, 1]]], dtype=int32)
>>> dtram_obj.stationary_distribution # doctest: +ELLIPSIS
array([ 0.38...,  0.61...])

See MEMM for a full documentation.

class pyemma.thermo.models.memm.MEMM(models, f_therm, pi=None, f=None, label='ground state')¶

Coupled set of Markov state models at multiple thermodynamic states

Parameters

models (list of Model objects) – List of Model objects, e.g. StationaryModel or MSM objects, at the different thermodynamic states. This list may include the ground state, such that self.pi = self.models[0].pi holds. An example for that is data obtained from parallel tempering or replica-exchange, where the lowest simulated temperature is usually identical to the thermodynamic ground state. However, the list does not have to include the thermodynamic ground state. For example, when obtaining data from umbrella sampling, models might be the list of stationary models for n umbrellas (biased ensembles), while the thermodynamic ground state is the unbiased ensemble. In that case, self.pi would be different from any self.models[i].pi
f_therm (ndarray(k)) – free energies at the different thermodynamic states
pi (ndarray(n), default=None) – Stationary distribution of the thermodynamic ground state. If not already normalized, pi will be scaled to fulfill \(\sum_i \pi_i = 1\). If None, models[0].pi will be used
f (ndarray(n)) – Discrete-state free energies of the thermodynamic ground state.
label (str, default='ground state') – Human-readable description for the thermodynamic ground state or reference state of this multiensemble. May contain a temperature description, such as ‘300 K’ or a description of bias energy such as ‘unbiased’.

Methods

`expectation`(a)	Equilibrium expectation value of a given observable.
`get_model_params`([deep])	Get parameters for this model.
`load`(file_name[, model_name])	Loads a previously saved PyEMMA object from disk.
`meval`(f, args, *kw)	Evaluates the given function call for all models Returns the results of the calls in a list
`save`(file_name[, model_name, overwrite, …])	saves the current state of this object to given file and name.
`set_model_params`([models, f_therm, pi, f, label])	Call to set all basic model parameters.
`update_model_params`(**params)	Update given model parameter if they are set to specific values

Attributes

`active_set`	The active set of states on which all computations and estimations will be done.
`f`	The free energies (in units of kT) on the configuration states.
`f_full_state`
`free_energies`	The free energies (in units of kT) on the configuration states.
`free_energies_full_state`
`label`	Human-readable description for the thermodynamic state of this model.
`msm`	MSM of the unbiased thermodynamic state; only present when unbiased data available.
`nstates`	Number of active states on which all computations and estimations are done.
`nstates_full`	Size of the full set of states.
`pi`	The stationary distribution on the configuration states.
`pi_full_state`
`stationary_distribution`	The stationary distribution on the configuration states.
`stationary_distribution_full_state`
`unbiased_state`	Index of the unbiased thermodynamic state.

active_set¶: The active set of states on which all computations and estimations will be done.

expectation(a)¶

Equilibrium expectation value of a given observable.

Parameters: a ((M,) ndarray) – Observable vector
Returns: val – Equilibrium expectation value of the given observable
Return type: float

Notes

The equilibrium expectation value of an observable a is defined as follows

\[\mathbb{E}_{\mu}[a] = \sum_i \mu_i a_i\]

\(\mu=(\mu_i)\) is the stationary vector of the transition matrix \(T\).

f¶: The free energies (in units of kT) on the configuration states.

f_full_state¶

free_energies¶: The free energies (in units of kT) on the configuration states.

free_energies_full_state¶

get_model_params(deep=True)¶

Get parameters for this model.

Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: mapping of string to any

label¶: Human-readable description for the thermodynamic state of this model.

classmethod load(file_name, model_name='default')¶

Loads a previously saved PyEMMA object from disk.

Parameters

file_name (str or file like object (has to provide read method)) – The file like object tried to be read for a serialized object.
model_name (str, default='default') – if multiple models are contained in the file, these can be accessed by their name. Use pyemma.list_models() to get a representation of all stored models.

Returns

obj

Return type

the de-serialized object

meval(f, *args, **kw)¶: Evaluates the given function call for all models Returns the results of the calls in a list

msm¶: MSM of the unbiased thermodynamic state; only present when unbiased data available.

nstates¶: Number of active states on which all computations and estimations are done.

nstates_full¶: Size of the full set of states.

pi¶: The stationary distribution on the configuration states.

pi_full_state¶

save(file_name, model_name='default', overwrite=False, save_streaming_chain=False)¶

saves the current state of this object to given file and name.

Parameters

file_name (str) – path to desired output file
model_name (str, default='default') – creates a group named ‘model_name’ in the given file, which will contain all of the data. If the name already exists, and overwrite is False (default) will raise a RuntimeError.
overwrite (bool, default=False) – Should overwrite existing model names?
save_streaming_chain (boolean, default=False) – if True, the data_producer(s) of this object will also be saved in the given file.

Examples

>>> import pyemma, numpy as np
>>> from pyemma.util.contexts import named_temporary_file
>>> m = pyemma.msm.MSM(P=np.array([[0.1, 0.9], [0.9, 0.1]]))

>>> with named_temporary_file() as file: # doctest: +SKIP
...    m.save(file, 'simple') # doctest: +SKIP
...    inst_restored = pyemma.load(file, 'simple') # doctest: +SKIP
>>> np.testing.assert_equal(m.P, inst_restored.P) # doctest: +SKIP

set_model_params(models=None, f_therm=None, pi=None, f=None, label='ground state')¶

Call to set all basic model parameters.

Parameters

pi (ndarray(n)) – Stationary distribution. If not already normalized, pi will be scaled to fulfill \(\sum_i \pi_i = 1\). The free energies f will then be computed from pi via \(f_i = - \log(\pi_i)\).
f (ndarray(n)) – Discrete-state free energies. If normalized_f = True, a constant will be added to normalize the stationary distribution. Otherwise f is left as given. Then, pi will be computed from f via \(\pi_i = \exp(-f_i)\) and, if necessary, scaled to fulfill \(\sum_i \pi_i = 1\). If both (pi and f) are given, f takes precedence over pi.
normalize_energy (bool, default=True) – If parametrized by free energy f, normalize them such that \(\sum_i \pi_i = 1\), which is achieved by \(\log \sum_i \exp(-f_i) = 0\).
label (str, default=None) – Human-readable description for the thermodynamic state of this model. May contain a temperature description, such as ‘300 K’ or a description of bias energy such as ‘unbiased’ or ‘Umbrella 1’.

stationary_distribution¶: The stationary distribution on the configuration states.

stationary_distribution_full_state¶

unbiased_state¶: Index of the unbiased thermodynamic state.

update_model_params(**params)¶: Update given model parameter if they are set to specific values

References

1: Wu, H. et al 2014 Statistically optimal analysis of state-discretized trajectory data from multiple thermodynamic states J. Chem. Phys. 141, 214106