pyemma.msm.ImpliedTimescales¶
-
class
pyemma.msm.
ImpliedTimescales
(estimator, lags=None, nits=None, n_jobs=None, show_progress=True, only_timescales=False)¶ -
__init__
(estimator, lags=None, nits=None, n_jobs=None, show_progress=True, only_timescales=False)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(estimator[, lags, nits, n_jobs, …])Initialize self.
estimate
(X, **params)- param X
discrete trajectories
fit
(X[, y])Estimates parameters - for compatibility with sklearn.
get_params
([deep])Get parameters for this estimator.
get_sample_conf
([conf, process])Returns the confidence interval that contains alpha % of the sample data
get_sample_mean
([process])Returns the sample means of implied timescales.
get_sample_std
([process])Returns the standard error of implied timescales.
get_timescales
([process])Returns the implied timescale estimates
load
(file_name[, model_name])Loads a previously saved PyEMMA object from disk.
save
(file_name[, model_name, overwrite, …])saves the current state of this object to given file and name.
set_params
(**params)Set the parameters of this estimator.
Attributes
Returns the estimators for all lagtimes.
Returns the fraction of frames used to compute the count matrix at each lag time.
Return the list of lag times for which timescales were computed.
Return the list of lag times for which timescales were computed.
The logger for this class instance
The model estimated by this Estimator
Returns the models for all lagtimes.
Returns number of jobs/threads to use during assignment of data.
The name of this instance
Return the number of timescales.
Return the number of timescales.
Returns the sample means of implied timescales.
Returns the standard error of implied timescales.
Returns True if samples are available and thus sample means, standard errors and confidence intervals can be obtained
Returns the implied timescale estimates
-
estimate
(X, **params)¶ - Parameters
X (lists of integer arrays) – discrete trajectories
estimator (Estimator) – Estimator to be used for estimating timescales at each lag time.
lags (array-like with integers or None, optional) – integer lag times at which the implied timescales will be calculated. If set to None (default) as list of lagtimes will be automatically generated.
nits (int, optional) – maximum number of implied timescales to be computed and stored. If less timescales are available, nits will be set to a smaller value during estimation. None means the number of timescales will be automatically determined.
n_jobs (int, optional) – how many subprocesses to start to estimate the models for each lag time.
-
estimators
¶ Returns the estimators for all lagtimes.
-
fit
(X, y=None)¶ Estimates parameters - for compatibility with sklearn.
- Parameters
X (object) – A reference to the data from which the model will be estimated
- Returns
estimator – The estimator (self) with estimated model.
- Return type
object
-
fraction_of_frames
¶ Returns the fraction of frames used to compute the count matrix at each lag time. .. rubric:: Notes
In a list of discrete trajectories with varying lengths, the estimation at longer lag times will mean discarding some trajectories for which not even one count can be computed. This function returns the fraction of frames that was actually used in computing the count matrix.
Be aware: this fraction refers to the full count matrix, and not that of the largest connected set. Hence, the output is not necessarily the active fraction. For that, use the
activte_count_fraction
function of thepyemma.msm.MaximumLikelihoodMSM
class object or for HMM respectively.
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
mapping of string to any
-
get_sample_conf
(conf=0.95, process=None)¶ Returns the confidence interval that contains alpha % of the sample data
etc.
- Parameters
conf (float, default = 0.95) –
the confidence interval. Use:
conf = 0.6827 for 1-sigma confidence interval
conf = 0.9545 for 2-sigma confidence interval
conf = 0.9973 for 3-sigma confidence interval
- Returns
(L,R) – lower and upper timescales bounding the confidence interval
if process is None, will return two (l x k) arrays, where l is the number of lag times and k is the number of computed timescales.
if process is an integer, will return two (l)-arrays with the selected process time scale for every lag time
- Return type
(float[],float[]) or (float[][],float[][])
-
get_sample_mean
(process=None)¶ Returns the sample means of implied timescales. Only available if underlying estimator produces samples.
- Parameters
process (int or None, default = None) – index in [0:n-1] referring to the process whose timescale will be returned. By default, process = None and all computed process timescales will be returned.
- Returns
if process is None, will return a (l x k) array, where l is the number of lag times
and k is the number of computed timescales.
if process is an integer, will return a (l) array with the selected process time scale
for every lag time
-
get_sample_std
(process=None)¶ Returns the standard error of implied timescales. Only available if underlying estimator produces samples.
- Parameters
process (int or None, default = None) – index in [0:n-1] referring to the process whose timescale will be returned. By default, process = None and all computed process timescales will be returned.
- Returns
if process is None, will return a (l x k) array, where l is the number of lag times
and k is the number of computed timescales.
if process is an integer, will return a (l) array with the selected process time scale
for every lag time
-
get_timescales
(process=None)¶ Returns the implied timescale estimates
- Parameters
process (int or None, default = None) – index in [0:n-1] referring to the process whose timescale will be returned. By default, process = None and all computed process timescales will be returned.
- Returns
if process is None, will return a (l x k) array, where l is the number of lag times
and k is the number of computed timescales.
if process is an integer, will return a (l) array with the selected process time scale
for every lag time
-
lags
¶ Return the list of lag times for which timescales were computed.
-
lagtimes
¶ Return the list of lag times for which timescales were computed.
-
classmethod
load
(file_name, model_name='default')¶ Loads a previously saved PyEMMA object from disk.
- Parameters
file_name (str or file like object (has to provide read method)) – The file like object tried to be read for a serialized object.
model_name (str, default='default') – if multiple models are contained in the file, these can be accessed by their name. Use
pyemma.list_models()
to get a representation of all stored models.
- Returns
obj
- Return type
the de-serialized object
-
logger
¶ The logger for this class instance
-
model
¶ The model estimated by this Estimator
-
models
¶ Returns the models for all lagtimes.
-
n_jobs
¶ Returns number of jobs/threads to use during assignment of data.
- Returns
If None it will return the setting of ‘PYEMMA_NJOBS’ or
’SLURM_CPUS_ON_NODE’ environment variable. If none of these environment variables exist,
the number of processors /or cores is returned.
Notes
This setting will effectively be multiplied by the the number of threads used by NumPy for algorithms which use multiple processes. So take care if you choose this manually.
-
name
¶ The name of this instance
-
nits
¶ Return the number of timescales.
-
number_of_timescales
¶ Return the number of timescales.
-
sample_mean
¶ Returns the sample means of implied timescales. Need to generate the samples first, e.g. by calling bootstrap
- Returns
timescales – mean timescales for all processes and lag times. l is the number of lag times and k is the number of computed timescales.
- Return type
ndarray((l x k), dtype=float)
-
sample_std
¶ Returns the standard error of implied timescales. Only available if underlying estimator produces samples.
- Returns
timescales – standard deviations of timescales for all processes and lag times. l is the number of lag times and k is the number of computed timescales.
- Return type
ndarray((l x k), dtype=float)
-
samples_available
¶ Returns True if samples are available and thus sample means, standard errors and confidence intervals can be obtained
-
save
(file_name, model_name='default', overwrite=False, save_streaming_chain=False)¶ saves the current state of this object to given file and name.
- Parameters
file_name (str) – path to desired output file
model_name (str, default='default') – creates a group named ‘model_name’ in the given file, which will contain all of the data. If the name already exists, and overwrite is False (default) will raise a RuntimeError.
overwrite (bool, default=False) – Should overwrite existing model names?
save_streaming_chain (boolean, default=False) – if True, the data_producer(s) of this object will also be saved in the given file.
Examples
>>> import pyemma, numpy as np >>> from pyemma.util.contexts import named_temporary_file >>> m = pyemma.msm.MSM(P=np.array([[0.1, 0.9], [0.9, 0.1]]))
>>> with named_temporary_file() as file: # doctest: +SKIP ... m.save(file, 'simple') # doctest: +SKIP ... inst_restored = pyemma.load(file, 'simple') # doctest: +SKIP >>> np.testing.assert_equal(m.P, inst_restored.P) # doctest: +SKIP
-
set_params
(**params)¶ Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object. :returns: :rtype: self
-
timescales
¶ Returns the implied timescale estimates
- Returns
timescales – timescales for all processes and lag times. l is the number of lag times and k is the number of computed timescales.
- Return type
ndarray((l x k), dtype=float)
-