Model serialization

In PyEMMA most Estimators and Models can be saved to disk in an efficient file format. Most of the Estimators and Models in PyEMMA are serializable. If a given PyEMMA object can be saved to disk, it provides a save method:

SerializableMixIn.save(file_name, model_name='default', overwrite=False, save_streaming_chain=False)

saves the current state of this object to given file and name.

Parameters:
  • file_name (str) – path to desired output file
  • model_name (str, default='default') – creates a group named ‘model_name’ in the given file, which will contain all of the data. If the name already exists, and overwrite is False (default) will raise a RuntimeError.
  • overwrite (bool, default=False) – Should overwrite existing model names?
  • save_streaming_chain (boolean, default=False) – if True, the data_producer(s) of this object will also be saved in the given file.

Examples

>>> import pyemma, numpy as np
>>> from pyemma.util.contexts import named_temporary_file
>>> m = pyemma.msm.MSM(P=np.array([[0.1, 0.9], [0.9, 0.1]]))
>>> with named_temporary_file() as file: # doctest: +SKIP
...    m.save(file, 'simple') # doctest: +SKIP
...    inst_restored = pyemma.load(file, 'simple') # doctest: +SKIP
>>> np.testing.assert_equal(m.P, inst_restored.P) # doctest: +SKIP

Use the load function to load a previously saved PyEMMA object. Since a file can contain multiple objects saved under different names, you can inspect the files with the pyemma.list_models() function to obtain the previously used names. There is also a command line utility pyemma_list_models to inspect these files quickly, without the need launching your own Python script.

pyemma.load(filename, model_name='default')

Loads a previously saved PyEMMA object from disk.

Parameters:
  • file_name (str or file like object (has to provide read method)) – The file like object tried to be read for a serialized object.
  • model_name (str, default='default') – if multiple models are contained in the file, these can be accessed by their name. Use pyemma.list_models() to get a representation of all stored models.
Returns:

obj

Return type:

the de-serialized object

pyemma.list_models(filename)

Lists all models in given filename.

Parameters:filename (str) – path to filename, where the model has been stored.
Returns:obj – A mapping by name and a comprehensive description like this: {model_name: {‘repr’ : ‘string representation, ‘created’: ‘human readable date’, …}
Return type:dict

Notes

We try our best to provide future compatibility for previously saved data. This means it should always be possible to load data with a newer version of the software. However, you can not do reverse; e.g., load a model saved by a new version with an old version of PyEMMA.