pyemma.coordinates.data.CustomFeature

class pyemma.coordinates.data.CustomFeature(fun, dim, description=None, fun_args=(), fun_kwargs=None)

A CustomFeature is the base class for user-defined features. If you want to implement a new fancy feature, derive from this class, calculate the quantity of interest in the map method and return it as an ndarray.

If you have defined a map function that should be classed, you don’t need to derive a class, but you can simply pass a function to the constructor of this class

Parameters:
  • func (function) – will be invoked with given args and kwargs on mapping traj
  • args (list of positional args (optional) passed to func) –
  • kwargs (named arguments (optional) passed to func) –

Notes

Your passed in function will get a mdtraj.Trajectory object as first argument.

Examples

We define a feature that transforms all coordinates by \(x^2\):

>>> from pyemma.coordinates import source
>>> from pyemma.datasets import get_bpti_test_data
>>> inp = get_bpti_test_data()

Define a function which transforms the coordinates of the trajectory object. Note that you need to define the output dimension, which we pass directly in the feature construction. The trajectory contains 58 atoms, so the output dimension will be 3 * 58 = 174:

>>> my_feature = CustomFeature(lambda x: (x.xyz**2).reshape(-1, 174), dim=174)
>>> reader = source(inp['trajs'][0], top=inp['top'])

pass the feature to the featurizer and transform the data

>>> reader.featurizer.add_custom_feature(my_feature)
>>> data = reader.get_output()
__init__(fun, dim, description=None, fun_args=(), fun_kwargs=None)

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(fun, dim[, description, fun_args, …]) Initialize self.
describe()
load(file_name[, model_name]) Loads a previously saved PyEMMA object from disk.
save(file_name[, model_name, overwrite, …]) saves the current state of this object to given file and name.
transform(traj)

Attributes

dimension
top
classmethod load(file_name, model_name='default')

Loads a previously saved PyEMMA object from disk.

Parameters:
  • file_name (str or file like object (has to provide read method)) – The file like object tried to be read for a serialized object.
  • model_name (str, default='default') – if multiple models are contained in the file, these can be accessed by their name. Use pyemma.list_models() to get a representation of all stored models.
Returns:

obj

Return type:

the de-serialized object

save(file_name, model_name='default', overwrite=False, save_streaming_chain=False)

saves the current state of this object to given file and name.

Parameters:
  • file_name (str) – path to desired output file
  • model_name (str, default='default') – creates a group named ‘model_name’ in the given file, which will contain all of the data. If the name already exists, and overwrite is False (default) will raise a RuntimeError.
  • overwrite (bool, default=False) – Should overwrite existing model names?
  • save_streaming_chain (boolean, default=False) – if True, the data_producer(s) of this object will also be saved in the given file.

Examples

>>> import pyemma, numpy as np
>>> from pyemma.util.contexts import named_temporary_file
>>> m = pyemma.msm.MSM(P=np.array([[0.1, 0.9], [0.9, 0.1]]))
>>> with named_temporary_file() as file: # doctest: +SKIP
...    m.save(file, 'simple') # doctest: +SKIP
...    inst_restored = pyemma.load(file, 'simple') # doctest: +SKIP
>>> np.testing.assert_equal(m.P, inst_restored.P) # doctest: +SKIP