pyemma.coordinates.data.FeatureReader¶
-
class
pyemma.coordinates.data.
FeatureReader
(trajectories, topologyfile=None, chunksize=100, featurizer=None)¶ Reads features from MD data.
To select a feature, access the
featurizer
and call a feature selecting method (e.g) distances.Parameters: - trajectories (list of strings) – paths to trajectory files
- topologyfile (string) – path to topology file (e.g. pdb)
Examples
Iterator access:
>>> reader = FeatureReader('mytraj.xtc', 'my_structure.pdb') >>> chunks = [] >>> for itraj, X in reader: >>> chunks.append(X)
Extract backbone torsion angles of protein during feature reading:
>>> reader = FeatureReader('mytraj.xtc', 'my_structure.pdb') >>> reader.featurizer.add_backbone_torsions() >>> X = reader.get_output()
-
__init__
(trajectories, topologyfile=None, chunksize=100, featurizer=None)¶
Methods
__init__
(trajectories[, topologyfile, ...])describe
()Returns a description of this transformer dimension
()Returns the number of output dimensions get_output
([dimensions, stride])Maps all input data of this transformer and returns it as an array or list of arrays. iterator
([stride, lag])Returns an iterator that allows to access the transformed data. map
(X)n_frames_total
([stride])Returns the total number of frames, over all trajectories number_of_trajectories
()Returns the number of trajectories output_type
()By default transformers return single precision floats. parametrize
([stride])trajectory_length
(itraj[, stride])Returns the length of trajectory trajectory_lengths
([stride])Returns the length of each trajectory Attributes
chunksize
chunksize defines how much data is being processed at once. data_producer
where the transformer obtains its data. in_memory
are results stored in memory? -
chunksize
¶ chunksize defines how much data is being processed at once.
-
data_producer
¶ where the transformer obtains its data.
-
describe
()¶ Returns a description of this transformer
Returns:
-
dimension
()¶ Returns the number of output dimensions
Returns:
-
get_output
(dimensions=slice(0, None, None), stride=1)¶ Maps all input data of this transformer and returns it as an array or list of arrays.
Parameters: - dimensions (list-like of indexes or slice) – indices of dimensions you like to keep, default = all
- stride (int) – only take every n’th frame, default = 1
Returns: output – the mapped data, where T is the number of time steps of the input data, or if stride > 1, floor(T_in / stride). d is the output dimension of this transformer. If the input consists of a list of trajectories, Y will also be a corresponding list of trajectories
Return type: ndarray(T, d) or list of ndarray(T_i, d)
Notes
- This function may be RAM intensive if stride is too large or too many dimensions are selected.
- if in_memory attribute is True, then results of this methods are cached.
Example
plotting trajectories
>>> import pyemma.coordinates as coor >>> import matplotlib.pyplot as plt >>> %matplotlib inline # only for ipython notebook >>> >>> tica = coor.tica() # fill with some actual data! >>> trajs = tica.get_output(dimensions=(0,), stride=100) >>> for traj in trajs: >>> plt.figure() >>> plt.plot(traj[:, 0])
-
in_memory
¶ are results stored in memory?
-
iterator
(stride=1, lag=0)¶ Returns an iterator that allows to access the transformed data.
Parameters: - stride (int) – Only transform every N’th frame, default = 1
- lag (int) – Configure the iterator such that it will return time-lagged data with a lag time of lag. If lag is used together with stride the operation will work as if the striding operation is applied before the time-lagged trajectory is shifted by lag steps. Therefore the effective lag time will be stride*lag.
Returns: iterator – If lag = 0, a call to the .next() method of this iterator will return the pair (itraj, X) : (int, ndarray(n, m)), where itraj corresponds to input sequence number (eg. trajectory index) and X is the transformed data, n = chunksize or n < chunksize at end of input.
If lag > 0, a call to the .next() method of this iterator will return the tuple (itraj, X, Y) : (int, ndarray(n, m), ndarray(p, m)) where itraj and X are the same as above and Y contain the time-lagged data.
Return type: a
pyemma.coordinates.transfrom.TransformerIterator
transformer iterator
-
n_frames_total
(stride=1)¶ Returns the total number of frames, over all trajectories
Parameters: stride – return value is the number of frames in trajectories when running through them with a step size of stride Returns: the total number of frames, over all trajectories
-
number_of_trajectories
()¶ Returns the number of trajectories
Returns: number of trajectories
-
output_type
()¶ By default transformers return single precision floats.
-
trajectory_length
(itraj, stride=1)¶ Returns the length of trajectory
Parameters: - itraj – trajectory index
- stride – return value is the number of frames in trajectory when running through it with a step size of stride
Returns: length of trajectory
-
trajectory_lengths
(stride=1)¶ Returns the length of each trajectory
Parameters: stride – return value is the number of frames in trajectories when running through them with a step size of stride Returns: list containing length of each trajectory