pyemma.coordinates.save_trajs

pyemma.coordinates.save_trajs(traj_inp, indexes, prefix='set_', fmt=None, outfiles=None, inmemory=False, stride=1, verbose=False)

Saves sequences of frames as multiple trajectories.

Extracts a number of specified sequences of time/trajectory indexes from the input loader and saves them in a set of molecular dynamics trajectories. The output filenames are obtained by prefix + str(n) + .fmt, where n counts the output trajectory and extension is either set by the user, or else determined from the input. Example: When the input is in dcd format, and indexes is a list of length 3, the output will by default go to files “set_1.dcd”, “set_2.dcd”, “set_3.dcd”. If you want files to be stored in a specific subfolder, simply specify the relative path in the prefix, e.g. prefix=’~/macrostates/pcca_’

Parameters
  • traj_inp (pyemma.coordinates.data.feature_reader.FeatureReader) – A data source as provided by Please use pyemma.coordinates.source() to construct it.

  • indexes (list of ndarray(T_i, 2)) – A list of N arrays, each of size (T_n x 2) for writing N trajectories of T_i time steps. Each row contains two indexes (i, t), where i is the index of the trajectory from the input and t is the index of the time step within the trajectory.

  • prefix (str, optional, default = set_) – output filename prefix. Can include an absolute or relative path name.

  • fmt (str, optional, default = None) – Outpuf file format. By default, the file extension and format. It will be determined from the input. If a different format is desired, specify the corresponding file extension here without a dot, e.g. “dcd” or “xtc”.

  • outfiles (list of str, optional, default = None) – A list of output filenames. When given, this will override the settings of prefix and fmt, and output will be written to these files.

  • inmemory (Boolean, default = False (untested for large files)) – Instead of internally calling traj_save for every (T_i,2) array in “indexes”, only one call is made. Internally, this generates a potentially large molecular trajectory object in memory that is subsequently sliced into the files of “outfiles”. Should be faster for large “indexes” arrays and large files, though it is quite memory intensive. The optimal situation is to avoid streaming two times through a huge file for “indexes” of type: indexes = [[1 4000000],[1 4000001]]

  • stride (integer, default is 1) – This parameter informs save_trajs() about the stride used in the indexes variable. Typically, the variable indexes contains frame indexes that match exactly the frames of the files contained in traj_inp.trajfiles. However, in certain situations, that might not be the case. Examples of these situations are cases in which stride value != 1 was used when reading/featurizing/transforming/discretizing the files contained in traj_inp.trajfiles.

  • verbose (boolean, default is False) – Verbose output while looking for “indexes” in the “traj_inp.trajfiles”

Returns

outfiles – The list of absolute paths that the output files have been written to.

Return type

list of str