Coordinates package (pyemma.coordinates)

The coordinates package contains tools to select features from MD-trajectories. It also assigns them to a discrete state space, which will be later used in Markov modeling.

It supports reading from MD-trajectories, comma separated value ASCII files and NumPy arrays. The discretized trajectories are being stored as NumPy arrays of integers.

User API

Data handling and IO

featurizer(topfile) Featurizer to select features from MD data.
load(trajfiles[, features, top, stride, ...]) Loads coordinate features into memory.
source(inp[, features, top, chunk_size]) Wraps input as data source for pipeline.
pipeline(stages[, run, stride, chunksize]) Data analysis pipeline.
discretizer(reader[, transform, cluster, ...]) Specialized pipeline: From trajectories to clustering.
save_traj(traj_inp, indexes, outfile[, ...]) Saves a sequence of frames as a single trajectory.
save_trajs(traj_inp, indexes[, prefix, fmt, ...]) Saves sequences of frames as multiple trajectories.

Transformations

pca([data, dim, stride]) Principal Component Analysis (PCA).
tica([data, lag, dim, stride, ...]) Time-lagged independent component analysis (TICA).

Clustering Algorithms

cluster_kmeans([data, k, max_iter, stride, ...]) k-means clustering
cluster_regspace([data, dmin, max_centers, ...]) Regular space clustering
cluster_uniform_time([data, k, stride, metric]) Uniform time clustering
assign_to_centers([data, centers, stride, ...]) Assigns data to the nearest cluster centers

Classes

Coordinate classes encapsulating complex functionality. You don’t need to construct these classes yourself, as this is done by the user API functions above. Find here a documentation how to extract features from them.

pipelines.Pipeline(chain[, chunksize, ...]) Data processing pipeline.
transform.PCA(output_dimension) Principal component analysis.
transform.TICA(lag, output_dimension[, ...]) Time-lagged independent component analysis (TICA)