Coordinates package (pyemma.coordinates)

The coordinates package contains tools to select features from MD-trajectories. It also assigns them to a discrete state space, which will be later used in Markov modeling.

It supports reading from MD-trajectories, comma separated value ASCII files and NumPy arrays. The discretized trajectories are being stored as NumPy arrays of integers.

User API

Trajectory input/output and featurization

featurizer(topfile)

Featurizer to select features from MD data.

load(trajfiles[, features, top, stride, …])

Loads coordinate features into memory.

source(inp[, features, top, chunksize])

Defines trajectory data source

combine_sources(sources[, chunksize])

Combines multiple data sources to stream from.

pipeline(stages[, run, stride, chunksize])

Data analysis pipeline.

discretizer(reader[, transform, cluster, …])

Specialized pipeline: From trajectories to clustering.

save_traj(traj_inp, indexes, outfile[, top, …])

Saves a sequence of frames as a single trajectory.

save_trajs(traj_inp, indexes[, prefix, fmt, …])

Saves sequences of frames as multiple trajectories.

Covariance estimation

covariance_lagged([data, c00, c0t, ctt, …])

Compute lagged covariances between time series.

Coordinate and feature transformations

pca([data, dim, var_cutoff, stride, mean, …])

Principal Component Analysis (PCA).

tica([data, lag, dim, var_cutoff, …])

Time-lagged independent component analysis (TICA).

vamp([data, lag, dim, scaling, right, …])

Variational approach for Markov processes (VAMP) [1]_.

Clustering Algorithms

cluster_kmeans([data, k, max_iter, …])

k-means clustering

cluster_mini_batch_kmeans([data, k, …])

k-means clustering with mini-batch strategy

cluster_regspace([data, dmin, max_centers, …])

Regular space clustering

cluster_uniform_time([data, k, stride, …])

Uniform time clustering

assign_to_centers([data, centers, stride, …])

Assigns data to the nearest cluster centers

Classes

Coordinate classes encapsulating complex functionality. You don’t need to construct these classes yourself, as this is done by the user API functions above. Find here a documentation how to extract features from them.

I/O and Featurization

data.MDFeaturizer(topfile, **kwargs)

Extracts features from MD trajectories.

data.CustomFeature(fun, dim[, description, …])

A CustomFeature is the base class for user-defined features.

Transformation estimators

transform.PCA([dim, var_cutoff, mean, …])

Principal component analysis.

transform.TICA(lag[, dim, var_cutoff, …])

Time-lagged independent component analysis (TICA)

transform.VAMP(lag[, dim, scaling, right, …])

Variational approach for Markov processes (VAMP)

Covariance estimation

estimation.covariance.LaggedCovariance([…])

Clustering algorithms

clustering.KmeansClustering(n_clusters[, …])

k-means clustering

clustering.MiniBatchKmeansClustering(n_clusters)

Mini-batch k-means clustering

clustering.RegularSpaceClustering(dmin[, …])

Regular space clustering

clustering.UniformTimeClustering([…])

Uniform time clustering

Transformers

data._base.transformer.StreamingTransformer([…])

Basis class for pipelined Transformers.

pipelines.Pipeline(chain[, chunksize, …])

Data processing pipeline.

Discretization

clustering.AssignCenters(clustercenters[, …])

Assigns given (pre-calculated) cluster centers.