Runtime Configuration

You can change some runtime behaviour of PyEMMA by setting a configuration value in PyEMMAs config module. These can be persisted to hard disk to be permanent on every import of the package.

Examples

Change values

To access the config at runtime eg. if progress bars should be shown:

>>> from pyemma import config # doctest: +SKIP
>>> print(config.show_progress_bars) # doctest: +SKIP
True
>>> config.show_progress_bars = False # doctest: +SKIP
>>> print(config.show_progress_bars) # doctest: +SKIP
False

Store your changes / Create a configuration directory

To create an editable configuration file, use the pyemma.config.save() method:

>>> from pyemma import config # doctest: +SKIP
>>> config.save('/tmp/pyemma_current.cfg') # doctest: +SKIP

This will store the current runtime configuration values in the given file. Note that these settings will not be used on the next start of PyEMMA, because you first need to tell us, where you have stored this file. To do so, please set the environment variable “PYEMMA_CFG_DIR” to the directory, where you have stored the config file.

  • For Linux/OSX this thread thread may be helpful.

  • For Windows have a look at this.

For details have a look at the brief documentation: https://docs.python.org/2/howto/logging.html

Default configuration file

Default settings are stored in a provided pyemma.cfg file, which is included in the Python package:

Configuration files

To configure the runtime behavior such as the logging system or other parameters, the configuration module reads several config files to build its final set of settings. It searches for the file ‘pyemma.cfg’ in several locations with different priorities:

  1. $CWD/pyemma.cfg

  2. $HOME/.pyemma/pyemma.cfg

  3. ~/pyemma.cfg

  4. $PYTHONPATH/pyemma/pyemma.cfg (always taken as default configuration file)

Note that you can also override the location of the configuration directory by setting an environment variable named “PYEMMA_CFG_DIR” to a writeable path to override the location of the config files.

The default values are stored in latter file to ensure these values are always defined.

If no configuration file could be found, the defaults from the shipped package will apply.

Load a configuration file

In order to load a pre-saved configuration file, use the load() method:

>>> from pyemma import config # doctest: +SKIP
>>> config.load('pyemma_silent.cfg') # doctest: +SKIP

Configuration values

class pyemma.util._config.Config

Methods

keys()

valid configuration keys

load([filename])

load runtime configuration from given filename.

save([filename])

Saves the runtime configuration to disk.

Attributes

DEFAULT_CONFIG_DIR

DEFAULT_CONFIG_FILE_NAME

DEFAULT_LOGGING_FILE_NAME

cfg_dir

PyEMMAs configuration directory (eg.

check_version

Check for the latest release online.

coordinates_check_output

Enabling this option will check for invalid output (NaN, Inf) in pyemma.coordinates.

default_chunksize

default chunksize to use for coordinate transformations, only intergers with suffix [k,m,g]

default_config_file

default config file living in PyEMMA package

default_logging_file

default logging configuration

logging_config

currently used logging configuration file.

mute

Switch this to True, to tell PyEMMA not to use progress bars and logging to console.

show_config_notification

show_progress_bars

Show progress bars for heavy computations?

traj_info_max_entries

How many entries (files) the trajectory info cache can hold.

traj_info_max_size

Maximum trajectory info cache size in bytes.

use_trajectory_lengths_cache

Shall the trajectory info cache be used to remember attributes of trajectory files.

used_filenames

these filenames have been red to obtain basic configuration values.

DEFAULT_CONFIG_DIR = '/srv/public/pyemma-build-doc-with-recipe/home/.pyemma'
DEFAULT_CONFIG_FILE_NAME = 'pyemma.cfg'
DEFAULT_LOGGING_FILE_NAME = 'logging.yml'
cfg_dir

PyEMMAs configuration directory (eg. ~/.pyemma)

check_version

Check for the latest release online.

Disable this if you have privacy concerns. We currently collect:

  • Python version

  • PyEMMA version

  • operating system

  • MAC address

See Legal Notices for further information.

coordinates_check_output

Enabling this option will check for invalid output (NaN, Inf) in pyemma.coordinates.

Notes

This setting is on by default by PyEMMA version 2.5.5

default_chunksize

default chunksize to use for coordinate transformations, only intergers with suffix [k,m,g]

default_config_file

default config file living in PyEMMA package

default_logging_file

default logging configuration

keys()

valid configuration keys

load(filename=None)

load runtime configuration from given filename. If filename is None try to read from default file from default location.

logging_config

currently used logging configuration file. Can not be changed during runtime.

mute

Switch this to True, to tell PyEMMA not to use progress bars and logging to console.

save(filename=None)

Saves the runtime configuration to disk.

Parameters

filename (str or None, default=None) – writeable path to configuration filename. If None, use default location and filename.

show_config_notification
show_progress_bars

Show progress bars for heavy computations?

traj_info_max_entries

How many entries (files) the trajectory info cache can hold. The cache will forget the least recently used entries when this limit is hit.

traj_info_max_size

Maximum trajectory info cache size in bytes. The cache will forget the least recently used entries when this limit is hit.

use_trajectory_lengths_cache

Shall the trajectory info cache be used to remember attributes of trajectory files.

It is strongly recommended to use the cache especially for XTC files, because this will speed up reader creation a lot.

used_filenames

these filenames have been red to obtain basic configuration values.

Parallel setup

Some algorithms of PyEMMA use parallel computing. On one hand there is parallelisation due to NumPy, which can use several threads to speed up raw NumPy computations. On the other hand PyEMMA itself can start several threads and or sub-processes (eg. in clustering, MSM timescales computation etc.).

To limit the amount of threads/processes started by PyEMMA you can set the environment variable PYEMMA_NJOBS to an integer value. This setting can also be overridden by the n_jobs property of the supported estimator.

To set the number of threads utilized by NumPy you can set the environment variable OMP_NUM_THREADS to an integer value as well.

Note that this number will be multiplied by the setting for PYEMMA_NJOBS, if the the algorithm uses multiple processes, as each process will use the same amount of OMP threads.

Setting these values too high, will lead to bad performance due to the overhead of maintaining multiple threads and or processes.

By default PYEMMA_NJOBS will be chosen automatically to suit your hardware setup, but in shared environments this can be sub-optimal.

For the popular SLURM cluster scheduler, we also respect the value of the environment variable SLURM_CPUS_ON_NODE and give it a high preference, if PYEMMA_NJOBS is also set. So if you have chosen the number of CPUs for your cluster job, PyEMMA would then automatically use the same amount of threads.