Runtime Configuration¶
You can change some runtime behaviour of PyEMMA by setting a configuration value in PyEMMAs config module. These can be persisted to hard disk to be permanent on every import of the package.
Examples¶
Change values¶
To access the config at runtime eg. if progress bars should be shown:
>>> from pyemma import config # doctest: +SKIP
>>> print(config.show_progress_bars) # doctest: +SKIP
True
>>> config.show_progress_bars = False # doctest: +SKIP
>>> print(config.show_progress_bars) # doctest: +SKIP
False
Store your changes / Create a configuration directory¶
To create an editable configuration file, use the pyemma.config.save()
method:
>>> from pyemma import config # doctest: +SKIP
>>> config.save('/tmp/pyemma_current.cfg') # doctest: +SKIP
This will store the current runtime configuration values in the given file. Note that these settings will not be used on the next start of PyEMMA, because you first need to tell us, where you have stored this file. To do so, please set the environment variable “PYEMMA_CFG_DIR” to the directory, where you have stored the config file.
For details have a look at the brief documentation: https://docs.python.org/2/howto/logging.html
Default configuration file¶
Default settings are stored in a provided pyemma.cfg file, which is included in the Python package:
Configuration files¶
To configure the runtime behavior such as the logging system or other parameters, the configuration module reads several config files to build its final set of settings. It searches for the file ‘pyemma.cfg’ in several locations with different priorities:
- $CWD/pyemma.cfg
- $HOME/.pyemma/pyemma.cfg
- ~/pyemma.cfg
- $PYTHONPATH/pyemma/pyemma.cfg (always taken as default configuration file)
Note that you can also override the location of the configuration directory by setting an environment variable named “PYEMMA_CFG_DIR” to a writeable path to override the location of the config files.
The default values are stored in latter file to ensure these values are always defined.
If no configuration file could be found, the defaults from the shipped package will apply.
Load a configuration file¶
In order to load a pre-saved configuration file, use the load()
method:
>>> from pyemma import config # doctest: +SKIP
>>> config.load('pyemma_silent.cfg') # doctest: +SKIP
Configuration values¶
-
class
pyemma.util._config.
Config
¶ Methods
keys
()valid configuration keys load
([filename])load runtime configuration from given filename. save
([filename])Saves the runtime configuration to disk. Attributes
DEFAULT_CONFIG_DIR
DEFAULT_CONFIG_FILE_NAME
DEFAULT_LOGGING_FILE_NAME
cfg_dir
PyEMMAs configuration directory (eg. check_version
Check for the latest release online. coordinates_check_output
Enabling this option will check for invalid output (NaN, Inf) in pyemma.coordinates. default_chunksize
default chunksize to use for coordinate transformations, only intergers with suffix [k,m,g] default_config_file
default config file living in PyEMMA package default_logging_file
default logging configuration logging_config
currently used logging configuration file. mute
Switch this to True, to tell PyEMMA not to use progress bars and logging to console. show_config_notification
show_progress_bars
Show progress bars for heavy computations? traj_info_max_entries
How many entries (files) the trajectory info cache can hold. traj_info_max_size
Maximum trajectory info cache size in bytes. use_trajectory_lengths_cache
Shall the trajectory info cache be used to remember attributes of trajectory files. used_filenames
these filenames have been red to obtain basic configuration values. -
DEFAULT_CONFIG_DIR
= '/home/mi/marscher/.pyemma'¶
-
DEFAULT_CONFIG_FILE_NAME
= 'pyemma.cfg'¶
-
DEFAULT_LOGGING_FILE_NAME
= 'logging.yml'¶
-
cfg_dir
¶ PyEMMAs configuration directory (eg. ~/.pyemma)
-
check_version
¶ Check for the latest release online.
Disable this if you have privacy concerns. We currently collect:
- Python version
- PyEMMA version
- operating system
- MAC address
See Legal Notices for further information.
-
coordinates_check_output
¶ Enabling this option will check for invalid output (NaN, Inf) in pyemma.coordinates.
Notes
This setting is on by default by PyEMMA version 2.5.5
-
default_chunksize
¶ default chunksize to use for coordinate transformations, only intergers with suffix [k,m,g]
-
default_config_file
¶ default config file living in PyEMMA package
-
default_logging_file
¶ default logging configuration
-
keys
()¶ valid configuration keys
-
load
(filename=None)¶ load runtime configuration from given filename. If filename is None try to read from default file from default location.
-
logging_config
¶ currently used logging configuration file. Can not be changed during runtime.
-
mute
¶ Switch this to True, to tell PyEMMA not to use progress bars and logging to console.
-
save
(filename=None)¶ Saves the runtime configuration to disk.
Parameters: filename (str or None, default=None) – writeable path to configuration filename. If None, use default location and filename.
-
show_config_notification
¶
-
show_progress_bars
¶ Show progress bars for heavy computations?
-
traj_info_max_entries
¶ How many entries (files) the trajectory info cache can hold. The cache will forget the least recently used entries when this limit is hit.
-
traj_info_max_size
¶ Maximum trajectory info cache size in bytes. The cache will forget the least recently used entries when this limit is hit.
-
use_trajectory_lengths_cache
¶ Shall the trajectory info cache be used to remember attributes of trajectory files.
It is strongly recommended to use the cache especially for XTC files, because this will speed up reader creation a lot.
-
used_filenames
¶ these filenames have been red to obtain basic configuration values.
-
Parallel setup¶
Some algorithms of PyEMMA use parallel computing. On one hand there is parallelisation due to NumPy, which can use several threads to speed up raw NumPy computations. On the other hand PyEMMA itself can start several threads and or sub-processes (eg. in clustering, MSM timescales computation etc.).
To limit the amount of threads/processes started by PyEMMA you can set the environment variable PYEMMA_NJOBS to an integer value. This setting can also be overridden by the n_jobs property of the supported estimator.
To set the number of threads utilized by NumPy you can set the environment variable OMP_NUM_THREADS to an integer value as well.
Note that this number will be multiplied by the setting for PYEMMA_NJOBS, if the the algorithm uses multiple processes, as each process will use the same amount of OMP threads.
Setting these values too high, will lead to bad performance due to the overhead of maintaining multiple threads and or processes.
By default PYEMMA_NJOBS will be chosen automatically to suit your hardware setup, but in shared environments this can be sub-optimal.
For the popular SLURM cluster scheduler, we also respect the value of the environment variable SLURM_CPUS_ON_NODE and give it a high preference, if PYEMMA_NJOBS is also set. So if you have chosen the number of CPUs for your cluster job, PyEMMA would then automatically use the same amount of threads.