deeptrain.util package¶
Subpackages¶
Submodules¶
deeptrain.util.algorithms module¶
-
deeptrain.util.algorithms.ordered_shuffle(*args)¶ Shuffles each of the iterables the same way. Ex:
>>> ([1, 2, 3, 4], {'a': 5, 'b': 6, 'c': 7, 'd': 8}) >>> ([3, 4, 1, 2], {'c': 7, 'd': 8, 'a': 5, 'b': 6})
-
deeptrain.util.algorithms.nCk(n, k)¶ n-Choose-k
-
deeptrain.util.algorithms.builtin_or_npscalar(x, include_type_type=False)¶ Returns True if x is a builtin or a numpy scalar. Since
typeis a builtin, but is a class rather than a literal, it’s omitted by default; setinclude_type_type=Trueto include it.
-
deeptrain.util.algorithms.obj_to_str(x, len_lim=200, drop_absname=False)¶ Converts
xto a string representation if it isn’t a builtin or numpy scalar.Trims string representation to
len_limifxortype(x)have no__qualname__or__name__attributes. To drop packages and modules in an object’s name (package.subpackage.obj), passdrop_absname=True.
-
deeptrain.util.algorithms.deeplen(item)¶ Return total number of items in an arbitrarily nested iterable - excluding the iterables themselves.
-
deeptrain.util.algorithms.deepget(obj, key=None, drop_keys=0)¶ Get an item from an arbitarily nested iterable.
keyis a list/tuple of indices of access specifiers (indices or mapping (e.g. dict) keys); if a mapping is unordered (e.g. dict for Python <=3.5), retrieval isn’t consistent.
-
deeptrain.util.algorithms.deepmap(obj, fn)¶ Map
fnto items of an arbitrarily nested iterable, including iterables. See https://codereview.stackexchange.com/q/242369/210581 for an explanation.
-
deeptrain.util.algorithms.deep_isinstance(obj, cond)¶ Checks that items within an arbitrarily nested iterable meet
cond. Returns a list of bools; to assert that all elements meetcond, runall(deep_isinstance()).
deeptrain.util.configs module¶
Custom configurations for TrainGenerator() / DataGenerator()
go here. These will be defaulted to when pertinent configs aren’t passed
explicitly to __init__ (e.g. _PLOT_CFG for TrainGenerator.plot_configs).
deeptrain.util._default_configs module¶
! DO NOT MODIFY ! Used internally by classes to validate input arguments. Effective configurations are in configs.py. Can serve as user reference.
-
deeptrain.util._default_configs._DEFAULT_MODEL_NAME_CFG= {'best_key_metric': None, 'lr': '', 'optimizer': ''}¶ Configures
get_unique_model_name().
-
deeptrain.util._default_configs._DEFAULT_TRAINGEN_SAVESKIP_LIST= ['model', 'optimizer_state', 'callbacks', 'key_metric_fn', 'custom_metrics', 'metric_to_alias', 'alias_to_metric', 'name_process_key_fn', '_fit_fn', '_eval_fn', '_labels', '_preds', '_y_true', '_y_preds', '_labels_cache', '_preds_cache', '_sw_cache', '_imports', '_history_fig', '_val_max_set_name_chars', '_max_set_name_chars', '_inferred_batch_size', '_class_labels_cache', '_temp_history_empty', '_val_temp_history_empty', '_val_sw', '_set_num', '_val_set_num']¶ Configures
TrainGenerator.save().
-
deeptrain.util._default_configs._DEFAULT_TRAINGEN_LOADSKIP_LIST= ['{auto}', 'model_name', 'model_base_name', 'model_num', 'use_passed_dirs_over_loaded', 'logdir', '_init_callbacks_called']¶ Configures
TrainGenerator.load().
-
deeptrain.util._default_configs._DEFAULT_DATAGEN_SAVESKIP_LIST= ['batch', 'superbatch', 'labels', 'all_labels', '_group_batch', '_group_labels']¶ Configures
TrainGenerator.save().
-
deeptrain.util._default_configs._DEFAULT_DATAGEN_LOADSKIP_LIST= ['data_path', 'labels_path', 'superbatch_path', 'data_loader', 'set_nums_original', 'set_nums_to_process', 'superbatch_set_nums']¶ Configures
TrainGenerator.load().
-
deeptrain.util._default_configs._DEFAULT_MODEL_SAVE_KW= {'include_optimizer': True, 'save_format': None}¶ Configures
TrainGenerator.save().
-
deeptrain.util._default_configs._DEFAULT_MODEL_SAVE_WEIGHTS_KW= {'save_format': None}¶ Configures
TrainGenerator.save().
-
deeptrain.util._default_configs._DEFAULT_METRIC_PRINTSKIP_CFG= {'train': [], 'val': []}¶ Configures
TrainGenerator._print_train_progress()andTrainGenerator._print_val_progress().
-
deeptrain.util._default_configs._DEFAULT_METRIC_TO_ALIAS= {'accuracy': 'Acc', 'f1_score': 'F1', 'loss': 'Loss', 'mean_absolute_error': 'MAE', 'mean_squared_error': 'MSE', 'tnr': '0-Acc', 'tpr': '1-Acc'}¶ Configures
TrainGenerator._metric_name_to_alias()
-
deeptrain.util._default_configs._DEFAULT_ALIAS_TO_METRIC= {'acc': 'accuracy', 'cosine': 'cosine_similarity', 'f1': 'f1_score', 'f1-score': 'f1_score', 'kld': 'kullback_leibler_divergence', 'mae': 'mean_absolute_error', 'mape': 'mean_absolute_percentage_error', 'mse': 'mean_squared_error', 'msle': 'mean_squared_logarithmic_error'}¶ Configures
TrainGenerator._alias_to_metric_name()
-
deeptrain.util._default_configs._DEFAULT_NAME_PROCESS_KEY_FN(key, alias, attrs)¶ Used within
deeptrain.util.logging.get_unique_model_name().
-
deeptrain.util._default_configs._DEFAULT_DATAGEN_CFG= {'data_batch_shape': None, 'data_dtype': None, 'labels_batch_shape': None, 'labels_dtype': None, 'loadskip_list': ['data_path', 'labels_path', 'superbatch_path', 'data_loader', 'set_nums_original', 'set_nums_to_process', 'superbatch_set_nums'], 'saveskip_list': ['batch', 'superbatch', 'labels', 'all_labels', '_group_batch', '_group_labels'], 'shuffle_group_batches': False, 'shuffle_group_samples': False}¶ Default
DataGenerator()configurations. Used withinDataGenerator._init_and_validate_kwargs()to check whether any of args in**kwargsisn’t one of keys in this dict (in which case it’s unused internally and will raise an exception).- shuffle_group_batches: bool
- See
DataGenerator._make_group_batch_and_labels(). - shuffle_group_samples: bool
- See
DataGenerator._make_group_batch_and_labels(). - data_batch_shape: tuple[int]
- Predefined complete batch shape
(batch_size, *)(i.e.(samples, *)) to be used by some loaders inDataLoader()(ofdata_loader). - labels_batch_shape: tuple[int]
data_batch_shape, but forlabels_loader.- data_dtype: str / dtype
- Used by some loaders in
DataLoader()(ofdata_loader) to cast loaded data to dtype. - labels_dtype: str / dtype
data_dtype, but forlabels_loader.- loadskip_list: list[str]
- List of
DataGeneratorattribute names to skip from loading. Mainly for attributes that should change between different train sessions, e.g.set_nums_to_process, or shouldn’t have**kwargsvalues overridden by load. - saveskip_list: list[str]
- List of
DataGeneratorattribute names to skip from saving. Used to exclude e.g. objects that cannot be pickled (e.g.model), are large and should be loaded separately (e.g.batch), or should be reinstantiated (e.g._imports).
-
deeptrain.util._default_configs._DEFAULT_PLOT_CFG¶ Configures
get_history_fig().
_DEFAULT_PLOT_CFG_DEFAULT_PLOT_CFG = {
'fig_kw': {'figsize': (12, 7)},
'0': {
'metrics': None,
'x_ticks': None,
'vhlines' :
{'v': '_hist_vlines',
'h': 1},
'mark_best_cfg': None,
'ylims' : (0, 2),
'legend_kw' : {'fontsize': 13},
'linewidth': [1.5, 1.5],
'linestyle': ['-', '-'],
'color' : None,
},
'1': {
'metrics': None,
'x_ticks': None,
'vhlines':
{'v': '_val_hist_vlines',
'h': .5},
'mark_best_cfg': None,
'ylims' : (0, 1),
'legend_kw' : {'fontsize': 13},
'linewidth': [1.5],
'linestyle': ['-'],
'color': None,
}
}
-
deeptrain.util._default_configs._DEFAULT_REPORT_CFG¶ Configures
generate_report().
_DEFAULT_REPORT_CFG_DEFAULT_REPORT_CFG = {
'model':
{},
'traingen':
{
'exclude':
['model', 'model_configs', 'logs_use_full_model_name',
'history_fig', 'plot_configs', 'max_checkpoints',
'history', 'val_history', 'temp_history', 'val_temp_history',
'name_process_key_fn', 'report_fontpath', 'model_name_configs',
'report_configs', 'datagen', 'val_datagen', 'logdir', 'logs_dir',
'best_models_dir', 'fit_fn', 'eval_fn', '_fit_fn', '_eval_fn',
'callbacks', '_cb_alias', '_passed_args', '_history_fig',
'_metrics_cached', 'metric_printskip_configs',
'_inferred_batch_size', 'plot_first_pane_max_vals', '_imports',
'iter_verbosity', '_max_set_name_chars', '_val_max_set_name_chars',
'metric_to_alias', 'alias_to_metric',
'*_has_', '*temp_history_empty',
],
'exclude_types':
[list, np.ndarray, '#best_subset_nums'],
},
('datagen', 'val_datagen'):
{
'exclude':
['batch', 'group_batch', 'labels', 'all_labels',
'batch_loaded', 'batch_exhausted', 'set_num', 'set_name',
'_set_names', 'set_nums_original', 'set_nums_to_process',
'superbatch_set_nums', 'data_loader', 'data_path',
'labels_loader', 'labels_path',
'saveskip_list', 'loadskip_list', '_path_attrs', 'preprocessor',
'*_ATTRS', '*superbatch', '*_filepaths', '*_filenames']
},
}
-
deeptrain.util._default_configs._DEFAULT_TRAINGEN_CFG¶ Default
TrainGenerator()configurations. Used withinTrainGenerator._init_and_validate_kwargs()to check whether any of args in**kwargsisn’t one of keys in this dict (in which case it’s unused internally and will raise an exception).- Parameters:
- dynamic_predict_threshold_min_max: tuple[float, float]
- Range of permitted values for
dynamic_predict_thresholdwhen setting it. Useful for constraining “best subset” search to discourage high binary classifier performance with extreme best thresholds (e.g. 0.99), which might do worse on larger validation sets. - checkpoints_overwrite_duplicates: bool
- Default value of
overwriteincheckpoint(). Controls whether checkpoint will overwrite files if they have same name as current checkpoint’s; if False, will make unique filenames by incrementing as ‘_v2’, ‘_v3’, etc. - loss_weighted_slices_range: tuple[float, float]
- Passed as
weight_rangeto_get_weighted_sample_weight(). A linear scaling ofsample_weightwhen using slices. During training, this is used overpred_weighted_slices_range; during validation, uses latter. - pred_weighted_slices_range: tuple[float, float]
- Same as
loss_weighted_slices_range, except is used during validation to compute metrics from predictions, and not in scaling train-timesample_weight. - logs_use_full_model_name: bool
- Whether to use
model_nameor a minimal name containing number of validations done + best key metric, withincheckpoint(). - new_model_num: bool
- Used within
get_unique_model_name(). If True, will setmodel_numto +1 the max number after"M"for directory names inlogs_dir; e.g. if such a directory is"M15__Classifier", will use"M16", and setmodel_num = 16. - dynamic_predict_threshold: float / None
predict_thresholdthat is optimized during training to yield bestkey_metric. See_set_predict_threshold(), which’s called by_get_val_history(), and_get_best_subset_val_history(). If None, will only usepredict_threshold.- plot_first_pane_max_vals: int
- Maximum number of validation metrics to plot, as set by
_make_plot_configs_from_metrics(), forget_history_fig(). This is a setting for the default config maker (first method), which plots all train metrics in first pane. - _val_max_set_name_chars: int
- Padding to use in
TrainGenerator._print_iter_progress()to justifyval_set_namewhen printing “Validating set val_set_name”; should be set to expected longest set name for vertical alignment. E.g. if'123', should set to 3, if'99', to 2. - _max_set_name_chars: int
- Same as
_val_max_set_name_chars, but for train_set_name. - predict_threshold: float
- Binary classifier prediction threshold, above which to classify as
'1', used in_compute_metrics(). Ifdynamic_predict_thresholdanddynamic_predict_threshold_min_maxare not None, it will be set equal to former within bounds of latter. - best_subset_size: int >= 1 / None
If not None, will search for
best_subset_sizenumber of batches yielding best validation performance, out of all validation batches (e.g. 5 of 10). Useful for model ensembling in specializing member models on different parts of data. See_get_best_subset_val_history().- check_model_health: bool
- Whether to call
TrainGenerator.check_health()at the end of validation inTrainGenerator._on_val_end(), which checks whether anymodellayers have zero/NaN weights. Very fast / inexpensive. - max_one_best_save: bool
- Whether to keep only one set of save files (model weights,
TrainGeneratorstate, etc.) inbest_models_dirwhen saving best model via_save_best_model(). - max_checkpoints: int
- Maximum sets of checkpoint files (model weights,
TrainGeneratorstate, etc.) to keep inlogdir, when checkpointing viacheckpoint(). - report_fontpath: str
- Path to font file for font to use in saving report
(
save_report()); defaults to consola, which yields nice vertical & horizontal alignment. - model_base_name: str
- Name between
"M{model_num}"and autogenerated string frommodel_configs, as"M{model_num}_{model_base_name}_*"; seeget_unique_model_name(). - final_fig_dir: str / None
- Path to directory where to save latest metric history using full
model_name, at most one permodel_num. If None, won’t save such a figure (but will still save history for best model & checkpoint). - loadskip_list: list[str]
- List of
TrainGeneratorattribute names to skip from loading. Mainly for attributes that should change between different train sessions, e.g.model_num, or shouldn’t have**kwargsvalues overridden by load. - saveskip_list: list[str]
- List of
TrainGeneratorattribute names to skip from saving. Used to exclude e.g. objects that cannot be pickled (e.g.model), are large and should be loaded separately (e.g.batch), or should be reinstantiated (e.g._imports). - model_save_kw: dict / None
Passed as kwargs to
model.save():overwrite: bool. Whether to overwrite existing file.include_optimizer: bool. Whether to include optimizer weights.save_format(tf.keras): str. Savefile format. If None, will internally default to'h5'if using tf.keras else will drop.- others (tf.keras): see
model.save().
- model_save_weights_kw: dict / None
- Passed as kwargs to
model.save_weights(); same asmodel_save_kw, excludinginclude_optimizer. - metric_to_alias: dict
- Dict mapping a metric name to its alias; current use is for controlling
how metric names are printed (see
TrainGenerator._print_progress()). - alias_to_metric: dict
- Dict mapping a metric alias to its TF/Keras/DeepTrain name. If defined
in TF/Keras, DeepTrain uses the same names - else, they’ll match function
names in
deeptrain.metrics. - report_configs: dict
- Dict specifying
generate_report()behavior; see the method for info. - model_name_configs: dict
- Dict specifying
get_unique_model_name()behavior; see the method for info. If'best_key_metric'is None, will default to'__max'ifmax_is_best, else'__min', in_validate_traingen_configs(), within_validate_model_name_configs. - name_process_key_fn: function
- Function used within
get_unique_model_name(); see the method for info. - metric_printskip_configs: dict
Names of train/val metrics (and their values) to omit in printing progress via
TrainGenerator._print_train_progress()andTrainGenerator._print_val_progress(). Ex:>>> {'train': 'accuracy', ... 'val': ['f1_score', 'r2_score']} >>> # So, if train metrics are ['loss', 'accuracy'], then will only print ... # 'loss' and its value.
_DEFAULT_TRAINGEN_CFG_DEFAULT_TRAINGEN_CFG = dict(
dynamic_predict_threshold_min_max = None,
checkpoints_overwrite_duplicates = True,
loss_weighted_slices_range = None,
pred_weighted_slices_range = None,
logs_use_full_model_name = True,
new_model_num = True,
dynamic_predict_threshold = .5, # initial
plot_first_pane_max_vals = 2,
_val_max_set_name_chars = 2,
_max_set_name_chars = 3,
predict_threshold = 0.5,
best_subset_size = None,
check_model_health = True,
max_one_best_save = True,
max_checkpoints = 5,
report_fontpath = fontsdir + "consola.ttf",
model_base_name = "model",
final_fig_dir = None,
loadskip_list = _DEFAULT_TRAINGEN_LOADSKIP_LIST,
saveskip_list = _DEFAULT_TRAINGEN_SAVESKIP_LIST,
model_save_kw = _DEFAULT_MODEL_SAVE_KW,
model_save_weights_kw = _DEFAULT_MODEL_SAVE_WEIGHTS_KW,
metric_to_alias = _DEFAULT_METRIC_TO_ALIAS,
alias_to_metric = _DEFAULT_ALIAS_TO_METRIC,
report_configs = _DEFAULT_REPORT_CFG,
model_name_configs = _DEFAULT_MODEL_NAME_CFG,
name_process_key_fn = _DEFAULT_NAME_PROCESS_KEY_FN,
metric_printskip_configs = _DEFAULT_METRIC_PRINTSKIP_CFG,
)
deeptrain.util.data_loaders module¶
-
class
deeptrain.util.data_loaders.DataLoader(path, loader, dtype=None, batch_shape=None, base_name=None, ext=None, filepaths=None)¶ Bases:
objectLoads data from files to feed
DataGenerator(). Builtin methods for handling various data & file formats. Is set withinDataGenerator._infer_and_set_info().- Arguments:
- path: str
- Path to directory or file from which to get load data. If file,
or if directory contains one file that isn’t of “opposite path”
(
labels_pathifpath == data_path, and vice versa), then uses Dataset mode of operation (see below) - else directory. - loader: str / function / None
- Name of builtin function, or a custom function with input
signature
(self, set_num). Loads data from directory, or dataset file if_is_dataset. If None, defaults to a builtin as determined byload_fn()setter. - dtype: str / dtype
- Dtype of data to load, required by some loaders (
numpy-lz4f). - batch_shape: tuple[int]
- Full batch shape of data to load, required by some loaders
(
numpy-lz4f). - base_name: str
- Name common to all filenames in directory, used to delimit by
set_num(e.g.data1.npy,data2.npy, etc) - ext: str
- Extension of file(s) to load (e.g.
.npy,.h5, etc). - filepaths: list[str]
- Paths to files to load.
Builtin loaders: see
_BUILTINSCustom loaders:
Simplest option is to inherit
DataLoaderand override_get_loader()to return the custom loader;DataGenerator()will handle the rest. Fully custom ones require:__init__with same input signature asDataLoader.__init__.load_fnmethod with(self, set_num)input signature, loading data from a directory / file_get_loadermethod with(self, loader)input signature, returning the custom loader function_pathmethod with(self, set_num)input signature, returning path to file to load_get_set_numsmethod_is_datasetattribute
Modes of operation:
Directory: one
batch/labelsper file. Filename includesset_numandbase_name.Datasest: all
batch`es / `labelsin one file, each batch accessed byset_num:- Mapping (.h5): keys must be string integers.
- Numpy array: indexed directly, so shape must be
(n_batches, batch_size, *), i.e.(batches, samples, *)
-
_BUILTINS= {'csv', 'hdf5', 'numpy', 'numpy-lz4f', 'numpy-memmap'}¶
-
load_fn¶ Loads data given
set_num. Isdata_loaderorlabels_loaderinDataGenerator(), set inDataGenerator._infer_and_set_info().Setter: if
loaderisNonepassed to__init__, will set to string (one of builtins) in_init_loader()based onext, or to None ifpathis None.- String, will match to a supported builtin.
- Function, will set to the function.
-
numpy_loader(set_num)¶ For numpy arrays (.npy).
-
hdf5_loader(set_num)¶ For hdf5 (.h5) files storing data one batch per file.
data_pathinDataGenerator()must contain more than one non-labels ‘.h5’ file to default to this loader.
-
csv_loader(set_num)¶ For .csv files (e.g. pandas.DataFrame).
-
numpy_lz4f_loader(set_num)¶ For numpy arrays (.npy) compressed with
lz4framed; seepreprocessing.numpy_to_lz4f().self.data_dtypemust be original (save) dtype; if there’s a mismatch, data of wrong value or shape will be decoded.Requires
data_batch_shape/labels_batch_shapeattribute to be set, as compressed representation omits shape info.
-
_get_set_nums()¶ Gets
set_nums_originalforDataGenerator()._is_dataset: will fetch from the single file based onload_fn.- Not
_is_dataset: will fetch from filenames inpath, delimiting withbase_name. Ex:data1.npy,data2.npy, …, andbase_name = 'data'–>set_nums = [1, 2, ...].
deeptrain.util.experimental module¶
-
deeptrain.util.experimental.deepcopy_v2(obj, item_fn=None, skip_flag=42069, debug_verbose=False)¶ Enables customized copying of a nested iterable, mediated by
item_fn.
-
deeptrain.util.experimental.extract_pickleable(obj, skip_flag=42069)¶ Given an arbitrarily nested dict / mapping, make its copy containing only objects that can be pickled. Excludes functions and class instances, even though most such can be pickled (# TODO). Utilizes
deepcopy_v2().
-
deeptrain.util.experimental.exclude_unpickleable(obj)¶ Given an arbitrarily nested dict / mapping, make its copy containing only objects that can be pickled. Excludes functions and class instances, even though most such can be pickled (# TODO). Utilizes
deep_isinstance().
deeptrain.util.logging module¶
-
deeptrain.util.logging.save_report(self, savepath=None)¶ Saves model,
TrainGenerator,'datagen', and'val_datagen'attributes and values as an image-text, text generated withgenerate_report().Text font is set from
TrainGenerator.report_fontpath, which defaults to a programming-style consolas for consistent vertical and horizontal alignment.If
savepathis None, will save a temp report inTrainGenerator.logdir+'_temp_model__report.png'.
-
deeptrain.util.logging.generate_report(self)¶ Generates
model,TrainGenerator(),datagen, andval_datagenreports according toreport_configs.Writes attributes and values in three columns of text, converted to and saved as an image. Extracts information from
model_configsandvars()ofTrainGenerator,datagen, andval_datagen. Useful for snapshotting key model, training, and data attributes for quick reference.report_configsis structured as follows:>>> {target_0: ... {filter_spec_0: ... [attr0, attr1, ...], ... }, ... {filter_spec_1: ... [attr0, attr1, ...], ... }, ... }
targetis one of:'model', 'traingen', 'datagen', 'val_datagen'; it may be a tuple of, in which casefilter_specapplies to those included.filter_specis one of:'include', 'exclude', 'exclude_types', but cannot include'include'and'exclude'at once.'include': names of attributes to include in report; no other attribute will be included. Supports wildcards with'*'as leading character; e.g.'*_has_'will include all names starting with _has_.'exclude': names of attributes to exclude from report; all other attributes will be included. Also supports wildcards.'exclude_types': attribute types to exclude from report. Elements of this list cannot be string, unless prepended with'#', which specifies an exception. E.g. ifattris dict, anddictis in the list, then it can be kept in report by including'#attr'in the list.
See
_DEFAULT_REPORT_CFGfor the defaultreport_configs, containing every possible config case.
-
deeptrain.util.logging.get_unique_model_name(self, set_model_num=True)¶ Returns a unique model name, prepended by
f"M{model_num}__{model_base_name}". Ifset_model_num, also setsmodel_num.Name is generated by extracting info from
model_configsaccording tomodel_name_configs(insertion-order sensitive),name_process_key_fn, andnew_model_num.name_process_key_fn(key, alias, attrs): returns a string representation ofkeyTrainGeneratorattribute ormodel_configskey and its value. Below is a description of the default function,_DEFAULT_NAME_PROCESS_KEY_FN(), but custom implementations are supported:key: name of attribute (and its value) to encode. Can get attribute ofTrainGeneratorobject via'.', e.g.'datagen.batch_size'.alias: replaceskeyif not Noneattrs: dict of object attribute-value pairs, where “object” is eitherTrainGeneratoror an object that is its attribute.
Example: (using default
name_process_key_fn)>>> model_base_name == "AutoEncoder" >>> model_num == 8 >>> model_name_configs == {"datagen.batch_size": "BS", ... "filters": None, ... "optimizer": "", ... "lr": "", ... "best_key_metric": "__max"} >>> model_configs = {"conv_filters": [32, 64], ... "lr": 0.0002, ... "optimizer": tf.keras.optimizers.SGD} >>> tg.best_key_metric == .97512 # TrainGenerator >>> tg.datagen.batch_size == 32 ... ... # will yield >>> "M8__AutoEncoder-BS32-filters32_64-SGD-2e-4__max.975"
Note that if
new_model_numis True, then will set to +1 the max number after"M"for directory names inlogs_dir; e.g. if such a directory is"M15__Classifier", will use"M16", and setmodel_num = 16.If an object is passed to
model_configs, its.__name__will be used as “value”; if this attribute is missing, will raise exception (default fn).Note that “unique” doesn’t mean yielding a new name with each call to the function; for a name to be new, either a directory as described should have a higher
"M{num}", or other sources of information must change values (e.g.TrainGeneratorattributes, likebest_key_metric).
-
deeptrain.util.logging.get_last_log(self, name, best=False)¶ Returns latest savefile path from
logdir(best=False) orbest_models_dir(best=True).nameis one of:'report', 'state', 'weights', 'history', 'init_state'.'init_state'ignoresbest(uses=False).
-
deeptrain.util.logging._log_init_state(self, kwargs={}, source_lognames='__main__', savedir=None, to_exclude=[], verbose=0)¶ Extract
self.__dict__key-value pairs as string, ignoring funcs/methods or getting their source codes. May include kwargs passed to__init__viakwargs, and execution script’s source code via'__main__'insource_lognames.- Arguments:
- kwargs: dict
- kwargs passed to
self’s__init__, in case they weren’t set toselfor were changed later. - source_lognames: list[str] / str
- Names of self method attributes to get source code of. If includes ‘__main__’, will get source code of execution script.
- savedir: str.
- Path to directory where to save logs. Saves a .json of
selfdict, and .txt of source codes (if any). - to_exclude: list[str] / str
- Names of attributes to exclude from logging.
- verbose: bool / int[bool]
- Print save messages if successful.
deeptrain.util.misc module¶
-
deeptrain.util.misc.pass_on_error(fn, *args, **kwargs)¶
-
deeptrain.util.misc.try_except(try_fn, except_fn)¶
-
deeptrain.util.misc.argspec(obj)¶ Unreliable with wrapped functions.
-
deeptrain.util.misc.get_module_methods(module)¶
-
deeptrain.util.misc.capture_args(fn)¶ Capture bound method arguments without changing its input signature. Method must have a
**kwargsto append captured arguments to.Non-literal types and objects will be converted to their string representation (or
__qualname__or__name__if they possess it).
-
deeptrain.util.misc._init_optimizer(model, class_weights=None, input_as_labels=False, alias_to_metric_name_fn=None)¶ Instantiates optimizer (and maybe trainer), but does NOT train (update weights).
-
deeptrain.util.misc._make_plot_configs_from_metrics(self)¶ Makes default
plot_configs, building onconfigs._PLOT_CFG; seeget_history_fig(). Validates some configs and tries to fill others.Ensures every iterable config is of same
len()as number of metrics in'metrics', by extending last value of iterable to match the len. Ex:>>> {'metrics': {'val': ['loss', 'accuracy', 'f1']}, ... 'linestyle': ['--', '-'], # -> ['--', '-', '-'] ... }
Assigns colors to metrics based on a default cycling coloring scheme, with some predefined customs (look for
_customs_mapin source code).Configures up to two plot panes, mediated by
plot_first_pane_max_vals; if number of metrics in'metrics'exceeds it, then a second pane is used. Can be used to configure how many metrics to draw in first pane; useful for managing clutter.
-
deeptrain.util.misc._validate_traingen_configs(self)¶ Ensures various attributes are properly configured, and attempts correction where possible.
-
deeptrain.util.misc.append_examples_dir_to_sys_path()¶ Enables utils.py to be imported for examples.
deeptrain.util.preprocessors module¶
-
class
deeptrain.util.preprocessors.Preprocessor¶ Bases:
objectAbstract base class for preprocessors, outlining required methods for operability with
DataGenerator().The following attributes are “synched” with
DataGenerator():batch_loaded,batch_exhausted,slices_per_batch,slice_idx. Setter and getter are implemented to set and get these attributes from the preprocessor, so they are always same forPreprocessorandDataGenerator.-
process(batch, labels)¶ Required to implement; must return
(batch, labels). Can apply arbitrary preprocessing steps, or return as-is. Is called withinDataGenerator.get().
-
update_state()¶ Optional to implement; must involve setting
batch_exhaustedandbatch_loadedattributes to True or False.
-
reset_state()¶ Optional to implement. Can be used to reset attributes specific to the preprocessor. Is called within
DataGenerator.reset_state().
-
on_epoch_end(epoch)¶ Optional to implement. Can be used to do things at end of epoch. Is called within
DataGenerator.on_epoch_end(), which is called by_on_epoch_endwithinTrainGenerator._train_postiter_processing()orTrainGenerator._val_postiter_processing().
-
_validate_configs()¶ Internal method to validate
slices_per_batchinDataGenerator._set_preprocessor().
-
-
class
deeptrain.util.preprocessors.TimeseriesPreprocessor(window_size, slide_size=None, start_increments=None, loadskip_list=None)¶ Bases:
deeptrain.util.preprocessors.PreprocessorStateful preprocessor breaking up batches into “windows”.
- Arguments:
- window_size: int
- Length of each window (dim1), or number of timesteps per slice.
- slide_size: int
- Number of timesteps by which to slide the window.
- start_increments: int
- Number of timesteps by which to increment each window when fetching.
- loadskip_list: dict / None
- Attributes to skip on
TrainGenerator.load(). Defaults to['start_increments', 'window_size', 'slide_size'].
A “slice” here is a “window”, and
slices_per_batchis the number of such windows per batch.Examples:
Each window in
windowsis from calling_next_window(); changingstart&endrequires callingupdate_state().>>> batch.shape == (32, 100, 4) ... >>> window_size, slide_size, start_increment = (25, 25, 0) >>> slices_per_batch == 4 >>> windows == [batch[:, :25], # slice_idx = 0 (window 1) ... batch[:, 25:50], # slice_idx = 1 (window 2) ... batch[:, 50:75], # slice_idx = 2 (window 3) ... batch[:, 75:100]] # slice_idx = 3 (window 4) ... >>> window_size, slide_size, start_increment = (25, 25, 10) >>> slices_per_batch == 3 >>> windows == [batch[:, 10:35], # slice_idx = 0 (window 1) ... batch[:, 35:60], # slice_idx = 1 (window 2) ... batch[:, 60:85]] # slice_idx = 2 (window 3) ... >>> window_size, slide_size, start_increment = (25, 10, 0) >>> slices_per_batch == 8 >>> windows == [batch[:, :25], # slice_idx = 0 (window 1) ... batch[:, 10:35], # slice_idx = 1 (...) ... batch[:, 20:45], # slice_idx = 2 ... batch[:, 30:55], # slice_idx = 3 ... batch[:, 40:65], # slice_idx = 4 ... batch[:, 50:75], # slice_idx = 5 ... batch[:, 60:85], # slice_idx = 6 ... batch[:, 70:95]] # slice_idx = 7 (window 8)
-
process(batch, labels)¶ Return next
batchwindow, and unchangedlabels.
-
_next_window(batch)¶ Fetches temporal slice according to
window_size,slide_size,start_increment, andslice_idx; SeeTimeseriesPreprocessor()for examples.
-
reset_state()¶ Set
slice_idx = 0.
-
on_epoch_end(epoch)¶ Update
slices_per_batch, andstart_incrementifstart_incrementsis not None.
-
update_state()¶ Increment
slice_idxby 1; ifslice_idx == slices_per_batch, setbatch_exhausted = True,batch_loaded = False.
-
start_increment¶ Sliding window start increment; see
help(TimeseriesPreprocessor).
-
class
deeptrain.util.preprocessors.GenericPreprocessor(loadskip_list=None)¶ Bases:
deeptrain.util.preprocessors.PreprocessorMinimal Preprocessor; does nothing to
batchorlabels, but maintainsbatch_exhaustedandbatch_loadedlogic.-
process(batch, labels)¶ Return
batchandlabelsas-is.
-
deeptrain.util.saving module¶
-
deeptrain.util.saving._save_best_model(self, del_previous_best=None)¶ Saves
model, history fig,TrainGeneratorstate, and report if latest key metric is a new best (max/min, as permax_is_best). Also deletes previous best saves ifmax_one_best_save(as called byTrainGenerator._on_val_end(), and defaulting to if None passed).
-
deeptrain.util.saving.checkpoint(self, forced=False, overwrite=None)¶ Saves
TrainGeneratorstate (including both `DataGenerator`s), report, history fig, and model (weights and if configured to, optimizer state and architecture).- Arguments:
- forced: bool
- If True, will checkpoint whether or not temp / unique checkpoint freq’s were met.
- overwrite: bool / None
- If None, set from
checkpoints_overwrite_duplicates. If True, will overwrite existing checkpoint files if having same name as one generated with current checkpoint - else, will make unique names by incrementing ‘_v2’, ‘_v3’, etc.
Saves according to
temp_checkpoint_freqandunique_checkpoint_freq. See_should_do(). Seesave()on how model andTrainGeneratorstate are save. Additionally, ill updatemodel_nameifbest_key_metricwas updated.“Unique” checkpoint will generate save files with latest
best_key_metricand_times_validated, and with fullmodel_nameiflogs_use_full_model_name.
-
deeptrain.util.saving.save(self, savepath=None)¶ Save
TrainGeneratorstate (including bothDataGenerator`s), and if configured to, model weights and optimizer state. Applies callbacks with `stage='save'before saving to file.- Arguments:
- savepath: str / None
- File path to save to. If None, will set to
logdir+_temp_model__state.h5. Internally,savepathis passed meaningfully bycheckpoint()and_save_best_model().
Saving TrainGenerator state:
Configured with
saveskip_list, andDataGenerator.saveskip_listfordatagenandval_datagen; any attribute not included in the lists will be saved (except objects that cannot be pickled, which will raisePickleError. Callables (e.g. functions) are excluded automatically).Saving optimizer state:
Configured with
optimizer_save_configs, in the below structure; only one of'include','exclude'can be set.>>> {'include': # optimizer attributes to include ... ['weights', 'learning_rate'] # will ONLY save these ... } >>> {'exclude': # optimizer attributes to exclude ... ['updates', 'epsilon'] # will save everything BUT these ... }
Note:
checkpoint()or_save_best_model`()called from withinTrainGenerator._on_val_end()will set_save_from_on_val_end=True, which will then set validation flags so as to not repeat call to_on_val_endupon loadingTrainGenerator.
-
deeptrain.util.saving._get_optimizer_state(self)¶ Get optimizer attributes to save, according to
optimizer_save_configs; helper method tosave().
-
deeptrain.util.saving.load(self, filepath=None, passed_args=None)¶ Loads
TrainGeneratorstate (including bothDataGenerator`s), and if configured to, model optimizer attributes and instantiates optimizer (but not model architecture). Instantiates callbacks, and applies them with `stage='load'. Preloads data fromdatagenandval_datagen.- Arguments:
- filepath: str / None
File path to load from. If None:
logdiris None, or is not a directory: raisesValueErrorlogdiris a directory: setsfilepathto latest file with name ending with'__state.h5'; if there isn’t such file, raisesValueError.
- passed_args: dict / None
- Passed within
TrainGenerator.__init__()as arguments given by user (not defaults) to__init__. Alongloadskip_list, mediates which attributes are loaded (see below).
Loading TrainGenerator state:
Configured with
loadskip_list,DataGenerator.loadskip_listandDataGenerator.preprocessor.loadskip_listfordatagenandval_datagen; any attribute not included in the lists will be loaded.'model'is always skipped from loading as part of pickled file, since it’s never saved viasave().- If
loadskip_list == 'auto'or falsy (e.g. None), will default it topassed_args. - If
passed_argsis falsy, defaults to[](load all). - If
'{auto}'is inloadskip_list, then will appendpassed_argstoloadskip_list, and pop'{auto}'. - Will omit
'datagen'&'val_datagen'frompassed_args; only way to skip them is viaself.loadskip_list.
Loading optimizer state:
Configured with
optimizer_load_configs, in the below structure; only one of'include','exclude'can be set.>>> {'include': # optimizer attributes to include ... ['weights', 'learning_rate'] # will ONLY load these ... } >>> {'exclude': # optimizer attributes to exclude ... ['updates', 'epsilon'] # will load everything BUT these ... }
-
deeptrain.util.saving._load_optimizer_state(self)¶ Sets optimizer attributes from
self.optimizer_state, according tooptimizer_load_configs; helper method toload(). Is called internally byload().optimizer_stateis set to None to free memory afterwards.
-
deeptrain.util.saving._save_history_fig(self, savepath=None)¶ Saves
_history_fig. Does nothing if_history_figis falsy (e.g. None).- Arguments:
- savepath: str / None
- Path to save figure to. If None, will set to
logdir+'_temp_model__hist.png'. Also if None, andfinal_fig_diris set, will use full model name instead.
If
final_fig_diris set, will also save (at most one permodel_num) figure there using full model name; if there are existing savefiles (.png) with samemodel_num, will delete them. Intended to document latest state of history.
deeptrain.util.searching module¶
Algorithms for searching combinations of various metric hyperparameters, like classification prediction threshold, and best-performing subset of batches.
Useful for e.g. tuning a classifier’s prediction threshold on validation data, or tracking classifier calibration.
-
deeptrain.util.searching.find_best_predict_threshold(labels, preds, metric_fn, search_interval=0.01, search_min_max=(0, 1), max_is_best=True, return_best_metric=False, threshold_preference=0.5, verbosity=0)¶ Finds best scalar prediction threshold for an arbitrary metric function.
- Arguments:
- labels: np.ndarray
- Labels. All samples must be along dim0.
- preds: np.ndarray
- Predictions. All samples must be along dim0.
- metric_fn: function
- Metric function with input signature
(labels, preds, predict_threshold). - search_interval: float
- Amount by which to increment predict_threshold from min to max
of
search_min_maxin search of best threshold. - search_min_max: tuple[float]
- Search bounds for best predict_threshold.
- max_is_best: bool
- “Best” means maximum if True, else minimum (metric).
- return_best_metric: bool
- If True, will also return metric_fn evaluated at the found best threshold. Default False.
- threshold_preference: float
- Select best metric yielding threshold that’s closest to this, if there are multiple metrics that equal the best found.
- verbosity: int in (0, 1, 2)
- 1: print found best predict threshold and metric metric.
- 2: print a table of (threshold, metric, best metric) for every threshold computed.
- 0: don’t print anything.
- Returns:
- best_th: float
- Best prediction threshold
- best_metric: float
- Metric resulting from
best_th; only returned ifreturn_best_metricis True (default False).
Finds best threshold by trying every threshold from min to max of
search_min_max, incremented bysearch_interval(grid search).
-
deeptrain.util.searching.find_best_subset(labels_all, preds_all, metric_fn, search_interval=0.01, search_min_max=(0, 1), max_is_best=True, subset_size=5)¶ Finds subset of batches yielding the best
metric_fn.- Arguments:
- labels_all: list[np.ndarray]
- Labels in
(batches, samples, *)format, as arranged withindeeptrain.util.training. - preds_all: list[np.ndarray]
- Labels in
(batches, samples, *)format, same aslabels_all. - metric_fn: function
- Metric function with input signature
(labels, preds, predict_threshold). - search_interval: float
- Amount by which to increment predict_threshold from min to max
of
search_min_maxin search of best threshold. - search_min_max: tuple[float]
- Search bounds for best predict_threshold.
- max_is_best: bool
- “Best” means maximum if True, else minimum (metric).
- subset_size: int
- Size of best subset to find; must be
<= len(labels_all)(but makes no sense if==).
- Returns:
- best_batch_idxs: list[int].
- Indices of best batches, w.r.t. original
labels_all&preds_all. - best_th: float
- Prediction threshold yielding best score on the found best subset.
- best_metric: float
- Metric computed the found best subset using
best_th.
Uses progressive elimination, and is not guaranteed to find the true best subset. Algorithm:
- Feed all labels & preds to
metric_fn - Remove “best” scoring batch from list of labels & preds
- Repeat 1-2
subset_sizenumber of times.
-
deeptrain.util.searching.find_best_subset_from_history(metric, subset_size=5, max_is_best=True)¶ Finds subset of batches yielding the best metric, given pre-computed metrics. Simply orders metrics best-to-worst, and returns top
subset_sizeof them. Exact.- Arguments:
- metric: list[float]
- List of pre-computed metrics, arranged as
(batches, slices), or(batches,); if former, will collapse slices as a mean. - max_is_best: bool
- “Best” means maximum if True, else minimum (metric).
- Returns:
- list[int]: indices of best batches.
deeptrain.util.training module¶
-
deeptrain.util.training._update_temp_history(self, metrics, val=False)¶ Updates temporary history given
metrics. If using batch-slices, ensures entries are grouped appropriately.- Gets train (
val=False) or val (=True) metrics, and validates thatlen(metrics)of returned list matches len oftrain_metricsorval_metrics. - Validates that each element of
metricsis a numeric (e.g. float)
- Gets train (
-
deeptrain.util.training.get_sample_weight(self, class_labels, val=False, slice_idx=None, force_unweighted=False)¶ Make
sample_weightto feed tomodelbased onclass_labels(and, if applicable, weighted slices).- Arguments:
- class_labels: np.ndarray
Classification class labels to map sample weights according to
class_weights.>>> class_weights = {0: 4, 1: 1, 2: 2.5} >>> class_labels == [1, 1, 0, 1, 2] # if >>> sample_weight == [4, 4, 1, 4, 2.5] # then
- val: bool
- Whether to use
class_weights(False) orval_class_weights(True). If using sliced weights, whether to useslices_per_batchofdatagen(False) orval_datagen(False). Further, if True, will get weightedsample_weightif either ofloss_weighted_slices_rangeorpred_weighted_slices_rangeare set (False requires former to be set). - slice_idx: int / None
- Index of slice to get
sample_weightfor, if using slices. If None, will return allslices_per_batchnumber ofsample_weight. - force_unweighted: bool
- Get slice-unweighted
sample_weightregardless of slice usage; used internally by_get_weighted_sample_weight()to break recursion.
-
deeptrain.util.training._get_weighted_sample_weight(self, class_labels_all, val=False, weight_range=(0.5, 1.5), slice_idx=None)¶ Gets
slices_per_batchnumber ofsample_weight, scaled linearly from min to max ofweight_range, overslices_per_batchnumber of steps.>>> weight_range == (0.5, 1.5) >>> class_weights == {0: 1, 1: 5} # val = False >>> slices_per_batch == 3 >>> slice_idx == None # get all >>> class_labels_all == [[0, 0, 1], [0, 0, 1], [0, 1, 1]] ... >>> [[0.5, 0.5, 2.5], ... [1.0, 1.0, 5.0], ... [1.5, 7.5, 7.5]]
-
deeptrain.util.training._set_predict_threshold(self, predict_threshold, for_current_iter=False)¶ Set
predict_thresholdand maybedynamic_predict_threshold.
-
deeptrain.util.training._get_val_history(self, for_current_iter=False)¶ Compute validation metrics from history (‘evaluate’-mode), or cache (‘predict’-mode).
for_current_iteris True when inside ofvalidate()loop, fitting individual batches/slices, and is False_on_val_end(), where metrics over the entire validation dataset are computed. In latter, best subset is found (if applicable).
-
deeptrain.util.training._get_best_subset_val_history(self)¶ Returns history entry for best
best_subset_sizenumber of validation batches, and setsbest_subset_nums.Ex: given 10 batches, a “best subset” of 5 is the set of 5 batches that yields the best (highest/lowest depending on
max_is_best)key_metric. Useful for model ensembling in specializing member models on different parts of data.
-
deeptrain.util.training._compute_metric(self, data, metric_name=None, metric_fn=None)¶ Compute metric given labels, preds, and sample weights or prediction threshold where applicable - and metric name or function.
-
deeptrain.util.training._compute_metrics(self, labels_all_norm, preds_all_norm, sample_weight_all)¶ Computes metrics from labels, predictions, and sample weights, via
_compute_metric().Iterates over metric names in
val_metrics:name == 'loss': fetches loss name frommodel.loss, then function fromdeeptrain.util.metrics, computes metric, and adds model weight penalty loss (L1/L2). Loss frommodel.evaluate()may still differ, as no other regularizer loss is accounted for.name == key_metric: computes metric withkey_metric_fn.name in custom_metrics: computes metric withcustom_metrics[name]function.namenone of the above: passes name andmodel.lossto_compute_metric().
Ensures computed metrics are scalars (numbers, instead of lists, tuples, etc).
-
deeptrain.util.training._unroll_into_samples(out_ndim, *arrs)¶ - Flatten samples, slices, and batches dims into one:
(batches, slices, samples, *output_shape)->(batches * slices * samples, *output_shape)(batches, samples, *output_shape)->(batches * samples, *output_shape)
*arrsare standardized (fed after_transform_eval_data()), so the minimal case is(1, 1, *output_shape), which still correctly reshapes into(1, *output_shape). Cases:>>> # (32, 1) -> (32, 1) ... # (1, 32, 1) -> (32, 1) ... # (1, 1, 32, 1) -> (32, 1) ... # (1, 3, 32, 1) -> (66, 1) ... # (2, 3, 32, 1) -> (122, 1)
-
deeptrain.util.training._transform_eval_data(self, labels_all, preds_all, sample_weight_all, class_labels_all, return_as_dict=True, unroll_into_samples=True)¶ Prepare data for feeding to metrics computing methods.
- Stanardize labels and preds shapes to the expected
(batches, *model.output_shape), or(batches, slices, *model.output_shape)if slices are used. See_validate_data_shapes(). - Standardize
sample_weightandclass_labelsshapes. See_validate_class_data_shapes(). - Unroll data into samples (merge batches, slices, and samples dims).
See
_unroll_into_samples().
- Stanardize labels and preds shapes to the expected
-
deeptrain.util.training._weighted_normalize_preds(self, preds_all)¶ Given batch-slices predictions, “weighs” binary (sigmoid) class predictions linearly according to
pred_weighted_slices_range, inslices_per_batchsteps. In effect, 0-class predictions from later slices are weighted greater than those from earlier, and likewise for 1-class.Norm logic: “0-class” is defined as predicting <0.5. 1 is subtracted from all predictions, so that 0-class preds are negative, and 1-class are positive. Preds are then scaled according to slice weights, so that greater weights correspond to more negative and more positive values. Preds are then shifted back to original [0, 1]. More negative values in shifted domain thus correspond to values closer to 0 in original domain, in a manner that weighs 0-class preds and 1-class preds equally.
-
deeptrain.util.training._validate_data_shapes(self, data, val=True, validate_n_slices=True, validate_last_dims_match_outs_shape=True, validate_equal_shapes=True)¶ Ensures
dataentires are shaped(batches, *model.output_shape), or(batches, slices, *model.output_shape)if using slices.- Validate
batch_size, and that it’s common to every batch/slice. - Validate
slices_per_batch, and that it’s common to every batch/slice,if validate_n_slices.
- Arguments:
- data: dict[str: np.ndarray]
{'labels_all': labels_all, 'preds_all': preds_all}. Passed as self-naming dict to improve code readability in exception handling.- val: bool
- Only relevant with
validate_n_slices==True; if True, getsslices_per_batchfromval_datagen- else, fromdatagen. - validate_n_slices: bool
- (Default True) is set False when
slice_idxis not None in in_get_weighted_sample_weight, which occurs duringvalidate()when processing individual batch-slices.slice_idxis None_on_val_end(). - validate_last_dims_match_outs_shape: bool
- See
_validate_class_data_shapes(). - validate_equal_shapes: bool
- See
_validate_class_data_shapes().
- Validate
-
deeptrain.util.training._validate_class_data_shapes(self, data, val=True, validate_n_slices=False)¶ Standardize
sample_weightandclass_labelsdata. Same as_validate_data_shapes(), except skips two validations:_validate_last_dims_match_outs_shape; forclass_labels, model output shapes can be same as input shapes as with autoencoders, but inputs can still have class labels, subjecting tosample_weight.sample_weightoften won’t share model output shape, as with e.g. multiclass classification, where individual classes aren’t weighted._equal_shapes; per above,dataentries may not have equal shapes.
deeptrain.util._traingen_utils module¶
-
class
deeptrain.util._traingen_utils.TraingenUtils¶ Bases:
object-
checkpoint(forced=False, overwrite=None)¶ Saves
TrainGeneratorstate (including both `DataGenerator`s), report, history fig, and model (weights and if configured to, optimizer state and architecture).- Arguments:
- forced: bool
- If True, will checkpoint whether or not temp / unique checkpoint freq’s were met.
- overwrite: bool / None
- If None, set from
checkpoints_overwrite_duplicates. If True, will overwrite existing checkpoint files if having same name as one generated with current checkpoint - else, will make unique names by incrementing ‘_v2’, ‘_v3’, etc.
Saves according to
temp_checkpoint_freqandunique_checkpoint_freq. See_should_do(). Seesave()on how model andTrainGeneratorstate are save. Additionally, ill updatemodel_nameifbest_key_metricwas updated.“Unique” checkpoint will generate save files with latest
best_key_metricand_times_validated, and with fullmodel_nameiflogs_use_full_model_name.
-
save(savepath=None)¶ Save
TrainGeneratorstate (including bothDataGenerator`s), and if configured to, model weights and optimizer state. Applies callbacks with `stage='save'before saving to file.- Arguments:
- savepath: str / None
- File path to save to. If None, will set to
logdir+_temp_model__state.h5. Internally,savepathis passed meaningfully bycheckpoint()and_save_best_model().
Saving TrainGenerator state:
Configured with
saveskip_list, andDataGenerator.saveskip_listfordatagenandval_datagen; any attribute not included in the lists will be saved (except objects that cannot be pickled, which will raisePickleError. Callables (e.g. functions) are excluded automatically).Saving optimizer state:
Configured with
optimizer_save_configs, in the below structure; only one of'include','exclude'can be set.>>> {'include': # optimizer attributes to include ... ['weights', 'learning_rate'] # will ONLY save these ... } >>> {'exclude': # optimizer attributes to exclude ... ['updates', 'epsilon'] # will save everything BUT these ... }
Note:
checkpoint()or_save_best_model`()called from withinTrainGenerator._on_val_end()will set_save_from_on_val_end=True, which will then set validation flags so as to not repeat call to_on_val_endupon loadingTrainGenerator.
-
load(filepath=None, passed_args=None)¶ Loads
TrainGeneratorstate (including bothDataGenerator`s), and if configured to, model optimizer attributes and instantiates optimizer (but not model architecture). Instantiates callbacks, and applies them with `stage='load'. Preloads data fromdatagenandval_datagen.- Arguments:
- filepath: str / None
File path to load from. If None:
logdiris None, or is not a directory: raisesValueErrorlogdiris a directory: setsfilepathto latest file with name ending with'__state.h5'; if there isn’t such file, raisesValueError.
- passed_args: dict / None
- Passed within
TrainGenerator.__init__()as arguments given by user (not defaults) to__init__. Alongloadskip_list, mediates which attributes are loaded (see below).
Loading TrainGenerator state:
Configured with
loadskip_list,DataGenerator.loadskip_listandDataGenerator.preprocessor.loadskip_listfordatagenandval_datagen; any attribute not included in the lists will be loaded.'model'is always skipped from loading as part of pickled file, since it’s never saved viasave().- If
loadskip_list == 'auto'or falsy (e.g. None), will default it topassed_args. - If
passed_argsis falsy, defaults to[](load all). - If
'{auto}'is inloadskip_list, then will appendpassed_argstoloadskip_list, and pop'{auto}'. - Will omit
'datagen'&'val_datagen'frompassed_args; only way to skip them is viaself.loadskip_list.
Loading optimizer state:
Configured with
optimizer_load_configs, in the below structure; only one of'include','exclude'can be set.>>> {'include': # optimizer attributes to include ... ['weights', 'learning_rate'] # will ONLY load these ... } >>> {'exclude': # optimizer attributes to exclude ... ['updates', 'epsilon'] # will load everything BUT these ... }
-
_save_best_model(del_previous_best=None)¶ Saves
model, history fig,TrainGeneratorstate, and report if latest key metric is a new best (max/min, as permax_is_best). Also deletes previous best saves ifmax_one_best_save(as called byTrainGenerator._on_val_end(), and defaulting to if None passed).
-
_get_optimizer_state()¶ Get optimizer attributes to save, according to
optimizer_save_configs; helper method tosave().
-
_load_optimizer_state()¶ Sets optimizer attributes from
self.optimizer_state, according tooptimizer_load_configs; helper method toload(). Is called internally byload().optimizer_stateis set to None to free memory afterwards.
-
_save_history_fig(savepath=None)¶ Saves
_history_fig. Does nothing if_history_figis falsy (e.g. None).- Arguments:
- savepath: str / None
- Path to save figure to. If None, will set to
logdir+'_temp_model__hist.png'. Also if None, andfinal_fig_diris set, will use full model name instead.
If
final_fig_diris set, will also save (at most one permodel_num) figure there using full model name; if there are existing savefiles (.png) with samemodel_num, will delete them. Intended to document latest state of history.
-
save_report(savepath=None)¶ Saves model,
TrainGenerator,'datagen', and'val_datagen'attributes and values as an image-text, text generated withgenerate_report().Text font is set from
TrainGenerator.report_fontpath, which defaults to a programming-style consolas for consistent vertical and horizontal alignment.If
savepathis None, will save a temp report inTrainGenerator.logdir+'_temp_model__report.png'.
-
generate_report()¶ Generates
model,TrainGenerator(),datagen, andval_datagenreports according toreport_configs.Writes attributes and values in three columns of text, converted to and saved as an image. Extracts information from
model_configsandvars()ofTrainGenerator,datagen, andval_datagen. Useful for snapshotting key model, training, and data attributes for quick reference.report_configsis structured as follows:>>> {target_0: ... {filter_spec_0: ... [attr0, attr1, ...], ... }, ... {filter_spec_1: ... [attr0, attr1, ...], ... }, ... }
targetis one of:'model', 'traingen', 'datagen', 'val_datagen'; it may be a tuple of, in which casefilter_specapplies to those included.filter_specis one of:'include', 'exclude', 'exclude_types', but cannot include'include'and'exclude'at once.'include': names of attributes to include in report; no other attribute will be included. Supports wildcards with'*'as leading character; e.g.'*_has_'will include all names starting with _has_.'exclude': names of attributes to exclude from report; all other attributes will be included. Also supports wildcards.'exclude_types': attribute types to exclude from report. Elements of this list cannot be string, unless prepended with'#', which specifies an exception. E.g. ifattris dict, anddictis in the list, then it can be kept in report by including'#attr'in the list.
See
_DEFAULT_REPORT_CFGfor the defaultreport_configs, containing every possible config case.
-
get_unique_model_name(set_model_num=True)¶ Returns a unique model name, prepended by
f"M{model_num}__{model_base_name}". Ifset_model_num, also setsmodel_num.Name is generated by extracting info from
model_configsaccording tomodel_name_configs(insertion-order sensitive),name_process_key_fn, andnew_model_num.name_process_key_fn(key, alias, attrs): returns a string representation ofkeyTrainGeneratorattribute ormodel_configskey and its value. Below is a description of the default function,_DEFAULT_NAME_PROCESS_KEY_FN(), but custom implementations are supported:key: name of attribute (and its value) to encode. Can get attribute ofTrainGeneratorobject via'.', e.g.'datagen.batch_size'.alias: replaceskeyif not Noneattrs: dict of object attribute-value pairs, where “object” is eitherTrainGeneratoror an object that is its attribute.
Example: (using default
name_process_key_fn)>>> model_base_name == "AutoEncoder" >>> model_num == 8 >>> model_name_configs == {"datagen.batch_size": "BS", ... "filters": None, ... "optimizer": "", ... "lr": "", ... "best_key_metric": "__max"} >>> model_configs = {"conv_filters": [32, 64], ... "lr": 0.0002, ... "optimizer": tf.keras.optimizers.SGD} >>> tg.best_key_metric == .97512 # TrainGenerator >>> tg.datagen.batch_size == 32 ... ... # will yield >>> "M8__AutoEncoder-BS32-filters32_64-SGD-2e-4__max.975"
Note that if
new_model_numis True, then will set to +1 the max number after"M"for directory names inlogs_dir; e.g. if such a directory is"M15__Classifier", will use"M16", and setmodel_num = 16.If an object is passed to
model_configs, its.__name__will be used as “value”; if this attribute is missing, will raise exception (default fn).Note that “unique” doesn’t mean yielding a new name with each call to the function; for a name to be new, either a directory as described should have a higher
"M{num}", or other sources of information must change values (e.g.TrainGeneratorattributes, likebest_key_metric).
-
get_last_log(name, best=False)¶ Returns latest savefile path from
logdir(best=False) orbest_models_dir(best=True).nameis one of:'report', 'state', 'weights', 'history', 'init_state'.'init_state'ignoresbest(uses=False).
-
_update_temp_history(metrics, val=False)¶ Updates temporary history given
metrics. If using batch-slices, ensures entries are grouped appropriately.- Gets train (
val=False) or val (=True) metrics, and validates thatlen(metrics)of returned list matches len oftrain_metricsorval_metrics. - Validates that each element of
metricsis a numeric (e.g. float)
- Gets train (
-
get_sample_weight(class_labels, val=False, slice_idx=None, force_unweighted=False)¶ Make
sample_weightto feed tomodelbased onclass_labels(and, if applicable, weighted slices).- Arguments:
- class_labels: np.ndarray
Classification class labels to map sample weights according to
class_weights.>>> class_weights = {0: 4, 1: 1, 2: 2.5} >>> class_labels == [1, 1, 0, 1, 2] # if >>> sample_weight == [4, 4, 1, 4, 2.5] # then
- val: bool
- Whether to use
class_weights(False) orval_class_weights(True). If using sliced weights, whether to useslices_per_batchofdatagen(False) orval_datagen(False). Further, if True, will get weightedsample_weightif either ofloss_weighted_slices_rangeorpred_weighted_slices_rangeare set (False requires former to be set). - slice_idx: int / None
- Index of slice to get
sample_weightfor, if using slices. If None, will return allslices_per_batchnumber ofsample_weight. - force_unweighted: bool
- Get slice-unweighted
sample_weightregardless of slice usage; used internally by_get_weighted_sample_weight()to break recursion.
-
_get_weighted_sample_weight(class_labels_all, val=False, weight_range=(0.5, 1.5), slice_idx=None)¶ Gets
slices_per_batchnumber ofsample_weight, scaled linearly from min to max ofweight_range, overslices_per_batchnumber of steps.>>> weight_range == (0.5, 1.5) >>> class_weights == {0: 1, 1: 5} # val = False >>> slices_per_batch == 3 >>> slice_idx == None # get all >>> class_labels_all == [[0, 0, 1], [0, 0, 1], [0, 1, 1]] ... >>> [[0.5, 0.5, 2.5], ... [1.0, 1.0, 5.0], ... [1.5, 7.5, 7.5]]
-
_set_predict_threshold(predict_threshold, for_current_iter=False)¶ Set
predict_thresholdand maybedynamic_predict_threshold.
-
_get_val_history(for_current_iter=False)¶ Compute validation metrics from history (‘evaluate’-mode), or cache (‘predict’-mode).
for_current_iteris True when inside ofvalidate()loop, fitting individual batches/slices, and is False_on_val_end(), where metrics over the entire validation dataset are computed. In latter, best subset is found (if applicable).
-
_get_best_subset_val_history()¶ Returns history entry for best
best_subset_sizenumber of validation batches, and setsbest_subset_nums.Ex: given 10 batches, a “best subset” of 5 is the set of 5 batches that yields the best (highest/lowest depending on
max_is_best)key_metric. Useful for model ensembling in specializing member models on different parts of data.
-
_compute_metric(data, metric_name=None, metric_fn=None)¶ Compute metric given labels, preds, and sample weights or prediction threshold where applicable - and metric name or function.
-
_compute_metrics(labels_all_norm, preds_all_norm, sample_weight_all)¶ Computes metrics from labels, predictions, and sample weights, via
_compute_metric().Iterates over metric names in
val_metrics:name == 'loss': fetches loss name frommodel.loss, then function fromdeeptrain.util.metrics, computes metric, and adds model weight penalty loss (L1/L2). Loss frommodel.evaluate()may still differ, as no other regularizer loss is accounted for.name == key_metric: computes metric withkey_metric_fn.name in custom_metrics: computes metric withcustom_metrics[name]function.namenone of the above: passes name andmodel.lossto_compute_metric().
Ensures computed metrics are scalars (numbers, instead of lists, tuples, etc).
-
_transform_eval_data(labels_all, preds_all, sample_weight_all, class_labels_all, return_as_dict=True, unroll_into_samples=True)¶ Prepare data for feeding to metrics computing methods.
- Stanardize labels and preds shapes to the expected
(batches, *model.output_shape), or(batches, slices, *model.output_shape)if slices are used. See_validate_data_shapes(). - Standardize
sample_weightandclass_labelsshapes. See_validate_class_data_shapes(). - Unroll data into samples (merge batches, slices, and samples dims).
See
_unroll_into_samples().
- Stanardize labels and preds shapes to the expected
-
_weighted_normalize_preds(preds_all)¶ Given batch-slices predictions, “weighs” binary (sigmoid) class predictions linearly according to
pred_weighted_slices_range, inslices_per_batchsteps. In effect, 0-class predictions from later slices are weighted greater than those from earlier, and likewise for 1-class.Norm logic: “0-class” is defined as predicting <0.5. 1 is subtracted from all predictions, so that 0-class preds are negative, and 1-class are positive. Preds are then scaled according to slice weights, so that greater weights correspond to more negative and more positive values. Preds are then shifted back to original [0, 1]. More negative values in shifted domain thus correspond to values closer to 0 in original domain, in a manner that weighs 0-class preds and 1-class preds equally.
-
_validate_data_shapes(data, val=True, validate_n_slices=True, validate_last_dims_match_outs_shape=True, validate_equal_shapes=True)¶ Ensures
dataentires are shaped(batches, *model.output_shape), or(batches, slices, *model.output_shape)if using slices.- Validate
batch_size, and that it’s common to every batch/slice. - Validate
slices_per_batch, and that it’s common to every batch/slice,if validate_n_slices.
- Arguments:
- data: dict[str: np.ndarray]
{'labels_all': labels_all, 'preds_all': preds_all}. Passed as self-naming dict to improve code readability in exception handling.- val: bool
- Only relevant with
validate_n_slices==True; if True, getsslices_per_batchfromval_datagen- else, fromdatagen. - validate_n_slices: bool
- (Default True) is set False when
slice_idxis not None in in_get_weighted_sample_weight, which occurs duringvalidate()when processing individual batch-slices.slice_idxis None_on_val_end(). - validate_last_dims_match_outs_shape: bool
- See
_validate_class_data_shapes(). - validate_equal_shapes: bool
- See
_validate_class_data_shapes().
- Validate
-
_validate_class_data_shapes(data, val=True, validate_n_slices=False)¶ Standardize
sample_weightandclass_labelsdata. Same as_validate_data_shapes(), except skips two validations:_validate_last_dims_match_outs_shape; forclass_labels, model output shapes can be same as input shapes as with autoencoders, but inputs can still have class labels, subjecting tosample_weight.sample_weightoften won’t share model output shape, as with e.g. multiclass classification, where individual classes aren’t weighted._equal_shapes; per above,dataentries may not have equal shapes.
-
get_history_fig(plot_configs=None, w=1, h=1)¶ Plots train / validation history according to
plot_configs.- Arguments:
- plot_configs: dict / None
- See
_DEFAULT_PLOT_CFG. If None, defaults toTrainGenerator.plot_configs(which itself defaults to_PLOT_CFGinconfigs.py). - w, h: float
- Scale figure width & height, respectively.
plot_configsis structured as follows:>>> {'fig_kw': fig_kw, ... '0': {reserved_name: value, ... plt_kw: value}, ... '1': {reserved_name: value, ... plt_kw: value}, ... ...}
fig_kw: dict, passed toplt.subplots(**fig_kw)reserved_name: str, one of('metrics', 'x_ticks', 'vhlines', 'mark_best_cfg', 'ylims', 'legend_kw'). Used to configure supported custom plot behavior (see “Builtin plot customs” below).plt_kw: str, name of kwarg to pass directly toplt.plot().value: depends on key; see defaultplot_configsin_DEFAULT_PLOT_CFGandmisc._make_plot_configs_from_metrics().
Only
'metrics'and'x_ticks'keys are required for each dict - others have default values.Builtin plot customs: (
reserved_name)'metrics'(required): names of metrics to plot from histories, as{'train': train_metrics, 'val': val_metrics}(at least one metric name required, for only one of train/val - need to have “something” to plot).x_ticks'(required): x-coordinates of respective metrics, of samelen().'vhlines': dict[‘v’ / ‘h’: float]. vertical/horizontal lines; e.g.{'v': 10}will draw a vertical line at x = 10, and{'h': .5}at y = .5.'mark_best_cfg':{'train': metric_name}or{'val': metric_name}and (optional){'max_is_best: bool}pairs. Will mark plot to indicate a metric optimum (max (if'max_is_best', the default) or min).'ylims': y-limits of plot panes.'legend_kw': passed toplt.legend(); if None, no legend is drawn.
Defaults handling:
Keys and subkeys, where absent, will be filled from configs returned by
misc._make_plot_configs_from_metrics().- If plot pane
'0'is lacking entirely, it’ll be copied from the defaults. - If subkey
'color'in dict with key'0'is missing, will fill fromdefaults['0']['color'].
Further info:
- Every key’s iterable value (list, etc) must be of same len as number of
metrics in
'metrics'; this is ensured withincfg_fn. - Metrics are plotted in order of insertion (at both dict and list level),
so later metrics will carry over to additional plot panes if number of
metrics exceeds
plot_first_pane_max_vals; seecfg_fn. - A convenient option is to change
_PLOT_CFGinconfigs.pyand passplot_configs=NonetoTrainGenerator.__init__; will internally callcfg_fn, which validates some configs and tries to fill what’s missing. - Above,
cfg_fn==misc._make_plot_configs_from_metrics()
-
compute_gradient_norm(input_data, labels, sample_weight=None, learning_phase=0, _id='*', mode='weights', norm_fn=(<ufunc 'sqrt'>, <ufunc 'square'>), scope='local')¶ Computes gradients w.r.t. layer weights or outputs per
_id, and returns norm according tonorm_fnandscope.- Arguments:
- input_data: np.ndarray / list[np.ndarray] / supported formats
- Data w.r.t. which loss is to be computed for the gradient.
List of arrays for multi-input networks. “Supported formats”
is any valid input to
model. - labels: np.ndarray / list[np.ndarray] / supported formats
- Labels w.r.t. which loss is to be computed for the gradient.
- sample_weight: np.ndarray / list[np.ndarray] / supported formats
- kwarg to
model.fit(), etc., weighting individual sample losses. - learning_phase: bool / int[bool]
- 1: use model in train mode
- 0: use model in inference mode
- _id: str / int / list[str/int].
- int -> idx; str -> name
- idx: int. Index of layer to fetch, via model.layers[idx].
- name: str. Name of layer (full or substring) to be fetched. Returns earliest match if multiple found.
- list[str/int] -> treat each str element as name, int as idx.
Ex:
['gru', 2]gets (e.g.) weights of first layer with name substring ‘gru’, then of layer w/ idx 2. '*'(wildcard) -> get (e.g.) outputs of all layers (except input) with ‘output’ attribute.
- mode: str in (‘weights’, ‘outputs’, ‘gradients:weights’, ‘gradients:outputs’)
- Whether to fetch layer weights, outputs, or gradients (w.r.t. outputs or weights).
- norm_fn: (function, function) / function
- Norm function(s) to apply to gradients arrays when gathering.
(np.sqrt, np.square)for L2-norm,np.absfor L1-norm. Computed as:outer_fn(sum(inner_fn(x) for x in data)), whereouter_fn, inner_fn = norm_fnifnorm_fnis list/tuple, andinner_fn = norm_fnandouter_fn = lambda x: xotherwise. - scope: str in (‘local’, ‘global’)
- Whether to apply
stat_fnon individual gradient arrays, or sum of.
- Returns:
- Gradient norm(s). List of float if
scope == 'local'(norms of weights), else float (outer_fn(sum(sum(inner_fn(g)) for g in grads))).
TensorFlow optimizers do gradient clipping according to the
clipnormsetting by comparing individual weights’ L2-norms againstclipnorm, and rescaling if exceeding. These L2 norms can be obtained usingnorm_fn=(np.sqrt, np.square)withscope == 'local'andmode='weights'. See:tensorflow.python.keras.optimizer_v2.optimizer_v2._clip_gradientskeras.optimizers.clip_normtensorflow.python.ops.clip_ops.clip_by_norm
-
gradient_norm_over_dataset(val=False, learning_phase=0, mode='weights', norm_fn=(<ufunc 'sqrt'>, <ufunc 'square'>), stat_fn=<function median>, n_iters=None, prog_freq=10, w=1, h=1)¶ Aggregates gradient norms over dataset, one iteration at a time. Useful for estimating value of gradient clipping,
clipnorm, to use. Plots a histogram of gathered data when finished. Also seecompute_gradient_norm().- Arguments:
- val: bool
- True: gather over
val_datagenbatches - False: gather over
datagenbatches
- True: gather over
- learning_phase: bool / int[bool]
- True: get gradients of model in train mode
- False: get gradients of model in inference mode
- mode: str in (‘weights’, ‘outputs’)
- Whether to get gradients with respect to layer weights or outputs.
- norm_fn: (function, function) / function
- Norm function(s) to apply to gradients arrays when gathering.
(np.sqrt, np.square)for L2-norm,np.absfor L1-norm. Computed as:outer_fn(sum(inner_fn(g) for g in grads)), whereouter_fn, inner_fn = norm_fnifnorm_fnis list/tuple, andinner_fn = norm_fnandouter_fn = lambda x: xotherwise. - stat_fn: function
- Aggregate function to apply on computed norms. If
np.mean, will gather mean of gradients; ifnp.median, the median, etc. Computed as:stat_fn(outer_fn(sum(inner_fn(g) for g in grads))). - n_iters: int / None
- Number of expected iterations over entire dataset. Can be used to
iterate over subset of entire dataset. If None, will return upon
DataGenerator.all_data_exhausted. - prog_freq: int
- How often to print
f'|{batch_idx}', and'.'otherwise, in terms of number of batches (not iterations, but are same if not using slices). E.g. 5:....|5....|10....|15. - w, h: float
- Scale figure width & height, respectively.
- Returns:
- grad_norms: np.ndarray
- Norms of gradients for every iteration.
Shape:
(iters_processed, n_params), wheren_paramsis number of gradient arrays whose norm stats were computed at each iteration. - batches_processed: int
- Number of batches processed.
- iters_processed: int
- Number of iterations processed (if using e.g. 4 slices per batch,
will equal
4 * batches_processed).
-
gradient_sum_over_dataset(val=False, learning_phase=0, mode='weights', n_iters=None, prog_freq=10, plot_kw={})¶ Computes cumulative sum of gradients over dataset, one iteration at a time, preserving full array shapes. Useful for computing mean of gradients over dataset, or other aggregate metrics.
- Arguments:
- val: bool
- True: gather over
val_datagenbatches - False: gather over
datagenbatches
- True: gather over
- learning_phase: bool / int[bool]
- True: get gradients of model in train mode
- False: get gradients of model in inference mode
- mode: str in (‘weights’, ‘outputs’)
- Whether to get gradients with respect to layer weights or outputs.
- n_iters: int / None
- Number of expected iterations over entire dataset. Can be used to
iterate over subset of entire dataset. If None, will return upon
DataGenerator.all_data_exhausted. - prog_freq: int
- How often to print
f'|{batch_idx}', and'.'otherwise, in terms of number of batches (not iterations, but are same if not using slices). E.g. 5:....|5....|10....|15. - plot_kw: dict
- Kwargs to pass to
see_rnn.features_hist; defaults to{'share_xy': False, 'center_zero': True}.
- Returns:
- grad_sum: dict[str: np.ndarray]
- Gradient arrays summed over dataset. Structure:
{name: array, name: array, ...}, wherenameis name of weight array or layer output. - batches_processed: int
- Number of batches processed.
- iters_processed: int
- Number of iterations processed (if using e.g. 4 slices per batch,
will equal
4 * batches_processed).
-
_gather_over_dataset(gather_fn, val=False, n_iters=None, prog_freq=10)¶ Iterates over
DataGenerator, applyinggather_fnto every batch (or slice). Stops aftern_iters, or whenDataGenerator.all_data_exhaustedifn_iters is None. Useful for monitoring quantities over the course of training or inference,.gather_fnrecursively updatesdata; as such, it can be used to append to a list, update a dictionary, operate on an array, etc. Review source code for exact logic.
-
interrupt_status() -> (<class 'bool'>, <class 'bool'>)¶ Prints whether
TrainGeneratorwas interrupted (e.g.KeyboardInterrupt, or via exception) duringtrain()andvalidate(). Returns bools (True for interrupted, else False) for each, as (train, val).Not foolproof; user can set flags manually or via callbacks. For further assurance, check
temp_history,val_temp_history, and cache attributes (e.g._preds_cache) which are cleared at end ofvalidate()by default; this method checks only flags:_train_loop_done,train_postiter_processed,_val_loop_done,_val_postiter_processed.
-
info()¶ Prints various useful TrainGenerator & DataGenerator attributes, and interrupt status.
-
_make_plot_configs_from_metrics()¶ Makes default
plot_configs, building onconfigs._PLOT_CFG; seeget_history_fig(). Validates some configs and tries to fill others.Ensures every iterable config is of same
len()as number of metrics in'metrics', by extending last value of iterable to match the len. Ex:>>> {'metrics': {'val': ['loss', 'accuracy', 'f1']}, ... 'linestyle': ['--', '-'], # -> ['--', '-', '-'] ... }
Assigns colors to metrics based on a default cycling coloring scheme, with some predefined customs (look for
_customs_mapin source code).Configures up to two plot panes, mediated by
plot_first_pane_max_vals; if number of metrics in'metrics'exceeds it, then a second pane is used. Can be used to configure how many metrics to draw in first pane; useful for managing clutter.
-
_validate_traingen_configs()¶ Ensures various attributes are properly configured, and attempts correction where possible.
-