xplogger.parser.experiment package¶

Submodules¶

xplogger.parser.experiment.experiment module¶

Container for the experiment data.

class xplogger.parser.experiment.experiment.Experiment(configs: list[ConfigType], metrics: experiment_utils.ExperimentMetricType, info: Optional[experiment_utils.ExperimentInfoType] = None)[source]¶

Bases: object

property config¶: Access the config property.

log_to_wandb(wandb_config: dict[str, Any]) → LogBook[source]¶: Log the experiment to wandb.

process_metrics(metric_names: list[str], x_name: str, x_min: int, x_max: int, mode: str, drop_duplicates: bool, dropna: bool, verbose: bool) → dict[str, np.typing.NDArray[np.float32]][source]¶

Given a list of metric names, process the metrics for a given experiment.

Parameters

metric_names (list[str]) – Names of metrics to process.
x_name (str) – The column/meric with respect to which other metrics are tracked. For example steps or epochs.
x_min (int) – Filter the experiment where the max value of x_name is less than or equal to x_min.
x_max (int) – Filter the metric values where value of x_name (corresponding to metric values) is greater than x_max
mode (str) – Mode when selecting metrics. Recall that experiment.metrics is a dictionary mapping modes to dataframes.
drop_duplicates (bool) – Should drop duplicate values in the x_name column
verbose (bool) – Should print additional information

Returns

dictionary mapping metric name to 1-dimensional: numpy array of metric values.

Return type

dict[str, np.ndarray]

serialize(dir_path: pathlib.Path) → None[source]¶

Serialize the experiment data and store at dir_path.

configs are stored as jsonl (since there are only a few configs per experiment) in a file called config.jsonl.
metrics are stored in [feather format](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_feather.html).
info is stored in the gzip format.

xplogger.parser.experiment.experiment.ExperimentList¶: alias of xplogger.parser.experiment.experiment.ExperimentSequence

class xplogger.parser.experiment.experiment.ExperimentSequence(experiments: list[Experiment])[source]¶

Bases: collections.UserList

aggregate(aggregate_configs: Callable[[list[list[ConfigType]]], list[ConfigType]] = <function return_first_config>, aggregate_metrics: Callable[[list[experiment_utils.ExperimentMetricType]], experiment_utils.ExperimentMetricType] = <function concat_metrics>, aggregate_infos: Callable[[list[experiment_utils.ExperimentInfoType]], experiment_utils.ExperimentInfoType] = <function return_first_infos>) → Experiment [source]¶

Aggregate a sequence of experiments into a single experiment.

Parameters

aggregate_configs (Callable[ [list[list[ConfigType]]], list[ConfigType] ], optional) – Function to aggregate the configs. Defaults to experiment_utils.return_first_config.
aggregate_metrics (Callable[ [list[experiment_utils.ExperimentMetricType]], ExperimentMetricType ], optional) – Function to aggregate the metrics. Defaults to experiment_utils.concat_metrics.
aggregate_infos (Callable[ [list[experiment_utils.ExperimentInfoType]], ExperimentInfoType ], optional) – Function to aggregate the information. Defaults to experiment_utils.return_first_infos.

Returns

Aggregated Experiment.

Return type

Experiment

aggregate_metrics(**kwargs: Any) → dict[str, np.typing.NDArray[np.float32]][source]¶

Aggregate metrics across experiment sequences.

Given a list of metric names, aggreate the metrics across different experiments in an experiment sequence.

Parameters

metric_names (list[str]) – Names of metrics to aggregate.
x_name (str) – The column/meric with respect to which other metrics are tracked. For example steps or epochs. This aggregated values for this metric are also returned.
x_min (int) – Only those experiments are considered (during aggregation) where the max value of x_name is greater than or equal to x_min.
x_max (int) – When aggregating experiments, consider metric values such that the max value of x_name corresponding to metric values is less than or equal to x_max
mode (str) – Mode when selecting metrics. Recall that experiment.metrics is a dictionary mapping modes to dataframes.
drop_duplicates (bool) – Should drop duplicate values in the x_name column
verbose (bool) – Should print additional information

Returns

dictionary mapping metric name to 2-dimensional: numpy array of metric values. The first dimension corresponds to the experiments and the second corresponds to metrics per experiment.

Return type

dict[str, np.ndarray]

filter(filter_fn: Callable[[xplogger.parser.experiment.experiment.Experiment], bool]) → ExperimentSequence [source]¶

Filter experiments in the sequence.

Parameters: filter_fn – Function to filter an experiment
Returns: A sequence of experiments for which the filter condition is true
Return type: ExperimentSequence

get_param_groups(params_to_exclude: Iterable[str]) → tuple[ConfigType, dict[str, set[Any]]][source]¶

Return two groups of params, one which is fixed across the experiments and one which varies.

This function is useful when understanding the effect of different parameters on the model’s performance. One could plot the performance of the different experiments, as a function of the parameters that vary.

Parameters

params_to_exclude (Iterable[str]) – These parameters are not returned in either group. This is useful for ignoring parameters like time when the experiment was started since these parameters should not affect the performance. In absence of this argument, all such parameters will likely be returned with the group of varying parameters.

Returns

The first group/config contains the params which are fixed across the experiments.: It maps these params to their default values, hence it should be a subset of any config. The second group/config contains the params which vary across the experiments. It maps these params to the set of values they take.

Return type

tuple[ConfigType, dict[str, set[Any]]]

groupby(group_fn: Callable[[Experiment], str]) → dict[str, ExperimentSequence][source]¶

Group experiments in the sequence.

Parameters: group_fn – Function to assign a string group id to the experiment
Returns: A dictionary mapping the sring group id to a sequence of experiments
Return type: dict[str, ExperimentSequence]

class xplogger.parser.experiment.experiment.ExperimentSequenceDict(experiment_sequence_dict: dict[Any, ExperimentSequence])[source]¶

Bases: collections.UserDict

aggregate_metrics(return_all_metrics_with_same_length: bool = True, **kwargs: Any) → dict[str, np.typing.NDArray[np.float32]][source]¶

Aggreate metrics across experiment sequences.

Given a list of metric names, aggreate the metrics across different experiment sequences in a dictionary indexed by the metric name.

Parameters

get_experiment_name (Callable[[str], str]) – Function to map the given key with a name.
metric_names (list[str]) – Names of metrics to aggregate.
x_name (str) – The column/meric with respect to which other metrics are tracked. For example steps or epochs. This aggregated values for this metric are also returned.
mode (str) – Mode when selecting metrics. Recall that experiment.metrics is a dictionary mapping modes to dataframes.

Returns

dictionary mapping metric name to 2-dimensional: numpy array of metric values. The first dimension corresponds to the experiments and the second corresponds to metrics per experiment.

Return type

dict[str, np.typing.NDArray[np.float32]]

filter(filter_fn: Callable[[str, xplogger.parser.experiment.experiment.Experiment], bool]) → ExperimentSequenceDict [source]¶

Filter experiment sequences in the dict.

Parameters: filter_fn – Function to filter an experiment sequence
Returns: A dict of sequence of experiments for which the filter condition is true
Return type: ExperimentSequenceDict

xplogger.parser.experiment.experiment.deserialize(dir_path: str) → xplogger.parser.experiment.experiment.Experiment [source]¶: Deserialize the experiment data stored at dir_path and return an Experiment object.

xplogger.parser.experiment.parser module¶

Implementation of Parser to parse experiment from the logs.

class xplogger.parser.experiment.parser.Parser(parse_config_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json_and_match_value>, parse_metric_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json_and_match_value>, parse_info_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json>)[source]¶

Bases: xplogger.parser.base.Parser

Class to parse an experiment from the log dir.

parse(filepath_pattern: Union[str, pathlib.Path]) → xplogger.parser.experiment.experiment.Experiment [source]¶

Load one experiment from the log dir.

Parameters: filepath_pattern (Union[str, Path]) – filepath pattern to glob or instance of Path (directory) object.
Returns: Experiment

xplogger.parser.experiment.utils module¶

Utilit functions to work with the experiment data.

xplogger.parser.experiment.utils.concat_metrics(metric_list: list[ExperimentMetricType]) → ExperimentMetricType[source]¶

Concatenate the metrics.

Parameters: metric_list (list[ExperimentMetricType]) –
Returns: ExperimentMetricType

xplogger.parser.experiment.utils.mean_metrics(metric_list: list[ExperimentMetricType]) → ExperimentMetricType[source]¶

Compute the mean of the metrics.

Parameters: metric_list (list[ExperimentMetricType]) –
Returns: ExperimentMetricType

xplogger.parser.experiment.utils.return_first_config(config_lists: list[list[ConfigType]]) → list[ConfigType][source]¶

Return the first config list, from a list of list of configs, else return empty list.

Parameters: config_lists (list[list[ConfigType]]) –
Returns: list[ConfigType]

xplogger.parser.experiment.utils.return_first_infos(info_list: list[ExperimentInfoType]) → ExperimentInfoType[source]¶

Return the first info, from a list of infos. Otherwise return empty info.

Parameters: info_list (list[ExperimentInfoType]) –
Returns: ExperimentInfoType

xplogger.parser.experiment.utils.sum_metrics(metric_list: list[ExperimentMetricType]) → ExperimentMetricType[source]¶

Compute the sum of the metrics.

Parameters: metric_list (list[ExperimentMetricType]) –
Returns: ExperimentMetricType

Module contents¶

Module to interact with the experiment data.