xplogger.parser.experiment package¶
Submodules¶
xplogger.parser.experiment.experiment module¶
Container for the experiment data.
-
class
xplogger.parser.experiment.experiment.
Experiment
(configs: list[ConfigType], metrics: experiment_utils.ExperimentMetricType, info: Optional[experiment_utils.ExperimentInfoType] = None)[source]¶ Bases:
object
-
property
config
¶ Access the config property.
-
process_metrics
(metric_names: list[str], x_name: str, x_min: int, x_max: int, mode: str, drop_duplicates: bool, dropna: bool, verbose: bool) → dict[str, np.typing.NDArray[np.float32]][source]¶ Given a list of metric names, process the metrics for a given experiment.
- Parameters
metric_names (list[str]) – Names of metrics to process.
x_name (str) – The column/meric with respect to which other metrics are tracked. For example steps or epochs.
x_min (int) – Filter the experiment where the max value of x_name is less than or equal to x_min.
x_max (int) – Filter the metric values where value of x_name (corresponding to metric values) is greater than x_max
mode (str) – Mode when selecting metrics. Recall that experiment.metrics is a dictionary mapping modes to dataframes.
drop_duplicates (bool) – Should drop duplicate values in the x_name column
verbose (bool) – Should print additional information
- Returns
- dictionary mapping metric name to 1-dimensional
numpy array of metric values.
- Return type
dict[str, np.ndarray]
-
serialize
(dir_path: pathlib.Path) → None[source]¶ Serialize the experiment data and store at dir_path.
configs are stored as jsonl (since there are only a few configs per experiment) in a file called config.jsonl.
metrics are stored in [feather format](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_feather.html).
info is stored in the gzip format.
-
property
-
xplogger.parser.experiment.experiment.
ExperimentList
¶ alias of
xplogger.parser.experiment.experiment.ExperimentSequence
-
class
xplogger.parser.experiment.experiment.
ExperimentSequence
(experiments: list[Experiment])[source]¶ Bases:
collections.UserList
-
aggregate
(aggregate_configs: Callable[[list[list[ConfigType]]], list[ConfigType]] = <function return_first_config>, aggregate_metrics: Callable[[list[experiment_utils.ExperimentMetricType]], experiment_utils.ExperimentMetricType] = <function concat_metrics>, aggregate_infos: Callable[[list[experiment_utils.ExperimentInfoType]], experiment_utils.ExperimentInfoType] = <function return_first_infos>) → Experiment[source]¶ Aggregate a sequence of experiments into a single experiment.
- Parameters
aggregate_configs (Callable[ [list[list[ConfigType]]], list[ConfigType] ], optional) – Function to aggregate the configs. Defaults to experiment_utils.return_first_config.
aggregate_metrics (Callable[ [list[experiment_utils.ExperimentMetricType]], ExperimentMetricType ], optional) – Function to aggregate the metrics. Defaults to experiment_utils.concat_metrics.
aggregate_infos (Callable[ [list[experiment_utils.ExperimentInfoType]], ExperimentInfoType ], optional) – Function to aggregate the information. Defaults to experiment_utils.return_first_infos.
- Returns
Aggregated Experiment.
- Return type
-
aggregate_metrics
(**kwargs: Any) → dict[str, np.typing.NDArray[np.float32]][source]¶ Aggregate metrics across experiment sequences.
Given a list of metric names, aggreate the metrics across different experiments in an experiment sequence.
- Parameters
metric_names (list[str]) – Names of metrics to aggregate.
x_name (str) – The column/meric with respect to which other metrics are tracked. For example steps or epochs. This aggregated values for this metric are also returned.
x_min (int) – Only those experiments are considered (during aggregation) where the max value of x_name is greater than or equal to x_min.
x_max (int) – When aggregating experiments, consider metric values such that the max value of x_name corresponding to metric values is less than or equal to x_max
mode (str) – Mode when selecting metrics. Recall that experiment.metrics is a dictionary mapping modes to dataframes.
drop_duplicates (bool) – Should drop duplicate values in the x_name column
verbose (bool) – Should print additional information
- Returns
- dictionary mapping metric name to 2-dimensional
numpy array of metric values. The first dimension corresponds to the experiments and the second corresponds to metrics per experiment.
- Return type
dict[str, np.ndarray]
-
filter
(filter_fn: Callable[[xplogger.parser.experiment.experiment.Experiment], bool]) → ExperimentSequence[source]¶ Filter experiments in the sequence.
- Parameters
filter_fn – Function to filter an experiment
- Returns
A sequence of experiments for which the filter condition is true
- Return type
-
get_param_groups
(params_to_exclude: Iterable[str]) → tuple[ConfigType, dict[str, set[Any]]][source]¶ Return two groups of params, one which is fixed across the experiments and one which varies.
This function is useful when understanding the effect of different parameters on the model’s performance. One could plot the performance of the different experiments, as a function of the parameters that vary.
- Parameters
params_to_exclude (Iterable[str]) – These parameters are not returned in either group. This is useful for ignoring parameters like time when the experiment was started since these parameters should not affect the performance. In absence of this argument, all such parameters will likely be returned with the group of varying parameters.
- Returns
- The first group/config contains the params which are fixed across the experiments.
It maps these params to their default values, hence it should be a subset of any config. The second group/config contains the params which vary across the experiments. It maps these params to the set of values they take.
- Return type
tuple[ConfigType, dict[str, set[Any]]]
-
groupby
(group_fn: Callable[[Experiment], str]) → dict[str, ExperimentSequence][source]¶ Group experiments in the sequence.
- Parameters
group_fn – Function to assign a string group id to the experiment
- Returns
A dictionary mapping the sring group id to a sequence of experiments
- Return type
dict[str, ExperimentSequence]
-
-
class
xplogger.parser.experiment.experiment.
ExperimentSequenceDict
(experiment_sequence_dict: dict[Any, ExperimentSequence])[source]¶ Bases:
collections.UserDict
-
aggregate_metrics
(return_all_metrics_with_same_length: bool = True, **kwargs: Any) → dict[str, np.typing.NDArray[np.float32]][source]¶ Aggreate metrics across experiment sequences.
Given a list of metric names, aggreate the metrics across different experiment sequences in a dictionary indexed by the metric name.
- Parameters
get_experiment_name (Callable[[str], str]) – Function to map the given key with a name.
metric_names (list[str]) – Names of metrics to aggregate.
x_name (str) – The column/meric with respect to which other metrics are tracked. For example steps or epochs. This aggregated values for this metric are also returned.
mode (str) – Mode when selecting metrics. Recall that experiment.metrics is a dictionary mapping modes to dataframes.
- Returns
- dictionary mapping metric name to 2-dimensional
numpy array of metric values. The first dimension corresponds to the experiments and the second corresponds to metrics per experiment.
- Return type
dict[str, np.typing.NDArray[np.float32]]
-
filter
(filter_fn: Callable[[str, xplogger.parser.experiment.experiment.Experiment], bool]) → ExperimentSequenceDict[source]¶ Filter experiment sequences in the dict.
- Parameters
filter_fn – Function to filter an experiment sequence
- Returns
A dict of sequence of experiments for which the filter condition is true
- Return type
-
-
xplogger.parser.experiment.experiment.
deserialize
(dir_path: str) → xplogger.parser.experiment.experiment.Experiment[source]¶ Deserialize the experiment data stored at dir_path and return an Experiment object.
xplogger.parser.experiment.parser module¶
Implementation of Parser to parse experiment from the logs.
-
class
xplogger.parser.experiment.parser.
Parser
(parse_config_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json_and_match_value>, parse_metric_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json_and_match_value>, parse_info_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json>)[source]¶ Bases:
xplogger.parser.base.Parser
Class to parse an experiment from the log dir.
-
parse
(filepath_pattern: Union[str, pathlib.Path]) → xplogger.parser.experiment.experiment.Experiment[source]¶ Load one experiment from the log dir.
- Parameters
filepath_pattern (Union[str, Path]) – filepath pattern to glob or instance of Path (directory) object.
- Returns
Experiment
-
xplogger.parser.experiment.utils module¶
Utilit functions to work with the experiment data.
-
xplogger.parser.experiment.utils.
concat_metrics
(metric_list: list[ExperimentMetricType]) → ExperimentMetricType[source]¶ Concatenate the metrics.
- Parameters
metric_list (list[ExperimentMetricType]) –
- Returns
ExperimentMetricType
-
xplogger.parser.experiment.utils.
mean_metrics
(metric_list: list[ExperimentMetricType]) → ExperimentMetricType[source]¶ Compute the mean of the metrics.
- Parameters
metric_list (list[ExperimentMetricType]) –
- Returns
ExperimentMetricType
-
xplogger.parser.experiment.utils.
return_first_config
(config_lists: list[list[ConfigType]]) → list[ConfigType][source]¶ Return the first config list, from a list of list of configs, else return empty list.
- Parameters
config_lists (list[list[ConfigType]]) –
- Returns
list[ConfigType]
Module contents¶
Module to interact with the experiment data.