ml_logger.parser package

Submodules

ml_logger.parser.base module

Base class that all parsers extend.

class ml_logger.parser.base.Parser(parse_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json>)[source]

Bases: abc.ABC

Base class that all parsers extend.

ml_logger.parser.config module

Implementation of Parser to parse config from logs.

class ml_logger.parser.config.Parser(parse_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json_and_match_value>)[source]

Bases: ml_logger.parser.log.Parser

Class to parse config from the logs.

ml_logger.parser.config.parse_json_and_match_value(line: str) → Optional[Dict[str, Any]][source]

Parse a line as JSON log and check if it a valid config log.

ml_logger.parser.log module

Implementation of Parser to parse the logs.

class ml_logger.parser.log.Parser(parse_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json>)[source]

Bases: ml_logger.parser.base.Parser

Class to parse the log files.

parse(filepath_pattern: str) → Iterator[Dict[str, Any]][source]

Open a log file, parse its contents and return logs.

Parameters

filepath_pattern (str) – filepath pattern to glob

Returns

Iterator over the logs

Return type

Iterator[LogType]

Yields

Iterator[LogType] – Iterator over the logs

parse_first_log(filepath_pattern: str) → Optional[Dict[str, Any]][source]

Return the first log from a file.

The method will return after finding the first log. Unlike parse() method, it will not iterate over the entire log file (thus saving memory and time).

Parameters

filepath_pattern (str) – filepath pattern to glob

Returns

First instance of a log

Return type

LogType

parse_last_log(filepath_pattern: str) → Optional[Dict[str, Any]][source]

Return the last log from a file.

Like parse() method, it will iterate over the entire log file but will not keep all the logs in memory (thus saving memory).

Parameters

filepath_pattern (str) – filepath pattern to glob

Returns

Last instance of a log

Return type

LogType

ml_logger.parser.log.parse_json_and_match_value(line: str, value: str) → Optional[Dict[str, Any]][source]

Parse a line as JSON log and check if it a valid log.

ml_logger.parser.metric module

Implementation of Parser to parse metrics from logs.

class ml_logger.parser.metric.Parser(parse_line: Callable[[str], Optional[Dict[str, Any]]] = <function parse_json_and_match_value>)[source]

Bases: ml_logger.parser.log.Parser

Class to parse the metrics from the logs.

parse_as_df(filepath_pattern: str, group_metrics: Callable[[List[Dict[str, Any]]], Dict[str, List[Dict[str, Any]]]] = <function group_metrics>, aggregate_metrics: Callable[[List[Dict[str, Any]]], List[Dict[str, Any]]] = <function aggregate_metrics>) → Dict[str, pandas.core.frame.DataFrame][source]

Create a dict of (metric_name, dataframe).

Method that: (i) reads metrics from the filesystem (ii) groups metrics (iii) aggregates all the metrics within a group, (iv) converts the aggregate metrics into dataframes and returns a dictionary of dataframes

Parameters
  • filepath_pattern (str) – filepath pattern to glob

  • group_metrics (Callable[[List[LogType]], Dict[str, List[LogType]]], optional) – Function to group a list of metrics into a dictionary of (key, list of grouped metrics). Defaults to group_metrics.

  • aggregate_metrics (Callable[[List[LogType]], List[LogType]], optional) – Function to aggregate a list of metrics. Defaults to aggregate_metrics.

ml_logger.parser.metric.aggregate_metrics(metrics: List[Dict[str, Any]]) → List[Dict[str, Any]][source]

Aggregate a list of metrics.

Parameters

metrics (List[MetricType]) – List of metrics to aggregate

Returns

List of aggregated metrics

Return type

List[MetricType]

ml_logger.parser.metric.group_metrics(metrics: List[Dict[str, Any]]) → Dict[str, List[Dict[str, Any]]][source]

Group a list of metrics.

Group a list of metrics into a dictionary of

(key, list of grouped metrics)

Parameters

metrics (List[MetricType]) – List of metrics to group

Returns

Dictionary of (key,

list of grouped metrics)

Return type

Dict[str, List[MetricType]]

ml_logger.parser.metric.metrics_to_df(metric_logs: List[Dict[str, Any]], group_metrics: Callable[[List[Dict[str, Any]]], Dict[str, List[Dict[str, Any]]]] = <function group_metrics>, aggregate_metrics: Callable[[List[Dict[str, Any]]], List[Dict[str, Any]]] = <function aggregate_metrics>) → Dict[str, pandas.core.frame.DataFrame][source]

Create a dict of (metric_name, dataframe).

Method that: (i) groups metrics (ii) aggregates all the metrics within a group, (iii) converts the aggregate metrics into dataframes and returns a dictionary of dataframes

Parameters
  • metric_logs (List[LogType]) – List of metrics

  • group_metrics (Callable[[List[LogType]], Dict[str, List[LogType]]], optional) – Function to group a list of metrics into a dictionary of (key, list of grouped metrics). Defaults to group_metrics.

  • aggregate_metrics (Callable[[List[LogType]], List[LogType]], optional) – Function to aggregate a list of metrics. Defaults to aggregate_metrics.

Returns

[description]

Return type

Dict[str, pd.DataFrame]

ml_logger.parser.metric.parse_json_and_match_value(line: str) → Optional[Dict[str, Any]][source]

Parse a line as JSON log and check if it a valid metric log.

ml_logger.parser.utils module

Utility functions for the parser module.

ml_logger.parser.utils.compare_logs(first_log: Dict[str, Any], second_log: Dict[str, Any], verbose: bool = False) → Tuple[List[str], List[str], List[str]][source]

Compare two logs.

Return list of keys that are either missing or have different valus in the two logs.

Parameters
  • first_log (LogType) – First Log

  • second_log (LogType) – Second Log

  • verbose (bool) – Defaults to False

Returns

Tuple of [

list of keys with different values, list of keys with values missing in first log, list of keys with values missing in the second log,]

Return type

Tuple[List[str], List[str], List[str]]

ml_logger.parser.utils.flatten_log(d: Dict[str, Any], parent_key: str = '', sep: str = '#') → Dict[str, Any][source]

Flatten a log using a separator.

Taken from https://stackoverflow.com/a/6027615/1353861

Parameters
  • d (LogType) – [description]

  • parent_key (str, optional) – [description]. Defaults to “”.

  • sep (str, optional) – [description]. Defaults to “#”.

Returns

[description]

Return type

LogType

ml_logger.parser.utils.parse_json(line: str) → Optional[Dict[str, Any]][source]

Parse a line as JSON string.

Module contents