aiqclib.classify.step6_classify_dataset package

Submodules

aiqclib.classify.step6_classify_dataset.dataset_all module

This module defines the ClassifyAll class, a specialized implementation of BuildModelBase designed for building and testing classification models across multiple targets. It manages configuration, data handling, and result persistence for a comprehensive classification workflow.

class aiqclib.classify.step6_classify_dataset.dataset_all.ClassifyAll(config, test_sets=None)[source]

Bases: BuildModelBase

A subclass of BuildModelBase that orchestrates the building and testing of classification models for multiple targets using provided training and test sets.

This class sets its expected_class_name to "ClassifyAll", which must match the YAML configuration’s base_class if you intend to instantiate it within that framework.

Parameters:
  • config (ConfigBase)

  • test_sets (Dict[str, DataFrame] | None)

build(target_name)[source]

Placeholder method as training does not occur during classification.

Parameters:

target_name (str) – The name of the target variable.

Return type:

None

build_final_model(target_name)[source]

Placeholder method as training does not occur during classification.

Parameters:

target_name (str) – The name of the target variable.

Return type:

None

default_file_names: Dict[str, str]

Default names for model files and test reports, with placeholders for the target name.

drop_cols

Columns to be dropped from the test set before passing to the base model.

expected_class_name: str = 'ClassifyAll'
model_file_names: Dict[str, str]

A dictionary mapping “model” to target-specific file paths, derived from configuration.

output_file_names: Dict[str, Dict[str, str]]

A dictionary mapping “model” or “result” to target-specific file paths, derived from configuration.

test(target_name)[source]

Test the model for the given target, storing the results in results.

This method performs the following steps:

  1. Retrieves the trained model from models[target_name].

  2. Resets the model’s model-scores table to ensure no data duplication from previous runs.

  3. Prepares the appropriate test set by dropping specified columns from test_sets[target_name] and attaches it to the base_model.

  4. Calls the base_model.test() method to generate predictions and reports.

  5. Stores the model-scores table in model_scores[target_name].

  6. Concatenates relevant original test set columns with the generated predictions and stores them in predictions[target_name].

  7. Stores the test report from the base model in reports[target_name].

Parameters:

target_name (str) – The target variable name, used to index both models and test_sets.

Return type:

None

test_cols

Columns to be selected from the original test set for final prediction output.

aiqclib.classify.step6_classify_dataset.dataset_all_suite module

This module provides the ClassifyAllSuite class, which extends BuildModelBase to facilitate the testing and evaluation of multiple classification models across various targets and machine learning methods. It automates the process of loading models, generating predictions, and aggregating results into unified datasets for comparative analysis.

class aiqclib.classify.step6_classify_dataset.dataset_all_suite.ClassifyAllSuite(config, test_sets=None)[source]

Bases: BuildModelBase

A subclass of BuildModelBase that orchestrates the evaluation and testing of classification models for multiple targets using multiple machine learning methods provided by a ModelSuite.

This class reads previously trained models (with composite keys) and aggregates test reports, predictions, and model-scores tables into single datasets per target by introducing a ‘method’ column.

Note

This class sets expected_class_name to "ClassifyAllSuite".

Parameters:
  • config (ConfigBase)

  • test_sets (Dict[str, DataFrame] | None)

build(target_name)[source]

Placeholder method as training does not occur during classification.

Parameters:

target_name (str) – The name of the target variable.

Return type:

None

build_final_model(target_name)[source]

Placeholder method as training does not occur during classification.

Parameters:

target_name (str) – The name of the target variable.

Return type:

None

create_metric_plots()[source]

Override parent method to call the multi-method metric plotter.

Return type:

None

expected_class_name: str = 'ClassifyAllSuite'
read_models()[source]

Read and restore each target’s models from disk for all methods in the suite, storing the loaded models in models.

Raises:

FileNotFoundError – If a model file path specified in model_file_names does not exist.

Return type:

None

test(target_name)[source]

Test the models for the given target across all methods, appending a ‘method’ column and aggregating the results into single datasets.

Data types for model outputs (class, score, etc.) are standardized to Int64 and Float64 to prevent Polars SchemaErrors when concatenating.

Parameters:

target_name (str) – The name of the target variable to be tested.

Return type:

None

test_targets()[source]

Iterate over all targets, ensuring that models have been read/loaded for all configured methods before calling test().

Raises:

ValueError – If a target/method combination has no corresponding entry in models.

Return type:

None