aiqclib.common.loader package

Submodules

aiqclib.common.loader.classify_loader module

This module provides factory functions to dynamically load and instantiate various dataset processing steps within a classification pipeline.

It uses a configuration object (aiqclib.common.config.dataset_config.DataSetConfig) to determine which specific implementation of a base class (e.g., aiqclib.prepare.step1_read_input.input_base.InputDataSetBase) should be loaded for each step. The module relies on several global registries to map class names from the configuration to their respective Python types.

aiqclib.common.loader.classify_loader.load_classify_step1_input_dataset(config)[source]

Instantiate an aiqclib.prepare.step1_read_input.input_base.InputDataSetBase-derived class based on the configuration.

Specifically:

  1. Fetches the class name from the config via DataSetConfig.get_base_class("input")().

  2. Looks up the class in aiqclib.common.loader.classify_registry.INPUT_CLASSIFY_REGISTRY.

  3. Instantiates and returns the class.

Parameters:

config (DataSetConfig) – The dataset configuration object, which includes a base_class field under “input” in the YAML file.

Returns:

An instance of a class derived from aiqclib.prepare.step1_read_input.input_base.InputDataSetBase.

Return type:

InputDataSetBase

aiqclib.common.loader.classify_loader.load_classify_step2_summary_dataset(config, input_data=None)[source]

Instantiate a aiqclib.prepare.step2_calc_stats.summary_base.SummaryStatsBase-derived class based on the configuration.

Specifically:

  1. Fetches the class name from the config via DataSetConfig.get_base_class("summary")().

  2. Looks up the class in aiqclib.common.loader.classify_registry.SUMMARY_CLASSIFY_REGISTRY.

  3. Instantiates and returns the class, optionally with an input dataset.

Parameters:
  • config (DataSetConfig) – The dataset configuration object referencing the “summary” step.

  • input_data (Optional[DataFrame]) – An optional Polars DataFrame for computing summary statistics.

Returns:

An instance of a class derived from aiqclib.prepare.step2_calc_stats.summary_base.SummaryStatsBase.

Return type:

SummaryStatsBase

aiqclib.common.loader.classify_loader.load_classify_step3_select_dataset(config, input_data=None)[source]

Instantiate a aiqclib.prepare.step3_select_profiles.select_base.ProfileSelectionBase-derived class based on the configuration.

Specifically:

  1. Fetches the class name from the config via DataSetConfig.get_base_class("select")().

  2. Looks up the class in aiqclib.common.loader.classify_registry.SELECT_CLASSIFY_REGISTRY.

  3. Instantiates and returns the class, optionally with an input dataset.

Parameters:
  • config (DataSetConfig) – The dataset configuration object referencing the “select” step.

  • input_data (Optional[DataFrame]) – An optional Polars DataFrame for selecting profiles.

Returns:

An instance of a class derived from aiqclib.prepare.step3_select_profiles.select_base.ProfileSelectionBase.

Return type:

ProfileSelectionBase

aiqclib.common.loader.classify_loader.load_classify_step4_locate_dataset(config, input_data=None, selected_profiles=None)[source]

Instantiate a aiqclib.prepare.step4_select_rows.locate_base.LocatePositionBase-derived class based on the configuration.

Specifically:

  1. Fetches the class name from the config via DataSetConfig.get_base_class("locate")().

  2. Looks up the class in aiqclib.common.loader.classify_registry.LOCATE_CLASSIFY_REGISTRY.

  3. Instantiates and returns the class, optionally with an input dataset and previously selected profiles.

Parameters:
  • config (DataSetConfig) – The dataset configuration object referencing the “locate” step.

  • input_data (Optional[DataFrame]) – An optional Polars DataFrame containing the data from which location-based subsetting occurs.

  • selected_profiles (Optional[DataFrame]) – An optional Polars DataFrame containing already selected profiles that might be used for filtering additional rows.

Returns:

An instance of a class derived from aiqclib.prepare.step4_select_rows.locate_base.LocatePositionBase.

Return type:

LocatePositionBase

aiqclib.common.loader.classify_loader.load_classify_step5_extract_dataset(config, input_data=None, selected_profiles=None, selected_rows=None, summary_stats=None)[source]

Instantiate an aiqclib.prepare.step5_extract_features.extract_base.ExtractFeatureBase-derived class based on the configuration.

Specifically:

  1. Fetches the class name from the config via DataSetConfig.get_base_class("extract")().

  2. Looks up the class in aiqclib.common.loader.classify_registry.EXTRACT_CLASSIFY_REGISTRY.

  3. Instantiates and returns the class, optionally with various intermediate datasets.

Parameters:
  • config (DataSetConfig) – The dataset configuration object referencing the “extract” step.

  • input_data (Optional[DataFrame]) – An optional Polars DataFrame containing the data from which features will be extracted.

  • selected_profiles (Optional[DataFrame]) – An optional Polars DataFrame containing selected profiles, if relevant to feature extraction.

  • selected_rows (Optional[Dict[str, DataFrame]]) – An optional dictionary where keys are target variable names and values are Polars DataFrames identifying rows relevant to each.

  • summary_stats (Optional[DataFrame]) – An optional Polars DataFrame providing summary statistics that might be used for feature scaling or reference.

Returns:

An instance of a class derived from aiqclib.prepare.step5_extract_features.extract_base.ExtractFeatureBase.

Return type:

ExtractFeatureBase

aiqclib.common.loader.classify_loader.load_classify_step6_classify_dataset(config, test_sets=None)[source]

Instantiate a aiqclib.train.step4_build_model.build_model_base.BuildModelBase-derived class based on the configuration.

Specifically:

  1. Fetches the class name from the config via DataSetConfig.get_base_class("classify")().

  2. Looks up the class in aiqclib.common.loader.classify_registry.CLASSIFY_CLASSIFY_REGISTRY.

  3. Instantiates and returns the class, optionally with test datasets.

Parameters:
  • config (DataSetConfig) – The dataset configuration object referencing the “classify” step.

  • test_sets (Optional[Dict[str, DataFrame]]) – An optional dictionary of test datasets where keys are names and values are Polars DataFrames.

Returns:

An instance of a class derived from aiqclib.train.step4_build_model.build_model_base.BuildModelBase.

Return type:

BuildModelBase

aiqclib.common.loader.classify_loader.load_classify_step7_concat_dataset(config, input_data=None, predictions=None)[source]

Instantiate a aiqclib.classify.step7_concat_datasets.concat_base.ConcatDatasetsBase-derived class based on the configuration.

Specifically:

  1. Fetches the class name from the config via DataSetConfig.get_base_class("concat")().

  2. Looks up the class in aiqclib.common.loader.classify_registry.CLASSIFY_CONCAT_REGISTRY.

  3. Instantiates and returns the class, optionally with various intermediate datasets.

Parameters:
  • config (DataSetConfig) – The dataset configuration object referencing the “concat” step.

  • input_data (Optional[DataFrame]) – An optional Polars DataFrame representing the original input data.

  • predictions (Optional[Dict[str, DataFrame]]) – An optional dictionary of predictions, where keys are prediction set names and values are Polars DataFrames.

Returns:

An instance of a class derived from aiqclib.classify.step7_concat_datasets.concat_base.ConcatDatasetsBase.

Return type:

ConcatDatasetsBase

aiqclib.common.loader.classify_registry module

Module providing registry dictionaries that map dataset class names (str) to their corresponding Python classes. These registries enable dynamic loading of the correct class during each preparation step in the pipeline.

aiqclib.common.loader.classify_registry.CLASSIFY_CLASSIFY_REGISTRY: Dict[str, Type[BuildModelBase]] = {'ClassifyAll': <class 'aiqclib.classify.step6_classify_dataset.dataset_all.ClassifyAll'>, 'ClassifyAllSuite': <class 'aiqclib.classify.step6_classify_dataset.dataset_all_suite.ClassifyAllSuite'>}

A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step6_classify_dataset tasks in the classification pipeline.

Type:

Dict[str, Type[BuildModelBase]]

aiqclib.common.loader.classify_registry.CLASSIFY_CONCAT_REGISTRY: Dict[str, Type[ConcatDatasetsBase]] = {'ConcatDataSetAll': <class 'aiqclib.classify.step7_concat_datasets.dataset_all.ConcatDataSetAll'>, 'ConcatDataSetSuite': <class 'aiqclib.classify.step7_concat_datasets.dataset_suite.ConcatDataSetSuite'>}

A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step7_concat_datasets tasks in the classification pipeline.

Type:

Dict[str, Type[ConcatDatasetsBase]]

aiqclib.common.loader.classify_registry.EXTRACT_CLASSIFY_REGISTRY: Dict[str, Type[ExtractFeatureBase]] = {'ExtractDataSetAll': <class 'aiqclib.classify.step5_extract_features.dataset_all.ExtractDataSetAll'>}

A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step5_extract_features tasks in the classification pipeline.

Type:

Dict[str, Type[ExtractFeatureBase]]

aiqclib.common.loader.classify_registry.INPUT_CLASSIFY_REGISTRY: Dict[str, Type[InputDataSetBase]] = {'InputDataSetAll': <class 'aiqclib.classify.step1_read_input.dataset_all.InputDataSetAll'>}

A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step1_read_input tasks in the classification pipeline.

Type:

Dict[str, Type[InputDataSetBase]]

aiqclib.common.loader.classify_registry.LOCATE_CLASSIFY_REGISTRY: Dict[str, Type[LocatePositionBase]] = {'LocateDataSetAll': <class 'aiqclib.classify.step4_select_rows.dataset_all.LocateDataSetAll'>}

A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step4_select_rows tasks in the classification pipeline.

Type:

Dict[str, Type[LocatePositionBase]]

aiqclib.common.loader.classify_registry.SELECT_CLASSIFY_REGISTRY: Dict[str, Type[ProfileSelectionBase]] = {'SelectDataSetAll': <class 'aiqclib.classify.step3_select_profiles.dataset_all.SelectDataSetAll'>}

A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step3_select_profiles tasks in the classification pipeline.

Type:

Dict[str, Type[ProfileSelectionBase]]

aiqclib.common.loader.classify_registry.SUMMARY_CLASSIFY_REGISTRY: Dict[str, Type[SummaryStatsBase]] = {'SummaryDataSetAll': <class 'aiqclib.classify.step2_calc_stats.dataset_all.SummaryDataSetAll'>}

A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step2_calc_stats tasks in the classification pipeline.

Type:

Dict[str, Type[SummaryStatsBase]]

aiqclib.common.loader.dataset_loader module

This module provides factory functions for loading and instantiating various dataset preparation steps within the aiqclib library. It uses a configuration object to determine the specific class to load for each step (e.g., input, summary, select) and retrieves it from a central registry.

Functions within this module facilitate the dynamic creation of dataset preparation objects based on predefined configurations, enabling a flexible and extensible data processing pipeline.

aiqclib.common.loader.dataset_loader.load_step1_input_dataset(config)[source]

Load an InputDataSetBase-derived class based on the configuration.

Uses the subclass name retrieved from YAML via config.get_base_class("input") to fetch the correct class from INPUT_DATASET_REGISTRY, then instantiates it.

Parameters:

config (DataSetConfig) – The dataset configuration object, which includes a base_class field under the “input” step in the YAML.

Returns:

An instantiated object that inherits from InputDataSetBase.

Return type:

InputDataSetBase

aiqclib.common.loader.dataset_loader.load_step2_summary_dataset(config, input_data=None)[source]

Load a SummaryStatsBase-derived class based on the configuration.

Uses the subclass name retrieved from YAML via config.get_base_class("summary") to fetch the correct class from SUMMARY_DATASET_REGISTRY, then instantiates it.

Parameters:
  • config (DataSetConfig) – The dataset configuration object, referencing the “summary” step.

  • input_data (Optional[DataFrame]) – A Polars DataFrame from which summary stats can be computed, defaults to None.

Returns:

An instantiated object that inherits from SummaryStatsBase.

Return type:

SummaryStatsBase

aiqclib.common.loader.dataset_loader.load_step3_select_dataset(config, input_data=None)[source]

Load a ProfileSelectionBase-derived class based on the configuration.

Uses the subclass name retrieved from YAML via config.get_base_class("select") to fetch the correct class from SELECT_DATASET_REGISTRY, then instantiates it.

Parameters:
  • config (DataSetConfig) – The dataset configuration object, referencing the “select” step.

  • input_data (Optional[DataFrame]) – A Polars DataFrame from which profiles can be selected, defaults to None.

Returns:

An instantiated object that inherits from ProfileSelectionBase.

Return type:

ProfileSelectionBase

aiqclib.common.loader.dataset_loader.load_step4_locate_dataset(config, input_data=None, selected_profiles=None)[source]

Load a LocatePositionBase-derived class based on the configuration.

Uses the subclass name retrieved from YAML via config.get_base_class("locate") to fetch the correct class from LOCATE_DATASET_REGISTRY, then instantiates it.

Parameters:
  • config (DataSetConfig) – The dataset configuration object, referencing the “locate” step.

  • input_data (Optional[DataFrame]) – A Polars DataFrame containing data from which locations can be derived, defaults to None.

  • selected_profiles (Optional[DataFrame]) – A Polars DataFrame representing pre-selected profiles, defaults to None.

Returns:

An instantiated object that inherits from LocatePositionBase.

Return type:

LocatePositionBase

aiqclib.common.loader.dataset_loader.load_step5_extract_dataset(config, input_data=None, selected_profiles=None, selected_rows=None, summary_stats=None)[source]

Load a ExtractFeatureBase-derived class based on the configuration.

Uses the subclass name retrieved from YAML via config.get_base_class("extract") to fetch the correct class from EXTRACT_DATASET_REGISTRY, then instantiates it.

Parameters:
  • config (DataSetConfig) – The dataset configuration object, referencing the “extract” step.

  • input_data (Optional[DataFrame]) – An optional Polars DataFrame containing data for extraction steps.

  • selected_profiles (Optional[DataFrame]) – A Polars DataFrame of selected profiles, if applicable.

  • selected_rows (Optional[Dict[str, DataFrame]]) – A dictionary mapping target names (str) to Polars DataFrames of rows to be processed. Defaults to None.

  • summary_stats (Optional[DataFrame]) – A Polars DataFrame containing summary stats for scaling or references.

Returns:

An instantiated object that inherits from ExtractFeatureBase.

Return type:

ExtractFeatureBase

aiqclib.common.loader.dataset_loader.load_step6_split_dataset(config, target_features=None)[source]

Load a SplitDataSetBase-derived class based on the configuration.

Uses the subclass name retrieved from YAML via config.get_base_class("split") to fetch the correct class from SPLIT_DATASET_REGISTRY, then instantiates it.

Parameters:
  • config (DataSetConfig) – The dataset configuration object, referencing the “split” step.

  • target_features (Optional[Dict[str, DataFrame]]) – A dictionary mapping target names (str) to Polars DataFrames containing features to be split into train/test sets or folds. Defaults to None.

Returns:

An instantiated object that inherits from SplitDataSetBase.

Return type:

SplitDataSetBase

aiqclib.common.loader.dataset_registry module

Module providing registry dictionaries that map dataset class names (str) to their corresponding Python classes. These registries enable dynamic loading of the correct class during each preparation step in the pipeline.

aiqclib.common.loader.dataset_registry.EXTRACT_DATASET_REGISTRY: Dict[str, Type[ExtractFeatureBase]] = {'ExtractDataSetA': <class 'aiqclib.prepare.step5_extract_features.dataset_a.ExtractDataSetA'>}

A registry mapping class names (used in YAML config) to their corresponding Python classes for step5_extract_features tasks.

Type:

Dict[str, Type[ExtractFeatureBase]]

aiqclib.common.loader.dataset_registry.INPUT_DATASET_REGISTRY: Dict[str, Type[InputDataSetBase]] = {'InputDataSetA': <class 'aiqclib.prepare.step1_read_input.dataset_a.InputDataSetA'>}

A registry mapping class names (used in YAML config) to their corresponding Python classes for step1_read_input tasks.

Type:

Dict[str, Type[InputDataSetBase]]

aiqclib.common.loader.dataset_registry.LOCATE_DATASET_REGISTRY: Dict[str, Type[LocatePositionBase]] = {'LocateDataSetA': <class 'aiqclib.prepare.step4_select_rows.dataset_a.LocateDataSetA'>, 'LocateDataSetAll': <class 'aiqclib.prepare.step4_select_rows.dataset_all.LocateDataSetAll'>}

A registry mapping class names (used in YAML config) to their corresponding Python classes for step4_select_rows tasks.

Type:

Dict[str, Type[LocatePositionBase]]

aiqclib.common.loader.dataset_registry.SELECT_DATASET_REGISTRY: Dict[str, Type[ProfileSelectionBase]] = {'SelectDataSetA': <class 'aiqclib.prepare.step3_select_profiles.dataset_a.SelectDataSetA'>, 'SelectDataSetAll': <class 'aiqclib.prepare.step3_select_profiles.dataset_all.SelectDataSetAll'>}

A registry mapping class names (used in YAML config) to their corresponding Python classes for step3_select_profiles tasks.

Type:

Dict[str, Type[ProfileSelectionBase]]

aiqclib.common.loader.dataset_registry.SPLIT_DATASET_REGISTRY: Dict[str, Type[SplitDataSetBase]] = {'SplitDataSetA': <class 'aiqclib.prepare.step6_split_dataset.dataset_a.SplitDataSetA'>, 'SplitDataSetAll': <class 'aiqclib.prepare.step6_split_dataset.dataset_all.SplitDataSetAll'>}

A registry mapping class names (used in YAML config) to their corresponding Python classes for step6_split_dataset tasks.

Type:

Dict[str, Type[SplitDataSetBase]]

aiqclib.common.loader.dataset_registry.SUMMARY_DATASET_REGISTRY: Dict[str, Type[SummaryStatsBase]] = {'SummaryDataSetA': <class 'aiqclib.prepare.step2_calc_stats.dataset_a.SummaryDataSetA'>}

A registry mapping class names (used in YAML config) to their corresponding Python classes for step2_calc_stats tasks.

Type:

Dict[str, Type[SummaryStatsBase]]

aiqclib.common.loader.feature_loader module

This module provides utilities for dynamically loading and instantiating feature extraction classes from a predefined registry. It serves as a central point for retrieving specific feature implementations based on configuration details, facilitating a modular approach to feature engineering.

aiqclib.common.loader.feature_loader.load_feature_class(target_name, feature_info, selected_profiles=None, filtered_input=None, selected_rows=None, summary_stats=None)[source]

Instantiate a feature extraction class using the specified feature registry.

This function retrieves the class name from the "feature" key within the provided feature_info dictionary. It then looks up the corresponding class in the global FEATURE_REGISTRY and instantiates it with the supplied parameters.

Parameters:
  • target_name (str) – The target variable or dataset name for which features will be extracted. This is typically a column name or a unique identifier for the data subset.

  • feature_info (Dict) – A dictionary describing the feature extraction procedure. This dictionary must at least include the key "feature" whose value is the string name of the feature class registered in FEATURE_REGISTRY.

  • selected_profiles (Optional[DataFrame]) – An optional Polars DataFrame containing specific profiles (e.g., sample IDs, experiment runs) relevant to the current feature extraction task. Defaults to None.

  • filtered_input (Optional[DataFrame]) – An optional Polars DataFrame containing data that has already been winnowed to relevant observations for advanced merging or lookups within the feature extraction process. Defaults to None.

  • selected_rows (Optional[DataFrame]) – An optional Polars DataFrame containing the specific rows or observations that are the focus for this target’s feature calculation. Defaults to None.

  • summary_stats (Optional[DataFrame]) – An optional Polars DataFrame containing pre-computed summary statistics (e.g., mean, standard deviation) for potential use in scaling, normalization, or transformation steps during feature extraction. Defaults to None.

Returns:

An instance of the requested feature extraction class, which must inherit from FeatureBase.

Return type:

FeatureBase

Raises:

ValueError – If the "feature" key is missing from feature_info or if the specified feature class name is not found in FEATURE_REGISTRY.

aiqclib.common.loader.feature_registry module

Module defining the global registry for feature classes.

This module provides FEATURE_REGISTRY, a central mapping of string identifiers to specific feature-extraction classes within the aiqclib pipeline. Each entry allows for dynamic loading and instantiation of feature generators based on configuration settings, facilitating the preparation of datasets by applying various data transformations and extractions.

aiqclib.common.loader.feature_registry.FEATURE_REGISTRY: Dict[str, Type[FeatureBase]] = {'basic_values': <class 'aiqclib.prepare.features.basic_values.BasicValues'>, 'day_of_year': <class 'aiqclib.prepare.features.day_of_year.DayOfYearFeat'>, 'flank_down': <class 'aiqclib.prepare.features.flank_down.FlankDown'>, 'flank_up': <class 'aiqclib.prepare.features.flank_up.FlankUp'>, 'location': <class 'aiqclib.prepare.features.location.LocationFeat'>, 'profile_summary_stats': <class 'aiqclib.prepare.features.profile_summary.ProfileSummaryStats'>}

A dictionary mapping feature identifiers (str) to classes that inherit from FeatureBase. These classes are dynamically loaded based on the “feature” key in a feature configuration dictionary.

Type:

Dict[str, Type[FeatureBase]]

aiqclib.common.loader.model_loader module

This module provides utility functions for loading and managing model classes based on configuration settings, typically used in a machine learning or data processing pipeline.

aiqclib.common.loader.model_loader.load_model_class(config)[source]

Retrieve and instantiate a model class for the “model” step from the provided configuration.

This function performs the following steps:

  1. Fetches the class name from the configuration using config.get_base_class("model").

  2. Looks up the corresponding class in the global MODEL_REGISTRY.

  3. Instantiates the found class with the given configuration object as an argument.

Parameters:

config (ConfigBase) – A configuration object that includes a “base_class” entry under the “model” step, specifying which model class to load. This object must implement the get_base_class method.

Returns:

An instantiated model object, which is an instance of a class inheriting from ModelBase.

Return type:

ModelBase

Raises:

ValueError – If the retrieved model class name is not found in the MODEL_REGISTRY.

aiqclib.common.loader.model_loader.load_model_class_with_class_name(config, class_name)[source]

Retrieve and instantiate a specific model class from the registry using its name.

This function looks up the model class by class_name in the global MODEL_REGISTRY and then instantiates it with the provided configuration object.

Parameters:
  • config (ConfigBase) – A configuration object that will be passed to the model class’s constructor upon instantiation.

  • class_name (str) – The string name of the model class to load from the registry. This name should correspond to a key in MODEL_REGISTRY.

Returns:

An instantiated model object, which is an instance of a class inheriting from ModelBase.

Return type:

ModelBase

Raises:

ValueError – If the specified class_name is not found in the MODEL_REGISTRY.

aiqclib.common.loader.model_registry module

This module provides a comprehensive registry of model classes that can be used during training or inference steps.

It aggregates a base single model registry from aiqclib.common.loader.single_model_registry.SINGLE_MODEL_REGISTRY and extends it with specific suite models, offering convenient aliases. Each key in the dictionary corresponds to a model name (string), and each value is the class constructor for that model, typically inheriting from aiqclib.common.base.model_base.ModelBase.

aiqclib.common.loader.model_registry.MODEL_REGISTRY: Dict[str, Type[ModelBase]] = {'DT': <class 'aiqclib.train.models.decision_tree.DecisionTree'>, 'DecisionTree': <class 'aiqclib.train.models.decision_tree.DecisionTree'>, 'GNB': <class 'aiqclib.train.models.gaussian_naive_bayes.GaussianNaiveBayes'>, 'GaussianNaiveBayes': <class 'aiqclib.train.models.gaussian_naive_bayes.GaussianNaiveBayes'>, 'KNN': <class 'aiqclib.train.models.k_nearest_neighbors.KNearestNeighbors'>, 'KNearestNeighbors': <class 'aiqclib.train.models.k_nearest_neighbors.KNearestNeighbors'>, 'LDA': <class 'aiqclib.train.models.linear_discriminant_analysis.LinearDiscriminantAnalysis'>, 'LinearDiscriminantAnalysis': <class 'aiqclib.train.models.linear_discriminant_analysis.LinearDiscriminantAnalysis'>, 'LogisticRegression': <class 'aiqclib.train.models.logistic_regression.LogisticRegression'>, 'Logit': <class 'aiqclib.train.models.logistic_regression.LogisticRegression'>, 'MLP': <class 'aiqclib.train.models.multilayer_perceptron.MultilayerPerceptron'>, 'MS': <class 'aiqclib.train.models.model_suite.ModelSuite'>, 'ModelSuite': <class 'aiqclib.train.models.model_suite.ModelSuite'>, 'MultilayerPerceptron': <class 'aiqclib.train.models.multilayer_perceptron.MultilayerPerceptron'>, 'RF': <class 'aiqclib.train.models.random_forest.RandomForest'>, 'RandomForest': <class 'aiqclib.train.models.random_forest.RandomForest'>, 'SVM': <class 'aiqclib.train.models.support_vector_machine.SupportVectorMachine'>, 'SupportVectorMachine': <class 'aiqclib.train.models.support_vector_machine.SupportVectorMachine'>, 'XGB': <class 'aiqclib.train.models.xgboost.XGBoost'>, 'XGBoost': <class 'aiqclib.train.models.xgboost.XGBoost'>}

A dictionary mapping model names to their corresponding Python classes.

This registry is initialized with models from SINGLE_MODEL_REGISTRY and then updated to include the ModelSuite under both “ModelSuite” and “MS” keys, providing convenient aliases.

The keys are strings (e.g., “XGBoost”, “ModelSuite”), and the values are class objects that inherit from ModelBase.

Type:

Dict[str, Type[ModelBase]]

aiqclib.common.loader.single_model_loader module

This module provides utility functions for loading and managing model classes based on configuration settings, typically used in a machine learning or data processing pipeline.

aiqclib.common.loader.single_model_loader.load_single_model_class(config)[source]

Retrieve and instantiate a model class for the “model” step from the provided configuration.

This function performs the following steps:

  1. Fetches the class name from the configuration using config.get_base_class("model").

  2. Looks up the corresponding class in the global SINGLE_MODEL_REGISTRY.

  3. Instantiates the found class with the given configuration object as an argument.

Parameters:

config (ConfigBase) – A configuration object that includes a “base_class” entry under the “model” step, specifying which model class to load. This object must implement the get_base_class method.

Returns:

An instantiated model object, which is an instance of a class inheriting from ModelBase.

Return type:

ModelBase

Raises:

ValueError – If the retrieved model class name is not found in the SINGLE_MODEL_REGISTRY.

aiqclib.common.loader.single_model_loader.load_single_model_class_with_class_name(config, class_name)[source]

Retrieves and instantiates a specific model class using a given configuration and class name.

This function looks up the specified model class name in the global SINGLE_MODEL_REGISTRY and then instantiates it with the provided configuration object.

Parameters:
  • config (ConfigBase) – The configuration object to be passed to the model class constructor. This object must implement the get_base_class method if it were to be used for fetching the class name, but here it’s directly passed to the model constructor.

  • class_name (str) – The string name of the model class to retrieve and instantiate. This name must exist as a key in the SINGLE_MODEL_REGISTRY.

Returns:

An instantiated model object, which is an instance of a class inheriting from ModelBase.

Return type:

ModelBase

Raises:

ValueError – If the provided class_name is not found in the SINGLE_MODEL_REGISTRY.

aiqclib.common.loader.single_model_registry module

This module provides a registry of model classes that can be used during training or inference steps. Each key in the dictionary corresponds to a model name (string), and each value is the class constructor for that model.

aiqclib.common.loader.single_model_registry.SINGLE_MODEL_REGISTRY: Dict[str, Type[ModelBase]] = {'DT': <class 'aiqclib.train.models.decision_tree.DecisionTree'>, 'DecisionTree': <class 'aiqclib.train.models.decision_tree.DecisionTree'>, 'GNB': <class 'aiqclib.train.models.gaussian_naive_bayes.GaussianNaiveBayes'>, 'GaussianNaiveBayes': <class 'aiqclib.train.models.gaussian_naive_bayes.GaussianNaiveBayes'>, 'KNN': <class 'aiqclib.train.models.k_nearest_neighbors.KNearestNeighbors'>, 'KNearestNeighbors': <class 'aiqclib.train.models.k_nearest_neighbors.KNearestNeighbors'>, 'LDA': <class 'aiqclib.train.models.linear_discriminant_analysis.LinearDiscriminantAnalysis'>, 'LinearDiscriminantAnalysis': <class 'aiqclib.train.models.linear_discriminant_analysis.LinearDiscriminantAnalysis'>, 'LogisticRegression': <class 'aiqclib.train.models.logistic_regression.LogisticRegression'>, 'Logit': <class 'aiqclib.train.models.logistic_regression.LogisticRegression'>, 'MLP': <class 'aiqclib.train.models.multilayer_perceptron.MultilayerPerceptron'>, 'MS': <class 'aiqclib.train.models.model_suite.ModelSuite'>, 'ModelSuite': <class 'aiqclib.train.models.model_suite.ModelSuite'>, 'MultilayerPerceptron': <class 'aiqclib.train.models.multilayer_perceptron.MultilayerPerceptron'>, 'RF': <class 'aiqclib.train.models.random_forest.RandomForest'>, 'RandomForest': <class 'aiqclib.train.models.random_forest.RandomForest'>, 'SVM': <class 'aiqclib.train.models.support_vector_machine.SupportVectorMachine'>, 'SupportVectorMachine': <class 'aiqclib.train.models.support_vector_machine.SupportVectorMachine'>, 'XGB': <class 'aiqclib.train.models.xgboost.XGBoost'>, 'XGBoost': <class 'aiqclib.train.models.xgboost.XGBoost'>}

A dictionary mapping model names to their corresponding Python classes.

The keys are strings (e.g., “XGBoost”), and the values are class objects that inherit from aiqclib.common.base.model_base.ModelBase.

Type:

Dict[str, Type[ModelBase]]

aiqclib.common.loader.training_loader module

This module provides utility functions for loading and instantiating various training components, such as input training sets, model validation classes, and model build classes. It leverages a registry pattern and a TrainingConfig object to determine the specific class to load for each training step, promoting modularity and configurability in the training pipeline.

aiqclib.common.loader.training_loader.load_step1_input_training_set(config)[source]

Retrieve and instantiate an aiqclib.train.step1_read_input.input_base.InputTrainingSetBase subclass for the “input” step, based on the YAML configuration.

  1. Extract the class name with TrainingConfig.get_base_class("input")().

  2. Retrieve the corresponding class from aiqclib.common.loader.training_registry.INPUT_TRAINING_SET_REGISTRY.

  3. Instantiate the class and return it.

Parameters:

config (TrainingConfig) – The training configuration object containing a base_class entry under the “input” section.

Returns:

An instantiated object of a class that inherits from aiqclib.train.step1_read_input.input_base.InputTrainingSetBase.

Return type:

InputTrainingSetBase

aiqclib.common.loader.training_loader.load_step2_model_validation_class(config, training_sets=None)[source]

Retrieve and instantiate a aiqclib.train.step2_validate_model.validate_base.ValidationBase subclass for the “validate” step, based on the YAML configuration.

Steps:
  1. Extract the class name with TrainingConfig.get_base_class("validate")().

  2. Retrieve the corresponding class from aiqclib.common.loader.training_registry.MODEL_VALIDATION_REGISTRY.

  3. Instantiate the class, optionally passing the provided training sets.

Parameters:
  • config (TrainingConfig) – The training configuration object referencing a base_class under the “validate” section.

  • training_sets (Optional[dict[str, DataFrame]]) – A dictionary of Polars DataFrames containing data for model validation, defaults to None. Keys typically represent data categories (e.g., “train”, “test”).

Returns:

An instantiated object of a class that inherits from aiqclib.train.step2_validate_model.validate_base.ValidationBase.

Return type:

ValidationBase

aiqclib.common.loader.training_loader.load_step4_build_model_class(config, training_sets=None, test_sets=None)[source]

Retrieve and instantiate a aiqclib.train.step4_build_model.build_model_base.BuildModelBase subclass for the “build” step, based on the YAML configuration.

Steps:
  1. Extract the class name with TrainingConfig.get_base_class("build")().

  2. Retrieve the corresponding class from aiqclib.common.loader.training_registry.BUILD_MODEL_REGISTRY.

  3. Instantiate the class, providing any training and test sets.

Parameters:
  • config (TrainingConfig) – The training configuration object referencing a base_class under the “build” section.

  • training_sets (Optional[dict[str, DataFrame]]) – A dictionary of Polars DataFrames of training data, defaults to None. Keys typically represent data categories (e.g., “features”, “target”).

  • test_sets (Optional[dict[str, DataFrame]]) – A dictionary of Polars DataFrames of test data, defaults to None. Keys typically represent data categories (e.g., “features”, “target”).

Returns:

An instantiated object of a class that inherits from aiqclib.train.step4_build_model.build_model_base.BuildModelBase.

Return type:

BuildModelBase

aiqclib.common.loader.training_registry module

This module provides centralized registries for various training components, including dataset readers, model validation strategies, and model-building classes. Each registry is a dictionary mapping string keys (typically from configuration files) to their corresponding Python class implementations, facilitating flexible and extensible model training workflows.

aiqclib.common.loader.training_registry.BUILD_MODEL_REGISTRY: Dict[str, Type[BuildModelBase]] = {'BuildModel': <class 'aiqclib.train.step4_build_model.build_model.BuildModel'>, 'BuildModelSuite': <class 'aiqclib.train.step4_build_model.build_model_suite.BuildModelSuite'>}

Registry mapping string keys to concrete implementations of BuildModelBase.

This dictionary enables the dynamic selection of model-building classes based on configuration settings, providing flexibility in choosing and configuring different model architectures.

aiqclib.common.loader.training_registry.INPUT_TRAINING_SET_REGISTRY: Dict[str, Type[InputTrainingSetBase]] = {'InputTrainingSetA': <class 'aiqclib.train.step1_read_input.dataset_a.InputTrainingSetA'>}

Registry mapping string keys to concrete implementations of InputTrainingSetBase.

This dictionary facilitates the dynamic selection of dataset reading classes based on configuration settings, allowing for easy extension and customization of input data handling.

aiqclib.common.loader.training_registry.MODEL_VALIDATION_REGISTRY: Dict[str, Type[ValidationBase]] = {'KFoldValidation': <class 'aiqclib.train.step2_validate_model.kfold_validation.KFoldValidation'>, 'KFoldValidationSuite': <class 'aiqclib.train.step2_validate_model.kfold_validation_suite.KFoldValidationSuite'>}

Registry mapping string keys to concrete implementations of ValidationBase.

This dictionary allows for the dynamic selection of model validation strategies based on configuration settings, supporting various evaluation methodologies.