aiqclib.common.loader package
Submodules
aiqclib.common.loader.classify_loader module
This module provides factory functions to dynamically load and instantiate various dataset processing steps within a classification pipeline.
It uses a configuration object (aiqclib.common.config.dataset_config.DataSetConfig)
to determine which specific implementation of a base class (e.g., aiqclib.prepare.step1_read_input.input_base.InputDataSetBase)
should be loaded for each step. The module relies on several global registries
to map class names from the configuration to their respective Python types.
- aiqclib.common.loader.classify_loader.load_classify_step1_input_dataset(config)[source]
Instantiate an
aiqclib.prepare.step1_read_input.input_base.InputDataSetBase-derived class based on the configuration.Specifically:
Fetches the class name from the config via
DataSetConfig.get_base_class("input")().Looks up the class in
aiqclib.common.loader.classify_registry.INPUT_CLASSIFY_REGISTRY.Instantiates and returns the class.
- Parameters:
config (
DataSetConfig) – The dataset configuration object, which includes abase_classfield under “input” in the YAML file.- Returns:
An instance of a class derived from
aiqclib.prepare.step1_read_input.input_base.InputDataSetBase.- Return type:
- aiqclib.common.loader.classify_loader.load_classify_step2_summary_dataset(config, input_data=None)[source]
Instantiate a
aiqclib.prepare.step2_calc_stats.summary_base.SummaryStatsBase-derived class based on the configuration.Specifically:
Fetches the class name from the config via
DataSetConfig.get_base_class("summary")().Looks up the class in
aiqclib.common.loader.classify_registry.SUMMARY_CLASSIFY_REGISTRY.Instantiates and returns the class, optionally with an input dataset.
- Parameters:
config (
DataSetConfig) – The dataset configuration object referencing the “summary” step.input_data (
Optional[DataFrame]) – An optional Polars DataFrame for computing summary statistics.
- Returns:
An instance of a class derived from
aiqclib.prepare.step2_calc_stats.summary_base.SummaryStatsBase.- Return type:
- aiqclib.common.loader.classify_loader.load_classify_step3_select_dataset(config, input_data=None)[source]
Instantiate a
aiqclib.prepare.step3_select_profiles.select_base.ProfileSelectionBase-derived class based on the configuration.Specifically:
Fetches the class name from the config via
DataSetConfig.get_base_class("select")().Looks up the class in
aiqclib.common.loader.classify_registry.SELECT_CLASSIFY_REGISTRY.Instantiates and returns the class, optionally with an input dataset.
- Parameters:
config (
DataSetConfig) – The dataset configuration object referencing the “select” step.input_data (
Optional[DataFrame]) – An optional Polars DataFrame for selecting profiles.
- Returns:
An instance of a class derived from
aiqclib.prepare.step3_select_profiles.select_base.ProfileSelectionBase.- Return type:
- aiqclib.common.loader.classify_loader.load_classify_step4_locate_dataset(config, input_data=None, selected_profiles=None)[source]
Instantiate a
aiqclib.prepare.step4_select_rows.locate_base.LocatePositionBase-derived class based on the configuration.Specifically:
Fetches the class name from the config via
DataSetConfig.get_base_class("locate")().Looks up the class in
aiqclib.common.loader.classify_registry.LOCATE_CLASSIFY_REGISTRY.Instantiates and returns the class, optionally with an input dataset and previously selected profiles.
- Parameters:
config (
DataSetConfig) – The dataset configuration object referencing the “locate” step.input_data (
Optional[DataFrame]) – An optional Polars DataFrame containing the data from which location-based subsetting occurs.selected_profiles (
Optional[DataFrame]) – An optional Polars DataFrame containing already selected profiles that might be used for filtering additional rows.
- Returns:
An instance of a class derived from
aiqclib.prepare.step4_select_rows.locate_base.LocatePositionBase.- Return type:
- aiqclib.common.loader.classify_loader.load_classify_step5_extract_dataset(config, input_data=None, selected_profiles=None, selected_rows=None, summary_stats=None)[source]
Instantiate an
aiqclib.prepare.step5_extract_features.extract_base.ExtractFeatureBase-derived class based on the configuration.Specifically:
Fetches the class name from the config via
DataSetConfig.get_base_class("extract")().Looks up the class in
aiqclib.common.loader.classify_registry.EXTRACT_CLASSIFY_REGISTRY.Instantiates and returns the class, optionally with various intermediate datasets.
- Parameters:
config (
DataSetConfig) – The dataset configuration object referencing the “extract” step.input_data (
Optional[DataFrame]) – An optional Polars DataFrame containing the data from which features will be extracted.selected_profiles (
Optional[DataFrame]) – An optional Polars DataFrame containing selected profiles, if relevant to feature extraction.selected_rows (
Optional[Dict[str,DataFrame]]) – An optional dictionary where keys are target variable names and values are Polars DataFrames identifying rows relevant to each.summary_stats (
Optional[DataFrame]) – An optional Polars DataFrame providing summary statistics that might be used for feature scaling or reference.
- Returns:
An instance of a class derived from
aiqclib.prepare.step5_extract_features.extract_base.ExtractFeatureBase.- Return type:
- aiqclib.common.loader.classify_loader.load_classify_step6_classify_dataset(config, test_sets=None)[source]
Instantiate a
aiqclib.train.step4_build_model.build_model_base.BuildModelBase-derived class based on the configuration.Specifically:
Fetches the class name from the config via
DataSetConfig.get_base_class("classify")().Looks up the class in
aiqclib.common.loader.classify_registry.CLASSIFY_CLASSIFY_REGISTRY.Instantiates and returns the class, optionally with test datasets.
- Parameters:
config (
DataSetConfig) – The dataset configuration object referencing the “classify” step.test_sets (
Optional[Dict[str,DataFrame]]) – An optional dictionary of test datasets where keys are names and values are Polars DataFrames.
- Returns:
An instance of a class derived from
aiqclib.train.step4_build_model.build_model_base.BuildModelBase.- Return type:
- aiqclib.common.loader.classify_loader.load_classify_step7_concat_dataset(config, input_data=None, predictions=None)[source]
Instantiate a
aiqclib.classify.step7_concat_datasets.concat_base.ConcatDatasetsBase-derived class based on the configuration.Specifically:
Fetches the class name from the config via
DataSetConfig.get_base_class("concat")().Looks up the class in
aiqclib.common.loader.classify_registry.CLASSIFY_CONCAT_REGISTRY.Instantiates and returns the class, optionally with various intermediate datasets.
- Parameters:
config (
DataSetConfig) – The dataset configuration object referencing the “concat” step.input_data (
Optional[DataFrame]) – An optional Polars DataFrame representing the original input data.predictions (
Optional[Dict[str,DataFrame]]) – An optional dictionary of predictions, where keys are prediction set names and values are Polars DataFrames.
- Returns:
An instance of a class derived from
aiqclib.classify.step7_concat_datasets.concat_base.ConcatDatasetsBase.- Return type:
aiqclib.common.loader.classify_registry module
Module providing registry dictionaries that map dataset class names (str) to their corresponding Python classes. These registries enable dynamic loading of the correct class during each preparation step in the pipeline.
- aiqclib.common.loader.classify_registry.CLASSIFY_CLASSIFY_REGISTRY: Dict[str, Type[BuildModelBase]] = {'ClassifyAll': <class 'aiqclib.classify.step6_classify_dataset.dataset_all.ClassifyAll'>, 'ClassifyAllSuite': <class 'aiqclib.classify.step6_classify_dataset.dataset_all_suite.ClassifyAllSuite'>}
A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step6_classify_dataset tasks in the classification pipeline.
- Type:
Dict[str, Type[BuildModelBase]]
- aiqclib.common.loader.classify_registry.CLASSIFY_CONCAT_REGISTRY: Dict[str, Type[ConcatDatasetsBase]] = {'ConcatDataSetAll': <class 'aiqclib.classify.step7_concat_datasets.dataset_all.ConcatDataSetAll'>, 'ConcatDataSetSuite': <class 'aiqclib.classify.step7_concat_datasets.dataset_suite.ConcatDataSetSuite'>}
A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step7_concat_datasets tasks in the classification pipeline.
- Type:
Dict[str, Type[ConcatDatasetsBase]]
- aiqclib.common.loader.classify_registry.EXTRACT_CLASSIFY_REGISTRY: Dict[str, Type[ExtractFeatureBase]] = {'ExtractDataSetAll': <class 'aiqclib.classify.step5_extract_features.dataset_all.ExtractDataSetAll'>}
A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step5_extract_features tasks in the classification pipeline.
- Type:
Dict[str, Type[ExtractFeatureBase]]
- aiqclib.common.loader.classify_registry.INPUT_CLASSIFY_REGISTRY: Dict[str, Type[InputDataSetBase]] = {'InputDataSetAll': <class 'aiqclib.classify.step1_read_input.dataset_all.InputDataSetAll'>}
A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step1_read_input tasks in the classification pipeline.
- Type:
Dict[str, Type[InputDataSetBase]]
- aiqclib.common.loader.classify_registry.LOCATE_CLASSIFY_REGISTRY: Dict[str, Type[LocatePositionBase]] = {'LocateDataSetAll': <class 'aiqclib.classify.step4_select_rows.dataset_all.LocateDataSetAll'>}
A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step4_select_rows tasks in the classification pipeline.
- Type:
Dict[str, Type[LocatePositionBase]]
- aiqclib.common.loader.classify_registry.SELECT_CLASSIFY_REGISTRY: Dict[str, Type[ProfileSelectionBase]] = {'SelectDataSetAll': <class 'aiqclib.classify.step3_select_profiles.dataset_all.SelectDataSetAll'>}
A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step3_select_profiles tasks in the classification pipeline.
- Type:
Dict[str, Type[ProfileSelectionBase]]
- aiqclib.common.loader.classify_registry.SUMMARY_CLASSIFY_REGISTRY: Dict[str, Type[SummaryStatsBase]] = {'SummaryDataSetAll': <class 'aiqclib.classify.step2_calc_stats.dataset_all.SummaryDataSetAll'>}
A registry mapping class names (as strings, typically from YAML configuration) to their corresponding Python classes for step2_calc_stats tasks in the classification pipeline.
- Type:
Dict[str, Type[SummaryStatsBase]]
aiqclib.common.loader.dataset_loader module
This module provides factory functions for loading and instantiating various dataset preparation steps within the aiqclib library. It uses a configuration object to determine the specific class to load for each step (e.g., input, summary, select) and retrieves it from a central registry.
Functions within this module facilitate the dynamic creation of dataset preparation objects based on predefined configurations, enabling a flexible and extensible data processing pipeline.
- aiqclib.common.loader.dataset_loader.load_step1_input_dataset(config)[source]
Load an
InputDataSetBase-derived class based on the configuration.Uses the subclass name retrieved from YAML via
config.get_base_class("input")to fetch the correct class fromINPUT_DATASET_REGISTRY, then instantiates it.- Parameters:
config (
DataSetConfig) – The dataset configuration object, which includes abase_classfield under the “input” step in the YAML.- Returns:
An instantiated object that inherits from
InputDataSetBase.- Return type:
- aiqclib.common.loader.dataset_loader.load_step2_summary_dataset(config, input_data=None)[source]
Load a
SummaryStatsBase-derived class based on the configuration.Uses the subclass name retrieved from YAML via
config.get_base_class("summary")to fetch the correct class fromSUMMARY_DATASET_REGISTRY, then instantiates it.- Parameters:
config (
DataSetConfig) – The dataset configuration object, referencing the “summary” step.input_data (
Optional[DataFrame]) – A Polars DataFrame from which summary stats can be computed, defaults to None.
- Returns:
An instantiated object that inherits from
SummaryStatsBase.- Return type:
- aiqclib.common.loader.dataset_loader.load_step3_select_dataset(config, input_data=None)[source]
Load a
ProfileSelectionBase-derived class based on the configuration.Uses the subclass name retrieved from YAML via
config.get_base_class("select")to fetch the correct class fromSELECT_DATASET_REGISTRY, then instantiates it.- Parameters:
config (
DataSetConfig) – The dataset configuration object, referencing the “select” step.input_data (
Optional[DataFrame]) – A Polars DataFrame from which profiles can be selected, defaults to None.
- Returns:
An instantiated object that inherits from
ProfileSelectionBase.- Return type:
- aiqclib.common.loader.dataset_loader.load_step4_locate_dataset(config, input_data=None, selected_profiles=None)[source]
Load a
LocatePositionBase-derived class based on the configuration.Uses the subclass name retrieved from YAML via
config.get_base_class("locate")to fetch the correct class fromLOCATE_DATASET_REGISTRY, then instantiates it.- Parameters:
config (
DataSetConfig) – The dataset configuration object, referencing the “locate” step.input_data (
Optional[DataFrame]) – A Polars DataFrame containing data from which locations can be derived, defaults to None.selected_profiles (
Optional[DataFrame]) – A Polars DataFrame representing pre-selected profiles, defaults to None.
- Returns:
An instantiated object that inherits from
LocatePositionBase.- Return type:
- aiqclib.common.loader.dataset_loader.load_step5_extract_dataset(config, input_data=None, selected_profiles=None, selected_rows=None, summary_stats=None)[source]
Load a
ExtractFeatureBase-derived class based on the configuration.Uses the subclass name retrieved from YAML via
config.get_base_class("extract")to fetch the correct class fromEXTRACT_DATASET_REGISTRY, then instantiates it.- Parameters:
config (
DataSetConfig) – The dataset configuration object, referencing the “extract” step.input_data (
Optional[DataFrame]) – An optional Polars DataFrame containing data for extraction steps.selected_profiles (
Optional[DataFrame]) – A Polars DataFrame of selected profiles, if applicable.selected_rows (
Optional[Dict[str,DataFrame]]) – A dictionary mapping target names (str) to Polars DataFrames of rows to be processed. Defaults to None.summary_stats (
Optional[DataFrame]) – A Polars DataFrame containing summary stats for scaling or references.
- Returns:
An instantiated object that inherits from
ExtractFeatureBase.- Return type:
- aiqclib.common.loader.dataset_loader.load_step6_split_dataset(config, target_features=None)[source]
Load a
SplitDataSetBase-derived class based on the configuration.Uses the subclass name retrieved from YAML via
config.get_base_class("split")to fetch the correct class fromSPLIT_DATASET_REGISTRY, then instantiates it.- Parameters:
config (
DataSetConfig) – The dataset configuration object, referencing the “split” step.target_features (
Optional[Dict[str,DataFrame]]) – A dictionary mapping target names (str) to Polars DataFrames containing features to be split into train/test sets or folds. Defaults to None.
- Returns:
An instantiated object that inherits from
SplitDataSetBase.- Return type:
aiqclib.common.loader.dataset_registry module
Module providing registry dictionaries that map dataset class names (str) to their corresponding Python classes. These registries enable dynamic loading of the correct class during each preparation step in the pipeline.
- aiqclib.common.loader.dataset_registry.EXTRACT_DATASET_REGISTRY: Dict[str, Type[ExtractFeatureBase]] = {'ExtractDataSetA': <class 'aiqclib.prepare.step5_extract_features.dataset_a.ExtractDataSetA'>}
A registry mapping class names (used in YAML config) to their corresponding Python classes for step5_extract_features tasks.
- Type:
Dict[str, Type[ExtractFeatureBase]]
- aiqclib.common.loader.dataset_registry.INPUT_DATASET_REGISTRY: Dict[str, Type[InputDataSetBase]] = {'InputDataSetA': <class 'aiqclib.prepare.step1_read_input.dataset_a.InputDataSetA'>}
A registry mapping class names (used in YAML config) to their corresponding Python classes for step1_read_input tasks.
- Type:
Dict[str, Type[InputDataSetBase]]
- aiqclib.common.loader.dataset_registry.LOCATE_DATASET_REGISTRY: Dict[str, Type[LocatePositionBase]] = {'LocateDataSetA': <class 'aiqclib.prepare.step4_select_rows.dataset_a.LocateDataSetA'>, 'LocateDataSetAll': <class 'aiqclib.prepare.step4_select_rows.dataset_all.LocateDataSetAll'>}
A registry mapping class names (used in YAML config) to their corresponding Python classes for step4_select_rows tasks.
- Type:
Dict[str, Type[LocatePositionBase]]
- aiqclib.common.loader.dataset_registry.SELECT_DATASET_REGISTRY: Dict[str, Type[ProfileSelectionBase]] = {'SelectDataSetA': <class 'aiqclib.prepare.step3_select_profiles.dataset_a.SelectDataSetA'>, 'SelectDataSetAll': <class 'aiqclib.prepare.step3_select_profiles.dataset_all.SelectDataSetAll'>}
A registry mapping class names (used in YAML config) to their corresponding Python classes for step3_select_profiles tasks.
- Type:
Dict[str, Type[ProfileSelectionBase]]
- aiqclib.common.loader.dataset_registry.SPLIT_DATASET_REGISTRY: Dict[str, Type[SplitDataSetBase]] = {'SplitDataSetA': <class 'aiqclib.prepare.step6_split_dataset.dataset_a.SplitDataSetA'>, 'SplitDataSetAll': <class 'aiqclib.prepare.step6_split_dataset.dataset_all.SplitDataSetAll'>}
A registry mapping class names (used in YAML config) to their corresponding Python classes for step6_split_dataset tasks.
- Type:
Dict[str, Type[SplitDataSetBase]]
- aiqclib.common.loader.dataset_registry.SUMMARY_DATASET_REGISTRY: Dict[str, Type[SummaryStatsBase]] = {'SummaryDataSetA': <class 'aiqclib.prepare.step2_calc_stats.dataset_a.SummaryDataSetA'>}
A registry mapping class names (used in YAML config) to their corresponding Python classes for step2_calc_stats tasks.
- Type:
Dict[str, Type[SummaryStatsBase]]
aiqclib.common.loader.feature_loader module
This module provides utilities for dynamically loading and instantiating feature extraction classes from a predefined registry. It serves as a central point for retrieving specific feature implementations based on configuration details, facilitating a modular approach to feature engineering.
- aiqclib.common.loader.feature_loader.load_feature_class(target_name, feature_info, selected_profiles=None, filtered_input=None, selected_rows=None, summary_stats=None)[source]
Instantiate a feature extraction class using the specified feature registry.
This function retrieves the class name from the
"feature"key within the providedfeature_infodictionary. It then looks up the corresponding class in the globalFEATURE_REGISTRYand instantiates it with the supplied parameters.- Parameters:
target_name (
str) – The target variable or dataset name for which features will be extracted. This is typically a column name or a unique identifier for the data subset.feature_info (
Dict) – A dictionary describing the feature extraction procedure. This dictionary must at least include the key"feature"whose value is the string name of the feature class registered inFEATURE_REGISTRY.selected_profiles (
Optional[DataFrame]) – An optional Polars DataFrame containing specific profiles (e.g., sample IDs, experiment runs) relevant to the current feature extraction task. Defaults to None.filtered_input (
Optional[DataFrame]) – An optional Polars DataFrame containing data that has already been winnowed to relevant observations for advanced merging or lookups within the feature extraction process. Defaults to None.selected_rows (
Optional[DataFrame]) – An optional Polars DataFrame containing the specific rows or observations that are the focus for this target’s feature calculation. Defaults to None.summary_stats (
Optional[DataFrame]) – An optional Polars DataFrame containing pre-computed summary statistics (e.g., mean, standard deviation) for potential use in scaling, normalization, or transformation steps during feature extraction. Defaults to None.
- Returns:
An instance of the requested feature extraction class, which must inherit from
FeatureBase.- Return type:
- Raises:
ValueError – If the
"feature"key is missing fromfeature_infoor if the specified feature class name is not found inFEATURE_REGISTRY.
aiqclib.common.loader.feature_registry module
Module defining the global registry for feature classes.
This module provides FEATURE_REGISTRY, a central mapping of string
identifiers to specific feature-extraction classes within the aiqclib
pipeline. Each entry allows for dynamic loading and instantiation of
feature generators based on configuration settings, facilitating the
preparation of datasets by applying various data transformations and
extractions.
- aiqclib.common.loader.feature_registry.FEATURE_REGISTRY: Dict[str, Type[FeatureBase]] = {'basic_values': <class 'aiqclib.prepare.features.basic_values.BasicValues'>, 'day_of_year': <class 'aiqclib.prepare.features.day_of_year.DayOfYearFeat'>, 'flank_down': <class 'aiqclib.prepare.features.flank_down.FlankDown'>, 'flank_up': <class 'aiqclib.prepare.features.flank_up.FlankUp'>, 'location': <class 'aiqclib.prepare.features.location.LocationFeat'>, 'profile_summary_stats': <class 'aiqclib.prepare.features.profile_summary.ProfileSummaryStats'>}
A dictionary mapping feature identifiers (str) to classes that inherit from
FeatureBase. These classes are dynamically loaded based on the “feature” key in a feature configuration dictionary.- Type:
Dict[str, Type[FeatureBase]]
aiqclib.common.loader.model_loader module
This module provides utility functions for loading and managing model classes based on configuration settings, typically used in a machine learning or data processing pipeline.
- aiqclib.common.loader.model_loader.load_model_class(config)[source]
Retrieve and instantiate a model class for the “model” step from the provided configuration.
This function performs the following steps:
Fetches the class name from the configuration using
config.get_base_class("model").Looks up the corresponding class in the global
MODEL_REGISTRY.Instantiates the found class with the given configuration object as an argument.
- Parameters:
config (
ConfigBase) – A configuration object that includes a “base_class” entry under the “model” step, specifying which model class to load. This object must implement theget_base_classmethod.- Returns:
An instantiated model object, which is an instance of a class inheriting from
ModelBase.- Return type:
- Raises:
ValueError – If the retrieved model class name is not found in the
MODEL_REGISTRY.
- aiqclib.common.loader.model_loader.load_model_class_with_class_name(config, class_name)[source]
Retrieve and instantiate a specific model class from the registry using its name.
This function looks up the model class by
class_namein the globalMODEL_REGISTRYand then instantiates it with the provided configuration object.- Parameters:
config (
ConfigBase) – A configuration object that will be passed to the model class’s constructor upon instantiation.class_name (
str) – The string name of the model class to load from the registry. This name should correspond to a key inMODEL_REGISTRY.
- Returns:
An instantiated model object, which is an instance of a class inheriting from
ModelBase.- Return type:
- Raises:
ValueError – If the specified
class_nameis not found in theMODEL_REGISTRY.
aiqclib.common.loader.model_registry module
This module provides a comprehensive registry of model classes that can be used during training or inference steps.
It aggregates a base single model registry from aiqclib.common.loader.single_model_registry.SINGLE_MODEL_REGISTRY
and extends it with specific suite models, offering convenient aliases.
Each key in the dictionary corresponds to a model name (string), and each value
is the class constructor for that model, typically inheriting from
aiqclib.common.base.model_base.ModelBase.
- aiqclib.common.loader.model_registry.MODEL_REGISTRY: Dict[str, Type[ModelBase]] = {'DT': <class 'aiqclib.train.models.decision_tree.DecisionTree'>, 'DecisionTree': <class 'aiqclib.train.models.decision_tree.DecisionTree'>, 'GNB': <class 'aiqclib.train.models.gaussian_naive_bayes.GaussianNaiveBayes'>, 'GaussianNaiveBayes': <class 'aiqclib.train.models.gaussian_naive_bayes.GaussianNaiveBayes'>, 'KNN': <class 'aiqclib.train.models.k_nearest_neighbors.KNearestNeighbors'>, 'KNearestNeighbors': <class 'aiqclib.train.models.k_nearest_neighbors.KNearestNeighbors'>, 'LDA': <class 'aiqclib.train.models.linear_discriminant_analysis.LinearDiscriminantAnalysis'>, 'LinearDiscriminantAnalysis': <class 'aiqclib.train.models.linear_discriminant_analysis.LinearDiscriminantAnalysis'>, 'LogisticRegression': <class 'aiqclib.train.models.logistic_regression.LogisticRegression'>, 'Logit': <class 'aiqclib.train.models.logistic_regression.LogisticRegression'>, 'MLP': <class 'aiqclib.train.models.multilayer_perceptron.MultilayerPerceptron'>, 'MS': <class 'aiqclib.train.models.model_suite.ModelSuite'>, 'ModelSuite': <class 'aiqclib.train.models.model_suite.ModelSuite'>, 'MultilayerPerceptron': <class 'aiqclib.train.models.multilayer_perceptron.MultilayerPerceptron'>, 'RF': <class 'aiqclib.train.models.random_forest.RandomForest'>, 'RandomForest': <class 'aiqclib.train.models.random_forest.RandomForest'>, 'SVM': <class 'aiqclib.train.models.support_vector_machine.SupportVectorMachine'>, 'SupportVectorMachine': <class 'aiqclib.train.models.support_vector_machine.SupportVectorMachine'>, 'XGB': <class 'aiqclib.train.models.xgboost.XGBoost'>, 'XGBoost': <class 'aiqclib.train.models.xgboost.XGBoost'>}
A dictionary mapping model names to their corresponding Python classes.
This registry is initialized with models from
SINGLE_MODEL_REGISTRYand then updated to include theModelSuiteunder both “ModelSuite” and “MS” keys, providing convenient aliases.The keys are strings (e.g., “XGBoost”, “ModelSuite”), and the values are class objects that inherit from
ModelBase.- Type:
Dict[str, Type[ModelBase]]
aiqclib.common.loader.single_model_loader module
This module provides utility functions for loading and managing model classes based on configuration settings, typically used in a machine learning or data processing pipeline.
- aiqclib.common.loader.single_model_loader.load_single_model_class(config)[source]
Retrieve and instantiate a model class for the “model” step from the provided configuration.
This function performs the following steps:
Fetches the class name from the configuration using
config.get_base_class("model").Looks up the corresponding class in the global
SINGLE_MODEL_REGISTRY.Instantiates the found class with the given configuration object as an argument.
- Parameters:
config (
ConfigBase) – A configuration object that includes a “base_class” entry under the “model” step, specifying which model class to load. This object must implement theget_base_classmethod.- Returns:
An instantiated model object, which is an instance of a class inheriting from
ModelBase.- Return type:
- Raises:
ValueError – If the retrieved model class name is not found in the
SINGLE_MODEL_REGISTRY.
- aiqclib.common.loader.single_model_loader.load_single_model_class_with_class_name(config, class_name)[source]
Retrieves and instantiates a specific model class using a given configuration and class name.
This function looks up the specified model class name in the global
SINGLE_MODEL_REGISTRYand then instantiates it with the provided configuration object.- Parameters:
config (
ConfigBase) – The configuration object to be passed to the model class constructor. This object must implement theget_base_classmethod if it were to be used for fetching the class name, but here it’s directly passed to the model constructor.class_name (
str) – The string name of the model class to retrieve and instantiate. This name must exist as a key in theSINGLE_MODEL_REGISTRY.
- Returns:
An instantiated model object, which is an instance of a class inheriting from
ModelBase.- Return type:
- Raises:
ValueError – If the provided
class_nameis not found in theSINGLE_MODEL_REGISTRY.
aiqclib.common.loader.single_model_registry module
This module provides a registry of model classes that can be used during training or inference steps. Each key in the dictionary corresponds to a model name (string), and each value is the class constructor for that model.
- aiqclib.common.loader.single_model_registry.SINGLE_MODEL_REGISTRY: Dict[str, Type[ModelBase]] = {'DT': <class 'aiqclib.train.models.decision_tree.DecisionTree'>, 'DecisionTree': <class 'aiqclib.train.models.decision_tree.DecisionTree'>, 'GNB': <class 'aiqclib.train.models.gaussian_naive_bayes.GaussianNaiveBayes'>, 'GaussianNaiveBayes': <class 'aiqclib.train.models.gaussian_naive_bayes.GaussianNaiveBayes'>, 'KNN': <class 'aiqclib.train.models.k_nearest_neighbors.KNearestNeighbors'>, 'KNearestNeighbors': <class 'aiqclib.train.models.k_nearest_neighbors.KNearestNeighbors'>, 'LDA': <class 'aiqclib.train.models.linear_discriminant_analysis.LinearDiscriminantAnalysis'>, 'LinearDiscriminantAnalysis': <class 'aiqclib.train.models.linear_discriminant_analysis.LinearDiscriminantAnalysis'>, 'LogisticRegression': <class 'aiqclib.train.models.logistic_regression.LogisticRegression'>, 'Logit': <class 'aiqclib.train.models.logistic_regression.LogisticRegression'>, 'MLP': <class 'aiqclib.train.models.multilayer_perceptron.MultilayerPerceptron'>, 'MS': <class 'aiqclib.train.models.model_suite.ModelSuite'>, 'ModelSuite': <class 'aiqclib.train.models.model_suite.ModelSuite'>, 'MultilayerPerceptron': <class 'aiqclib.train.models.multilayer_perceptron.MultilayerPerceptron'>, 'RF': <class 'aiqclib.train.models.random_forest.RandomForest'>, 'RandomForest': <class 'aiqclib.train.models.random_forest.RandomForest'>, 'SVM': <class 'aiqclib.train.models.support_vector_machine.SupportVectorMachine'>, 'SupportVectorMachine': <class 'aiqclib.train.models.support_vector_machine.SupportVectorMachine'>, 'XGB': <class 'aiqclib.train.models.xgboost.XGBoost'>, 'XGBoost': <class 'aiqclib.train.models.xgboost.XGBoost'>}
A dictionary mapping model names to their corresponding Python classes.
The keys are strings (e.g., “XGBoost”), and the values are class objects that inherit from
aiqclib.common.base.model_base.ModelBase.- Type:
Dict[str, Type[ModelBase]]
aiqclib.common.loader.training_loader module
This module provides utility functions for loading and instantiating various training components, such as input training sets, model validation classes, and model build classes. It leverages a registry pattern and a TrainingConfig object to determine the specific class to load for each training step, promoting modularity and configurability in the training pipeline.
- aiqclib.common.loader.training_loader.load_step1_input_training_set(config)[source]
Retrieve and instantiate an
aiqclib.train.step1_read_input.input_base.InputTrainingSetBasesubclass for the “input” step, based on the YAML configuration.Extract the class name with
TrainingConfig.get_base_class("input")().Retrieve the corresponding class from
aiqclib.common.loader.training_registry.INPUT_TRAINING_SET_REGISTRY.Instantiate the class and return it.
- Parameters:
config (
TrainingConfig) – The training configuration object containing abase_classentry under the “input” section.- Returns:
An instantiated object of a class that inherits from
aiqclib.train.step1_read_input.input_base.InputTrainingSetBase.- Return type:
- aiqclib.common.loader.training_loader.load_step2_model_validation_class(config, training_sets=None)[source]
Retrieve and instantiate a
aiqclib.train.step2_validate_model.validate_base.ValidationBasesubclass for the “validate” step, based on the YAML configuration.- Steps:
Extract the class name with
TrainingConfig.get_base_class("validate")().Retrieve the corresponding class from
aiqclib.common.loader.training_registry.MODEL_VALIDATION_REGISTRY.Instantiate the class, optionally passing the provided training sets.
- Parameters:
config (
TrainingConfig) – The training configuration object referencing abase_classunder the “validate” section.training_sets (
Optional[dict[str,DataFrame]]) – A dictionary of Polars DataFrames containing data for model validation, defaults to None. Keys typically represent data categories (e.g., “train”, “test”).
- Returns:
An instantiated object of a class that inherits from
aiqclib.train.step2_validate_model.validate_base.ValidationBase.- Return type:
- aiqclib.common.loader.training_loader.load_step4_build_model_class(config, training_sets=None, test_sets=None)[source]
Retrieve and instantiate a
aiqclib.train.step4_build_model.build_model_base.BuildModelBasesubclass for the “build” step, based on the YAML configuration.- Steps:
Extract the class name with
TrainingConfig.get_base_class("build")().Retrieve the corresponding class from
aiqclib.common.loader.training_registry.BUILD_MODEL_REGISTRY.Instantiate the class, providing any training and test sets.
- Parameters:
config (
TrainingConfig) – The training configuration object referencing abase_classunder the “build” section.training_sets (
Optional[dict[str,DataFrame]]) – A dictionary of Polars DataFrames of training data, defaults to None. Keys typically represent data categories (e.g., “features”, “target”).test_sets (
Optional[dict[str,DataFrame]]) – A dictionary of Polars DataFrames of test data, defaults to None. Keys typically represent data categories (e.g., “features”, “target”).
- Returns:
An instantiated object of a class that inherits from
aiqclib.train.step4_build_model.build_model_base.BuildModelBase.- Return type:
aiqclib.common.loader.training_registry module
This module provides centralized registries for various training components, including dataset readers, model validation strategies, and model-building classes. Each registry is a dictionary mapping string keys (typically from configuration files) to their corresponding Python class implementations, facilitating flexible and extensible model training workflows.
- aiqclib.common.loader.training_registry.BUILD_MODEL_REGISTRY: Dict[str, Type[BuildModelBase]] = {'BuildModel': <class 'aiqclib.train.step4_build_model.build_model.BuildModel'>, 'BuildModelSuite': <class 'aiqclib.train.step4_build_model.build_model_suite.BuildModelSuite'>}
Registry mapping string keys to concrete implementations of
BuildModelBase.This dictionary enables the dynamic selection of model-building classes based on configuration settings, providing flexibility in choosing and configuring different model architectures.
- aiqclib.common.loader.training_registry.INPUT_TRAINING_SET_REGISTRY: Dict[str, Type[InputTrainingSetBase]] = {'InputTrainingSetA': <class 'aiqclib.train.step1_read_input.dataset_a.InputTrainingSetA'>}
Registry mapping string keys to concrete implementations of
InputTrainingSetBase.This dictionary facilitates the dynamic selection of dataset reading classes based on configuration settings, allowing for easy extension and customization of input data handling.
- aiqclib.common.loader.training_registry.MODEL_VALIDATION_REGISTRY: Dict[str, Type[ValidationBase]] = {'KFoldValidation': <class 'aiqclib.train.step2_validate_model.kfold_validation.KFoldValidation'>, 'KFoldValidationSuite': <class 'aiqclib.train.step2_validate_model.kfold_validation_suite.KFoldValidationSuite'>}
Registry mapping string keys to concrete implementations of
ValidationBase.This dictionary allows for the dynamic selection of model validation strategies based on configuration settings, supporting various evaluation methodologies.