aiqclib.common.config package
Submodules
aiqclib.common.config.classify_config module
This module defines the ClassificationConfig class, a specialized configuration handler for managing dataset-related settings pertinent to machine learning classification tasks. It extends ConfigBase to provide structured access and resolution of various sub-configurations (e.g., target sets, feature sets, step class definitions) from YAML-based configuration files, simplifying the management of complex ML pipeline configurations.
- class aiqclib.common.config.classify_config.ClassificationConfig(config_file, auto_select=False)[source]
Bases:
ConfigBaseA configuration class for retrieving and organizing dataset-related configurations specific to classification tasks.
Extends
aiqclib.common.base.config_base.ConfigBaseby adding logic to select datasets from YAML-based configuration files. The selected dataset references various sub-configurations (e.g., target sets, feature sets, and step class definitions). These references are resolved and stored withindata.- Parameters:
config_file (str)
auto_select (bool)
- expected_class_name: str = 'ClassificationConfig'
The class name expected by this configuration to validate it aligns with the YAML definition. Used by
aiqclib.common.base.config_base.ConfigBase.
- select(dataset_name)[source]
Choose a dataset by name and load its sub-configuration items (e.g., target sets, feature sets) into
data.This method retrieves multiple related configurations by calling
aiqclib.common.utils.config.get_config_item()on relevant sections of the YAML file. It expects that the initial self.data population from super().select contains references to these sub-configurations, which are then resolved.- Parameters:
dataset_name (
str) – The name (key) of the desired dataset in the YAML’s “classification_sets” dictionary.- Raises:
KeyError – If
dataset_nameis not present in the “classification_sets” section of the YAML, or if a referenced sub-configuration name (e.g., “target_set” within the selected dataset) is not found in its corresponding top-level section (e.g., “target_sets”), or if any of the required sub-configuration keys (e.g., “target_set”, “feature_set”) are missing from the selected dataset configuration itself.- Returns:
None
- Return type:
None
aiqclib.common.config.dataset_config module
This module defines the DataSetConfig class, a specialized configuration handler for managing dataset-specific settings within a larger YAML configuration structure.
It extends aiqclib.common.base.config_base.ConfigBase to provide
interfaces for selecting and resolving dataset-related configurations such as
target sets, feature sets, and step class definitions from a hierarchical
configuration file.
- class aiqclib.common.config.dataset_config.DataSetConfig(config_file, auto_select=False)[source]
Bases:
ConfigBaseA configuration class that provides dataset-related configuration interfaces.
This class extends
ConfigBasewith handling for one or more dataset-specific YAML sections, mapping them to container dictionaries withindata. The selected dataset name is used to look up configurations for target sets, feature sets, step classes, etc.Note
expected_class_namemust match the YAML’sbase_classif instantiated directly.- Parameters:
config_file (str)
auto_select (bool)
- expected_class_name: str = 'DataSetConfig'
The class name expected by the configuration. Used by
ConfigBaseto validate consistency with the YAML data.
- select(dataset_name)[source]
Select a dataset entry by name from
data_setsin the YAML config, then retrieve related configuration items (e.g., target_set, feature_set, etc.).This method populates
datawith relevant sub-configurations by callingaiqclib.common.utils.config.get_config_item()on specified fields.- Parameters:
dataset_name (
str) – The key name of the dataset to select from the YAML.- Raises:
KeyError – If the dataset name does not exist in the YAML’s data_sets dictionary.
- Return type:
None
aiqclib.common.config.training_config module
This module defines the TrainingConfig class, which is responsible for managing and accessing training-related configurations from a YAML file.
It extends aiqclib.common.base.config_base.ConfigBase to provide
structured access to dataset settings, including targets, step classes,
and step parameters, by resolving references within the configuration.
- class aiqclib.common.config.training_config.TrainingConfig(config_file, auto_select=False)[source]
Bases:
ConfigBaseA configuration class providing interfaces for training dataset settings.
Inherits from
ConfigBasewith an expectation of working under the “training_sets” section in the YAML configuration. Leverages methods likeselect()to initialize and fetch subset configurations (e.g., target sets, step parameters).Note
expected_class_namemust match the YAML’sbase_classproperty if you intend to instantiate this class directly from config.- Parameters:
config_file (str)
auto_select (bool)
- expected_class_name: str = 'TrainingConfig'
The class name expected by
ConfigBasefor consistency checks when instantiating TrainingConfig from YAML.
- select(dataset_name)[source]
Select a named dataset from the training_sets configuration, retrieving nested configurations for targets, step classes, and step parameters.
After calling
select(), sub-keys (target_set,step_class_set, etc.) are populated from their respective config dictionaries by resolving their references within the full configuration.- Parameters:
dataset_name (
str) – The key name of the dataset to select withindata(which references the training_sets section).- Raises:
KeyError – If
dataset_nameis not found within the training_sets dictionary.- Return type:
None
aiqclib.common.config.yaml_schema module
Module providing YAML-based JSON schemas used to validate dataset, training, and classification configuration files. Each function returns a YAML string describing the structure and constraints for a specific configuration schema.
- aiqclib.common.config.yaml_schema.get_classification_config_schema()[source]
Retrieve the YAML-based JSON schema for classification configurations.
The returned schema requires certain objects and properties (e.g., path_info_sets, target_sets, feature_sets, etc.), each with nested type constraints and additional properties set to false when appropriate.
- Returns:
A YAML string representing the JSON schema for classification configurations.
- Return type:
str
- aiqclib.common.config.yaml_schema.get_data_set_config_schema()[source]
Retrieve the YAML-based JSON schema for dataset configurations.
The returned schema requires certain objects and properties (e.g., path_info_sets, target_sets, feature_sets, etc.), each with nested type constraints and additional properties set to false when appropriate.
- Returns:
A YAML string representing the JSON schema for dataset configurations.
- Return type:
str
- aiqclib.common.config.yaml_schema.get_training_config_schema()[source]
Retrieve the YAML-based JSON schema for training configurations.
The returned schema specifies required objects and properties under categories such as path_info_sets, target_sets, step_class_sets, step_param_sets, and training_sets. Additional properties are disallowed to ensure constraints remain strict.
- Returns:
A YAML string representing the JSON schema for training configurations.
- Return type:
str
aiqclib.common.config.yaml_templates module
Module providing YAML templates for both dataset preparation and training configurations. These templates can be customized to fit various data pipeline requirements.
- aiqclib.common.config.yaml_templates.get_config_classify_set_full_template()[source]
Retrieve a YAML template string for classification configurations with normalization.
This template includes:
path_info_sets: specifying common, input, model, and concatenation paths.target_sets: defining which variables to process and their flags.summary_stats_sets: defining summary statistics.feature_sets: listing named sets of feature extraction modules.feature_param_sets: detailing parameters for each feature.feature_stats_sets: detailing methods and stats for normalization.step_class_sets: referencing classes for each classification step (e.g., input, summary, select, locate, extract, model, classify, concat).step_param_sets: referencing parameters for the classification steps.classification_sets: referencing specific dataset folders, files, and associated configuration sets (e.g.,step_class_set,step_param_set).
- Returns:
A string containing the YAML template.
- Return type:
str
- aiqclib.common.config.yaml_templates.get_config_classify_set_template()[source]
Retrieve a YAML template string for classification configurations.
This template includes:
path_info_sets: specifying common, input, model, and concatenation paths.target_sets: defining which variables to process and their flags.summary_stats_sets: defining summary statistics.feature_sets: listing named sets of feature extraction modules.feature_param_sets: detailing parameters for each feature.feature_stats_sets: detailing methods and stats for normalization.step_class_sets: referencing classes for each classification step (e.g., input, summary, select, locate, extract, model, classify, concat).step_param_sets: referencing parameters for the classification steps.classification_sets: referencing specific dataset folders, files, and associated configuration sets (e.g.,step_class_set,step_param_set).
- Returns:
A string containing the YAML template.
- Return type:
str
- aiqclib.common.config.yaml_templates.get_config_data_set_all_template()[source]
Retrieve a YAML template string for dataset preparation configurations with ‘All’ step variants.
This template includes:
path_info_sets: specifying common, input, and split paths.target_sets: defining which variables to process and their flags.summary_stats_sets: defining summary statistics.feature_sets: listing named sets of feature extraction modules.feature_param_sets: detailing parameters for each feature.feature_stats_sets: detailing methods and stats for normalization.step_class_sets: referencing classes for each preparation step (e.g., input, summary, select, locate, extract, split) with ‘All’ variants.step_param_sets: referencing parameters for the preparation steps with ‘All’ variants.data_sets: referencing specific dataset folders, files, and associated configuration sets (e.g.,step_class_set,step_param_set).
- Returns:
A string containing the YAML template.
- Return type:
str
- aiqclib.common.config.yaml_templates.get_config_data_set_full_template()[source]
Retrieve a YAML template string for dataset preparation configurations with normalization.
This template includes:
path_info_sets: specifying common, input, and split paths.target_sets: defining which variables to process and their flags.summary_stats_sets: defining summary statistics.feature_sets: listing named sets of feature extraction modules.feature_param_sets: detailing parameters for each feature.feature_stats_sets: detailing methods and stats for normalization.step_class_sets: referencing classes for each preparation step (e.g., input, summary, select, locate, extract, split).step_param_sets: referencing parameters for the preparation steps.data_sets: referencing specific dataset folders, files, and associated configuration sets (e.g.,step_class_set,step_param_set).
- Returns:
A string containing the YAML template.
- Return type:
str
- aiqclib.common.config.yaml_templates.get_config_data_set_template()[source]
Retrieve a YAML template string for dataset preparation configurations.
This template includes:
path_info_sets: specifying common, input, and split paths.target_sets: defining which variables to process and their flags.summary_stats_sets: defining summary statistics.feature_sets: listing named sets of feature extraction modules.feature_param_sets: detailing parameters for each feature.feature_stats_sets: detailing methods and stats for normalization.step_class_sets: referencing classes for each preparation step (e.g., input, summary, select, locate, extract, split).step_param_sets: referencing parameters for the preparation steps.data_sets: referencing specific dataset folders, files, and associated configuration sets (e.g.,step_class_set,step_param_set).
- Returns:
A string containing the YAML template.
- Return type:
str
- aiqclib.common.config.yaml_templates.get_config_train_set_template()[source]
Retrieve a YAML template string for training configurations.
This template includes:
path_info_sets: specifying common paths and subfolders for input, validate, and build.target_sets: defining variables and associated flags for training.step_class_sets: mapping each step (input, validate, model, build) to corresponding Python class names.step_param_sets: detailing optional parameters for each training step.training_sets: referencing specific dataset folders, thepath_infoused, the target set, and whichstep_class_setandstep_param_setapply.
- Returns:
A string containing the YAML template.
- Return type:
str