aiqclib.classify.step3_select_profiles package
Submodules
aiqclib.classify.step3_select_profiles.dataset_all module
This module defines the SelectDataSetAll class, a specialized profile selection mechanism within the aiqclib library. It is designed to select all available profiles from a given input dataset (typically Copernicus CTD data) and assign initial labels and identifiers for subsequent classification tasks.
- class aiqclib.classify.step3_select_profiles.dataset_all.SelectDataSetAll(config, input_data=None)[source]
Bases:
ProfileSelectionBaseA subclass of
ProfileSelectionBasethat selects all profiles from Copernicus CTD data.This class initializes a selection process where all input profiles are considered, and initial labels (e.g., ‘negative’) and unique identifiers are assigned before further processing or classification.
- Parameters:
config (ConfigBase)
input_data (DataFrame | None)
- default_file_name: str
Default file name to which selected profiles are written.
- expected_class_name: str = 'SelectDataSetAll'
- key_col_names: List[str]
Columns used as unique identifiers for grouping/merging (e.g., by platform or profile).
- label_profiles()[source]
Select and label positive and negative datasets before combining them into a single DataFrame in
selected_profiles.In this specific implementation, all profiles are initially selected and labeled as ‘negative’ (label 0) by calling
select_all_profiles(). This method effectively serves as the entry point for the profile selection and initial labeling process.- Return type:
None
- output_file_name: str
Full path for the output file, resolved via the config.
- select_all_profiles()[source]
Select all profiles from the input data and prepare them with initial labeling and unique identifiers.
This method processes the
input_datato create a DataFrame of unique profiles. It adds the following columns:neg_profile_id(uint32): Initialized to 0. This column can serve as a placeholder for later assignment of specific negative profile identifiers, though it is not a unique ID in this step.label(uint32): Initialized to 0, indicating an unclassified or ‘negative’ profile in the context of subsequent classification.profile_id(int): A unique 1-based row index assigned to each selected profile, serving as its primary identifier.
The resulting DataFrame is assigned to
selected_profiles. All profiles are made unique based on their key columns (platform, profile number, timestamp, longitude, latitude) before profile_id is assigned.- Return type:
None