coco_pipe.descriptors.core#

Descriptor extraction planner and execution pipeline.

This module owns the config-bound runtime orchestration for descriptor extraction. It does not implement family-specific descriptor math; instead it:

validates the explicit runtime inputs accepted by the module
instantiates enabled descriptor families from typed config
plans shared PSD computation for compatible PSD consumers
executes one observation batch at a time with controlled parallelism
merges aligned family outputs into one flat descriptor matrix

Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca)

Classes#

DescriptorPipeline

Run config-driven descriptor extraction on explicit arrays.

Module Contents#

class coco_pipe.descriptors.core.DescriptorPipeline(config)#

Run config-driven descriptor extraction on explicit arrays.

Parameters:

config (DescriptorConfig or Mapping[str, Any]) – Typed descriptors configuration or a mapping accepted by DescriptorConfig.

Variables:

config (DescriptorConfig) – Parsed descriptors configuration.
extractors (list of BaseDescriptorExtractor) – Enabled family extractors in deterministic family order.
signal_extractors (list of BaseDescriptorExtractor) – Enabled non-PSD extractors that consume raw signal batches directly.
psd_groups (list of _PSDGroup) – Planned PSD reuse groups derived once from the enabled extractors.
family_order (list of str) – Deterministic family order used when merging batch-local outputs.

Notes

The pipeline is config-bound but runtime-stateless. Construction performs config parsing, corrected-band compatibility checks, and planner setup once. Each call to extract() then validates the explicit runtime inputs, executes the planned families, and returns one flat descriptor matrix plus any collected failures.

config#

extractors: list[coco_pipe.descriptors.extractors.base.BaseDescriptorExtractor] = []#

signal_extractors#

psd_groups = []#

family_order#

extract(X, ids=None, sfreq=None, channel_names=None)#

Extract descriptors from explicit NumPy inputs.

Parameters:

X (np.ndarray) – Signal array with shape (n_obs, n_channels, n_times).
ids (sequence or np.ndarray, optional) – Observation identifiers aligned with X.
sfreq (float, optional) – Sampling frequency in Hertz. Required when enabled families depend on spectral estimates or spectral entropy.
channel_names (sequence of str or np.ndarray, optional) – Channel labels. Required for channel-resolved outputs.

Returns:

Flat ("obs", "feature") container: X is the descriptor matrix, coords["feature"] the column names, and coords["feature_family"] the per-column family token (carried from the extractors, not parsed). ids are the observation ids and meta holds failures and sfreq.

Return type:

DataContainer

Raises:

ValueError – If the explicit input contract is not satisfied.
ImportError – If an optional backend required by the enabled families is missing.

Notes

When runtime.on_error="warn", extraction still completes and stores the failures in meta["failures"] before emitting one aggregate warning at the pipeline level.

The returned row order always matches the input observation order.

pool_channels(container, channel_groups)#

Pool sensor-level descriptor columns into grouped channel outputs.

Parameters:

container (DataContainer) – Flat descriptor container produced by extract().
channel_groups (mapping of str to sequence of str) – Channel groups used to replace sensor-level descriptor columns with grouped "chgrp-..." outputs.

Returns:

New container with grouped channel features. ids and meta (including failures) are preserved; coords["feature"] and coords["feature_family"] reflect the pooled columns.

Return type:

DataContainer

Raises:

ValueError – If the container is malformed or if any requested group cannot be formed from the sensor-level descriptor columns.