coco_pipe.descriptors.tables#
Assemble descriptor extraction containers into epoch/subject feature tables.
The descriptor lifecycle is extract → reject → aggregate → merge:
extract —
coco_pipe.descriptors.core.DescriptorPipeline.extract()returns a flat("obs", "feature")DataContainer.reject — epoch MAD-outlier rejection is just
coco_pipe.io.quality.drop_epoch_outliers()on that container;mad_failures_from_qc()turns the dropped epochs into failure records.aggregate —
build_descriptor_tables()builds the per-epoch table and the group-aggregated subject table (mean + extra grouped stats + optional band ratios) viaaggregate()/aggregate_groups().merge — cross-shard concatenation lives in
coco_pipe.descriptors.io.merge_descriptor_tables().
Project-specific concerns (BIDS grouping-key derivation, channel-group pooling, shard file layout, QC reports) stay with the caller.
Author: Hamza Abdelhedi <hamza.abdelhedi@umontreal.ca>
Functions#
|
Turn epochs dropped by |
|
Compute band-ratio columns from aggregated mean band-power features. |
|
Build epoch- and group-aggregated subject-level descriptor tables. |
Module Contents#
- coco_pipe.descriptors.tables.mad_failures_from_qc(qc_result)#
Turn epochs dropped by
drop_epoch_outliers()into failure records.Mirrors the extractor failure schema (
obs_id,obs_index,channel_index,channel_name,family,exception_type,message) so MAD drops flow into the same failure log as extraction failures. When rejection was made per descriptor group, one record is emitted per (group, epoch); otherwise one per dropped epoch.- Parameters:
qc_result (coco_pipe.io.quality.QCResult | None) – The
QCResultreturned bycoco_pipe.io.quality.drop_epoch_outliers()(Noneyields[]).- Return type:
- coco_pipe.descriptors.tables.add_aggregated_band_ratios(base_features_df, ratio_pairs, floor=0.0, prefixes=DEFAULT_RATIO_PREFIXES)#
Compute band-ratio columns from aggregated mean band-power features.
For each
(numerator, denominator)band pair and each(input_prefix, output_prefix)in prefixes, divide every matching{input_prefix}{numerator}_{suffix}column by its{input_prefix}{denominator}_{suffix}counterpart. Denominators at or below floor yieldNaNinstead of an unstable division.- Returns:
One column per emitted ratio (empty when nothing matched). Aligned to base_features_df’s row index.
- Return type:
- Parameters:
base_features_df (pandas.DataFrame)
ratio_pairs (collections.abc.Sequence[tuple[str, str]])
floor (float)
prefixes (collections.abc.Sequence[tuple[str, str]])
- coco_pipe.descriptors.tables.build_descriptor_tables(container, metadata_df, group_by, id_col='obs_id', target_col=None, aggregation_groups=None, ratio_pairs=None, ratio_floor=0.0, ratio_prefixes=DEFAULT_RATIO_PREFIXES, min_count=1, on_insufficient='raise')#
Build epoch- and group-aggregated subject-level descriptor tables.
- Parameters:
container (coco_pipe.io.structures.DataContainer) – Flat
("obs", "feature")descriptor container fromextract()(typically after epoch MAD rejection).metadata_df (pandas.DataFrame) – One row per epoch, aligned with
container.X. Must contain id_col; every other column is carried as an observation coordinate and, when constant within a group, into the subject table.group_by (str) – Metadata column defining the aggregation groups (e.g. a recording id).
id_col (str) – Observation-id column in metadata_df (default
"obs_id").target_col (str | None) – Optional target column carried onto the subject table.
aggregation_groups (collections.abc.Sequence[collections.abc.Mapping[str, Any]] | None) –
aggregate_groupsspecs producing the subject feature columns (each{"stats": ..., <selectors>}). Defaults to[{"stats": "mean"}](mean of every feature). This is where median / IQR / etc. subject-level stats are requested.ratio_pairs (collections.abc.Sequence[tuple[str, str]] | None) – When ratio_pairs is given, band ratios from the aggregated mean features are appended via
add_aggregated_band_ratios().ratio_floor (float) – When ratio_pairs is given, band ratios from the aggregated mean features are appended via
add_aggregated_band_ratios().ratio_prefixes (collections.abc.Sequence[tuple[str, str]]) – When ratio_pairs is given, band ratios from the aggregated mean features are appended via
add_aggregated_band_ratios().min_count (int) – Forwarded to
aggregate()andaggregate_groups. Withon_insufficient="warn"a group (or a single descriptor family withinaggregate_groups) whose surviving rows are all-NaN emits NaN features instead of raising, so the subject is retained with whatever else is computable.on_insufficient (str) – Forwarded to
aggregate()andaggregate_groups. Withon_insufficient="warn"a group (or a single descriptor family withinaggregate_groups) whose surviving rows are all-NaN emits NaN features instead of raising, so the subject is retained with whatever else is computable.
- Returns:
epoch_df,subject_df,epoch_feature_columns, andsubject_feature_columns.- Return type:
- Raises:
ValueError – If id_col or group_by is missing from metadata_df.