coco_pipe.io.embeddings#

BIDS-compatible derivative I/O for foundation-model embeddings.

Functions#

embedding_sidecar_path(path)

Return the JSON sidecar paired with an embedding NPZ.

validate_embedding_derivative(path)

Validate arrays, shape consistency, and sidecar presence.

save_embedding_derivative(result, path[, metadata, ...])

Atomically write an embedding NPZ and matching JSON sidecar.

discover_embedding_derivatives(root[, model_key])

Discover valid embedding NPZ artifacts under a derivative root.

load_embedding_derivatives(paths[, representation, ...])

Load embedding artifacts into a 2-D DataContainer.

combined_embedding_table_path(derivative_root, ...[, ...])

Path to the merged per-(model, condition) embedding table.

load_combined_embedding_table(derivative_root, ...[, ...])

Load one merged per-(model, condition) embedding table as a 2-D container.

write_embedding_manifest(root, records)

Write a JSON run manifest indexing successful and failed artifacts.

write_embedding_dataset_description(root, name, ...[, ...])

Write a BIDS derivative dataset_description.json.

Module Contents#

coco_pipe.io.embeddings.embedding_sidecar_path(path)#

Return the JSON sidecar paired with an embedding NPZ.

Parameters:

path (str | pathlib.Path)

Return type:

pathlib.Path

coco_pipe.io.embeddings.validate_embedding_derivative(path)#

Validate arrays, shape consistency, and sidecar presence.

Parameters:

path (str | pathlib.Path)

Return type:

dict[str, Any]

coco_pipe.io.embeddings.save_embedding_derivative(result, path, metadata=None, overwrite=False)#

Atomically write an embedding NPZ and matching JSON sidecar.

Parameters:
Return type:

tuple[pathlib.Path, pathlib.Path]

coco_pipe.io.embeddings.discover_embedding_derivatives(root, model_key=None)#

Discover valid embedding NPZ artifacts under a derivative root.

Parameters:
Return type:

list[pathlib.Path]

coco_pipe.io.embeddings.load_embedding_derivatives(paths, representation='recording', aggregate_by=None, model_key=None)#

Load embedding artifacts into a 2-D DataContainer.

representation is "epoch" (one row per epoch, read from the on-disk window_embeddings array) or "recording" (the pooled recording_embedding). A coarser "subject" level is produced by the merge step, not here (it pools across recordings).

Parameters:
Return type:

coco_pipe.io.structures.DataContainer

coco_pipe.io.embeddings.combined_embedding_table_path(derivative_root, model_key, condition, representation='recording')#

Path to the merged per-(model, condition) embedding table.

Mirrors the combined/<model>_<condition>_<representation>_embeddings.parquet layout written by the merge step. representation is one of AGGREGATION_LEVELS.

Parameters:
Return type:

pathlib.Path

coco_pipe.io.embeddings.load_combined_embedding_table(derivative_root, model_key, condition, representation='recording', aggregate_by=None)#

Load one merged per-(model, condition) embedding table as a 2-D container.

This reads the single parquet the merge step already materialized instead of rescanning every per-recording NPZ and filtering — one table read per condition, the same access pattern descriptors use. representation is one of AGGREGATION_LEVELS. The table carries id columns (subject/session/run/condition/recording_id/model_key[/window_index]) plus embedding_* feature columns; the former become coords, the latter X.

Parameters:
Return type:

coco_pipe.io.structures.DataContainer

coco_pipe.io.embeddings.write_embedding_manifest(root, records)#

Write a JSON run manifest indexing successful and failed artifacts.

Parameters:
Return type:

pathlib.Path

coco_pipe.io.embeddings.write_embedding_dataset_description(root, name, bids_version, generated_by, source_datasets=None)#

Write a BIDS derivative dataset_description.json.

Parameters:
Return type:

pathlib.Path