coco_pipe.io.embeddings ======================= .. py:module:: coco_pipe.io.embeddings .. autoapi-nested-parse:: BIDS-compatible derivative I/O for foundation-model embeddings. Functions --------- .. autoapisummary:: coco_pipe.io.embeddings.embedding_sidecar_path coco_pipe.io.embeddings.validate_embedding_derivative coco_pipe.io.embeddings.save_embedding_derivative coco_pipe.io.embeddings.discover_embedding_derivatives coco_pipe.io.embeddings.load_embedding_derivatives coco_pipe.io.embeddings.combined_embedding_table_path coco_pipe.io.embeddings.load_combined_embedding_table coco_pipe.io.embeddings.write_embedding_manifest coco_pipe.io.embeddings.write_embedding_dataset_description Module Contents --------------- .. py:function:: embedding_sidecar_path(path) Return the JSON sidecar paired with an embedding NPZ. .. py:function:: validate_embedding_derivative(path) Validate arrays, shape consistency, and sidecar presence. .. py:function:: save_embedding_derivative(result, path, metadata = None, overwrite = False) Atomically write an embedding NPZ and matching JSON sidecar. .. py:function:: discover_embedding_derivatives(root, model_key = None) Discover valid embedding NPZ artifacts under a derivative root. .. py:function:: load_embedding_derivatives(paths, representation = 'recording', aggregate_by = None, model_key = None) Load embedding artifacts into a 2-D DataContainer. ``representation`` is ``"epoch"`` (one row per epoch, read from the on-disk ``window_embeddings`` array) or ``"recording"`` (the pooled ``recording_embedding``). A coarser ``"subject"`` level is produced by the merge step, not here (it pools across recordings). .. py:function:: combined_embedding_table_path(derivative_root, model_key, condition, representation = 'recording') Path to the merged per-(model, condition) embedding table. Mirrors the ``combined/___embeddings.parquet`` layout written by the merge step. ``representation`` is one of :data:`~coco_pipe.io.AGGREGATION_LEVELS`. .. py:function:: load_combined_embedding_table(derivative_root, model_key, condition, representation = 'recording', aggregate_by = None) Load one merged per-(model, condition) embedding table as a 2-D container. This reads the single parquet the merge step already materialized instead of rescanning every per-recording NPZ and filtering — one table read per condition, the same access pattern descriptors use. ``representation`` is one of :data:`~coco_pipe.io.AGGREGATION_LEVELS`. The table carries id columns (subject/session/run/condition/recording_id/model_key[/window_index]) plus ``embedding_*`` feature columns; the former become coords, the latter ``X``. .. py:function:: write_embedding_manifest(root, records) Write a JSON run manifest indexing successful and failed artifacts. .. py:function:: write_embedding_dataset_description(root, name, bids_version, generated_by, source_datasets = None) Write a BIDS derivative dataset_description.json.