coco_pipe.dim_reduction.config#

Strict configuration models and reducer registry for the dim-reduction module.

This module defines:

  • canonical reducer names and lazy registry lookup

  • strict pydantic configs for each supported reducer

  • evaluation configuration with early validation for metric and ranking choices

The config layer follows the same explicit design as the rest of the module: exact method names, no aliasing, no compatibility wrappers, and no permissive extra fields.

Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca)

Attributes#

METHODS

DEFAULT_EVAL_GROUP_COL

Default grouping column used when building post-hoc eval specs.

MISSING_EVAL_VALUES

Label/group values that are treated as missing during eval alignment.

Classes#

BaseReducerConfig

Base configuration shared by all reducer configs.

StochasticReducerConfig

Mixin for reducers that expose a random seed.

PCAConfig

Configuration for PCA.

IncrementalPCAConfig

Configuration for Incremental PCA.

DaskPCAConfig

Configuration for Dask PCA.

DaskTruncatedSVDConfig

Configuration for Dask TruncatedSVD.

UMAPConfig

Configuration for UMAP.

TSNEConfig

Configuration for TSNE.

PacmapConfig

Configuration for Pacmap.

TrimapConfig

Configuration for Trimap.

PHATEConfig

Configuration for PHATE.

IsomapConfig

Configuration for Isomap.

LLEConfig

Configuration for LLE.

MDSConfig

Configuration for MDS.

SpectralEmbeddingConfig

Configuration for Spectral Embedding.

DMDConfig

Configuration for DMD.

TRCAConfig

Configuration for TRCA.

TopologicalAEConfig

Configuration for Topological Autoencoder.

IVISConfig

Configuration for IVIS.

ParametricUMAPConfig

Configuration for Parametric UMAP.

EvaluationConfig

Configuration for post-hoc evaluation and method comparison.

Functions#

get_reducer_class(method)

Return the reducer class registered for one canonical method name.

parse_eval_specs(raw_specs, subject_col)

Parse raw eval spec input into a validated list of spec dicts.

Module Contents#

coco_pipe.dim_reduction.config.METHODS = ('PCA', 'IncrementalPCA', 'DaskPCA', 'DaskTruncatedSVD', 'Isomap', 'LLE', 'MDS',...#
coco_pipe.dim_reduction.config.get_reducer_class(method)#

Return the reducer class registered for one canonical method name.

Parameters:

method (str) – Canonical public name of the reduction method.

Returns:

The reducer class (subclass of BaseReducer).

Return type:

class

Raises:
  • ValueError – If method is not one of the canonical names in METHODS.

  • ImportError – If the reducer backend cannot be imported.

Notes

Registry lookup is exact and case-sensitive. The dim-reduction module does not support aliasing or case normalization.

See also

METHODS

Canonical public method names accepted by the registry.

BaseReducerConfig

Base type for typed reducer configuration objects.

Examples

>>> cls = get_reducer_class("PCA")
>>> cls.__name__
'PCAReducer'
class coco_pipe.dim_reduction.config.BaseReducerConfig(/, **data)#

Bases: _StrictConfigModel

Base configuration shared by all reducer configs.

Notes

All reducer configs are strict. Unknown fields are rejected at parse time. Subclasses must expose a canonical method literal and may override to_reducer_kwargs() when the reducer constructor needs renamed fields.

See also

get_reducer_class

Registry lookup for canonical method names.

EvaluationConfig

Post-hoc scoring and ranking configuration.

Parameters:

data (Any)

n_components: int = None#
to_reducer_kwargs()#

Return reducer keyword arguments for this config.

Return type:

dict[str, Any]

class coco_pipe.dim_reduction.config.StochasticReducerConfig(/, **data)#

Bases: _StrictConfigModel

Mixin for reducers that expose a random seed.

Parameters:

data (Any)

random_state: int | None = None#
class coco_pipe.dim_reduction.config.PCAConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for PCA.

Parameters:

data (Any)

method: Literal['PCA'] = 'PCA'#
whiten: bool = None#
svd_solver: str = None#
class coco_pipe.dim_reduction.config.IncrementalPCAConfig(/, **data)#

Bases: BaseReducerConfig

Configuration for Incremental PCA.

Parameters:

data (Any)

method: Literal['IncrementalPCA'] = 'IncrementalPCA'#
batch_size: int | None = None#
whiten: bool = None#
class coco_pipe.dim_reduction.config.DaskPCAConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for Dask PCA.

Parameters:

data (Any)

method: Literal['DaskPCA'] = 'DaskPCA'#
svd_solver: str = None#
class coco_pipe.dim_reduction.config.DaskTruncatedSVDConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for Dask TruncatedSVD.

Parameters:

data (Any)

method: Literal['DaskTruncatedSVD'] = 'DaskTruncatedSVD'#
algorithm: str = None#
class coco_pipe.dim_reduction.config.UMAPConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for UMAP.

Parameters:

data (Any)

method: Literal['UMAP'] = 'UMAP'#
n_neighbors: int = None#
min_dist: float = None#
metric: str = None#
n_epochs: int | None = None#
spread: float = None#
set_op_mix_ratio: float = None#
class coco_pipe.dim_reduction.config.TSNEConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for TSNE.

Parameters:

data (Any)

method: Literal['TSNE'] = 'TSNE'#
perplexity: float = None#
early_exaggeration: float = None#
learning_rate: float | str = None#
max_iter: int = None#
init: str = None#
class coco_pipe.dim_reduction.config.PacmapConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for Pacmap.

Parameters:

data (Any)

method: Literal['Pacmap'] = 'Pacmap'#
n_neighbors: int = None#
MN_ratio: float = None#
FP_ratio: float = None#
nn_backend: str = None#
init: str = None#
class coco_pipe.dim_reduction.config.TrimapConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for Trimap.

Parameters:

data (Any)

method: Literal['Trimap'] = 'Trimap'#
n_inliers: int = None#
n_outliers: int = None#
n_random: int = None#
class coco_pipe.dim_reduction.config.PHATEConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for PHATE.

Parameters:

data (Any)

method: Literal['PHATE'] = 'PHATE'#
knn: int = None#
decay: int = None#
t: int | str = None#
class coco_pipe.dim_reduction.config.IsomapConfig(/, **data)#

Bases: BaseReducerConfig

Configuration for Isomap.

Parameters:

data (Any)

method: Literal['Isomap'] = 'Isomap'#
n_neighbors: int = None#
metric: str = None#
p: int = None#
class coco_pipe.dim_reduction.config.LLEConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for LLE.

Parameters:

data (Any)

method: Literal['LLE'] = 'LLE'#
n_neighbors: int = None#
lle_method: str = None#
to_reducer_kwargs()#

Return reducer keyword arguments with sklearn-compatible names.

Return type:

dict[str, Any]

class coco_pipe.dim_reduction.config.MDSConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for MDS.

Parameters:

data (Any)

method: Literal['MDS'] = 'MDS'#
metric: bool = None#
n_init: int = None#
max_iter: int = None#
dissimilarity: str = None#
class coco_pipe.dim_reduction.config.SpectralEmbeddingConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for Spectral Embedding.

Parameters:

data (Any)

method: Literal['SpectralEmbedding'] = 'SpectralEmbedding'#
affinity: str = None#
gamma: float | None = None#
class coco_pipe.dim_reduction.config.DMDConfig(/, **data)#

Bases: BaseReducerConfig

Configuration for DMD.

Parameters:

data (Any)

method: Literal['DMD'] = 'DMD'#
force_transpose: bool = None#
tlsq_rank: int = None#
exact: bool = None#
opt: bool = None#
class coco_pipe.dim_reduction.config.TRCAConfig(/, **data)#

Bases: BaseReducerConfig

Configuration for TRCA.

Parameters:

data (Any)

method: Literal['TRCA'] = 'TRCA'#
sfreq: float = None#
filterbank: list | None = None#
class coco_pipe.dim_reduction.config.TopologicalAEConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for Topological Autoencoder.

Parameters:

data (Any)

method: Literal['TopologicalAE'] = 'TopologicalAE'#
hidden_dims: list[int] = None#
lam: float = None#
lr: float = None#
batch_size: int = None#
epochs: int = None#
device: str = None#
verbose: int = None#
class coco_pipe.dim_reduction.config.IVISConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for IVIS.

Parameters:

data (Any)

method: Literal['IVIS'] = 'IVIS'#
k: int = None#
model: str = None#
n_epochs_without_progress: int = None#
supervise_metric: str = None#
class coco_pipe.dim_reduction.config.ParametricUMAPConfig(/, **data)#

Bases: BaseReducerConfig, StochasticReducerConfig

Configuration for Parametric UMAP.

Parameters:

data (Any)

method: Literal['ParametricUMAP'] = 'ParametricUMAP'#
n_neighbors: int = None#
min_dist: float = None#
metric: str = None#
n_epochs: int | None = None#
batch_size: int = None#
verbose: bool = None#
class coco_pipe.dim_reduction.config.EvaluationConfig(/, **data)#

Bases: _StrictConfigModel

Configuration for post-hoc evaluation and method comparison.

Parameters:
  • metrics (list of str, optional) – Metric families to compute. Must use canonical evaluator metric names.

  • k_range (list of int, optional) – Neighborhood sizes used for standard structure-preservation metrics.

  • selection_metric (str, optional) – Primary ranking metric. Must be one of the supported ranking metrics and also appear in metrics.

  • selection_k (int, optional) – Neighborhood size used when ranking a k-scoped metric.

  • tie_breakers (list of str, optional) – Additional ranking metrics applied in order. Each value must also be present in metrics.

  • separation_method (str, default="centroid") – Separation definition used for trajectory separation scoring.

  • data (Any)

Notes

EvaluationConfig validates semantic consistency at parse time. Invalid metric names, duplicate entries, invalid separation methods, and ranking metrics that are not part of metrics all fail early.

See also

coco_pipe.dim_reduction.evaluation.core.evaluate_embedding

Pure evaluator that consumes these settings.

coco_pipe.dim_reduction.evaluation.core.MethodSelector

Post-hoc collector and ranker for scored reducers.

Examples

>>> config = EvaluationConfig(
...     metrics=["trustworthiness", "continuity"],
...     k_range=[5, 10],
...     selection_metric="trustworthiness",
...     selection_k=10,
...     tie_breakers=["continuity"],
... )
>>> config.selection_metric
'trustworthiness'
metrics: list[str] = None#
k_range: list[int] = None#
selection_metric: str | None = None#
selection_k: int | None = None#
tie_breakers: list[str] = None#
separation_method: str = None#
to_score_kwargs()#

Return scoring keyword arguments for evaluate_embedding.

Maps the config’s evaluation fields onto the keyword arguments consumed by coco_pipe.dim_reduction.evaluation.core.evaluate_embedding() (and coco_pipe.dim_reduction.core.DimReduction.score()). Ranking fields (selection_metric, selection_k, tie_breakers) are not included here — they drive coco_pipe.dim_reduction.evaluation.core.MethodSelector.rank_methods().

Returns:

Mapping with metrics, k_values, and separation_method.

Return type:

dict

coco_pipe.dim_reduction.config.DEFAULT_EVAL_GROUP_COL: str = 'patient_group_id'#

Default grouping column used when building post-hoc eval specs.

coco_pipe.dim_reduction.config.MISSING_EVAL_VALUES: frozenset[str]#

Label/group values that are treated as missing during eval alignment.

coco_pipe.dim_reduction.config.parse_eval_specs(raw_specs, subject_col)#

Parse raw eval spec input into a validated list of spec dicts.

raw_specs may be:

  • None — returns an empty list

  • a list of spec dicts

  • a dict with an "evals" key whose value is the list

Each spec dict must have at least "name" and "target_col" keys. Optional keys: "group_col" (defaults to DEFAULT_EVAL_GROUP_COL), "filters" (list of {column, values} dicts), "label_map" (string→string mapping).

Parameters:
  • raw_specs (Any | None) – Raw YAML/JSON eval spec input.

  • subject_col (str) – Subject identifier column name (reserved for future alignment checks).

Returns:

Normalised eval spec dicts, ready for use in coco_pipe.dim_reduction.pipeline.run_eval().

Return type:

list of dict

Raises:

ValueError – On structural violations (wrong type, missing required keys, …).