coco_pipe.dim_reduction.interpret_features#

coco_pipe.dim_reduction.interpret_features(X, *, X_emb=None, model=None, analyses=None, feature_names=None, method_name='embedding', n_repeats=5, random_state=None)#

Run one or more feature interpretation analyses.

Parameters:
  • X (np.ndarray) – Original input data.

  • X_emb (np.ndarray, optional) – Explicit embedding used by correlation-based analysis.

  • model (Any, optional) – Fitted reducer or model used by importance analyses.

  • analyses (sequence of {"correlation", "perturbation", "gradient"}, optional) – Analyses to compute. None defaults to ("correlation",).

  • feature_names (sequence of str, optional) – Feature names aligned with X when the requested analysis returns feature-keyed outputs.

  • method_name (str, default="embedding") – Display name written into the returned analysis records.

  • n_repeats (int, default=5) – Number of permutations per feature for perturbation importance.

  • random_state (int, optional) – Random seed for perturbation importance.

Returns:

Dictionary with keys:

  • analysis: nested analysis payloads

  • records: tidy analysis records as list[dict]

Return type:

dict

Raises:

ValueError – If a requested analysis is unsupported, missing required inputs, or lacks required feature names.

Notes

This function is a pure interpretation backend for manager, report, or visualization workflows. It does not fit models, compute embeddings, or mutate reducer state.

See also

correlate_features

Feature-to-dimension interpretation from explicit embeddings.

perturbation_importance

Model-agnostic importance based on shuffled features.

gradient_importance

Encoder saliency for supported torch-based reducers.

Examples

>>> import numpy as np
>>> class MockReducer:
...     def transform(self, X):
...         return X[:, :2]
>>> X = np.array([[0.0, 1.0], [1.0, 0.0], [2.0, 1.0]])
>>> X_emb = X[:, :2]
>>> result = interpret_features(
...     X,
...     X_emb=X_emb,
...     model=MockReducer(),
...     analyses=["correlation", "perturbation"],
...     feature_names=["f1", "f2"],
...     n_repeats=1,
...     random_state=0,
... )
>>> sorted(result)
['analysis', 'records']