Advanced Topics#
Custom Reducers#
coco_pipe.dim_reduction is designed to be extended. To add a new reducer,
subclass BaseReducer, declare your
capabilities, implement fit / transform, and (optionally) register
the reducer with the method registry. The new reducer then works through
DimReduction with no further wiring.
—
1. Minimal Reducer#
from sklearn.decomposition import PCA
from coco_pipe.dim_reduction import BaseReducer
class CustomPCAReducer(BaseReducer):
@property
def capabilities(self):
caps = super().capabilities
caps.update({"is_linear": True, "has_components": True})
return caps
def fit(self, X, y=None):
self.model = PCA(n_components=self.n_components, **self.params)
self.model.fit(X)
return self
def transform(self, X):
return self.model.transform(X)
That’s enough to make the new reducer plug into the manager:
from coco_pipe.dim_reduction import DimReduction
class CustomPCAManager(DimReduction):
pass # only needed if you also want a typed config — see below
reducer = CustomPCAReducer(n_components=2)
embedding = reducer.fit_transform(X)
The manager itself is happy to wrap any BaseReducer instance once
registered.
—
2. Capabilities Contract#
The manager and evaluator inspect BaseReducer.capabilities to know what
your reducer can do. Common flags:
|
Set |
|
Set |
|
Set |
|
Expected input rank. Defaults to 2. |
|
|
Declaring capabilities accurately lets the evaluator skip incompatible metrics early and lets the viz layer enable / disable diagnostic plots.
—
3. Non-Standard Input Shapes#
If your reducer consumes (n_features, n_snapshots) (like DMD) or higher-rank
tensors, declare it:
class CustomDMDReducer(BaseReducer):
@property
def capabilities(self):
caps = super().capabilities
caps.update({
"is_linear": True,
"input_ndim": 2,
"input_layout": "snapshots_x_features",
})
return caps
The manager’s _validate_input step will use these flags to fail fast on
shape mismatches.
—
4. Heavy Optional Dependencies#
Keep heavy imports inside fit / transform so importing the reducer
module stays lightweight:
from coco_pipe.utils import import_optional_dependency
class CustomTorchReducer(BaseReducer):
def fit(self, X, y=None):
torch = import_optional_dependency(
lambda: __import__("torch"),
feature="CustomTorchReducer",
dependency="torch",
install_hint="pip install coco-pipe[topology]",
)
# ... build and train your torch model ...
return self
import_optional_dependency raises a clear, actionable error if the
dependency is missing.
—
5. Registering a New Method#
To use your reducer through DimReduction("MyMethod") rather than passing the
reducer instance directly, register it in the method registry. The registry is
intentionally module-local
(coco_pipe.dim_reduction.config._METHOD_REGISTRY), so a downstream package
should expose a helper that:
Imports the dotted path to the reducer class.
Adds
"MyMethod": (module_path, "MyReducer")to the registry.
For a one-process script, prefer passing the reducer instance to
DimReduction directly.
—
6. Typed Config (Optional but Recommended)#
Pair the reducer with a pydantic config so it benefits from strict validation:
from typing import Literal
from pydantic import Field
from coco_pipe.dim_reduction.config import (
BaseReducerConfig,
StochasticReducerConfig,
)
class CustomPCAConfig(BaseReducerConfig, StochasticReducerConfig):
method: Literal["CustomPCA"] = "CustomPCA"
whiten: bool = Field(False, description="Whiten projected components.")
Now you can do:
reducer = DimReduction(CustomPCAConfig(n_components=2, whiten=True))
—
7. Testing Checklist#
Before treating a custom reducer as production-ready, verify:
fitreturnsselfand is idempotent on repeated calls with the same data + parameters.transformworks on new samples when the method supports it (capabilities["has_transform"]).capabilitiesreflects reality — particularlyinput_ndimandhas_components.DimReduction(reducer).score(embedding, X=X)produces a non-emptymetric_records_and no metrics report"reason: incompatible".If you registered a method name,
DimReduction("MyMethod")round-trips through your config.
Optional Dependencies#
The dim-reduction module is built so that the base install does not require
any heavy ML libraries. Reducers that need torch, umap-learn,
pydmd, etc. import them lazily inside fit / transform. Optional
extras unlock specific reducer families.
—
1. What Stays Lightweight#
import coco_pipe.dim_reduction only pulls in:
numpy,scipy,pandas,scikit-learnpydantic(for configs)
The following submodules are also import-light:
coco_pipe.iococo_pipe.reportcoco_pipe.viz(matplotlib + plotly; plotly is imported lazily)
So a script that uses only PCA / Isomap / TSNE / MDS / LLE / SpectralEmbedding doesn’t need any optional installs.
—
2. Extras and What They Unlock#
Extra |
Unlocks |
|---|---|
|
Umbrella extra: |
|
Same as |
|
|
|
|
|
|
|
|
|
|
|
MNE-Python; used by |
Install with pip:
pip install coco-pipe[dim-red]
pip install coco-pipe[dask,topology]
pip install coco-pipe[neighbor,parametric-umap,ivis]
—
3. Choosing What to Install#
You only need PCA / Isomap / TSNE / MDS / LLE / SpectralEmbedding → base install. No extras needed.
You need fast non-linear dimensionality reduction →
[dim-red].Your data is on Dask arrays →
[dask].You want a parametric reducer for transferring to new samples →
[parametric-umap](preferred) or[ivis].You want topology-regularized embeddings →
[topology].You’re working with EEG topographic maps → add
[eeg]for MNE.
For exploration across many methods, [dim-red] is the most common starting
point.
—
4. Failure Mode When an Extra is Missing#
Optional reducers are imported the first time you instantiate them. If the underlying library is missing, you’ll see a structured error:
ImportError: PHATE requires the 'phate' package.
Install with: pip install coco-pipe[dim-red]
The same message applies to interpretation backends:
gradient_importance() raises a similar error when torch is missing.
—
5. import_optional_dependency#
The helper used internally is exposed for custom reducers:
from coco_pipe.utils import import_optional_dependency
torch = import_optional_dependency(
lambda: __import__("torch"),
feature="my_custom_reducer",
dependency="torch",
install_hint="pip install coco-pipe[topology]",
)
It centralizes the “raise a clear, actionable error when optional dependency is missing” pattern. See Custom Reducers for usage in custom reducers.
—
6. PaCMAP and Nearest-Neighbor Backends#
PaCMAP supports multiple NN backends. The [neighbor] and [dim-red]
extras include faiss-cpu, so PaCMAP’s default
nn_backend="faiss" works out of the box on supported platforms. To force
a different backend:
reducer = DimReduction("Pacmap", n_components=2, nn_backend="annoy")
Recent PaCMAP versions accept "faiss", "annoy", and "voyager".
—
7. Recommended Install Profiles#
Profile |
Install |
|---|---|
Quick exploration |
|
Standard scientific work |
|
Distributed / out-of-core |
|
Deep learning / parametric |
|
Everything |
|