coco_pipe.dim_reduction.evaluation.MethodSelector#
- class coco_pipe.dim_reduction.evaluation.MethodSelector(reducers)#
Bases:
objectCompare and rank already-scored dimensionality reduction methods.
MethodSelectoris intentionally post-hoc. It does not fit reducers or compute embeddings. Each reducer must already be a scored~coco_pipe.dim_reduction.DimReductioninstance with cachedmetric_records_.- Parameters:
reducers (dict or list of DimReduction) – Scored
~coco_pipe.dim_reduction.DimReductionobjects to compare. Lists are converted to a method-keyed mapping usingreducer.method.- Variables:
See also
evaluate_embeddingPure evaluator used upstream by
DimReduction.score.coco_pipe.dim_reduction.core.DimReduction.scoreScores a fitted reduction and populates the records consumed here.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(30, 4) >>> reducers = [ ... DimReduction("PCA", n_components=2), ... DimReduction("Isomap", n_components=2, n_neighbors=5), ... ] >>> for reducer in reducers: ... embedding = reducer.fit_transform(X) ... reducer.score(embedding, X=X, k_values=[5]) >>> selector = MethodSelector(reducers).collect() >>> frame = selector.to_frame() >>> not frame.empty True
- classmethod from_records(records)#
Create a selector directly from long-form metric records.
- Parameters:
- Return type:
- classmethod from_frame(frame)#
Create a selector directly from a metric-record DataFrame.
- Parameters:
frame (DataFrame)
- Return type:
- collect()#
Collect cached metric records from already-scored reducers.
- Returns:
The selector populated with comparison-ready metric records.
- Return type:
- Raises:
ValueError – If a reducer has not been scored yet.
See also
coco_pipe.dim_reduction.core.DimReduction.scorePopulates the
metric_records_consumed by this method.to_frameMaterialize the collected long-form records as a DataFrame.
Notes
collect()does not fit reducers or recompute evaluation metrics. It only gathers cached metric observations from reducers that were already scored explicitly.Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> selector = MethodSelector([reducer]).collect() >>> len(selector.metric_records_) > 0 True
- to_frame()#
Return the cached long-form metric table.
- Returns:
Tidy metric table with columns
method,metric,value,scope, andscope_value.- Return type:
Notes
This method only materializes a DataFrame at the public export boundary. Internally,
MethodSelectorstores metric records as plain Python dictionaries.See also
collectGather cached metric records from scored reducers.
rank_methodsRank reducers from the collected metric table.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> frame = MethodSelector([reducer]).collect().to_frame() >>> set(["method", "metric", "value"]).issubset(frame.columns) True
- rank_methods(selection_metric, *, selection_k=None, tie_breakers=None)#
Rank methods using one primary metric and optional tie-breakers.
- Parameters:
- Returns:
Ranked comparison table. The first row is the best-scoring method under the requested ranking policy.
- Return type:
- Raises:
ValueError – If the requested metrics are unsupported, unavailable in the cached records, or missing the requested
selection_kobservations.
Notes
Ranking is based on mean metric values per method. For k-scoped metrics,
selection_krestricts comparison to a single neighborhood size when requested.See also
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducers = [DimReduction("PCA", n_components=2)] >>> reducer = reducers[0] >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> ranked = ( ... MethodSelector(reducers) ... .collect() ... .rank_methods( ... "trustworthiness", ... selection_k=5, ... ) ... ) >>> ranked.iloc[0]["method"] == reducer.method True