coco_pipe.dim_reduction.evaluation.core ======================================= .. py:module:: coco_pipe.dim_reduction.evaluation.core .. autoapi-nested-parse:: Evaluation Core =============== Pure evaluation orchestration for dimensionality-reduction workflows. This module contains the two public evaluation interfaces used by the dim-reduction stack: - ``evaluate_embedding(...)`` evaluates an explicit embedding and returns scalar metrics, scalar metadata, diagnostics, and tidy metric records. - ``MethodSelector`` compares and ranks multiple already-scored ``~coco_pipe.dim_reduction.DimReduction`` objects without refitting or recomputing embeddings. The module is intentionally evaluation-only. It does not fit reducers, transform data, reconstruct 3D trajectory tensors from flat embeddings, or provide plotting methods. Reduction execution belongs to ``coco_pipe.dim_reduction.core.DimReduction`` and plotting belongs to ``coco_pipe.viz.dim_reduction``. Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca) Classes ------- .. autoapisummary:: coco_pipe.dim_reduction.evaluation.core.MethodSelector Functions --------- .. autoapisummary:: coco_pipe.dim_reduction.evaluation.core.evaluate_embedding Module Contents --------------- .. py:function:: evaluate_embedding(X_emb, X = None, method_name = 'embedding', metrics = None, labels = None, groups = None, times = None, quality_metadata = None, diagnostics = None, random_state = None, n_neighbors = 5, k_values = None, separation_method = None, config = None) Evaluate an already computed embedding. :param X_emb: Embedded data to evaluate. - ``(n_samples, n_dims)`` triggers standard co-ranking and Shepard-style metrics. - ``(n_trajectories, n_times, n_dims)`` triggers trajectory metrics. :type X_emb: np.ndarray :param X: Original data with shape ``(n_samples, n_features)``. Required when standard 2D metrics are requested. :type X: np.ndarray, optional :param method_name: Display name attached to tidy metric records. :type method_name: str, default="embedding" :param metrics: Metric selectors to compute. ``None`` computes all metrics available for the provided inputs. :type metrics: sequence of str, optional :param labels: Optional labels aligned with the embedding. Used by ``trajectory_separation`` for native 3D embeddings and by explicit supervised 2D metrics such as ``separation_logreg_balanced_accuracy`` when requested. :type labels: np.ndarray, optional :param groups: Optional grouping variable aligned with ``X_emb``. Required by ``separation_logreg_balanced_accuracy``. :type groups: np.ndarray, optional :param times: Optional trajectory time coordinates used for separation AUC integration when trajectory metrics are evaluated. :type times: np.ndarray, optional :param quality_metadata: Scalar quality metadata to attach to the evaluation payload. :type quality_metadata: dict, optional :param diagnostics: Precomputed diagnostics to carry through the evaluation payload. :type diagnostics: dict, optional :param random_state: Random state used for sampled Shepard distances. :type random_state: int, optional :param n_neighbors: Neighborhood size for single-score standard metrics. :type n_neighbors: int, default=5 :param k_values: Neighborhood sizes for benchmark sweeps. Explicit values take precedence over ``config``. :type k_values: sequence of int, optional :param separation_method: Separation definition passed to ``trajectory_separation`` when trajectory labels are available. ``None`` defers to ``config`` and otherwise falls back to ``"centroid"``. :type separation_method: str, optional :param config: Typed evaluation configuration. Supplies ``metrics``, ``k_values`` (from ``config.k_range``), and ``separation_method`` for any of those left unset by an explicit argument. Explicit arguments always win. :type config: EvaluationConfig, optional :returns: Dictionary with these keys: - ``embedding`` : the evaluated embedding - ``metrics`` : scalar metric summaries - ``metadata`` : scalar descriptive metadata - ``diagnostics`` : array-like or structured diagnostics - ``records`` : tidy long-form metric records as ``list[dict]`` - ``artifacts`` : copy of the diagnostics payload :rtype: dict :raises TypeError: If ``quality_metadata`` or ``diagnostics`` is not a dictionary. :raises ValueError: If ``X_emb`` is not 2D or 3D, or if standard 2D evaluation is requested without a compatible ``X``. .. rubric:: Notes This function is intentionally pure. It does not fit reducers, transform data, or inspect reducer internals. Callers are responsible for preparing ``X_emb`` and any optional metadata such as trajectory labels or times. .. seealso:: :py:obj:`coco_pipe.dim_reduction.core.DimReduction.score` Manager-level wrapper that prepares inputs and stores the returned evaluation payload on a fitted ``~coco_pipe.dim_reduction.DimReduction`` object. :py:obj:`MethodSelector` Post-hoc comparison and ranking across multiple scored reductions. .. rubric:: Examples Evaluate a standard 2D embedding: >>> import numpy as np >>> X = np.random.RandomState(0).randn(20, 5) >>> X_emb = X[:, :2] >>> result = evaluate_embedding(X_emb, X=X, method_name="demo") >>> "metrics" in result and "records" in result True Evaluate a native trajectory embedding: >>> traj = np.random.RandomState(0).randn(4, 10, 2) >>> labels = np.array(["A", "A", "B", "B"]) >>> result = evaluate_embedding( ... traj, ... method_name="traj", ... metrics=["trajectory_speed", "trajectory_separation"], ... labels=labels, ... ) >>> "trajectory_speed_mean" in result["metrics"] True .. py:class:: MethodSelector(reducers) Compare and rank already-scored dimensionality reduction methods. ``MethodSelector`` is intentionally post-hoc. It does not fit reducers or compute embeddings. Each reducer must already be a scored ``~coco_pipe.dim_reduction.DimReduction`` instance with cached ``metric_records_``. :param reducers: Scored ``~coco_pipe.dim_reduction.DimReduction`` objects to compare. Lists are converted to a method-keyed mapping using ``reducer.method``. :type reducers: dict or list of DimReduction :ivar reducers: Compared reductions keyed by method name. :vartype reducers: dict of str to DimReduction :ivar metric_records_: Cached long-form metric records populated by ``collect()``. :vartype metric_records_: list of dict .. seealso:: :py:obj:`evaluate_embedding` Pure evaluator used upstream by ``DimReduction.score``. :py:obj:`coco_pipe.dim_reduction.core.DimReduction.score` Scores a fitted reduction and populates the records consumed here. .. rubric:: Examples >>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(30, 4) >>> reducers = [ ... DimReduction("PCA", n_components=2), ... DimReduction("Isomap", n_components=2, n_neighbors=5), ... ] >>> for reducer in reducers: ... embedding = reducer.fit_transform(X) ... reducer.score(embedding, X=X, k_values=[5]) >>> selector = MethodSelector(reducers).collect() >>> frame = selector.to_frame() >>> not frame.empty True .. py:attribute:: metric_records_ :value: [] .. py:method:: from_records(records) :classmethod: Create a selector directly from long-form metric records. .. py:method:: from_frame(frame) :classmethod: Create a selector directly from a metric-record DataFrame. .. py:method:: collect() Collect cached metric records from already-scored reducers. :returns: The selector populated with comparison-ready metric records. :rtype: MethodSelector :raises ValueError: If a reducer has not been scored yet. .. seealso:: :py:obj:`coco_pipe.dim_reduction.core.DimReduction.score` Populates the ``metric_records_`` consumed by this method. :py:obj:`to_frame` Materialize the collected long-form records as a DataFrame. .. rubric:: Notes ``collect()`` does not fit reducers or recompute evaluation metrics. It only gathers cached metric observations from reducers that were already scored explicitly. .. rubric:: Examples >>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> selector = MethodSelector([reducer]).collect() >>> len(selector.metric_records_) > 0 True .. py:method:: to_frame() Return the cached long-form metric table. :returns: Tidy metric table with columns ``method``, ``metric``, ``value``, ``scope``, and ``scope_value``. :rtype: pandas.DataFrame .. rubric:: Notes This method only materializes a DataFrame at the public export boundary. Internally, ``MethodSelector`` stores metric records as plain Python dictionaries. .. seealso:: :py:obj:`collect` Gather cached metric records from scored reducers. :py:obj:`rank_methods` Rank reducers from the collected metric table. .. rubric:: Examples >>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> frame = MethodSelector([reducer]).collect().to_frame() >>> set(["method", "metric", "value"]).issubset(frame.columns) True .. py:method:: rank_methods(selection_metric, *, selection_k = None, tie_breakers = None) Rank methods using one primary metric and optional tie-breakers. :param selection_metric: Metric to optimize. :type selection_metric: str :param selection_k: Neighborhood size to compare for k-scoped metrics. :type selection_k: int, optional :param tie_breakers: Additional metrics used in order when primary values tie. :type tie_breakers: sequence of str, optional :returns: Ranked comparison table. The first row is the best-scoring method under the requested ranking policy. :rtype: pandas.DataFrame :raises ValueError: If the requested metrics are unsupported, unavailable in the cached records, or missing the requested ``selection_k`` observations. .. rubric:: Notes Ranking is based on mean metric values per method. For k-scoped metrics, ``selection_k`` restricts comparison to a single neighborhood size when requested. .. seealso:: :py:obj:`collect` Gather cached metric observations before ranking. :py:obj:`to_frame` Inspect the underlying long-form metric observations directly. :py:obj:`coco_pipe.dim_reduction.core.DimReduction.score` Produces the metric records that feed into ranking. .. rubric:: Examples >>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducers = [DimReduction("PCA", n_components=2)] >>> reducer = reducers[0] >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> ranked = ( ... MethodSelector(reducers) ... .collect() ... .rank_methods( ... "trustworthiness", ... selection_k=5, ... ) ... ) >>> ranked.iloc[0]["method"] == reducer.method True