Advanced Topics#

Custom Estimators#

coco_pipe.decoding is extensible. You can register any scikit-learn-compatible estimator with the registry to enable capability contracts, metric compatibility checks, and diagnostic reporting for your custom model.

—

1. Protocol Requirements#

Any custom estimator used in coco_pipe.decoding must implement the DecoderEstimator protocol:

from coco_pipe.decoding.interfaces import DecoderEstimator

class MyCustomClassifier:
    def fit(self, X, y=None, **fit_params):
        # ... training logic
        return self

    def predict(self, X):
        # ... inference logic
        return y_pred

    def get_params(self, deep=True):
        return {}

    def set_params(self, **params):
        return self

assert isinstance(MyCustomClassifier(), DecoderEstimator)  # runtime check

For models that provide probability estimates, also implement predict_proba. For neural models with training diagnostics, implement the NeuralTrainable protocol (see Foundation Models).

—

2. Registering an Estimator#

Register your estimator in ESTIMATOR_SPECS so it is discoverable by the capability checking system:

from coco_pipe.decoding._specs import ESTIMATOR_SPECS, EstimatorSpec

ESTIMATOR_SPECS["MyCustomClassifier"] = EstimatorSpec(
    name="MyCustomClassifier",          # matches the registry key and class name
    import_path="mypackage.models",     # module that exposes the class
    family="linear",                    # "linear", "tree_ensemble", "neural", ...
    task=("classification",),
    input_kinds=("tabular_2d",),        # or ("epoched",) for 3D temporal models
    supports_proba=True,
    supports_decision_function=False,
    supports_calibration=True,
    feature_selection=("univariate", "sfs"),  # () if unsupported
    importance=("coefficients",),       # or ("feature_importances",), () if none
    dependency_extra="core",
)

2.1 EstimatorSpec Fields#

Field	Description
`name`	Must match the registry key and the class name.
`import_path`	Full dotted import path to the class.
`family`	`"classical"` \| `"tree_ensemble"` \| `"boosting"` \| `"neural"` \| `"dummy"`
`task`	List of `"classification"` and/or `"regression"`.
`input_kinds`	`("tabular_2d",)` for 2D arrays; `("epoched",)` for 3D temporal.
`supports_proba` / `supports_decision_function`	Which prediction interfaces the model provides.
`feature_selection`	Tuple of supported methods, e.g. `("univariate", "sfs")`; `()` if none.
`supports_calibration`	Whether `CalibratedClassifierCV` can wrap it.
`importance`	Tuple of importance kinds, e.g. `("coefficients",)` / `("feature_importances",)`.
`default_search_space`	Optional dict of hyperparameter name → list of values for tuning.

—

3. Using the Custom Estimator in an Experiment#

After registration, use the estimator name in ClassicalModelConfig:

from coco_pipe.decoding.configs import ExperimentConfig, ClassicalModelConfig, CVConfig

config = ExperimentConfig(
    task="classification",
    models={
        "my_model": ClassicalModelConfig(
            estimator="MyCustomClassifier",
            params={"my_param": 1.0},
        )
    },
    metrics=["accuracy"],
    cv=CVConfig(strategy="stratified", n_splits=5),
)

result = Experiment(config).run(X, y)

—

4. Custom Feature Importances#

If your custom model exposes feature importances differently, override the importance extraction logic by adding a callable to importance_attr:

class MyCustomClassifier:
    def fit(self, X, y=None):
        self.importances_ = self._compute_importances(X, y)
        return self

    def _compute_importances(self, X, y):
        # custom importance logic
        return np.ones(X.shape[1])

ESTIMATOR_SPECS["MyCustomClassifier"] = EstimatorSpec(
    ...,
    importance_attr="importances_",   # will be read after .fit()
)

coco_pipe.decoding._engine.extract_feature_importances reads importance_attr via getattr(fitted_model, importance_attr) and handles 1D arrays (feature importance vectors) and 2D arrays (class-specific weights like coef_).

Reproducibility Architecture#

coco_pipe.decoding is designed so that every run with the same configuration and data produces bit-identical results. This section documents how random seeds are propagated, where they appear in the result schema, and how to validate reproducibility.

—

1. Seed Propagation via SeedSequence#

Setting ExperimentConfig.random_state propagates derived, independent seeds to every sub-component through NumPy’s SeedSequence:

config = ExperimentConfig(
    task="classification",
    models={"lr": ClassicalModelConfig(estimator="LogisticRegression")},
    metrics=["accuracy"],
    cv=CVConfig(strategy="stratified", n_splits=5),
    random_state=42,    # master seed
)

Internally, Experiment._propagate_random_state() derives:

Component	Derived seed (offset from master)
`cv`	`master + 0`
`feature_selection`	`master + 1`
`tuning`	`master + 2`
`calibration`	`master + 3`
Per-model seeds	Spawned from `master + 4` via `SeedSequence.spawn`

The per-model seeds are ordered by model name (alphabetically) for determinism.

Note

Even if you add models or change their order in the models dict, alphabetical seed assignment ensures each model always receives the same seed regardless of insertion order.

—

2. What Is Seeded#

Every stochastic component in the pipeline is seeded:

CV splitters: StratifiedKFold, StratifiedGroupKFold, KFold.
Hyperparameter search: RandomizedSearchCV uses tuning.random_state.
Model initialization: models with random_state parameters receive model-specific seeds.
Calibration: CalibratedClassifierCV uses calibration.random_state.
Bootstrap CI: get_bootstrap_confidence_intervals accepts random_state.
Permutation tests: ChanceAssessmentConfig.random_state seeds the null permutation engine.

Not seeded (intentionally):

Data loading and preprocessing outside the Experiment.run call.
MNE meta-estimator internal parallelism (joblib workers), which may vary between runs if parallelism order is non-deterministic.

—

3. Result Schema Provenance#

Every ExperimentResult stores reproducibility metadata in result.meta["hardware_provenance"]:

print(result.meta["hardware_provenance"])
# {
#   "python_version": "3.11.12",
#   "sklearn_version": "1.6.1",
#   "numpy_version": "1.26.4",
#   "platform": "macOS-14.5",
#   "n_jobs": 1,
#   "timestamp": "2026-05-14T04:30:00Z",
# }

This provenance is captured by get_environment_info() from coco_pipe at the time of Experiment.run.

—

4. Validating Reproducibility#

To verify that two runs produce identical results:

import numpy as np
import pandas as pd

# Run A
result_a = Experiment(config).run(X, y, sample_metadata=meta)

# Run B (identical config and data)
result_b = Experiment(config).run(X, y, sample_metadata=meta)

scores_a = result_a.get_detailed_scores()
scores_b = result_b.get_detailed_scores()

pd.testing.assert_frame_equal(
    scores_a.sort_values(["Model", "Fold", "Metric"]).reset_index(drop=True),
    scores_b.sort_values(["Model", "Fold", "Metric"]).reset_index(drop=True),
)

—

5. Known Non-Determinism Sources#

Some operations may produce slightly different results even with the same seed:

Parallel outer CV (n_jobs > 1): scikit-learn’s parallel backends can schedule workers in different orders between runs. For exact reproducibility, use n_jobs=1.
GPU operations (for foundation models with LoRA/QLoRA): CUDA operations are non-deterministic by default unless torch.use_deterministic_algorithms(True) is set.
OS-level RNG state leaking into random.random() or os.urandom() calls in third-party libraries.

For fully deterministic publication runs, set n_jobs=1 and document the exact library versions from result.meta["hardware_provenance"].