coco_pipe.utils#

Shared package utilities.

This module holds small helpers that are not specific to one subpackage.

Attributes#

Functions#

stable_hash(value, *[, length])

Return a deterministic SHA-256 prefix for a JSON-compatible value.

import_optional_dependency(loader, feature, dependency)

Lazily import an optional dependency with clearer failure modes.

get_git_revision_hash([cwd])

Return the current short git hash, or "Unknown" when unavailable.

get_package_version(package_name)

Return an installed distribution version, or "Unknown".

get_environment_info([cwd, packages])

Capture runtime provenance metadata for reports and experiment results.

slug(value, *[, max_len])

Return a filesystem-safe slug from an arbitrary value.

resolve_n_jobs(n_jobs)

Resolve n_jobs to a concrete positive integer.

run_task_batch(tasks, worker_fn, max_workers)

Execute tasks with worker_fn, optionally in parallel.

Module Contents#

coco_pipe.utils.PACKAGE_VERSIONS: collections.abc.Mapping[str, str]#
coco_pipe.utils.stable_hash(value, *, length=64)#

Return a deterministic SHA-256 prefix for a JSON-compatible value.

Dictionaries are serialized with sorted keys and compact separators. Values such as paths that are not directly JSON serializable fall back to str.

Parameters:
  • value (Any)

  • length (int)

Return type:

str

coco_pipe.utils.import_optional_dependency(loader, feature, dependency, install_hint=None)#

Lazily import an optional dependency with clearer failure modes.

Parameters:
  • loader (callable) – Zero-argument callable returning the imported dependency.

  • feature (str) – Feature or component name using the dependency.

  • dependency (str) – Human-readable dependency name.

  • install_hint (str, optional) – Installation hint shown on ImportError.

Returns:

Imported dependency returned by loader.

Return type:

Any

Raises:
  • ImportError – If the dependency is not installed.

  • RuntimeError – If the dependency is installed but fails during initialization.

coco_pipe.utils.get_git_revision_hash(cwd=None)#

Return the current short git hash, or "Unknown" when unavailable.

Parameters:

cwd (str | os.PathLike[str] | None)

Return type:

str

coco_pipe.utils.get_package_version(package_name)#

Return an installed distribution version, or "Unknown".

Parameters:

package_name (str)

Return type:

str

coco_pipe.utils.get_environment_info(cwd=None, packages=None)#

Capture runtime provenance metadata for reports and experiment results.

Parameters:
Return type:

dict[str, Any]

coco_pipe.utils.slug(value, *, max_len=80)#

Return a filesystem-safe slug from an arbitrary value.

Collapses runs of non-alphanumeric characters (except ., _, =, -) into a single -, strips leading/trailing punctuation, and truncates at max_len.

Parameters:
  • value (object) – Any object; str(value) is used as the source text.

  • max_len (int) – Maximum character length of the returned slug.

Return type:

str

coco_pipe.utils.resolve_n_jobs(n_jobs)#

Resolve n_jobs to a concrete positive integer.

-1 maps to os.cpu_count() (minimum 1). Any other value must already be a positive integer, or ValueError is raised.

Parameters:

n_jobs (int)

Return type:

int

coco_pipe.utils.run_task_batch(tasks, worker_fn, max_workers)#

Execute tasks with worker_fn, optionally in parallel.

When max_workers is 1 the tasks are run serially in the current process. For any larger value joblib.Parallel() is used with n_jobs=min(max_workers, len(tasks)) so the pool size never exceeds the actual work to do.

Parameters:
  • tasks (collections.abc.Sequence[Any]) – Sequence of opaque task objects passed one-by-one to worker_fn.

  • worker_fn (collections.abc.Callable[[Any], Any]) – Single-argument callable that processes one task and returns a result.

  • max_workers (int) – Maximum number of parallel workers. Pass 1 for serial execution.

Returns:

Results in the same order as tasks.

Return type:

list