Module Template & API Contract

This page provides a copy-pasteable template for new pyCyto pipeline modules and documents the full API contract.

Full Template

from __future__ import annotations
from tqdm import tqdm
import dask.array as da
import numpy as np


class MyModule:
    """
    One-line description of what this module does.

    Longer description explaining the algorithm, key parameters,
    and any caveats.

    Args:
        param1 (float): Description and units if applicable.
        param2 (str): Description. Options: ``"a"``, ``"b"``.
        verbose (bool): Print progress messages. Default ``True``.

    Example:
        >>> module = MyModule(param1=1.0)
        >>> result = module({"image": my_image})
        >>> label = result["label"]
    """

    def __init__(
        self,
        param1: float = 1.0,
        param2: str = "a",
        verbose: bool = True,
    ) -> None:
        self.name = "MyModule"
        self.param1 = param1
        self.param2 = param2
        self.verbose = verbose

    def __call__(self, data: dict) -> dict:
        """
        Run the module.

        Args:
            data (dict): Must contain key ``"image"`` (Dask or NumPy array,
                shape ``(T, Y, X)`` or ``(Y, X)``).

        Returns:
            dict: Contains key ``"label"`` (same spatial shape as input,
                integer dtype).

        Raises:
            KeyError: If ``"image"`` is not present in *data*.
            ValueError: If image shape is unsupported.
        """
        if "image" not in data:
            raise KeyError("MyModule requires data['image']")

        image = data["image"]

        if self.verbose:
            tqdm.write(f"[{self.name}] param1={self.param1}")

        label = self._process(image)
        return {"image": image, "label": label}

    def _process(self, image):
        # implementation
        raise NotImplementedError

Checklist for New Modules

  • [ ] Class in cyto/<stage>/my_module.py

  • [ ] Imported and re-exported in cyto/<stage>/__init__.py

  • [ ] Dependencies added to pixi.toml

  • [ ] Unit test in tests/test_<stage>.py

  • [ ] NumPy-style docstrings on __init__ and __call__

  • [ ] YAML example block in pipelines/

  • [ ] Entry in relevant notebook example

  • [ ] sbatch template in distributed/<stage>/ (for distributed stages)

Dictionary Key Reference

Key

Type

Shape convention

"image"

dask/np array

(T, Y, X) or (Y, X)

"label"

dask/np array

same spatial dims, int32

"dataframe"

pd.DataFrame

one row per detection

"network"

nx.Graph

nodes = cells, edges = contacts