Boilerplate Templates¶
Copy-pasteable templates for the most common extension patterns in pyCyto.
graph LR
A[Define class in cyto/stage/] --> B[Register import in main.py]
B --> C[Add to pipeline YAML]
C --> D[Add resource spec YAML]
D --> E[pixi install -e env]
classDef templateStep fill:#0d7377,color:#fff,stroke:#0a5c60
class A,B,C,D,E templateStep
1. Compute Node Class¶
Every pipeline stage is a callable class following the dict I/O contract. Copy the variant that matches your stage type.
Preprocessing node (Image → Image)¶
from tqdm import tqdm
class MyPreprocessing:
def __init__(self, param_a=1.0, verbose=True):
self.name = "MyPreprocessing"
self.param_a = param_a
self.verbose = verbose
def __call__(self, data: dict) -> dict:
image = data["image"] # dask or numpy array (T, Y, X) or (T, Z, Y, X)
if self.verbose:
tqdm.write(f"[{self.name}] processing {image.shape}")
result = image * self.param_a # replace with your algorithm
return {"image": result}
Segmentation node (Image → Label)¶
class MySegmentation:
def __init__(self, threshold=0.5, verbose=True):
self.name = "MySegmentation"
self.threshold = threshold
self.verbose = verbose
def __call__(self, data: dict) -> dict:
image = data["image"]
if self.verbose:
tqdm.write(f"[{self.name}] segmenting {image.shape}")
labels = (image > self.threshold).astype("uint32") # replace with real model
return {"label": labels}
Postprocessing node (Image + Label + DataFrame → any)¶
class MyAnalysis:
def __init__(self, verbose=True):
self.name = "MyAnalysis"
self.verbose = verbose
def __call__(self, data: dict) -> dict:
image = data.get("image")
label = data.get("label")
df = data.get("dataframe")
if self.verbose:
tqdm.write(f"[{self.name}] analysing {len(df)} detections")
# your analysis here
result_df = df.copy()
return {"dataframe": result_df}
2. Pipeline YAML Snippet¶
Add a new stage to your pipeline YAML:
pipeline:
postprocessing:
- name: MyAnalysis # class name — must match Python class
tag: MyAnalysisTag # unique identifier; used in logs and resource config
channels: [TCell] # channel names to pass to this stage
input_type: [image, label, feature]
output_type: [feature]
args:
verbose: true # passed as kwargs to __init__
output: true # write result to output_dir/postprocessing/<tag>/
3. Resource YAML Compute Spec Entries¶
Add a matching entry in configs/distributed/pipeline-resources.yaml:
# ── CPU baremetal (no GPU needed) ─────────────────────────────────────────────
pipeline:
postprocessing:
MyAnalysisTag: # must match tag in pipeline YAML
partition: short
cpus-per-task: 4
mem: 16G
time: "01:00:00"
batch_size: 50
dependency: [Cellpose_TCell]
dependency_type: afterok
# ── GPU stage ─────────────────────────────────────────────────────────────────
MyGpuAnalysisTag:
partition: gpu_short
gres: gpu:a100-pcie-40gb:1
cpus-per-task: 4
mem: 32G
time: "04:00:00"
batch_size: 100
dependency: singleton
dependency_type: afterok
# ── Apptainer container ────────────────────────────────────────────────────────
MyContainerTag:
partition: gpu_short
gres: gpu:a100-pcie-40gb:1
cpus-per-task: 4
mem: 32G
container: containers/images/cyto-gpu.sif # path to SIF
dependency: [Cellpose_TCell]
dependency_type: afterok
4. pixi.toml Feature Dependency Block¶
Add an optional package as a named pixi feature so it does not pollute the default environment:
# In pixi.toml — add a new feature section
[feature.myalgorithm.dependencies]
python = ">=3.10"
mypackage = ">=1.2" # replace with real conda/PyPI package name
[feature.myalgorithm.pypi-dependencies]
my-pypi-package = ">=0.5"
# Wire the feature into an environment
[environments]
myalgorithm = { features = ["myalgorithm"], solve-group = "cpu" }
Install with pixi install -e myalgorithm.
5. Provenance-Aware Output Writer Stub¶
For modules that write files directly (rather than returning a dict):
import json
from pathlib import Path
from datetime import datetime, timezone
def write_provenance(output_dir: Path, params: dict, git_sha: str = "") -> None:
"""Write a JSON provenance sidecar next to module outputs."""
record = {
"generated_at": datetime.now(timezone.utc).isoformat(),
"git_sha": git_sha,
"params": params,
}
(output_dir / "provenance.json").write_text(
json.dumps(record, indent=2)
)
class MyFileWriter:
def __init__(self, output_dir: str, verbose=True):
self.name = "MyFileWriter"
self.output_dir = Path(output_dir)
self.verbose = verbose
def __call__(self, data: dict) -> dict:
self.output_dir.mkdir(parents=True, exist_ok=True)
df = data["dataframe"]
out_path = self.output_dir / "results.csv"
df.to_csv(out_path, index=False)
write_provenance(self.output_dir, params={"module": self.name})
if self.verbose:
tqdm.write(f"[{self.name}] wrote {out_path}")
return {"dataframe": df}
See Also¶
Module Template — annotated full-class template with checklist
Plugin Integration — integration patterns (JVM, GPU, CLI, container)
Docker to Apptainer — build and run SIF on HPC