Developer Guide for Cytotoxicity Pipeline

The Cytotoxicity pipeline is a modular, CLI-based workflow designed for flexibility and reusability. The workflow consists of the following main steps:

Preprocessing: Transforms raw images into processed images.
Input: image → Output: image
Segmentation: Converts images to segmentation labels, or refines existing labels.
Input: image → label / label → label
Tabulation: Converts images or labels into a sparse tabular format for downstream analysis.
Input: image → dataframe / label → dataframe
Tracking: Tracks objects over time using either dense (label-based) or sparse (feature point-based) methods.
Input: label → dataframe / dataframe → dataframe
Analysis: Performs analysis and visualization using any combination of images, labels, or tabular data.
Input: dataframe → dataframe / arbitrary outputs

Each step is implemented as an independent module, allowing for easy customization and extension of the pipeline.

For each workflow process it only accept specific type of input-output pair. To enhance module reusability the data is interfaced as a python dictionary object, e.g. in preprocessing we provide a dict input with keyword “image”. Then return file is in the same format.

For individual steps check the table for input-output dictionary key-pairs:

Step	Input Key	Output Key
Preprocessing	“image”	“image”
Segmentation	“image”/”label”	“label”
Tabulate	“image”/”label”	“dataframe”
Tracking	“dataframe”	“dataframe”
Analysis	“image”/”label”/”dataframe”	arbitrary

For short, following the template class for the modular workflow:

class MySegmentationClass(object):
    def __init__(self,arg1=[0,1,2],arg2="foo", verbose=True) -> None:
        """
        Function documentation comes to here

        Args:
            arg1 (tuple or int): Tuple or integer argument input
            arg2 (str): String input
            verbose (bool): Turn on or off the processing printout
        """
        self.name = "MyPreprocessingClass"
        self.arg1 = arg1
        self.arg2 = arg2
        self.verbose = verbose

    def __call__(self, data) -> Any:
        image = data["image"]

        if self.verbose:
            tqdm.write("Class args: [{},{}]".format(self.arg1,self.arg2))

        # some processing here
        ...
        label = awesome_segmentation(image)

        return {"image": image, "label": label}

A full example of the class can be found in ../preprocessing/normalization.py

⚠️ Important: Add your package dependency to requirements.txt ⚠️

⚠️ Add notes to README.md and/or ./doc/setup.md when necessary, particularly conda/mamba specific dependencies⚠️

To load the class back to the main function you only need to add corresponding header import and edit the pipeline YAML file.

Custom Intermediate Output

TODO

Dask Support

For better big data management we recommended the usage of Dask array than numpy, though in some cases cupy and numpy may do the work.