# Developer Guide for Cytotoxicity Pipeline The Cytotoxicity pipeline is a modular, CLI-based workflow designed for flexibility and reusability. The workflow consists of the following main steps: 1. **Preprocessing**: Transforms raw images into processed images. *Input*: image → *Output*: image 2. **Segmentation**: Converts images to segmentation labels, or refines existing labels. *Input*: image → label / label → label 3. **Tabulation**: Converts images or labels into a sparse tabular format for downstream analysis. *Input*: image → dataframe / label → dataframe 4. **Tracking**: Tracks objects over time using either dense (label-based) or sparse (feature point-based) methods. *Input*: label → dataframe / dataframe → dataframe 5. **Analysis**: Performs analysis and visualization using any combination of images, labels, or tabular data. *Input*: dataframe → dataframe / arbitrary outputs Each step is implemented as an independent module, allowing for easy customization and extension of the pipeline. For each workflow process it only accept specific type of input-output pair. To enhance module reusability the data is interfaced as a python dictionary object, e.g. in preprocessing we provide a dict input with keyword "image". Then return file is in the same format. For individual steps check the table for input-output dictionary key-pairs: | Step | Input Key | Output Key | |---------------|---------------------|---------------------| | Preprocessing | "image" | "image" | | Segmentation | "image"/"label" | "label" | | Tabulate | "image"/"label" | "dataframe" | | Tracking | "dataframe" | "dataframe" | | Analysis | "image"/"label"/"dataframe" | arbitrary | For short, following the template class for the modular workflow: ```python class MySegmentationClass(object): def __init__(self,arg1=[0,1,2],arg2="foo", verbose=True) -> None: """ Function documentation comes to here Args: arg1 (tuple or int): Tuple or integer argument input arg2 (str): String input verbose (bool): Turn on or off the processing printout """ self.name = "MyPreprocessingClass" self.arg1 = arg1 self.arg2 = arg2 self.verbose = verbose def __call__(self, data) -> Any: image = data["image"] if self.verbose: tqdm.write("Class args: [{},{}]".format(self.arg1,self.arg2)) # some processing here ... label = awesome_segmentation(image) return {"image": image, "label": label} ``` A full example of the class can be found in [../preprocessing/normalization.py](../preprocessing/normalization.py) ⚠️ **Important**: Add your package dependency to [requirements.txt](../requirements.txt) ⚠️ ⚠️ Add notes to [README.md](../README.md) and/or [./doc/setup.md](./setup.md) when necessary, particularly conda/mamba specific dependencies⚠️ To load the class back to the main function you only need to add corresponding header import and edit the pipeline YAML file. ## Custom Intermediate Output TODO ## Dask Support For better big data management we recommended the usage of [Dask array](https://docs.dask.org/en/stable/array.html) than numpy, though in some cases cupy and numpy may do the work.