cyto.utils¶
General utilities: label conversion, kinematics, segmentation cache, notebook config loading.
- cyto.utils.config.load_db_config(configs_dir: Path | None = None) dict[source]¶
Load the PostgreSQL database configuration for pyCyto.
Searches upward from the current working directory (or
configs_dir) forconfigs/db.def.toml, then overlaysconfigs/db.user.tomlif present. The user file must supply at minimumdatabase.password.- Parameters:
configs_dir (Path, optional) – Explicit path to the
configs/directory. IfNone, walks upward fromPath.cwd()until a directory containingdb.def.tomlis found.- Returns:
- Merged database config with keys
host,port,dbname, user,password, and optionallyadminsub-dict.
- Merged database config with keys
- Return type:
- Raises:
FileNotFoundError – If
db.def.tomlcannot be located, or ifdb.user.tomldoes not exist (credentials are required at runtime).
Example:
from cyto.utils import load_db_config from urllib.parse import quote_plus db = load_db_config() conn_str = ( f"postgresql+psycopg2://{quote_plus(db['user'])}:" f"{quote_plus(db['password'])}@{db['host']}:{db['port']}/{db['dbname']}" )
- cyto.utils.config.load_notebook_config(notebooks_dir: Path | None = None) dict[source]¶
Load the DataOps path configuration for notebooks.
Searches upward from the current working directory (or
notebooks_dir) forconfig.def.toml, then overlaysconfig.user.tomlif present. Returns the merged configuration dict.- Parameters:
notebooks_dir (Path, optional) – Explicit path to the
notebooks/directory containingconfig.def.toml. IfNone, the function walks upward fromPath.cwd()until the file is found.- Returns:
- Merged TOML config — keys match the sections in
config.def.toml(paths,dataset,datasets). Also injects_meta.notebooks_dirand_meta.output_root(resolved Path) for convenience.
- Return type:
- Raises:
FileNotFoundError – If
config.def.tomlcannot be located.
- DataOps retention tiers (documented in notebooks/config.def.toml):
Tier 1 — Ceph SoT: raw snapshots + promoted final artifacts (permanent) Tier 2 — output_root: figures, tables, metadata JSON (retained until promoted) Tier 3 — scratch_root: large intermediate TIFFs/arrays (volatile, delete after promotion) Tier 4 — Ephemeral: SLURM logs, tmp; auto-rotated
Example:
from cyto.utils import load_notebook_config from pathlib import Path cfg = load_notebook_config() DATA_ROOT = Path(cfg["paths"]["data_root"]) SCRATCH_ROOT = Path(cfg["paths"]["scratch_root"]) OUTPUT_ROOT = Path(cfg["paths"]["output_root"]) # small results SNAPSHOT_ID = cfg["dataset"]["snapshot_id"]
- cyto.utils.label_to_table.extract_segment_features(image, label, frame, relabel=False, offset=0, channel='', spacing=[1, 1])[source]¶
- cyto.utils.label_to_table.label_to_sparse(label, image=None, spacing=[1, 1], channel_name='', processes=1)[source]¶
- cyto.utils.kinematics.cal_kinematics(tracks, x_col='x', y_col='y', frame_col='frame', track_id_col='track_id', chuck_size=2000, verbose=False)[source]¶
- cyto.utils.kinematics.compute_msd_track_vectorized(track_data)[source]¶
Compute time-lagged MSD for all lags at once for a single track.
Replaces calling compute_msd_time_lag in a loop: O(T²) with vectorised numpy inner operations instead of O(T³) with pure-Python nested loops.
- Parameters:
track_data (pd.DataFrame) – Single-track data with ‘frame’ and ‘displacement squared’ columns.
- Returns:
(time_lags np.ndarray, msd_values np.ndarray)
- Return type:
cyto.utils.seg_cache¶
Pickle-based segmentation cache utilities.
Cache files are named segmentation_frame_{idx:04d}_{cell_type}.pkl
and live in a persistent directory that is shared across runs and ignored
by git (notebooks/**/cache/).
Both the batch script and the interactive notebook import from here so the cache format is defined in one place.
- cyto.utils.seg_cache.cache_exists(cache_dir, frame_idx, cell_type)[source]¶
Return True if a (non-failed) cache file exists for this frame.
- cyto.utils.seg_cache.get_cache_filename(cache_dir, frame_idx, cell_type)[source]¶
Return the canonical cache file path for a frame and cell type.
- cyto.utils.seg_cache.load_segmentation_cache(cache_dir, frame_idx, cell_type)[source]¶
Load segmentation results from a pkl file.