Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 50 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ZedProfiler

[![Coverage](https://img.shields.io/badge/coverage-87%25-green)](#quality-gates)
[![Coverage](https://img.shields.io/badge/coverage-99%25-brightgreen)](#quality-gates)

CPU-first 3D image feature extraction toolkit for high-content and high-throughput image-based profiling.

Expand All @@ -18,12 +18,55 @@ just env
## Quick usage (API shape)

```python
from zedprofiler import colocalization # promoted convenience import
from zedprofiler.featurization import texture # stays in feature namespace

# Implementation is scaffolded in PR 1 and will be completed in module PRs.
result = colocalization.compute()
texture_result = texture.compute()
import os
import pathlib
import pandas as pd

from zedprofiler.IO.loading_classes import ImageSetConfig, ImageSetLoader, ObjectLoader
from zedprofiler.featurization.areasizeshape import compute as compute_areasizeshape

image_set_path = pathlib.Path(
os.path.expanduser(
"~/mnt/bandicoot/NF1_organoid_data/data/NF0014_T1/zstack_images/C2-1/"
)
).resolve(strict=True)
mask_set_path = pathlib.Path(
os.path.expanduser(
"~/mnt/bandicoot/NF1_organoid_data/data/NF0014_T1/segmentation_masks/C2-1/"
)
).resolve(strict=True)
image_set_config = ImageSetConfig(
image_set_name="test_set",
raw_image_key_name=["AGP"],
mask_key_name=["Nuclei"],
)
image_set_loader = ImageSetLoader(
image_set_path=image_set_path,
mask_set_path=mask_set_path,
anisotropy_spacing=(1,0.1,0.1),
channel_mapping={
"DNA": 405,
"ER": 488,
"AGP": 555,
"Mito": 640,
"Organoid": "organoid_",
"Cell": "cell_",
"Nuclei": "nuclei_",
"Cytoplasm": "cytoplasm_",
},
config=image_set_config
)

object_loader = ObjectLoader(
image_set_loader=image_set_loader,
channel_name=image_set_config.raw_image_key_name[0],
compartment_name=image_set_config.mask_key_name[0],
)
area_dict_df = pd.DataFrame(compute_areasizeshape(
image_set_loader=image_set_loader,
object_loader=object_loader,
))
area_dict_df
```

## Data Contract
Expand Down
5 changes: 5 additions & 0 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,11 @@ The roadmap is intended to be a living document and may be updated as needed.

- [ ] Changelog, semantic tag, PyPI publish automation, README install updates.

### Phase 4: Post-release improvements (PR 14+)

14. Add the ability for loading of images from array instead of file
file paths for increased flexibility in data sources and testing.

## Verification Gates

- [ ] Run full unit and integration tests on Linux with coverage >=85%.
Expand Down
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@ classifiers = [
dependencies = [
"fire>=0.7.1",
"jinja2>=3.1.6",
"numpy>=1.26",
"pandas>=3.0.2",
"pyarrow>=16",
"scikit-image>=0.24",
]
scripts.ZedProfiler = "ZedProfiler.cli:trigger"

Expand Down
9 changes: 8 additions & 1 deletion src/zedprofiler/featurization/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,14 @@
modules to be promoted at the package top-level.
"""

from . import areasizeshape, colocalization, granularity, intensity, neighbors, texture
from . import (
areasizeshape,
colocalization,
granularity,
intensity,
neighbors,
texture,
)

__all__ = [
"areasizeshape",
Expand Down
166 changes: 162 additions & 4 deletions src/zedprofiler/featurization/areasizeshape.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this code is very clean! Though, I highly recommend more code comments here. There are instances where area and volume are used interchangeably, which is not always clear, and I find the section on labels unclear about why we are avoiding 0 as a label (especially if it is 0-indexed).

Original file line number Diff line number Diff line change
@@ -1,10 +1,168 @@
"""Area/size/shape featurization module scaffold."""
"""Area, size, and shape features for 3D objects."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered renaming to Volume or Surface area to be more specific towards 3D featurization? This is more of a nit-pick as it is pretty understandable with the current naming.


from __future__ import annotations

from collections.abc import Sequence
from importlib import import_module
from typing import Protocol

import numpy as np

from zedprofiler.exceptions import ZedProfilerError


def compute() -> dict[str, list[float]]:
"""Placeholder for area/size/shape computation implementation."""
raise ZedProfilerError("areasizeshape.compute is not implemented yet")
class SupportsImageSetLoader(Protocol):
"""Minimal image-set loader interface required by this module."""

anisotropy_spacing: tuple[float, float, float]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend adding comment here on what anisotropy spacing is. Someone new to 3D might not have knowledge of this already and it is probably safer to be more explicit here.

Suggested change
anisotropy_spacing: tuple[float, float, float]
# Pixel size in x,y,z space?
anisotropy_spacing: tuple[float, float, float]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is this loading in an image-set or is this setting the spacing distances for 3D? The name SupportsImageSetLoader makes it sound like it is taking in the image path and loading it in without any other context.



class SupportsObjectLoader(Protocol):
"""Minimal object loader interface required by this module."""

label_image: np.ndarray
object_ids: Sequence[int]

Comment on lines +20 to +25
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is label_image? As an array that sounds like it will be the FOV since we usually call FOVs images, but for AreaShape that FOV only matters for segmentation. In this case, is label_image actually the label_mask from the segmentation module?


def _empty_feature_result() -> dict[str, list[float]]:
"""Return deterministic empty output schema for area/size/shape features."""
return {
"object_id": [],
"Volume": [],
"CenterX": [],
"CenterY": [],
"CenterZ": [],
"BboxVolume": [],
"MinX": [],
"MaxX": [],
"MinY": [],
"MaxY": [],
"MinZ": [],
"MaxZ": [],
"Extent": [],
"EulerNumber": [],
"EquivalentDiameter": [],
"SurfaceArea": [],
}


def compute(
image_set_loader: SupportsImageSetLoader | None = None,
object_loader: SupportsObjectLoader | None = None,
) -> dict[str, list[float]]:
"""Compute area/size/shape features for one object loader.

This supports two invocation modes:
- no arguments: returns an empty deterministic schema so dispatchers can
call the function without crashing.
- both loaders provided: executes feature extraction.
"""
if image_set_loader is None and object_loader is None:
return _empty_feature_result()
if image_set_loader is None or object_loader is None:
raise ZedProfilerError(
"areasizeshape.compute requires both image_set_loader and "
"object_loader for execution."
)

return measure_3D_area_size_shape(
image_set_loader=image_set_loader,
object_loader=object_loader,
)


def _get_skimage_measure() -> object:
"""Return `skimage.measure` or raise a user-facing dependency error."""
try:
return import_module("skimage.measure")
except ImportError as exc:
raise ZedProfilerError(
"areasizeshape requires scikit-image for area/size/shape computation."
) from exc


def calculate_surface_area(
label_object: np.ndarray,
props: dict[str, np.ndarray],
spacing: tuple[float, float, float],
) -> float:
"""Calculate surface area for one labeled object using marching cubes."""
measure = _get_skimage_measure()

volume = label_object[
max(props["bbox-0"][0], 0) : min(props["bbox-3"][0], label_object.shape[0]),
max(props["bbox-1"][0], 0) : min(props["bbox-4"][0], label_object.shape[1]),
max(props["bbox-2"][0], 0) : min(props["bbox-5"][0], label_object.shape[2]),
]
volume_truths = volume > 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this here to avoid any object with a volume = 0?

verts, faces, _normals, _values = measure.marching_cubes(
volume_truths,
method="lewiner",
spacing=spacing,
level=0,
)
return measure.mesh_surface_area(verts, faces)


def measure_3D_area_size_shape(
image_set_loader: SupportsImageSetLoader,
object_loader: SupportsObjectLoader,
) -> dict[str, list[float]]:
"""Measure area/size/shape features for each non-zero label object."""
measure = _get_skimage_measure()

label_object = object_loader.label_image
spacing = image_set_loader.anisotropy_spacing
unique_objects = object_loader.object_ids

features_to_record = _empty_feature_result()

desired_properties = [
"area",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't area not a 3D feature? Should this be volume?

"bbox",
"centroid",
"bbox_area",
"extent",
"euler_number",
"equivalent_diameter",
]
for label in unique_objects:
if label == 0:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this label ID or what is 0 in this case?

continue
subset_lab_object = label_object.copy()
subset_lab_object[subset_lab_object != label] = 0
Comment on lines +132 to +133
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused by subset_lab_object. I can see it is a copy of label_object, so what makes it a subset in this case? Also, should it be subet_label_object?

props = measure.regionprops_table(
subset_lab_object,
properties=desired_properties,
)

features_to_record["object_id"].append(label)
features_to_record["Volume"].append(props["area"].item())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might get a bit confusing to interchange volume and area here. Recommend sticking to one convention if possible.

features_to_record["CenterX"].append(props["centroid-2"].item())
features_to_record["CenterY"].append(props["centroid-1"].item())
features_to_record["CenterZ"].append(props["centroid-0"].item())
features_to_record["BboxVolume"].append(props["bbox_area"].item())
features_to_record["MinX"].append(props["bbox-2"].item())
features_to_record["MaxX"].append(props["bbox-5"].item())
features_to_record["MinY"].append(props["bbox-1"].item())
features_to_record["MaxY"].append(props["bbox-4"].item())
features_to_record["MinZ"].append(props["bbox-0"].item())
features_to_record["MaxZ"].append(props["bbox-3"].item())
features_to_record["Extent"].append(props["extent"].item())
features_to_record["EulerNumber"].append(props["euler_number"].item())
features_to_record["EquivalentDiameter"].append(
props["equivalent_diameter"].item()
)

try:
features_to_record["SurfaceArea"].append(
calculate_surface_area(
label_object=label_object,
props=props,
spacing=spacing,
)
)
except (RuntimeError, ValueError):
features_to_record["SurfaceArea"].append(np.nan)

return features_to_record
Loading
Loading