Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the need for imzml and poslog files for core cropping #28

Open
1 of 3 tasks
alex-l-kong opened this issue Dec 2, 2024 · 0 comments
Open
1 of 3 tasks

Remove the need for imzml and poslog files for core cropping #28

alex-l-kong opened this issue Dec 2, 2024 · 0 comments
Assignees

Comments

@alex-l-kong
Copy link
Contributor

Relevant background

The existing pipeline requires the specification of a poslog file for core cropping. This can be difficult because:

  1. If multiple TMAs are combined, the poslogs have to be specified in the order of acquisition
    a. This won't be an issue after the pyTDFSDK workflow is merged, but the other points still stand
  2. Users have to specify the full path to each poslog
  3. Having to load in the imzml file again will be difficult, especially once the core cropping step gets separated into a different notebook

The ark-analysis repo already uses labeling algorithms provided by scikit-image to accomplish this without the need of an external coordinate list. A similar workflow can be added into the maldi-pipeline repo.

Design overview

After the core list has been created using TSAI, the skimage.measure.label function can be used to automatically segment out the cores and their corresponding coordinates. For each unique label which corresponds to a core, we can then extract the coordinates out from the array and match it to the core centroid found in the TSAI list.

Note that the "blank centroid" issue still needs to be addressed, although the existing logic can be ported over to this new workflow.

Code mockup

The following code written by @kxleow provides a template we can use:

import json
from pathlib import Path
from typing import List, Tuple

from alpineer.io_utils import validate_paths
from skimage.io import imread, imsave
from skimage.measure import label, RegionProperties, regionprops

def match_centroids_to_cores(
    glycan_crop_save_dir: Union[str, Path],
    glycan_mask_path: Union[str, Path],
    centroid_core_mapping: Union[str, Path]
)
validate_paths([glycan_mask_path, centroid_core_mapping])
if not os.path.exists(glycan_crop_save_dir):
    os.makedirs(glycan_crop_save_dir)

glycan_mask: np.ndarray = imread(glycan_mask_path)

# Label connected components (4-connectivity)
labeled_image: np.ndarray = label(glycan_mask, connectivity=1, background=0)

# Count the components
num_labels: int = labeled_image.max()

# Get properties of labeled regions
regions: List[RegionProperties] = regionprops(labeled_image)

# Step 2: Load the JSON file
with open(centroid_path, "r") as infile:
    centroid_data: dict = json.load(infile)

# Prepare a mapping of FOV coordinates to their names
fov_mapping: Dict[Tuple, str] = {tuple(core["centerPointPixels"].values()): core["name"] for core in centroid_data["fovs"]}

# Step 3: Match FOV coordinates to regions
# NOTE: we should think about optimizing this section, since iterating through several coordinates adds extra complexity
component_to_fov: Dict[int, str] = {}
for region in regions:
    # Get the region's coordinates
    coords: List[Tuple[int, int]] = region.coords  # List of (row, col) pixels in the region

    for coord in coords:
        x, y = coord[1], coord[0]  # (col, row) in skimage
        if (x, y) in fov_mapping:
            component_to_fov[region.label] = fov_mapping[(x, y)]
            break  # Stop searching once we map this region
       else:
           # add logic to handle "blank centroid" case

# Create a renamed labeled image excluding unmatched regions
renamed_image: np.ndarray = np.zeros_like(labeled_image, dtype=int)
for region in regions:
    if region.label in component_to_fov:
        renamed_image[labeled_image == region.label] = region.label

# Step 4: Save individual FOVs as TIFFs
for region_label, fov_name in component_to_fov.items():
    # Create a binary mask for the specific region
    fov_mask: np.ndarray = (labeled_image == region_label).astype(np.uint8) * 255

    # Save the binary mask as a TIFF
    output_path: Path = glycan_crop_save_dir / f"{fov_name}".tiff
    imsave(output_path, fov_mask)

Required inputs

Same as before, except without the need for a poslog or an imzml file.

Output files

Same as before

Timeline
Give a rough estimate for how long you think the project will take. In general, it's better to be too conservative rather than too optimistic.

  • A couple days
  • A week
  • Multiple weeks. For large projects, make sure to agree on a plan that isn't just a single monster PR at the end.

Estimated date when a fully implemented version will be ready for review:

Before the winter closure

Estimated date when the finalized project will be merged in:

Just after the winter closure

@alex-l-kong alex-l-kong self-assigned this Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant