You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The existing pipeline requires the specification of a poslog file for core cropping. This can be difficult because:
If multiple TMAs are combined, the poslogs have to be specified in the order of acquisition
a. This won't be an issue after the pyTDFSDK workflow is merged, but the other points still stand
Users have to specify the full path to each poslog
Having to load in the imzml file again will be difficult, especially once the core cropping step gets separated into a different notebook
The ark-analysis repo already uses labeling algorithms provided by scikit-image to accomplish this without the need of an external coordinate list. A similar workflow can be added into the maldi-pipeline repo.
Design overview
After the core list has been created using TSAI, the skimage.measure.label function can be used to automatically segment out the cores and their corresponding coordinates. For each unique label which corresponds to a core, we can then extract the coordinates out from the array and match it to the core centroid found in the TSAI list.
Note that the "blank centroid" issue still needs to be addressed, although the existing logic can be ported over to this new workflow.
Code mockup
The following code written by @kxleow provides a template we can use:
import json
from pathlib import Path
from typing import List, Tuple
from alpineer.io_utils import validate_paths
from skimage.io import imread, imsave
from skimage.measure import label, RegionProperties, regionprops
def match_centroids_to_cores(
glycan_crop_save_dir: Union[str, Path],
glycan_mask_path: Union[str, Path],
centroid_core_mapping: Union[str, Path]
)
validate_paths([glycan_mask_path, centroid_core_mapping])
if not os.path.exists(glycan_crop_save_dir):
os.makedirs(glycan_crop_save_dir)
glycan_mask: np.ndarray = imread(glycan_mask_path)
# Label connected components (4-connectivity)
labeled_image: np.ndarray = label(glycan_mask, connectivity=1, background=0)
# Count the components
num_labels: int = labeled_image.max()
# Get properties of labeled regions
regions: List[RegionProperties] = regionprops(labeled_image)
# Step 2: Load the JSON file
with open(centroid_path, "r") as infile:
centroid_data: dict = json.load(infile)
# Prepare a mapping of FOV coordinates to their names
fov_mapping: Dict[Tuple, str] = {tuple(core["centerPointPixels"].values()): core["name"] for core in centroid_data["fovs"]}
# Step 3: Match FOV coordinates to regions
# NOTE: we should think about optimizing this section, since iterating through several coordinates adds extra complexity
component_to_fov: Dict[int, str] = {}
for region in regions:
# Get the region's coordinates
coords: List[Tuple[int, int]] = region.coords # List of (row, col) pixels in the region
for coord in coords:
x, y = coord[1], coord[0] # (col, row) in skimage
if (x, y) in fov_mapping:
component_to_fov[region.label] = fov_mapping[(x, y)]
break # Stop searching once we map this region
else:
# add logic to handle "blank centroid" case
# Create a renamed labeled image excluding unmatched regions
renamed_image: np.ndarray = np.zeros_like(labeled_image, dtype=int)
for region in regions:
if region.label in component_to_fov:
renamed_image[labeled_image == region.label] = region.label
# Step 4: Save individual FOVs as TIFFs
for region_label, fov_name in component_to_fov.items():
# Create a binary mask for the specific region
fov_mask: np.ndarray = (labeled_image == region_label).astype(np.uint8) * 255
# Save the binary mask as a TIFF
output_path: Path = glycan_crop_save_dir / f"{fov_name}".tiff
imsave(output_path, fov_mask)
Required inputs
Same as before, except without the need for a poslog or an imzml file.
Output files
Same as before
Timeline
Give a rough estimate for how long you think the project will take. In general, it's better to be too conservative rather than too optimistic.
A couple days
A week
Multiple weeks. For large projects, make sure to agree on a plan that isn't just a single monster PR at the end.
Estimated date when a fully implemented version will be ready for review:
Before the winter closure
Estimated date when the finalized project will be merged in:
Just after the winter closure
The text was updated successfully, but these errors were encountered:
Relevant background
The existing pipeline requires the specification of a poslog file for core cropping. This can be difficult because:
a. This won't be an issue after the
pyTDFSDK
workflow is merged, but the other points still standimzml
file again will be difficult, especially once the core cropping step gets separated into a different notebookThe
ark-analysis
repo already uses labeling algorithms provided byscikit-image
to accomplish this without the need of an external coordinate list. A similar workflow can be added into themaldi-pipeline
repo.Design overview
After the core list has been created using TSAI, the
skimage.measure.label
function can be used to automatically segment out the cores and their corresponding coordinates. For each unique label which corresponds to a core, we can then extract the coordinates out from the array and match it to the core centroid found in the TSAI list.Note that the "blank centroid" issue still needs to be addressed, although the existing logic can be ported over to this new workflow.
Code mockup
The following code written by @kxleow provides a template we can use:
Required inputs
Same as before, except without the need for a poslog or an imzml file.
Output files
Same as before
Timeline
Give a rough estimate for how long you think the project will take. In general, it's better to be too conservative rather than too optimistic.
Estimated date when a fully implemented version will be ready for review:
Before the winter closure
Estimated date when the finalized project will be merged in:
Just after the winter closure
The text was updated successfully, but these errors were encountered: