IPPM graph refactor #391

caiw · 2024-10-25T15:16:26Z

Main changes

Keeps the specification of hemisphere just in the very top level: an IPPM is derived from the whole brain, even if we only want to show the LH part of it
Fixes Make denoising an operation on ExpressionSets #404: Try to avoid creating new datastructures which look similar to existing ones, where existing ones can be used.
- Fixes IPPMs should be buildable from SensorExpressionSets #402 for free
- Fixes Almost duplicated functionality between plot.expression_plot and ippm.stem_plot #400 for free
- Fixes Separate denoising and graph-plotting #403: having made denoising an operation on ExpressionSets, and making it so that any ExpressionSet can produce an IPPM graph, we kind of get this one for free too. General compartmentalisation of functionality.
Fixes Refine IPPM graph datastructure #362: I felt a bit uneasy having the structure of the IPPM be encoded in string suffixes, especially when some of our latest analyses use functions called things like "dnn-layer-0-1". So I made this representation of a graph into an actual networkx.DiGraph.
Fixes IPPM code has some mix-up between mutating and non-mutating function calls #367: Some of this was intentionally to mimic the behaviour of existing clusterer code, and I didn't change all of it, but it makes it hard to know exactly when values get set. E.g. if there's a line self.var = self.func(...), and self.func itself mutates self.var before returning a value which replaces the mutation, it is not always clear which assignments to self.var will end up "sticking" without following out to the calling code.
Fixes Automatically determine IPPM parallel-vs-serial grouping #377

Note

I'M SORRY that this is such a big diff. I'm aware it makes it hard to review. If I had had more time I'd have written a shorter letter, etc.

Also I'm sorry that I have renamed some classes/files in a way which prevents Github from showing diffs properly. In particular builder.py and IPPMBuilder are more or less replaced by grapy.py and IPPMGraph. data_tools.py was somewhat a collection of miscellaneous functionality which has been moved into other classes.

Little fixes

A few issues I created and fixed in the course of the refactor.

Fixes Minimap latency range doesn't work with None supplied as the first or second element #420 as part of a merge-conflict resolution
Fixes ImportError: cannot import name 'float128' from 'numpy' #421

Unblocks

Issues removed from scope, but which can now be tackled, hopefully more easily.

…xes tests of IPPMGraph) - Fix a bug in generating the serial sequence for complex CTLs

caiw · 2025-01-11T10:59:47Z

Down to 22 failing tests from 36 :)

caiw · 2025-01-11T11:00:55Z

And all the changes are improving things, adding new tests, improving comments. Testing rules! Thanks AGAIN to @anirudh1666 for setting up such a comprehensive suite for this code. 🙏

…r to deref the denoiser name!

caiw · 2025-01-18T07:33:03Z

Note to self to speed up: Instead of clearing 10s of 1000s of points, simply filter the few you want to keep - it'll be much faster, and will use nearly the same numpy code!

neukym · 2025-01-18T17:27:22Z

Amazing. @anirudh1666 - can you take a first look?

caiw · 2025-01-18T18:40:52Z

Oof.. it's really slow. I did some optimisations this morning but it's still noticeably slower than the original code - something to work on with proper profiling, though I do have a couple more ideas.

Also I confirmed (by running the demo) that the failing tests are a function of the denoising not working as intended. I'm sure it's an easy fix as I was only changing data structures not algorithms - probably just something silly. Still sorry to be giving you broken code to read @anirudh1666 - but I'd still really value your input at this stage.

anirudh1666 · 2025-01-20T14:42:24Z

kymata/ippm/ippm.py

+            expression_set = denoising_strategy.denoise(expression_set)
+
+        # Build the graph
+        self._graphs: dict[str, IPPMGraph] = dict()


nit: for clarity, it may help to place self._graphs near the definition in the comments. Usually the constructor contains all of the attributes for a class in one place for easy reference

anirudh1666 · 2025-01-21T16:16:06Z

kymata/entities/expression.py

@@ -5,26 +5,24 @@
 from __future__ import annotations


What is the purpose of densify_data_block in this file?

anirudh1666 · 2025-01-25T07:36:47Z

kymata/ippm/denoising_strategies.py

-        should_merge_hemis: bool = False,
-        should_exclude_insignificant: bool = True,
-        should_shuffle: bool = True,
+        should_normalise: bool,


It might be worthwhile to add some default values for these parameters. It comes down to whether you think which parameters have high variance across runs and which ones will remain constant. The constant ones can be moved to optional/default valuse.

I think should_normalise, should_cluster_only_latency, should_max_pool will be False for most runs unless someone wants to experiment.

should_shuffle will be True. As for the threshold for significance, not sure, although estimating it from the data seems better than absolute threshold (exclude_points_above_n_sigma > exclude_logp_vals_above)

caiw added ⚙️ refactor Changes which relate to code structure rather than functionality IPPM generation labels Oct 25, 2024

caiw self-assigned this Oct 25, 2024

caiw force-pushed the ippm-graph-refactor branch from f91409f to b949c97 Compare November 1, 2024 13:19

caiw added 6 commits November 1, 2024 16:14

Draft CandidateTransformList

5b90470

Remove unused API-calling code (will be covered by #14)

8b8fca4

Clean up clusterers (resolve mixed mutating and nonmutating methods)

5e35673

Consolidate transform metadata

cd02b3c

Remove unused function

f36c38f

Fix tests

26d14ec

caiw force-pushed the ippm-graph-refactor branch from cd9e56b to 26d14ec Compare November 1, 2024 16:14

caiw added 19 commits November 1, 2024 16:22

Undo unnecessary rename

875d0ae

Test fix

1393f22

Reformat

613fddf

All hierarchy stuff can go here

85d8900

Missed one

06e6705

Pass data as list[ExpressionPairing] instead of pandas.DataFrame

e067a6c

Bring constants inline so they can be overridden

abc4a30

Update tests

b5a22bd

Replace dataframes with list[ExpressionPairing]

2a6aae4

Some test fixes

5456f25

Don't include hemisphere metadata in the denoiser or builder

f8251ec

Return both-hemisphere functionality

0fcb4b6

Flesh out IPPM class

8176120

Store log p-vaule instead of p-value

2799dcf

Remove use of DataFrames in ExpressionSets

875f8f7

Move transparent

4eebeca

Fix tests

b746638

Reorder ExpressionPoint members

f9821e7

Make order consistent with dimension order

8524f52

caiw added 5 commits January 10, 2025 15:28

Fix transform_recall test

221665a

Fix all transform recall tests

b9afd91

Fix all IPPM evaluation tests

e340121

- Clarity about input transforms with and without associated data (fi…

ad4f2ed

…xes tests of IPPMGraph) - Fix a bug in generating the serial sequence for complex CTLs

Lint

dbec936

Fix a bug when deriving last-to-first graphs

8a59df3

caiw mentioned this pull request Jan 17, 2025

New "show"/"large" minimap options #424

Merged

caiw added 3 commits January 17, 2025 13:48

Fix a bug with fitting clusterer models

7a26b7d

Clarity and reduce whitespace changes

6834dfc

Technical correctness

56b4f05

caiw requested review from anirudh1666 and neukym January 17, 2025 21:30

caiw added 2 commits January 17, 2025 21:31

Lint

e297c0a

Slight optimisation

c121e58

caiw marked this pull request as ready for review January 17, 2025 23:16

caiw added the 💪 enhancement New feature or request label Jan 17, 2025

caiw added 4 commits January 17, 2025 23:17

Further optimisation

58e27b0

MASSIVE speedup and memory reduction, and fix a weird diagonal bug

c040775

Bugfix

6bbaa24

Little bugfix: only add kwargs when the denoiser is used, and remembe…

4d244bb

…r to deref the denoiser name!

Lint

3ba8d29

anirudh1666 reviewed Jan 20, 2025

View reviewed changes

anirudh1666 reviewed Jan 21, 2025

View reviewed changes

caiw mentioned this pull request Jan 24, 2025

Temp fix for the float128 MacOS bug. Main fix on another branch #430

Merged

anirudh1666 reviewed Jan 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPPM graph refactor #391

IPPM graph refactor #391

caiw commented Oct 25, 2024 •

edited

Loading

caiw commented Jan 11, 2025

caiw commented Jan 11, 2025

caiw commented Jan 18, 2025

neukym commented Jan 18, 2025

caiw commented Jan 18, 2025 •

edited

Loading

anirudh1666 Jan 20, 2025

anirudh1666 Jan 21, 2025

anirudh1666 Jan 25, 2025

anirudh1666 Jan 25, 2025

IPPM graph refactor #391

Are you sure you want to change the base?

IPPM graph refactor #391

Conversation

caiw commented Oct 25, 2024 • edited Loading

Main changes

Note

Little fixes

Unblocks

caiw commented Jan 11, 2025

caiw commented Jan 11, 2025

caiw commented Jan 18, 2025

neukym commented Jan 18, 2025

caiw commented Jan 18, 2025 • edited Loading

anirudh1666 Jan 20, 2025

Choose a reason for hiding this comment

anirudh1666 Jan 21, 2025

Choose a reason for hiding this comment

anirudh1666 Jan 25, 2025

Choose a reason for hiding this comment

anirudh1666 Jan 25, 2025

Choose a reason for hiding this comment

caiw commented Oct 25, 2024 •

edited

Loading

caiw commented Jan 18, 2025 •

edited

Loading