-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPPM graph refactor #391
base: main
Are you sure you want to change the base?
IPPM graph refactor #391
Conversation
f91409f
to
b949c97
Compare
cd9e56b
to
26d14ec
Compare
…xes tests of IPPMGraph) - Fix a bug in generating the serial sequence for complex CTLs
Down to 22 failing tests from 36 :) |
And all the changes are improving things, adding new tests, improving comments. Testing rules! Thanks AGAIN to @anirudh1666 for setting up such a comprehensive suite for this code. 🙏 |
Note to self to speed up: Instead of clearing 10s of 1000s of points, simply filter the few you want to keep - it'll be much faster, and will use nearly the same numpy code! |
Amazing. @anirudh1666 - can you take a first look? |
Oof.. it's really slow. I did some optimisations this morning but it's still noticeably slower than the original code - something to work on with proper profiling, though I do have a couple more ideas. Also I confirmed (by running the demo) that the failing tests are a function of the denoising not working as intended. I'm sure it's an easy fix as I was only changing data structures not algorithms - probably just something silly. Still sorry to be giving you broken code to read @anirudh1666 - but I'd still really value your input at this stage. |
expression_set = denoising_strategy.denoise(expression_set) | ||
|
||
# Build the graph | ||
self._graphs: dict[str, IPPMGraph] = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for clarity, it may help to place self._graphs near the definition in the comments. Usually the constructor contains all of the attributes for a class in one place for easy reference
@@ -5,26 +5,24 @@ | |||
from __future__ import annotations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of densify_data_block in this file?
should_merge_hemis: bool = False, | ||
should_exclude_insignificant: bool = True, | ||
should_shuffle: bool = True, | ||
should_normalise: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worthwhile to add some default values for these parameters. It comes down to whether you think which parameters have high variance across runs and which ones will remain constant. The constant ones can be moved to optional/default valuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think should_normalise, should_cluster_only_latency, should_max_pool will be False for most runs unless someone wants to experiment.
should_shuffle will be True. As for the threshold for significance, not sure, although estimating it from the data seems better than absolute threshold (exclude_points_above_n_sigma > exclude_logp_vals_above)
Main changes
ExpressionSet
s #404: Try to avoid creating new datastructures which look similar to existing ones, where existing ones can be used.SensorExpressionSet
s #402 for freeplot.expression_plot
andippm.stem_plot
#400 for freeExpressionSets
, and making it so that anyExpressionSet
can produce an IPPM graph, we kind of get this one for free too. General compartmentalisation of functionality.networkx.DiGraph
.self.var = self.func(...)
, andself.func
itself mutatesself.var
before returning a value which replaces the mutation, it is not always clear which assignments toself.var
will end up "sticking" without following out to the calling code.Note
I'M SORRY that this is such a big diff. I'm aware it makes it hard to review. If I had had more time I'd have written a shorter letter, etc.
Also I'm sorry that I have renamed some classes/files in a way which prevents Github from showing diffs properly. In particular
builder.py
andIPPMBuilder
are more or less replaced bygrapy.py
andIPPMGraph
.data_tools.py
was somewhat a collection of miscellaneous functionality which has been moved into other classes.Little fixes
A few issues I created and fixed in the course of the refactor.
None
supplied as the first or second element #420 as part of a merge-conflict resolutionUnblocks
Issues removed from scope, but which can now be tackled, hopefully more easily.