Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster contamination when there are fast firing events #845

Open
wufan0504 opened this issue Jan 2, 2025 · 11 comments
Open

Cluster contamination when there are fast firing events #845

wufan0504 opened this issue Jan 2, 2025 · 11 comments

Comments

@wufan0504
Copy link

Describe the issue:

Hi KS team,

I have recordings with some interesting physiological events where many cells spontaneously fire in synchrony. These events can last anywhere between 0.5-5 seconds. You can see from the zoomed out view of the raw data that even the RMS increases significantly during the event. My guess is that many distant spikes contribute to this and they are not isolatable, which is fine. My issue is that when there are events like this, KS4 tends to merge very good, high amplitude spikes with many background "mua" during the events (see raster plot showing spikes from the big spikes leading the event with many much smaller "mua" during the event. This is including every unit detected by KS4. If, however, when I choose to only look at the KS4 labeled "good" units, most of the clusters would not pass, and this is detrimental because a significant amount of good spikes/clusters would be thrown out because they contain contaminations during the events. I tried to play around with elevating the global and learned Th, but these "mua" during events are still detected.

any thoughts / advice would be greatly appreciated.

image image image

Reproduce the bug:

No response

Error message:

No response

Version information:

Kilosort version 4.0.20
Python version 3.9.19
Windows-10-10.0.22631-SP0 AMD64
Using CUDA device: NVIDIA GeForce RTX 4060 Ti 8.00GB

@wufan0504 wufan0504 changed the title BUG: <Please replace this text with a comprehensive title> Cluster contamination when there are fast firing events Jan 2, 2025
@wufan0504
Copy link
Author

kilosort4.log

Here is the ks4 log. I mainly tried tweaking the thresholds, with/without CAR but found no improvement.

@jacobpennington
Copy link
Collaborator

Can you please include some screenshots from Phy (or whatever you're using to look through the sorting results) to demonstrate what you're seeing in regards to the issues you're seeing with clustering?

@wufan0504
Copy link
Author

hi Jacob

Thank you for the quick response. This problem is best shown in the traceView in Phy, but I used Neuroscope2 to show similar things but IMO a bit more clear. In other Phy views, basically I get a lot of "mua" labeled clusters and the waveform view for example, will only plot a subset of spikes, which by majority are the smaller, contaminated "spikes". So it's hard to tell that these mua clusters contain many good spikes as well.

But in traceView (see screenshot above where the raster plot shows KS4 sorted clusters), for example, from channel 36, the blue cluster includes a few very large amplitude, good spikes, but also a ton of much smaller "spikes" during the event with notable low freq field deflection. This is true for other channels as well, the obviously good spikes gets mixed with overwhelmingly many but obviously bad spikes. As a result, KS4 would label all of the clusters I showed on the screenshots as "mua", and so I'm left with very few good clusters for these recordings even though these events only occur during less than 1% of the recording but impacts spikes throughout the entire sessions.

What I was hoping to accomplish by tweaking the parameters is so that KS4 will not merge super large, good spikes with the small ones during the events, or better yet not detect the smaller "spikes" at all. I have tried tunning the Thresholds to some very large values but the resulting clusters still contained the big and small spikes.

Does this make sense? Let me know if I should provide any specific views from Phy.

thank you very much in advance

@jacobpennington
Copy link
Collaborator

Thanks, one other question for now: which values did you try for Th_universal and Th_learned? I see in the log you uploaded that they're not the default values, and you said you tried some different ones.

@wufan0504
Copy link
Author

i always changed both values, from the default [7, 7] to [20, 20].

@jacobpennington
Copy link
Collaborator

The default values are [9,8]. Did you try sorting with the defaults at least once?

@wufan0504
Copy link
Author

wufan0504 commented Jan 3, 2025

hmm, then i guess not. I will try. Though, having experimented with such wide range of THs, I thought it was not the solution I had hoped for.

here is some phy traceViews to illustrate the problem. First image and second image highlighted all "good" units, within and outside of the event. You should see many big spikes missed.

image
image

then this 3rd image shows a "mua" labeled unit that fires many good spikes (biggest ones within the event) and many more outside the event, but is contaminated by lots of small "mua spikes" because of the rare events. There are many such examples. The view is very short time window, but if I zoomed out, there are many more small, bad spikes that is mixed with this unit.

image

Please ignore the channel mapping shown on the phy traceView. I think there is a bug and that's a separate issue. I am sure that KS used the correct mapping and other tools i used showed correct map of sorted waveforms.

@wufan0504
Copy link
Author

Sorry, another piece of info here. The difference between the template view and raw/mean waveform view ofthe same unit 92 is striking. The template has largest waveform on ch 38, which is also indicated by the clusterView, and it is a positive spike. But from the traceview, I can see a lot of good spikes with largest waveform on ch 36 and they are negative spikes.

image

@wufan0504
Copy link
Author

just tried Th = [9,8] and the results are very similar

@jacobpennington
Copy link
Collaborator

I will look at the other examples some more, but to your last reply: that is an mua unit with very few spikes. Differences in mean waveform vs template are not surprising for those cases.

@jacobpennington
Copy link
Collaborator

jacobpennington commented Jan 15, 2025

It does look like there may simply be too many closely-timed spikes during those high firing rate events. The sorting algorithm relies on iteratively subtracting detected spikes within a small window and then repeating spike detection on the residual to detect spikes that are close together in time and space, and that process becomes less effective when the density of spikes is very high.

One thing you could try is increasing the new max_peels parameter (upgrade to Kilosort v4.0.24 or pull the latest changes). This will increase the iteration limit for matching pursuit, which may help detect those additional spikes. I would continue using the default values for the detection thresholds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants