Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calc_rdm_movie can change the type of the observation descriptors #345

Open
caiw opened this issue Sep 19, 2023 · 2 comments
Open

calc_rdm_movie can change the type of the observation descriptors #345

caiw opened this issue Sep 19, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@caiw
Copy link
Member

caiw commented Sep 19, 2023

Describe the bug
If you pass a dictionary whose values are numpy.int64 to calc_rdm_movie's obs_descriptors argument, the resultant RDMs object will have .pattern_descriptors value types of numpy.float64 (which annoyingly changes they way they print).

To Reproduce
Following the Temporal RSA demo, we have the following set (among other variables):

with Path(path.abspath(""), "data", "TemporalSampleData", "meg_sample_data.pkl").open("rb") as d:
    meg_sample_data = pickle.load(d)
# ...
condition_indices = meg_sample_data["cond_idx"]
print(type(condition_indices[0]))  # <class 'numpy.int64'>

Later on in the demo, we create a TemporalDataset and run calc_rdm_movie:

observ_des = {"conditions": condition_indices}
# ...
dataset = TemporalDataset(measurements, descriptors=des, obs_descriptors=observ_des, channel_descriptors=chan_des, time_descriptors=time_des)
print(dataset.obs_descriptors["conditions"][:5])  # [1, 1, 1, 1, 1]
print(type(dataset.obs_descriptors["conditions"][0]))  # <class 'numpy.int64'>

so no funny business there.

Following further down we run calc_rdm_movie:

rdms_data = calc_rdm_movie(dataset, method="correlation", descriptor="conditions")
print(rdms_data.pattern_descriptors["conditions"])  # [1.0, 2.0, 3.0, 4.0]
print(type(rdms_data.pattern_descriptors["conditions"][0]))  # <class 'numpy.float64'>

Expected behavior
The final two prints should look like:

print(rdms_data.pattern_descriptors["conditions"])  # [1, 2, 3, 4]
print(type(rdms_data.pattern_descriptors["conditions"][0]))  # <class 'numpy.int64'>

Versions
Please report:

  • the version of rsatoolbox you used (pip freeze | grep rsatoolbox)
    • 0.1.3
  • the version of python you used (python --version)
    • 3.11.5
@caiw caiw added the bug Something isn't working label Sep 19, 2023
@caiw
Copy link
Member Author

caiw commented Sep 19, 2023

Ok, digging further, the culprit here is TemporalDataset.convert_to_dataset, which is called by calc_rdm_movie, at least in v0.1.3. Around line 735 it repeatedly calls np.concatenate, starting with an empty list, thereby converting the dtype to numpy.float64.

The solution would probably be to do something like detect and reset the datatype, but TemporalDataset.convert_to_dataset is now deprecated, so not sure how far to pursue this.

@JasperVanDenBosch
Copy link
Contributor

We also more generally need to discuss descriptor datatypes I think. Many methods (naturally) make certain assumptions about them, which don't always hold up because we are quite permissive right now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants