Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different Data Size #11

Open
sean0042 opened this issue Aug 28, 2022 · 2 comments
Open

Different Data Size #11

sean0042 opened this issue Aug 28, 2022 · 2 comments

Comments

@sean0042
Copy link

Hi.
I ran mp/mp.py, but the data statistic is different with your result
my train, valid, test is (33954, 4908, 9822) (original : 33997, 4918, 9830)

My mimic3 version is 1.4 (latest) . I think it's pandas version difference.
Can I know your environment's pandas version?
Thank you

@bvanaken
Copy link
Owner

Hi,
we have made some adjustments to the mortality prediction task in the meantime, because there were some cases left, for which the death of the patient was described in the notes. I guess the difference come from these changes.

If you want to replicate the original dataset, you can run the mp.py script as committed on 2021/09/05: https://github.com/bvanaken/clinical-outcome-prediction/commits/master/tasks/mp/mp.py

Best
Betty

@rochanaph
Copy link

Just in case anyone having the same issue of not getting any data subset by running the code on the latest version. I down version all packages in requirements.txt to before 2021/09 to make it work. (haven't checked the data proporsion though)

numpy==1.21.0 pandas==1.3.2 nltk==3.6.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants