Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kaggle datasets download not working #484

Closed
junkoda opened this issue Jul 1, 2023 · 7 comments
Closed

kaggle datasets download not working #484

junkoda opened this issue Jul 1, 2023 · 7 comments

Comments

@junkoda
Copy link

junkoda commented Jul 1, 2023

There seems to be a simple bug for kaggle datasets download in version 1.5.15. It worked with 1.5.13, and broken in 1.5.14 and 1.5.15.

$ kaggle datasets download -d shashwatraman/contrails-images-ash-color
time data 'Mon, 05 Jun 2023 10:54:39 GMT' does not match format '%a, %d %b %Y %X %Z'

The command is given by the Copy API Command menu in the right top corner of the data page.

The command works with 1.5.13.

$ pip install kaggle==1.5.13
$ kaggle datasets download -d shashwatraman/contrails-images-ash-color

Version 1.5.14 had an addition bug that does not recognize -d option, but without -d it also has the same time data error.

This does not depend on the data. I tried one more data path and get the same error.

@Philmod
Copy link
Contributor

Philmod commented Jul 5, 2023

I wasn't able to reproduce the bug

Screenshot 2023-07-05 at 13 03 31

What OS are you using?

@junkoda
Copy link
Author

junkoda commented Jul 7, 2023

Thanks for the reply. I have investigated a bit more.

The error occurs at:

remote_date = datetime.strptime(response.headers['Last-Modified'],
                                  '%a, %d %b %Y %X %Z')
# with response.headers['Last-Modified'] = 'Mon, 05 Jun 2023 10:54:39 GMT'

in kaggle_api_extended.py.

The problem is that the code depends on locale or other settings for the datetime. In my environment. '%X' requires AM/PM:

from datetime import datetime
import locale
datetime.strptime('10:54:39', '%X') # => Error
datetime.strptime('10:54:39 AM', '%X') # => OK

locale.getlocale()  # => ('en_US', 'UTF-8')
datetime.now().strftime('%X')  # => '05:16:31 PM'
  • OS: Ubuntu 18.04.6 LTS
  • Python 3.10.10
  • System Locale: en_US.UTF-8

My other computer MacOS Ventura, getlocal() returns (None, None) and the kaggle download works with no error.

Weekday and month might also fail in other languages.

from datetime import datetime
import locale

remote_date = datetime.strptime('04 Jul', '%d %b')  # OK for me

locale.setlocale(locale.LC_ALL, 'it_IT.UTF-8')
remote_date = datetime.strptime('04 Jul', '%d %b')  # Error July is luglio in Italian

Since I am not familiar with locales, I'll leave the proper locale-independent solution to the professionals. Thanks.

@Philmod
Copy link
Contributor

Philmod commented Jul 17, 2023

Internal bug: http://b/291578234

@Philmod
Copy link
Contributor

Philmod commented Jul 17, 2023

It should be fixed in https://github.com/Kaggle/kaggle-api/releases/tag/1.5.16. Let me know if that works for you now.

@Philmod Philmod closed this as completed Jul 17, 2023
@junkoda
Copy link
Author

junkoda commented Jul 18, 2023

Thanks! Yes, datasets download worked with 1.5.16.

@Karesto
Copy link

Karesto commented Jul 31, 2024

Up because this error is now back for me.

@junkoda
Copy link
Author

junkoda commented Aug 1, 2024

Thanks. Nice to see someone else other than me is experiencing this error. I've opened an another issue.
#523
en_US was fixed but not fixed for other languages (locale)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants