-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/scryfall data #12
Conversation
…-Deckbox into feature/scryfallData
…hat deckbox uses the front face name, removed art cards from scryfall query
…-and-removal-rules-for-alt-cards
…removal-rules-for-alt-cards Added new replacement rules, added ignore rules for dualfaced card that deckbox uses the front face name, removed art cards from scryfall query
…ards, updated regex for cases with multiple groups of parenthesis
…x and tcgplay use different columns to differentiate between the original release and the 2021 release. physically the 2021 release does not include a planeswalker symbol in the lower left (and is much less valued sadly)
…nfig as i should have done originally
…t_of_cards_scanned_in_tcgplayer more dual face cards to ignore from scryfall, added mapping for bab c…
…aced / split cards
src/tcg-to-deckbox.py
Outdated
with requests.get(deckbox_request_url) as deckbox_response: | ||
|
||
# If we are not redirected to a new page, then we should only use the front face name | ||
if deckbox_request_url == deckbox_response.url: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if deckbox_response.status_code == 302?
Api seems to react to the user-agent header tho, as in if I use the base curl user-agent (experimentation located at https://gist.github.com/rkonell/27c093b4703cf2a7d7b354409fb1776b) it actually returns a 404, but if I provide it with any other user-agent then I get the expected 302. But having to be user-agent aware to know what to expect seems brittle(is anything user-agent related NOT brittle?).
With that in mind however, we shouldn't bother even following the redirects and instead only test the status code of the response itself, by passing an optional parameter requests exposes:
with requests.get(deckbox_request_url, allow_redirect=False) as deckbox_response:
if deckbox_response.status == 200:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, need to be a little bit trickier than that. I tested with curl, so the user agent might change things, but I found out that valid cards redirect too, if they have multiple printings. It redirects to one of the printings. That's why I checked the response URL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh interesting find... what card if you don't mind? I'd like to review the un-redirected response for clues. I iterated through each dual-faced card to get the response for each and got 200s, 301s, and 302s, which the 301s i wasn't expecting. I'm hoping maybe we can identify some response header that's shared amongst the variations (Found 0, Found 1, Found many).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(sorry if there was message spam here.. i had wrote out a comment and it displayed oddly on my end so i deleted)
Ran into an issue with Lantern Bearer:
https://deckbox.org/mtg/Lantern%20Bearer%20/%20Lanterns%27%20Lift?printing=61146
url includes just a single slash as the name delimiter so it always fails comparison. It's odd because it rejects a single slash when imported via csv, needs the double. I'm guessing there is a defect between the webserver and their db. I'm just manually adjusting for now.
Another oddity I noticed with that card is it includes a query parameter in the response url. Haven't had that issue with the other dual faced cards... but should strip query parameters from the response url prior to comparing in case there are others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That query parameter is exactly the issue I was talking about, I just hit it with older cards, such as Delver of Secrets, which has seen reprints. The single slash is annoying. I posted about the API issues but I haven't gotten a response yet. For now, I can check double slash, and if that fails, I can also check single slash
"Begin: Download '%s', page %s of results" % (urllib.parse.unquote(uri), page) | ||
) | ||
try: | ||
with requests.get(uri) as scryfall_response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we track the etag header (append to the file name or add it as a top level key in the file), and pass it back in the "If-None-Match" header on the GET request, the api will respond with 304 Not Modified if the resulting content hasn't updated on their end. If not modified we'll be able to short circuit refreshing the scryfall data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I can look into this.
src/tcg-to-deckbox.py
Outdated
if deckbox_request_url == deckbox_response.url: | ||
print( | ||
"Dual name for '%s' found on deckbox, the dual name will be used for the import." | ||
% (scryfall_data[row["Name"]], scryfall_data[row["Name"]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format string has a single specifier but 2 arguments. Removing the second argument cleared it up. Only issue that I hit when uploading another set of cards this evening.
No description provided.