Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shot_chart_detail table, simplify GenericRequester api, change default behavior to load current season, SQLite support (kinda), README update #39

Merged
merged 4 commits into from
Aug 2, 2021

Conversation

mpope9
Copy link
Owner

@mpope9 mpope9 commented Aug 1, 2021

Lots of stuff.

We are fetching the shot_chart_detail data in a silly manner. We fetch all games for all players, because that is the path of least resistant, request wise (have to make less total requests). BUT, this means that there will be game_ids that aren't in the game table. That breaks the foriegn key constraint. Easiest way around this is building a temp table, inserting all the shot_chart_detail into there w/ an index on game_id (and not foreign key constraint), then inserting into the real table, filtering on the game ids. I feel like this isn't overkill.

Still finalizing whether we want to use a left join or subquery to insert from the temp table into the shot_chart_detail. 🤷

SQLite kinda works, we just need to fully define the non-unique PKs better, it throws an error without this. Issue here: #41

ALSO added xactions around the bulk inserting, PeeWee recommends it and I think it makes sense. I don't really see a loss of performance, but I've only been testing with 2020-21.

1996-97 seems broken, not sure if the NBA changed their API, or what. SO, the default behavior is now to load the current season only. Still need to add an all option, but I might break that into a second PR because this one is getting big. Issue here: #37

Also need to add some resiliency and logging around the failed requests. issue for that here: #38

@mpope9 mpope9 mentioned this pull request Aug 1, 2021

class ShotChartDetailRequester(GenericRequester):

shot_char_detail_url = "https://stats.nba.com/stats/shotchartdetail"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo - short_chart_detail_url

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We both typo', will change though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

def finalize(self):
"""
This function finishes loading shot_chart_detail by inserting all valid
records from the temp table into the main table, then dropping the temp

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the call to drop the temp table?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye, but this uses a table specified with TEMPORARY will be dropped at the end of the session. That is defined in the Meta section of the Model.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix the comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, that's very cool.

@avadhanij
Copy link

this means that there will be game_ids that aren't in the game table.

How does this happen? Is it because maybe only a certain season is loaded into the DB?

@mpope9
Copy link
Owner Author

mpope9 commented Aug 1, 2021

this means that there will be game_ids that aren't in the game table.

How does this happen? Is it because maybe only a certain season is loaded into the DB?

Yes exactly. We're now only loading the most recent season, and since this fetches all games for a player there is a good chance they don't exist yet.

@avadhanij
Copy link

avadhanij commented Aug 1, 2021

So why not fetch only the shot data for the requested seasons? Seems like the API has a season parameter?

Edit: Ahh I get it, reduce number of requests. Makes sense

@mpope9
Copy link
Owner Author

mpope9 commented Aug 2, 2021

So why not fetch only the shot data for the requested seasons? Seems like the API has a season parameter?

Edit: Ahh I get it, reduce number of requests. Makes sense

Ah yeah acutally I think the endpoint requires both team_id and player_id to properly function. Just feeding it season_id will not return a proper request (i believe, took a few iterations to get this one right.) And a nice side effect is less requests.

@mpope9 mpope9 merged commit a3bf092 into master Aug 2, 2021
@mpope9 mpope9 deleted the shot_chart_detail branch August 2, 2021 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants