-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shot_chart_detail table, simplify GenericRequester api, change default behavior to load current season, SQLite support (kinda), README update #39
Conversation
stats/shot_chart_detail.py
Outdated
|
||
class ShotChartDetailRequester(GenericRequester): | ||
|
||
shot_char_detail_url = "https://stats.nba.com/stats/shotchartdetail" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo - short_chart_detail_url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We both typo', will change though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol
stats/shot_chart_detail.py
Outdated
def finalize(self): | ||
""" | ||
This function finishes loading shot_chart_detail by inserting all valid | ||
records from the temp table into the main table, then dropping the temp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the call to drop the temp table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good eye, but this uses a table specified with TEMPORARY
will be dropped at the end of the session. That is defined in the Meta
section of the Model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, that's very cool.
How does this happen? Is it because maybe only a certain season is loaded into the DB? |
Yes exactly. We're now only loading the most recent season, and since this fetches all games for a player there is a good chance they don't exist yet. |
So why not fetch only the shot data for the requested seasons? Seems like the API has a season parameter? Edit: Ahh I get it, reduce number of requests. Makes sense |
Ah yeah acutally I think the endpoint requires both |
Lots of stuff.
We are fetching the
shot_chart_detail
data in a silly manner. We fetch all games for all players, because that is the path of least resistant, request wise (have to make less total requests). BUT, this means that there will begame_id
s that aren't in the game table. That breaks the foriegn key constraint. Easiest way around this is building a temp table, inserting all theshot_chart_detail
into there w/ an index ongame_id
(and not foreign key constraint), then inserting into the real table, filtering on the game ids. I feel like this isn't overkill.Still finalizing whether we want to use a left join or subquery to insert from the temp table into the
shot_chart_detail
. 🤷SQLite kinda works, we just need to fully define the non-unique PKs better, it throws an error without this. Issue here: #41
ALSO added xactions around the bulk inserting, PeeWee recommends it and I think it makes sense. I don't really see a loss of performance, but I've only been testing with 2020-21.
1996-97 seems broken, not sure if the NBA changed their API, or what. SO, the default behavior is now to load the current season only. Still need to add an
all
option, but I might break that into a second PR because this one is getting big. Issue here: #37Also need to add some resiliency and logging around the failed requests. issue for that here: #38