Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facebook test client WIP, campaigns stream only #231

Merged
merged 14 commits into from
Nov 15, 2023
Merged

Conversation

bhtowles
Copy link
Contributor

Description of change

Related to TDL-7577 to support pagination test
Facebook test client WIP, campaigns stream only

Manual QA steps

Risks

Rollback steps

  • revert this branch

@bhtowles bhtowles added the testing QA work. No src code changes. label Oct 24, 2023
tests/test_client.py Outdated Show resolved Hide resolved
self.api_version = 'v18.0'
self.account_id = os.getenv('TAP_FACEBOOK_ACCOUNT_ID')
self.access_token = os.getenv('TAP_FACEBOOK_ACCESS_TOKEN')
self.account_url = self.base_url + self.api_version +'/act_{}'.format(self.account_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to move away from .format() calls with new code. Every modern python version supports F-strings and any version that doesn't is deprecated

Suggested change
self.account_url = self.base_url + self.api_version +'/act_{}'.format(self.account_id)
self.account_url = f"{self.base_url}/{self.api_version}/act_{self.account_id}"

This also lets us make the decision that no piece of the url should have a trailing or leading slash and we can let whatever code builds a full URL add the slashes in for us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we never separate 'https://graph.facebook.com/' from 'v18.0', then I think it's safe to say that the base_url is 'https://graph.facebook.com/v18.0'.

  1. It makes the "build a full URL" lines shorter
  2. It makes it harder to deviate from the one API version this client uses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should tie to one api version so updated accordingly


def get_account_objects(self, stream):
assert stream in self.stream_endpoint_map.keys(), \
f'Endpoint undefiend for specified stream: {stream}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f'Endpoint undefiend for specified stream: {stream}'
f'Endpoint undefined for specified stream: {stream}'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Comment on lines 63 to 65
response = requests.get(url, params)
LOGGER.info(f"Returning get response: {response}")
return response.json()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best practices for the requests library is

response = requests.get()
response.raise_for_status()
response.json()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the raise_for_status

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean my status code check in the pagination test is now redudnant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. This client is just for the CRUD portion right?

And the tap is still the thing we are testing to see if pagination is working?

response = fb_client.get_account_objects(stream)

number_of_records = len(response['data'])
if number_of_records >= limit and response.get('paging', {}).get('next'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think for pagination to occur the number of records cannot be equal to the limit. If the limit is 100 and there are 100 then it won't go to a second page.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on what response.get('paging', {}).get('next'): does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears as if the limit sent in the get request controls the number of records returned per page so I don't expect to get greater than 'limit' records back in a single response. I guess we should change this from ">=" to just "==". The "response.get('paging', {}).get('next'):" verifies that there is another page of data still left to get.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should be in the get_account_objects. When the if statement is true we just continue to the next stream. Are we assuming because there is a next page, we have another record and are over the limit.

Comment on lines 78 to 89
def get_account_objects(self, stream):
assert stream in self.stream_endpoint_map.keys(), \
f'Endpoint undefined for specified stream: {stream}'
endpoint = self.stream_endpoint_map[stream]
url = self.account_url + endpoint
params = {'access_token': self.access_token,
'limit': 100}
LOGGER.info(f"Getting url: {url}")
response = requests.get(url, params)
response.raise_for_status()
LOGGER.info(f"Returning get response: {response}")
return response.json()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this will only make one request and not continue until we either get to the end of the records or (have enough to make sure we don't need more data). Should this be in a loop until we have a certain number of records or until there are no more records?

Also, I don't see where we set a start date. what if all the records are ones that we wouldn't get in the sync? I'm not sure how the API works but we should make sure we are getting the correct set of data and continue to paginate correctly ourselves. Do we need to specify a sort order for example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Had not considered how limit, start_date, and order could change things. In general the idea was that only one request for 'limit' items would be needed and we could inspect the response to see if it had 'limit' records and a next page of data to know if there was enough there to paginate. But this assumes the limit is always 100 and the start date is always whatever the API default is. Once the start date differs from this then order will also matter. Will look into passing 'limit' and start date to the request and see if there is a way to dictate order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to pass limit and date range to the get query.

LOGGER.info(f"Returning post response: {response}")
return response

def generate_post_params(self, stream):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more than enough for now if we have random pk values. For an all fields test we would need to expand this and it could get messy. Might need to think about that more in the future, but not necessary now.

Also, to think about for the future (and maybe for the pagination test) is if we should store and return the data that we get and create to compare to what the sync gets and to make sure it is getting the correct data. Not sure if our standard pagination test worries about that but probably not since the data is unknown ahead of time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted.

Copy link

@HarrisonMarcRose HarrisonMarcRose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have some comments it would be good to implement but even without fixing those this is good. Love that we are making progress on this.

tests/base_new_frmwrk.py Outdated Show resolved Hide resolved
tests/test_facebook_pagination.py Outdated Show resolved Hide resolved
response = fb_client.get_account_objects(stream)

number_of_records = len(response['data'])
if number_of_records >= limit and response.get('paging', {}).get('next'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should be in the get_account_objects. When the if statement is true we just continue to the next stream. Are we assuming because there is a next page, we have another record and are over the limit.

tests/test_facebook_pagination.py Outdated Show resolved Hide resolved
…te BaseCase page_size(), update self.start_date pattern
@bhtowles bhtowles merged commit add18f1 into master Nov 15, 2023
2 checks passed
@bhtowles bhtowles deleted the qa/api-eval-TDL-7577 branch November 15, 2023 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing QA work. No src code changes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants