Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TADA_DataRetrieval updates for sf option, tribal options, big data options #566

Open
wants to merge 40 commits into
base: develop
Choose a base branch
from

Conversation

mbrousil
Copy link
Contributor

Hi there!

This PR covers a few large changes to the data retrieval process based on conversations I have had with Cristina and Hillary:

Overview

  1. Addition of {sf} methods to allow users to query WQP data using {sf} objects
  2. Addition of options to allow tribal lands to be more directly queried using TADA_DataRetrieval
  3. New function, TADA_TribalOptions, to assist users with identifying and querying tribal lands
  4. Folding the processes in TADA_BigDataRetrieval into TADA_DataRetrieval and removing TADA_BigDataRetrieval to avoid confusion
  5. Adding progress bar to large data pulls, user prompt to confirm download, silencing {dataRetrieval} messages + error handling for HTTP errors, vignette update

Additional info

  1. {sf} methods use aoi_sf arg and largely begin here. First checks what data are available for the bbox of the {sf} object provided, then uses only MonitoringLocationIdentifiers inside the {sf} object when running the full query
  2. Tribal land queries use tribal_area_type and tribe_name_parcel args and are handled alongside {sf} because they use this EPA spatial data. Both tribal_area_type and tribe_name_parcel are required. {sf} and tribal args can't be used at the same time (error), and if geographic info like statecode are provided in addition to either {sf} or tribal args then a warning is returned
  3. tribal_area_type refers to one of the EMEF/Tribal MapServer layers. tribe_name_parcel refers to either TRIBE_NAME or PARCEL_NO entries from that layer. The TADA_TribalOptions function is included to help users see TRIBE_NAME/PARCEL_NO options available to them and check punctuation, etc.
  4. TADA_BigDataHelper is now used to handle "big" data requests within TADA_DataRetrieval. By default this is triggered with maxrecs = 250000 & maxsites = 300.
    1. Two (1, 2) progress bars are included inside TADA_BigDataHelper
    2. The ask_user function is used to confirm that the user wants to download the dataset after the number of records is determined
    3. In general the messages from {dataRetrieval} are now silenced because they were returning a lot of information that was hiding (what we considered) more useful information from TADA_DataRetrieval. But we've made sure to include checks for HTTP errors, which will then be communicated back to the user
    4. Additional info now in vignette 1 to explain the new {sf}, tribal, and big data functionality

A few notes:

  • I left NULL as the default for the aoi_sf argument instead of "null" because the character version didn't work properly
  • I had hoped to work on issues related to character length limits in queries, as discussed with Cristina, but ran out of time
  • From my tests it didn't seem like the way that data are indexed by calendar date affected query speed

Please let me know if I can provide any other info on any of this! For example I didn't include any info from speed tests to avoid overwhelming amounts of info here. Thanks for your help.

Closes #361, closes #427, closes #345, closes #159

@mbrousil mbrousil marked this pull request as draft January 29, 2025 00:47
@mbrousil mbrousil marked this pull request as ready for review January 29, 2025 00:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant