Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's update to drop extra data in AirNow so only include hourly data that is paired #286

Open
blychs opened this issue Oct 7, 2024 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@blychs
Copy link
Collaborator

blychs commented Oct 7, 2024

Hi all,
The pairing from the docs_examples save_and_read has strange, hard to understand timestamps.
I.e., here is what we get.
Is this the intended behaviour @rschwant? It looks strange to me, especially the fact that is not consistent: it has time differences of 15 minutes twice, then half an hour, then repeat (e.g, 12:00:00, 12:15:00, 12:30:00, 13:00:00).
image

Cheers
Pablo

@rschwant
Copy link
Collaborator

rschwant commented Oct 7, 2024

I believe that this is just because most AirNow data is reported hourly. But some stations report more frequently including at 15 minute marks.

@zmoon, we are using this tool in monet monet.util.combinetool.combine_da_to_df

I forgot, does this just directly pair the model and obs data at each common time provided, so that would be at each hour and no time interpolation or is there a time interpolation here too for data that is provided more frequently?

@blychs
Copy link
Collaborator Author

blychs commented Oct 7, 2024

OK, that's definitely the case. I am checking the donwloaded Airnow data and the time looks the same (IMHO, strange) way, with two 15 minutes steps and a complete one just after that.

@zmoon
Copy link
Collaborator

zmoon commented Oct 7, 2024

Yes sorry for the confusion. The pairing is obs-first, so it looks for model data at the same times that obs has (while keeping all obs times), so the obs data not on the hour are usually useless at best. In some of the MM obs data prep tools I drop data not on the hour, but for AirNow I didn't bother at the time since mostly it is on the hour. We should update the AirNow data prep tool (and the example file) to do this too.

About the AirNow data, Kathmandu sites ['KM1010001', 'KM1010002'], for example, are on the :15 because of their +05:45 UTC offset. Other sites not on the UTC hour include US Dept of State India, Nepal, Sri Lanka, and Burma sites, with a +05:30 UTC offset. These locations aren't relevant for most (if not all) of our examples. AirNow data are left-labelled hourly averages.

@blychs
Copy link
Collaborator Author

blychs commented Oct 7, 2024

Should I close this issue? Or would you rather have it open until the updates to the prep tool are done?

@zmoon
Copy link
Collaborator

zmoon commented Oct 7, 2024

@rschwant with the additional notes about the sites I added, would you still want to keep them in the example data file?

@rschwant
Copy link
Collaborator

rschwant commented Oct 7, 2024

If when people download the AirNow data it's going to still be there then let's leave it in the example because then we can use this as an opportunity to explain how the tool works. But if we want to update the command line tool like you did for other datasets to drop these non-standard times as they would require extra effort for us to use anyway, then that might be best.

@zmoon
Copy link
Collaborator

zmoon commented Oct 7, 2024

I think we should update the command line tool to drop them, but have it be optional, like you suggested today. And at that time we can also update the example dataset.

@rschwant
Copy link
Collaborator

rschwant commented Oct 7, 2024

I agree. Do you have time to do this today? If not, I'll add it as an issue that we will do after the tutorial.

@zmoon
Copy link
Collaborator

zmoon commented Oct 7, 2024

Let's plan to do it after. It could take some time to do it carefully (checking for same results), and I want to give others the same option (this would at least include AQS). And this way we could combine with the NaN-fill change and just update the files once.

@rschwant rschwant changed the title Strange time frequency in the pairing Let's update to drop extra data in AirNow so only include hourly data that is paired Oct 7, 2024
@rschwant rschwant added the enhancement New feature or request label Oct 7, 2024
@rschwant rschwant moved this to Todo in MELODIES MONET Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

No branches or pull requests

3 participants