Replies: 6 comments 10 replies
-
Thank you so much for getting us started on this! Just to clarify your description above, are you saying that the stuff that existed before you started was not appropriate for a hubEvals example or that the stuff you put together is not appropriate? |
Beta Was this translation helpful? Give feedback.
-
One thing we could also consider for example data is separating them out into a data package, a bit like the Long Term Ecological Research program (LTER) Network's This would allow us to mix individual table data stored as rda in Actually I think this is really the way to go and surprised it only just occured me given I really rate the above mentioned packages! |
Beta Was this translation helpful? Give feedback.
-
I'm bringing broader discussion from this thread back here. I see three options for how to create and package up example data:
Option 1 has the advantage of setting up a consistent example that is standardized and will be familiar to package users who are looking at help files across different package in the ecosystem. That has a certain kind of advantage: consistency can be helpful. On the other hand, it seems likely that any single example data set that we might set up in option 1 would be "too complex" to be useful as an example for every purpose, and so almost every help file example would end up doing some filtering on it. For examples:
If we think including these kinds of filters throughout the documentation is too clunky, options 2 or 3 are indicated. I don't love either of these options, but of them, I prefer option 2 (packages are expected to maintain their own example data, pulling subsets from the upstream example hub repos). I think i could be persuaded to go with either option 1 or 2. |
Beta Was this translation helpful? Give feedback.
-
I like @elray1 's suggestion above for option 2. Trying to flesh out how we would operationalize this a bit more:
|
Beta Was this translation helpful? Give feedback.
-
Following up on discussion in our hubverse dev meeting on Feb 28, in this comment I will try to outline what it would look like to use a hubverse data package, option 1 in my comment above. This does not represent a decision, just trying to articulate the option clearly to facilitate decision making about how we want to organize things. A working name for the package is In brief, the idea is that the package would make three examples of hub data available:
I'll describe these example and their use cases in more detail in the following subsections. Example 1: full file structureData structuresIn the package repository, the hub will be located in the
Example uses in other hubverse packagesWith this in place, here's an example of what downstream use could look like in
Example 2: data objects for an example forecast hubData structuresThe proposal is to have approximately 3 data objects that contain model outputs and target data derived from the A first data object might be called
This data set would have columns like the following:
A second data object might be called
One or more additional data objects would include observed target values. We have not yet established the precise format for this/these objects. If a single data object, we might call it Example uses in other hubverse packagesThe hubEnsembles packages includes the
Planned functionality in the
[Note: On one hand, these filter statements feel kind of distracting. On the other hand, they may serve a useful documentation purpose, calling the reader's attention to the fact that these functions accept subsets of Example 3: data objects for an example scenario modeling hubThe setup and use cases here would be very similar to example 2, but we'd like to provide examples of scenario model outputs and targets. |
Beta Was this translation helpful? Give feedback.
-
I am closing this discussion in favor of new, more focused discussions over on the shiny new hubExamples repository. |
Beta Was this translation helpful? Give feedback.
-
This post collects ideas that we've discussed before in non-centralized locations.
Goal
It would be nice to have an example hub with all output types, suitable for use in examples and documentation throughout the hubverse. There are several advantages to this:
Existing work
We have existing examples in the example simple forecast hub and the example complex scenario hub. The main issue I see with these examples is that they do not have all output types. We would like to be able to demonstrate functionality with examples of all output types.
A second issue related to the complex scenario hub (which includes more output types than the simple forecast hub) is that it is slightly less natural to use those data to demonstrate standard forecast evaluation methods, as evaluating scenario projections carefully is more complex than evaluating forecasts. So I would like to find another example to use for
hubEvals
.Example complex forecast hub
I've started putting examples together in the example complex forecast hub. It still needs some documentation describing what's in there.
Output types included
Use of example data in hubverse documentation and packages
It seems like these data are candidates for use in the following places:
For hubData (currently hubUtils), we want to demonstrate a
connect_hub() |> collect()
workflow, so for that package it may be helpful to work with an actual copy of a full hub setup.For other packages, it seems like we can bypass that part of the workflow and assume the user has data frames of model output data and target data (if relevant). For those purposes, we could include those data as data objects in the package. To facilitate creation of those data objects, we could mirror the example complex forecast hub to an S3 Bucket, allowing scripts that create the data objects in a particular package to run without requiring a local clone of the example hub in a specific location. This would make development easier.
Beta Was this translation helpful? Give feedback.
All reactions