-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TimeSeriesDataset: support multiple aggregation methods #369
TimeSeriesDataset: support multiple aggregation methods #369
Conversation
Codecov Report
@@ Coverage Diff @@
## master #369 +/- ##
=========================================
Coverage ? 89.32%
=========================================
Files ? 48
Lines ? 2211
Branches ? 0
=========================================
Hits ? 1975
Misses ? 236
Partials ? 0
Continue to review full report at Codecov.
|
5e07a8b
to
717ca0e
Compare
I like the idea, but I think the PR is WIP? |
717ca0e
to
0e58bbe
Compare
0e58bbe
to
7f6a539
Compare
I think it is pretty good to go. Except I have not updated the documentation, I will do that as well.
|
This PR does also not make sure that it actually works to do this in a workflow. I guess it makes sense to wait for #368 for that. It involves some dirty stuff, especially since the grafana dashboards will probably not work with several aggregation methods. For now the full workflow works when selecting a single aggregation method (e.g. "max" instead of default "mean"), but the "multiple aggregation methods" is only useful when using the dataset locally. |
In general, what do you think about using
or if a single agg:
Might result in more complex changes? |
I like it. I am a bit uncertain how big of an API change it is. Maybe not that big? At least it has a nicer consistency between the single and multiple aggregation methods. |
cc817ff
to
cb53c9e
Compare
So, now it is as follows: It uses multi-level dataframes when there are several aggregation methods, but the old-style single level ones in case of a single aggregation method. This is to make it backwards compatible, and make it a smaller PR. Two things:
|
Seems reasonable to me, I've added #382 so we can address using the multi level column dataframes and thus support multiple agg function throughout the workflow later on. For the example, I think adding it in the https://github.com/equinor/gordo-test-project might be a good place? |
cb53c9e
to
e3c8c4b
Compare
157d1db
to
98e4b2c
Compare
a56c45f
to
2e84c08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebase, mahn.
Now the TimeSeriesDataset supports an optional argument `aggregation_methods` with default value `mean`. It can be either a single aggregation method or a list of aggregation methods. Any aggregation method supported by pandas is supported. If multiple aggregation methods are provided then the resulting dataframe contains a multi-level column index with the tag-name as the top level, and the aggregation method as the second level. If a single aggregation method is provided then only the first level (tag-name) is used.
Also changed the dates to some dates with more interesting values when comparing max/min.
2e84c08
to
ccb4562
Compare
Now the TimeSeriesDataset supports an optional argument
aggregation_methods
with default value
mean
. It can be either a single aggregation method ora list of aggregation methods. Any aggregation method supported by pandas
is supported. If multiple aggregation methods are provided then the resulting
dataframe contains a multi-level column index with the tag-name as the top
level, and the aggregation method as the second level. If a single aggregation
method is provided then only the first level (tag-name) is used.
This closes #278