-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-40] Add cross-billing project spend report API #3107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial feedback!
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Outdated
Show resolved
Hide resolved
|
||
pathPrefix(Segment) { projectId => | ||
pathEnd { | ||
pathPrefix("spendReport") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file must've been hit by scalafmt
for some reason; adding in this new spendReport
route is all I did
@@ -417,6 +405,19 @@ trait ApiServiceSpec | |||
samDAO | |||
) | |||
|
|||
val spendReportingBigQueryService = bigQueryServiceFactory.getServiceFromJson("json", GoogleProject("test-project")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just moved after the creation of workspaceServiceConstructor
so I can pass it in as a parameter
...main/scala/org/broadinstitute/dsde/rawls/dataaccess/slick/RawlsBillingProjectComponent.scala
Show resolved
Hide resolved
List( | ||
SpendReportingForDateRange( | ||
totalCost, | ||
"0", // Ignoring credits for now; do we want to include them? | ||
currency, | ||
Option(start), | ||
Option(end), | ||
workspace = Some(workspaceName), | ||
googleProjectId = Some(GoogleProject(projectId.value)) | ||
), | ||
SpendReportingForDateRange( | ||
computeCost, | ||
"0", | ||
currency, | ||
Option(start), | ||
Option(end), | ||
workspace = Some(workspaceName), | ||
googleProjectId = Some(GoogleProject(projectId.value)), | ||
category = Some(TerraSpendCategories.Compute) | ||
), | ||
SpendReportingForDateRange( | ||
storageCost, | ||
"0", | ||
currency, | ||
Option(start), | ||
Option(end), | ||
workspace = Some(workspaceName), | ||
googleProjectId = Some(GoogleProject(projectId.value)), | ||
category = Some(TerraSpendCategories.Storage) | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to clean this up a bit, like:
val baseSpend = SpendReportingForDateRange(
"0",
"0", // Ignoring credits for now; do we want to include them?
currency,
Option(start),
Option(end),
workspace = Some(workspaceName),
googleProjectId = Some(GoogleProject(projectId.value))
)
// Each row gets a summary of compute, storage, and total
List(
baseSpend.copy(cost = totalCost),
baseSpend.copy(cost = computeCost, category = Some(TerraSpendCategories.Compute)),
baseSpend.copy(cost = storageCost, category = Some(TerraSpendCategories.Storage)),
)
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Outdated
Show resolved
Hide resolved
action: SamResourceAction, | ||
ctx: RawlsRequestContext | ||
): Future[Seq[SamUserResource]] = | ||
resourceTypeName match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The content of this mock method is mostly copied from listUserResources
, with just the action included. Not sure if we need anything better
core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/HttpSamDAO.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Outdated
Show resolved
Hide resolved
val callback = new SamApiCallback[ListResourcesV2200Response]("listResourcesV2") | ||
|
||
resourcesApi(ctx).listResourcesV2Async( | ||
/* format = */ "hierarchical", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* format = */ "hierarchical", | |
/* format = */ "flat", |
hierarchical
will not return actions that are included in roles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, with hierarchical I get a policies
object like so:
"policies": [
{
"actions": [],
"inherited": false,
"isPublic": false,
"policy": "owner",
"roles": [
{
"actions": [
"read_spend_report"
],
"role": "owner"
}
]
}
]
but with flat
it's just:
"policies": [
{
"inherited": false,
"isPublic": false,
"policy": "owner"
}
]
Looks like I want hierarchical?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm, this does not seem right, I will look at what sam is doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a full resource in the response for flat
looks like
{
"actions": [
"read_spend_report"
],
"authDomainGroups": [],
"missingAuthDomainGroups": [],
"policies": [
{
"inherited": false,
"isPublic": false,
"policy": "owner"
}
],
"resourceId": "eaacdc66-9ac5-4dd9-aeae-1a2c98f1eda8",
"resourceType": "spend-profile",
"roles": [
"owner"
]
}
The actions are not included in the policies. The is probably what it means to be flat.
The full resource for a hierarchical response looks like
{
"authDomainGroups": [],
"missingAuthDomainGroups": [],
"policies": [
{
"actions": [],
"inherited": false,
"isPublic": false,
"policy": "owner",
"roles": [
{
"actions": [
"read_spend_report"
],
"role": "owner"
}
]
}
],
"resourceId": "7900ff92-a1f0-4c3e-b051-e549e2211e32",
"resourceType": "spend-profile"
}
The actions are both at policy and role level. Also note there could be multiple policies that contain the action. Flat is really what you want because it hides all the details about how user has the action on the resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this further, flat vs hierarchical does not matter because nothing is used of the response other than the resource id. But hierarchical is much harder to use correctly. So I would either have this listResourcesWithActions
function return only a list of resource ids or switch to flat
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Outdated
Show resolved
Hide resolved
# Conflicts: # core/src/test/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingServiceSpec.scala
@@ -11,7 +11,10 @@ import org.broadinstitute.dsde.rawls.coordination.{DataSourceAccess, Uncoordinat | |||
import org.broadinstitute.dsde.rawls.dataaccess._ | |||
import org.broadinstitute.dsde.rawls.dataaccess.slick.{TestDriverComponent, WorkflowRecord} | |||
import org.broadinstitute.dsde.rawls.expressions.{BoundOutputExpression, OutputExpression} | |||
import org.broadinstitute.dsde.rawls.jobexec.SubmissionMonitorActor.{ExecutionServiceStatusResponse, StatusCheckComplete} | |||
import org.broadinstitute.dsde.rawls.jobexec.SubmissionMonitorActor.{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't make any changes to this file, it just somehow got mangled by scalafmt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thoughts inline!
core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/HttpSamDAO.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/slick/WorkspaceComponent.scala
Show resolved
Hide resolved
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Show resolved
Hide resolved
getRoundedNumericValue("credits").toString, | ||
currencyCode.toString, | ||
category = Option(TerraSpendCategories.Compute) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of these SpendReportingForDateRange
objects include the same value for getRoundedNumericValue("credits")
. Is this triple-counting credits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On closer inspection, I think it shows the same credits three times, one for each category, but only counts it once in the total. Still less than ideal
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Show resolved
Hide resolved
val summary = SpendReportingForDateRange( | ||
total.toString, | ||
total_credits.toString, | ||
currency.toString, // TODO: what to do about combined summary for currencies? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same question! Maybe for this version of the report we need to throw an error if we find multiple currencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it shouldn't throw an error - for a single billing project it should be a single currency, but since this covers multiple billing projects it seems plausible that there might be different currencies. So I think ideally we would either not have a summary, or have a summary per currency? Not really sure what the best thing to do here is but I don't think it should be error throwing. We could potentially do that for a first pass and deal with it later on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support throwing an error for now and in the future, we can choose provided multiple aggregated reports for each currency [{UsdAggregate}, {EuroAggregate},...etc}]
| project.id AS project_id, | ||
| project.name AS project_name, | ||
| currency, | ||
| SUM(IFNULL((SELECT SUM(c.amount) FROM UNNEST(credits) c), 0)) as credits, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do credits
also have a description like "Cloud Storage" or "Compute Engine" we could use to categorize them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know and I don't have the permissions to figure it out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see a type
column defined for the credits
record. If this CASE … WHEN
doesn't work out, we could look at the type
column. I think we'll really just need to test this empirically and see what results it gives.
query = getAllUserWorkspaceQuery(billingMap, pageSize, offset) | ||
queryJob = setUpAllUserWorkspaceQuery(query, start, end) | ||
|
||
job: Job <- bigQueryService.use(_.runJob(queryJob)).unsafeToFuture().map(_.waitFor()) | ||
_ = logSpendQueryStats(job.getStatistics[JobStatistics.QueryStatistics]) | ||
result = job.getQueryResults() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be a great place to add tracing: let's get timing and visibility into how the BigQuery query is performing. Might be easiest to pull this logic out into a separate, traced, method
complete { | ||
spendReportingConstructor(ctx).getSpendForAllWorkspaces( | ||
startDate, | ||
endDate.plusDays(1).minusMillis(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fancy!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a Scala expert, but I have added a few optional comments for consideration. I tested the API locally, and everything seems to work well. Nice job!
| SUM(CASE WHEN spend_category = 'Compute' THEN category_cost ELSE 0 END) AS compute_cost, | ||
| SUM(CASE WHEN spend_category = 'Other' THEN category_cost ELSE 0 END) AS other_cost, | ||
| currency, | ||
| SUM(credits) as credits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess to David's suggestion above we can split the credits by category similar to the category_costs
| SUM(credits) as credits | |
| SUM(CASE WHEN spend_category = 'Storage' THEN credits ELSE 0 END) AS storage_credits, | |
| SUM(CASE WHEN spend_category = 'Compute' THEN credits ELSE 0 END) AS compute_credits, | |
| SUM(CASE WHEN spend_category = 'Other' THEN credits ELSE 0 END) AS other_credits, |
val summary = SpendReportingForDateRange( | ||
total.toString, | ||
total_credits.toString, | ||
currency.toString, // TODO: what to do about combined summary for currencies? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support throwing an error for now and in the future, we can choose provided multiple aggregated reports for each currency [{UsdAggregate}, {EuroAggregate},...etc}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one last question about public workspaces and a note about accuracy of credits.
Since I'll be out, I'm giving this a proactive thumb. Since it's an API, we can release it and iterate on it before actually releasing user-facing features.
/* policies = */ util.List.of(), | ||
/* roles = */ util.List.of(), | ||
/* actions = */ util.List.of(action.value), | ||
/* includePublic = */ false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will the includePublic = false
prevent seeing spend reports for public workspaces, even if the calling user is an owner? As in, will our support/comms team be unable to see spend for our public workspaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll change it to true.
| project.id AS project_id, | ||
| project.name AS project_name, | ||
| currency, | ||
| SUM(IFNULL((SELECT SUM(c.amount) FROM UNNEST(credits) c), 0)) as credits, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see a type
column defined for the credits
record. If this CASE … WHEN
doesn't work out, we could look at the type
column. I think we'll really just need to test this empirically and see what results it gives.
https://broadworkbench.atlassian.net/browse/CORE-40
Adds a new spend reporting API that gets a spend report for each workspace the user has owner access to. Returns a
SpendReportingResult
, with each workspace broken down by category, and a summary for the total of all workspaces. Includes offset and pageSize parameters for paginating.In the case of workspaces with non-UUID resourceIds or billing projects without billing accounts, they are simply dismissed without error and not included in the report.
PR checklist
model/
, then you should publish a new officialrawls-model
and updaterawls-model
in Orchestration's dependencies.