[CORE-40] Add cross-billing project spend report API #3107

calypsomatic · 2024-10-31T20:08:26Z

https://broadworkbench.atlassian.net/browse/CORE-40

Adds a new spend reporting API that gets a spend report for each workspace the user has owner access to. Returns a SpendReportingResult, with each workspace broken down by category, and a summary for the total of all workspaces. Includes offset and pageSize parameters for paginating.
In the case of workspaces with non-UUID resourceIds or billing projects without billing accounts, they are simply dismissed without error and not included in the report.

PR checklist

Include the JIRA issue number in the PR description and title
Make sure Swagger is updated if API changes
- ...and Orchestration's Swagger too!
If you changed anything in model/, then you should publish a new official rawls-model and update rawls-model in Orchestration's dependencies.
Get two thumbsworth of PR review
Verify all tests go green, including CI tests
Squash commits and merge to develop (branches are automatically deleted after merging)
Inform other teams of any substantial changes via Slack and/or email

davidangb

initial feedback!

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

calypsomatic · 2024-11-08T21:36:34Z

core/src/main/scala/org/broadinstitute/dsde/rawls/webservice/BillingApiServiceV2.scala

-
-        pathPrefix(Segment) { projectId =>
-          pathEnd {
+        pathPrefix("spendReport") {


This file must've been hit by scalafmt for some reason; adding in this new spendReport route is all I did

calypsomatic · 2024-11-08T21:37:32Z

core/src/test/scala/org/broadinstitute/dsde/rawls/webservice/ApiServiceSpec.scala

@@ -417,6 +405,19 @@ trait ApiServiceSpec
                                  samDAO
      )

+    val spendReportingBigQueryService = bigQueryServiceFactory.getServiceFromJson("json", GoogleProject("test-project"))


This is just moved after the creation of workspaceServiceConstructor so I can pass it in as a parameter

...main/scala/org/broadinstitute/dsde/rawls/dataaccess/slick/RawlsBillingProjectComponent.scala

davidangb · 2024-11-15T17:27:52Z

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

+      List(
+        SpendReportingForDateRange(
+          totalCost,
+          "0", // Ignoring credits for now; do we want to include them?
+          currency,
+          Option(start),
+          Option(end),
+          workspace = Some(workspaceName),
+          googleProjectId = Some(GoogleProject(projectId.value))
+        ),
+        SpendReportingForDateRange(
+          computeCost,
+          "0",
+          currency,
+          Option(start),
+          Option(end),
+          workspace = Some(workspaceName),
+          googleProjectId = Some(GoogleProject(projectId.value)),
+          category = Some(TerraSpendCategories.Compute)
+        ),
+        SpendReportingForDateRange(
+          storageCost,
+          "0",
+          currency,
+          Option(start),
+          Option(end),
+          workspace = Some(workspaceName),
+          googleProjectId = Some(GoogleProject(projectId.value)),
+          category = Some(TerraSpendCategories.Storage)
+        )
+      )


You might be able to clean this up a bit, like:

val baseSpend = SpendReportingForDateRange( "0", "0", // Ignoring credits for now; do we want to include them? currency, Option(start), Option(end), workspace = Some(workspaceName), googleProjectId = Some(GoogleProject(projectId.value)) ) // Each row gets a summary of compute, storage, and total List( baseSpend.copy(cost = totalCost), baseSpend.copy(cost = computeCost, category = Some(TerraSpendCategories.Compute)), baseSpend.copy(cost = storageCost, category = Some(TerraSpendCategories.Storage)), )

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

calypsomatic · 2024-11-19T20:52:08Z

core/src/test/scala/org/broadinstitute/dsde/rawls/mock/MockSamDAO.scala

+                                        action: SamResourceAction,
+                                        ctx: RawlsRequestContext
+  ): Future[Seq[SamUserResource]] =
+    resourceTypeName match {


The content of this mock method is mostly copied from listUserResources, with just the action included. Not sure if we need anything better

core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/HttpSamDAO.scala

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

dvoet · 2024-11-21T15:11:38Z

core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/HttpSamDAO.scala

+      val callback = new SamApiCallback[ListResourcesV2200Response]("listResourcesV2")
+
+      resourcesApi(ctx).listResourcesV2Async(
+        /* format = */ "hierarchical",


Suggested change

/* format = */ "hierarchical",

/* format = */ "flat",

hierarchical will not return actions that are included in roles

Hm, with hierarchical I get a policies object like so:

"policies": [ { "actions": [], "inherited": false, "isPublic": false, "policy": "owner", "roles": [ { "actions": [ "read_spend_report" ], "role": "owner" } ] } ]

but with flat it's just:

"policies": [ { "inherited": false, "isPublic": false, "policy": "owner" } ]

Looks like I want hierarchical?

hmmm, this does not seem right, I will look at what sam is doing

a full resource in the response for flat looks like

{ "actions": [ "read_spend_report" ], "authDomainGroups": [], "missingAuthDomainGroups": [], "policies": [ { "inherited": false, "isPublic": false, "policy": "owner" } ], "resourceId": "eaacdc66-9ac5-4dd9-aeae-1a2c98f1eda8", "resourceType": "spend-profile", "roles": [ "owner" ] }

The actions are not included in the policies. The is probably what it means to be flat.

The full resource for a hierarchical response looks like

{ "authDomainGroups": [], "missingAuthDomainGroups": [], "policies": [ { "actions": [], "inherited": false, "isPublic": false, "policy": "owner", "roles": [ { "actions": [ "read_spend_report" ], "role": "owner" } ] } ], "resourceId": "7900ff92-a1f0-4c3e-b051-e549e2211e32", "resourceType": "spend-profile" }

The actions are both at policy and role level. Also note there could be multiple policies that contain the action. Flat is really what you want because it hides all the details about how user has the action on the resource.

Looking at this further, flat vs hierarchical does not matter because nothing is used of the response other than the resource id. But hierarchical is much harder to use correctly. So I would either have this listResourcesWithActions function return only a list of resource ids or switch to flat

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

# Conflicts: # core/src/test/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingServiceSpec.scala

calypsomatic · 2024-11-25T20:46:30Z

core/src/test/scala/org/broadinstitute/dsde/rawls/jobexec/SubmissionMonitorSpec.scala

@@ -11,7 +11,10 @@ import org.broadinstitute.dsde.rawls.coordination.{DataSourceAccess, Uncoordinat
 import org.broadinstitute.dsde.rawls.dataaccess._
 import org.broadinstitute.dsde.rawls.dataaccess.slick.{TestDriverComponent, WorkflowRecord}
 import org.broadinstitute.dsde.rawls.expressions.{BoundOutputExpression, OutputExpression}
-import org.broadinstitute.dsde.rawls.jobexec.SubmissionMonitorActor.{ExecutionServiceStatusResponse, StatusCheckComplete}
+import org.broadinstitute.dsde.rawls.jobexec.SubmissionMonitorActor.{


I didn't make any changes to this file, it just somehow got mangled by scalafmt

davidangb

thoughts inline!

core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/HttpSamDAO.scala

core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/slick/WorkspaceComponent.scala

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

davidangb · 2024-11-25T22:13:05Z

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

+          getRoundedNumericValue("credits").toString,
+          currencyCode.toString,
+          category = Option(TerraSpendCategories.Compute)
+        )


Each of these SpendReportingForDateRange objects include the same value for getRoundedNumericValue("credits"). Is this triple-counting credits?

Good point.

On closer inspection, I think it shows the same credits three times, one for each category, but only counts it once in the total. Still less than ideal

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

davidangb · 2024-11-25T22:16:26Z

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

+    val summary = SpendReportingForDateRange(
+      total.toString,
+      total_credits.toString,
+      currency.toString, // TODO: what to do about combined summary for currencies?


I have the same question! Maybe for this version of the report we need to throw an error if we find multiple currencies.

I feel like it shouldn't throw an error - for a single billing project it should be a single currency, but since this covers multiple billing projects it seems plausible that there might be different currencies. So I think ideally we would either not have a summary, or have a summary per currency? Not really sure what the best thing to do here is but I don't think it should be error throwing. We could potentially do that for a first pass and deal with it later on?

I support throwing an error for now and in the future, we can choose provided multiple aggregated reports for each currency [{UsdAggregate}, {EuroAggregate},...etc}]

davidangb · 2024-11-25T22:19:28Z

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

+                       |    project.id AS project_id,
+                       |    project.name AS project_name,
+                       |    currency,
+                       |    SUM(IFNULL((SELECT SUM(c.amount) FROM UNNEST(credits) c), 0)) as credits,


do credits also have a description like "Cloud Storage" or "Compute Engine" we could use to categorize them?

I don't know and I don't have the permissions to figure it out

I do see a type column defined for the credits record. If this CASE … WHEN doesn't work out, we could look at the type column. I think we'll really just need to test this empirically and see what results it gives.

davidangb · 2024-11-25T22:23:52Z

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

+        query = getAllUserWorkspaceQuery(billingMap, pageSize, offset)
+        queryJob = setUpAllUserWorkspaceQuery(query, start, end)
+
+        job: Job <- bigQueryService.use(_.runJob(queryJob)).unsafeToFuture().map(_.waitFor())
+        _ = logSpendQueryStats(job.getStatistics[JobStatistics.QueryStatistics])
+        result = job.getQueryResults()


this would be a great place to add tracing: let's get timing and visibility into how the BigQuery query is performing. Might be easiest to pull this logic out into a separate, traced, method

davidangb · 2024-11-25T22:25:52Z

core/src/main/scala/org/broadinstitute/dsde/rawls/webservice/BillingApiServiceV2.scala

+                complete {
+                  spendReportingConstructor(ctx).getSpendForAllWorkspaces(
+                    startDate,
+                    endDate.plusDays(1).minusMillis(1),


kevinmarete

I am not a Scala expert, but I have added a few optional comments for consideration. I tested the API locally, and everything seems to work well. Nice job!

kevinmarete · 2024-11-26T16:12:22Z

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

+       |  SUM(CASE WHEN spend_category = 'Compute' THEN category_cost ELSE 0 END) AS compute_cost,
+       |  SUM(CASE WHEN spend_category = 'Other' THEN category_cost ELSE 0 END) AS other_cost,
+       |  currency,
+       |  SUM(credits) as credits


I guess to David's suggestion above we can split the credits by category similar to the category_costs

Suggested change

| SUM(credits) as credits

| SUM(CASE WHEN spend_category = 'Storage' THEN credits ELSE 0 END) AS storage_credits,

| SUM(CASE WHEN spend_category = 'Compute' THEN credits ELSE 0 END) AS compute_credits,

| SUM(CASE WHEN spend_category = 'Other' THEN credits ELSE 0 END) AS other_credits,

kevinmarete · 2024-11-26T16:16:47Z

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

+    val summary = SpendReportingForDateRange(
+      total.toString,
+      total_credits.toString,
+      currency.toString, // TODO: what to do about combined summary for currencies?


I support throwing an error for now and in the future, we can choose provided multiple aggregated reports for each currency [{UsdAggregate}, {EuroAggregate},...etc}]

davidangb

one last question about public workspaces and a note about accuracy of credits.

Since I'll be out, I'm giving this a proactive thumb. Since it's an API, we can release it and iterate on it before actually releasing user-facing features.

davidangb · 2024-11-26T20:39:22Z

core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/HttpSamDAO.scala

+        /* policies = */ util.List.of(),
+        /* roles = */ util.List.of(),
+        /* actions = */ util.List.of(action.value),
+        /* includePublic = */ false,


will the includePublic = false prevent seeing spend reports for public workspaces, even if the calling user is an owner? As in, will our support/comms team be unable to see spend for our public workspaces?

Good point, I'll change it to true.

davidangb · 2024-11-26T20:48:18Z

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala

+                       |    project.id AS project_id,
+                       |    project.name AS project_name,
+                       |    currency,
+                       |    SUM(IFNULL((SELECT SUM(c.amount) FROM UNNEST(credits) c), 0)) as credits,


I do see a type column defined for the credits record. If this CASE … WHEN doesn't work out, we could look at the type column. I think we'll really just need to test this empirically and see what results it gives.

calypsomatic added 6 commits October 31, 2024 16:07

add getownerworkspaces

c0ef330

update getOwnerWorkspaces, start to add getBilling

1a396b9

interim getBillingForWorkspaces

3a4dc82

update getbillingforworkspaces to get what we need

56cf2bc

update getbilling and query

cc2c3bd

remove unnecessary changes

2adf61b

davidangb reviewed Nov 6, 2024

View reviewed changes

calypsomatic added 4 commits November 7, 2024 11:09

update getbillingworkspaces to be one sql query

775616e

start putting it all together

414207c

add new spendreport api

c5c736a

query roughly working

81f843b

calypsomatic commented Nov 8, 2024

View reviewed changes

Merge branch 'develop' into core-40-spend

0895f80

davidangb reviewed Nov 15, 2024

View reviewed changes

calypsomatic and others added 5 commits November 18, 2024 15:49

update return value of cross billing spend report

a023bbb

update how to extract spend report results

0c6915b

add test skeleton

9a8694b

get workspaces/bps from sam instead

554256b

Merge branch 'develop' into core-40-spend

9085110

calypsomatic commented Nov 19, 2024

View reviewed changes

some cleanup

8f52a84

calypsomatic changed the title ~~add getownerworkspaces~~ [CORE-40] Add cross-billing project spend report API Nov 19, 2024

davidangb reviewed Nov 20, 2024

View reviewed changes

core/src/main/scala/org/broadinstitute/dsde/rawls/dataaccess/HttpSamDAO.scala Outdated Show resolved Hide resolved

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala Outdated Show resolved Hide resolved

dvoet reviewed Nov 21, 2024

View reviewed changes

Merge branch 'develop' into core-40-spend

b5fa11f

dvoet reviewed Nov 21, 2024

View reviewed changes

core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala Outdated Show resolved Hide resolved

calypsomatic added 3 commits November 21, 2024 14:16

fix unit tests

9fd6bc0

move logic around and expand test

9ad4c7e

compare spend report permission to workspace ownership

b07bbb1

calypsomatic and others added 9 commits November 22, 2024 09:27

check own action on workspaces

a0e9897

# Conflicts: # core/src/test/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingServiceSpec.scala

fix rebase

9da40c0

get workspaces by spend report action

6a03274

Merge branch 'develop' into core-40-spend

2200b62

add pagination

e1e31a1

include credits and deal with edge cases

e4a13db

start to add trace

85ad0f2

Merge branch 'develop' into core-40-spend

c6d966f

scalafmt

0c4cbe1

calypsomatic commented Nov 25, 2024

View reviewed changes

calypsomatic marked this pull request as ready for review November 25, 2024 20:47

calypsomatic requested a review from a team as a code owner November 25, 2024 20:47

calypsomatic requested review from dvoet, kevinmarete and davidangb and removed request for a team November 25, 2024 20:47

davidangb reviewed Nov 25, 2024

View reviewed changes

kevinmarete approved these changes Nov 26, 2024

View reviewed changes

calypsomatic added 2 commits November 26, 2024 14:39

pr comments round 1

7d9f379

pr comments round 2

20dfd8c

davidangb approved these changes Nov 26, 2024

View reviewed changes

calypsomatic and others added 4 commits November 27, 2024 09:36

Merge branch 'develop' into core-40-spend

cdc0b53

remove extraneous todo

a6b900f

include public workspaces

e079f56

hierarchical to flat

780974f

dvoet approved these changes Dec 4, 2024

View reviewed changes

whoops

9dcfd0f

calypsomatic merged commit 6fafd0c into develop Dec 4, 2024
29 checks passed

calypsomatic deleted the core-40-spend branch December 4, 2024 16:24

sam-schu mentioned this pull request Dec 5, 2024

[AN-226] Enable submitting workflows to Cromwell's GCP Batch backend #3141

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CORE-40] Add cross-billing project spend report API #3107

[CORE-40] Add cross-billing project spend report API #3107

calypsomatic commented Oct 31, 2024 •

edited

Loading

davidangb left a comment

calypsomatic Nov 8, 2024

calypsomatic Nov 8, 2024

davidangb Nov 15, 2024

calypsomatic Nov 19, 2024

dvoet Nov 21, 2024

calypsomatic Nov 21, 2024

dvoet Dec 4, 2024

dvoet Dec 4, 2024

dvoet Dec 4, 2024

calypsomatic Nov 25, 2024

davidangb left a comment

davidangb Nov 25, 2024

calypsomatic Nov 26, 2024

calypsomatic Nov 26, 2024

davidangb Nov 25, 2024

calypsomatic Nov 26, 2024

kevinmarete Nov 26, 2024

davidangb Nov 25, 2024

calypsomatic Nov 26, 2024

davidangb Nov 26, 2024

davidangb Nov 25, 2024

davidangb Nov 25, 2024

kevinmarete left a comment

kevinmarete Nov 26, 2024

kevinmarete Nov 26, 2024

davidangb left a comment

davidangb Nov 26, 2024

calypsomatic Dec 4, 2024

davidangb Nov 26, 2024

-       |  SUM(credits) as credits
+    |  SUM(CASE WHEN spend_category = 'Storage' THEN credits ELSE 0 END) AS storage_credits,
+    |  SUM(CASE WHEN spend_category = 'Compute' THEN credits ELSE 0 END) AS compute_credits,
+    |  SUM(CASE WHEN spend_category = 'Other' THEN credits ELSE 0 END) AS other_credits,

[CORE-40] Add cross-billing project spend report API #3107

[CORE-40] Add cross-billing project spend report API #3107

Conversation

calypsomatic commented Oct 31, 2024 • edited Loading

davidangb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidangb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevinmarete left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidangb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

calypsomatic commented Oct 31, 2024 •

edited

Loading