Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DCJ-386] Register Azure Managed Resource Group with SAM #1697

Draft
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

fboulnois
Copy link
Contributor

@fboulnois fboulnois commented Jun 3, 2024

Addresses

https://broadworkbench.atlassian.net/browse/DCJ-386

Summary

This adds the step to register the Azure managed resource group in SAM stairway step with unit tests.

@fboulnois fboulnois requested review from a team as code owners June 3, 2024 14:51
@fboulnois fboulnois requested review from rushtong and okotsopoulos and removed request for a team June 3, 2024 14:51
@fboulnois fboulnois force-pushed the fb-dcj-386-register-managed-resource-group-dummy branch from 91829b7 to 4f259e0 Compare June 3, 2024 17:32
@fboulnois fboulnois changed the title [DCJ-386] Add dummy step with unit tests [DCJ-386] Register Azure Managed Resource Group with SAM Jun 4, 2024
Copy link
Contributor

@rushtong rushtong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. Not sure how to debug the test failure, but once that is sorted out, 👍🏽

Copy link
Contributor

@okotsopoulos okotsopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this so far! Happy to discuss more, maybe in mobbing this afternoon?


public void registerManagedResourceGroup(
BillingProfileRequestModel request, AuthenticatedUserRequest user) {
String billingProfileId = request.getBillingAccountId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A BillingProfileRequestModel only has a billingAccountId supplied when a GCP billing profile is being created -- it's the ID of the GCP billing account to wrap. For Azure billing profile creation, it is unspecified. I suspect this is why your AzureIntegrationTests are failing with this error:

https://scans.gradle.com/s/da5olvhqbaima/tests/overview?outcome=FAILED

java.lang.AssertionError: Error validating azure billing profile create.  Got response: bio.terra.integration.DataRepoResponse@193cb0e9[response=bio.terra.integration.ObjectOrErrorResponse@5de6abf3[statusCode=500 INTERNAL_SERVER_ERROR,locationHeader=Optional.empty,errorObject=Optional[class ErrorModel { |  
-- | --
  | message: Message: Missing the required parameter 'billingProfileId' when calling createManagedResourceGroup(Async)

What this endpoint is expecting is the ID of the Sam spend-profile resource created in this flight. I believe that it needs the Sam resource to already exist. Do you know where that Sam resource creation takes place in the flight?

Copy link
Contributor Author

@fboulnois fboulnois Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateProfileAuthzIamStep seems to ultimately call samResourceApi.createResourceV2(IamResourceType.SPEND_PROFILE.toString(), req);. req uses request.getId().toString() as the profileId. However, .getId() is deprecated, so this is a bit confusing to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it's a bit confusing!

Here's the past PR where we deprecated this ID in the request, with more context on the history: #1370

The first step in the billing profile creation flight will generate a UUID for the sam-profile and stores it in the request object if one doesn't already exist:

UUID profileId = request.getId();
if (profileId == null) {
request.setId(UUID.randomUUID());
}

But the ID is only supplied to Sam to create the resource in Sam later in the flight, which looks to happen in CreateProfileAuthzIamStep.

Comment on lines 33 to 37
@Override
public StepResult undoStep(FlightContext flightContext) throws InterruptedException {
// Registering the Managed Resource Group has no side effects to clean up
return StepResult.getStepResultSuccess();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There actually is a side effect to clean up: you can call Sam to delete the MRG registration. Here's that endpoint:

https://sam.dsde-prod.broadinstitute.org/#/Azure/deleteManagedResourceGroup

But you could ask in #dsp-identiteam if deleting the Sam profile resource would also delete any MRG registration associated with that billing profile. If that's the case, then a no-op undoStep here would be acceptable, because undoing the step which deletes the Sam profile resource would take care of this clean-up for us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a larger problem with making thedeleteManagedResourceGroup call, which is that it must be called before deleting the billing profile in Sam (according to Doug). This means that the cleanup to CreateProfileAuthzIamStep will cause conflicts with Managed Resource Group undo step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a problem, but we do need to change where this step runs.

The createManagedResourceGroup endpoint must be called after creating the billing profile, and deleteManagedResourceGroup must be called before deleting the billing profile. So, this step must run after CreateProfileAuthzIamStep.

Then when a flight is moving forward:

  • CreateProfileAuthzIamStep.doStep creates the profile
  • CreateProfileManagedResourceGroupStep.doStep creates the association between profile and MRG coords

If a flight is rolled back, it walks backwards through the steps to undo:

  • CreateProfileManagedResourceGroupStep.undoStep deletes the association between profile and MRG coords
  • CreateProfileAuthzIamStep.undoStep deletes the profile

new ManagedResourceGroupCoordinates()
.tenantId(request.getTenantId().toString())
.subscriptionId(request.getSubscriptionId().toString())
.managedResourceGroupName(request.getResourceGroupName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had left this remark in the ticket:

The format of the MRG coordinates that TDR billing profile creation expects from the caller may not match exactly the format expected by the Sam endpoint.

I was specifically wondering if TDR's BillingProfileRequestModel.resourceGroupName is the expected value for managedResourceGroupName. In TDR Azure architecture, the user creates a Resource Group in which to deploy their TDR managed app. The TDR managed app manages a new Managed Resource Group. I think TDR billing profile creation takes in the user-created Resource Group name, but the Sam endpoint takes in Managed Resource Group name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really sure how to address this issue. The request doesn't seem to contain the "real" Managed Resource Group name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Azure service has the following:

var properties = objectMapper.valueToTree(applicationDeployment.properties());
var managedResourceGroupId = properties.get(MANAGED_RESOURCE_GROUP_ID_KEY)

I'm not sure if this is safe to access from the running steps, however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are on the right track. The request body may not contain the MRG name, but this is something TDR can obtain through interactions with the Azure SDK… and the snippet you found shows that there is precedence elsewhere in the code for doing just that.

(BTW -- where possible to link to snippets from the existing codebase, that would be much appreciated to make sure that we're looking at the same thing, and leave a trail of breadcrumbs for those perusing this PR in the future.)

The snippet you shared looks to come from AzureApplicationDeploymentService.newApplicationDeployment which is called only in AzureApplicationDeploymentService.getOrRegisterApplicationDeployment.

getOrRegisterApplicationDeployment is presently only called when creating datasets and snapshots, as part of getting or creating their Azure storage accounts.

I may be missing historical reasoning, but it makes more sense to me to register an application deployment as part of creating an Azure billing profile, as a step within the flight that can also delete the application deployment registration in its undoStep definition. Then the remainder of the flight can rely on having the MRG name available.

Any future attempts to create billing profiles from the same MRG would fail fast at billing profile creation rather than later on at dataset creation, which seems to be the desired effect:

throw new MismatchedBillingProfilesException(
"Cannot reuse existing application deployment "
+ applicationResource.getAzureApplicationDeploymentName()
+ " from profile "
+ applicationResource.getProfileId()
+ " with a different profile "
+ billingProfile.getId());

@Override
public StepResult doStep(FlightContext flightContext)
throws InterruptedException, RetryException {
profileService.registerManagedResourceGroup(request, user);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had left this question in the ticket:

What is the Sam endpoint’s behavior when a billing profile already exists for the MRG coordinates?

Looking at the documented responses, it will return a 409: https://sam.dsde-prod.broadinstitute.org/#/Azure/createManagedResourceGroup

Flagging this as I'm not sure if this will end up proving problematic in the current setup for our AzureIntegrationTests. Let's revisit that after the next round of changes.

Copy link

sonarqubecloud bot commented Jun 7, 2024

@fboulnois fboulnois marked this pull request as draft June 14, 2024 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants