Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when adding secondaryAllocationIds to existing NAT Gateway #2971

Closed
barryrobison opened this issue Nov 13, 2023 · 8 comments
Closed

Panic when adding secondaryAllocationIds to existing NAT Gateway #2971

barryrobison opened this issue Nov 13, 2023 · 8 comments
Assignees
Labels
impact/panic This bug represents a panic or unexpected crash kind/bug Some behavior is incorrect or out of spec p1 A bug severe enough to be the next item assigned to an engineer resolution/fixed This issue was fixed

Comments

@barryrobison
Copy link

barryrobison commented Nov 13, 2023

What happened?

We are attempting to add additional EIPs to our existing NAT Gateways. Adding this causes pulumi to panic during a preview.

  ~ aws:ec2/natGateway:NatGateway: (update)
            [id=nat-0d956f392212314be]
            [urn=urn:pulumi:networking-dev-ap-southeast-2::networking::networking$aws:ec2/natGateway:NatGateway::main-vpc-nat-1]
            [provider=urn:pulumi:networking-dev-ap-southeast-2::networking::pulumi:providers:aws::networking-dev-main::b2a1187a-d9f4-4a79-a67b-f5abc56a5db7]
          ~ secondaryAllocationIds: [
              + [0]: "eipalloc-05f44fxxxxx"
            ]
panic: fatal: An assertion has failed: Expected diff to not require deletion or replacement during Update of urn:pulumi:networking-dev-ap-southeast-2::networking::networking$aws:ec2/natGateway:NatGateway::main-vpc-nat-1

This operation should not cause a deletion or replacement of an existing NAT GW.

Example

new aws.ec2.NatGateway(
                `${name}-vpc-nat-${i}`,
                {
                  allocationId: elasticIPs[i].allocationId,
                  secondaryAllocationIds: ["eipalloc-05f44f1e5xxxxx"],
                  subnetId,
                  tags,
                },
                { provider, parent: this },
              )

Output of pulumi about

CLI          
Version      3.86.0
Go Version   go1.21.1
Go Compiler  gc

Plugins
NAME    VERSION
aws     6.8.0
docker  3.6.1
nodejs  unknown

Host     
OS       debian
Version  11.7
Arch     aarch64

This project is written in nodejs: executable='/usr/local/bin/node' version='v18.18.0'

Current Stack: networking-dev-ap-southeast-2

TYPE                                                 URN
pulumi:pulumi:Stack                                  urn:pulumi:networking-dev-ap-southeast-2::networking::pulumi:pulumi:Stack::networking-networking-dev-ap-southeast-2

Found no pending operations associated with networking-dev-ap-southeast-2

Backend        
Name           993e6d61911f
URL            s3://xxxxx/
User           root
Organizations  
Token type     personal

Dependencies:
NAME                              VERSION
@pulumi/aws                       6.8.0
@pulumi/awsx                      0.40.1
@pulumi/pulumi                    3.93.0
axios                             1.6.0
eslint                            8.52.0
eslint-plugin-import              2.29.0
handlebars                        4.7.8
lint-staged                       15.0.2
typescript                        5.2.2
@types/node                       20.8.10
@typescript-eslint/eslint-plugin  6.9.1
@typescript-eslint/parser         6.9.1
eslint-config-prettier            9.0.0
prettier                          3.0.3

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@barryrobison barryrobison added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Nov 13, 2023
@mikhailshilkov
Copy link
Member

Thank you for reporting this issue @barryrobison

No root cause yet, but quick notes so far: the error is coming from TF bridge Update: https://github.com/pulumi/pulumi-terraform-bridge/blob/f217804896b3dfa0c48627686de58f5e624a9c18/pkg/tfbridge/provider.go#L980

The code hasn't changed for a long while. A similar issue that was closed in the past: pulumi/pulumi-terraform-bridge#296

@mikhailshilkov
Copy link
Member

@barryrobison Can you describe which other resources I need to create to repro the issue? How is elasticIPs defined? Any chance for a standalone repro that I could just run locally?

@mikhailshilkov mikhailshilkov added awaiting-feedback Blocked on input from the author and removed needs-triage Needs attention from the triage team labels Nov 13, 2023
@barryrobison
Copy link
Author

@barryrobison Can you describe which other resources I need to create to repro the issue? How is elasticIPs defined? Any chance for a standalone repro that I could just run locally?

Hi Mikhail - here is a minimal repro:

https://github.com/barryrobison/pulumi-issue-repro/tree/main/aws-2971

You'll need to provide your own subnet and EIP allocation ids but hopefully this is sufficient. Thanks!

@pulumi-bot pulumi-bot added needs-triage Needs attention from the triage team and removed awaiting-feedback Blocked on input from the author labels Nov 13, 2023
@mikhailshilkov
Copy link
Member

Thank you @barryrobison. I got the issue reproduced with the following isolated program:

import * as aws from "@pulumi/aws";

const defaultVpc = aws.ec2.getVpcOutput({default: true});
const defaultVpcSubnets = aws.ec2.getSubnetsOutput({
    filters: [
        {name: "vpc-id", values: [defaultVpc.id]},
    ],
});

const one = new aws.ec2.Eip("one", {
  domain: "vpc",
});

const two = new aws.ec2.Eip("two", {
  domain: "vpc",
});

new aws.ec2.NatGateway("ng",
  {
      allocationId: one.allocationId,
      // after a first run to create the NAT Gateway, uncomment
      // the following line to reproduce a terraform panic
      // secondaryAllocationIds: [two.allocationId],
      subnetId: defaultVpcSubnets.ids[0],
  },
);

Then, on the second update I get the same panic.

@mikhailshilkov mikhailshilkov added impact/panic This bug represents a panic or unexpected crash p1 A bug severe enough to be the next item assigned to an engineer and removed needs-triage Needs attention from the triage team labels Nov 14, 2023
@iwahbe iwahbe self-assigned this Nov 14, 2023
@iwahbe
Copy link
Member

iwahbe commented Nov 14, 2023

Digging in, I have confirmed that this is a provider bug. @barryrobison Thanks for raising the issue and @mikhailshilkov thanks for building the self-contained repro. It made chasing this down much faster.


This is another bug caused by makeDetailedDiff missing computed inputs, in this case that a new computed input requires a replace. As such, the fix will come in a qualified bridge release.

iwahbe added a commit to pulumi/pulumi-terraform-bridge that referenced this issue Nov 15, 2023
This will fix (upon adoption)
pulumi/pulumi-aws#2971.

The design discussion for this PR is
https://pulumi.slack.com/archives/C02FXTZEZ6W/p1699989102568939.

### Changes

If there is a replace on the underlying TF object (as indicated by
`diff.RequiresNew() || diff.Destroy()`) but no associated replace in the
wire return (`pulumirpc.DiffResponse.Replaces`), then we set
`pulumirpc.DiffResponse.Replaces = pulumirpc.DiffResponse_DIFF_SOME` and
insert `__meta` as needing a replace.
@barryrobison
Copy link
Author

This is another bug caused by makeDetailedDiff missing computed inputs, in this case that a new computed input requires a replace. As such, the fix will come in a qualified bridge release.

Thanks Ian and Mikhail!

Just to clarify, adding the Secondary EIPs to an existing NAT Gateway should not cause a replace.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ec2-natgateway.html

I can't find anything definitive in the Terraform docs about how it handles a change of this property..

Thanks!

@iwahbe
Copy link
Member

iwahbe commented Nov 15, 2023

I don't see it documented anywhere, but TF handles it the same way that we do, with a replace. I checked by translating the Pulumi program into HCL: https://gist.github.com/iwahbe/d11ab2a962a69394741150e8ce0e7694.

t0yv0 added a commit that referenced this issue Nov 15, 2023
[Upstream v5.25.0 release
notes](https://github.com/hashicorp/terraform-provider-aws/releases/tag/v5.25.0)

Fixes #2983, #2904, #2971, #2900

- [x] Rebuild eks.Cluster patches; upstream moved to AWS SDK v2 for Go,
patches needed updates as well
- [x] Fix pulumi/pulumi-terraform-bridge#1523
in the bridge
- [x] Update bridge to include
pulumi/pulumi-terraform-bridge#1521 and
pulumi/pulumi-terraform-bridge#1520 fixes
affecting P1s in pulumi-aws
- [x] Build a Pulumi test for EKS Cluster add-on removal -> turns out
the property is a no-op, not needed
@t0yv0
Copy link
Member

t0yv0 commented Nov 15, 2023

Fixed in v6.9.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact/panic This bug represents a panic or unexpected crash kind/bug Some behavior is incorrect or out of spec p1 A bug severe enough to be the next item assigned to an engineer resolution/fixed This issue was fixed
Projects
None yet
Development

No branches or pull requests

5 participants