-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Packer Build job for Windows AMI #6064
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
b5c22ad
to
841f5b1
Compare
841f5b1
to
706494a
Compare
706494a
to
3f8191e
Compare
Add a job to create Windows AMIs in the PyTorch AWS Account. Issue: #5992 Signed-off-by: Thanh Ha <[email protected]>
3f8191e
to
1bfd5a1
Compare
Hi @zxiiro this looks good. Would be nice to see it in action. |
@atalman You can see it in action here: https://github.com/pytorch/test-infra/actions/workflows/build-windows-ami.yml Since this PR is not yet merged we cannot use the WebUI to trigger it but I've been manually triggering it using the GitHub CLI.
One issue I've been having though with this PR is even though its successfully able to create the AMI. The packer command usually hangs at the very end either after it prints:
The job will fail and no AMI is created in this case. or when it prints:
The job will timeout and fail however when I check in AWS the AMI is there so this one passes despite the failure. I'm not sure why this happens. I've only see it successfully pass twice where everything was green. |
Signed-off-by: Thanh Ha <[email protected]>
Upon further inspecting looks like the EBS volume snapshots are taking a very long time in AWS to create. I wonder if we can tell packer to not wait for AMI creation to complete and just end the job once it gets there since the EBS volume snapshots seem like it could be a background thing. |
I just saw the snapshots have completed and AMI state is now available yet the packer command is still hanging waiting for it to complete. |
I think we can likely proceed with merging this since this is a manually triggered job and it does everything we need it to do despite this kinda annoying packer hang. This gets us where we want to go and we can cancel the workflow after checking that the AMI is available in AWS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Add a job to create Windows AMIs in the PyTorch AWS Account.
Issue: #5992