Sagemaker Endpoint suddenly disappearing #3239
Unanswered
danielcavalli
asked this question in
Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I'm currently using Sagemaker to host a custom ML model deployed to two accounts, homolog, and production. Both endpoints have the same entry point code and are deployed on the same day. The homologation version suddenly disappeared on June 28th, leaving no traces besides the last HealthCheck ping on CloudWatch. After searching CloudTrail logs to see what could have happened, there was nothing out of the ordinary: deployed the endpoint and that was it. No delete command coming from anywhere.
I thought of it as a bug and promptly redeployed the model, on June 29th, assuming it wouldn't happen again. The issue is that on July 3rd the endpoint vanished without traces again. Same thing, no delete, no update, no renaming of anything on CloudTrail, and the only proof that it was ever on running were the CloudWatch logs and the CreateEndpoint entry on CloudTrail.
The reason I'm not posting this as an issue is to understand the problem better and get ideas as to what evidence to collect so it can be resolved/opened.
Some more information:
Any thoughts on what could that be?
Beta Was this translation helpful? Give feedback.
All reactions