502 bad gateway? #2384
Replies: 17 comments
-
Actually the 502 error trouble was there when running predictor.predict(test) before deploy. But my model performed well in my own machine and saved exactly the same way as https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ |
Beta Was this translation helpful? Give feedback.
-
Hi @Xixiong-Guo, if you are following the example from https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ , it's likely that you've used the wrong For framework versions 1.11 and above, we've split the tensorflow container into training and serving. And for deploying the model, please use this class instead: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/serving.py#L121 |
Beta Was this translation helpful? Give feedback.
-
Hi, @ChuyangDeng Thanks for your reply. I did encountered this problem, and after I found this reference (https://sagemaker.readthedocs.io/en/stable/using_tf.html#deploying-directly-from-model-artifacts). I've changed to
I guess this should not be the reason for that 502 error now? Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi @Xixiong-Guo, How did you tar the model? When you tar your model, please make sure to use the $ ls -al 00000123 # version number (not model name) |
Beta Was this translation helpful? Give feedback.
-
Hi @ChuyangDeng My tar.gz looks like: You mean the directory should like: Thanks. |
Beta Was this translation helpful? Give feedback.
-
Yes, SageMaker expects model to be extracted directly under "opt/ml/<model_name>/" directory inside the container. The sagemaker-tensorflow-serving container will look for model version directly under "<model_name>/". So your tar structure should be: model.tar.gz\1\saved_model.pb |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I've been having the same issue after following the same examples. I've also checked me TAR and use the serving model. The cloudwatch log is as follows from the moment I invoke the endpoint until it goes back to the regular pinging. (I used
|
Beta Was this translation helpful? Give feedback.
-
For me this ended up being an issue with the shape of the input. I was uploading an individual sample, but the endpoint expects a batch, so I needed to make my input one layer deeper (As described here). Could this be happening for you as well, @Xixiong-Guo? |
Beta Was this translation helpful? Give feedback.
-
Hi @Sbrikky , did you encounter the same 502 issue? |
Beta Was this translation helpful? Give feedback.
-
@Xixiong-Guo Yes, I had the exact same error in my notebook as you posted so I didn't bother posting it again.
This suggested that maybe there was something in the shape of the request. Why on earth it ends up throwing this as a 502, I have no clue. |
Beta Was this translation helpful? Give feedback.
-
Hi @Sbrikky I got it. In your case, is there any difference in terms of the error info, when you tried both predict(input) and predict([input.tolist()])? |
Beta Was this translation helpful? Give feedback.
-
When I use predict([input.tolist()]) it works and I get a prediction back. No 502. |
Beta Was this translation helpful? Give feedback.
-
Hi @Xixiong-Guo , looks like you are using csv_seralizer and note here (https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/predictor.py#L325) that the serializer will try to serialize your input row by row delimited by "," if you are using a python list: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/predictor.py#L363 |
Beta Was this translation helpful? Give feedback.
-
Hi all, 502 Bad Gatewaynginx/1.16.1 |
Beta Was this translation helpful? Give feedback.
-
For me this ended up being an issue with the directory structure of the saved model. So, Your tar structure should should be : |
Beta Was this translation helpful? Give feedback.
-
Hi guys , |
Beta Was this translation helpful? Give feedback.
-
Following https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/, I tried to save two different model (sentiment analysis and a simple regression model) trained by tensorflow+keras, and uploaded to Sagemaker, but encountered the same 502 error, which is seldom reported here or stackoverflow. Any thoughts?
Body_review = ','.join([str(val) for val in padded_pred]).encode('utf-8')
response = runtime.invoke_endpoint(EndpointName=predictor.endpoint,
ContentType = 'text/csv',
Body = Body_review)
An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (502) from model with message " <title>502 Bad Gateway</title>
502 Bad Gateway
nginx/1.16.1
I searched the CloudWatch as found attached:
2020/05/10 15:53:27 [error] 35#35: *187 connect() failed (111: Connection refused) while connecting to upstream, client: 10.32.0.1, server: , request: "POST /invocations HTTP/1.1", subrequest: "/v1/models/export:predict", upstream: "http://127.0.0.1:27001/v1/models/export:predict", host: "model.aws.local:8080"
I tried another regression model (trained outside sagemaker, saved and loaded to S3 and Sagemaker, following https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ )
Still the same issue when using the predictor:
from sagemaker.predictor import csv_serializer
predictor.content_type = 'text/csv'
predictor.serializer = csv_serializer
Y_pred = predictor.predict(test.tolist())
Error:
--------------------------------------------------------------------------- ModelError Traceback (most recent call last) in () 4 predictor.serializer = csv_serializer 5 ----> 6 Y_pred = predictor.predict(test) ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model) 108 109 request_args = self._create_request_args(data, initial_args, target_model) --> 110 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args) 111 return self._handle_response(response) 112 ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs) 314 "%s() only accepts keyword arguments." % py_operation_name) 315 # The "self" in this scope is referring to the BaseClient. --> 316 return self._make_api_call(operation_name, kwargs) 317 318 _api_call.name = str(py_operation_name) ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params) 624 error_code = parsed_response.get("Error", {}).get("Code") 625 error_class = self.exceptions.from_code(error_code) --> 626 raise error_class(parsed_response, operation_name) 627 else: 628 return parsed_response ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (502) from model with message " <title>502 Bad Gateway</title>
502 Bad Gateway
nginx/1.16.1 ".
Beta Was this translation helpful? Give feedback.
All reactions