-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specifying a code location in ProcessingStep should be optional #4904
Comments
@system123 can you try creating the step this way: preprocessing_job = ScriptProcessor(
image_uri=processing_image_uri,
command=["python3"],
role=role_pipeline,
instance_type=instance_type,
instance_count=instance_count,
sagemaker_session=pipeline_session,
)
step_args_preprocessing = preprocessing_job.run(
code=os.path.join(BASE_DIR, "preprocess.py"),
inputs=[
ProcessingInput(...)
],
outputs=[
ProcessingOutput(...)
],
)
step_preprocessing = ProcessingStep(
name="PreprocessingStep",
step_args=step_args_preprocessing,
) hope that solves the issue |
Hi @system123, thanks for reaching out! I have received an internal customer ticket on the same topic and responded to that. Not sure if that was from you, so replying here as well. The ScriptProcessor, as its name suggested, is for the use case of supplying custom script or code. That's why it has the However, there is still another more general class to use, i.e.
for which, you don't need to supply the code . And instead, you'll need to supply an image uri, which can be your custom image. This class works with ProcessingStep as well. See the example below:
Hope this can help. |
Hi @system123, please kindly let us know if this issue is resolved. |
@mollyheamazon sorry for the delay, yes the solution provided by @qidewenwhen solved the issue. |
Describe the bug
When defining a ProcessingStep using the Python SDK the pipeline compiler complains if the
code=
argument is not specified. However, the SDK documentation and code havecode=None
as a default (which is invalid) and the AWS documentation for processing steps states that the code parameter may be None if the code already exists in the container. In this case the ScriptProcessor already contains the code, and defines how to execute it throughcommand=
parameter.To reproduce
Defining a processing step without a
code
argument will cause an error.Expected behavior
If a ScriptProcessor is used which is based upon a custom image, the command should just be run directly. No specific code needs to be uploaded or pulled into the container. The expected behaviour can be obtained using the SDK currently by pointing
code
to any dummy file on S3 or the local machine. This is then pushed to the container, but the command specified by the Script Processor is still executed.Screenshots or logs
The text was updated successfully, but these errors were encountered: