-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a field to limit the size of uploading content #1701
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
Added a limit of 4MB to non-Blob content, through the `OCI_PAYLOAD_MAX_SIZE` setting, to protect | ||
against OOM DoS attacks. | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Limit the size of Manifests and Signatures | ||
|
||
By default, Pulp is configured to block the synchronization of non-Blob content (Manifests, | ||
Signatures, etc.) if they exceed a 4MB size limit. A use case for this feature is to avoid | ||
OOM DoS attacks when synchronizing remote repositories with malicious or compromised container | ||
images. | ||
To define a different limit, use the following setting: | ||
``` | ||
OCI_PAYLOAD_MAX_SIZE=<bytes> | ||
``` | ||
|
||
for example, to modify the limit to 10MB: | ||
``` | ||
OCI_PAYLOAD_MAX_SIZE=10_000_000 | ||
``` | ||
lubosmj marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,22 +5,85 @@ | |
|
||
from aiohttp.client_exceptions import ClientResponseError | ||
from collections import namedtuple | ||
from django.conf import settings | ||
from logging import getLogger | ||
from urllib import parse | ||
|
||
from pulpcore.plugin.download import DownloaderFactory, HttpDownloader | ||
|
||
from pulp_container.constants import V2_ACCEPT_HEADERS | ||
from pulp_container.constants import ( | ||
CONTENT_TYPE_WITHOUT_SIZE_RESTRICTION, | ||
MEGABYTE, | ||
V2_ACCEPT_HEADERS, | ||
) | ||
|
||
log = getLogger(__name__) | ||
|
||
HeadResult = namedtuple( | ||
"HeadResult", | ||
["status_code", "path", "artifact_attributes", "url", "headers"], | ||
) | ||
DownloadResult = namedtuple("DownloadResult", ["url", "artifact_attributes", "path", "headers"]) | ||
|
||
|
||
class PayloadTooLarge(ClientResponseError): | ||
"""Client exceeded the max allowed payload size.""" | ||
|
||
|
||
class ValidateResourceSizeMixin: | ||
async def _handle_response(self, response): | ||
""" | ||
Overrides the HttpDownloader method to be able to limit the request body size. | ||
Handle the aiohttp response by writing it to disk and calculating digests | ||
Args: | ||
response (aiohttp.ClientResponse): The response to handle. | ||
Returns: | ||
DownloadResult: Contains information about the result. See the DownloadResult docs for | ||
more information. | ||
""" | ||
if self.headers_ready_callback: | ||
await self.headers_ready_callback(response.headers) | ||
total_size = 0 | ||
while True: | ||
chunk = await response.content.read(MEGABYTE) | ||
total_size += len(chunk) | ||
max_body_size = self._get_max_allowed_resource_size(response) | ||
if max_body_size and total_size > max_body_size: | ||
self._ensure_no_broken_file() | ||
raise PayloadTooLarge( | ||
status=413, | ||
message="manifest invalid", | ||
request_info=response.request_info, | ||
history=response.history, | ||
) | ||
if not chunk: | ||
await self.finalize() | ||
break # the download is done | ||
await self.handle_data(chunk) | ||
return DownloadResult( | ||
path=self.path, | ||
artifact_attributes=self.artifact_attributes, | ||
url=self.url, | ||
headers=response.headers, | ||
) | ||
|
||
def _get_max_allowed_resource_size(self, response): | ||
""" | ||
Returns the maximum allowed size for non-blob artifacts. | ||
""" | ||
|
||
# content_type is defined by aiohttp based on the definition of the content-type header. | ||
# When it is not set, aiohttp defines it as "application/octet-stream" | ||
# note: http content-type header can be manipulated, making it easy to bypass this | ||
# size restriction, but checking the manifest content is also not a feasible solution | ||
# because we would need to first download it. | ||
Comment on lines
+75
to
+79
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 😭 So, we actually need to depend on this optional header. Meaning that we will never by able to assure that no manifest larger than 4MB will be synced into Pulp... Why then trying to support it? I am still not sold on this restriction.
Also, it is somewhat unfortunate that we are relying on the downloader to determine the type of the data, as it's sole responsibility should be downloading data without being aware of the types. Shifting this check to the sync pipeline seems to be a bit better solution afterall. However, this would mean that we will download, .e.g., 1GB of data and then throw it away. Perhaps, it is fine. I am sorry for dragging you away. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Resurrecting this comment: #1701 (comment) |
||
if response.content_type in CONTENT_TYPE_WITHOUT_SIZE_RESTRICTION: | ||
return None | ||
|
||
return settings["OCI_PAYLOAD_MAX_SIZE"] | ||
|
||
|
||
class RegistryAuthHttpDownloader(HttpDownloader): | ||
class RegistryAuthHttpDownloader(ValidateResourceSizeMixin, HttpDownloader): | ||
""" | ||
Custom Downloader that automatically handles Token Based and Basic Authentication. | ||
|
||
|
@@ -193,7 +256,7 @@ async def _handle_head_response(self, response): | |
) | ||
|
||
|
||
class NoAuthSignatureDownloader(HttpDownloader): | ||
class NoAuthSignatureDownloader(ValidateResourceSizeMixin, HttpDownloader): | ||
"""A downloader class suited for signature downloads.""" | ||
|
||
def raise_for_status(self, response): | ||
|
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -8,3 +8,8 @@ | |||||||||||
|
||||||||||||
# The number of allowed threads to sign manifests in parallel | ||||||||||||
MAX_PARALLEL_SIGNING_TASKS = 10 | ||||||||||||
|
||||||||||||
# Set max payload size for non-blob container artifacts (manifests, signatures, etc). | ||||||||||||
# This limit is also valid for docker manifests, but we will use the OCI_ prefix | ||||||||||||
# (instead of ARTIFACT_) to avoid confusion with pulpcore artifacts. | ||||||||||||
OCI_PAYLOAD_MAX_SIZE = 4_000_000 | ||||||||||||
Comment on lines
+13
to
+15
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Is it 4MB?
ref: https://github.com/distribution/distribution/blob/d0eebf3af4fc1d5c0287e5af61147403ccb78ec2/registry/handlers/manifests.go#L29, https://github.com/containers/image/blob/8d792a4a930c36ae3228061531cca0958ba4fe0a/internal/iolimits/iolimits.go#L20 |
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -19,6 +19,7 @@ | |||
SIGNATURE_TYPE, | ||||
V2_ACCEPT_HEADERS, | ||||
) | ||||
from pulp_container.app.downloaders import PayloadTooLarge | ||||
from pulp_container.app.models import ( | ||||
Blob, | ||||
BlobManifest, | ||||
|
@@ -62,7 +63,12 @@ def __init__(self, remote, signed_only): | |||
|
||||
async def _download_manifest_data(self, manifest_url): | ||||
downloader = self.remote.get_downloader(url=manifest_url) | ||||
response = await downloader.run(extra_data={"headers": V2_ACCEPT_HEADERS}) | ||||
try: | ||||
response = await downloader.run(extra_data={"headers": V2_ACCEPT_HEADERS}) | ||||
except PayloadTooLarge as e: | ||||
git-hyagi marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
log.warning(e.message + ": max size limit exceeded!") | ||||
raise RuntimeError("Manifest max size limit exceeded.") | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this result in a failed task or a skipped manifest list? I thought that we agreed on skipping the manifest list, did not we? |
||||
|
||||
with open(response.path, "rb") as content_file: | ||||
raw_bytes_data = content_file.read() | ||||
response.artifact_attributes["file"] = response.path | ||||
|
@@ -542,6 +548,12 @@ async def create_signatures(self, man_dc, signature_source): | |||
"{} is not accessible, can't sync an image signature. " | ||||
"Error: {} {}".format(signature_url, exc.status, exc.message) | ||||
) | ||||
except PayloadTooLarge as e: | ||||
git-hyagi marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
log.warning( | ||||
"Failed to sync signature {}. Error: {}".format(signature_url, e.args[0]) | ||||
) | ||||
signature_counter += 1 | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||
continue | ||||
|
||||
with open(signature_download_result.path, "rb") as f: | ||||
signature_raw = f.read() | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to be more explicit.