Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify s3 functions: create_s3_bucket. #4566

Open
wants to merge 62 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
062da74
Simplify create s3 bucket.
DailyDreaming Aug 16, 2023
1142681
Remove old function.
DailyDreaming Aug 16, 2023
25ab299
Resolve one more mk bucket location.
DailyDreaming Aug 16, 2023
5bf34bc
Simplify tagging.
DailyDreaming Aug 17, 2023
cd47aa1
Typing.
DailyDreaming Aug 17, 2023
b39d561
Typing.
DailyDreaming Aug 17, 2023
5428f5e
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
adamnovak Aug 17, 2023
5b25a35
Simplify s3 functions: delete_s3_bucket. (#4567)
DailyDreaming Aug 31, 2023
7646a5d
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
adamnovak Aug 31, 2023
095ba36
Rebase.
DailyDreaming Apr 30, 2024
29a2ecc
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 1, 2024
8870d67
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 1, 2024
db585b7
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 1, 2024
2bb9092
Missing import.
DailyDreaming May 1, 2024
cb9e474
Another missing import.
DailyDreaming May 2, 2024
0e08474
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
DailyDreaming May 2, 2024
3ca63ed
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 2, 2024
3e21094
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
7f4f0d6
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
a4a395e
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
0a950bf
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
03941e9
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
3323e08
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 7, 2024
9341962
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 9, 2024
c0eea1b
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 14, 2024
59b4b8a
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 14, 2024
52be927
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 14, 2024
2242ace
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 17, 2024
8c3a876
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 17, 2024
a9c62db
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 18, 2024
f779127
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 20, 2024
d13fc62
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 20, 2024
cc6ab3c
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 21, 2024
fbdc80a
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 21, 2024
392064b
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 21, 2024
74d8ee8
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 23, 2024
004f139
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 24, 2024
455da2a
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
DailyDreaming Jun 6, 2024
53dcd42
Remove cruft.
DailyDreaming Jun 6, 2024
f6876ad
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 6, 2024
797624d
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 10, 2024
b468005
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 10, 2024
04c47e2
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 10, 2024
77303d6
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 11, 2024
0c48203
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 11, 2024
8e35f61
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 11, 2024
56b6ad6
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 19, 2024
ed97974
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 19, 2024
90d54a8
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 24, 2024
21a221e
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 25, 2024
39f9c4c
Rebase from master.
DailyDreaming Jun 28, 2024
21bbf5e
Patch test_utils.py.
DailyDreaming Jun 28, 2024
8b3c897
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
DailyDreaming Aug 19, 2024
6be49c0
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 19, 2024
85558eb
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 19, 2024
5ac40e0
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 21, 2024
b9efc4a
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 22, 2024
cc6be4b
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 22, 2024
e89e6ac
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 22, 2024
9a3f53a
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 27, 2024
68954a2
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 27, 2024
3aaf340
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Sep 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion contrib/admin/cleanup_aws_resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@

from src.toil.lib import aws
from src.toil.lib.aws import session
from src.toil.lib.aws.utils import delete_iam_role, delete_iam_instance_profile, delete_s3_bucket, delete_sdb_domain
from src.toil.lib.aws.utils import delete_iam_role, delete_iam_instance_profile, delete_sdb_domain
from src.toil.lib.aws.s3 import delete_s3_bucket
from src.toil.lib.generatedEC2Lists import regionDict

# put us-west-2 first as our default test region; that way anything with a universal region shows there
Expand Down
45 changes: 12 additions & 33 deletions src/toil/jobStores/aws/jobStore.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,17 @@
from urllib.parse import ParseResult, parse_qs, urlencode, urlsplit, urlunsplit

from botocore.exceptions import ClientError
from mypy_boto3_s3.service_resource import Bucket
from mypy_boto3_sdb import SimpleDBClient
from mypy_boto3_sdb.type_defs import ReplaceableItemTypeDef, ReplaceableAttributeTypeDef, SelectResultTypeDef, ItemTypeDef, AttributeTypeDef, DeletableItemTypeDef, UpdateConditionTypeDef

from toil.lib.aws.utils import flatten_tags, enable_public_objects
import toil.lib.encryption as encryption
from toil.fileStores import FileID
from toil.job import Job, JobDescription
from toil.lib.aws import tags_from_env
from toil.lib.aws.s3 import create_s3_bucket, delete_s3_bucket

from toil.jobStores.abstractJobStore import (
AbstractJobStore,
ConcurrentFileModificationException,
Expand All @@ -62,12 +69,10 @@
uploadFromPath,
)
from toil.jobStores.utils import ReadablePipe, ReadableTransformingPipe, WritablePipe
from toil.lib.aws import build_tag_dict_from_env
from toil.lib.aws.session import establish_boto3_session
from toil.lib.aws.utils import (
NoBucketLocationError,
boto3_pager,
create_s3_bucket,
enable_public_objects,
flatten_tags,
get_bucket_region,
Expand All @@ -77,6 +82,7 @@
retry_s3,
retryable_s3_errors,
)

from toil.lib.compatibility import compat_bytes
from toil.lib.ec2nodes import EC2Regions
from toil.lib.exceptions import panic
Expand Down Expand Up @@ -834,9 +840,7 @@ def bucket_retry_predicate(error):
bucketExisted = False
logger.debug("Bucket '%s' does not exist.", bucket_name)
if create:
bucket = create_s3_bucket(
self.s3_resource, bucket_name, self.region
)
bucket = create_s3_bucket(self.s3_resource, bucket_name, self.region)
# Wait until the bucket exists before checking the region and adding tags
bucket.wait_until_exists()

Expand All @@ -845,11 +849,10 @@ def bucket_retry_predicate(error):
# produce an S3ResponseError with code
# NoSuchBucket. We let that kick us back up to the
# main retry loop.
assert (
get_bucket_region(bucket_name) == self.region
assert (get_bucket_region(bucket_name) == self.region
), f"bucket_name: {bucket_name}, {get_bucket_region(bucket_name)} != {self.region}"

tags = build_tag_dict_from_env()
tags = tags_from_env()

if tags:
flat_tags = flatten_tags(tags)
Expand Down Expand Up @@ -1742,7 +1745,7 @@ def destroy(self):
# TODO: Add other failure cases to be ignored here.
self._registered = None
if self.files_bucket is not None:
self._delete_bucket(self.files_bucket)
delete_s3_bucket(s3_resource=s3_boto3_resource, bucket_name=self.files_bucket.name)
self.files_bucket = None
for name in 'files_domain_name', 'jobs_domain_name':
domainName = getattr(self, name)
Expand All @@ -1760,30 +1763,6 @@ def _delete_domain(self, domainName):
if not no_such_sdb_domain(e):
raise

@staticmethod
def _delete_bucket(bucket):
"""
:param bucket: S3.Bucket
"""
for attempt in retry_s3():
with attempt:
try:
uploads = s3_boto3_client.list_multipart_uploads(Bucket=bucket.name).get('Uploads')
if uploads:
for u in uploads:
s3_boto3_client.abort_multipart_upload(Bucket=bucket.name,
Key=u["Key"],
UploadId=u["UploadId"])

bucket.objects.all().delete()
bucket.object_versions.delete()
bucket.delete()
except s3_boto3_client.exceptions.NoSuchBucket:
pass
except ClientError as e:
if get_error_status(e) != 404:
raise


aRepr = reprlib.Repr()
aRepr.maxstring = 38 # so UUIDs don't get truncated (36 for UUID plus 2 for quotes)
Expand Down
26 changes: 11 additions & 15 deletions src/toil/lib/aws/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,29 +176,25 @@ def file_begins_with(path, prefix):
except (URLError, socket.timeout, HTTPException):
return False


def running_on_ecs() -> bool:
"""
Return True if we are currently running on Amazon ECS, and false otherwise.
"""
# We only care about relatively current ECS
return 'ECS_CONTAINER_METADATA_URI_V4' in os.environ

def build_tag_dict_from_env(environment: MutableMapping[str, str] = os.environ) -> Dict[str, str]:
tags = dict()
owner_tag = environment.get('TOIL_OWNER_TAG')

def tags_from_env() -> Dict[str, str]:
try:
tags = json.loads(os.environ.get('TOIL_AWS_TAGS', '{}'))
except json.decoder.JSONDecodeError:
logger.error('TOIL_AWS_TAGS must be in JSON format: {"key" : "value", ...}')
exit(1)

# TODO: Remove TOIL_OWNER_TAG and only use TOIL_AWS_TAGS .
DailyDreaming marked this conversation as resolved.
Show resolved Hide resolved
owner_tag = os.environ.get('TOIL_OWNER_TAG')
if owner_tag:
tags.update({'Owner': owner_tag})

user_tags = environment.get('TOIL_AWS_TAGS')
if user_tags:
try:
json_user_tags = json.loads(user_tags)
if isinstance(json_user_tags, dict):
tags.update(json.loads(user_tags))
else:
logger.error('TOIL_AWS_TAGS must be in JSON format: {"key" : "value", ...}')
exit(1)
except json.decoder.JSONDecodeError:
logger.error('TOIL_AWS_TAGS must be in JSON format: {"key" : "value", ...}')
exit(1)
return tags
145 changes: 145 additions & 0 deletions src/toil/lib/aws/s3.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# Copyright (C) 2015-2023 Regents of the University of California
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
import sys

from typing import (Any,
Dict,
List,
Optional,
Union,
cast)

from toil.lib.retry import retry, get_error_status
from toil.lib.misc import printq
from . import tags_from_env
from toil.lib.aws.utils import enable_public_objects, flatten_tags

if sys.version_info >= (3, 8):
from typing import Literal
else:
from typing_extensions import Literal

try:
from boto.exception import BotoServerError, S3ResponseError
from botocore.exceptions import ClientError
from mypy_boto3_iam import IAMClient, IAMServiceResource
from mypy_boto3_s3 import S3Client, S3ServiceResource
from mypy_boto3_s3.literals import BucketLocationConstraintType
from mypy_boto3_s3.service_resource import Bucket, Object
from mypy_boto3_sdb import SimpleDBClient
except ImportError:
BotoServerError = Exception # type: ignore
S3ResponseError = Exception # type: ignore
ClientError = Exception # type: ignore
# AWS/boto extra is not installed


logger = logging.getLogger(__name__)


@retry(errors=[BotoServerError, S3ResponseError, ClientError])
def create_s3_bucket(
s3_resource: "S3ServiceResource",
bucket_name: str,
region: Union["BucketLocationConstraintType", Literal["us-east-1"]],
tags: Optional[Dict[str, str]] = None,
public: bool = True
) -> "Bucket":
"""
Create an AWS S3 bucket, using the given Boto3 S3 session, with the
given name, in the given region.

Supports the us-east-1 region, where bucket creation is special.

*ALL* S3 bucket creation should use this function.
"""
logger.debug("Creating bucket '%s' in region %s.", bucket_name, region)
if region == "us-east-1": # see https://github.com/boto/boto3/issues/125
bucket = s3_resource.create_bucket(Bucket=bucket_name)
else:
bucket = s3_resource.create_bucket(
Bucket=bucket_name,
CreateBucketConfiguration={"LocationConstraint": region},
)
# wait until the bucket exists before adding tags
bucket.wait_until_exists()

tags = tags_from_env() if tags is None else tags
bucket_tagging = s3_resource.BucketTagging(bucket_name)
bucket_tagging.put(Tagging={'TagSet': flatten_tags(tags)}) # type: ignore

# enabling public objects is the historical default
if public:
enable_public_objects(bucket_name)

return bucket


@retry(errors=[BotoServerError, S3ResponseError, ClientError])
def delete_s3_bucket(
s3_resource: "S3ServiceResource",
bucket_name: str,
quiet: bool = True
) -> None:
"""
Delete the bucket with 'bucket_name'.

Note: 'quiet' is False when used for a clean up utility script (contrib/admin/cleanup_aws_resources.py)
that prints progress rather than logging. Logging should be used for all other internal Toil usage.
"""
assert isinstance(bucket_name, str), f'{bucket_name} is not a string ({type(bucket_name)}).'
logger.debug("Deleting bucket '%s'.", bucket_name)
printq(f'\n * Deleting s3 bucket: {bucket_name}\n\n', quiet)

s3_client = s3_resource.meta.client

try:
for u in s3_client.list_multipart_uploads(Bucket=bucket_name).get('Uploads', []):
s3_client.abort_multipart_upload(
Bucket=bucket_name,
Key=u["Key"],
UploadId=u["UploadId"]
)

paginator = s3_client.get_paginator('list_object_versions')
for response in paginator.paginate(Bucket=bucket_name):
# Versions and delete markers can both go in here to be deleted.
# They both have Key and VersionId, but there's no shared base type
# defined for them in the stubs to express that. See
# <https://github.com/vemel/mypy_boto3_builder/issues/123>. So we
# have to do gymnastics to get them into the same list.
to_delete: List[Dict[str, Any]] = cast(List[Dict[str, Any]], response.get('Versions', [])) + \
cast(List[Dict[str, Any]], response.get('DeleteMarkers', []))
for entry in to_delete:
printq(f" Deleting {entry['Key']} version {entry['VersionId']}", quiet)
s3_client.delete_object(
Bucket=bucket_name,
Key=entry['Key'],
VersionId=entry['VersionId']
)
bucket = s3_resource.Bucket(bucket_name)
bucket.objects.all().delete()
bucket.object_versions.delete()
bucket.delete()
printq(f'\n * Deleted s3 bucket successfully: {bucket_name}\n\n', quiet)
logger.debug("Deleted s3 bucket successfully '%s'.", bucket_name)
except s3_client.exceptions.NoSuchBucket:
printq(f'\n * S3 bucket no longer exists: {bucket_name}\n\n', quiet)
logger.debug("S3 bucket no longer exists '%s'.", bucket_name)
except ClientError as e:
if get_error_status(e) != 404:
raise
printq(f'\n * S3 bucket no longer exists: {bucket_name}\n\n', quiet)
logger.debug("S3 bucket no longer exists '%s'.", bucket_name)
6 changes: 5 additions & 1 deletion src/toil/lib/aws/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import logging
import os
import threading

from typing import (
TYPE_CHECKING,
Dict,
Expand All @@ -30,6 +31,8 @@
import boto3
import boto3.resources.base
import botocore

from typing import Dict, Optional, Tuple, cast, Union, Literal, overload, TypeVar
from boto3 import Session
from botocore.client import Config
from botocore.session import get_session
Expand Down Expand Up @@ -61,6 +64,7 @@
# initializing Boto3 (or Boto2) things at a time.
_init_lock = threading.RLock()


def _new_boto3_session(region_name: Optional[str] = None) -> Session:
"""
This is the One True Place where new Boto3 sessions should be made, and
Expand All @@ -81,6 +85,7 @@ def _new_boto3_session(region_name: Optional[str] = None) -> Session:

return Session(botocore_session=botocore_session, region_name=region_name, profile_name=os.environ.get("TOIL_AWS_PROFILE", None))


class AWSConnectionManager:
"""
Class that represents a connection to AWS. Caches Boto 3 and Boto 2 objects
Expand Down Expand Up @@ -234,7 +239,6 @@ def client(
config: Optional[Config] = None,
) -> "AutoScalingClient": ...


def client(self, region: Optional[str], service_name: Literal["ec2", "iam", "s3", "sts", "sdb", "autoscaling"], endpoint_url: Optional[str] = None,
config: Optional[Config] = None) -> botocore.client.BaseClient:
"""
Expand Down
Loading