Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RF: Copy get_key_url from datalad.support.s3 #143

Merged
merged 1 commit into from
Jun 6, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 24 additions & 1 deletion datalad_crawler/nodes/s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,40 @@

from datalad.utils import updated
from datalad.dochelpers import exc_str
from datalad.support.s3 import get_key_url
from datalad.support.network import iso8601_to_epoch
from datalad.downloaders.providers import Providers
from datalad.downloaders.s3 import S3Downloader
from datalad.support.exceptions import TargetFileAbsent
from datalad.support.network import urlquote
from ..dbs.versions import SingleVersionDB

from logging import getLogger
lgr = getLogger('datalad.crawl.s3')


def get_key_url(e, schema='http', versioned=True):
"""Generate an s3:// or http:// url given a key
if versioned url is requested but version_id is None, no versionId suffix
will be added
"""
# Copied from datalad.support.s3, which is removing support for boto
#
# TODO: here we would need to encode the name since urlquote actually
# can't do that on its own... but then we should get a copy of the thing
# so we could still do the .format....
# ... = e.name.encode('utf-8') # unicode isn't advised in URLs
e.name_urlquoted = urlquote(e.name)
if schema == 'http':
fmt = "http://{e.bucket.name}.s3.amazonaws.com/{e.name_urlquoted}"
elif schema == 's3':
fmt = "s3://{e.bucket.name}/{e.name_urlquoted}"

Check warning on line 47 in datalad_crawler/nodes/s3.py

View check run for this annotation

Codecov / codecov/patch

datalad_crawler/nodes/s3.py#L47

Added line #L47 was not covered by tests
else:
raise ValueError(schema)

Check warning on line 49 in datalad_crawler/nodes/s3.py

View check run for this annotation

Codecov / codecov/patch

datalad_crawler/nodes/s3.py#L49

Added line #L49 was not covered by tests
if versioned and e.version_id is not None:
fmt += "?versionId={e.version_id}"
return fmt.format(e=e)


def get_version_for_key(k, fmt='0.0.%Y%m%d'):
"""Given a key return a version it identifies to be used for tagging

Expand Down