-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter subdir out of filename when finding .conda component files #83
Conversation
402cd95
to
6c8496a
Compare
https://github.com/conda/conda-index/tree/main/tests/archives has some archives that have been stripped of their package data (so they are nice and small) similar ones in conda-build. The script to make these from normal "has data" archives is simple but doesn't appear to be in the repository. |
I think I can just copy and rename files that exist in the package cache. I'll work on that. |
As an Anaconda employee I've looked into editing anaconda.org to not include the prefix, it was not entirely clear how to do so. (It would have been string-based) |
Thanks for looking into it! It is an annoying little detail, but hopefully this hacky workaround will be good enough. As I said, if it makes sense to filter the subdir out of the extracted directory name, I can do that too. I kind of think that would happen in conda-package-handling though. And I also think it's more intuitive to not do that filtering, because matching the filename to the extracted folder name is pretty well-established behavior. |
@@ -125,6 +126,9 @@ def stream_conda_component( | |||
|
|||
zf = zipfile.ZipFile(fileobj or filename) | |||
file_id, _, _ = os.path.basename(filename).rpartition(".") | |||
# this substitution compensates for web downloads from anaconda.org having | |||
# the platform as a prefix | |||
file_id = re.sub("^(osx|linux|win|noarch)(-.+?)?_", "", file_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we don't have access to DEFAULT_SUBDIRS here, since we don't depend on conda.
Another way to address this issue would be in conda-package-handling's _extract function It would need to remove the subdir prefix from filename before passing it to stream_conda_component. |
Here's my take on it as a conda-package-streaming change. |
Yeah, I like that endswith instead of startswith approach. The original idea with the conda format was to not necessarily be specific to .zst files, but I think that falls into YAGNI. Would it be worth an extra step to also try to strip off file extension? |
No, I read that part of the conda specification as being enthusiastic about libarchive. Which is fine but we aren't using libarchive anymore. |
superseded by #85; closing |
Description
Closes conda/conda-package-handling#230
This just makes the component finder a tiny bit smarter in that it ignores common platform patterns when finding these files.
The extracted folder still has these prefixes present, matching the input filename. If you'd prefer that the prefix be stripped from that as well, I can look into it.
I did look at writing a test for this, but it seems like this project doesn't contain testing tarballs/.conda packages, relying instead on using what is already present in the local package cache. If you'd like me to include a test, would it be OK to include a package file in this repo's test folder?