Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize git_repository_url for better parsing #622

Open
aliceinwire opened this issue Jan 11, 2025 · 3 comments
Open

Standardize git_repository_url for better parsing #622

aliceinwire opened this issue Jan 11, 2025 · 3 comments

Comments

@aliceinwire
Copy link
Member

kci-dev is currently matching repository git configurations to kcidb git_repository_url
The problem is that there is no way fro kci-dev to know if the url as been saved with git:// or http:// or https://
in same rare case the url could have a guest username and password
I propose to standardize git_repository_url in kcidb to the currently most used protocol https:// without any authentication. this could be done easy from the result committer by cleaning the url to be sent or even directly by kcidb with something similar to what kci-dev is doing kernelci/kci-dev#76 (comment)

ref PR: kernelci/kci-dev#76

@spbnick
Copy link
Collaborator

spbnick commented Jan 13, 2025

Thank you for the report, @aliceinwire!

We standardized the URL quite a while ago: https://github.com/kernelci/kcidb-io/blob/308e85f914b687a5544c41bec7c86a752dec8949/kcidb_io/schema/v04_05.py#L210-L224

Basically, the description there is trying to say: "send us the shortest possible HTTPS URL" (that is, e.g. if credentials are not needed, drop them, use shortest path, etc.). And "If that's not available, send us the shorted Git URL". That should cover everything (unless we really need unencrypted HTTP). The problem is of course to get everyone to comply correctly.

@aliceinwire
Copy link
Member Author

aliceinwire commented Jan 14, 2025

kcidb have some benefit from knowing the repository protocol?
Having some URL with git and some URL with HTTPS can be deceiving as most repository usually offers both. why not having kcidb just internally sanitize each URL by defaulting to https?

@spbnick
Copy link
Collaborator

spbnick commented Jan 14, 2025

I would love to have all of them HTTPS, but we had to add an exception in case it is not available. One of the maintainer repos is only available over git://. KCIDB doesn't have any benefits from knowing the repo protocol, except having a URL which could actually be used.

Having some URL with git and some URL with HTTPS can be deceiving as most repository usually offers both.

That's why we have the "Use git://, only if https:// is unavailable" rule. The rules are aimed at producing preferred and unique repo URLs. There's still discrepancy in the use of trailing slash, but we can deal with it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants