Deal with duplicated repos
Closedopened 4 years ago by max.mehl · 9 comments
Reference in New Issue
There is no content yet.
Delete Branch '%!s(<nil>)'
Deleting a branch is permanent. It CANNOT be undone. Continue?
Git repos usually have multiple URLs, e.g.
Can we make sure that we actually check only one instance of this project, and therefore only one badge? And this for all kinds of source forges?
How to deal with duplicated reposto How to deal with duplicated repos? 4 years ago
This syntax does not work.
The rest I agree needs some fixing, possibly, probably. I would suggest to rewrite any request that comes in to the format "git://github.com/fsfe/reuse-tool", but I am not sure whether all platforms support that syntax. This is the kind of thing that, when implemented, can cost someone an hour of their time if it doesn't work and they can't figure out why, and it turns out their URLs are being rewritten.
How about a dropdown of supported schemes, e.g. only http, https, and git so people know what they put into? The rest of the URL is probably always the same (except the .git suffix) and could be checked for duplicates in the backend.
How to deal with duplicated repos?to Deal with duplicated repos 4 years ago
Are we actually sure we want to forbid re-registering with a different scheme? What if, for example, somebody registers http://git.acme.com/foo/bar and later decides to completely switch the server from http to https?
What's the damage for us when multiple URLs are registered, when only one of them will actually be queried?
I just had another idea: we could store just the URL without the scheme, and when it comes to checking, we try "git", "https" and "http" (in a fixed, TBD order) and take the first one that works. This would even implicitly solve the issue of repositories changing the supported access scheme.
Maybe that costs us a few seconds when linting the repositories not supporting our first choice, but that runs in an asynchronous queue anyway.
Resources, I am afraid. I would rather prefer linting the Linux kernel just once per commit (at least the primary repo)...
Yes, that could be a viable solution!
@carmenbianca what do you think about the proposal to just try git, https, and http and take the first that works? What do you think would be the best order to try?
@reinhard This seems to work for me, in that order. There is probably some weird server out there that behaves differently based on protocol, but it's probably fine. The order
git -> https -> httpseems fine.
@carmenbianca instead of opening 3 ssh connections to the reuse lint server for the 3 tries, would it be smarter to improve the reuse-lint-repo script and make it accept the URL without protocol and try all 3 variants within a single run of the script?
@carmenbianca please forget the above question. The API does a
git ls-remoteon the repository and can remember which of the protocols worked before starting the remote lint.