Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

bitbucket hangs on git query to hg repo declared ambiguously #2092

Closed
kevinburke opened this issue Jan 22, 2019 · 5 comments
Closed

bitbucket hangs on git query to hg repo declared ambiguously #2092

kevinburke opened this issue Jan 22, 2019 · 5 comments

Comments

@kevinburke
Copy link
Collaborator

kevinburke commented Jan 22, 2019

If you declare a repo source as eg "bitbucket.org/mattfarina/testhgrepo" it can be either a Git or Hg repo. We guess it is Git in the bitbucketDeducer in gps/deduce.go, but if we guess incorrectly, we can end up issuing a git ls-remote call to a Mercurial repository and hanging forever.

There is this (wise) TODO in the code

	// TODO(sdboyer) resolve scm ambiguity if needed by querying bitbucket's REST API

In the meantime I think we should just return an error with a clear message instead of hanging forever.

This issue is the reason the test suite currently panics with a timeout.

Updates #2091
Updates #2071

@theckman
Copy link
Collaborator

theckman commented Jan 22, 2019

@kevinburke Is your suggested path to have this function return an error instead of a possibly bad remote SCM, or to wrap the git ls-remote in a timeout to cause a failure? I assume the latter.

Looking in to it more it seems like git ls-remote doesn't have the ability to set a connection timeout parameter.

theckman added a commit that referenced this issue Jan 22, 2019
There are cases that can result in `dep` trying to use `git ls-remote` to talk
to a mercurial repository, which can have some really adverse effects due to
poor connection handling characteristics of `git ls-remote`. This tries to
improve that situation with a timeout.

If the git command cannot connect to the remote SCM, it takes 2 minutes for the
connection to time out if it's unable to establish the TCP connection. It then
tries the next IP, which will also take 2 minutes to time out, and so on and so
forth.

If `dep` hands `git ls-remote` a Bitbucket repository that's `hg` backed, `git
ls-remote` fails to establish the connection and retries all IPs until finally
failing. If there are 6 IPs returned, this will take 12 minutes. If you're
running a `dep` ensure that's trying to do a lot of different actions, this can
add up very quickly.

Looking in to the `git ls-remote` manpage there does not seem to be a
straightforward way to configure it to fail a bit more aggressively (such as
after 2 failed attempts with a 15 second timeout).

This change wraps calls to `git ls-remote` in a `context.WithTimeout` that gives
it 30 seconds to run. This is a completely arbitrary value and may need tuning,
or even to become configurable.

Updates #2092
@theckman
Copy link
Collaborator

@kevinburke separate from that linked PR above (#2093), I'm thinking of tackling the TODO of hitting the Bitbucket REST API to just make this not make any guesses.

I'm thinking of writing a gps/internal/bitbucket subpackage that queries the Bitbucket API and returns the SCM of the project. I really only want one field from the JSON[1], and could easily accept a BITBUCKET_API_TOKEN environment variable to alleviate any issues with unauthenticated API rate limits.

I'd need to dig in to callers of deduceSource() to know if they cache the results, or if they'd try to resolve it multiple times. The internal package could also intercept requests to get the SCM and serve it from an in-memory cache of previous resolutions.

What are your thoughts?

[1] https://developer.atlassian.com/bitbucket/api/2/reference/resource/repositories/%7Busername%7D/%7Brepo_slug%7D#get

@kevinburke
Copy link
Collaborator Author

I am also wondering what the real world consequences would be of just erroring if you specify this in a vague way. What percentage of BB repos are specified this way with no extra hints, and what percentage of them are Git vs Mercurial?

@kevinburke
Copy link
Collaborator Author

What you described about a Bitbucket API seems fine... we can optimize it after the fact but just adding query support would be helpful.

@kevinburke
Copy link
Collaborator Author

Going to close this out, I think it's fixed now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants