-
Notifications
You must be signed in to change notification settings - Fork 880
Linear scan of a repository with lots (200+) of tags causes pull IO timeout #614
Comments
As part of this investigation, I discovered that "docker pull foo/bar:sometag" will incur a hit to the API endpoint that lists all tags for 'foo/bar'. This seems a bit wasteful. @dmp42 - as I'm not too familiar with the details of 'docker pull', perhaps you know offhand whether this is indeed unnecessary work and whether it's worth filing a bug in docker? |
@bshi I don't think reporting this is worth the effort - energy IMO is better focused on registry v2 development. |
On Tue, Oct 21, 2014 at 07:06:37PM -0700, Bo Shi wrote:
I haven't looked at the client-side code (at least recently enough to |
One of the terribly inefficient thing right now is that we do not only Driver will still need to provide an efficient |
Skimmed the discussion in #643 - it seems like you guys are aware and thinking about the issue of unbounded looping over driver interface methods. One other concern is the underlying storage consistency model and what the registry expects of the drivers. S3 doesn't even have consistent (:P) consistency models across S3 regions. |
Consistency... we think about it, a lot :-) cc @stevvooe |
@bshi Consistency and coordination are definitely something we are thinking about. Unfortunately, many of the storage backends lack consistency and don't have any kind of transactional coordination. The new registry will likely require some sort of coordination layer to mitigate that. Watch out for upcoming proposals, as you're input will be appreciated. |
On Thu, Nov 06, 2014 at 10:48:58AM -0800, Stephen Day wrote:
I'd just keep the mutable (i.e. not content-addressable) stuff in a |
Just as a fun data-point, I started to run into this problem right around 1700 tags for a repo, backed with S3. We are doing continuous builds, so I am able to workaround by periodically deleting large repos and re-pushing as needed. |
I have the similar problems @bshi @dmp42 .my storage backend is s3-ceph,So has this problem been solved yet? urgently!!! i guess it is the problem of api "/v1/repositories/repo/tags", it is using python gevent to pull all tags file from storage backend,read it and return to the docker.it is using too much time! |
This was originally reported at GoogleCloudPlatform#22
It seems when performing "docker pull foo/bar:sometag" the repository performs a linear scan of ALL tags in "foo/bar". When backed by object storage systems like GCS, this can take a long time. It has broken image distribution for us.
The text was updated successfully, but these errors were encountered: