I've set up an AWS ECR pull-through-cache for Docker Hub registry.
Say it is available under: 123.dkr.ecr.eu-central-1.amazonaws.com/docker.
Now after authenticating using:
aws ecr get-login-password ... | docker login 123.dkr.ecr.eu-central-1.amazonaws.com -u AWS --password-stdin I can do something like docker pull 123.dkr.ecr.eu-central-1.amazonaws.com/docker/library/busybox:latest.
This all makes sense, but my question is how does this work under the hood, because if I do a HTTP request using:
curl -vv -H "Authorization: Basic $(echo -n AWS:$(aws ecr get-login-password ...) | base64 -w 8000)" -v https://123.dkr.ecr.eu-central-1.amazonaws.com/v2/docker/library/busybox/manifests/latest
I get
{"errors":[{"code":"MANIFEST_UNKNOWN","message":"Requested image not found"}]}
or to be more precise the first time I get an error but after invoking docker pull (and I can see that the image was cached on ECR) it starts working.
I've read https://github.com/opencontainers/distribution-spec/blob/v1.0.1/spec.md and still don't understand this. What does the official docker client do that is different than my HTTP request that forces ECR to actually pull the image and start caching it?
Is it related to docker authentication - but how does docker signal what tag exactly it will request later?
Because after it is cached the normal protocol appears to work (I didn't actually download all the blobs, but at least manifests returns a proper response - HTTP 200 with layers). But as mentioned just requesting the same URL the first time returns HTTP 404. And I've also tried to wait a bit - no luck, my request appears to never start the process of ECR downloading the image, despite trying to add User-Agent headers, changing HTTP requests to HEAD etc etc.