5

The lowest boundary of a Docker image can be very small i.e. under 1 MB (busybox) or even zero bytes (scratch).

Now for research purporses, I need to reference the biggest Docker image on the Docker Hub.

How to find it?

mohan08p
  • 360
  • 2
  • 7
Ta Mu
  • 6,792
  • 5
  • 43
  • 83

1 Answers1

5

If you're looking for a master list of all images and sizes, kind of a 'statistics' pages, Docker Hub doesn't have that. You'll need to use their HTTP API to:

  1. get all the repositories, stepping through the paginated results. API reference: https://docs.docker.com/registry/spec/api/#listing-repositories
  2. pull the Image Manifest for each repo. API reference: https://docs.docker.com/registry/spec/api/#pulling-an-image-manifest
  3. Store those manifests in local database so you can sort them based on image size

You'll specifically need to pull the newest Image Manifest Version 2, Schema 2, which apparently not all repos support yet. This newer manifest has the image sizes in it: https://docs.docker.com/registry/spec/manifest-v2-2/

All in all, this is much faster than trying to actually download all the images from Docker Hub, but it will still take quite a long time and a lot of processing. The database for storing this all might get rather large and difficult to process, so I'm not sure how much you really want to pursue this route for a simple research project. That's up to you.

You may want to take @Vish's advice and simply go for obvious large images such as the Microsoft or Java ones.

BoomShadow
  • 1,472
  • 1
  • 15
  • 11