147

In many blog posts, and general opinion, there is a saying that goes "one process per container".

Why does this rule exist? Why not run ntp, nginx, uwsgi and more processes in a single container that needs to have all processes to work?

blog posts mentioning this rule:

Evgeny Zislis
  • 9,023
  • 5
  • 39
  • 72

5 Answers5

123

Lets forget the high-level architectural and philosophical arguments for a moment. While there may be some edge cases where multiple functions in a single container may make sense, there are very practical reasons why you may want to consider following "one function per container" as a rule of thumb:

  • Scaling containers horizontally is much easier if the container is isolated to a single function. Need another apache container? Spin one up somewhere else. However if my apache container also has my DB, cron and other pieces shoehorned in, this complicates things.
  • Having a single function per container allows the container to be easily re-used for other projects or purposes.
  • It also makes it more portable and predictable for devs to pull down a component from production to troubleshoot locally rather than an entire application environment.
  • Patching/upgrades (both the OS and the application) can be done in a more isolated and controlled manner. Juggling multiple bits-and-bobs in your container not only makes for larger images, but also ties these components together. Why have to shut down application X and Y just to upgrade Z?
    • Above also holds true for code deployments and rollbacks.
  • Splitting functions out to multiple containers allows more flexibility from a security and isolation perspective. You may want (or require) services to be isolated on the network level -- either physically or within overlay networks -- to maintain a strong security posture or comply with things like PCI.
  • Other more minor factors such as dealing with stdout/stderr and sending logs to the container log, keeping containers as ephemeral as possible etc.

Note that I'm saying function, not process. That language is outdated. The official docker documentation has moved away from saying "one process" to instead recommending "one concern" per container.

Jon
  • 1,345
  • 1
  • 8
  • 6
68

Having slain a "two processes" container a few days ago, there are some pain points for me which caused me to use two container instead of a python script that started two processes:

  1. Docker is good at recognizing crashed containers. It can't do that when the main process looks fine, but some other process died a gruesome death. Sure, you can monitor your process manually, but why reimplement that?

  2. docker logs gets a lot less useful when multiple processes are spewing their logs to the console. Again, you can write the process name to the logs, but docker can do that, too.

  3. Testing and reasoning about a container gets a lot harder.

codeforester
  • 391
  • 1
  • 5
  • 29
Christian Sauer
  • 781
  • 5
  • 4
21

The recommendation comes from the goal and design of the Operating-system-level virtualization

Containers have been designed to isolate a process for others by giving it its own userspace and filesystem.

This is the logical evolution of chroot which was providing an isolated filesystem, the next step was isolating processes from the others to avoid memory overwrites and allowing to use the same resource (I.e TCP port 8080 for example) from multiple processes without conflicts.

The main interest in a container it to package the needed library for the process without worrying about version conflicts. If you run multiples processes needing two versions of the same library in the same userspace and filesystem, you'd had to tweak at least LDPATH for each process so the proper library is found first, and some libraries can't be tweaked this way, because their path is hard coded in the executable at compilation time, see this SO question for more details.
At the network level you'll have to configure each process to avoid using the same ports.

Running multiple processes in the same container require some heavy tweaking and at the end of the day defeat the purpose of isolation, if you are ok to run multiples processes within the same userspace, sharing the same filesytem and network resources, then why not run them on the host itself?

Here is the non exhaustive list of the heavy tweaking/pitfalls I can think of:

  • Handling the logs

    Either being with a mounted volume or interleaved on stdout this brings some management. If using a mounted volume your container should have its own "place" on host or two same containers will fight for the same resource. When interleaving on stdout to take advantage of docker logs it can become a nightmare for analysis if the sources can't be identified easily.

  • Beware of zombie processes

    If one of your process in a container crashes, supervisord may not be able to clean up the children in a zombie state, and the host init will never inherit them. Once you exhausted the number of available pids (2^22 so roughly 4 millions) a bunch of things will fail.

  • Separation of concerns

    If you run two separated things, like an apache server and logstash within the same container, that may ease the log handling, but you have to shutdown apache to update logstash. (In reality, you should use the logging driver of Docker)

    Will it be a graceful stop waiting the current sessions to end or not ? If it's a graceful stop, it may take sometime and become long to roll the new version. If you do a kill, you'll impact users for a log shipper and that should be avoided IMHO.

Finally when you have multiple processes you're reproducing an OS, and in this case using a hardware virtualization sounds more in line with this need.

codeforester
  • 391
  • 1
  • 5
  • 29
Tensibai
  • 11,416
  • 2
  • 37
  • 63
11

As in most cases, it's not all-or-nothing. The guidance of "one process per container" stems from the idea that containers should serve a distinct purpose. For example, a container should not be both a web application and a Redis server.

There are cases where it makes sense to run multiple processes in a single container, as long as both processes support a single, modular function.

Dave Swersky
  • 4,068
  • 2
  • 21
  • 33
5

The process I'll called as service here, 1 container ~ 1 service, if any of my service is failed then I'll only spin up that respective container and with-in seconds everything is up again. So, there won't be any dependencies between services. It is best practice to keep your container size less than 200 MB and max 500 MB (exception to windows native containers are more than 2 GB's) otherwise, its going to be the similar as virtual machine, not exactly but, performance suffice. Also, take into consideration few parameters as scaling, how could I make my services resilience, auto-deploy, etc.

And, its purely your call how you need to make your architectural patterns like micro-service in polygot environment using the containers technology that best suite your environment and will automate the things for you.

mohan08p
  • 360
  • 2
  • 7