How can I create my own rhel-based kafka (4.0.0) image and modify the server.properties at startup with docker-compose.yml

Question

The problem I am having is that kafka does not use the environment variables passed in from Docker-compose. It just uses the default server.properties file. I know this is by design (or lack thereof).. but why bother passing in the environment variables if they aren't going to be used? I have scoured the web as well as other docker image sources and it looks like confluent has code that reads the environment variables and then builds a custom server.properties file. Seems like a huge duplication of effort! Am I missing something?

Goal: I want to pass basic config vars from docker-compose.yml to my custom docker image (it needs to be rhel8-based) and have it use the config. For example, cluster ID, node ID, storage path, ports, just the basics needed to run Kafka. Why is this so hard? What am I doing wrong?

I've managed to take our RHEL8 ubi8-jdk21 minimal docker image and create a myproject/kafka image from it:

In the Dockerfile, I download the latest kafka release, then unzip it to e.g. /opt/kafka:

FROM myproject/ubi-openjdk21:ubi8
...
RUN curl -L ${DOWNLOAD_URL} -o kafka_${KAFKA_VERSION_LONG}.tgz 

  && curl -L ${VERIFY_URL} -o kafka_${KAFKA_VERSION_LONG}.tgz.sha512 

  && sha512sum kafka_${KAFKA_VERSION_LONG}.tgz > zip.checksum 

  && sha512sum -c zip.checksum > checksum.result
#TODO finish the checksum validation process; grep for OK in the result file
RUN tar -xzf kafka_${KAFKA_VERSION_LONG}.tgz 

    && mv kafka_${KAFKA_VERSION_LONG} kafka
...

Then in the entrypoint.sh script I know I need to call

./bin/kafka-storage.sh format --standalone -t "$CLUSTER_ID" -c config/server.properties
./bin/kafka-server-start.sh config/server.properties

I am attempting to re-use the cluster set up from the apache kafka site: https://hub.docker.com/r/apache/kafka

Sample docker-compose.yml:

services:
  controller-1:
    image: apache/kafka:latest
    container_name: controller-1
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_PROCESS_ROLES: controller
      KAFKA_LISTENERS: CONTROLLER://:9093
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@controller-1:9093,2@controller-2:9093,3@controller-3:9093
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
controller-2:
    image: apache/kafka:latest
    container_name: controller-2
    environment:
      KAFKA_NODE_ID: 2
      KAFKA_PROCESS_ROLES: controller
      KAFKA_LISTENERS: CONTROLLER://:9093
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@controller-1:9093,2@controller-2:9093,3@controller-3:9093
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
controller-3:
    image: apache/kafka:latest
    container_name: controller-3
    environment:
      KAFKA_NODE_ID: 3
      KAFKA_PROCESS_ROLES: controller
      KAFKA_LISTENERS: CONTROLLER://:9093
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@controller-1:9093,2@controller-2:9093,3@controller-3:9093
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
broker-1:
    image: apache/kafka:latest
    container_name: broker-1
    ports:
      - 29092:9092
    environment:
      KAFKA_NODE_ID: 4
      KAFKA_PROCESS_ROLES: broker
      KAFKA_LISTENERS: 'PLAINTEXT://:19092,PLAINTEXT_HOST://:9092'
      KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker-1:19092,PLAINTEXT_HOST://localhost:29092'
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@controller-1:9093,2@controller-2:9093,3@controller-3:9093
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
    depends_on:
      - controller-1
      - controller-2
      - controller-3
... (see link for the full file)

But I need to pass the storage dir and cluster ID in. I tweaked the docker-compose.yml:

...
broker-3:
    image: myproject/kafka:${CONTAINER_VERSION}
    container_name: broker-3
    restart: on-failure:5
    ports:
      - 49192:9092
    environment:
      KAFKA_NODE_ID: 6
      KAFKA_LOG_DIRS:  '/storage_dir'
      KAFKA_PROCESS_ROLES: broker
...
  CLUSTER_ID: '${CUSTER_ID}'
depends_on:
  - controller-1
  - controller-2
  - controller-3
volumes:
  - kafka_data_6: '/storage_dir'


volumes:
  kafka_data_1:
  kafka_data_2:
...

score 0 · Answer 1 · answered May 16 '25 at 10:35

I couldn't find what I was looking for; I don't think it exists. So basically I am parsing ENV and writing to the kafka config in my entrypoint.sh:

...
STANDALONE=
overwrite the default config every boot
rm -f config/server.properties
loop thru all environment variables starting with KAFKA_
for var in "${!KAFKA_@}"; do
    # make lowercase
    svar=${var,,}
    # get rid of the "kafka_" prefix
    svar=${svar/kafka_/}
    # change underscores to dots
    svar=${svar//_/.}
# for controller nodes, we want to pass the standalone flag.. but not for broker nodes. (??)
if [[ ${!var} = &quot;controller&quot; ]]; then
    STANDALONE=--standalone
fi

# debug
printf '%s=%s\n' &quot;$svar&quot; &quot;${!var}&quot;

# update the generated config file
printf '%s=%s\n' &quot;$svar&quot; &quot;${!var}&quot; &gt;&gt; config/server.properties

done
./bin/kafka-storage.sh format $STANDALONE -t "$MYCLUSTERID" -c config/server.properties
./bin/kafka-server-start.sh config/server.properties

My understanding of kafka is basic, and of docker is just a bit better than that, so I may be doing all kinds of stuff wrong here. Evidenced by the fact that my new kafka cluster seems unstable! But at least it is kinda up and running. I will update this question as I make progress.

How can I create my own rhel-based kafka (4.0.0) image and modify the server.properties at startup with docker-compose.yml

1 Answers1

overwrite the default config every boot

loop thru all environment variables starting with KAFKA_