3

The default value of logs.dir=/tmp/kafka-logsin server.properties. Usually /tmp is avoided from keeping any important files and we are storing messages and offsets!

Any particular reason why one may not choose /var/log/kafka-logs or /opt/kafka-logs

NOTE - Assuming /tmp, /var/log are all same file-system type.

Divs
  • 131

2 Answers2

3

You'll always find me placing files in standard directories or as close to them as possible.

The reason for this is so that future admins can find them later -- because very often that future admin is me!

Consider logs, for instance, since that's what you have brought up. I would create a subdirectory in /var/log to store these, such as /var/log/kafka. The directory /var/log is where most admins will go first to look for logs for any package. Apache's default of /tmp/kafka-logs is pretty senseless, as you've already discovered. Cloudera's default log directory /var/log/kafka makes much more sense.

If it turns out that you need to mount a disk partition to store logs, you don't have to change the log directory; instead, you can mount the new disk space directly at /var/log/kafka.

And /opt is intended for large third party packages; it's not where I expect to find most things. There are few standards or conventions for anything in this directory, so things could end up difficult to find.

Michael Hampton
  • 252,907
2

The best place is a separate partition mounted to have all of the data in the same place, which holds no other function for the OS and/or other installed packages

/var/log/ is being handled by other programs also, logrotate etc. so not really safe to have your Kafka data there

/opt/ should not hold any program data, only additional installed software

Depending on what kind of messages you have, you might want to limit any interaction/possible issue with them.

Alex H
  • 1,824