4

I'm setting up a new Ceph cluster for testing purposes using virtual machines on Hyper-V, all running Debian 12 (Bookworm). Here's the current setup:

4 VMs, each with: 40GB disk for the system 1000GB disk for Ceph (clean, no partitions or filesystem)

Hostnames and Network Each VM has a static IP and hostname:

sudo hostnamectl set-hostname adm001
sudo hostnamectl set-hostname osd001
sudo hostnamectl set-hostname osd002
sudo hostnamectl set-hostname osd003

All hosts share the following /etc/hosts entry:

192.168.155.10  adm001
192.168.155.21  osd001
192.168.155.22  osd002
192.168.155.23  osd003

Installed packages on all nodes:

sudo apt -y install podman lvm2

Ceph Installation On adm001, I installed Ceph 19.2.1:

CEPH_RELEASE=19.2.1
curl --silent --remote-name --location https://download.ceph.com/rpm-${CEPH_RELEASE}/el9/noarch/cephadm
chmod +x cephadm
./cephadm add-repo --release squid
./cephadm install

Then I bootstrapped the cluster:

cephadm bootstrap --mon-ip=192.168.155.10 \
  --apply-spec=initial-config-primary-cluster.yaml \
  --initial-dashboard-password='niceone' \
  --dashboard-password-noupdate \
  --ssh-user=admin

The initial-config-primary-cluster.yaml file defines the cluster topology (included below if needed). After about 20 minutes, the installation completes successfully. I can access the Ceph Dashboard, and the cluster looks healthy:

ceph status

Output:

cluster:
  id: ...
  health: HEALTH_OK

services: mon: 2 daemons, quorum adm001,osd001 mgr: adm001 (active) osd: 3 osds: 3 up, 3 in

data: pools: 2 pools, 33 pgs usage: 146 MiB used, 3.0 TiB / 3.0 TiB avail pgs: 33 active+clean

io: client: 3.0 KiB/s rd, 1 op/s rd

NVMe-oF Configuration (via Dashboard)

  • Created a pool, assigned rbd as its application. Pool name is pool1.
  • Set up NVMe/TCP Gateway on osd002 and osd003. Created a Subsystem:
  • Created a Subsystem: nqn.2001-07.com.ceph:1742891486659
  • Created a Namespace with image name nvme_ns_image:1742891396750, size 1TB, in pool1.
  • Created a Listener on osd002 at port 4420.
  • Added an Initiator (ID from Linux client), also tried allowing all hosts.

Client-Side (Linux) On the Linux client:

sudo apt install nvme-cli
modprobe nvme-tcp

nvme connect -t tcp -a 192.168.155.22 -n "nqn.2001-07.com.ceph:1742891486659"

Output:

message: connecting to device: nvme0

But:

  • lsblk shows no new device
  • nvme list only shows the header, no devices:

Client-Side (Windows) I also tried using StarWind NVMe-oF Initiator on Windows. The connection succeeds, but no usable device is listed.

Question: What might I be missing in the Ceph NVMe-oF setup? Why is no device appearing on either the Linux or Windows client? The image and listener seem to be set up correctly, and the connection doesn't error out — but no usable device is visible.

Any hints, suggestions, or troubleshooting steps are highly appreciated!

initial-config-primary-cluster.yaml:

  service_type: host
hostname: adm001
addr: 192.168.155.10
location:
  root: default
  datacenter: DC1
  rack: HVHost1
labels:
  - mon
  - mgr
  - admin
---

service_type: host hostname: osd001 addr: 192.168.155.21 location: root: default datacenter: DC1 rack: HVHost1 labels:

  • osd
  • mon

service_type: host hostname: osd002 addr: 192.168.155.22 location: root: default datacenter: DC1 rack: HVHost1 labels:

  • osd
  • nvme

service_type: host hostname: osd003 addr: 192.168.155.23 location: root: default datacenter: DC1 rack: HVHost1 labels:

  • osd
  • nvme

service_type: mon placement: label: "mon"


service_type: mgr placement: label: "mgr"


service_type: osd service_id: default_drive_group spec: data_devices: all: true placement: label: "osd"


0 Answers0