I recently began learning about Ceph and setup my own ansible scripts to setup a Ceph cluster (yes, I'm aware that cephadm-ansible exists, but I want to get comfortable managing ceph).
Initially, I provisioned the default rgw service here and tried to create another zone with the pools I wanted and make it the default, but that didn't work so well (the service seemed to run, but was inaccessible): https://github.com/Magnitus-/ansible-playbooks/blob/23385216d078251939a6ad03f197d3aad9a79516/roles/ceph/rgw/templates/setup_rgw_service.sh.j2
I deleted the new zone, removed the rgw service and cleaned up all the pools except for .rgw.root because the documentation warned me not to (in retrospect, I probably should have removed it).
I then re-provisioned the rgw service, but just pre-provisioned the default pools before I did so with what I wanted instead: https://github.com/Magnitus-/ansible-playbooks/blob/main/roles/ceph/rgw/templates/setup_rgw_pools.sh.j2 https://github.com/Magnitus-/ansible-playbooks/blob/main/roles/ceph/rgw/templates/setup_rgw_service.sh.j2
And that seemed to work very well. However, when I list the users by running radosgw-admin user list, I get the list, but I get this first:
2024-02-05T13:20:40.999+0000 7f37fd330a40 0 failed reading obj info from .rgw.root:realms.c08fb4e1-502c-42f2-98b9-63202f161420: (2) No such file or directory
2024-02-05T13:20:40.999+0000 7f37fd330a40 0 failed reading obj info from .rgw.root:realms.c08fb4e1-502c-42f2-98b9-63202f161420: (2) No such file or directory
2024-02-05T13:20:41.003+0000 7f37fd330a40 0 failed reading obj info from .rgw.root:realms.c08fb4e1-502c-42f2-98b9-63202f161420: (2) No such file or directory
2024-02-05T13:20:41.075+0000 7f37fd330a40 0 failed reading obj info from .rgw.root:realms.c08fb4e1-502c-42f2-98b9-63202f161420: (2) No such file or directory
I'm guessing I have some corruption left from my previous setup. It hasn't affected me so far (well, creating a user with read-only access on some buckets really feels like pulling teeth, but I have the same problem on a fresh virtualized test ceph cluster), but I feel like I should clean house before it becomes a problem so I want to reprovision a fresh rgw service (no previous metadata, nothing).
First, I want to migrate the data within the same ceph cluster (I don't have the disk capacity outside my ceph cluster, I don't want to pay cloud egress fees and I don't want to redownload everything). Ideally, I'd provision another rgw service that doesn't use .rgw.root and rclone to it, but I get the feeling that might be a tall order. Instead, I guess I'll figure out how to setup cephfs, mount a volume and rclone my buckets to it (I have enough capacity in my ceph cluster to duplicate the data).
Then, to get a squeeky clean rgw service, I guess I'll remove it like last time, clean all its pools, but this time I'll also cleanup .rgw.root and then I'll be good? No more ghosts from the past?