I have a 4-node HarvesterHCI cluster up and running. One node is equipped with dedicated hardware (GPU) which is not present on the other 3 nodes. Rancher is installed as the management solution to allow the automatic deployment of kubernetes clusters on the available hardware.
For the question in this post, I created a cluster configuration using Rancher defining:
one machine pool with 3 nodes for the control plane
one machine pool also with 3 nodes as worker nodes. Those worker nodes should be handling all non-GPU related workloads and are also responsible for managing the Longhorn storage. Longhorn is configured to only deploy itself on nodes having the node.longhorn.io/create-default-disk=true label. The nodes in this machine pool are labelled accordingly.
As a result of this configuration, 6 VM's are created on the 3 harvester nodes and assigned the corresponding k8s roles as expected.
The problem is when I change the current cluster configuration by adding a machine pool specifically for GPU workloads. To achieve this,
I have added a label gpu.present:true to the harvester node with the GPU hardware installed.
I added a node scheduling rule to the machine pool (via Rancher GUI) in charge of the GPU hardware. This rule defines
priority: required
key: gpu.present
operator: in
value: true
When Rancher tries to create the VM for the GPU machine pool on the HCI cluster, I can see the VM appear in the virtual machine overview of the HCI cluster.
Unfortunately, it does not become ready because it is in an Unschedulable state. The reason for not being schedulable is 0/4 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
As the error indicates, Rancher is not able to assign a PVC on the Longhorn datastore to be used as the storage device for the VM. I think it has something to do with the fact that the hardware node (with GPU support) is not participating as a cluster node in Longhorn. On the other hand, it should be possible to use PVC on other nodes in a k8s cluster then the ones where a particular workload will be deployed (or where the VM should be running in this case). Something must be missing in the configuration of the HCI node and/or the GPU machine pool specified in Rancher...
Any suggestions about what I am missing or alternative approaches would be greatly appreciated.
Thanks for reading and your feedback.