Ceph CSI driver deployment failing

Hi all

Following is being executed on Kubespray, aka K8s 1.34

Ceph is from a Proxmox cluster v9.0.1.1

I’m following: Ceph Storage Integration with Kubernetes using Ceph CSI | by Satish Patel | Medium

vi ceph-csi-rbd-values.yaml
csiConfig:
  - clusterID: "32a09c66-a6c3-4329-8339-15510d2ea9e0"
    monitors:
      - "172.16.10.51:6789"
      - "172.16.10.52:6789"
      - "172.16.10.53:6789"
provisioner:
  name: provisioner
  replicaCount: 2

Above values come from :

$ ceph fsid
$ ceph mon dump

nc -vc 172.16.10.51 6789
Connection to 172.16.10.51 port 6789 [tcp/smc-https] succeeded!
ceph v027
3龬�^C
helm install --namespace ceph-csi ceph-csi --values ceph-csi-rbd-values.yaml ./
NAME: ceph-csi
LAST DEPLOYED: Fri Nov 14 11:09:45 2025
NAMESPACE: ceph-csi
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Examples on how to configure a storage class and start using the driver are here:
https://github.com/ceph/ceph-csi/tree/devel/examples/rbd
helm status ceph-csi -n ceph-csi
NAME: ceph-csi
LAST DEPLOYED: Fri Nov 14 11:09:45 2025
NAMESPACE: ceph-csi
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Examples on how to configure a storage class and start using the driver are here:
https://github.com/ceph/ceph-csi/tree/devel/examples/rb
kubectl rollout status deployment -n ceph-csi
error: deployment "ceph-csi-ceph-csi-rbd-provisioner" exceeded its progress deadline
kubectl -n ceph-csi logs deploy/ceph-csi-ceph-csi-rbd-provisioner
Found 2 pods, using pod/ceph-csi-ceph-csi-rbd-provisioner-9756f44b-xvzxh
Defaulted container "csi-rbdplugin" out of: csi-rbdplugin, csi-provisioner, csi-resizer, csi-snapshotter, csi-attacher, csi-rbdplugin-controller, liveness-prometheus
Fatal glibc error: CPU does not support x86-64-v2

This is all running on Intel i5/32GB boxes.

kubectl get nodes
NAME    STATUS   ROLES           AGE     VERSION
node1   Ready    control-plane   3d20h   v1.31.2
node2   Ready    control-plane   3d20h   v1.31.2
node3   Ready    control-plane   3d20h   v1.31.2
node4   Ready    <none>          3d20h   v1.34.1
node5   Ready    <none>          3d20h   v1.34.1
node6   Ready    <none>          3d20h   v1.34.1
kubectl get all
NAME                 TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   172.233.0.1   <none>        443/TCP   3d21h
kubectl version
Client Version: v1.34.1
Kustomize Version: v5.7.1
Server Version: v1.34.1
kubectl get pods -n ceph-csi
NAME                                               READY   STATUS             RESTARTS         AGE
ceph-csi-ceph-csi-rbd-nodeplugin-8krqb             0/3     CrashLoopBackOff   26 (4m28s ago)   29m
ceph-csi-ceph-csi-rbd-nodeplugin-ln7hd             1/3     CrashLoopBackOff   27 (4m28s ago)   29m
ceph-csi-ceph-csi-rbd-nodeplugin-v2fl4             0/3     CrashLoopBackOff   26 (92s ago)     29m
ceph-csi-ceph-csi-rbd-provisioner-9756f44b-rk4hl   0/7     CrashLoopBackOff   56 (44s ago)     29m
ceph-csi-ceph-csi-rbd-provisioner-9756f44b-xvzxh   0/7     CrashLoopBackOff   59 (118s ago)    29m
kubectl logs -n ceph-csi ceph-csi-ceph-csi-rbd-nodeplugin-8krqb
Defaulted container "csi-rbdplugin" out of: csi-rbdplugin, driver-registrar, liveness-prometheus
Fatal glibc error: CPU does not support x86-64-v2
kubectl logs -n ceph-csi ceph-csi-ceph-csi-rbd-provisioner-9756f44b-xvzxh
Defaulted container "csi-rbdplugin" out of: csi-rbdplugin, csi-provisioner, csi-resizer, csi-snapshotter, csi-attacher, csi-rbdplugin-controller, liveness-prometheus
Fatal glibc error: CPU does not support x86-64-v2

Anyone able to guide me into what could be the problem.
G

@Hutch-45Drives would be the best resource.

I don’t know much about Ceph, but maybe the following will help? You may need to downgrade the driver version, or adjust the VM settings depending on whether the i5/32GB boxes are physical hardware or VMs.

If the below is irrelevant I apologize, I just know responses here can be slow sometimes.

============================================================

The issue you are encountering, based on the logs, is a CPU microarchitecture compatibility problem.

The deployment is failing with the critical error:

Fatal glibc error: CPU does not support x86-64-v2

This means the container image for the Ceph CSI driver (specifically the csi-rbdplugin) was compiled with optimizations (using a newer version of glibc) that require the x86-64-v2 instruction set. Your node CPUs, which are described as older generation Intel i5s, do not expose this required instruction set.

Here are the most effective ways to fix this issue, categorized by whether your Kubernetes nodes are running on bare metal or as Virtual Machines (VMs).


1. Software Fix: Downgrade the Ceph CSI Image (Recommended)

The quickest and most common fix is to deploy an older version of the Ceph CSI driver that uses container images compiled for a more generic (less optimized) x86-64 architecture.

You are likely using the latest image tag (e.g., v3.15.0 or newer). Try installing a slightly older, stable version of the Helm chart:

  1. Find a compatible tag: The requirement for x86-64-v2 started appearing in newer releases of container images (including those used by Ceph CSI) around late 2024/early 2025. You should target a version from the stable branch just before this change. A safe bet is the v3.14.x series, or even older if needed.
  2. Redeploy with a specific version:
    helm uninstall ceph-csi -n ceph-csi
    helm install --namespace ceph-csi ceph-csi --version **3.14.2** --values ceph-csi-rbd-values.yaml ./
    
    (Note: Replace 3.14.2 with the desired older version tag you want to test.)

If you need to stick to the newest Helm chart, you can manually override the image tag in your ceph-csi-rbd-values.yaml file by setting the image tags for the RBD plugins to a known working older tag, such as a v3.13.0 or v3.14.0 version of the quay.io/cephcsi/cephcsi image.


2. Virtualization Fix: Update CPU Emulation (If Applicable)

If your Kubernetes nodes are running as Virtual Machines on a hypervisor (like Proxmox, which you mentioned is running Ceph), the hypervisor might be emulating an old or generic CPU that hides the necessary instruction sets from the VM.

In your virtualization platform (e.g., Proxmox, ESXi, KVM):

  1. Stop the Kubernetes Node VMs.
  2. Change the CPU Type: Edit the VM settings and change the CPU model from a generic type (like kvm64, qemu64, or an old Intel model) to one of the following:
    • Host Passthrough (or just Host): This exposes the host CPU’s full capabilities to the VM, which should include x86-64-v2 features. This is the most reliable fix if the host CPU supports it.
    • A newer specific CPU model: Choose a model that is known to support x86-64-v2 (e.g., Intel Nehalem or newer—any CPU from around 2009 onward).

Once the CPU setting is updated, start the VMs and the Ceph CSI pods should initialize successfully.

1 Like

Hi there

Thank you for the above. my K8s/Kubespray nodes run as VM’s on my Proxmox cluster, as you hinted they are defined currently as KVM64 based.

If I go to me base Proxmox I see the following.

Now to just figure out what to change my K8S VM to: going to try: x86-64-v3

well x86-64-v3 def not liked, reconfigured to v2 and vm’s are starting.

Let me get this back up and then re-attempt the deployment.

G

Awesome, that fixed the problem, thank you, continuing with rest of the deployment.

G

1 Like

… making progress.

kubectl get StorageClass
NAME                    PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
ceph-rbd-sc (default)   rbd.csi.ceph.com               Delete          Immediate              true                   7m12s
local-storage           kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  3d21h
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: my-ceph-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: ceph-rbd-sc
kubectl get pvc
NAME          STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
my-ceph-pvc   Pending                                      ceph-rbd-sc    <unset>                 85s


and this then makes the test pod deploy fail.

cat <<EOF > create-pod-with-pvc.yaml
apiVersion: v1
kind: Pod
metadata:
  name: ceph-pod-pvc
spec:
  containers:
  - name:  ceph-pod-pvc
    image: busybox
    command: ["sleep", "infinity"]
    volumeMounts:
    - mountPath: /mnt/ceph_rbd
      name: volume
  volumes:
  - name: volume
    persistentVolumeClaim:
      claimName: my-ceph-pvc
EOF

If I go onto my hosting node:

ceph osd pool stats k8s-pool
pool k8s-pool id 10
  nothing is going on

G

know know @Hutch-45Drives , may I bother you :wink:

G

RESOLVED.

Did not update the cluster id when I created the StorageClass.

Got the PVC created and Bonded.

onwards we go.

G

Hey, sorry for the delay, but I’m glad you were able to get this resolved with a simple update of the ID

hehehe, well got the single rbd driver working.

it all came apart when i tried to install the cephfs driver, be that into the same namespace or into a separate namespace.

Why I’m back here, Hope you can maybe help with a chart deployment/yaml to deploy both rbd as a sc and cephfs as a sc. thinking into a single ns ceph-csi makes sense,

using their own secrets.

G

We do not have much experence with k8 and the CSI driver as its not something we typically deploy

could you share some of the errors you are getitng?

Hi there

Tagged you in a 2nd thread specifically re the cephfs deployment, all code and the error.

G

for anyone interested, deploying Ceph RBD and Cephfs into separate namespaces.

G