Hi all. Simple enough question, seems the answer not that simply.
Reposting as a separate thread… (original was to get RBD) working.
I’ve opted to install the RBD driver into namespace: ceph-csi-rbd, and cephfs completely separately, in namespace: ceph-csi-cephfs.
Get my NS created, the secret configured,
Got the RBD deployed/working.
I have my pool and the fsName, issue, second I do the help install the pods go into a chashloop.
Note, even if I remove the entire RBD deployment, same results as per below.
Below is my yaml files, anyone see anything wrong. Also posting the log output from my crashing pod.
The CNCF says this is not their problem, I can’t find a dedicated Ceph Forum, My K8S cluster is based on Kubespray, they also saying well not a K8S problem so go somewhere else..
One of the earlier errors pointed at the Metrics port conflict, but can’t understand why that, as the RBD runs in it’s own namespace and the port 8080 in that case is a container port, not a host port so that should not conflict with the cephfs metrics also on 8080, as a container port.
Also to change this implies going into the yaml, which i don’t think is the best route.
G
Error’s
kubectl logs -n ceph-csi-cephfs ceph-csi-cephfs-nodeplugin-jtmq8
Defaulted container “csi-cephfsplugin” out of: csi-cephfsplugin, driver-registrar, liveness-prometheus
I1121 13:29:45.221001 3349672 cephcsi.go:204] Driver version: canary and Git version: 6b1c8a9b598405e804a62c94397890f0382d1c8d
I1121 13:29:45.222292 3349672 maxprocs.go:48] maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined
I1121 13:29:45.222352 3349672 cephcsi.go:291] Initial PID limit is set to 2267
I1121 13:29:45.222387 3349672 cephcsi.go:297] Reconfigured PID limit to -1 (max)
I1121 13:29:45.222420 3349672 cephcsi.go:245] Starting driver type: cephfs with name: cephfs.csi.ceph.com
I1121 13:29:45.252194 3349672 volumemounter.go:81] loaded mounter: kernel
I1121 13:29:45.311729 3349672 volumemounter.go:92] loaded mounter: fuse
I1121 13:29:45.393975 3349672 mount_linux.go:324] Detected umount with safe ‘not mounted’ behavior
I1121 13:29:45.394403 3349672 server.go:123] listening for CSI-Addons requests on address: &net.UnixAddr{Name:“/tmp/csi-addons.sock”, Net:“unix”}
I1121 13:29:45.401872 3349672 server.go:131] Listening for connections on address: &net.UnixAddr{Name:“//csi/csi.sock”, Net:“unix”}
I1121 13:29:46.398619 3349672 utils.go:341] ID: 1 GRPC call: /csi.v1.Identity/GetPluginInfo
I1121 13:29:46.398696 3349672 utils.go:342] ID: 1 GRPC request: {}
I1121 13:29:46.398712 3349672 identityserver-default.go:40] ID: 1 Using default GetPluginInfo
I1121 13:29:46.398917 3349672 utils.go:348] ID: 1 GRPC response: {“name”:“cephfs.csi.ceph.com”,“vendor_version”:“canary”}
I1121 13:29:46.874140 3349672 utils.go:341] ID: 2 GRPC call: /csi.v1.Node/NodeGetInfo
I1121 13:29:46.874157 3349672 utils.go:342] ID: 2 GRPC request: {}
I1121 13:29:46.874163 3349672 nodeserver-default.go:45] ID: 2 Using default NodeGetInfo
I1121 13:29:46.874259 3349672 utils.go:348] ID: 2 GRPC response: {“accessible_topology”:{},“node_id”:“node6”}
This is immediate right after the helm install command
Deployment Notes/steps
Cloning ceph-csi repo
git clone GitHub - ceph/ceph-csi: CSI driver for Ceph
cd ceph-csi
git checkout v3.15.0
cd charts/ceph-csi-cephfs
We use one file to create both the rbd and cephfs namespaces
cat <<EOF > ceph-ns.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: ceph-csi-rbd
---
apiVersion: v1
kind: Namespace
metadata:
name: ceph-csi-cephfs
EOF
Create namespace on kubernetes
kubectl apply -f ceph_ns.yaml
On CEPH cluster
Execute the following on your CEPH cluster
ceph fsid
ceph mon dump
On Kube cluster
Now, we need chart values file and add CEPH cluster ID and mon nodes list, from above
Copy values from above into ceph-values.yaml
cat <<EOF > ceph-cephfs-values.yaml
csiConfig:
- clusterID: "32a09c66-a6c3-4329-8339-15510d2ea9e0"
monitors:
- "172.16.10.51:6789"
- "172.16.10.52:6789"
- "172.16.10.53:6789"
provisioner:
name: provisioner
replicaCount: 2
EOF
Let’s install Ceph CSI chart on Kubernetes
helm install -n ceph-csi-cephfs ceph-csi-cephfs --values ceph-values.yaml ./
THE ABOVE IS WHERE IT FAILS
Check CSI Installation Status
helm status ceph-csi-cephfs -n ceph-csi-cephfs
kubectl rollout status deployment -n ceph-csi-cephfs
kubectl get pods -n ceph-csi-cephfs -o wide
On CEPH cluster
Create CephFS filesystem (if not already exists)
To mount volumes on Kubernetes from external Ceph Storage, A cephfs needs to be created first. Create a fs, you can create multiple storage classes, each configured to use a unque subvolume
ceph fs volume create cephfs
Inspecting
List all CephFS filesystems
ceph fs volume ls
[
{
“name”: “cephfs”
}
]
Using the traditional command, list fs, use name:… for ceph user create and fsname,
use data pools value in storageclass for pools value.
ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
Get detailed info about a specific filesystem
ceph fs get cephfs
Show filesystem status
ceph fs status cephfs
Create CephFS user with appropriate permissions
Cleanup
ceph auth del client.k8s-cephfs-user
ceph auth ls | grep k8s-cephfs-user
Create
ceph auth get-or-create client.k8s-cephfs-user \
mon 'profile rbd, allow r' \
mgr 'profile rbd, allow r' \
mds 'allow rw fsname=cephfs' \
osd 'profile cephfs, allow rw pool=cephfs_data'
or
Create, we’re over assigning permissions!
ceph auth get-or-create client.k8s-cephfs-user \
mon 'profile rbd, allow *' \
mgr 'profile rbd, allow *' \
mds 'allow * fsname=cephfs' \
osd 'profile cephfs, allow * pool=cephfs_data'
or
Modify, we’re over assigning permissions!
ceph auth caps client.k8s-cephfs-user \
mon 'allow r' \
mgr 'allow rw' \
mds 'allow rw' \
osd 'profile cephfs, allow rw pool=cephfs_data'
Inspect
ceph auth get client.k8s-cephfs-user
[client.k8s-cephfs-user]
key = AQASQB9pEcMbBBAAOjG0ond+gdQ8aJOAUHLBuw==
caps mds = “allow rw fsname=cephfs”
caps mgr = “profile rbd, allow r”
caps mon = “profile rbd, allow r”
caps osd = “profile cephfs, allow rw pool=cephfs_data”
Encode key from above into base64 value
echo -n "AQASQB9pEcMbBBAAOjG0ond+gdQ8aJOAUHLBuw==" | base64 -w 0
QVFBU1FCOXBFY01iQkJBQU9qRzBvbmQrZ2RROGFKT0FVSExCdXc9PQ==
or
Get clear key and pipe directly into base64 value
ceph auth get-key client.k8s-cephfs-user | tr -d '\n' | base64
QVFBU1FCOXBFY01iQkJBQU9qRzBvbmQrZ2RROGFKT0FVSExCdXc9PQ==
base64 Encode the username
base 64 encode username
echo "k8s-cephfs-user" | tr -d '\n' | base64
azhzLWNlcGhmcy11c2Vy
On Kube cluster
Copy/Add base64 encoded values from above into admin-secret.yaml
Note, we’re using one admin-secret to create both our rbd and cephfs secrets.
See rbd section to retrieve the rbd user credentials.
cat > admin-secret.yaml << EOF
apiVersion: v1
kind: Secret
metadata:
name: csi-admin-secret
namespace: ceph-csi-cephfs
type: Opaque
data:
userID: azhzLWNlcGhmcy11c2Vy
userKey: QVFCUmNoMXArRy9oREJBQXNsK2lmRHZLRjBnVjBURkduRUgrdEE9PQ==
---
apiVersion: v1
kind: Secret
metadata:
name: csi-admin-secret
namespace: ceph-csi-rbd
type: kubernetes.io/rbd
data:
userID: azhzLXJiZC11c2Vy
userKey: QVFDN2JoMXBnMXFSQ1JBQWgyZkwrWmZhT1V0anlNZDhxOUlPNUE9PQ==
EOF
Create CephFS Admin Secret in each namespace
kubectl apply -f admin-secret.yaml
Create CephFS StorageClass
kubectl apply -f ceph-cephfs-sc.yaml
Verify Build
kubectl get sc
Lets try and create a pvc
kubectl apply -f create-cephfs-pod-with-pvc.yaml
kubectl get pvc -n ceph-csi-cephfs
kubectl describe pvc/my-tst-pvc -n ceph-csi-cephfs
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc
namespace: ceph-csi-cephfs
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: ceph-cephfs-sc
---
apiVersion: v1
kind: Pod
metadata:
name: rbd-app
namespace: ceph-csi-cephfs
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "echo 'Hello CEPHFS' > /mnt/cephfs/data.txt && sleep 3600"]
volumeMounts:
- mountPath: /mnt/cephfs
name: cephfs-mnt
volumes:
- name: cephfs-mnt
persistentVolumeClaim:
claimName: cephfs-pvc
Uninstall/Cleanup CephFS CSI
kubectl delete -f create-cephfs-pod-with-pvc.yaml
kubectl delete -f ceph-cephfs-sc.yaml
helm uninstall ceph-csi-cephfs -n ceph-csi-cephfs
kubectl delete -f admin-secret.yaml
kubectl delete namespace ceph-csi-cephfs
helm list -n ceph-csi-cephfs
For Reference, here is my RBD deployment in it’s own namespace
kubectl get all -n ceph-csi-rbd -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/ceph-csi-rbd-nodeplugin-ds48k 3/3 Running 0 44h 172.16.30.74 node4 <none> <none>
pod/ceph-csi-rbd-nodeplugin-ns2hm 3/3 Running 0 44h 172.16.30.75 node5 <none> <none>
pod/ceph-csi-rbd-nodeplugin-wjnp7 3/3 Running 0 44h 172.16.30.76 node6 <none> <none>
pod/ceph-csi-rbd-provisioner-6b6f6f5c5f-dgz8w 7/7 Running 0 44h 172.233.75.31 node6 <none> <none>
pod/ceph-csi-rbd-provisioner-6b6f6f5c5f-vk899 7/7 Running 0 44h 172.233.97.174 node5 <none> <none>
pod/rbd-app 1/1 Running 44 (41m ago) 44h 172.233.75.33 node6 <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/ceph-csi-rbd-nodeplugin-http-metrics ClusterIP 172.233.11.135 <none> 8080/TCP 44h app=ceph-csi-rbd,component=nodeplugin,release=ceph-csi-rbd
service/ceph-csi-rbd-provisioner-http-metrics ClusterIP 172.233.45.178 <none> 8080/TCP 44h app=ceph-csi-rbd,component=provisioner,release=ceph-csi-rbd
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
daemonset.apps/ceph-csi-rbd-nodeplugin 3 3 3 3 3 <none> 44h csi-rbdplugin,driver-registrar,liveness-prometheus quay.io/cephcsi/cephcsi:canary,registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.15.0,quay.io/cephcsi/cephcsi:canary app=ceph-csi-rbd,component=nodeplugin,release=ceph-csi-rbd
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/ceph-csi-rbd-provisioner 2/2 2 2 44h csi-rbdplugin,csi-provisioner,csi-resizer,csi-snapshotter,csi-attacher,csi-rbdplugin-controller,liveness-prometheus quay.io/cephcsi/cephcsi:canary,registry.k8s.io/sig-storage/csi-provisioner:v6.0.0,registry.k8s.io/sig-storage/csi-resizer:v2.0.0,registry.k8s.io/sig-storage/csi-snapshotter:v8.4.0,registry.k8s.io/sig-storage/csi-attacher:v4.10.0,quay.io/cephcsi/cephcsi:canary,quay.io/cephcsi/cephcsi:canary app=ceph-csi-rbd,component=provisioner,release=ceph-csi-rbd
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
replicaset.apps/ceph-csi-rbd-provisioner-6b6f6f5c5f 2 2 2 44h csi-rbdplugin,csi-provisioner,csi-resizer,csi-snapshotter,csi-attacher,csi-rbdplugin-controller,liveness-prometheus quay.io/cephcsi/cephcsi:canary,registry.k8s.io/sig-storage/csi-provisioner:v6.0.0,registry.k8s.io/sig-storage/csi-resizer:v2.0.0,registry.k8s.io/sig-storage/csi-snapshotter:v8.4.0,registry.k8s.io/sig-storage/csi-attacher:v4.10.0,quay.io/cephcsi/cephcsi:canary,quay.io/cephcsi/cephcsi:canary app=ceph-csi-rbd,component=provisioner,pod-template-hash=6b6f6f5c5f,release=ceph-csi-rbd