文章大纲
ROOK 可以提供块存储,再此之前需要创建相应的 StorageClass
和使用 CephBlockPool
CRD 创建对应的 CR(Ceph 中的存储池)。
Rook 中的 RBD 池可以参考上一篇文章 ROOK 03:创建 RBD Pool ,接下来就需要创建对应的 StorageClass
。
创建 StorageClass
块对应的 StorageClass 可以有复制池和纠删代码池两种类型的池提供。
基于复制池的 StorageClass
在项目 deploy/examples/csi/rbd/storageclass.yaml
中提供了创建基于复制池的 StroageClass 的示例:
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
# clusterID is the namespace where the rook cluster is running
clusterID: rook-ceph
# Ceph pool into which the RBD image shall be created
pool: replicapool
# (optional) mapOptions is a comma-separated list of map options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# mapOptions: lock_on_read,queue_depth=1024
# (optional) unmapOptions is a comma-separated list of unmap options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# unmapOptions: force
# RBD image format. Defaults to "2".
imageFormat: "2"
# RBD image features
# Available for imageFormat: "2". Older releases of CSI RBD
# support only the `layering` feature. The Linux kernel (KRBD) supports the
# full complement of features as of 5.4
# `layering` alone corresponds to Ceph's bitfield value of "2" ;
# `layering` + `fast-diff` + `object-map` + `deep-flatten` + `exclusive-lock` together
# correspond to Ceph's OR'd bitfield value of "63". Here we use
# a symbolic, comma-separated format:
# For 5.4 or later kernels:
#imageFeatures: layering,fast-diff,object-map,deep-flatten,exclusive-lock
# For 5.3 or earlier kernels:
imageFeatures: layering
# The secrets contain Ceph admin credentials.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
# Specify the filesystem type of the volume. If not specified, csi-provisioner
# will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
# in hyperconverged settings where the volume is mounted on the same node as the osds.
csi.storage.k8s.io/fstype: ext4
# Delete the rbd volume when a PVC is deleted
reclaimPolicy: Delete
# Optional, if you want to add dynamic resize for PVC.
# For now only ext3, ext4, xfs resize support provided, like in Kubernetes itself.
allowVolumeExpansion: true
上述示例中,reclaimPolicy
可以设置为 Delete
或 Retain
:
Delete
:删除 PVC 时删除 PVRetain
:删除 PVC 时保留 PV,需手动通过rbd rm
进行删除
使用上述示例创建 StorageClass 并验证:
[vagrant@master01 rbd]$ kubectl apply -f storageclass.yaml
cephblockpool.ceph.rook.io/replicapool configured
storageclass.storage.k8s.io/rook-ceph-block created
[vagrant@master01 rbd]$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 11s
部署应用进行验证
创建一个持久化的 mysql 应用,示例如下:
apiVersion: v1
kind: Service
metadata:
name: wordpress-mysql
labels:
app: wordpress
spec:
ports:
- port: 3306
selector:
app: wordpress
tier: mysql
clusterIP: None
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
labels:
app: wordpress
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: wordpress-mysql
labels:
app: wordpress
tier: mysql
spec:
selector:
matchLabels:
app: wordpress
tier: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: wordpress
tier: mysql
spec:
containers:
- image: mysql:5.6
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: imxcai.com
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pv-claim
应用并验证:
[vagrant@master01 examples]$ kubectl apply -f mysql.yaml
service/wordpress-mysql created
persistentvolumeclaim/mysql-pv-claim created
deployment.apps/wordpress-mysql created
[vagrant@master01 examples]$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
mysql-pv-claim Bound pvc-1459dadf-b9c7-404f-8438-296ce856b7aa 20Gi RWO rook-ceph-block <unset> 4s
[vagrant@master01 examples]$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
pvc-1459dadf-b9c7-404f-8438-296ce856b7aa 20Gi RWO Delete Bound default/mysql-pv-claim rook-ceph-block <unset> 12m
等待 Pod 运行后,查看 Pod 中的挂载,可以到对应的 rbd 块设备:
[vagrant@master01 examples]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
wordpress-mysql-946474f8f-lxwtj 1/1 Running 0 53s
[vagrant@master01 examples]$ kubectl exec -it wordpress-mysql-946474f8f-lxwtj -- /bin/df -h
Filesystem Size Used Avail Use% Mounted on
overlay 125G 10G 115G 8% /
tmpfs 64M 0 64M 0% /dev
shm 64M 0 64M 0% /dev/shm
tmpfs 1.6G 13M 1.5G 1% /etc/hostname
/dev/mapper/centos9s-root 125G 10G 115G 8% /etc/hosts
/dev/rbd0 20G 116M 20G 1% /var/lib/mysql
tmpfs 7.5G 12K 7.5G 1% /run/secrets/kubernetes.io/serviceaccount
devtmpfs 4.0M 0 4.0M 0% /proc/keys
默认格式化为了 ext4 文件系统:
[vagrant@master01 examples]$ kubectl exec -it wordpress-mysql-946474f8f-lxwtj -- /bin/mount | grep rbd0
/dev/rbd0 on /var/lib/mysql type ext4 (rw,relatime,stripe=16)
replicapool
池中对应的 rbd image:
[vagrant@master01 examples]$ kubectl exec -it rook-ceph-tools-66b77b8df5-x97q4 -n rook-ceph -- /bin/rbd -p replicapool ls
csi-vol-470aadee-1094-4955-9159-77cd4c465af7
[vagrant@master01 examples]$ kubectl exec -it rook-ceph-tools-66b77b8df5-x97q4 -n rook-ceph -- /bin/rbd -p replicapool info csi-vol-470aadee-1094-4955-9159-77cd4c465af7
rbd image 'csi-vol-470aadee-1094-4955-9159-77cd4c465af7':
size 20 GiB in 5120 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 1e8e0c8116a37
block_name_prefix: rbd_data.1e8e0c8116a37
format: 2
features: layering
op_features:
flags:
create_timestamp: Sun Mar 3 05:25:32 2024
access_timestamp: Sun Mar 3 05:25:32 2024
modify_timestamp: Sun Mar 3 05:25:32 2024
基于纠删代码池的 StorageClass
基于纠删代码池的 StorageClass 需要有两个池,一个是复制池,用于保存 image 的元信息,一个纠删代码池用于存储 image。
项目里的 deploy/examples/csi/rbd/storageclass-ec.yaml
提供的示例:
#################################################################################################################
# Create a storage class with a data pool that uses erasure coding for a production environment.
# A metadata pool is created with replication enabled. A minimum of 3 nodes with OSDs are required in this
# example since the default failureDomain is host.
# kubectl create -f storageclass-ec.yaml
#################################################################################################################
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicated-metadata-pool
namespace: rook-ceph # namespace:cluster
spec:
replicated:
size: 2
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: ec-data-pool
namespace: rook-ceph # namespace:cluster
spec:
# Make sure you have enough nodes and OSDs running bluestore to support the replica size or erasure code chunks.
# For the below settings, you need at least 3 OSDs on different nodes (because the `failureDomain` is `host` by default).
erasureCoded:
dataChunks: 2
codingChunks: 1
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block-ec
provisioner: rook-ceph.rbd.csi.ceph.com # csi-provisioner-name
parameters:
# clusterID is the namespace where the rook cluster is running
# If you change this namespace, also change the namespace below where the secret namespaces are defined
clusterID: rook-ceph # namespace:cluster
# If you want to use erasure coded pool with RBD, you need to create
# two pools. one erasure coded and one replicated.
# You need to specify the replicated pool here in the `pool` parameter, it is
# used for the metadata of the images.
# The erasure coded pool must be set as the `dataPool` parameter below.
dataPool: ec-data-pool
pool: replicated-metadata-pool
# (optional) mapOptions is a comma-separated list of map options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# mapOptions: lock_on_read,queue_depth=1024
# (optional) unmapOptions is a comma-separated list of unmap options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# unmapOptions: force
# RBD image format. Defaults to "2".
imageFormat: "2"
# RBD image features, equivalent to OR'd bitfield value: 63
# Available for imageFormat: "2". Older releases of CSI RBD
# support only the `layering` feature. The Linux kernel (KRBD) supports the
# full feature complement as of 5.4
# imageFeatures: layering,fast-diff,object-map,deep-flatten,exclusive-lock
imageFeatures: layering
# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph # namespace:cluster
# Specify the filesystem type of the volume. If not specified, csi-provisioner
# will set default as `ext4`.
csi.storage.k8s.io/fstype: ext4
# uncomment the following to use rbd-nbd as mounter on supported nodes
# **IMPORTANT**: CephCSI v3.4.0 onwards a volume healer functionality is added to reattach
# the PVC to application pod if nodeplugin pod restart.
# Its still in Alpha support. Therefore, this option is not recommended for production use.
#mounter: rbd-nbd
allowVolumeExpansion: true
reclaimPolicy: Delete
应用并查看创建的 StorageClass:
[vagrant@master01 rbd]$ kubectl apply -f storageclass-ec.yaml
cephblockpool.ceph.rook.io/replicated-metadata-pool unchanged
cephblockpool.ceph.rook.io/ec-data-pool unchanged
storageclass.storage.k8s.io/rook-ceph-block-ec created
[vagrant@master01 rbd]$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 14m
rook-ceph-block-ec rook-ceph.rbd.csi.ceph.com Delete Immediate true 17s
提供未格式化的块存储
在前面示例中,rbd image 默认会被格式化为 ext4,也可以在 StorageClass 中的 parameters
中的 csi.storage.k8s.io/fstype
修改为 xfs 文件系统。
如果应用是虚拟机,需要使用未格式化的块设备,可以在定义 PVC 时,声明卷的模式 为VolumeMode: Block
:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
labels:
app: wordpress
spec:
storageClassName: rook-ceph-block
VolumeMode: Block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi