Ceph 安装使用

发表于 2023-06-16 更新于 2023-06-20 上层目录存储， Ceph 阅读次数：

环境信息

Centos7 6.3.8-1.el7.elrepo.x86_64
Python-3.10.12
Docker-ce 20.10.9
Ceph version 17.2.6

服务器环境信息说明

服务器	IP	配置	用途
`ceph-node-1`	10.111.30.100	centos 7 6.3.8-1 2c 3G 50G	`cephadm` 节点 `monitor daemon`
`ceph-node-2`	10.111.30.110	centos 7 6.3.8-1 2c 5G 50G
`ceph-node-3`	10.111.30.120	centos 7 6.3.8-1 2c 5G 50G

安装

本文档使用 cephadm 安装 Ceph Cluster，使用 cephadm 会首先在 Ceph Cluster 的第一个节点上安装第一个 monitor daemon，安装时 monitor daemon 必须指定和集群通信的 IP 地址。 ^[3]

依赖

Python 3
Systemd
Docker
Time synchronization (such as chrony or NTP)
LVM2 for provisioning storage devices

需要提前配置好集群节点服务器的主机名，并安装 Python 3、Docker。安装集群时，会自动安装 chrony 用来做时间同步

配置节点防火墙，允许节点之间网络互通

安装 cephadm

使用 curl 安装最新版本 ^[1]

CEPH_RELEASE=17.2.6
curl --silent --remote-name --location https://download.ceph.com/rpm-${CEPH_RELEASE}/el9/noarch/cephadm
chmod +x cephadm

将 cephadm 安装到主机系统，Centos 7 未提供最新版本的 repo

./cephadm add-repo --release octopus

rpm --import 'https://download.ceph.com/keys/release.asc'

./cephadm install

检查安装后的 cephadm 命令路径

$ which cephadm
/usr/sbin/cephadm

使用 cephadm bootstrap 初始化集群的第一个节点

使用 cephadm bootstrap 初始化集群的第一个节点，会安装集群中的第一个 monitor daemon，必须要指定集群通信的 IP 地址。执行以下命令 ^[3]

$ cephadm bootstrap --mon-ip 10.111.30.100
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: e2b9a77e-0c23-11ee-9e9d-000c29687fa4
Verifying IP 10.111.30.100 port 3300 ...
Verifying IP 10.111.30.100 port 6789 ...
Mon IP 10.111.30.100 is in CIDR network 10.111.30.0/24
Pulling container image quay.io/ceph/ceph:v15...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network...
Creating mgr...
Verifying port 9283 ...
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Wrote config to /etc/ceph/ceph.conf
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/10)...
mgr not available, waiting (2/10)...
mgr not available, waiting (3/10)...
mgr not available, waiting (4/10)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 5...
Mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to to /etc/ceph/ceph.pub
Adding key to root@localhost's authorized_keys...
Adding host ceph-node-1...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Enabling mgr prometheus module...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 13...
Mgr epoch 13 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:

	     URL: https://ceph-node-1:8443/
	    User: admin
	Password: 5sn85szxmw

You can access the Ceph CLI with:

	sudo /usr/sbin/cephadm shell --fsid e2b9a77e-0c23-11ee-9e9d-000c29687fa4 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Please consider enabling telemetry to help improve Ceph:

	ceph telemetry on

For more information see:

	https://docs.ceph.com/docs/master/mgr/telemetry/

Bootstrap complete.

cephadm bootstrap --mon-ip 10.111.30.100 命令将会执行以下操作 ^[4]

在主机节点上创建第一个 monitor 和 manager daemon
为集群生成一个 SSH Key pair，并将其添加到 root 用户的 /root/.ssh/authorized_keys 文件中
将 public key 写入 /etc/ceph/ceph.pub 文件
生成一个最小化的集群配置文件，并写入 /etc/ceph/ceph.conf
将有管理员权限的 secret key 写入 /etc/ceph/ceph.client.admin.keyring
Add the _admin label to the bootstrap host. By default, any host with this label will (also) get a copy of /etc/ceph/ceph.conf and /etc/ceph/ceph.client.admin.keyring

cephadm bootstrap --mon-ip 10.111.30.100 命令不会在主机上安装任何 Ceph 安装包，集群初始化后，cephadm 会在节点上拉去 Docker 镜像，并启动对应的 Docker 容器来运行相应的服务，包括 ceph、Prometheus、Grafana 、alertmanager 等。

$ docker ps -a
CONTAINER ID   IMAGE                                      COMMAND                  CREATED          STATUS          PORTS     NAMES
48657ea7b022   quay.io/ceph/ceph-grafana:6.7.4            "/bin/sh -c 'grafana…"   22 minutes ago   Up 22 minutes             ceph-e2b9a77e-0c23-11ee-9e9d-000c29687fa4-grafana.ceph-node-1
cc704589a17f   quay.io/prometheus/alertmanager:v0.20.0    "/bin/alertmanager -…"   22 minutes ago   Up 22 minutes             ceph-e2b9a77e-0c23-11ee-9e9d-000c29687fa4-alertmanager.ceph-node-1
86f95aba68c0   quay.io/prometheus/prometheus:v2.18.1      "/bin/prometheus --c…"   22 minutes ago   Up 22 minutes             ceph-e2b9a77e-0c23-11ee-9e9d-000c29687fa4-prometheus.ceph-node-1
9b6801c15353   quay.io/prometheus/node-exporter:v0.18.1   "/bin/node_exporter …"   23 minutes ago   Up 23 minutes             ceph-e2b9a77e-0c23-11ee-9e9d-000c29687fa4-node-exporter.ceph-node-1
459bf96f7646   quay.io/ceph/ceph:v15                      "/usr/bin/ceph-crash…"   29 minutes ago   Up 29 minutes             ceph-e2b9a77e-0c23-11ee-9e9d-000c29687fa4-crash.ceph-node-1
0399340209dc   quay.io/ceph/ceph:v15                      "/usr/bin/ceph-mgr -…"   31 minutes ago   Up 31 minutes             ceph-e2b9a77e-0c23-11ee-9e9d-000c29687fa4-mgr.ceph-node-1.bswaqn
93b3f483d33b   quay.io/ceph/ceph:v15                      "/usr/bin/ceph-mon -…"   31 minutes ago   Up 31 minutes             ceph-e2b9a77e-0c23-11ee-9e9d-000c29687fa4-mon.ceph-node-1

容器使用的 Docker 网络类型为 host，容器和主机共享了 root network namespace

启动的端口如下：

$ netstat -anutp | grep -v "ESTABLISHED"
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 10.111.30.100:3300      0.0.0.0:*               LISTEN      77870/ceph-mon      
tcp        0      0 0.0.0.0:6800            0.0.0.0:*               LISTEN      78076/ceph-mgr      
tcp        0      0 0.0.0.0:6801            0.0.0.0:*               LISTEN      78076/ceph-mgr      
tcp        0      0 10.111.30.100:6789      0.0.0.0:*               LISTEN      77870/ceph-mon      
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1066/sshd           
tcp6       0      0 :::3000                 :::*                    LISTEN      84486/grafana-serve 
tcp6       0      0 :::9100                 :::*                    LISTEN      83277/node_exporter 
tcp6       0      0 :::9093                 :::*                    LISTEN      84235/alertmanager  
tcp6       0      0 :::9094                 :::*                    LISTEN      84235/alertmanager  
tcp6       0      0 :::9095                 :::*                    LISTEN      83602/prometheus    
tcp6       0      0 :::8443                 :::*                    LISTEN      78076/ceph-mgr      
tcp6       0      0 :::22                   :::*                    LISTEN      1066/sshd           
tcp6       0      0 :::9283                 :::*                    LISTEN      78076/ceph-mgr      
udp        0      0 127.0.0.1:323           0.0.0.0:*                           76627/chronyd       
udp6       0      0 :::9094                 :::*                                84235/alertmanager  
udp6       0      0 ::1:323                 :::*                                76627/chronyd

部署成功后，可以通过 https://ceph-node-1:8443/ 登陆 Ceph Dashboard，用户名密码在 cephadm bootstrap --mon-ip 10.111.30.100 的输出信息中，首次登陆后必须修改密码。

部署成功后，可以通过以下命令，查看集群状态信息及操作管理集群

执行命令 cephadm shell，此命令会使用镜像 quay.io/ceph/ceph 启动一个容器，并使用 bash 命令进入容器，此容器会挂载主机上的 /etc/ceph/ 下的配置文件来连接访问集群。 ^[5]

$ cephadm shell
Inferring fsid e2b9a77e-0c23-11ee-9e9d-000c29687fa4
Inferring config /var/lib/ceph/e2b9a77e-0c23-11ee-9e9d-000c29687fa4/mon.ceph-node-1/config
Using recent ceph image quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
[ceph: root@ceph-node-1 /]# 
[ceph: root@ceph-node-1 /]# 
[ceph: root@ceph-node-1 /]# ceph status
  cluster:
    id:     e2b9a77e-0c23-11ee-9e9d-000c29687fa4
    health: HEALTH_WARN
            Reduced data availability: 1 pg inactive
            OSD count 0 < osd_pool_default_size 3
 
  services:
    mon: 1 daemons, quorum ceph-node-1 (age 2d)
    mgr: ceph-node-1.bswaqn(active, since 2d)
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     100.000% pgs unknown
             1 unknown

添加节点到集群

要添加的节点必须满足依赖，否则添加会失败 ^[6]

在第一个节点上执行以下命令，安装集群的 SSH public key，安装到新节点的 /root/.ssh/authorized_keys
ssh-copy-id -f -i /etc/ceph/ceph.pub root@10.111.30.110
ssh-copy-id -f -i /etc/ceph/ceph.pub root@10.111.30.120

将新节点加入集群。--labels _admin 选项会将第一个节点上的拥有管理员权限的 key （/etc/ceph/ceph.client.admin.keyring）拷贝到新添加的主机上，之后方便在主机上使用 cephadm shell

[ceph: root@ceph-node-1 /]# ceph orch host add ceph-node-2 10.111.30.110 --labels _admin
Added host 'ceph-node-2'
[ceph: root@ceph-node-1 /]# ceph orch host add ceph-node-3 10.111.30.120 --labels _admin
Added host 'ceph-node-3'

添加之后查看集群信息

[ceph: root@ceph-node-1 /]# ceph status
  cluster:
    id:     bcff3e7c-0f2f-11ee-afdf-000c29687fa4
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3
 
  services:
    mon: 3 daemons, quorum ceph-node-1,ceph-node-3,ceph-node-2 (age 7m)
    mgr: ceph-node-1.dfbrag(active, since 8m), standbys: ceph-node-2.lnnzmd
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

将节点从集群移除

要将节点从集群移除，首先要将节点上的 daemons 都移除，执行以下命令 ^[7]

[ceph: root@ceph-node-1 /]# ceph orch host drain *<host>*

此命令实际上在节点上添加了 Label _no_schedule。此节点上所有的 OSDs 都会被调度移走。可以通过以下命令，查看 OSD 移除（迁移）的进度

[ceph: root@ceph-node-1 /]# ceph orch osd rm status

使用以下命令检查节点上是否还有 daemons

[ceph: root@ceph-node-1 /]# ceph orch ps <host>

待所有的 daemons 都被移除后，执行以下命令删除节点

[ceph: root@ceph-node-1 /]# ceph orch host rm <host>

添加存储

存储要满足以下条件，才能作为 OSD 加入集群用作存储设备 ^[8]

裸设备，必须没有分区存在
设备不得具有任何 LVM 状态
设备不能包含文件系统
设备不得包含 Ceph BlueStore OSD
存储必须大于 5G

本示例中，每台节点添加了 20G 的裸磁盘用于集群的 OSD 存储，要将所有可用的存储（磁盘）设备当作存储添加到集群，选择以下方式

执行以下命令，将节点上所有可用的存储设备加入 OSD 存储

[ceph: root@ceph-node-1 /]# ceph orch apply osd --all-available-devices
Scheduled osd.all-available-devices update...

可以使用以下命令指定节点和磁盘添加 ^[8]

[ceph: root@ceph-node-1 /]# lsblk 
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0  100G  0 disk 
|-sda1            8:1    0    1G  0 part /rootfs/boot
`-sda2            8:2    0   99G  0 part 
  |-centos-root 253:0    0   50G  0 lvm  /rootfs
  |-centos-swap 253:1    0    2G  0 lvm  [SWAP]
  `-centos-home 253:2    0   47G  0 lvm  /rootfs/home
sdb               8:16   0   20G  0 disk

[ceph: root@ceph-node-1 /]# ceph orch device ls
HOST         PATH      TYPE  DEVICE ID   SIZE  AVAILABLE  REFRESHED  REJECT REASONS  
ceph-node-1  /dev/sdb  hdd              21.4G  Yes        18m ago                    
ceph-node-2  /dev/sdb  hdd              21.4G  Yes        23m ago                    
ceph-node-3  /dev/sdb  hdd              21.4G  Yes        22m ago                                                                          253:3    0   20G  0 lvm


[ceph: root@ceph-node-1 /]# ceph orch daemon add osd ceph-node-1:/dev/sdb
Created osd(s) 0 on host 'ceph-node-1'

[ceph: root@ceph-node-1 /]# ceph orch daemon add osd ceph-node-2:/dev/sdb
Created osd(s) 1 on host 'ceph-node-2'
[ceph: root@ceph-node-1 /]# ceph orch daemon add osd ceph-node-3:/dev/sdb
Created osd(s) 2 on host 'ceph-node-3'


[ceph: root@ceph-node-1 /]# ceph status
  cluster:
    id:     bcff3e7c-0f2f-11ee-afdf-000c29687fa4
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph-node-1,ceph-node-3,ceph-node-2 (age 22m)
    mgr: ceph-node-1.dfbrag(active, since 22m), standbys: ceph-node-2.lnnzmd
    osd: 3 osds: 3 up (since 50s), 3 in (since 70s)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 449 KiB
    usage:   62 MiB used, 60 GiB / 60 GiB avail
    pgs:     1 active+clean

常用命令

查看集群内的节点信息

查看集群内的所有节点信息

[ceph: root@ceph-node-1 /]# ceph orch host ls
HOST         ADDR           LABELS  STATUS  
ceph-node-1  ceph-node-1                    
ceph-node-2  10.111.30.110                  
ceph-node-3  10.111.30.120

查看集群内运行的组件

查看所有组件 (daemons) 的状态

[ceph: root@ceph-node-1 /]# ceph orch ps
NAME                       HOST         STATUS         REFRESHED  AGE  VERSION  IMAGE NAME                                IMAGE ID      CONTAINER ID  
alertmanager.ceph-node-1   ceph-node-1  running (42m)  2m ago     2d   0.20.0   quay.io/prometheus/alertmanager:v0.20.0   0881eb8f169f  d14854792fd0  
crash.ceph-node-1          ceph-node-1  running (2d)   2m ago     2d   15.2.17  quay.io/ceph/ceph:v15                     93146564743f  459bf96f7646  
crash.ceph-node-2          ceph-node-2  running (43m)  2m ago     43m  15.2.17  quay.io/ceph/ceph:v15                     93146564743f  a6fcd737a77e  
crash.ceph-node-3          ceph-node-3  running (5h)   2m ago     5h   15.2.17  quay.io/ceph/ceph:v15                     93146564743f  100dfe6be1c6  
grafana.ceph-node-1        ceph-node-1  running (2d)   2m ago     2d   6.7.4    quay.io/ceph/ceph-grafana:6.7.4           557c83e11646  48657ea7b022  
mgr.ceph-node-1.bswaqn     ceph-node-1  running (2d)   2m ago     2d   15.2.17  quay.io/ceph/ceph:v15                     93146564743f  0399340209dc  
mgr.ceph-node-2.jmyqrh     ceph-node-2  running (42m)  2m ago     42m  15.2.17  quay.io/ceph/ceph:v15                     93146564743f  52bf4fd6dc85  
mon.ceph-node-1            ceph-node-1  running (2d)   2m ago     2d   15.2.17  quay.io/ceph/ceph:v15                     93146564743f  93b3f483d33b  
mon.ceph-node-2            ceph-node-2  running (42m)  2m ago     42m  15.2.17  quay.io/ceph/ceph:v15                     93146564743f  429a884250d0  
mon.ceph-node-3            ceph-node-3  running (5h)   2m ago     5h   15.2.17  quay.io/ceph/ceph:v15                     93146564743f  c1a83b532057  
node-exporter.ceph-node-1  ceph-node-1  running (2d)   2m ago     2d   0.18.1   quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  9b6801c15353  
node-exporter.ceph-node-2  ceph-node-2  running (42m)  2m ago     42m  0.18.1   quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  a4a1551d8512  
node-exporter.ceph-node-3  ceph-node-3  running (5h)   2m ago     5h   0.18.1   quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  807e73c74634  
prometheus.ceph-node-1     ceph-node-1  running (42m)  2m ago     2d   2.18.1   quay.io/prometheus/prometheus:v2.18.1     de242295e225  51f52e64da3c                                   quay.io/prometheus/prometheus:v2.18.1     de242295e225

查看某一类型的组件的状态

[ceph: root@ceph-node-1 /]# ceph orch ps --daemon-type mon
NAME             HOST         STATUS        REFRESHED  AGE  VERSION  IMAGE NAME             IMAGE ID      CONTAINER ID  
mon.ceph-node-1  ceph-node-1  running (2d)  7m ago     2d   15.2.17  quay.io/ceph/ceph:v15  93146564743f  93b3f483d33b  
mon.ceph-node-3  ceph-node-3  running (3h)  7m ago     3h   15.2.17  quay.io/ceph/ceph:v15  93146564743f  c1a83b532057

列出服务状态

列出所有服务的状态

[ceph: root@ceph-node-1 /]# ceph orch ls
NAME           RUNNING  REFRESHED  AGE  PLACEMENT                                    IMAGE NAME                                IMAGE ID      
alertmanager       1/1  116s ago   2d   count:1                                      quay.io/prometheus/alertmanager:v0.20.0   0881eb8f169f  
crash              3/3  117s ago   2d   *                                            quay.io/ceph/ceph:v15                     93146564743f  
grafana            1/1  116s ago   2d   count:1                                      quay.io/ceph/ceph-grafana:6.7.4           557c83e11646  
mgr                2/2  117s ago   2d   count:2                                      quay.io/ceph/ceph:v15                     93146564743f  
mon                3/3  117s ago   97m  ceph-node-1;ceph-node-2;ceph-node-3;count:3  quay.io/ceph/ceph:v15                     93146564743f  
node-exporter      3/3  117s ago   2d   *                                            quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  
prometheus         1/1  116s ago   2d   count:1                                      quay.io/prometheus/prometheus:v2.18.1     de242295e225

列出某一个服务的状态

[ceph: root@ceph-node-1 /]# ceph orch ls mon   
NAME  RUNNING  REFRESHED  AGE  PLACEMENT                                    IMAGE NAME             IMAGE ID      
mon       3/3  4m ago     99m  ceph-node-1;ceph-node-2;ceph-node-3;count:3  quay.io/ceph/ceph:v15  93146564743f

[ceph: root@ceph-node-1 /]# ceph orch ls mgr
NAME  RUNNING  REFRESHED  AGE  PLACEMENT  IMAGE NAME             IMAGE ID      
mgr       2/2  5m ago     2d   count:2    quay.io/ceph/ceph:v15  93146564743f

集群常见操作

指定集群中 Mon 和 Mgr 数量

Ceph 集群默认需要启动 5 个 Mon 和 2 个 Mgr，如果要修改，可以使用以下命令

[ceph: root@ceph-node-1 /]# ceph orch apply mon --placement="3 ceph-node-1 ceph-node-2 ceph-node-3"
Scheduled mon update...

删除集群中的所有 daemons

./cephadm rm-cluster --fsid e2b9a77e-0c23-11ee-9e9d-000c29687fa4 --force

常见错误

No module named ‘_ssl’

安装 cephadm 后，执行命令报错

$ ./cephadm 
Traceback (most recent call last):
  File "/root/./cephadm", line 27, in <module>
    import ssl
  File "/usr/local/python3/lib/python3.10/ssl.py", line 99, in <module>
    import _ssl             # if we can't import it, let the error propagate
ModuleNotFoundError: No module named '_ssl'

问题原因及解决办法

Ceph does not support pacific or later for this version of this linux distro and therefore cannot add a repo for it

安装 cephadm 后，将 cephadm 安装到主机上时报错

$ ./cephadm add-repo --release quincy
ERROR: Ceph does not support pacific or later for this version of this linux distro and therefore cannot add a repo for it

可以安装 octopus 版本的 repo。octopus 以后的版本，未提供 Centos 7 的 REPO 源。

$ ./cephadm add-repo --release octopus
Writing repo to /etc/yum.repos.d/ceph.repo...
Enabling EPEL...
Completed adding repo.

Failed command: yum install -y cephadm