Ceph Kraken 11.2.0部署记录

2017-04-06 阅读: Ceph

本文记录在测试环境中部署Ceph Kraken的完整过程。 我们的线上环境主要使用Ceph的块存储RBD作为Kubernetes的存储卷,同时使用Ceph的对象存储RGW作为各种服务的对象存储,最后对这两种使用场景做一个整理。

环境准备

192.168.61.41 node1 - admin-node, deploy-node, mon, osd.0
192.168.61.42 node2 - mon, osd.1
192.168.61.43 node3 - mon, osd.2

在node1上配置Ceph yum源 /etc/yum.repos.d/ceph.repo, 根据GET PACKAGES选择kraken的地址:

[ceph-noarch]
name=Ceph noarch packages
baseurl=https://download.ceph.com/rpm-kraken/el7/noarch
enabled=1
priority=2
gpgcheck=1
gpgkey=https://download.ceph.com/keys/release.asc

安装ceph-deploy:

yum install ceph-deploy

在各节点上创建部署用户sdsceph,设置一个密码,并确保该用户具有sudo权限:

useradd -d /home/sdsceph -m sdsceph
passwd sdsceph

echo "sdsceph ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/sdsceph
chmod 0440 /etc/sudoers.d/sdsceph

禁用 requiretty,修改各节点sdsceph用户不需要控制终端,visudo找到Defaults requiretty修改成Defaults:sdsceph !requiretty

配置node1的sdsceph用户到各节点无密码登录,直接回车保持密码为空:

su sdsceph
ssh-keygen

将key拷贝到node1,node2,node3:

ssh-copy-id sdsceph@node1
ssh-copy-id sdsceph@node2
ssh-copy-id sdsceph@node3

修改node1上~/.ssh/config文件,设置当不指定用户时登录到node2, node3的用户为sdsceph:

Host node1
   Hostname node1
   User sdsceph
Host node2
   Hostname node2
   User sdsceph
Host node3
   Hostname node3
   User sdsceph

集群初始化

创建集群

使用ceph-deploy创建集群,在此过程中会生成一些配置文件,因此可以先在c0上创建一个目录ceph-cluster,并进入到这个目录中:

su sdsceph
mkdir ~/ceph-cluster
cd ~/ceph-cluster

执行命令ceph-deploy new {initial-monitor-node(s)}将创建一个名称ceph的ceph cluster,node1~node3每个节点上都有一个MON节点:

ceph-deploy new node1 node2 node3

ceph-cluster目录下会生成下面3个文件:

ceph.conf  ceph-deploy-ceph.log  ceph.mon.keyring

安装Ceph软件包

从node1执行命令将ceph安装到各个节点:

su sdsceph
cd ~/ceph-cluster
ceph-deploy install node1 node2 node3 --release=kraken

配置并启动MON节点

配置初始化并启动Ceph MON节点,收集所有密钥:

ceph-deploy mon create-initial

执行成功后当前目录下会出现以下key:

ceph.bootstrap-mds.keyring
ceph.bootstrap-osd.keyring
ceph.bootstrap-rgw.keyring
ceph.client.admin.keyring

在各个节点上在ps -ef | grep ceph-mon可以看到Ceph MON进程已经启动。

下面将Ceph MON做成开机启动,分别在各节点执行:

sudo systemctl enable ceph-mon.target

添加并启动OSD节点

这个实验环境的node1~node3上各有一个没有分区和格式化的裸盘/dev/sdb,我们将使用这几个盘创建OSD。

sudo fdisk -l

Disk /dev/sdb: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

初始化磁盘:

ceph-deploy disk zap node1:sdb
ceph-deploy disk zap node2:sdb
ceph-deploy disk zap node3:sdb

准备OSD,数据和日志在同一个磁盘上:

ceph-deploy osd prepare node1:sdb
ceph-deploy osd prepare node2:sdb
ceph-deploy osd prepare node3:sdb
sudo fdisk -l

Disk /dev/sdb: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: gpt


#         Start          End    Size  Type            Name
 1     10487808    209715166     95G  unknown         ceph data
 2         2048     10487807      5G  unknown         ceph journal
sudo df -h | grep sdb
/dev/sdb1                 95G   33M   95G   1% /var/lib/ceph/tmp/mnt.nyHIPm

激活osd:

ceph-deploy osd activate node1:sdb1:sdb2
ceph-deploy osd activate node2:sdb1:sdb2
ceph-deploy osd activate node3:sdb1:sdb2

在各节点上执行下面的命令,将Ceph OSD做成开机启动:

sudo systemctl enable ceph-mon.target
sudo systemctl enable ceph.target

集群状态查看

分发admin密钥,将admin密钥拷贝到各节点,这样每次执行ceph命令行时就无需指定monitor地址和 ceph.client.admin.keyring了:

ceph-deploy admin node1 node2 node3

赋予 ceph.client.admin.keyring读权限:

sudo chmod +r /etc/ceph/ceph.client.admin.keyring

查看ceph集群中OSD节点的状态:

ceph osd tree
ID WEIGHT  TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.27809 root default
-2 0.09270     host node1
 0 0.09270         osd.0       up  1.00000          1.00000
-3 0.09270     host node2
 1 0.09270         osd.1       up  1.00000          1.00000
-4 0.09270     host node3
 2 0.09270         osd.2       up  1.00000          1.00000

查看集群健康状态:

ceph health
HEALTH_OK

使用Ceph RBD块存储

因为Ceph的高版本在将块设备镜像映射到内核时,Ceph会在创建image时增加许多feature,这些feature都需要内核支持,而CentOS 7内核上支持有限,所以在映射镜像时会出现RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable".的错误。 我们之前在Ceph块存储之RBD中是手动对具体的镜像禁用当前系统内核不支持的feature,然后再将其映射到内核。 这次为了全局禁用这些特性即使得在创建rbd image时就禁用这些特性,我们直接修改ceph.conf,加入:

rbd_default_features = 1
  • 上面配置rbd_default_features = 1 来设置默认 features,数值1是 layering特性的 bit码所对应的整数值

接下来将ceph.conf推送到各个节点的/etc/ceph/ceph.conf下:

ceph-deploy --overwrite-conf config push node1 node2 node3

当前我们对的Ceph RBD的使用情况是创建单独的存储池pool,并在其下面创建RBD镜像,作为Kubernetes集群的Persistent Volume 。 这块可参看前面的文章:

使用Ceph RGW对象存储

接下来继续在实验环境中部署Ceph RGW服务。这里使用的是civetweb的方式,nginx的方式可参考之前的文章Ceph对象存储之RGW

192.168.61.41 node1 - admin-node, deploy-node, mon, osd.0, rgw
192.168.61.42 node2 - mon, osd.1, rgw
192.168.61.43 node3 - mon, osd.2, rgw

在各RGW节点安装Ceph RGW软件包:

ceph-deploy install --rgw  --release=kraken node1 node2 node3

报下面对错误:

file /etc/yum.repos.d/ceph.repo from install of ceph-release-1-1.el7.noarch conflicts with file from package ceph-release-1-1.el7.noarch
yum remove ceph-release-1-1.el7.noarch

重新安装:

ceph-deploy install --rgw  --release=kraken node1 node2 node3

启动RGW实例:

ceph-deploy rgw create node1 node2 node3

在各节点执行将Ceph RGW做成开机启动:

sudo systemctl enable ceph-radosgw.target
sudo systemctl enable ceph.target

RGW服务在各个节点启动并默认监听7480端口:

sudo netstat -nltp | grep radosgw
tcp        0      0 0.0.0.0:7480            0.0.0.0:*               LISTEN      1066/radosgw
curl node1:7480
<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
	<Owner>
		<ID>anonymous</ID>
		<DisplayName></DisplayName>
	</Owner>
	<Buckets></Buckets>
</ListAllMyBucketsResult>

创建一个S3用户:

radosgw-admin user create --uid=oper --display-name=oper --email=oper@oper.com

{
    "user_id": "oper",
    "display_name": "oper",
    "email": "oper@oper.com",
    "suspended": 0,
    "max_buckets": 1000,
    "auid": 0,
    "subusers": [],
    "keys": [
        {
            "user": "oper",
            "access_key": "JGD1S199DEMTQVMP435P",
            "secret_key": "iaw2K9BHowvvyrFBGRUTrNJgw2E9eE7qZLcIO7vJ"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw"
}

在node1上安装s3cmd:

sudo yum install -y s3cmd

接下来配置s3cmd, 指定前面创建的oper用户的Access Key和Secret Key:

s3cmd --configure
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: JGD1S199DEMTQVMP435P
Secret Key: iaw2K9BHowvvyrFBGRUTrNJgw2E9eE7qZLcIO7vJ
Default Region [US]:

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: No

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:

New settings:
  Access Key: JGD1S199DEMTQVMP435P
  Secret Key: iaw2K9BHowvvyrFBGRUTrNJgw2E9eE7qZLcIO7vJ
  Default Region: US
  Encryption password:
  Path to GPG program: /usr/bin/gpg
  Use HTTPS protocol: False
  HTTP Proxy server name:
  HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] n

Save settings? [y/N] y
Configuration saved to '/home/sdsceph/.s3cfg'

因为当用户访问S3的Bucket中的数据时,通常情况下需要借助域名,有的S3客户端可能需要将相关Bucket的资源关联到具体的域名上才能正常使用S3服务,因此可能需要搭建泛域名解析环境。 这里对于node1上的s3cmd这个客户端,我们不使用泛域名。 签名配置完s3cmd后,会在sdsceph用户的Home目录里生成.s3cfg配置文件,找到下面的内容:

host_base = s3.amazonaws.com
host_bucket = %(bucket)s.s3.amazonaws.com

修改成:

host_base = node1:7480
host_bucket = node1:7480/%(bucket)

下面对s3服务的使用做一下简单测试:

创建Bucket:

s3cmd mb s3://mybucket
Bucket 's3://mybucket/' created

上传Object:

s3cmd put hello.txt s3://mybucket
upload: 'hello.txt' -> 's3://mybucket/hello.txt'  [1 of 1]
 12 of 12   100% in    1s     6.96 B/s  done

下载Object:

cd /tmp
s3cmd get s3://mybucket/hello.txt

上传并授予Object公开访问权限:

s3cmd put --acl-public hello.txt s3://mybucket/a/b/helloworld.txt
upload: 'hello.txt' -> 's3://mybucket/a/b/helloworld.txt'  [1 of 1]
 12 of 12   100% in    0s   148.96 B/s  done
Public URL of the object is: http://node1:7480/mybucket/a/b/helloworld.txt
curl http://node3:7480/mybucket/a/b/helloworld.txt
hello

查看Object:

s3cmd ls -r s3://mybucket

参考

标题:Ceph Kraken 11.2.0部署记录
本文链接:https://blog.frognew.com/2017/04/ceph-deploy-kraken-11.2.0.html
转载请注明出处。

目录