使用kubeadm安装Kubernetes 1.9
📅 2017-12-16 | 🖱️
kubeadm是Kubernetes官方提供的用于快速安装Kubernetes集群的工具,伴随Kubernetes每个版本的发布都会同步更新,kubeadm会对集群配置方面的一些实践做调整,通过实验kubeadm可以学习到Kubernetes官方在集群配置上一些新的最佳实践。
Kubernetes的官方文档更新的速度太快了,我们注意到在Kubernetes 1.9的文档Using kubeadm to Create a Cluster中已经给出了目前1.9的kubeadm的主要特性已经处于beta状态了,在2018年将进入GA状态,说明kubeadm离可以在生产环境中使用的距离越来越近了。
当然我们线上稳定运行的Kubernetes集群是使用ansible以二进制形式的部署的高可用集群,这里体验Kubernetes 1.9中的kubeadm是为了跟随官方对集群初始化和配置方面的最佳实践,进一步完善我们的ansible部署脚本。
1.准备 #
1.1系统配置 #
在安装之前,需要先做如下准备。两台CentOS 7.4主机如下:
1cat /etc/hosts
2192.168.61.11 node1
3192.168.61.12 node2
如果各个主机启用了防火墙,需要开放Kubernetes各个组件所需要的端口,可以查看Installing kubeadm中的"Check required ports"一节。 这里简单起见在各节点禁用防火墙:
1systemctl stop firewalld
2systemctl disable firewalld
禁用SELINUX:
1setenforce 0
1vi /etc/selinux/config
2SELINUX=disabled
创建/etc/sysctl.d/k8s.conf文件,添加如下内容:
1net.bridge.bridge-nf-call-ip6tables = 1
2net.bridge.bridge-nf-call-iptables = 1
执行sysctl -p /etc/sysctl.d/k8s.conf
使修改生效。
1.2安装Docker #
1yum install -y yum-utils device-mapper-persistent-data lvm2
2yum-config-manager \
3 --add-repo \
4 https://download.docker.com/linux/centos/docker-ce.repo
查看当前的Docker版本:
1yum list docker-ce.x86_64 --showduplicates |sort -r
2docker-ce.x86_64 17.09.0.ce-1.el7.centos docker-ce-stable
3docker-ce.x86_64 17.06.2.ce-1.el7.centos docker-ce-stable
4docker-ce.x86_64 17.06.1.ce-1.el7.centos docker-ce-stable
5docker-ce.x86_64 17.06.0.ce-1.el7.centos docker-ce-stable
6docker-ce.x86_64 17.03.2.ce-1.el7.centos docker-ce-stable
7docker-ce.x86_64 17.03.1.ce-1.el7.centos docker-ce-stable
8docker-ce.x86_64 17.03.0.ce-1.el7.centos docker-ce-stable
Kubernetes 1.8已经针对Docker的1.11.2, 1.12.6, 1.13.1和17.03等版本做了验证。 因为我们这里在各节点安装docker的17.03.2版本。
1yum makecache fast
2
3yum install -y --setopt=obsoletes=0 \
4 docker-ce-17.03.2.ce-1.el7.centos \
5 docker-ce-selinux-17.03.2.ce-1.el7.centos
6
7systemctl start docker
8systemctl enable docker
Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信,在各个Docker节点执行下面的命令:
1iptables -P FORWARD ACCEPT
可在docker的systemd unit文件中以ExecStartPost加入上面的命令:
1ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
1systemctl daemon-reload
2systemctl restart docker
2.安装kubeadm和kubelet #
下面在各节点安装kubeadm和kubelet:
1cat <<EOF > /etc/yum.repos.d/kubernetes.repo
2[kubernetes]
3name=Kubernetes
4baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
5enabled=1
6gpgcheck=1
7repo_gpgcheck=1
8gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
9 https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
10EOF
测试地址https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64是否可用,如果不可用需要科学上网。
1curl https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
1yum makecache fast
2yum install -y kubelet kubeadm kubectl
3
4...
5Installed:
6 kubeadm.x86_64 0:1.9.0-0 kubectl.x86_64 0:1.9.0-0 kubelet.x86_64 0:1.9.0-0
7
8Dependency Installed:
9 kubernetes-cni.x86_64 0:0.6.0-0 socat.x86_64 0:1.7.3.2-2.el7
- 从安装结果可以看出还安装了kubernetes-cni和socat两个依赖: * 可以看出官方Kubernetes 1.9依赖的cni升级到了0.6.0版本 * socat是kubelet的依赖
- 我们之前在Kubernetes 1.6 高可用集群部署中手动安装这两个依赖的
Kubernetes文档中kubelet的启动参数:
1 --cgroup-driver string Driver that the kubelet uses to manipulate cgroups on the host.
2 Possible values: 'cgroupfs', 'systemd' (default "cgroupfs")
默认值为cgroupfs,但是我们注意到yum安装kubelet,kubeadm时生成10-kubeadm.conf文件中将这个参数值改成了systemd。
查看kubelet的 /etc/systemd/system/kubelet.service.d/10-kubeadm.conf文件,其中包含如下内容:
1Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
使用docker info
打印docker信息:
1docker info
2......
3Server Version: 17.03.2-ce
4......
5Cgroup Driver: cgroupfs
可以看出docker 17.03使用的Cgroup Driver为cgroupfs。
于是修改各节点docker的cgroup driver使其和kubelet一致,即修改或创建/etc/docker/daemon.json,加入下面的内容:
1{
2 "exec-opts": ["native.cgroupdriver=systemd"]
3}
重启docker:
1systemctl restart docker
2systemctl status docker
在各节点开机启动kubelet服务:
1systemctl enable kubelet.service
Kubernetes 1.8开始要求关闭系统的Swap,如果不关闭,默认配置下kubelet将无法启动。可以通过kubelet的启动参数--fail-swap-on=false
更改这个限制。
关闭系统的Swap方法如下:
1swapoff -a
修改 /etc/fstab 文件,注释掉 SWAP 的自动挂载,使用free -m
确认swap已经关闭。
swappiness参数调整,修改/etc/sysctl.d/k8s.conf添加下面一行:
1vm.swappiness=0
执行sysctl -p /etc/sysctl.d/k8s.conf
使修改生效。
因为这里本次用于测试两台主机上还运行其他服务,关闭swap可能会对其他服务产生影响,所以这里修改kubelet的启动参数--fail-swap-on=false
去掉这个限制。修改/etc/systemd/system/kubelet.service.d/10-kubeadm.conf,加入:
1Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false"
1systemctl daemon-reload
3.使用kubeadm init初始化集群 #
接下来使用kubeadm初始化集群,选择node1作为Master Node,在node1上执行下面的命令:
1kubeadm init \
2 --kubernetes-version=v1.9.0 \
3 --pod-network-cidr=10.244.0.0/16 \
4 --apiserver-advertise-address=192.168.61.11
因为我们选择flannel作为Pod网络插件,所以上面的命令指定–pod-network-cidr=10.244.0.0/16。
执行时报了下面的错误:
1[init] Using Kubernetes version: v1.9.0
2[init] Using Authorization modes: [Node RBAC]
3[preflight] Running pre-flight checks.
4 [WARNING FileExisting-crictl]: crictl not found in system path
5[preflight] Some fatal errors occurred:
6 [ERROR Swap]: running with swap on is not supported. Please disable swap
7[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
一个警告信息是crictl not found in system path
,另一个错误信息是running with swap on is not supported. Please disable swap
。因为我们前面已经修改了kubelet的启动参数,所以重新添加–ignore-preflight-errors=Swap参数忽略这个错误,重新运行。
1kubeadm init \
2> --kubernetes-version=v1.9.0 \
3> --pod-network-cidr=10.244.0.0/16 \
4> --apiserver-advertise-address=192.168.61.11 \
5> --ignore-preflight-errors=Swap
6[init] Using Kubernetes version: v1.9.0
7[init] Using Authorization modes: [Node RBAC]
8[preflight] Running pre-flight checks.
9 [WARNING Swap]: running with swap on is not supported. Please disable swap
10 [WARNING FileExisting-crictl]: crictl not found in system path
11[preflight] Starting the kubelet service
12[certificates] Generated ca certificate and key.
13[certificates] Generated apiserver certificate and key.
14[certificates] apiserver serving cert is signed for DNS names [node1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.61.11]
15[certificates] Generated apiserver-kubelet-client certificate and key.
16[certificates] Generated sa key and public key.
17[certificates] Generated front-proxy-ca certificate and key.
18[certificates] Generated front-proxy-client certificate and key.
19[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
20[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
21[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
22[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
23[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
24[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
25[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
26[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
27[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
28[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
29[init] This might take a minute or longer if the control plane images have to be pulled.
30[apiclient] All control plane components are healthy after 115.502952 seconds
31[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
32[markmaster] Will mark node node1 as master by adding a label and a taint
33[markmaster] Master node1 tainted and labelled with key/value: node-role.kubernetes.io/master=""
34[bootstraptoken] Using token: 227d9b.9c1772a7ab45942d
35[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
36[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
37[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
38[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
39[addons] Applied essential addon: kube-dns
40[addons] Applied essential addon: kube-proxy
41
42Your Kubernetes master has initialized successfully!
43
44To start using your cluster, you need to run the following as a regular user:
45
46 mkdir -p $HOME/.kube
47 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
48 sudo chown $(id -u):$(id -g) $HOME/.kube/config
49
50You should now deploy a pod network to the cluster.
51Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
52 https://kubernetes.io/docs/concepts/cluster-administration/addons/
53
54You can now join any number of machines by running the following on each node
55as root:
56
57 kubeadm join --token 227d9b.9c1772a7ab45942d 192.168.61.11:6443 --discovery-token-ca-cert-hash sha256:e2aa90853cc97410904adc5d58fbb52f4377ad091a0a807bed0dc69e37107151
上面记录了完成的初始化输出的内容。我们注意到虽然kubeadm 1.9仍然处于beta状态,但是kubeadm init的输出中已经没有了过去运行kubeadm 1.8时的警告信息[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
其中有以下关键内容:
- RBAC模式早已经在Kubernetes 1.8中稳定可用。kubeadm 1.9也默认启用了RBAC
- 接下来是生成证书和相关的kubeconfig文件,这个目前我们在Kubernetes 1.6 高可用集群部署也是这么做的,目前没看出有什么新东西
- 生成token记录下来,后边使用
kubeadm join
往集群中添加节点时会用到 - 下面的命令是配置常规用户如何使用kubectl访问集群:
1mkdir -p $HOME/.kube 2sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 3sudo chown $(id -u):$(id -g) $HOME/.kube/config
- 最后给出了将节点加入集群的命令
kubeadm join --token 227d9b.9c1772a7ab45942d 192.168.61.11:6443 --discovery-token-ca-cert-hash sha256:e2aa90853cc97410904adc5d58fbb52f4377ad091a0a807bed0dc69e37107151
查看一下集群状态:
1kubectl get cs
2NAME STATUS MESSAGE ERROR
3scheduler Healthy ok
4controller-manager Healthy ok
5etcd-0 Healthy {"health": "true"}
确认个组件都处于healthy状态。
集群初始化如果遇到问题,可以使用下面的命令进行清理:
1kubeadm reset
2ifconfig cni0 down
3ip link delete cni0
4ifconfig flannel.1 down
5ip link delete flannel.1
6rm -rf /var/lib/cni/
4.安装Pod Network #
接下来安装flannel network add-on:
1mkdir -p ~/k8s/
2cd ~/k8s
3wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
4kubectl apply -f kube-flannel.yml
5clusterrole "flannel" created
6clusterrolebinding "flannel" created
7serviceaccount "flannel" created
8configmap "kube-flannel-cfg" created
9daemonset "kube-flannel-ds" created
这里注意kube-flannel.yml这个文件里的flannel的镜像是0.9.1,quay.io/coreos/flannel:v0.9.1-amd64
如果Node有多个网卡的话,参考flannel issues 39701,目前需要在kube-flannel.yml中使用--iface
参数指定集群主机内网网卡的名称,否则可能会出现dns无法解析。需要将kube-flannel.yml下载到本地,flanneld启动参数加上--iface=<iface-name>
1......
2containers:
3 - name: kube-flannel
4 image: quay.io/coreos/flannel:v0.9.1-amd64
5 command:
6 - /opt/bin/flanneld
7 args:
8 - --ip-masq
9 - --kube-subnet-mgr
10 - --iface=eth1
11......
使用kubectl get pod --all-namespaces -o wide
确保所有的Pod都处于Running状态。
1kubectl get pod --all-namespaces -o wide
5.master node参与工作负载 #
使用kubeadm初始化的集群,出于安全考虑Pod不会被调度到Master Node上,也就是说Master Node不参与工作负载。
这里搭建的是测试环境可以使用下面的命令使Master Node参与工作负载:
1kubectl taint nodes node1 node-role.kubernetes.io/master-
2node "node1" untainted
6.测试DNS #
1kubectl run curl --image=radial/busyboxplus:curl -i --tty
2If you don't see a command prompt, try pressing enter.
3[ root@curl-2716574283-xr8zd:/ ]$
进入后执行nslookup kubernetes.default确认解析正常:
1nslookup kubernetes.default
2Server: 10.96.0.10
3Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
4
5Name: kubernetes.default
6Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local
7.向Kubernetes集群添加Node #
下面我们将node2这个主机添加到Kubernetes集群中,因为我们同样在node2上的kubelet的启动参数中去掉了必须关闭swap的限制,所以同样需要--ignore-preflight-errors=Swap
这个参数。
在node2上执行:
1kubeadm join --token 227d9b.9c1772a7ab45942d 192.168.61.11:6443 --discovery-token-ca-cert-hash sha256:e2aa90853cc97410904adc5d58fbb52f4377ad091a0a807bed0dc69e37107151 \
2> --ignore-preflight-errors=Swap
3[preflight] Running pre-flight checks.
4 [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
5 [WARNING Swap]: running with swap on is not supported. Please disable swap
6 [WARNING FileExisting-crictl]: crictl not found in system path
7[preflight] Starting the kubelet service
8[discovery] Trying to connect to API Server "192.168.61.11:6443"
9[discovery] Created cluster-info discovery client, requesting info from "https://192.168.61.11:6443"
10[discovery] Requesting info from "https://192.168.61.11:6443" again to validate TLS against the pinned public key
11[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.61.11:6443"
12[discovery] Successfully established connection with API Server "192.168.61.11:6443"
13
14This node has joined the cluster:
15* Certificate signing request was sent to master and a response
16 was received.
17* The Kubelet was informed of the new secure connection details.
18
19Run 'kubectl get nodes' on the master to see this node join the cluster.
node2加入集群很是顺利,下面在master节点上执行命令查看集群中的节点:
1kubectl get nodes
2NAME STATUS ROLES AGE VERSION
3node1 Ready master 26m v1.9.0
4node2 Ready <none> 2m v1.9.0
如何从集群中移除Node #
如果需要从集群中移除node2这个Node执行下面的命令:
在master节点上执行:
1kubectl drain node2 --delete-local-data --force --ignore-daemonsets
2kubectl delete node node2
在node2上执行:
1kubeadm reset
2ifconfig cni0 down
3ip link delete cni0
4ifconfig flannel.1 down
5ip link delete flannel.1
6rm -rf /var/lib/cni/
8.dashboard插件部署 #
注意当前dashboard的版本已经是1.8.1了。
另外需要注意dashboard调整了部署文件的源码目录结构:
1mkdir -p ~/k8s/
2cd ~/k8s
3wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.8.1/src/deploy/recommended/kubernetes-dashboard.yaml
4kubectl create -f kubernetes-dashboard.yaml
kubernetes-dashboard.yaml文件中的ServiceAccount kubernetes-dashboard
只有相对较小的权限,因此我们创建一个kubernetes-dashboard-admin
的ServiceAccount并授予集群admin的权限,创建kubernetes-dashboard-admin.rbac.yaml:
1---
2apiVersion: v1
3kind: ServiceAccount
4metadata:
5 labels:
6 k8s-app: kubernetes-dashboard
7 name: kubernetes-dashboard-admin
8 namespace: kube-system
9
10---
11apiVersion: rbac.authorization.k8s.io/v1beta1
12kind: ClusterRoleBinding
13metadata:
14 name: kubernetes-dashboard-admin
15 labels:
16 k8s-app: kubernetes-dashboard
17roleRef:
18 apiGroup: rbac.authorization.k8s.io
19 kind: ClusterRole
20 name: cluster-admin
21subjects:
22- kind: ServiceAccount
23 name: kubernetes-dashboard-admin
24 namespace: kube-system
1kubectl create -f kubernetes-dashboard-admin.rbac.yaml
2serviceaccount "kubernetes-dashboard-admin" created
3clusterrolebinding "kubernetes-dashboard-admin" created
查看kubernete-dashboard-admin的token:
1kubectl -n kube-system get secret | grep kubernetes-dashboard-admin
2kubernetes-dashboard-admin-token-pfss5 kubernetes.io/service-account-token 3 14s
3
4 kubectl describe -n kube-system secret/kubernetes-dashboard-admin-token-pfss5
5Name: kubernetes-dashboard-admin-token-pfss5
6Namespace: kube-system
7Labels: <none>
8Annotations: kubernetes.io/service-account.name=kubernetes-dashboard-admin
9 kubernetes.io/service-account.uid=1029250a-ad76-11e7-9a1d-08002778b8a1
10
11Type: kubernetes.io/service-account-token
12
13Data
14====
15ca.crt: 1025 bytes
16namespace: 11 bytes
17token: eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC1hZG1pbi10b2tlbi1wZnNzNSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC1hZG1pbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjEwMjkyNTBhLWFkNzYtMTFlNy05YTFkLTA4MDAyNzc4YjhhMSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcm5ldGVzLWRhc2hib2FyZC1hZG1pbiJ9.Bs6h65aFCFkEKBO_h4muoIK3XdTcfik-pNM351VogBJD_pk5grM1PEWdsCXpR45r8zUOTpGM-h8kDwgOXwy2i8a5RjbUTzD3OQbPJXqa1wBk0ABkmqTuw-3PWMRg_Du8zuFEPdKDFQyWxiYhUi_v638G-R5RdZD_xeJAXmKyPkB3VsqWVegoIVTaNboYkw6cgvMa-4b7IjoN9T1fFlWCTZI8BFXbM8ICOoYMsOIJr3tVFf7d6oVNGYqaCk42QL_2TfB6xMKLYER9XDh753-_FDVE5ENtY5YagD3T_s44o0Ewara4P9C3hYRKdJNLxv7qDbwPl3bVFH3HXbsSxxF3TQ
在dashboard的登录窗口使用上面的token登录。
9.heapster插件部署 #
下面安装Heapster为集群添加使用统计和监控功能,为Dashboard添加仪表盘。 使用InfluxDB做为Heapster的后端存储,开始部署:
1mkdir -p ~/k8s/heapster
2cd ~/k8s/heapster
3wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/grafana.yaml
4wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml
5wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml
6wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml
7
8kubectl create -f ./
最后确认所有的pod都处于running状态,打开Dashboard,集群的使用统计会以仪表盘的形式显示出来。
本次安装涉及到的Docker镜像:
1gcr.io/google_containers/kube-proxy-amd64:v1.9.0
2gcr.io/google_containers/kube-apiserver-amd64:v1.9.0
3gcr.io/google_containers/kube-controller-manager-amd64:v1.9.0
4gcr.io/google_containers/kube-scheduler-amd64:v1.9.0
5quay.io/coreos/flannel:v0.9.1-amd64
6gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.7
7gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.7
8gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.7
9gcr.io/google_containers/etcd-amd64:3.1.10
10gcr.io/google_containers/pause-amd64:3.0
11
12gcr.io/google_containers/kubernetes-dashboard-amd64:v1.8.1
13gcr.io/google_containers/heapster-influxdb-amd64:v1.3.3
14gcr.io/google_containers/heapster-grafana-amd64:v4.4.3
15gcr.io/google_containers/heapster-amd64:v1.4.2