使用Ansible部署Kubernetes 1.6高可用集群
2017-06-06
我们已经用ansible在新的环境中部署了etcd和docker,接下来使用ansible部署Kubernetes 1.6集群。 由于对ansible的使用已经没有什么问题了,因此本篇记录的侧重点是白话描述一下部署的具体步骤,以及部署过程踩的一些坑。 因为前段日子写过一篇《Kubernetes 1.6 高可用集群部署》,本次在编写ansible部署Kubernetes的roles时主要是一这篇文章里手动部署的过程为参考。
环境准备 #
系统配置 #
Kubernetes集群各节点禁用SELINUX,同时在各节点创建/etc/sysctl.d/k8s.conf文件添加如下内容:
1net.bridge.bridge-nf-call-ip6tables = 1
2net.bridge.bridge-nf-call-iptables = 1
1{% raw %}
2- name: copy sysctl k8s.conf
3 copy:
4 src: k8s.conf
5 dest: /etc/sysctl.d
6
7- name: sysctl k8s.conf
8 sysctl:
9 name: "{{ item }}"
10 value: 1
11 sysctl_file: /etc/sysctl.d/k8s.conf
12 with_items:
13 - net.bridge.bridge-nf-call-iptables
14 - net.bridge.bridge-nf-call-ip6tables
15
16- name: config disable selinux
17 lineinfile:
18 path: /etc/selinux/config
19 regexp: '^SELINUX='
20 line: 'SELINUX=enforcing'
21
22- name: diable selinux
23 selinux:
24 state: disabled
25{% endraw %}
部署etcd高可用集群 #
前面准备好了,直接复用《使用Ansible部署etcd 3.2高可用集群》。
各节点安装Docker #
主要坑在了这块,之前的手动部署的那套环境(Kubernetes 1.6 高可用集群部署)按照官方文档的建议按照的是Docker 1.12:
Kubernetes 1.6还没有针对docker 1.13和最新的docker 17.03上做测试和验证,Kubernetes官方推荐的Docker 1.12版本
而这次部署这套新的环境前面已经使用Ansible安装Docker CE 17.03,按照的是最新的Docker CE 17.03。 结果写完整套部署Kubernetes的ansible role和playbook后,使用ansilbe-playbook运行,整个安装过程一气呵成,但在测试时却出现了问题,不同Node上的Pod之间无法通信,网络不通。
最后在这两个issue中找到了问题的原因:
- https://github.com/kubernetes/kubernetes/issues/40182
- https://github.com/kubernetes/kubernetes/issues/40761
Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样就引起了Kubernetes集群中跨Node的Pod无法通信。解决的方法有以下两种:
- 在所有节点上执行
sudo iptables -P FORWARD ACCEPT
- 通过修改dockerd选项配置文件/etc/docker/daemon.json中的
--iptables-false
这里使用的是第一种方法,在部署Kubernetes Node的roles中加入了下面的task:
1{% raw %}
2- name: config filter FORWARD chain for pod networks
3 iptables:
4 table: filter
5 chain: FORWARD
6 policy: ACCEPT
7{% endraw %}
用户,证书和二进制包安装 #
之前的手动部署的那套环境(Kubernetes 1.6 高可用集群部署)中是以root用户运行Kubernetes各个核心组件的,现在搭建的这套环境相对比较正式,因此需要创建运行这些核心的组件的系统用户和用户组。 同时创建Kubernetes的配置文件目录和日志目录。
下一步生成和分发集群所需的SSL证书和秘钥到各个节点:
- kube_ca_cert_file:CA证书
- kube_ca_key_file:CA私钥
- kube_apiserver_cert_file:APIServer的证书
- kube_apiserver_key_file:APIServer的私钥
- kube_admin_cert_file:kubernetes-admin客户端用户的证书
- kube_admin_key_file:kubernetes-admin客户端用户的私钥
- kube_controller_manager_cert_file:controller manager客户端的证书
- kube_controller_manager_key_file:controller mangager客户端的私钥
- kube_scheduler_cert_file:scheduler客户端的证书
- kube_scheduler_key_file:scheduler客户端的私钥
1{% raw %}
2- name: gen certs on the first master server
3 command:
4 "{{ kube_cert_dir }}/make-ca-cert.sh"
5 args:
6 creates: "{{ kube_cert_dir }}/ca.key"
7 run_once: true
8 delegate_to: "{{ groups['k8s-master'][0] }}"
9 environment:
10 NODE_IPS: "{% for host in groups['k8s-master'] %}{{ hostvars[host]['k8s_master_address'] }}{% if not loop.last %},{% endif %}{% endfor %}"
11 NODE_DNS: "{{ groups['k8s-master']|join(',') }}"
12 CERT_DIR: "{{ kube_cert_dir }}"
13 CERT_GROUP: kube
14
15- name: slurp kube certs
16 slurp:
17 src: "{{ item }}"
18 register: pki_certs
19 run_once: true
20 delegate_to: "{{ groups['k8s-master'][0] }}"
21 with_items:
22 - "{{ kube_ca_cert_file }}"
23 - "{{ kube_ca_key_file }}"
24 - "{{ kube_admin_cert_file }}"
25 - "{{ kube_admin_key_file }}"
26 - "{{ kube_apiserver_cert_file }}"
27 - "{{ kube_apiserver_key_file }}"
28 - "{{ kube_controller_manager_cert_file }}"
29 - "{{ kube_controller_manager_key_file }}"
30 - "{{ kube_scheduler_cert_file }}"
31 - "{{ kube_scheduler_key_file }}"
32
33- name: copy kube certs to other node servers
34 copy:
35 dest: "{{ item.item }}"
36 content: "{{ item.content | b64decode }}"
37 owner: kube
38 group: kube
39 mode: 0400
40 with_items: "{{ pki_certs.results }}"
41 when: inventory_hostname != groups['k8s-master'][0]
42{% endraw %}
接下来要在各个节点下载和安装Kubernetes各组件二进制可执行文件:
- kube-apiserver
- kube-controller-manager
- kube-scheduler
- kubelet
- kube-proxy
- kubectl
Master集群部署 #
Master集群由三个Master节点组成,每个节点上部署kube-apiserver,kube-controller-manager,kube-scheduler三个核心组件。 kube-apiserver的3个实例同时提供服务,在其前端部署一个高可用的负载均衡器作为kube-apiserver的地址。 kube-controller-manager和kube-scheduler也是各自3个实例,在同一时刻只能有1个实例工作,这个实例通过选举产生。
apiserver #
apiserver的部署十分简单,因为我们的ETCD集群启用SSL,所以apiserver参数需要指定etcd客户端相关的证书。 下面只列出 systemd unit模板文件:
1{% raw %}
2Unit]
3Description=kube-apiserver
4After=network.target
5After=etcd.service
6
7[Service]
8User=kube
9EnvironmentFile=-/etc/kubernetes/apiserver
10ExecStart={{ kube_bin_dir }}/kube-apiserver \
11 --logtostderr=true \
12 --v=0 \
13 --advertise-address={{ k8s_master_address }} \
14 --bind-address={{ k8s_master_address }} \
15 --secure-port=6443 \
16 --insecure-port=0 \
17 --allow-privileged=true \
18 --etcd-servers={{ kube_etcd_servers }} \
19 --etcd-cafile={{ kube_etcd_ca_file }} \
20 --etcd-certfile={{ kube_etcd_cert_file }} \
21 --etcd-keyfile={{ kube_etcd_key_file }} \
22 --storage-backend=etcd3 \
23 --service-cluster-ip-range=10.96.0.0/12 \
24 --tls-cert-file={{ kube_apiserver_cert_file }} \
25 --tls-private-key-file={{ kube_apiserver_key_file }} \
26 --client-ca-file={{ kube_ca_cert_file }} \
27 --service-account-key-file={{ kube_ca_key_file }} \
28 --experimental-bootstrap-token-auth=true \
29 --apiserver-count=3 \
30 --enable-swagger-ui=true \
31 --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota,DefaultTolerationSeconds \
32 --authorization-mode=RBAC \
33 --audit-log-maxage=30 \
34 --audit-log-maxbackup=3 \
35 --audit-log-maxsize=100 \
36 --audit-log-path={{ kube_log_dir }}/audit.log
37Restart=on-failure
38Type=notify
39LimitNOFILE=65536
40
41[Install]
42WantedBy=multi-user.target
43{% endraw %}
controller-manager #
controller-manager部署也很简单,下面只列出 systemd unit模板文件:
1{% raw %}
2[Unit]
3Description=kube-controller-manager
4After=network.target
5After=kube-apiserver.service
6
7[Service]
8EnvironmentFile=-/etc/kubernetes/controller-manager
9ExecStart={{ kube_bin_dir }}/kube-controller-manager \
10 --logtostderr=true \
11 --v=0 \
12 --master={{ kube_apiserver_lb_address }} \
13 --kubeconfig={{ kube_controller_manager_kubeconfig_file }} \
14 --cluster-name=kubernetes \
15 --cluster-signing-cert-file={{ kube_ca_cert_file }} \
16 --cluster-signing-key-file={{ kube_ca_key_file }} \
17 --service-account-private-key-file={{ kube_ca_key_file }} \
18 --root-ca-file={{ kube_ca_cert_file }} \
19 --insecure-experimental-approve-all-kubelet-csrs-for-group=system:bootstrappers \
20 --use-service-account-credentials=true \
21 --service-cluster-ip-range=10.96.0.0/12 \
22 --cluster-cidr=10.244.0.0/16 \
23 --allocate-node-cidrs=true \
24 --leader-elect=true \
25 --controllers=*,bootstrapsigner,tokencleaner
26Restart=on-failure
27Type=simple
28LimitNOFILE=65536
29
30[Install]
31WantedBy=multi-user.target
32{% endraw %}
- 其中
--kubeconfig
使用controller manager客户端相关的证书生成。
scheduler #
controller-manager部署也很简单,下面只列出 systemd unit模板文件:
1{% raw %}
2[Unit]
3Description=kube-scheduler
4After=network.target
5After=kube-apiserver.service
6
7[Service]
8EnvironmentFile=-/etc/kubernetes/scheduler
9ExecStart={{ kube_bin_dir }}/kube-scheduler \
10 --logtostderr=true \
11 --v=0 \
12 --master={{ kube_apiserver_lb_address }} \
13 --kubeconfig={{ kube_scheduler_kubeconfig_file }} \
14 --leader-elect=true
15Restart=on-failure
16Type=simple
17LimitNOFILE=65536
18
19[Install]
20WantedBy=multi-user.target
21{% endraw %}
- 其中
--kubeconfig
使用scheduler客户端相关的证书生成。
Node节点部署 #
cni #
1{% raw %}
2---
3
4- name: create cni download dir
5 file:
6 path: "{{ kube_cni_download_dir }}"
7 state: directory
8 delegate_to: "{{ groups['k8s-node'][0] }}"
9 run_once: true
10
11- name: check whether cni downloaded on the first node
12 stat:
13 path: "{{ kube_cni_download_dir }}/{{ kube_cni_release }}"
14 register: kube_cni_downloaded_check
15 delegate_to: "{{ groups['k8s-node'][0] }}"
16 run_once: true
17
18- name: download cni on the first node
19 get_url:
20 url: "{{ kube_cni_download_url }}"
21 dest: "{{ kube_cni_download_dir }}"
22 validate_certs: no
23 timeout: 20
24 register: download_cni
25 delegate_to: "{{ groups['k8s-node'][0] }}"
26 run_once: true
27 when: not kube_cni_downloaded_check.stat.exists
28
29- name: check whether cni tar extracted on the first node
30 stat:
31 path: "{{ kube_cni_download_dir }}/cnitool"
32 register: kube_cni_release_tar_check
33 delegate_to: "{{ groups['k8s-node'][0] }}"
34 run_once: true
35
36- name: extract cni tar file
37 unarchive:
38 src: "{{ kube_cni_download_dir }}/{{ kube_cni_release }}"
39 dest: "{{ kube_cni_download_dir }}"
40 remote_src: yes
41 run_once: true
42 delegate_to: "{{ groups['k8s-node'][0] }}"
43 when: not kube_cni_release_tar_check.stat.exists
44
45- name: fetch cni binary from the first node
46 fetch:
47 src: "{{ kube_cni_download_dir }}/{{ item }}"
48 dest: "tmp/k8s-cni/{{ item }}"
49 flat: yes
50 run_once: true
51 delegate_to: "{{ groups['k8s-node'][0] }}"
52 with_items:
53 - bridge
54 - cnitool
55 - dhcp
56 - flannel
57 - host-local
58 - ipvlan
59 - loopback
60 - macvlan
61 - noop
62 - ptp
63 - tuning
64
65- name: create cni bin and conf dir
66 file:
67 path: "{{ item }}"
68 state: directory
69 owner: kube
70 group: kube
71 mode: 0751
72 recurse: yes
73 with_items:
74 - "{{ kube_cni_conf_dir }}"
75 - "{{ kube_cni_bin_dir }}"
76
77- name: copy cni binary
78 copy:
79 src: "tmp/k8s-cni/{{ item }}"
80 dest: "{{ kube_cni_bin_dir }}"
81 owner: kube
82 group: kube
83 mode: 0751
84 with_items:
85 - bridge
86 - cnitool
87 - dhcp
88 - flannel
89 - host-local
90 - ipvlan
91 - loopback
92 - macvlan
93 - noop
94 - ptp
95 - tuning
96{% endraw %}
kubelet #
kubelet部署也很简单,下面只列出 systemd unit模板文件:
1{% raw %}
2[Unit]
3Description=kubelet
4After=docker.service
5Requires=docker.service
6
7[Service]
8WorkingDirectory={{ kube_kubelet_data_dir }}
9EnvironmentFile=-/etc/kubernetes/kubelet
10ExecStart={{ kube_bin_dir }}/kubelet \
11 --logtostderr=true \
12 --v=0 \
13 --address={{ k8s_node_address }} \
14 --api-servers={{ kube_apiserver_lb_address }} \
15 --cluster-dns=10.96.0.10 \
16 --cluster-domain=cluster.local \
17 --kubeconfig={{ kube_kubelet_kubeconfig_file }} \
18 --require-kubeconfig=true \
19 --pod-manifest-path={{ kube_pod_manifest_dir }} \
20 --allow-privileged=true \
21 --authorization-mode=Webhook \
22 --client-ca-file={{ kube_ca_cert_file }} \
23 --network-plugin=cni \
24 --cni-conf-dir={{ kube_cni_conf_dir }} \
25 --cni-bin-dir={{ kube_cni_bin_dir }}
26Restart=on-failure
27
28[Install]
29WantedBy=multi-user.target
30{% endraw %}
- 其中
--kubeconfig
使用kubelet客户端相关的证书生成。
kube-proxy #
kube-proxy部署也很简单,下面只列出 systemd unit模板文件:
1{% raw %}
2[Unit]
3Description=kubelet
4After=docker.service
5Requires=docker.service
6
7[Service]
8WorkingDirectory={{ kube_kubelet_data_dir }}
9EnvironmentFile=-/etc/kubernetes/kubelet
10ExecStart={{ kube_bin_dir }}/kubelet \
11 --logtostderr=true \
12 --v=0 \
13 --address={{ k8s_node_address }} \
14 --api-servers={{ kube_apiserver_lb_address }} \
15 --cluster-dns=10.96.0.10 \
16 --cluster-domain=cluster.local \
17 --kubeconfig={{ kube_kubelet_kubeconfig_file }} \
18 --require-kubeconfig=true \
19 --pod-manifest-path={{ kube_pod_manifest_dir }} \
20 --allow-privileged=true \
21 --authorization-mode=Webhook \
22 --client-ca-file={{ kube_ca_cert_file }} \
23 --network-plugin=cni \
24 --cni-conf-dir={{ kube_cni_conf_dir }} \
25 --cni-bin-dir={{ kube_cni_bin_dir }}
26Restart=on-failure
27
28[Install]
29WantedBy=multi-user.target
30{% endraw %}
PodNetWork插件flannel #
flannel以DaemonSet的形式运行在Kubernetes集群中。 由于我们的etcd集群启用了TLS认证,为了从flannel容器中能访问etcd,我们先把etcd的TLS证书信息保存到Kubernetes的Secret中。
1{% raw %}
2- name: delete etcd client cert secret
3 command: "{{ kube_bin_dir }}/kubectl delete secret etcd-tls-secret \
4 -n kube-system"
5 run_once: true
6 delegate_to: "{{ groups['k8s-master'][0] }}"
7
8- name: create etcd client cert secret
9 command: "{{ kube_bin_dir }}/kubectl create secret generic etcd-tls-secret \
10 --from-file={{ kube_etcd_cert_file }} \
11 --from-file={{ kube_etcd_key_file }} \
12 --from-file={{ kube_etcd_ca_file }} \
13 -n kube-system"
14 run_once: true
15 delegate_to: "{{ groups['k8s-master'][0] }}"
16
17- name: copy kube-flannel-rbac.yml
18 copy:
19 src: kube-flannel-rbac.yml
20 dest: "{{ ansible_temp_dir }}"
21 run_once: true
22 delegate_to: "{{ groups['k8s-master'][0] }}"
23
24
25- name: apply kube-flannel-rbac.yml
26 command: "{{ kube_bin_dir }}/kubectl apply -f {{ ansible_temp_dir }}/kube-flannel-rbac.yml"
27 run_once: true
28 delegate_to: "{{ groups['k8s-master'][0] }}"
29
30- name: create kube-flannel.yml
31 template:
32 src: kube-flannel.yml.j2
33 dest: "{{ ansible_temp_dir }}/kube-flannel.yml"
34
35- name: apply kube-flannel.yml
36 command: "{{ kube_bin_dir }}/kubectl apply -f {{ ansible_temp_dir }}/kube-flannel.yml"
37 run_once: true
38 delegate_to: "{{ groups['k8s-master'][0] }}"
39{% endraw %}
注意我们已经开始使用ansible的command moudle调用kubectl来在Kubernetes集群中创建资源, 而没有使用ansible的kubernetes module,这是因为我们启用了Kubernetes ApiServer的SSL双向认证, ansible的kubernetes module当前还不支持以这种方式访问ApiServer。
flannel安装完成后,需要先确认每个节点上的flannel的DaemonSet的Pod都处于Running状态。 这个时候可以部署一个三个副本的nginx到集群中测试一下跨节点Pod之间的网络通信已经OK。
1kubectl run nginx --replicas=3 --image=nginx --port=80
在一个节点上curl另外两个节点上nginx的Pod IP确保通信正常。
kube-dns #
kube-dns插件也是跑在Kubernetes集群上的。
1{% raw %}
2---
3
4- name: copy kube dns yml
5 copy:
6 src: "kube-dns/{{ item }}"
7 dest: "{{ ansible_temp_dir }}"
8 run_once: true
9 delegate_to: "{{ groups['k8s-master'][0] }}"
10 with_items:
11 - kubedns-cm.yaml
12 - kubedns-sa.yaml
13 - kubedns-controller.yaml
14 - kubedns-svc.yaml
15
16- name: apply kube dns yml
17 command: "{{ kube_bin_dir }}/kubectl apply -f {{ ansible_temp_dir }}/{{ item }}"
18 run_once: true
19 delegate_to: "{{ groups['k8s-master'][0] }}"
20 with_items:
21 - kubedns-cm.yaml
22 - kubedns-sa.yaml
23 - kubedns-controller.yaml
24 - kubedns-svc.yaml
25
26- name: scale kube dns
27 command: "{{ kube_bin_dir }}/kubectl --namespace=kube-system scale deployment kube-dns --replicas=3"
28 run_once: true
29 delegate_to: "{{ groups['k8s-master'][0] }}"
30{% endraw %}
yml文件可以参考Kubernetes 1.6 高可用集群部署中的4.5章节。
kube-dns的Pod跑起来之后需要测试一下dns是否好用:
1kubectl run curl --image=radial/busyboxplus:curl -i --tty
2
3nslookup kubernetes.default
4Server: 10.96.0.10
5Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
6
7Name: kubernetes
8Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local
dashboard和heapster插件 #
1{% raw %}
2---
3
4- name: copy dashboard and heapster yml
5 copy:
6 src: "dashboard/{{ item }}"
7 dest: "{{ ansible_temp_dir }}"
8 run_once: true
9 delegate_to: "{{ groups['k8s-master'][0] }}"
10 with_items:
11 - kubernetes-dashboard.yaml
12 - heapster-rbac.yaml
13 - heapster.yaml
14 - influxdb.yaml
15 - grafana.yaml
16
17- name: apply dashboard and heapster yml
18 command: "{{ kube_bin_dir }}/kubectl apply -f {{ ansible_temp_dir }}/{{ item }}"
19 run_once: true
20 delegate_to: "{{ groups['k8s-master'][0] }}"
21 with_items:
22 - kubernetes-dashboard.yaml
23 - heapster-rbac.yaml
24 - heapster.yaml
25 - influxdb.yaml
26 - grafana.yaml
27{% endraw %}