环境准备

1192.168.61.41 node1
2192.168.61.42 node2
3192.168.61.43 node3

安装Docker 1.12

Kubernetes 1.6还没有针对docker 1.13和最新的docker 17.03上做测试和验证,所以这里安装Kubernetes官方推荐的Docker 1.12版本。

1yum install -y yum-utils
2
3yum-config-manager \
4    --add-repo \
5    https://docs.docker.com/v1.13/engine/installation/linux/repo_files/centos/docker.repo
6    
7yum makecache fast

查看版本:

1yum list docker-engine.x86_64  --showduplicates |sort -r
2docker-engine.x86_64             1.13.1-1.el7.centos                 docker-main
3docker-engine.x86_64             1.12.6-1.el7.centos                 docker-main
4docker-engine.x86_64             1.11.2-1.el7.centos                 docker-main

安装1.12.6:

1yum install -y docker-engine-1.12.6
2
3systemctl start docker
4systemctl enable docker

系统配置

根据官方文档Installing Kubernetes on Linux with kubeadm 中的Limitations小节中的内容,对各节点系统做如下设置:

创建/etc/sysctl.d/k8s.conf文件,添加如下内容:

1net.bridge.bridge-nf-call-ip6tables = 1
2net.bridge.bridge-nf-call-iptables = 1

执行sysctl -p /etc/sysctl.d/k8s.conf使修改生效。

在/etc/hostname中修改各节点的hostname,在/etc/hosts中设置hostname对应非lo回环网卡ip:

1192.168.61.41 node1
2192.168.61.42 node2
3192.168.61.43 node3

安装kubeadm和kubelet

下面在各节点安装kubeadm和kubelet:

 1cat <<EOF > /etc/yum.repos.d/kubernetes.repo
 2[kubernetes]
 3name=Kubernetes
 4baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64
 5enabled=1
 6gpgcheck=1
 7repo_gpgcheck=1
 8gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
 9        https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
10EOF

测试地址http://yum.kubernetes.io/repos/kubernetes-el7-x86_64是否可用,如果不可用需要科学上网。

1 curl http://yum.kubernetes.io/repos/kubernetes-el7-x86_64

查看kubeadm, kubelet, kubectl, kubernets-cni的最新版本:

 1yum list kubeadm  --showduplicates |sort -r
 2kubeadm.x86_64                        1.6.1-0                        kubernetes
 3kubeadm.x86_64                        1.6.0-0                        kubernetes
 4
 5yum list kubelet  --showduplicates |sort -r
 6kubelet.x86_64                        1.6.1-0                        kubernetes
 7kubelet.x86_64                        1.6.0-0                         kubernetes
 8kubelet.x86_64                        1.5.4-0                         kubernetes
 9
10
11yum list kubectl  --showduplicates |sort -r
12kubectl.x86_64                        1.6.1-0                        kubernetes
13kubectl.x86_64                        1.6.0-0                         kubernetes
14kubectl.x86_64                        1.5.4-0                         kubernetes
15
16
17yum list kubernets-cni  --showduplicates |sort -r
18kubernetes-cni              x86_64              0.5.1-0                    kubernetes

kubeadm和kubelet已经是1.6.1,就是我们要安装的版本,直接安装即可:

 1setenforce 0
 2
 3yum install -y kubelet kubeadm kubectl kubernetes-cni
 4...
 5Installed:
 6  kubeadm.x86_64 0:1.6.1-0               kubectl.x86_64 0:1.6.1-0        kubelet.x86_64 0:1.6.1-0
 7  kubernetes-cni.x86_64 0:0.5.1-0
 8
 9Dependency Installed:
10  ebtables.x86_64 0:2.0.10-15.el7                       socat.x86_64 0:1.7.2.2-5.el7
11
12Complete!
1systemctl enable kubelet.service

初始化集群

接下来使用kubeadm初始化集群,选择node1作为Master Node,在node1上执行下面的命令:

1kubeadm init --kubernetes-version=v1.6.1 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.61.41

注意到kubeadm init的--kubernetes-version--apiserver-advertise-address发生了变化,之前kuebadm 1.6.0-0.alpha是--use-kubernetes-version--api-advertise-addresses

因为我们选择flannel作为Pod网络插件,所以上面的命令指定--pod-network-cidr=10.244.0.0/16。 在集群初始化遇到问题,可以使用下面的命令进行清理后重新再初始化:

1kubeadm reset
2ifconfig cni0 down
3ip link delete cni0
4ifconfig flannel.1 down
5ip link delete flannel.1
6rm -rf /var/lib/cni/

kubeadm init执行成功后输出下面的信息:

 1kubeadm init --kubernetes-version=v1.6.1 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.61.41
 2[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
 3[init] Using Kubernetes version: v1.6.1
 4[init] Using Authorization mode: RBAC
 5[preflight] Running pre-flight checks
 6[preflight] Starting the kubelet service
 7[certificates] Generated CA certificate and key.
 8[certificates] Generated API server certificate and key.
 9[certificates] API Server serving cert is signed for DNS names [node0 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.61.41]
10[certificates] Generated API server kubelet client certificate and key.
11[certificates] Generated service account token signing key and public key.
12[certificates] Generated front-proxy CA certificate and key.
13[certificates] Generated front-proxy client certificate and key.
14[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
15[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
16[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
17[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
18[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
19[apiclient] Created API client, waiting for the control plane to become ready
20[apiclient] All control plane components are healthy after 14.583864 seconds
21[apiclient] Waiting for at least one node to register
22[apiclient] First node has registered after 6.008990 seconds
23[token] Using token: e7986d.e440de5882342711
24[apiconfig] Created RBAC rules
25[addons] Created essential addon: kube-proxy
26[addons] Created essential addon: kube-dns
27
28Your Kubernetes master has initialized successfully!
29
30To start using your cluster, you need to run (as a regular user):
31
32  sudo cp /etc/kubernetes/admin.conf $HOME/
33  sudo chown $(id -u):$(id -g) $HOME/admin.conf
34  export KUBECONFIG=$HOME/admin.conf
35
36You should now deploy a pod network to the cluster.
37Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
38  http://kubernetes.io/docs/admin/addons/
39
40You can now join any number of machines by running the following on each node
41as root:
42
43  kubeadm join --token e7986d.e440de5882342711 192.168.61.41:6443

Master Node初始化完成,使用kubeadm初始化的Kubernetes集群在Master节点上的核心组件:kube-apiserver, kube-scheduler, kube-controller-manager是以静态Pod的形式运行的。

1ls /etc/kubernetes/manifests/
2etcd.yaml  kube-apiserver.yaml  kube-controller-manager.yaml  kube-scheduler.yaml

在/etc/kubernetes/manifests/目录里可以看到kube-apiserver, kube-scheduler, kube-controller-manager的定义文件。另外集群持久化存储etcd也是以单点静态Pod的形式运行的,对于etcd后边我们会把它切换成etcd集群,这里暂且不表。

查看一下kube-apiserver.yaml的内容:

 1apiVersion: v1
 2kind: Pod
 3metadata:
 4  creationTimestamp: null
 5  labels:
 6    component: kube-apiserver
 7    tier: control-plane
 8  name: kube-apiserver
 9  namespace: kube-system
10spec:
11  containers:
12  - command:
13    - kube-apiserver
14    .......
15    - --insecure-port=0

注意到kube-apiserver的选项--insecure-port=0,也就是说kubeadm 1.6.0初始化的集群,kube-apiserver没有监听默认的http 8080端口。 所以我们使用kubectl get nodes会报The connection to the server localhost:8080 was refused - did you specify the right host or port?

查看kube-apiserver的监听端口可以看到只监听了https的6443端口,

1netstat -nltp | grep apiserver
2tcp6       0      0 :::6443                 :::*                    LISTEN      9831/kube-apiserver

为了使用kubectl访问apiserver,在~/.bash_profile中追加下面的环境变量:

1export KUBECONFIG=/etc/kubernetes/admin.conf
1source ~/.bash_profile

此时kubectl命令在master node上就好用了,查看一下当前机器中的Node:

1kubectl get nodes
2NAME      STATUS     AGE       VERSION
3node0     NotReady   3m        v1.6.1

安装Pod Network

接下来安装flannel network add-on:

1kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
2kubectl apply -f  https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
3serviceaccount "flannel" created
4configmap "kube-flannel-cfg" created
5daemonset "kube-flannel-ds" created

如果Node有多个网卡的话,参考flannel issues 39701,目前需要在kube-flannel.yml中使用--iface参数指定集群主机内网网卡的名称,否则可能会出现dns无法解析。需要将kube-flannel.yml下载到本地,flanneld启动参数加上--iface=<iface-name>

 1......
 2apiVersion: extensions/v1beta1
 3kind: DaemonSet
 4metadata:
 5  name: kube-flannel-ds
 6......
 7containers:
 8      - name: kube-flannel
 9        image: quay.io/coreos/flannel:v0.7.0-amd64
10        command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=eth1" ]
11......

使用kubectl get pod --all-namespaces -o wide确保所有的Pod都处于Running状态。

1kubectl get pod --all-namespaces -o wide

使master node参与工作负载

使用kubeadm初始化的集群,出于安全考虑Pod不会被调度到Master Node上,也就是说Master Node不参与工作负载。

这里搭建的是测试环境可以使用下面的命令使Master Node参与工作负载:

1kubectl taint nodes --all  node-role.kubernetes.io/master-

测试DNS

1kubectl run curl --image=radial/busyboxplus:curl -i --tty
2Waiting for pod default/curl-2421989462-vldmp to be running, status is Pending, pod ready: false
3Waiting for pod default/curl-2421989462-vldmp to be running, status is Pending, pod ready: false
4If you don't see a command prompt, try pressing enter.
5[ root@curl-2421989462-vldmp:/ ]$

进入后执行nslookup kubernetes.default确认解析正常。

1[ root@curl-2421989462-vldmp:/ ]$ nslookup kubernetes.default
2Server:    10.96.0.10
3Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
4
5Name:      kubernetes.default
6Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

测试OK后,删除掉curl这个Pod。

1kubectl delete deploy curl

向集群中添加节点

下面将node2和node3加入集群,分别在node2和node3上执行:

 1kubeadm join --token e7986d.e440de5882342711 192.168.61.41:6443
 2[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
 3[preflight] Running pre-flight checks
 4[discovery] Trying to connect to API Server "192.168.61.41:6443"
 5[discovery] Created cluster-info discovery client, requesting info from "https://192.168.61.41:6443"
 6[discovery] Cluster info signature and contents are valid, will use API Server "https://192.168.61.41:6443"
 7[discovery] Successfully established connection with API Server "192.168.61.41:6443"
 8[bootstrap] Detected server version: v1.6.1
 9[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
10[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
11[csr] Received signed certificate from the API server, generating KubeConfig...
12[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
13
14Node join complete:
15* Certificate signing request sent to master and response
16  received.
17* Kubelet informed of new secure connection details.
18
19Run 'kubectl get nodes' on the master to see this machine join.

查看集群中节点:

1kubectl get nodes
2NAME      STATUS    AGE       VERSION
3node1     Ready     12m       v1.6.1
4node2     Ready     4m        v1.6.1
5node3     Ready     2m        v1.6.1

安装Dashboard插件

1wget https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml
2
3kubectl create -f kubernetes-dashboard.yaml

从http://NodeIp:NodePort访问dashboard,浏览器显示下面的错误:

1User "system:serviceaccount:kube-system:default" cannot list statefulsets.apps in the namespace "default". (get statefulsets.apps)

这是因为Kubernetes 1.6开始API Server启用了RBAC授权,当前的kubernetes-dashboard.yaml没有定义授权的ServiceAccount,所以访问API Server时被拒绝了。

根据https://github.com/kubernetes/dashboard/issues/1803中的内容临时授予system:serviceaccount:kube-system:default cluster_admin的角色,临时解决一下。

创建dashboard-rbac.yaml,定义system:serviceaccount:kube-system:default和ClusterRole cluster-admin绑定:

 1kind: ClusterRoleBinding
 2apiVersion: rbac.authorization.k8s.io/v1beta1
 3metadata:
 4  name: dashboard-admin
 5roleRef:
 6  apiGroup: rbac.authorization.k8s.io
 7  kind: ClusterRole
 8  name: cluster-admin 
 9subjects:
10- kind: ServiceAccount
11  name: default
12  namespace: kube-system
1 kubectl create -f dashboard-rbac.yml

在集群中运行Heapster

下面安装Heapster为集群添加使用统计和监控功能,为Dashboard添加仪表盘。

下载最新的Heapster到集群中的某个Node上。

1wget https://github.com/kubernetes/heapster/archive/v1.3.0.tar.gz

使用InfluxDB做为Heapster的后端存储,开始部署,中间会pull相关镜像,包含gcr.io/google_containers/heapster_grafana:v2.6.0-2

 1tar -zxvf v1.3.0.tar.gz
 2cd heapster-1.3.0/deploy/kube-config/influxdb
 3
 4kubectl create -f ./
 5deployment "monitoring-grafana" created
 6service "monitoring-grafana" created
 7deployment "heapster" created
 8service "heapster" created
 9deployment "monitoring-influxdb" created
10service "monitoring-influxdb" created

最后确认所有的pod都处于running状态,打开Dashboard,集群的使用统计会以仪表盘的形式显示出来。

参考