文章目录

  • 1. kubeadm 部署三节点(复用)高可用 k8s 集群
    • 1.1 环境规划阶段
      • 1.1.1 实验架构图
      • 1.1.2 系统版本说明
      • 1.1.3 环境基本信息
      • 1.1.4 k8s 网段划分
    • 1.2 基础安装及优化阶段
      • 1.2.1 系统信息检查
      • 1.2.2 静态 IP 地址配置
      • 1.2.3 配置主机名
      • 1.2.4 配置/etc/hosts文件
      • 1.2.5 关闭 selinux
      • 1.2.6 配置主机互信
      • 1.2.7 关闭交换分区
      • 1.2.8 关闭 firewalld
      • 1.2.9 关闭 NetworkManager
      • 1.2.10 设置资源限制
      • 1.2.11 配置时间同步
      • 1.2.12 配置国内源
      • 1.2.13 升级内核
      • 1.2.14 安装基础工具
      • 1.2.15 配置内核模块和参数
      • 1.2.16 安装容器运行时
    • 1.3 集群安装配置阶段
      • 1.3.1 安装kubeadm、kubelet、kubectl
      • 1.3.2 高可用组件安装 keepalived、nginx
      • 1.3.3 第一台 master 节点初始化
      • 1.3.4 其他节点扩容到 master
      • 1.3.5 安装网络插件 calico
      • 1.3.6 污点去除
      • 1.3.7 配置 etcd 高可用
    • 1.4 插件安装及优化阶段
      • 1.4.1 kubectl 命令补全功能
      • 1.4.2 安装 metrics
      • 1.4.3 Dashboard UI 安装
    • 1.5 集群验证阶段
      • 1.5.1 节点验证
      • 1.5.2 Pod 验证
      • 1.5.3 k8s 网段验证
      • 1.5.4 创建资源验证
      • 1.5.5 pod 解析验证
      • 1.5.6 节点访问验证
      • 1.5.7 Pod 之间的通信

1. kubeadm 部署三节点(复用)高可用 k8s 集群

1.1 环境规划阶段

1.1.1 实验架构图

1.1.2 系统版本说明

OS 版本:CentOS Linux release 7.9.2009 (Core)

初始内核版本:3.10.0-1160.71.1.el7.x86_64

配置信息:2C2G 150G硬盘

文件系统:xfs

网络:外网权限

k8s 版本:1.25.9

1.1.3 环境基本信息

K8s集群角色IP地址主机名组件信息
控制节点1(工作节点1)192.168.204.10k8s-001apiserver、controller-manager、schedule、etcd、kube-proxy、容器运行时、keepalived、nginx、calico、coredns、kubelet
控制节点2(工作节点2)192.168.204.11k8s-002apiserver、controller-manager、schedule、etcd、kube-proxy、容器运行时、keepalived、nginx、calico、coredns、kubelet
控制节点3(工作节点3)192.168.204.12k8s-003apiserver、controller-manager、schedule、etcd、kube-proxy、容器运行时、calico、coredns、kubelet
VIP地址192.168.204.13(k8s-vip)

1.1.4 k8s 网段划分

  • service 网段:10.165.0.0/16

  • pod 网段:10.166.0.0/16

1.2 基础安装及优化阶段

无特别说明,三台都要执行

1.2.1 系统信息检查

检查系统版本以及内核

cat /etc/redhat-release ; uname -r


1.2.2 静态 IP 地址配置

服务器必须配置静态IP地址,不可变动

grep -E 'BOOTPROTO|IPADDR' /etc/sysconfig/network-scripts/ifcfg-ens32


1.2.3 配置主机名

按照规划配置对应主机的主机名即可

hostnamectl set-hostname k8s-001hostnamectl set-hostname k8s-002hostnamectl set-hostname k8s-003

1.2.4 配置/etc/hosts文件

cat >> /etc/hosts <<EOF192.168.204.10 k8s-001192.168.204.11 k8s-002192.168.204.12 k8s-003192.168.204.13 k8s-vipEOF

1.2.5 关闭 selinux

setenforce 0sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config

1.2.6 配置主机互信

### 只需要配置k8s-001到三节点互信即可ssh-keygen -t rsa -f ~/.ssh/id_rsa -N ''ssh-copy-id k8s-001ssh-copy-id k8s-002ssh-copy-id k8s-003

1.2.7 关闭交换分区

必须关闭

swapoff -ased --in-place=.bak 's/.*swap.*/#&/g' /etc/fstab

1.2.8 关闭 firewalld

systemctl stop firewalld ; systemctl disable firewalld

1.2.9 关闭 NetworkManager

systemctl stop NetworkManager; systemctl disable NetworkManager

1.2.10 设置资源限制

/etc/security/limits.conf 初始文件没有任何有效的参数内容

cat >> /etc/security/limits.conf << EOF* soft nofile 65536* hard nofile 131072* soft nproc 65535* hard nproc 655350* soft memlock unlimited* hard memlock unlimitedEOF

1.2.11 配置时间同步

### 配置chrony.confcat > /etc/chrony.conf << EOFserver ntp.aliyun.com iburststratumweight 0driftfile /var/lib/chrony/driftrtcsyncmakestep 10 3bindcmdaddress 127.0.0.1bindcmdaddress ::1keyfile /etc/chrony.keyscommandkey 1generatecommandkeylogchange 0.5logdir /var/log/chronyEOF### 重启服务

1.2.12 配置国内源

## centos 7 的yum和epel源mkdir /etc/yum.repos.d.bak && mv /etc/yum.repos.d/*.repo /etc/yum.repos.d.bakcurl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.reposed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repocurl -o /etc/yum.repos.d/epel.repo https://mirrors.aliyun.com/repo/epel-7.repoyum clean all && yum makecache## 配置docker源yum install -y yum-utils device-mapper-persistent-data lvm2yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.reposed -i 's+download.docker.com+mirrors.aliyun.com/docker-ce+' /etc/yum.repos.d/docker-ce.repoyum makecache fast## 配置k8s源cat > /etc/yum.repos.d/kubernetes.repo << EOF[kubernetes]name=Kubernetesbaseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/enabled=1gpgcheck=1repo_gpgcheck=1gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpgEOF

1.2.13 升级内核

### 先升级一下软件包yum update --exclude=kernel* -y### 下载内核(4.19以上推荐,默认其实也可以)curl -o kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpm http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpmcurl -o kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm### 安装内核(当前目录只有这2个rpm包)yum localinstall -y *.rpm### 更改内核启动顺序grub2-set-default 0grub2-mkconfig -o /etc/grub2.cfggrubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"### 检查是否加载最新grubby --default-kernel### 重启服务器reboot

1.2.14 安装基础工具

就安装一些用得到的一些工具

yum install -y device-mapper-persistent-data net-tools nfs-utils jq psmisc git lrzsz gcc gcc-c++ make cmake libxml2-devel openssl-devel curl curl-devel unzip sudo libaio-devel wget vim ncurses-devel autoconf automake zlib-devel python-devel epel-release openssh-server socat ipvsadm conntrack telnet ipset sysstat libseccomp 

1.2.15 配置内核模块和参数

### 配置需要加载模块cat > /etc/modules-load.d/k8s.conf << EOFip_vsip_vs_lcip_vs_wlcip_vs_rrip_vs_wrrip_vs_lblcip_vs_lblcrip_vs_dhip_vs_ship_vs_foip_vs_nqip_vs_sedip_vs_ftpip_vs_shnf_conntrackip_tablesip_setxt_setipt_setipt_rpfilteript_REJECTipipoverlaybr_netfilterEOF## 开机自动加载systemctl enable systemd-modules-load.service --now## 配置内核参数优化cat > /etc/sysctl.d/k8s.conf << EOFnet.ipv4.ip_forward = 1net.bridge.bridge-nf-call-iptables = 1net.bridge.bridge-nf-call-ip6tables = 1fs.may_detach_mounts = 1net.ipv4.conf.all.route_localnet = 1vm.overcommit_memory=1vm.panic_on_oom=0fs.inotify.max_user_watches=89100fs.file-max=52706963fs.nr_open=52706963net.netfilter.nf_conntrack_max=2310720net.ipv4.tcp_keepalive_time = 600net.ipv4.tcp_keepalive_probes = 3net.ipv4.tcp_keepalive_intvl =15net.ipv4.tcp_max_tw_buckets = 36000net.ipv4.tcp_tw_reuse = 1net.ipv4.tcp_max_orphans = 327680net.ipv4.tcp_orphan_retries = 3net.ipv4.tcp_syncookies = 1net.ipv4.tcp_max_syn_backlog = 16384net.ipv4.ip_conntrack_max = 65536net.ipv4.tcp_max_syn_backlog = 16384net.ipv4.tcp_timestamps = 0net.core.somaxconn = 16384EOF## 生效加载sysctl --system### 重启服务,检查模块加载是否正常rebootlsmod | grep --color=auto -e ip_vs -e nf_conntrack

1.2.16 安装容器运行时

docker 引擎和 containerd 都安装上

安装 containerd

### 安装yum install -y containerd.io-1.6.6### 生成配置文件containerd config default > /etc/containerd/config.toml### 修改配置文件sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.tomlsed -i 's#sandbox_image = "k8s.gcr.io/pause:3.6"#sandbox_image="registry.aliyuncs.com/google_containers/pause:3.7"#g' /etc/containerd/config.toml### 配置镜像加速sed -i 's#config_path = ""#config_path = "/etc/containerd/certs.d"#g' /etc/containerd/config.tomlmkdir -p /etc/containerd/certs.d/docker.iocat > /etc/containerd/certs.d/docker.io/hosts.toml << EOFserver = "https://registry-1.docker.io"[host."https://xpd691zc.mirror.aliyuncs.com"]capabilities = ["pull", "resolve", "push"]EOF## 启动生效systemctl daemon-reload ; systemctl enable containerd --now

**安装 crictl **

### 下载二进制包wget https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.25.0/crictl-v1.25.0-linux-amd64.tar.gz### 解压tar -xf crictl-v1.25.0-linux-amd64.tar.gz### 移动位置mv crictl /usr/local/bin/### 配置cat > /etc/crictl.yaml << EOFruntime-endpoint: unix:///run/containerd/containerd.sockimage-endpoint: unix:///run/containerd/containerd.socktimeout: 10debug: falseEOF### 重启生效systemctl restart containerd

安装 docker

yum install docker-ce -y# 配置镜像加速mkdir -p /etc/dockercat > /etc/docker/daemon.json << EOF{"registry-mirrors": ["https://xpd691zc.mirror.aliyuncs.com"]}EOF# 启动服务systemctl enable docker --now

快照节点

1.3 集群安装配置阶段

1.3.1 安装kubeadm、kubelet、kubectl

### 查看有哪些版本yum list kubeadm.x86_64 --showduplicates | sort -r### 安装1.25.9版本yum install kubeadm-1.25.9 kubelet-1.25.9 kubectl-1.25.9 -y### 设置 kubelet### 这个时候启动状态是异常的systemctl enable kubelet --now

1.3.2 高可用组件安装 keepalived、nginx

安装组件

yum install nginx keepalived nginx-mod-stream -y

配置相关文件

### 三个节点都一样的 /etc/nginx/nginx.confuser nginx;worker_processes auto;error_log /var/log/nginx/error.log;pid /run/nginx.pid;include /usr/share/nginx/modules/*.conf;events {worker_connections 1024;}stream {log_formatmain'$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';access_log/var/log/nginx/k8s-access.logmain;upstream kube-apiserver {server 192.168.204.10:6443 weight=5 max_fails=3 fail_timeout=30s;server 192.168.204.11:6443 weight=5 max_fails=3 fail_timeout=30s;server 192.168.204.12:6443 weight=5 max_fails=3 fail_timeout=30s;}server { listen 16443; proxy_pass kube-apiserver;}}http {log_formatmain'$remote_addr - $remote_user [$time_local] "$request" ''$status $body_bytes_sent "$http_referer" ''"$http_user_agent" "$http_x_forwarded_for"';access_log/var/log/nginx/access.logmain;sendfileon;tcp_nopushon;tcp_nodelay on;keepalive_timeout 65;types_hash_max_size 2048;include /etc/nginx/mime.types;default_typeapplication/octet-stream;server {listen 80 default_server;server_name_;location / {}}}### /etc/keepalived/keepalived.conf 文件global_defs { notification_email { acassen@firewall.loc failover@firewall.loc sysadmin@firewall.loc } notification_email_from Alexandre.Cassen@firewall.locsmtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_MASTER } vrrp_script check_nginx { script "/etc/keepalived/check_nginx.sh" } vrrp_instance VI_1 {state MASTERinterface ens32 ##注意网卡virtual_router_id 51priority 100 # 主是100,其他2个节点是90、80advert_int 1authentication {auth_type PASS auth_pass 1111} virtual_ipaddress {192.168.204.13/24 }track_script { check_nginx }}### /etc/keepalived/check_nginx.sh 文件内容#!/bin/bash#counter=$(ps -ef |grep nginx | grep sbin | egrep -cv "grep|$$" )if [ $counter -eq 0 ]; thenservice nginx startsleep 2counter=$(ps -ef |grep nginx | grep sbin | egrep -cv "grep|$$" )if [ $counter -eq 0 ]; thenservicekeepalived stopfifi## 给执行权限chmod +x /etc/keepalived/check_nginx.sh

启动验证

nginx -tsystemctl daemon-reloadsystemctl enable nginx keepalived --nowip -4 a## 尝试停一下主节点的nginx,看是否漂移恢复systemctl stop nginx## 验证结果:vip 正常飘逸,恢复后会回到第一台

1.3.3 第一台 master 节点初始化

# 三台设置容器运行时crictl config runtime-endpoint /run/containerd/containerd.sock# 在k8s-001进行第一台初始化master节点kubeadm config print init-defaults > k8s-config.yaml# 得到文件后修改点apiVersion: kubeadm.k8s.io/v1beta3bootstrapTokens:- groups:- system:bootstrappers:kubeadm:default-node-tokentoken: abcdef.0123456789abcdefttl: 24h0m0susages:- signing- authenticationkind: InitConfiguration#localAPIEndpoint: ##注释#advertiseAddress: 192.168.204.10 ##注释#bindPort: 6443 ##注释nodeRegistration:criSocket: unix:///var/run/containerd/containerd.sockimagePullPolicy: IfNotPresent#name: node##注释taints: null---apiServer:timeoutForControlPlane: 4m0sapiVersion: kubeadm.k8s.io/v1beta3certificatesDir: /etc/kubernetes/pkiclusterName: kubernetescontrollerManager: {}dns: {}etcd:local:dataDir: /var/lib/etcdimageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers # 改成国内的镜像kind: ClusterConfigurationkubernetesVersion: 1.25.9 # 改成对应的版本# 新增内容 vip地址controlPlaneEndpoint: 192.168.204.13:16443networking:dnsDomain: cluster.localserviceSubnet: 10.165.0.0/16 # 指定网段,新增内容podSubnet: 10.166.0.0/16scheduler: {}#增加内容---apiVersion: kubeproxy.config.k8s.io/v1alpha1kind: KubeProxyConfigurationmode: ipvs---apiVersion: kubelet.config.k8s.io/v1beta1kind: KubeletConfigurationcgroupDriver: systemd

初始化

### k8s-001执行kubeadm init --config=k8s-config.yaml --ignore-preflight-errors=SystemVerification

成功以后的画面

按照要求配置一下

### k8s-001操作mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/config

1.3.4 其他节点扩容到 master

拷贝证书

### 其他2个节点执行cd /root && mkdir -p /etc/kubernetes/pki/etcd && mkdir -p ~/.kube/### k8s-001 节点拷贝证书到k8s-002scp /etc/kubernetes/pki/ca.crt k8s-002:/etc/kubernetes/pki/scp /etc/kubernetes/pki/ca.key k8s-002:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/sa.key k8s-002:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/sa.pub k8s-002:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/front-proxy-ca.crt k8s-002:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/front-proxy-ca.key k8s-002:/etc/kubernetes/pki/scp /etc/kubernetes/pki/etcd/ca.crt k8s-002:/etc/kubernetes/pki/etcd/ scp /etc/kubernetes/pki/etcd/ca.key k8s-002:/etc/kubernetes/pki/etcd/### k8s-001 节点拷贝证书到k8s-003scp /etc/kubernetes/pki/ca.crt k8s-003:/etc/kubernetes/pki/scp /etc/kubernetes/pki/ca.key k8s-003:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/sa.key k8s-003:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/sa.pub k8s-003:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/front-proxy-ca.crt k8s-003:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/front-proxy-ca.key k8s-003:/etc/kubernetes/pki/scp /etc/kubernetes/pki/etcd/ca.crt k8s-003:/etc/kubernetes/pki/etcd/ scp /etc/kubernetes/pki/etcd/ca.key k8s-003:/etc/kubernetes/pki/etcd/

加入集群

### 根据初始化第一台master节点的信息,加入master的命令是kubeadm join 192.168.204.13:16443 --token abcdef.0123456789abcdef \--discovery-token-ca-cert-hash sha256:02e4ac4dac8089fd8b15f164fa079b450050e1ec238f58a11338411789100bf0 \--control-plane### 在另外2个节点都执行一下即可,执行完记得也要配置一下mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/config

加入集群的 token 过期了怎么办?

kubeadm token create –print-join-command

kubeadm init phase upload-certs –upload-certs

核实集群状态

### 任意一台执行kubectl get nodes 

1.3.5 安装网络插件 calico

### k8s-001执行即可mkdir /opt/k8s-yaml cd /opt/k8s-yaml### 提前准备好了相关文件和镜像包:calico.tar.gz和calico.yaml### 将镜像拷贝到其他节点scp calico.tar.gz k8s-002:/rootscp calico.tar.gz k8s-003:/root### 在每个节点对应位置导入镜像ctr -n=k8s.io images import calico.tar.gz### 在k8s-001节点启动kubectl apply -f calico.yaml

再看下集群状态

kubectl get nodes

1.3.6 污点去除

Master 节点默认是不允许部署非系统的 pod,我们可以通过删除污点的方式运行部署

查看污点

kubectl describe node | grep -B 3 Taints

删除污点

kubectl taint node -l node-role.kubernetes.io/control-plane= node-role.kubernetes.io/control-plane:NoSchedule-

验证一下

在没有去除污点的时候,这个pod一直是有问题的

去除污点以后就好了

通过查看这个资源能够看到效果

1.3.7 配置 etcd 高可用

修改每个节点的 /etc/kubernetes/manifests/etcd.yaml

## 每个节点的文件这个---initial-cluster内容保持三节点配置,改了之后好像会自动重启k8s集群--initial-cluster=k8s-003=https://192.168.204.12:2380,k8s-002=https://192.168.204.11:2380,k8s-001=https://192.168.204.10:2380## 再重启kubeletsystemctl restart kubelet

验证

docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0 etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt member list

显示这个就是说明配置正确的

docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0 etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints=https://192.168.204.10:2379,https://192.168.204.11:2379,https://192.168.204.12:2379 endpoint health --cluster

1.4 插件安装及优化阶段

1.4.1 kubectl 命令补全功能

### 任意一台执行yum install bash-completionsource /usr/share/bash-completion/bash_completionbashtype _init_completionkubectl completion bash | sudo tee /etc/bash_completion.d/kubectl > /dev/nullsource ~/.bashrcbash

1.4.2 安装 metrics

有关 metrics-server 地址:https://github.com/kubernetes-sigs/metrics-server#readme

Metrics Server是 k8s 内置自动缩放管道的可扩展、高效的容器资源度量源。

Metrics Server 从 Kubelets 收集资源度量,并通过Metrics API在Kubernetes apiserver中公开这些度量,供Horizontal Pod Autoscaler和Vertical Pod Autocaler使用。kubectl top还可以访问Metrics API,从而更容易调试自动缩放管道。

准备 yaml 文件

##### metrics-server.yamlapiVersion: v1kind: ServiceAccountmetadata:labels:k8s-app: metrics-servername: metrics-servernamespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata:labels:k8s-app: metrics-serverrbac.authorization.k8s.io/aggregate-to-admin: "true"rbac.authorization.k8s.io/aggregate-to-edit: "true"rbac.authorization.k8s.io/aggregate-to-view: "true"name: system:aggregated-metrics-readerrules:- apiGroups:- metrics.k8s.ioresources:- pods- nodesverbs:- get- list- watch---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata:labels:k8s-app: metrics-servername: system:metrics-serverrules:- apiGroups:- ""resources:- pods- nodes- nodes/stats- namespaces- configmapsverbs:- get- list- watch---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata:labels:k8s-app: metrics-servername: metrics-server-auth-readernamespace: kube-systemroleRef:apiGroup: rbac.authorization.k8s.iokind: Rolename: extension-apiserver-authentication-readersubjects:- kind: ServiceAccountname: metrics-servernamespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata:labels:k8s-app: metrics-servername: metrics-server:system:auth-delegatorroleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: system:auth-delegatorsubjects:- kind: ServiceAccountname: metrics-servernamespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata:labels:k8s-app: metrics-servername: system:metrics-serverroleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: system:metrics-serversubjects:- kind: ServiceAccountname: metrics-servernamespace: kube-system---apiVersion: v1kind: Servicemetadata:labels:k8s-app: metrics-servername: metrics-servernamespace: kube-systemspec:ports:- name: httpsport: 443protocol: TCPtargetPort: httpsselector:k8s-app: metrics-server---apiVersion: apps/v1kind: Deploymentmetadata:labels:k8s-app: metrics-servername: metrics-servernamespace: kube-systemspec:selector:matchLabels:k8s-app: metrics-serverstrategy:rollingUpdate:maxUnavailable: 0template:metadata:labels:k8s-app: metrics-serverspec:containers:- args:- --cert-dir=/tmp- --kubelet-insecure-tls- --secure-port=4443- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname- --kubelet-use-node-status-portimage: registry.cn-hangzhou.aliyuncs.com/lfy_k8s_images/metrics-server:v0.4.3imagePullPolicy: IfNotPresentlivenessProbe:failureThreshold: 3httpGet:path: /livezport: httpsscheme: HTTPSperiodSeconds: 10name: metrics-serverports:- containerPort: 4443name: httpsprotocol: TCPreadinessProbe:failureThreshold: 3httpGet:path: /readyzport: httpsscheme: HTTPSperiodSeconds: 10securityContext:readOnlyRootFilesystem: truerunAsNonRoot: truerunAsUser: 1000volumeMounts:- mountPath: /tmpname: tmp-dirnodeSelector:kubernetes.io/os: linuxpriorityClassName: system-cluster-criticalserviceAccountName: metrics-servervolumes:- emptyDir: {}name: tmp-dir---apiVersion: apiregistration.k8s.io/v1kind: APIServicemetadata:labels:k8s-app: metrics-servername: v1beta1.metrics.k8s.iospec:group: metrics.k8s.iogroupPriorityMinimum: 100insecureSkipTLSVerify: trueservice:name: metrics-servernamespace: kube-systemversion: v1beta1versionPriority: 100

部署一下

### k8s-001执行即可kubectl apply -f metrics-server.yaml

查看一下状态

kubectl get pod -n kube-system

验证一下

kubectl top nodeskubectl top pods

1.4.3 Dashboard UI 安装

准备的 yaml 文件

直接下载也是可以的:https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml

  • 这个服务的镜像不好拉取,我自己做成了阿里云的仓库进行拉取了(这是一共公开的仓库)

  • 需要的是这2个镜像

    • kubernetesui/metrics-scraper:v1.0.8
    • kubernetesui/dashboard:v2.7.0
  • 我换成了阿里云的

    • registry.cn-hangzhou.aliyuncs.com/k8s_whale_images/dashboard:v2.7.0
    • registry.cn-hangzhou.aliyuncs.com/k8s_whale_images/metrics-scraper:v1.0.8
# 我们还得修改一下这个服务的暴露方式,这样我们就可以外部访问了# 要将配置文件里面的 Service 暴露方式设置成 NodePort 方式,这样我们外部就可以直接访问了# 在32行位置type: NodePort# 再把镜像改一下sed -i 's#kubernetesui/metrics-scraper:v1.0.8#registry.cn-hangzhou.aliyuncs.com/k8s_whale_images/metrics-scraper:v1.0.8#g' recommended.yamlsed -i 's#kubernetesui/dashboard:v2.7.0#registry.cn-hangzhou.aliyuncs.com/k8s_whale_images/dashboard:v2.7.0#g' recommended.yaml

运行一下

kubectl apply -f recommended.yaml

看下状态

kubectl get pod -A

配置访问

### dashboard-admin.yamlapiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata:name: kubernetes-dashboardnamespace: kubernetes-dashboardroleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: cluster-adminsubjects:- kind: ServiceAccountname: kubernetes-dashboardnamespace: kubernetes-dashboard

生效一下

kubectl delete -f dashboard-admin.yamlkubectl create -f dashboard-admin.yaml

生成登陆 token

这个版本的好像和之前的有所改变,需要这样操作才能获取到登陆 token

  • 创建服务账号
apiVersion: v1kind: ServiceAccountmetadata:name: admin-usernamespace: kubernetes-dashboard
  • 集群角色绑定
apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata:name: admin-userroleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: cluster-adminsubjects:- kind: ServiceAccountname: admin-usernamespace: kubernetes-dashboard
  • 生成 token
kubectl -n kubernetes-dashboard create token admin-user

登陆验证

##看一下访问端口kubectl get svc -n kubernetes-dashboard## 是宿主机的32190端口

试试浏览器访问:https://192.168.204.13:32190/ (vip 地址也可)

输入上面生成的 token 即可访问

1.5 集群验证阶段

1.5.1 节点验证

kubectl get nodes

  • 节点正常处于 Ready 状态
  • 版本和安装无出入

1.5.2 Pod 验证

kubectl get pods -A

  • 状态肯定是需要是 Running 状态
  • Pod 准备是否就绪,处于 1/1(前面数字和后面数字保持一致)
  • 对于重启次数,不用太纠结这个,虚拟机重启都会导致这个重启次数

1.5.3 k8s 网段验证

kubectl get svckubectl get pod -A -owide

  • 核实一下对于的网段是否和我们规划的一致,是否存在冲突

  • 对于网络用的宿主机 IP 地址,是因为 pod 网络模式用的主机模式

1.5.4 创建资源验证

### 我们使用国内的一个debug工具镜像进行验证即可,包括后续的验证,准备这么一个yaml文件apiVersion: apps/v1kind: Deploymentmetadata:name: debug-toolslabels:app: debug-toolsspec:replicas: 3selector:matchLabels:app: debug-toolstemplate:metadata:labels:app: debug-toolsspec:containers:- name: debug-toolsimage: registry.cn-hangzhou.aliyuncs.com/k8s_whale_images/debug-tools:latestcommand: ["/bin/sh","-c","sleep 3600"]

运行一下

kubectl apply -f demo.yaml

查看一下是否正常部署

  • 副本数
  • pod 状态
  • 容器内部

删除资源是否正常

kubectl delete -f demo.yaml
  • 资源配置问题,稍微比较慢,只要资源能够正常创建删除即可

1.5.5 pod 解析验证

验证是否正确解析 service

# 从上面的pod进行验证# 同空间(容器内部验证)nslookup kubernetes# 跨空间nslookup kube-dns.kube-system

  • 有 IP 返回即可

1.5.6 节点访问验证

# 宿主机必须是能够访问k8s svc 443 和 kube-dns的53端口# 每个节点curl -k https://10.165.0.1:443curl 10.165.0.10:53

  • 有这些说明事正常通信的

1.5.7 Pod 之间的通信

同机器上的pod和不同机器上的都要去验证一下

准备2个ymal

第一个

apiVersion: apps/v1kind: Deploymentmetadata:name: nginx-deploymentlabels:app: nginxspec:replicas: 1selector:matchLabels:app: nginxtemplate:metadata:labels:app: nginxspec:containers:- name: nginximage: nginx:1.14.2

第二个

apiVersion: apps/v1kind: Deploymentmetadata:name: debug-toolsnamespace: kube-systemlabels:app: debug-toolsspec:replicas: 1selector:matchLabels:app: debug-toolstemplate:metadata:labels:app: debug-toolsspec:containers:- name: debug-toolsimage: registry.cn-hangzhou.aliyuncs.com/k8s_whale_images/debug-tools:latestcommand: ["/bin/sh","-c","sleep 3600"]

跑起来

  • 我们跑了2个容器,在不同的空间下

验证 Pod 之间通信

查看一下各 pod 地址

实际去 ping 验证一下:

# 不同空间kubectl exec -it debug-tools-7b466cf8dc-xmhl7 -n kube-system ping 10.166.6.10

相同的空间

到现在,应该算是一个比较完美的过程了