前言

K8s + Spring Boot实现零宕机发布:健康检查+滚动更新+优雅停机+弹性伸缩+Prometheus监控+配置分离(镜像复用)

配置健康检查

  • 健康检查类型:就绪探针(readiness)+ 存活探针(liveness)
  • 探针类型:exec(进入容器执行脚本)、tcpSocket(探测端口)、httpGet(调用接口)

业务层面

Spring Boot 基础就不介绍了,推荐看这个实战项目:

https://github.com/javastacks/spring-boot-best-practice

项目依赖 pom.xml

    org.springframework.boot    spring-boot-starter-actuator

定义访问端口、路径及权限 application.yaml

management:  server:    port: 50000                         # 启用独立运维端口  endpoint:                             # 开启health端点    health:      probes:        enabled: true  endpoints:    web:      exposure:        base-path: /actuator            # 指定上下文路径,启用相应端点        include: health

将暴露/actuator/health/readiness/actuator/health/liveness两个接口,访问方式如下:

http://127.0.0.1:50000/actuator/health/readinesshttp://127.0.0.1:50000/actuator/health/liveness

运维层面

k8s部署模版deployment.yaml

apiVersion: apps/v1kind: Deploymentspec:  template:    spec:      containers:      - name: {APP_NAME}        image: {IMAGE_URL}        imagePullPolicy: Always        ports:        - containerPort: {APP_PORT}        - name: management-port          containerPort: 50000         # 应用管理端口        readinessProbe:                # 就绪探针          httpGet:            path: /actuator/health/readiness            port: management-port          initialDelaySeconds: 30      # 延迟加载时间          periodSeconds: 10            # 重试时间间隔          timeoutSeconds: 1            # 超时时间设置          successThreshold: 1          # 健康阈值          failureThreshold: 6          # 不健康阈值        livenessProbe:                 # 存活探针          httpGet:            path: /actuator/health/liveness            port: management-port          initialDelaySeconds: 30      # 延迟加载时间          periodSeconds: 10            # 重试时间间隔          timeoutSeconds: 1            # 超时时间设置          successThreshold: 1          # 健康阈值          failureThreshold: 6          # 不健康阈值

滚动更新

k8s资源调度之滚动更新策略,若要实现零宕机发布,需支持健康检查

apiVersion: apps/v1kind: Deploymentmetadata:  name: {APP_NAME}  labels:    app: {APP_NAME}spec:  selector:    matchLabels:      app: {APP_NAME}  replicas: {REPLICAS}    # Pod副本数  strategy:    type: RollingUpdate    # 滚动更新策略    rollingUpdate:      maxSurge: 1                   # 升级过程中最多可以比原先设置的副本数多出的数量      maxUnavailable: 1             # 升级过程中最多有多少个POD处于无法提供服务的状态

优雅停机

在K8s中,当我们实现滚动升级之前,务必要实现应用级别的优雅停机。否则滚动升级时,还是会影响到业务。使应用关闭线程、释放连接资源后再停止服务

业务层面

项目依赖 pom.xml

    org.springframework.boot    spring-boot-starter-actuator

定义访问端口、路径及权限 application.yaml

spring:  application:    name:   profiles:    active: @profileActive@  lifecycle:    timeout-per-shutdown-phase: 30s     # 停机过程超时时长设置30s,超过30s,直接停机server:  port: 8080  shutdown: graceful                    # 默认为IMMEDIATE,表示立即关机;GRACEFUL表示优雅关机management:  server:    port: 50000                         # 启用独立运维端口  endpoint:                             # 开启shutdown和health端点    shutdown:      enabled: true    health:      probes:        enabled: true  endpoints:    web:      exposure:        base-path: /actuator            # 指定上下文路径,启用相应端点        include: health,shutdown

将暴露/actuator/shutdown接口,调用方式如下:

curl -X POST 127.0.0.1:50000/actuator/shutdown

运维层面

确保dockerfile模版集成curl工具,否则无法使用curl命令

FROM openjdk:8-jdk-alpine#构建参数ARG JAR_FILEARG WORK_PATH="/app"ARG EXPOSE_PORT=8080#环境变量ENV JAVA_OPTS=""\    JAR_FILE=${JAR_FILE}#设置时区RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezoneRUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories  \    && apk add --no-cache curl#将maven目录的jar包拷贝到docker中,并命名为for_docker.jarCOPY target/$JAR_FILE $WORK_PATH/#设置工作目录WORKDIR $WORK_PATH# 指定于外界交互的端口EXPOSE $EXPOSE_PORT# 配置容器,使其可执行化ENTRYPOINT exec java $JAVA_OPTS -jar $JAR_FILE

k8s部署模版deployment.yaml

注:经验证,java项目可省略结束回调钩子的配置

此外,若需使用回调钩子,需保证镜像中包含curl工具,且需注意应用管理端口(50000)不能暴露到公网

apiVersion: apps/v1kind: Deploymentspec:  template:    spec:      containers:      - name: {APP_NAME}        image: {IMAGE_URL}        imagePullPolicy: Always        ports:        - containerPort: {APP_PORT}        - containerPort: 50000        lifecycle:          preStop:       # 结束回调钩子            exec:              command: ["curl", "-XPOST", "127.0.0.1:50000/actuator/shutdown"]

弹性伸缩

为pod设置资源限制后,创建HPA

apiVersion: apps/v1kind: Deploymentmetadata:  name: {APP_NAME}  labels:    app: {APP_NAME}spec:  template:    spec:      containers:      - name: {APP_NAME}        image: {IMAGE_URL}        imagePullPolicy: Always        resources:                     # 容器资源管理          limits:                      # 资源限制(监控使用情况)            cpu: 0.5            memory: 1Gi          requests:                    # 最小可用资源(灵活调度)            cpu: 0.15            memory: 300Mi---kind: HorizontalPodAutoscaler            # 弹性伸缩控制器apiVersion: autoscaling/v2beta2metadata:  name: {APP_NAME}spec:  scaleTargetRef:    apiVersion: apps/v1    kind: Deployment    name: {APP_NAME}  minReplicas: {REPLICAS}                # 缩放范围  maxReplicas: 6  metrics:    - type: Resource      resource:        name: cpu                        # 指定资源指标        target:          type: Utilization          averageUtilization: 50

Prometheus集成业务层面

项目依赖 pom.xml

    org.springframework.boot    spring-boot-starter-actuator    io.micrometer    micrometer-registry-prometheus

定义访问端口、路径及权限 application.yaml

management:  server:    port: 50000                         # 启用独立运维端口  metrics:    tags:      application: ${spring.application.name}  endpoints:    web:      exposure:        base-path: /actuator            # 指定上下文路径,启用相应端点        include: metrics,prometheus

将暴露/actuator/metric/actuator/prometheus接口,访问方式如下:

http://127.0.0.1:50000/actuator/metrichttp://127.0.0.1:50000/actuator/prometheus

运维层面

deployment.yaml

apiVersion: apps/v1kind: Deploymentspec:  template:    metadata:      annotations:        prometheus:io/port: "50000"        prometheus.io/path: /actuator/prometheus  # 在流水线中赋值        prometheus.io/scrape: "true"              # 基于pod的服务发现

配置分离

方案:通过configmap挂载外部配置文件,并指定激活环境运行

作用:配置分离,避免敏感信息泄露;镜像复用,提高交付效率

通过文件生成configmap

# 通过dry-run的方式生成yaml文件kubectl create cm -n   --from-file=application-test.yaml --dry-run=1 -oyaml > configmap.yaml# 更新kubectl apply -f configmap.yaml

挂载configmap并指定激活环境

apiVersion: apps/v1kind: Deploymentmetadata:  name: {APP_NAME}  labels:    app: {APP_NAME}spec:  template:    spec:      containers:      - name: {APP_NAME}        image: {IMAGE_URL}        imagePullPolicy: Always        env:          - name: SPRING_PROFILES_ACTIVE   # 指定激活环境            value: test        volumeMounts:                      # 挂载configmap        - name: conf          mountPath: "/app/config"         # 与Dockerfile中工作目录一致          readOnly: true      volumes:      - name: conf        configMap:          name: {APP_NAME}

汇总配置业务层面

项目依赖 pom.xml

    org.springframework.boot    spring-boot-starter-actuator    io.micrometer    micrometer-registry-prometheus

定义访问端口、路径及权限 application.yaml

spring:  application:    name: project-sample  profiles:    active: @profileActive@  lifecycle:    timeout-per-shutdown-phase: 30s     # 停机过程超时时长设置30s,超过30s,直接停机server:  port: 8080  shutdown: graceful                    # 默认为IMMEDIATE,表示立即关机;GRACEFUL表示优雅关机management:  server:    port: 50000                         # 启用独立运维端口  metrics:    tags:      application: ${spring.application.name}  endpoint:                             # 开启shutdown和health端点    shutdown:      enabled: true    health:      probes:        enabled: true  endpoints:    web:      exposure:        base-path: /actuator            # 指定上下文路径,启用相应端点        include: health,shutdown,metrics,prometheus

运维层面

确保dockerfile模版集成curl工具,否则无法使用curl命令

FROM openjdk:8-jdk-alpine#构建参数ARG JAR_FILEARG WORK_PATH="/app"ARG EXPOSE_PORT=8080#环境变量ENV JAVA_OPTS=""\    JAR_FILE=${JAR_FILE}#设置时区RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezoneRUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories  \    && apk add --no-cache curl#将maven目录的jar包拷贝到docker中,并命名为for_docker.jarCOPY target/$JAR_FILE $WORK_PATH/#设置工作目录WORKDIR $WORK_PATH# 指定于外界交互的端口EXPOSE $EXPOSE_PORT# 配置容器,使其可执行化ENTRYPOINT exec java $JAVA_OPTS -jar $JAR_FILE

k8s部署模版deployment.yaml

apiVersion: apps/v1kind: Deploymentmetadata:  name: {APP_NAME}  labels:    app: {APP_NAME}spec:  selector:    matchLabels:      app: {APP_NAME}  replicas: {REPLICAS}                            # Pod副本数  strategy:    type: RollingUpdate                           # 滚动更新策略    rollingUpdate:      maxSurge: 1      maxUnavailable: 0  template:    metadata:      name: {APP_NAME}      labels:        app: {APP_NAME}      annotations:        timestamp: {TIMESTAMP}        prometheus.io/port: "50000"               # 不能动态赋值        prometheus.io/path: /actuator/prometheus        prometheus.io/scrape: "true"              # 基于pod的服务发现    spec:      affinity:                                   # 设置调度策略,采取多主机/多可用区部署        podAntiAffinity:          preferredDuringSchedulingIgnoredDuringExecution:          - weight: 100            podAffinityTerm:              labelSelector:                matchExpressions:                - key: app                  operator: In                  values:                  - {APP_NAME}              topologyKey: "kubernetes.io/hostname" # 多可用区为"topology.kubernetes.io/zone"      terminationGracePeriodSeconds: 30             # 优雅终止宽限期      containers:      - name: {APP_NAME}        image: {IMAGE_URL}        imagePullPolicy: Always        ports:        - containerPort: {APP_PORT}        - name: management-port          containerPort: 50000         # 应用管理端口        readinessProbe:                # 就绪探针          httpGet:            path: /actuator/health/readiness            port: management-port          initialDelaySeconds: 30      # 延迟加载时间          periodSeconds: 10            # 重试时间间隔          timeoutSeconds: 1            # 超时时间设置          successThreshold: 1          # 健康阈值          failureThreshold: 9          # 不健康阈值        livenessProbe:                 # 存活探针          httpGet:            path: /actuator/health/liveness            port: management-port          initialDelaySeconds: 30      # 延迟加载时间          periodSeconds: 10            # 重试时间间隔          timeoutSeconds: 1            # 超时时间设置          successThreshold: 1          # 健康阈值          failureThreshold: 6          # 不健康阈值        resources:                     # 容器资源管理          limits:                      # 资源限制(监控使用情况)            cpu: 0.5            memory: 1Gi          requests:                    # 最小可用资源(灵活调度)            cpu: 0.1            memory: 200Mi        env:          - name: TZ            value: Asia/Shanghai---kind: HorizontalPodAutoscaler            # 弹性伸缩控制器apiVersion: autoscaling/v2beta2metadata:  name: {APP_NAME}spec:  scaleTargetRef:    apiVersion: apps/v1    kind: Deployment    name: {APP_NAME}  minReplicas: {REPLICAS}                # 缩放范围  maxReplicas: 6  metrics:    - type: Resource      resource:        name: cpu                        # 指定资源指标        target:          type: Utilization          averageUtilization: 50

近期热文推荐:

1.1,000+ 道 Java面试题及答案整理(2022最新版)

2.劲爆!Java 协程要来了。。。

3.Spring Boot 2.x 教程,太全了!

4.别再写满屏的爆爆爆炸类了,试试装饰器模式,这才是优雅的方式!!

5.《Java开发手册(嵩山版)》最新发布,速速下载!

觉得不错,别忘了随手点赞+转发哦!