Table of Contents
- 2. 1 metric-server简介
- 2.2 metric-server架构
- 2.3 metric-server部署
- 2.4 metric-server api测试
- 3.1 HPA概述
- 3.2 HPA实现
1. 监控架构概述 #
kubernetes监控指标大体可以分为两类:核心监控指标和自定义指标,核心监控指标是kubernetes内置稳定可靠监控指标,早期由heapster完成,现由metric-server实现;自定义指标用于实现核心指标的扩展,能够提供更丰富的指标支持,如应用状态指标,自定义指标需要通过Aggregator和k8s api集成,当前主流通过promethues实现。
监控指标用途:
- kubectl top 查看node和pod的cpu+内存使用情况
- kubernetes-dashbaord 控制台查看节点和pod资源监控
- Horizontal Pod Autoscaler 水平横向动态扩展
- Scheduler 调度器调度选择条件
2. metric-server架构和安装 #
2. 1 metric-server简介 #
Metrics Server is a cluster-wide aggregator of resource usage data. Resource metrics are used by components like kubectl top and the Horizontal Pod Autoscaler to scale workloads. To autoscale based upon a custom metric, you need to use the Prometheus Adapter Metric-server是一个集群级别的资源指标收集器,用于收集资源指标数据
- 提供基础资源如CPU、内存监控接口查询;
- 接口通过 Kubernetes aggregator注册到kube-apiserver中;
- 对外通过Metric API暴露给外部访问;
- 自定义指标使用需要借助Prometheus实现。
The Metrics API
- /node 获取所有节点的指标,指标名称为NodeMetrics
- /node/<node_name> 特定节点指标
- /namespaces/{namespace}/pods 获取命名空间下的所有pod指标
- /namespaces/{namespace}/pods/{pod} 特定pod的指标,指标名称为PodMetrics
未来将能够支持指标聚合,如max最大值,min最小值,95th峰值,以及自定义时间窗口,如1h,1d,1w等。
2.2 metric-server架构 #
%E4%BD%BF%E7%94%A8metric-server%E8%AE%A9HPA%E5%BC%B9%E6%80%A7%E4%BC%B8%E7%BC%A9%E6%84%89%E5%BF%AB%E8%BF%90%E8%A1%8C/1%20-%201620.jpg)
监控架构分两部分内容:核心监控(图白色部分)和自定义监控(图蓝色部分)
1、 核心监控实现
- 通过kubelet收集资源估算+使用估算
- metric-server负责数据收集,不负责数据存储
- metric-server对外暴露Metric API接口
- 核心监控指标客用户HPA,kubectl top,scheduler和dashboard
2、 自定义监控实现
- 自定义监控指标包括监控指标和服务指标
- 需要在每个node上部署一个agent上报至集群监控agent,如prometheus
- 集群监控agent收集数据后需要将监控指标+服务指标通过API adaptor转换为apiserver能够处理的接口
- HPA通过自定义指标实现更丰富的弹性扩展能力,需要通过HPA adaptor API做次转换。
2.3 metric-server部署 #
1、获取metric-server安装文件,当前具有两个版本:1.7和1.8+,kubernetes1.7版本安装1.7的metric-server版本,kubernetes 1.8后版本安装metric server 1.8+版本
1[root@node-1 ~]# git clone https://github.com/kubernetes-sigs/metrics-server.git
2、部署metric-server,部署1.8+版本
1[root@node-1 metrics-server]# kubectl apply -f deploy/1.8+/
2clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
3clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
4rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
5apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
6serviceaccount/metrics-server created
7deployment.apps/metrics-server created
8service/metrics-server created
9clusterrole.rbac.authorization.k8s.io/system:metrics-server created
10clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
核心的配置文件是metrics-server-deployment.yaml,metric-server以Deployment的方式部署在集群中,j镜像k8s.gcr.io/metrics-server-amd64:v0.3.6需要提前下载好,其对应的安装文件内容如下:
1---
2apiVersion: v1
3kind: ServiceAccount
4metadata:
5 name: metrics-server
6 namespace: kube-system
7---
8apiVersion: apps/v1
9kind: Deployment
10metadata:
11 name: metrics-server
12 namespace: kube-system
13 labels:
14 k8s-app: metrics-server
15spec:
16 selector:
17 matchLabels:
18 k8s-app: metrics-server
19 template:
20 metadata:
21 name: metrics-server
22 labels:
23 k8s-app: metrics-server
24 spec:
25 serviceAccountName: metrics-server
26 volumes:
27 # mount in tmp so we can safely use from-scratch images and/or read-only containers
28 - name: tmp-dir
29 emptyDir: {}
30 containers:
31 - name: metrics-server
32 image: k8s.gcr.io/metrics-server-amd64:v0.3.6
33 args:
34 - --cert-dir=/tmp
35 - --secure-port=4443
36ExternalIP
37 ports:
38 - name: main-port
39 containerPort: 4443
40 protocol: TCP
41 securityContext:
42 readOnlyRootFilesystem: true
43 runAsNonRoot: true
44 runAsUser: 1000
45 imagePullPolicy: Always
46 volumeMounts:
47 - name: tmp-dir
48 mountPath: /tmp
49 nodeSelector:
50 beta.kubernetes.io/os: linux
3、检查metric-server部署的情况,查看metric-server的Pod已部署成功
1[root@node-1 1.8+]# kubectl get deployments metrics-server -n kube-system
2NAME READY UP-TO-DATE AVAILABLE AGE
3metrics-server 1/1 1 1 2m49s
4[root@node-1 1.8+]# kubectl get pods -n kube-system metrics-server-67db467b7b-5xf8x
5NAME READY STATUS RESTARTS AGE
6metrics-server-67db467b7b-5xf8x 1/1 Running 0 3m
实际此时metric-server并不能使用,使用kubectl top node 查看会提示Error from server (NotFound): nodemetrics.metrics.k8s.io "node-1" not found类似的报错,查看metric-server的pod的日志信息,显示如下:
1[root@node-1 1.8+]# kubectl logs metrics-server-67db467b7b-5xf8x -n kube-system -f
2I1230 11:34:10.905500 1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
3I1230 11:34:11.527346 1 secure_serving.go:116] Serving securely on [::]:4443
4E1230 11:35:11.552067 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:node-1: unable to fetch metrics from Kubelet node-1 (node-1): Get https://node-1:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup node-1 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:node-3: unable to fetch metrics from Kubelet node-3 (node-3): Get https://node-3:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup node-3 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:node-2: unable to fetch metrics from Kubelet node-2 (node-2): Get https://node-2:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup node-2 on 10.96.0.10:53: no such host]
4、上述的报错信息提示pod中通过DNS无法解析主机名,可以通过在pod中定义hosts文件或告知metric-server优先使用IP的方式通讯,修改metric-server的deployment配置文件,修改如下并重新应用配置
%E4%BD%BF%E7%94%A8metric-server%E8%AE%A9HPA%E5%BC%B9%E6%80%A7%E4%BC%B8%E7%BC%A9%E6%84%89%E5%BF%AB%E8%BF%90%E8%A1%8C/2%20-%201620.jpg)
5、应用metric-server部署文件后重新生成一个pod,日志中再次查看提示另外一个报错信息
1[root@node-1 1.8+]# kubectl logs metrics-server-f54f5d6bf-s42rc -n kube-system -f
2I1230 11:45:26.615547 1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
3I1230 11:45:27.043723 1 secure_serving.go:116] Serving securely on [::]:4443
4
5E1230 11:46:27.065274 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:node-2: unable to fetch metrics from Kubelet node-2 (10.254.100.102): Get https://10.254.100.102:10250/stats/summary?only_cpu_and_memory=true: x509: cannot validate certificate for 10.254.100.102 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:node-1: unable to fetch metrics from Kubelet node-1 (10.254.100.101): Get https://10.254.100.101:10250/stats/summary?only_cpu_and_memory=true: x509: cannot validate certificate for 10.254.100.101 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:node-3: unable to fetch metrics from Kubelet node-3 (10.254.100.103): Get https://10.254.100.103:10250/stats/summary?only_cpu_and_memory=true: x509: cannot validate certificate for 10.254.100.103 because it doesn't contain any IP SANs]
6、修改metric-server的deployments配置文件,添加--kubelet-insecure-tls参数设置
%E4%BD%BF%E7%94%A8metric-server%E8%AE%A9HPA%E5%BC%B9%E6%80%A7%E4%BC%B8%E7%BC%A9%E6%84%89%E5%BF%AB%E8%BF%90%E8%A1%8C/3%20-%201620.jpg)
再次重新部署后无报错,等待几分钟后就有数据上报告metric-server中了,可以通过kubectl top进行验证测试。
2.4 metric-server api测试 #
1、安装完metric-server后会增加一个metrics.k8s.io/v1beta1的API组,该API组通过Aggregator接入apiserver中
%E4%BD%BF%E7%94%A8metric-server%E8%AE%A9HPA%E5%BC%B9%E6%80%A7%E4%BC%B8%E7%BC%A9%E6%84%89%E5%BF%AB%E8%BF%90%E8%A1%8C/4%20-%201620.jpg)
2、使用命令行查看kubectl top node的监控信息,可以看到CPU和内存的利用率
1[root@node-1 1.8+]# kubectl top node
2NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
3node-1 110m 5% 4127Mi 53%
4node-2 53m 5% 1066Mi 61%
5node-3 34m 3% 1002Mi 57%
3、查看pod监控信息,可以看到pod中CPU和内存的使用情况
1[root@node-1 1.8+]# kubectl top pods
2NAME CPU(cores) MEMORY(bytes)
3haproxy-1-686c67b997-kw8pp 0m 1Mi
4haproxy-2-689b4f897-7cwmf 0m 1Mi
5haproxy-ingress-demo-5d487d4fc-5pgjt 0m 1Mi
6haproxy-ingress-demo-5d487d4fc-pst2q 0m 1Mi
7haproxy-ingress-demo-5d487d4fc-sr8tm 0m 1Mi
8ingress-demo-d77bdf4df-7kwbj 0m 1Mi
9ingress-demo-d77bdf4df-7x6jn 0m 1Mi
10ingress-demo-d77bdf4df-hr88b 0m 1Mi
11ingress-demo-d77bdf4df-wc22k 0m 1Mi
12service-1-7b66bf758f-xj9jh 0m 2Mi
13service-2-7c7444684d-w9cv9 1m 3Mi
4、除了用命令行连接metricc-server获取监控资源,还可以通过API方式链接方式获取,可用API有
- http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes
- http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes/
- http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/pods
- http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/namespace//pods/<pod-name
如下测试API接口的使用:
1a、创建一个kube proxy代理,用于链接apiserver,默认将监听在127的8001端口
2[root@node-1 ~]# kubectl proxy
3Starting to serve on 127.0.0.1:8001
4
5b、查看node列表的监控数据,可以获取到所有node的资源监控数据,usage中包含cpu和memory
6[root@node-1 ~]# curl http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes
7 % Total % Received % Xferd Average Speed Time Time Time Current
8 Dload Upload Total Spent Left Speed
9100 1167 100 1167 0 0 393k 0 --:--:-- --:--:-- --:--:-- 569k
10{
11 "kind": "NodeMetricsList",
12 "apiVersion": "metrics.k8s.io/v1beta1",
13 "metadata": {
14 "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
15 },
16 "items": [
17 {
18 "metadata": {
19 "name": "node-3",
20 "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node-3",
21 "creationTimestamp": "2019-12-30T14:23:00Z"
22 },
23 "timestamp": "2019-12-30T14:22:07Z",
24 "window": "30s",
25 "usage": {
26 "cpu": "32868032n",
27 "memory": "1027108Ki"
28 }
29 },
30 {
31 "metadata": {
32 "name": "node-1",
33 "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node-1",
34 "creationTimestamp": "2019-12-30T14:23:00Z"
35 },
36 "timestamp": "2019-12-30T14:22:07Z",
37 "window": "30s",
38 "usage": {
39 "cpu": "108639556n",
40 "memory": "4305356Ki"
41 }
42 },
43 {
44 "metadata": {
45 "name": "node-2",
46 "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node-2",
47 "creationTimestamp": "2019-12-30T14:23:00Z"
48 },
49 "timestamp": "2019-12-30T14:22:12Z",
50 "window": "30s",
51 "usage": {
52 "cpu": "47607386n",
53 "memory": "1119960Ki"
54 }
55 }
56 ]
57}
58
59c、指定某个具体的node访问到具体node的资源监控数据
60[root@node-1 ~]# curl http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes/node-2
61{
62 "kind": "NodeMetrics",
63 "apiVersion": "metrics.k8s.io/v1beta1",
64 "metadata": {
65 "name": "node-2",
66 "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node-2",
67 "creationTimestamp": "2019-12-30T14:24:39Z"
68 },
69 "timestamp": "2019-12-30T14:24:12Z",
70 "window": "30s",
71 "usage": {
72 "cpu": "43027609n",
73 "memory": "1120168Ki"
74 }
75}
76
77d、查看所有pod的列表信息
78curl http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/pods
79
80e、查看某个具体pod的监控数据
81[root@node-1 ~]# curl http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/haproxy-ingress-demo-5d487d4fc-sr8tm
82{
83 "kind": "PodMetrics",
84 "apiVersion": "metrics.k8s.io/v1beta1",
85 "metadata": {
86 "name": "haproxy-ingress-demo-5d487d4fc-sr8tm",
87 "namespace": "default",
88 "selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/haproxy-ingress-demo-5d487d4fc-sr8tm",
89 "creationTimestamp": "2019-12-30T14:36:30Z"
90 },
91 "timestamp": "2019-12-30T14:36:13Z",
92 "window": "30s",
93 "containers": [
94 {
95 "name": "haproxy-ingress-demo",
96 "usage": {
97 "cpu": "0",
98 "memory": "1428Ki"
99 }
100 }
101 ]
102}
5、当然也可以通过kubectl -raw的方式访问接口,如调用node-3的数据
1[root@node-1 ~]# kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes/node-3 | jq .
2{
3 "kind": "NodeMetrics",
4 "apiVersion": "metrics.k8s.io/v1beta1",
5 "metadata": {
6 "name": "node-3",
7 "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node-3",
8 "creationTimestamp": "2019-12-30T14:44:46Z"
9 },
10 "timestamp": "2019-12-30T14:44:09Z",
11 "window": "30s",
12 "usage": {
13 "cpu": "35650151n",
14 "memory": "1026820Ki"
15 }
16}
其他近似的接口有:
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes 获取所有node的数据
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes/<node_name> 获取特定node数据
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods 获取所有pod的数据
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/haproxy-ingress-demo-5d487d4fc-sr8tm 获取某个特定pod的数据
3. HPA水平横向动态扩展 #
3.1 HPA概述 #
The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics). Note that Horizontal Pod Autoscaling does not apply to objects that can’t be scaled, for example, DaemonSets.
%E4%BD%BF%E7%94%A8metric-server%E8%AE%A9HPA%E5%BC%B9%E6%80%A7%E4%BC%B8%E7%BC%A9%E6%84%89%E5%BF%AB%E8%BF%90%E8%A1%8C/5%20-%201620.jpg)
HPA即Horizontal Pod Autoscaler,Pod水平横向动态扩展,即根据应用分配资源使用情况,动态增加或者减少Pod副本数量,以实现集群资源的扩容,其实现机制为:
- HPA需要依赖于监控组件,调用监控数据实现动态伸缩,如调用Metrics API接口
- HPA是二级的副本控制器,建立在Deployments,ReplicaSet,StatefulSets等副本控制器基础之上
- HPA根据获取资源指标不同支持两个版本:v1和v2alpha1
- HPA V1获取核心资源指标,如CPU和内存利用率,通过调用Metric-server API接口实现
- HPA V2获取自定义监控指标,通过Prometheus获取监控数据实现
- HPA根据资源API周期性调整副本数,检测周期horizontal-pod-autoscaler-sync-period定义的值,默认15s
3.2 HPA实现 #
如下开始延时HPA功能的实现,先创建一个Deployment副本控制器,然后再通过HPA定义资源度量策略,当CPU利用率超过requests分配的80%时即扩容。
1、创建Deployment副本控制器
1[root@node-1 ~]# kubectl run hpa-demo --image=nginx:1.7.9 --port=80 --replicas=1 --expose=true --requests="'cpu=200m,memory=64Mi"
2
3[root@node-1 ~]# kubectl get deployments hpa-demo -o yaml
4apiVersion: extensions/v1beta1
5kind: Deployment
6metadata:
7 annotations:
8 deployment.kubernetes.io/revision: "1"
9 creationTimestamp: "2019-12-31T01:43:24Z"
10 generation: 1
11 labels:
12 run: hpa-demo
13 name: hpa-demo
14 namespace: default
15 resourceVersion: "14451208"
16 selfLink: /apis/extensions/v1beta1/namespaces/default/deployments/hpa-demo
17 uid: 3b0f29e8-8606-4e52-8f5b-6c960d396136
18spec:
19 progressDeadlineSeconds: 600
20 replicas: 1
21 revisionHistoryLimit: 10
22 selector:
23 matchLabels:
24 run: hpa-demo
25 strategy:
26 rollingUpdate:
27 maxSurge: 25%
28 maxUnavailable: 25%
29 type: RollingUpdate
30 template:
31 metadata:
32 creationTimestamp: null
33 labels:
34 run: hpa-demo
35 spec:
36 containers:
37 - image: nginx:1.7.9
38 imagePullPolicy: IfNotPresent
39 name: hpa-demo
40 ports:
41 - containerPort: 80
42 protocol: TCP
43 resources:
44 requests:
45 cpu: 200m
46 memory: 64Mi
47 terminationMessagePath: /dev/termination-log
48 terminationMessagePolicy: File
49 dnsPolicy: ClusterFirst
50 restartPolicy: Always
51 schedulerName: default-scheduler
52 securityContext: {}
53 terminationGracePeriodSeconds: 30
54status:
55 availableReplicas: 1
56 conditions:
57 - lastTransitionTime: "2019-12-31T01:43:25Z"
58 lastUpdateTime: "2019-12-31T01:43:25Z"
59 message: Deployment has minimum availability.
60 reason: MinimumReplicasAvailable
61 status: "True"
62 type: Available
63 - lastTransitionTime: "2019-12-31T01:43:24Z"
64 lastUpdateTime: "2019-12-31T01:43:25Z"
65 message: ReplicaSet "hpa-demo-755bdd875c" has successfully progressed.
66 reason: NewReplicaSetAvailable
67 status: "True"
68 type: Progressing
69 observedGeneration: 1
70 readyReplicas: 1
71 replicas: 1
72 updatedReplicas: 1
2、创建HPA控制器,基于CPU实现横向扩展,策略为至少2个Pod,最大5个,targetCPUUtilizationPercentage表示CPU实际使用率占requests百分比
1apiVersion: autoscaling/v1
2kind: HorizontalPodAutoscaler
3metadata:
4 name: hpa-demo
5spec:
6 maxReplicas: 5
7 minReplicas: 2
8 scaleTargetRef:
9 apiVersion: apps/v1
10 kind: Deployment
11 name: hpa-demo
12 targetCPUUtilizationPercentage: 80
3、应用HPA规则并查看详情,由于策略需确保最小2个副本,Deployment默认不是2个副本,因此需要扩容,在详情日志中看到副本扩展至2个
1[root@node-1 ~]# kubectl apply -f hpa-demo.yaml
2horizontalpodautoscaler.autoscaling/hpa-demo created
3
4#查看HPA列表
5[root@node-1 ~]# kubectl get horizontalpodautoscalers.autoscaling
6NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
7hpa-demo Deployment/hpa-demo <unknown>/80% 2 5 0 7s
8
9#查看HPA详情
10[root@node-1 ~]# kubectl describe horizontalpodautoscalers.autoscaling hpa-demo
11Name: hpa-demo
12Namespace: default
13Labels: <none>
14Annotations: kubectl.kubernetes.io/last-applied-configuration:
15 {"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-demo","namespace":"default"},"spe...
16CreationTimestamp: Tue, 31 Dec 2019 09:52:51 +0800
17Reference: Deployment/hpa-demo
18Metrics: ( current / target )
19 resource cpu on pods (as a percentage of request): <unknown> / 80%
20Min replicas: 2
21Max replicas: 5
22Deployment pods: 1 current / 2 desired
23Conditions:
24 Type Status Reason Message
25 ---- ------ ------ -------
26 AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
27Events:
28 Type Reason Age From Message
29 ---- ------ ---- ---- -------
30 Normal SuccessfulRescale 1s horizontal-pod-autoscaler New size: 2; reason: Current number of replicas below Spec.MinReplicas #副本扩容至2个,根据MinReplica的策略
4、查看Deployment列表校验确认扩容情况,已达到HPA基础最小化策略
1[root@node-1 ~]# kubectl get deployments hpa-demo --show-labels
2NAME READY UP-TO-DATE AVAILABLE AGE LABELS
3hpa-demo 2/2 2 2 94m run=hpa-demo
4
5[root@node-1 ~]# kubectl get pods -l run=hpa-demo
6NAME READY STATUS RESTARTS AGE
7hpa-demo-5fcd9c757d-7q4td 1/1 Running 0 5m10s
8hpa-demo-5fcd9c757d-cq6k6 1/1 Running 0 10m
5、假如业务增长期间,CPU利用率增高,会自动横向增加Pod来实现,下面开始通过CPU压测来演示Deployment的扩展
1[root@node-1 ~]# kubectl exec -it hpa-demo-5fcd9c757d-cq6k6 /bin/bash
2root@hpa-demo-5fcd9c757d-cq6k6:/# dd if=/dev/zero of=/dev/null
3
4再次查看HPA的日志,提示已扩容,原因是cpu resource utilization (percentage of request) above target,即CPU资源利用率超过requests设置的百分比
5[root@node-1 ~]# kubectl describe horizontalpodautoscalers.autoscaling hpa-demo
6Name: hpa-demo
7Namespace: default
8Labels: <none>
9Annotations: kubectl.kubernetes.io/last-applied-configuration:
10 {"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-demo","namespace":"default"},"spe...
11CreationTimestamp: Tue, 31 Dec 2019 09:52:51 +0800
12Reference: Deployment/hpa-demo
13Metrics: ( current / target )
14 resource cpu on pods (as a percentage of request): 99% (199m) / 80%
15Min replicas: 2
16Max replicas: 5
17Deployment pods: 5 current / 5 desired
18Conditions:
19 Type Status Reason Message
20 ---- ------ ------ -------
21 AbleToScale True ReadyForNewScale recommended size matches current size
22 ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
23 ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count
24Events:
25 Type Reason Age From Message
26 ---- ------ ---- ---- -------
27 Normal SuccessfulRescale 8m2s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
28
29查看副本的个数,确认扩容情况,已成功扩容至5个
30[root@node-1 ~]# kubectl get pods
31NAME READY STATUS RESTARTS AGE
32hpa-demo-5fcd9c757d-7q4td 1/1 Running 0 16m
33hpa-demo-5fcd9c757d-cq6k6 1/1 Running 0 21m
34hpa-demo-5fcd9c757d-jmb6w 1/1 Running 0 16m
35hpa-demo-5fcd9c757d-lpxk8 1/1 Running 0 16m
36hpa-demo-5fcd9c757d-zs6cg 1/1 Running 0 21m
6、停止CPU压测业务,HPA会自定缩减Pod的副本个数,直至满足条件
1[root@node-1 ~]# kubectl describe horizontalpodautoscalers.autoscaling hpa-demo
2Name: hpa-demo
3Namespace: default
4Labels: <none>
5Annotations: kubectl.kubernetes.io/last-applied-configuration:
6 {"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-demo","namespace":"default"},"spe...
7CreationTimestamp: Tue, 31 Dec 2019 09:52:51 +0800
8Reference: Deployment/hpa-demo
9Metrics: ( current / target )
10 resource cpu on pods (as a percentage of request): 0% (0) / 80%
11Min replicas: 2
12Max replicas: 5
13Deployment pods: 2 current / 2 desired
14Conditions:
15 Type Status Reason Message
16 ---- ------ ------ -------
17 AbleToScale True ReadyForNewScale recommended size matches current size
18 ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
19 ScalingLimited True TooFewReplicas the desired replica count is increasing faster than the maximum scale rate
20Events:
21 Type Reason Age From Message
22 ---- ------ ---- ---- -------
23 Normal SuccessfulRescale 18m horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
24 Normal SuccessfulRescale 113s horizontal-pod-autoscaler New size: 2; reason: All metrics below target #缩减至2个pod副本
25
26确认副本的个数,已缩减至最小数量2个
27[root@node-1 ~]# kubectl get pods -l run=hpa-demo
28NAME READY STATUS RESTARTS AGE
29hpa-demo-5fcd9c757d-cq6k6 1/1 Running 0 24m
30hpa-demo-5fcd9c757d-zs6cg 1/1 Running 0 24m
通过上面的例子可以知道,HPA可以基于metric-server提供的API监控数据实现水平动态弹性扩展的需求,从而可以根据业务CPU使用情况,动态水平横向扩展,保障业务的可用性。当前HPA V1扩展使用指标只能基于CPU分配使用率进行扩展,功能相对有限,更丰富的功能需要由HPA V2版来实现,其由不同的API来实现:
- metrics.k8s.io 资源指标API,通过metric-server提供,提供node和pod的cpu,内存资源查询;
- custom.metrics.k8s.io 自定义指标,通过adapter和kube-apiserver集成,如promethues;
- external.metrics.k8s.io 外部指标,和自定义指标类似,需要通过adapter和k8s集成。
参考文献 #
资源指标说明:https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/
部署官方说明:https://github.com/kubernetes-sigs/metrics-server
(https://github.com/kubernetes-sigs/metrics-server)
『 转载 』该文章来源于网络,侵删。