Kubernetes从1.1版本开始, 新增了名为Horizontal Pod Autoscaler(HPA) 的控制器, 用于实现基于CPU使用率进行自动Pod扩缩容的功能。 HPA控制器基于Master的kube-controller-manager服务启动参数–horizontal-pod-autoscaler-sync-period定义的探测周期(默认值为15s) , 周期性地监测目标Pod的资源性能指标, 并与HPA资源对象中的扩缩容条件进行对比, 在满足条件时对Pod副本数量进行调整。Kubernetes在早期版本中, 只能基于Pod的CPU使用率进行自动扩缩容操作, 关于CPU使用率的数据来源于Heapster组件。 Kubernetes从1.6版本开始, 引入了基于应用自定义性能指标的HPA机制, 并在1.9版本之后逐步成熟。
Kubernetes中的某个Metrics Server(Heapster或自定义Metrics Server) 持续采集所有Pod副本的指标数据。 HPA控制器通过Metrics Server的API(Heapster的API或聚合API) 获取这些数据, 基于用户定义的扩缩容规则进行计算, 得到目标Pod副本数量。 当目标Pod副本数量与当前副本数量不同时, HPA控制器就向Pod的副本控制器 (Deployment、 RC或ReplicaSet) 发起scale操作, 调整Pod的副本数量,完成扩缩容操作。 如下图所示:
Kubernetes从1.11版本开始, 弃用基于Heapster组件完成Pod的CPU使用率采集的机制, 全面转向基于Metrics Server完成数据采集。 Metrics Server将采集到的Pod性能指标数据通过聚合API(Aggregated API) 如metrics.k8s.io、 custom.metrics.k8s.io和external.metrics.k8s.io提供给HPA控制器进行查询
下面创建一个deployment
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mty-production-api
spec:
replicas: 1
selector:
matchLabels:
app: mty-production-api
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: mty-production-api
spec:
containers:
- image: harbor.ysmty.com:19999/onair/mty-production-api:202007151447-3.5.2-b9a7f09
imagePullPolicy: IfNotPresent
name: mty-production-api
resources:
limits:
cpu: 4
memory: 4Gi
requests:
cpu: 100m
memory: 128Mi
volumeMounts:
- mountPath: /usr/local/mty-production-api/logs
name: log-pv
subPath: mty-production-api
imagePullSecrets:
- name: mima
restartPolicy: Always
volumes:
- name: log-pv
persistentVolumeClaim:
claimName: log-pv
运行这个yaml文件即可,这时这个deployment资源pod会启动起来,现在正常应该是只启动一个pod 下面,使用HPA,基于CPU来做动态扩容
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: hpa-demo
namespace: default
spec:
maxReplicas: 5
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: mty-production-api
targetCPUUtilizationPercentage: 10
status:
currentReplicas: 1
desiredReplicas: 0
完事之后,启动该yaml文件,可以查看hpa的资源类型
# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo Deployment/mty-production-api 8%/10% 1 5 5 28m
使用简单的压测工具,进行测试下
ab -n 10000 -c 10 http://172.17.58.255:8080/api/healthy/check
随后,再次查看pod数量
# kubectl get pod | grep mty-production-api
mty-production-api-596dfc85c4-599xj 1/1 Running 0 28m
mty-production-api-596dfc85c4-922p4 1/1 Running 0 27m
mty-production-api-596dfc85c4-b6zcx 1/1 Running 0 27m
mty-production-api-596dfc85c4-cqdz2 1/1 Running 0 12d
mty-production-api-596dfc85c4-fmk5w 1/1 Running 0 27m
可以看到现在已经启动了4个了。说明hpa已经生效了。查看下hpa的相关信息
# kubectl describe hpa hpa-demo
Name: hpa-demo
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-demo","namespace":"default"},"spe...
CreationTimestamp: Mon, 03 Aug 2020 23:20:50 +0800
Reference: Deployment/mty-production-api
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 8% (8m) / 10%
Min replicas: 1
Max replicas: 5
Deployment pods: 5 current / 5 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 29m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 28m horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 28m horizontal-pod-autoscaler New size: 5; reason: cpu resource utilization (percentage of request) above target
停止压测,过一会,pod的数量应该会再次变成一个pod。