添加- --kubelet-insecure-tls
通过HPA实现业务应用的动态扩缩容
HPA控制器介绍
当系统资源过高的时候,我们可以使用如下命令来实现 Pod 的扩缩容功能
$ kubectl -n luffy scale deployment eladmin-web --replicas=2
但是这个过程是手动操作的。在实际项目中,我们需要做到是的是一个自动化感知并自动扩容的操作。Kubernetes 也为提供了这样的一个资源对象:Horizontal Pod Autoscaling(Pod 水平自动伸缩),简称HPA
基本原理:HPA 通过监控分析控制器控制的所有 Pod 的负载变化情况来确定是否需要调整 Pod 的副本数量
Metric Server
...
Metric server collects metrics from the Summary API, exposed by Kubelet on each node.
Metrics Server registered in the main API server through Kubernetes aggregator, which was introduced in Kubernetes 1.7
...
安装
官方代码仓库地址:https://github.com/kubernetes-sigs/metrics-server
Depending on your cluster setup, you may also need to change flags passed to the Metrics Server container. Most useful flags:
--kubelet-preferred-address-types
- The priority of node address types used when determining an address for connecting to a particular node (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])--kubelet-insecure-tls
- Do not verify the CA of serving certificates presented by Kubelets. For testing purposes only.--requestheader-client-ca-file
- Specify a root certificate bundle for verifying client certificates on incoming requests.
$ wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.1/components.yaml
修改args参数:
# 添加- --kubelet-insecure-tls
...
133 containers:
134 - args:
135 - --cert-dir=/tmp
136 - --secure-port=4443
- --kubelet-insecure-tls
137 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
138 - --kubelet-use-node-status-port
139 - --metric-resolution=15s
140 image: bitnami/metrics-server:0.6.1
141 imagePullPolicy: IfNotPresent
...
执行安装:
$ kubectl apply -f components.yaml
$ kubectl -n kube-system get pods
$ kubectl top nodes
HPA实践
HPA的实现有两个版本:
- autoscaling/v1,只包含了根据CPU指标的检测,稳定版本
- autoscaling/v2,支持根据cpu、memory或者用户自定义指标进行伸缩
基于CPU和内存的动态伸缩
创建hpa对象:
# 方式一
$ cat hpa-eladmin-web.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-eladmin-web
namespace: luffy
spec:
maxReplicas: 3
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: eladmin-web
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
# 方式二
$ kubectl -n luffy autoscale deployment eladmin-web --cpu-percent=80 --min=1 --max=3
> Deployment对象必须配置requests的参数,不然无法获取监控数据,也无法通过HPA进行动态伸缩
验证:
$ yum -y install httpd-tools
$ kubectl -n luffy get svc eladmin-web
eladmin-web ClusterIP 10.109.94.95 <none> 80/TCP 2d18h
# 为了更快看到效果,先调整副本数为1
$ kubectl -n luffy scale deploy eladmin-web --replicas=1
# 模拟1000个用户并发访问页面10万次
$ ab -n 100000 -c 1000 http://10.109.94.95:80/
$ kubectl get hpa
$ kubectl -n luffy get pods
压力降下来后,会有默认5分钟的scaledown
的时间,可以通过controller-manager
的如下参数设置:
--horizontal-pod-autoscaler-downscale-stabilization
The value for this option is a duration that specifies how long the autoscaler has to wait before another downscale operation can be performed after the current one has completed. The default value is 5 minutes (5m0s).
也可以通过设置每个hpa的behavior
来控制scaleDown
和scaleUp
的行为。
是一个逐步的过程,当前的缩放完成后,下次缩放的时间间隔,比如从3个副本降低到1个副本,中间大概会等待2*5min = 10分 钟
基于自定义指标的动态伸缩
除了基于 CPU 和内存来进行自动扩缩容之外,我们还可以根据自定义的监控指标来进行。这个我们就需要使用 Prometheus Adapter
,Prometheus 用于监控应用的负载和集群本身的各种指标,Prometheus Adapter
可以帮我们使用 Prometheus 收集的指标并使用它们来制定扩展策略,这些指标都是通过 APIServer 暴露的,而且 HPA 资源对象也可以很轻易的直接使用。
架构图:
实现原理篇
如何获取Pod的监控数据
- k8s 1.8以下:使用heapster,1.11版本完全废弃
- k8s 1.8以上:使用metric-server
官方从 1.8 版本开始提出了 Metric api 的概念,而 metrics-server 就是这种概念下官方的一种实现,用于从 kubelet获取指标,替换掉之前的 heapster。
Metrics Server
可以通过 标准的 Kubernetes API 把监控数据暴露出来,比如获取某一Pod的监控数据:
https://172.21.65.226:6443/apis/metrics.k8s.io/v1beta1/namespaces/<namespace-name>/pods/<pod-name>
# https://172.21.65.226:6443/api/v1/namespaces/luffy/pods?limit=500
集群中安装了metrics-server
就可以用上述接口获取Pod的基础监控数据了,如:
# 获取eladmin-api的pod的监控数据
$ kubectl -n luffy get pod
eladmin-api-6b5d9664d8-gj87w 1/1 Running 0 5d19h
# 则请求的接口应该是
https://172.21.65.226:6443/apis/metrics.k8s.io/v1beta1/namespaces/luffy/pods/eladmin-api-6b5d9664d8-gj87w
# 我们知道调用apiserver需要认证和鉴权流程,因此,前篇我们已经在luffy名称空间下创建了名为luffy-pods-admin的serviceaccount,并赋予了luffy名称空间的pods的操作权限,因此可以使用该serviceaccount的token来进行认证
$ kubectl -n luffy create token luffy-pods-admin
eyJhbGciOiJSUzI1NiIsImtpZCI6IktJeWtHUFlydXZ2ZncxQVNxUlZyWHhCTkkwb01IbjNKUnFwZ18wUUxkVGcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjY4MjI4Mzg5LCJpYXQiOjE2NjgyMjQ3ODksImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJsdWZmeSIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJsdWZmeS1wb2RzLWFkbWluIiwidWlkIjoiNjAzYWEzMDYtNDljZi00Y2UxLWI1OGYtMGNmMjUzYTI4YmY2In19LCJuYmYiOjE2NjgyMjQ3ODksInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpsdWZmeTpsdWZmeS1wb2RzLWFkbWluIn0.HywSHWpsq9yHraUsPwehpaJKWWZFIQY2xE3mZBcVbbXA6ZerEjBzy1q_7VwonNMyUDo_kgK_vM0CVWKDFbESXqG-tbYd6LLD_PNvL8WyEZsKfiK2LYBiTNOAXnUqReUNt9XD_oHVoaEfeEpIO1WPONnmdcOLl_OBa7sdWFH8iT42hVufOHjYELJrOF8PG741BvtMuAIohYwFgO76G8dTWEaOYCX9Rg8n9jJQTqhMvm1fvW6c0V558q63e3oi7OFFR_V90dg4PYbBMAVMrKrrGAogEPnBhHFiY8YKF9lECzTyoVIxOphyvS5M9noU6G_W3-0w7gsMYGrXQVw7xlywNg
$ curl -k -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IktJeWtHUFlydXZ2ZncxQVNxUlZyWHhCTkkwb01IbjNKUnFwZ18wUUxkVGcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjY4MjI4Mzg5LCJpYXQiOjE2NjgyMjQ3ODksImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJsdWZmeSIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJsdWZmeS1wb2RzLWFkbWluIiwidWlkIjoiNjAzYWEzMDYtNDljZi00Y2UxLWI1OGYtMGNmMjUzYTI4YmY2In19LCJuYmYiOjE2NjgyMjQ3ODksInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpsdWZmeTpsdWZmeS1wb2RzLWFkbWluIn0.HywSHWpsq9yHraUsPwehpaJKWWZFIQY2xE3mZBcVbbXA6ZerEjBzy1q_7VwonNMyUDo_kgK_vM0CVWKDFbESXqG-tbYd6LLD_PNvL8WyEZsKfiK2LYBiTNOAXnUqReUNt9XD_oHVoaEfeEpIO1WPONnmdcOLl_OBa7sdWFH8iT42hVufOHjYELJrOF8PG741BvtMuAIohYwFgO76G8dTWEaOYCX9Rg8n9jJQTqhMvm1fvW6c0V558q63e3oi7OFFR_V90dg4PYbBMAVMrKrrGAogEPnBhHFiY8YKF9lECzTyoVIxOphyvS5M9noU6G_W3-0w7gsMYGrXQVw7xlywNg" https://172.21.65.226:6443/apis/metrics.k8s.io/v1beta1/namespaces/luffy/pods/eladmin-api-6b5d9664d8-gj87w
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "eladmin-api-6b5d9664d8-gj87w",
"namespace": "luffy",
"creationTimestamp": "2022-11-12T03:47:03Z",
"labels": {
"app": "eladmin-api",
"pod-template-hash": "6b5d9664d8"
}
},
"timestamp": "2022-11-12T03:46:47Z",
"window": "19.07s",
"containers": [
{
"name": "eladmin-api",
"usage": {
"cpu": "2082398n",
"memory": "3795872Ki"
}
}
]
}
目前的采集流程:
kubelet的指标采集
无论是 heapster还是 metric-server,都只是数据的中转和聚合,两者都是调用的 kubelet 的 api 接口获取的数据,而 kubelet 代码中实际采集指标的是 cadvisor 模块,你可以在 node 节点访问 10250 端口获取监控数据:
- Kubelet Summary metrics: https://127.0.0.1:10250/metrics,暴露 node、pod 汇总数据
- Cadvisor metrics: https://127.0.0.1:10250/metrics/cadvisor,暴露 container 维度数据
调用示例:
$ kubectl -n kube-system create token metrics-server
eyJhbGciOiJSUzI1NiIsImtpZCI6IktJeWtHUFlydXZ2ZncxQVNxUlZyWHhCTkkwb01IbjNKUnFwZ18wUUxkVGcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjY4MjI4ODY0LCJpYXQiOjE2NjgyMjUyNjQsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJtZXRyaWNzLXNlcnZlciIsInVpZCI6IjE4MDJmNGVkLWFjMDYtNDcwYS04YmExLWM0MDJjODgwMjI2OCJ9fSwibmJmIjoxNjY4MjI1MjY0LCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06bWV0cmljcy1zZXJ2ZXIifQ.Wz3kRbwoCJr2v_Gh5gG9oTMcZOPAgUPcSCmS-Nv7TB7a0aiYzSAqlH2PIqDt1wsv_bmNlpg2QifuDF1lnd5FmBmPCsY-zJX-uW-MStAQaCjfV8iBDmlThIP20srpmMf-z6JzkAlIF7JTTeuv03AkZg50FVKvu_2Zk-lm9gEwUhWgXns3oSnhTCWFHh2rOwnNq3IwcypTrRGWBEt5e9BQ6HWWMiCkkZ0WfgATXAAzYsmzRMIp2ZXntoqYEJGLEwgqJNLPVFpCdSn_C3Ft_2Mnfex84uSH0SL08fD5KP23SibWcPyOHIsSrsuPqA03JF-XW_JjRQyhOZsadSjvTBSSMQ
$ curl -k -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IktJeWtHUFlydXZ2ZncxQVNxUlZyWHhCTkkwb01IbjNKUnFwZ18wUUxkVGcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjY4MjI4ODY0LCJpYXQiOjE2NjgyMjUyNjQsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJtZXRyaWNzLXNlcnZlciIsInVpZCI6IjE4MDJmNGVkLWFjMDYtNDcwYS04YmExLWM0MDJjODgwMjI2OCJ9fSwibmJmIjoxNjY4MjI1MjY0LCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06bWV0cmljcy1zZXJ2ZXIifQ.Wz3kRbwoCJr2v_Gh5gG9oTMcZOPAgUPcSCmS-Nv7TB7a0aiYzSAqlH2PIqDt1wsv_bmNlpg2QifuDF1lnd5FmBmPCsY-zJX-uW-MStAQaCjfV8iBDmlThIP20srpmMf-z6JzkAlIF7JTTeuv03AkZg50FVKvu_2Zk-lm9gEwUhWgXns3oSnhTCWFHh2rOwnNq3IwcypTrRGWBEt5e9BQ6HWWMiCkkZ0WfgATXAAzYsmzRMIp2ZXntoqYEJGLEwgqJNLPVFpCdSn_C3Ft_2Mnfex84uSH0SL08fD5KP23SibWcPyOHIsSrsuPqA03JF-XW_JjRQyhOZsadSjvTBSSMQ" https://localhost:10250/metrics
kubelet虽然提供了 metric 接口,但实际监控逻辑由内置的cAdvisor模块负责,早期的时候,cadvisor是单独的组件,从k8s 1.12开始,cadvisor 监听的端口在k8s中被删除,所有监控数据统一由Kubelet的API提供。
cadvisor获取指标时实际调用的是 runc/libcontainer库,而libcontainer是对 cgroup文件 的封装,即 cadvsior也只是个转发者,它的数据来自于cgroup文件。
cgroup文件中的值是监控数据的最终来源
Metrics数据流:
思考:Metrics Server是独立的一个服务,只能服务内部实现自己的api,是如何做到通过标准的kubernetes 的API格式暴露出去的
kube-aggregator聚合器及Metric-Server的实现
kube-aggregator是对 apiserver 的api的一种拓展机制,它允许开发人员编写一个自己的服务,并把这个服务注册到k8s的api里面,即扩展 API 。
看下metric-server的实现:
$ kubectl get apiservice
NAME SERVICE AVAILABLE
v1beta1.metrics.k8s.io kube-system/metrics-server True
$ kubectl get apiservice v1beta1.metrics.k8s.io -oyaml
...
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
port: 443
version: v1beta1
versionPriority: 100
...
# 会往apiserver里注册一个
# proxyPath := "/apis/" + apiService.Spec.Group + "/" + apiService.Spec.Version
# https://172.21.65.226:6443/apis/metrics.k8s.io/v1beta1
# =\> https://metrics-server:443/apis/metrics.k8s.io/v1beta1
$ kubectl -n kube-system get svc metrics-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
metrics-server ClusterIP 10.106.62.186 <none> 443/TCP 13h
$ kubectl -n luffy create token luffy-pods-admin
eyJhbGciOiJSUzI1NiIsImtpZCI6IktJeWtHUFlydXZ2ZncxQVNxUlZyWHhCTkkwb01IbjNKUnFwZ18wUUxkVGcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjY4MjI5OTMyLCJpYXQiOjE2NjgyMjYzMzIsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJsdWZmeSIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJsdWZmeS1wb2RzLWFkbWluIiwidWlkIjoiNjAzYWEzMDYtNDljZi00Y2UxLWI1OGYtMGNmMjUzYTI4YmY2In19LCJuYmYiOjE2NjgyMjYzMzIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpsdWZmeTpsdWZmeS1wb2RzLWFkbWluIn0.LeKFaXBeaAsyHd06JFc6exenNtfuBadjUj3YtqgHiBMmhNds4zmB9_ysvI4p04W6Kov37YTZg4WSj_AqpsplfCVVg8h-3kAQfPC6cHG5oBN-VUsMC80lu-MZsBTI4C5in7WgddFyFqFMxXL_-TdpguYohOJ6NC90z3IGLCKy8pS5mOCUA34o1_9x5P3JM5e--R-NIbwZmdESkfHejiaENCau_cwP2L2lxmU364kppSrcX_kGLybT7nMV-Bg6Q_-pt0JZVtP5C5NZUuLN0Mtmsd9me8LJFyPDX4fkWtXwZraqiRDx_OTgbckIwQAIDseFEu8ikVBYZ2p6qPCvtIgsMQ
$ curl -k -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IktJeWtHUFlydXZ2ZncxQVNxUlZyWHhCTkkwb01IbjNKUnFwZ18wUUxkVGcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjY4MjI5OTMyLCJpYXQiOjE2NjgyMjYzMzIsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJsdWZmeSIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJsdWZmeS1wb2RzLWFkbWluIiwidWlkIjoiNjAzYWEzMDYtNDljZi00Y2UxLWI1OGYtMGNmMjUzYTI4YmY2In19LCJuYmYiOjE2NjgyMjYzMzIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpsdWZmeTpsdWZmeS1wb2RzLWFkbWluIn0.LeKFaXBeaAsyHd06JFc6exenNtfuBadjUj3YtqgHiBMmhNds4zmB9_ysvI4p04W6Kov37YTZg4WSj_AqpsplfCVVg8h-3kAQfPC6cHG5oBN-VUsMC80lu-MZsBTI4C5in7WgddFyFqFMxXL_-TdpguYohOJ6NC90z3IGLCKy8pS5mOCUA34o1_9x5P3JM5e--R-NIbwZmdESkfHejiaENCau_cwP2L2lxmU364kppSrcX_kGLybT7nMV-Bg6Q_-pt0JZVtP5C5NZUuLN0Mtmsd9me8LJFyPDX4fkWtXwZraqiRDx_OTgbckIwQAIDseFEu8ikVBYZ2p6qPCvtIgsMQ" https://10.106.62.186/apis/metrics.k8s.io/v1beta1/namespaces/luffy/pods/eladmin-api-6b5d9664d8-gj87w
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "eladmin-api-6b5d9664d8-gj87w",
"namespace": "luffy",
"creationTimestamp": "2022-11-12T04:12:27Z",
"labels": {
"app": "eladmin-api",
"pod-template-hash": "6b5d9664d8"
}
},
"timestamp": "2022-11-12T04:12:19Z",
"window": "15.425s",
"containers": [
{
"name": "eladmin-api",
"usage": {
"cpu": "1569859n",
"memory": "3800032Ki"
}
}
]
}