通用指标示例程序部署

为了演示效果，我们新建一个deployment来模拟业务应用。

$ cat custom-metrics-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: custom-metrics-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: custom-metrics-demo
  template:
    metadata:
      labels:
        app: custom-metrics-demo
    spec:
      containers:
      - name: custom-metrics-demo
        image: luxas/autoscale-demo:v0.1.2
        resources:
          limits:
            cpu: 50m
          requests:
            cpu: 50m

部署：

$ kubectl apply -f custom-metrics-demo.yaml

$ kubectl get po -o wide
custom-metrics-demo-95b5bc949-xpppl   1/1     Running   0          65s   10.244.1.194

$ curl 10.244.1.194:8080/metrics
# HELP http_requests_total The amount of requests served by the server in total
# TYPE http_requests_total counter
http_requests_total 2

注册为Prometheus的target：

$ cat custom-metrics-demo-svc.yaml
apiVersion: v1
kind: Service
metadata:
  name: custom-metrics-demo
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
spec:
  ports:
  - port: 80
    targetPort: 8080
    name: http
  selector:
    app: custom-metrics-demo
  type: ClusterIP

自动注册为Prometheus的采集Targets。

通常web类的应用，会把每秒钟的请求数作为业务伸缩的指标依据。

实践：

使用案例应用custom-metrics-demo，如果custom-metrics-demo最近1分钟内每秒钟的请求数超过10次，则自动扩充业务应用的副本数。

配置自定义指标

告诉Adapter去采集转换哪些指标，Adapter支持转换的指标，才可以作为HPA的依据

配置HPA规则

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: custom-app-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: custom-metrics-demo
  minReplicas: 1
  maxReplicas: 3
  metrics:
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 10

Adapter配置自定义指标

思考：

前面讲CPU的平均使用率的采集，其实是通过node_cpu_seconds_total指标计算得到的。

  ^
  │   . . . . . . . . . . . . . . . . .   . .   node_cpu{cpu="cpu0",mode="idle"}
  │     . . . . . . . . . . . . . . . . . . .   node_cpu{cpu="cpu0",mode="system"}
  │     . . . . . . . . . .   . . . . . . . .   node_load1{}
  │     . . . . . . . . . . . . . . . .   . .   node_cpu_seconds_total{...}
  v
    \<------------------ 时间 ----------------\>

同样，如果想获得每个业务应用最近1分钟内每秒的访问次数，也是根据总数来做计算，因此，需要使用业务自定义指标http_requests_total，配合rate方法即可获取每秒钟的请求数。

rate(http_requests_total[2m])

# 如查询有多条数据，需做汇聚，需要使用sum
sum(rate(http_requests_total[2m])) by(kubernetes_pod_name)

自定义指标可以配置多个，因此，需要将规则使用数组来配置
```
rules:
- {}
```
告诉Adapter，哪些自定义指标可以使用
```
rules:
- seriesQuery: 'http_requests_total{}'
```
seriesQuery是PromQL语句，和直接用http_requests_total查询到的结果一样，凡是seriesQuery可以查询到的指标，都可以用作自定义指标
告诉Adapter，指标中的标签和k8s中的资源对象的关联关系
```
rules:
- seriesQuery: 'http_requests_total{}'
  resources:
    overrides:
      kubernetes_namespace: {resource: "namespace"}
      kubernetes_pod_name: {resource: "pod"}
```
hpa 拿着k8s里的namepace和pod名称，来查询adaptor，adaptor去查询Prometheus的时候根据resources的适配来转换，namepace=default, pod=front-app-xxxx, kubernetes_namespace="default"

我们查询到的可用指标格式为：
```
http_requests_total{instance="10.244.2.140:8080", job="kubernetes-sd-endpoints", kubernetes_name="custom-metrics-demo", kubernetes_namespace="default", kubernetes_pod_name="front-app-df5fc79dd-rmzr6", namespace="default", pod="front-app-df5fc79dd-rmzr6"}
```
由于HPA在调用Adapter接口的时候，告诉Adapter的是查询哪个命名空间下的哪个Pod的指标，因此，Adapter在去查询的时候，需要做一层适配转换（因为并不是每个prometheus查询到的结果中都是叫做kubernetes_namespace和kubernetes_pod_name）

/apis/custom.metrics.k8s.io/v1beta2/namespaces/default/pods/xxx/http_requests_total

指定自定义的指标名称，供HPA配置使用

rules:
- seriesQuery: 'http_requests_total{}'
  resources:
    overrides:
      kubernetes_namespace: {resource: "namespace"}
      kubernetes_pod_name: {resource: "pod"}
  name:
    as: "http_requests_per_second"

因为Adapter转换完之后的指标含义为：每秒钟的请求数。因此提供指标名称，该配置根据正则表达式做了匹配替换，转换完后的指标名称为：http_requests_per_second，HPA规则中可以直接配置该名称。

告诉Adapter如何获取最终的自定义指标值

    rules:
    - seriesQuery: 'http_requests_total{}'
      resources:
        overrides:
          kubernetes_namespace: {resource: "namespace"}
          kubernetes_pod_name: {resource: "pod"}
      name:
        as: "http_requests_per_second"
      metricsQuery: 'sum(rate(\<\<.Series>\>{\<\<.LabelMatchers>\>}[2m])) by (\<\<.GroupBy>\>)'

我们最终期望的写法可能是这样：

sum(rate(http_requests_total{kubernetes_namespace="default",kubernetes_pod_name="xxxx"}[2m])) by (kubernetes_pod_name)

但是Adapter提供了更简单的写法：

   sum(rate(\<\<.Series>\>{\<\<.LabelMatchers>\>}[2m])) by (\<\<.GroupBy>\>)

Series: 指标名称
- LabelMatchers: 指标查询的label
- GroupBy: 结果分组，针对HPA过来的查询，都会匹配成kubernetes_pod_name

更新Adapter的配置：

$ vi custom-metrics-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: adapter-config
  namespace: monitor
data:
  config.yaml: |
    rules:
    - seriesQuery: 'http_requests_total'
      seriesFilters: []
      resources:
        overrides:
          kubernetes_namespace: {resource: "namespace"}
          kubernetes_pod_name: {resource: "pod"}
      name:
        as: "http_requests_per_second"
      metricsQuery: (sum(rate(\<\<.Series>\>{\<\<.LabelMatchers>\>}[2m])) by (\<\\<.GroupBy>\\>))

需要更新configmap并重启adapter服务：

$ kubectl apply -f custom-metrics-configmap.yaml

$ kubectl -n monitor delete po custom-metrics-apiserver-c689ff947-zp8gq

实际中，hpa会去对如下地址发起请求，获取数据：

$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta2/namespaces/default/pods/custom-metrics-demo-776f497584-5slf9/http_requests_per_second |jq
{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta2",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta2/namespaces/default/pods/custom-metrics-demo-776f497584-5slf9/http_requests_per_second"
  },
  "items": [
    {
      "describedObject": {
        "kind": "Pod",
        "namespace": "default",
        "name": "custom-metrics-demo-776f497584-5slf9",
        "apiVersion": "/v1"
      },
      "metric": {
        "name": "http_requests_per_second",
        "selector": null
      },
      "timestamp": "2022-12-03T10:59:56Z",
      "value": "33m"
    }
  ]
}

其中33m等于0.033，即当前指标查询每秒钟请求数为0.033次.

我们发现有两个可用的resources，引用官方的一段解释：

Notice that we get an entry for both "pods" and "namespaces" -- the adapter exposes the metric on each resource that we've associated the metric with (and all namespaced resources must be associated with a namespace), and will fill in the \<\<.GroupBy>\> section with the appropriate label depending on which we ask for.

We can now connect to $KUBERNETES/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/nginx_vts_server_requests_per_second, and we should see

https://github.com/DirectXMan12/k8s-prometheus-adapter/blob/master/docs/config-walkthrough.md

https://github.com/DirectXMan12/k8s-prometheus-adapter/blob/master/docs/config.md

通用指标示例程序部署​

Adapter配置自定义指标​

通用指标示例程序部署

Adapter配置自定义指标