跳到主要内容

14 篇博文 含有标签「Prometheus」

查看所有标签

Prometheus 服务发现机制

· 阅读需 14 分钟

Prometheus 服务发现机制

Prometheus 默认才用pull取拉取监控数据的,也就是定时去目标主机去抓取metrics数据。所以每一个被抓取目标需要暴露一个HTTP接口,通过这个借口获取对应指标数据。

这种方式需要由目标服务决定抓取的采集目标有哪些,通过配置在scrape_config中的各种job实现,无法动态感知新服务。如果后续添加新服务就要手动改prometheus的配置,所以要用动态服务发现,通过服务发现,prometheus能查询到需要监控的Target列表,然后轮询这个Target获取监控数据

获取数据源target的方式有很多种,静态配置和服务发现,实现服务发现的方式有:

  • kubernetes_sd_configs #k8s服务发现,让prome动态发现k8s中被监控的目标
  • static_configs #静态服务发现,基于prometheus配置文件指定的监控目标(默认使用的)
  • dns_sd_configs #DNS服务发现监控数据
  • consul_sd_configs. #Consul服务发现,基于consul动态发现监控目标
  • file_sd_configs #基于指定的文件实现服务发现

一、kubernetes_sd_configs

promethues 的relabeling (重新修改标签)功能很强大,它能够在抓取到目标实例之前把目标实例的元数据标签动态重新修改,动态添加或者覆盖标签
prometheus 加载 target 成功之后,在Target 实例中,都包含一些 Metadata 标签信息,默认的标签有:
__address__:以<host>:<port>格式显示目标targets的地址
__scheme__:采集的目标服务地址的Scheme形式, HTTP 或者HTTPS
__metrics_path__:采集的目标服务的访问路径

基础功能-重新标记目的

relabel_configs: 在采集之前(比如在采集数据之前重新定义元标签),可以使用relabel_configs添加一些标签、也可以只采集特定目标或过滤目标。

metric_relabel_configs: 如果是已经抓取到指标数据时,可以使用metric_relabel_configs做最后的重新标记和过滤

配置 -> 重新标签(relabel_configs)-> 抓取 -> 重新标签(metric_relabel_configs) -> TSDB

    - job_name: 'kubernetes-apiserver'
kubernetes_sd_configs: #基于kubernetes_sd_configs 实现服务发现
- role: endpoints # 发现endpoints
scheme: https #当前job使用的发现协议
tls_config: #证书配置
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt #容器里的证书路径
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token #容器里的token路径
relabel_configs: #重新re修改标签label配置configs
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep #action定义里relabel的具体动作,action支持多种
regex: default;kubernetes;https #发现了default命名空间的kubernetes服务且时https协议
label 详解
source_labels: 源标签,没有经过relabel处理之前的标签名字
target_label: 通过action处理之后的新的标签名字
regex: 正则表达式,匹配源标签
replacement: 通过分组替换后标签(target_label)对应的值
action 详解

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_con

replace: 替换标签纸,根据regex正则匹配到源标签的值,使用replacement来引用表达式匹配的分组
keep: 满足regex正则条件的实例进行采集,把source_labels中没有匹配到regex正则内容的Target实例丢掉,即只采集匹配成功的实例

drop: 满足 regex 正则条件的实例不采集,把source_labels中匹配到regex正则内容的Target实例丢掉,即只采集没有匹配到的实例
---------------------------------------------------------
labelmap: 匹配 regex所有标签名称,然后复制匹配标签的值进行分组,通过replacement分组引用($[1),$[2),...)替代
labelkeep: 匹配 regex所有标签名称,其它不匹配的标签都将从标签集中删除
labeldrop: 匹配 regex所有标签名称,其它匹配的标签都将从标签集中删除

支持的发现目标类型

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config

  • node
  • service
  • pod
  • endpoints
  • Endpointslice #对endpoint进行分片
  • ingress
api-server 指标数据

监控 api-server 实现

可以直接通过k8s的service来获取

      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https #含义为匹配default的namespace,svc名称时kubernetes并且协议时https,匹配成功后进行保留,并且把regex作为source_labels相对应的值。 即labels为key、regex为值

# label替换方式如下:

本文深入探讨Prometheus服务发现机制的原理与实践,帮助你更好地理解和配置自动化监控。

K8s Prome常用监控部署

· 阅读需 5 分钟

K8s Prome常用监控部署

使用K8s部署

Cadvisor 部署

# 拉取镜像 
docker pull gcr.io/cadvisor/cadvisor:v0.39.2

# 做tag
docker tag gcr.io/cadvisor/cadvisor:v0.39.2 harbor.images.com/test/cadvisor:v0.39.2

# 推送镜像
docker push harbor.images.com/test/cadvisor:v0.39.2

# 创建命名空间
kubectl create ns monitoring

编写yaml

vim daemonset-deploy-cadvisor.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cadvisor
namespace: monitoring
spec:
selector:
matchLabels:
app: cAdvisor
template:
metadata:
labels:
app: cAdvisor
spec:
tolerations: #污点容忍,忽略master的NoSchedule
- effect: NoSchedule
key: node-role.kubernetes.io/master
hostNetwork: true
restartPolicy: Always # 重启策略
containers:
- name: cadvisor
image: harbor.images.com/test/cadvisor:v0.39.2
imagePullPolicy: IfNotPresent # 镜像策略
ports:
- containerPort: 8080
volumeMounts:
- name: root
mountPath: /rootfs
- name: run
mountPath: /var/run
- name: sys
mountPath: /sys
- name: docker
mountPath: /var/lib/docker
volumes:
- name: root
hostPath:
path: /
- name: run
hostPath:
path: /var/run
- name: sys
hostPath:
path: /sys
- name: docker
hostPath:
path: /var/lib/docker

应用并查看

kubectl create -f daemonset-deploy-cadvisor.yaml

# 查看状态
kubectl get pod -n monitoring -owide

node-exporter 部署

# 拉取镜像
docker pull quay.io/prometheus/node-exporter:v1.3.1

# 打标签
docker tag quay.io/prometheus/node-exporter:v1.3.1 harbor.images.com/test/node-exporter:v1.3.1

# 推送镜像
docker push harbor.images.com/test/node-exporter:v1.3.1

编写 yaml

vim daemonset-deploy-node-exporter.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: monitoring
labels:
k8s-app: node-exporter
spec:
selector:
matchLabels:
k8s-app: node-exporter
template:
metadata:
labels:
k8s-app: node-exporter
spec:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
containers:
- image: quay.io/prometheus/node-exporter:v1.3.1
#- image: prom/node-exporter:v1.3.1
imagePullPolicy: IfNotPresent
name: prometheus-node-exporter
ports:
- containerPort: 9100
hostPort: 9100
protocol: TCP
name: metrics
volumeMounts:
- mountPath: /host/proc
name: proc
- mountPath: /host/sys
name: sys
- mountPath: /host
name: rootfs
args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host
volumes:
- name: proc
hostPath:
path: /proc
- name: sys
hostPath:
path: /sys
- name: rootfs
hostPath:
path: /
hostNetwork: true
hostPID: true
---
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: "true"
labels:
k8s-app: node-exporter
name: node-exporter
namespace: monitoring
spec:
type: NodePort
ports:
- name: http
port: 9100
nodePort: 9100
protocol: TCP
selector:
k8s-app: node-exporter

应用并验证

kubectl create -f daemonset-deploy-node-exporter.yaml

# 验证
kubectl get pod -n monitoring -owide

Prometheus Server 部署

vim prometheus-cfg.yaml

---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
app: prometheus
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 1m
scrape_configs:
- job_name: 'kubernetes-node'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-node-cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https\?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(\?::\d+)\?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name



- job_name: 'kubernetes-apiserver'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https

拉取镜像并配置

# 选择在 node3 上安装 prome
mkdir -p /data/prometheusdata
chmod 777 /data/prometheusdata # 准备数据目录
docker pull prom/prometheus:v2.31.2

# 创建监控账号
kubectl create serviceaccount monitor -n monitoring

# 账号授权
kubectl create clusterrolebinding monitor-clusterrolebinding -n monitoring --clusterrole=cluster-admin --serviceaccount=monitoring:monitor

编写部署yaml

vim prometheus-deployment.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-server
namespace: monitoring
labels:
app: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
component: server
#matchExpressions:
#- {key: app, operator: In, values: [prometheus]}
#- {key: component, operator: In, values: [server]}
template:
metadata:
labels:
app: prometheus
component: server
annotations:
prometheus.io/scrape: 'false'
spec:
nodeName: k8s-slave3 # 写node3的hostname,master配置hosts
serviceAccountName: monitor
containers:
- name: prometheus
image: prom/prometheus:v2.31.2
imagePullPolicy: IfNotPresent
command:
- prometheus
- --config.file=/etc/prometheus/prometheus.yml
- --storage.tsdb.path=/prometheus
- --storage.tsdb.retention=720h
ports:
- containerPort: 9090
protocol: TCP
volumeMounts:
- mountPath: /etc/prometheus/prometheus.yml
name: prometheus-config
subPath: prometheus.yml
- mountPath: /prometheus/
name: prometheus-storage-volume
volumes:
- name: prometheus-config
configMap:
name: prometheus-config
items:
- key: prometheus.yml
path: prometheus.yml
mode: 0644
- name: prometheus-storage-volume
hostPath:
path: /data/prometheusdata
type: Directory

应用并验证

kubectl create -f prometheus-deployment.yaml

# 验证
kubectl get pod -n monitoring

Prometheus svc

vim prometheus-svc.yaml

---
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
spec:
type: NodePort
ports:
- port: 9090
targetPort: 9090
nodePort: 30090
protocol: TCP
selector:
app: prometheus
component: server

应用并验证

kubectl create -f prometheus-svc.yaml

# 验证
kubectl get svc -n monitoring