文章采集api( 如何查询prometheus采集job中指标下数据量的情况过滤？)

优采云发布时间: 2022-01-22 12:25

　　文章采集api(

如何查询prometheus采集job中指标下数据量的情况过滤？)

　　如何在普罗米修斯中过滤不需要的指标

　　在prometheus的采集中，你会发现一个job可能收录几十个甚至上百个指标，每个指标下的数据量也非常大。在生产环境中，我们实际上可能只用到了几十个指标，而那些我们没有用过的指标，prometheus采集就成了浪费部署资源的罪魁祸首。

　　这时候就需要考虑过滤prometheus采集的job中的指标了。

　　如何查询指标下的数据量

　　1

2

3

4

5

　　# 展现数据量前50的指标

topk(50, count by (__name__, job)({__name__=~".+"}))

# prometheus中的指标数据量

sum(count by (__name__, job)({__name__=~".+"}))

　　在 prometheus 采集Job 上过滤指标

　　以下是 cadvice采集job 的示例。目前使用metric_relabel_configs下的drop操作丢弃不需要的指标（感觉不是特别方便）

　　1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

　　- job_name: kubernetes-cadvisor

kubernetes_sd_configs:

- role: node

scheme: https

tls_config:

ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

relabel_configs:

- action: labelmap

regex: __meta_kubernetes_node_label_(.+)

- target_label: __address__

replacement: kubernetes.default.svc:443

- source_labels: [__meta_kubernetes_node_name]

regex: (.+)

target_label: __metrics_path__

replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

metric_relabel_configs:

- action: replace

source_labels: [id]

regex: '^/machine\.slice/machine-rkt\\x2d([^\\]+)\\.+/([^/]+)\.service$'

target_label: rkt_container_name

replacement: '${2}-${1}'

- action: replace

source_labels: [id]

regex: '^/system\.slice/(.+)\.service$'

target_label: systemd_service_name

replacement: '${1}'

# 丢弃掉container_network_tcp_usage_total指标

- action: drop

source_labels: [__name__]

regex: 'container_network_tcp_usage_total'

- action: drop

source_labels: [__name__]

regex: 'container_tasks_state'

- action: drop

source_labels: [__name__]

regex: 'container_network_udp_usage_total'

- action: drop

source_labels: [__name__]

regex: 'container_memory_failures_total'

　　在 prometheus-opretor 中过滤 Servicemonit 配置的作业指标

　　在使用prometheus-opretor部署监控环境时，会发现很多监控作业都是使用Servicemonit定义的。您还可以在 Servicemonit 中配置 drop to drop 指示器。

　　1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

　　serviceMonitors:

- name: foundation-prometheus

namespaceSelector:

matchNames:

- monitoring

selector:

matchLabels:

cluster: foundation

endpoints:

- port: foundation-port

honorLabels: true

path: /federate

params:

'match[]':

- '{__name__=~".+"}'

# 配置丢弃container_memory_failures_total指标

metricRelabelings:

- action: drop

source_labels: [__name__]

regex: 'container_memory_failures_total'

　　在每个采集export 中配置不需要的指标

　　最好的处理方法是在prometheus采集指标之前控制每个export提供的指标，只给prometheus提供我们需要监控的指标。我们以节点导出为例。在 node-export 中，有一个应用程序的官方描述。您可以使用--no-collector。用于控制不需要采集的模块的标志

0

2022-01-22

文章采集api

0 个评论

要回复文章请先登录或注册

AI时代内容工厂

文章采集api( 如何查询prometheus采集job中指标下数据量的情况过滤？)

0 个评论

发起人

AI时代内容工厂

文章采集api( 如何查询prometheus采集job中指标下数据量的情况过滤？)

0 个评论

发起人

相关问题