Questions tagged [prometheus]

For questions about Prometheus, an open-source systems monitoring and alerting toolkit.

About Prometheus (from prometheus.io):

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and maintained independently of any company. To emphasize this and clarify the project's governance structure, Prometheus joined the Cloud Native Computing Foundation in 2016 as the second hosted project after Kubernetes.

86 questions
14
votes
3 answers

How do I troubleshoot missing data in my Prometheus database?

I have been gradually integrating Prometheus into my monitoring workflows, in order to gather detailed metrics about running infrastructure. During this, I have noticed that I often run into a peculiar issue: sometimes an exporter that Prometheus is…
Sander
  • 241
  • 1
  • 3
  • 6
11
votes
2 answers

How to calculate disk space required by Prometheus v2.2?

We are trying to calculate the storage requirements but is unable to find the values needed to do the calculation for our version of Prometheus (v2.2). Prometheus(v2.2) storage documentation gives this simple formula: needed_disk_space =…
MCoetzee
  • 113
  • 1
  • 1
  • 7
10
votes
3 answers

Prometheus Alertamanger - how to silence all alerts for a given period during a maintenance?

A work scenario that I can't cover currently is that I want to set a maintenance mode, meaning all alerts received from Prometheus to be ignored, I want to be able to set it through the UI for a given period until maintenance finish. One way to do…
9
votes
1 answer

How to calculate burn rate for SLOs?

I've read the Google SRE book a few times but I need some clarifications on exactly how to set up the burn rate and understanding how long it'll take to trigger an alert. Most of my questions are specifically from this section on the book:…
BlueChips23
  • 193
  • 1
  • 4
9
votes
1 answer

Why is Prometheus not a good choice for data with high cardinality?

I have a background in relational databases and new to Prometheus. I wonder why Prometheus is not a good choice for high cardinality data? Why do I need to use low cardinality data? It's exact opposite from SQL DBs. What are the technical reason for…
Sybil
  • 472
  • 4
  • 11
8
votes
2 answers

Prometheus alert CPUThrottlingHigh raised but monitoring does not show it

I have installed Prometheus to monitor my installation and it is frequently raising alerts about CPU throttling. The Prometheus alert rules to identify this alert is : alert: CPUThrottlingHigh expr: 100 * sum by(container_name, pod_name,…
jobou
  • 183
  • 1
  • 1
  • 5
6
votes
2 answers

Back-filling prometheus (and related system) metrics?

Projects like Thanos, which are based on prometheus storage formats, seem to say that it is impossible to back-fill data in a prometheus based system. Can anyone explain why and if there are any possibilities for changing this? Lack of ability to…
John Humphreys
  • 1,570
  • 7
  • 18
5
votes
1 answer

How fast does Prometheus data grow?

This may sound like a vague question so I will provide some context below. The basic question is : What parameters describe the growth in size of a Prometheus database over time? To tie the question down: When will the prometheus time series data…
Bruce Becker
  • 3,783
  • 4
  • 20
  • 41
5
votes
4 answers

How to find interdependencies between pods in a Kubernetes cluster?

Two Pods run in a Kubernetes cluster. One is a simple Wordpress application and the other a Mysql database. The Wordpress Pod communicates with the Mysql database. The aim is to find dependencies between pods. Is there any kubectl command or any…
avishkar
  • 51
  • 1
4
votes
1 answer

Does Prometheus expose the Horizontal Pod Autoscaler's "Current CPU Utilization" as shown in the Kubernetes dashboard?

In the Kubernetes dashboard, I can see for a HPA the following information: Min Replicas: 3 Max Replicas: 11 Target CPU Utilization: 80% Status Current Replicas: 3 Desired Replicas: 3 Current CPU Utilization: 10% Last Scaled: 5 days However, I…
Darragh
  • 141
  • 1
4
votes
0 answers

Thanos or Cortex - What handles very large scale (say, hundreds of millions of time series) better?

We're looking at a new metrics solution and are attempting to build it in house. So, my question is: What can scale more effectively / with less pain; Thanos or Cortex? I understand the general differences between the two. I'm just looking to…
John Humphreys
  • 1,570
  • 7
  • 18
4
votes
3 answers

What is the Prometheus and Grafana ideal setup?

I am just wondering. If I have many environments monitored via Prometheus, what will be the best configuration? The security and efficiency are already important for on-premise installations but they are much more important and unavoidable when…
gervais.b
  • 155
  • 3
4
votes
1 answer

Using Prometheus to monitor Spring Boot Applications in Kubernetes Cluster

I have spring boot powered microservices deployed in my local kubernetes cluster. The microservices are using micrometer and prometheus registry but due to our company policy the actuator is available on another port: 8080 for "business" http…
Mark Bramnik
  • 141
  • 1
  • 3
4
votes
0 answers

Grafana sometimes can't resolve prometheus hostname

Scenario I deploy grafana and prometheus onto EKS cluster (AWS K8s service). If I use prometheus service's fqdn (prometheus-server.monitoring.svc.cluster.local) as the data source, grafana sometimes fail to load the data like the image below. If I…
Tran Triet
  • 879
  • 3
  • 11
  • 21
3
votes
1 answer

How to get hostname from node-exporter in Docker Swarm?

I have setup my monitoring with the cadvisor, node-exporter, prometheus, grafana stack in my clustered environment using docker swarm. What is the easiest way to get my actual hostnames picked up by node-exporter, so I can configure my dashboards to…
Max N.
  • 413
  • 6
  • 13
1
2 3 4 5 6