Prometheus (software)
{{short description|Application used for event monitoring and alerting}}
{{Primary sources|date=January 2021}}
{{Infobox software
| name = Prometheus
| logo = Prometheus_software logo.svg
| released = {{Start date and age|2012|11|24|df=yes}}
| latest release version = v3.2.1[https://github.com/prometheus/prometheus/releases/latest Latest release at Github]
| latest release date = {{release date and age|2025|2|26|df=yes}}
| programming language = Go
| operating system = Cross-platform
| genre = Time series database
| license = Apache License 2.0
| website = {{URL|https://prometheus.io}}
| repo = {{URL|https://github.com/prometheus/prometheus}}
}}
Prometheus is a free software application used for event monitoring and alerting.{{cite web|url=https://prometheus.io/docs/introduction/overview/|title=Overview|website=prometheus.io}} It records metrics in a time series database (allowing for high dimensionality) built using an HTTP pull model, with flexible queries and real-time alerting.{{cite book|author=James Turnbull|title=Monitoring with Prometheus|url=https://books.google.com/books?id=EtlfDwAAQBAJ|date=12 June 2018|publisher=Turnbull Press|isbn=978-0-9888202-8-9}}{{cite web|url = https://prometheus.io/|title = Prometheus: From metrics to insight. Power your metrics and alerting with a leading open-source monitoring solution|accessdate = December 26, 2018}} The project is written in Go and licensed under the Apache 2 License, with source code available on GitHub.{{cite web |url=https://github.com/prometheus |title=Prometheus |website=GitHub |accessdate=December 26, 2018}}
History
Prometheus was developed at SoundCloud starting in 2012,{{cite book|author=Brian Brazil|title=Prometheus: Up & Running: Infrastructure and Application Performance Monitoring|url=https://books.google.com/books?id=QW1jDwAAQBAJ|date=9 July 2018|publisher=O'Reilly Media|isbn=978-1-4920-3409-4|page=3}} when the company discovered that its existing metrics and monitoring tools (using StatsD and Graphite) were insufficient for their needs. Specifically, they identified needs that Prometheus was built to meet, including a multi-dimensional data model, operational simplicity, scalable data collection, and a powerful query language, all in a single tool.{{cite web|url = https://developers.soundcloud.com/blog/prometheus-monitoring-at-soundcloud|title = Prometheus: Monitoring at SoundCloud|last1 = Volz|first1 = Julius|last2 = Rabenstein|first2 = Björn|date = January 26, 2015|publisher = SoundCloud}} The project was open-source from the beginning and began to be used by Boxever and Docker users as well, despite not being explicitly announced.{{cite web|url = http://5pi.de/2015/01/26/monitor-docker-containers-with-prometheus/|title = Monitor Docker Containers with Prometheus|date = January 26, 2015|publisher = 5π Consulting|access-date = December 26, 2018|archive-date = January 3, 2019|archive-url = https://web.archive.org/web/20190103181809/http://5pi.de/2015/01/26/monitor-docker-containers-with-prometheus/|url-status = dead}} Prometheus was inspired by the monitoring tool Borgmon used at Google.{{cite book|url=http://shop.oreilly.com/product/0636920041528.do |title=Site Reliability Engineering:How Google Runs Production Systems |first1=Niall |last1=Murphy |first2=Betsy |last2=Beyer |first3=Chris |last3=Jones |first4=Jennifer |last4=Petoff |publisher=O'Reilly Media |year=2016 |isbn=978-1491929124 |quote=Even though Borgmon remains internal to Google, the idea of treating time-series data as a data source for generating alerts is now accessible to everyone through those open source tools like Prometheus ... }}{{cite web |first=Julius |last=Volz |url=https://www.youtube.com/watch?v=4Pr-z8-r1eo |title=PromCon 2017: Conference Recap |date=4 September 2017 |publisher= |via=YouTube |quote=I joined SoundCloud back in 2012 coming from Google...we didn't yet have any monitoring tools that that works with this kind of dynamic environment. We were kind of missing the way Google did its monitoring for its own internal cluster scheduler and we were very inspired by that and finally decided to build our own open-source solution.}}
By 2013, Prometheus was introduced for production monitoring at SoundCloud. The official public announcement was made in January 2015.
In May 2016, the Cloud Native Computing Foundation (CNCF) accepted Prometheus as its second incubated project, after Kubernetes.{{cite web|url = https://www.cncf.io/announcement/2016/05/09/cloud-native-computing-foundation-accepts-prometheus-as-second-hosted-project/|title = Cloud Native Computing Foundation Accepts Prometheus as Second Hosted Project|date = May 9, 2016|publisher = Cloud Native Computing Foundation|accessdate = December 26, 2018}} In August 2018, the CNFC announced that the Prometheus project had graduated.{{cite web |last=Evans |first=Kristen |date=August 9, 2018 |title=Cloud Native Computing Foundation Announces Prometheus Graduation |url=https://www.cncf.io/announcement/2018/08/09/prometheus-graduates/ |accessdate=December 26, 2018}}
= Versions =
Prometheus 1.0 was released in July 2016.{{cite web |date=July 18, 2016 |title=Prometheus 1.0 Is Here |url=https://www.cncf.io/blog/2016/07/18/prometheus-1-0-is-here/ |accessdate=December 26, 2018 |publisher=Cloud Native Computing Foundation}} Subsequent versions were released through 2016 and 2017, leading to Prometheus 2.0 in November 2017.{{cite web |date=November 8, 2017 |title=New Features in Prometheus 2.0.0 |url=https://www.robustperception.io/new-features-in-prometheus-2-0-0 |accessdate=December 26, 2018 |publisher=Robust Perception}}
Architecture
A typical monitoring platform with Prometheus is composed of multiple tools:{{citation needed|date=January 2019}}
- Multiple exporters typically run on the monitored host to export local metrics.
- Prometheus to centralize and store the metrics.
- Alertmanager{{cite web | url=https://github.com/prometheus/alertmanager | title=Alertmanager | website=GitHub | date=17 May 2022 }} to trigger alerts based on those metrics.
- Grafana to produce dashboards.
- PromQL is the query language used to create dashboards and alerts.
= Data storage format =
Prometheus data is stored in the form of metrics, with each metric having a name that is used for referencing and querying it. Each metric can be drilled down by an arbitrary number of key=value pairs (labels). Labels can include information on the data source (which server the data is coming from) and other application-specific breakdown information such as the HTTP status code (for metrics related to HTTP responses), query method (GET versus POST), endpoint, etc. The ability to specify an arbitrary list of labels and to query based on these in real time is why Prometheus' data model is called multi-dimensional.{{cite web|url = https://prometheus.io/docs/concepts/data_model/|title = Data model|publisher = Prometheus|accessdate = December 26, 2018}}
Prometheus stores data locally on disk, which helps for fast data storage and fast querying. There is the ability to store metrics in remote storage.{{cite web|url=https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage|title=Integrations - Prometheus|website=prometheus.io}}
= Data collection =
Prometheus collects data in the form of time series. The time series are built through a pull model: the Prometheus server queries a list of data sources (sometimes called exporters) at a specific polling frequency. Each of the data sources serves the current values of the metrics for that data source at the endpoint queried by Prometheus. The Prometheus server then aggregates data across the data sources. Prometheus has a number of mechanisms to automatically discover resources that should be used as data sources.{{cite web|url = http://www.openwebit.com/c/prometheus-collects-metrics-provides-alerting-and-graphs-web-ui/|title = Prometheus: Collects metrics, provides alerting and graphs web UI|date = March 18, 2017|accessdate = December 26, 2018}}
= PromQL =
Prometheus provides its own query language PromQL (Prometheus Query Language) that lets users select and aggregate data. PromQL is specifically adjusted to work in convention with a Time-Series Database and therefore provides time-related query functionalities. Examples include the {{tt|rate()}} function, the instant vector and the range vector which can provide many samples for each queried time series.{{cite web|url = https://prometheus.io/docs/prometheus/latest/querying/basics/|title = Querying Prometheus| accessdate = November 4, 2019}} Prometheus has four clearly defined metric types around which the PromQL components revolve. The four types are:{{Cite web |last= |title=Metric types |url=https://prometheus.io/docs/concepts/metric_types/ |access-date=2024-06-29 |website=prometheus.io |language=en}}
- Gauge
- Counter
- Histogram
- Summary
== Example code ==
{{sxhl|2=promql|1=
- A metric with label filtering
go_gc_duration_seconds{instance="localhost:9090", job="alertmanager"}
- Aggregation operators
sum by (app, proc) (
instance_memory_limit_bytes - instance_memory_usage_bytes
) / 1024 / 1024
= Alerts and monitoring =
Configuration for alerts can be specified in Prometheus which specifies a condition that needs to be maintained for a specific duration in order for an alert to trigger. When alerts trigger, they are forwarded to the Alertmanager service. Alertmanager can include logic to silence alerts and also to forward them to email, Slack, or notification services such as PagerDuty.{{cite web|url=https://medium.com/@abhishekbhardwaj510/alertmanager-integration-in-prometheus-197e03bfabdf |title=AlertManager Integration with Prometheus |last=Dubey| first=Abhishek |date=March 25, 2018 |access-date=December 26, 2018}} Some other messaging systems like Microsoft Teams{{cite web|url = https://medium.com/@Danuka_Praneeth/generating-alerts-from-prometheus-dc66522ecbe5|title = Alerting for Cloud-native Applications with Prometheus |last=Danuka |first=Praneeth |date=March 8, 2020 |access-date = October 18, 2020}} could be configured using the Alertmanager Webhook Receiver as a mechanism for external integrations.{{cite web | url=https://prometheus.io/docs/operating/integrations/#alertmanager-webhook-receiver | title=Integrations | Prometheus }} also Prometheus Alerts can be used to receive alerts directly on android devices even without the requirement of any targets configuration in Alert Manager.{{cite web | url=https://play.google.com/store/apps/details?id=com.khafan.prometheusalerts | title=Prometheus alerts - Apps on Google Play }}
= Time Series Database =
Prometheus has its own implementation of time series database where it stores the recent data (1-3 hours of data by default) in a combination of memory{{Cite web |date=2020-09-19 |title=Prometheus TSDB (Part 1): The Head Block {{!}} Ganesh Vernekar |url=https://ganeshvernekar.com/blog/prometheus-tsdb-the-head-block/ |access-date=2025-01-17 |website=ganeshvernekar.com |language=en}} and mmap-ed files from disk,{{Cite web |date=2020-10-02 |title=Prometheus TSDB (Part 3): Memory Mapping of Head Chunks from Disk {{!}} Ganesh Vernekar |url=https://ganeshvernekar.com/blog/prometheus-tsdb-mmapping-head-chunks-from-disk/ |access-date=2025-01-17 |website=ganeshvernekar.com |language=en}} and persists the older data in the form of blocks{{Cite web |date=2020-10-18 |title=Prometheus TSDB (Part 4): Persistent Block and its Index {{!}} Ganesh Vernekar |url=https://ganeshvernekar.com/blog/prometheus-tsdb-persistent-block-and-its-index/ |access-date=2025-01-17 |website=ganeshvernekar.com |language=en}} with an inverted index. Inverted index is well suited for Prometheus data format and querying patterns.{{Cite web |date=2021-01-04 |title=Prometheus TSDB (Part 5): Queries {{!}} Ganesh Vernekar |url=https://ganeshvernekar.com/blog/prometheus-tsdb-queries/ |access-date=2025-01-17 |website=ganeshvernekar.com |language=en}} As part of background maintenance, smaller blocks are merged together to form bigger blocks in a process called compaction{{Cite web |date=2021-07-27 |title=Prometheus TSDB (Part 6): Compaction and Retention {{!}} Ganesh Vernekar |url=https://ganeshvernekar.com/blog/prometheus-tsdb-compaction-and-retention/ |access-date=2025-01-17 |website=ganeshvernekar.com |language=en}} to improve query efficiency by having fewer blocks to read. Prometheus also uses a Write-Ahead-Log (WAL){{Cite web |date=2020-09-26 |title=Prometheus TSDB (Part 2): WAL and Checkpoint {{!}} Ganesh Vernekar |url=https://ganeshvernekar.com/blog/prometheus-tsdb-wal-and-checkpoint/ |access-date=2025-01-17 |website=ganeshvernekar.com |language=en}} to provide durability against crashes.
=Dashboards =
Prometheus is not intended as a full-fledged dashboard. Although it can be used to graph specific queries, it is not a full-fledged dashboard and needs to be hooked up with Grafana to generate dashboards; this has been cited as a disadvantage due to the additional setup complexity.{{cite web|url = https://jaxenter.com/prometheus-monitoring-pros-cons-136019.html|title = Prometheus monitoring: Pros and cons|date = July 28, 2017|last = Ryckbosch|first = Frederick|accessdate = December 26, 2018}}
= Interoperability =
Prometheus favors white-box monitoring. Applications are encouraged to publish (export) internal metrics to be collected periodically by Prometheus.{{cite web |url=https://prometheus.io/docs/practices/instrumentation/|title=Instrumentation - Prometheus|last=Prometheus|website=prometheus.io}} Some exporters and agents for various applications are available to provide metrics.{{cite web |url=https://prometheus.io/docs/instrumenting/exporters/ |title=Exporters |website=prometheus.io}} Prometheus supports some monitoring and administration protocols to allow interoperability for transitioning: Graphite, StatsD, SNMP, JMX, and CollectD.
Prometheus focuses on the availability of the platform and basic operations.{{cite web|url=https://prometheus.io/|title=Prometheus - Monitoring system & time series database|last=Prometheus|website=prometheus.io}} The metrics are typically stored for a few weeks. For long-term storage, the metrics can be streamed to remote storage.
= Standardization into OpenMetrics =
There is an effort to promote Prometheus exposition format into a standard known as OpenMetrics.{{cite web |url=https://github.com/OpenObservability/OpenMetrics |website=GitHub |title=OpenMetrics|date=2018-11-13 }} Some products adopted the format: InfluxData's TICK suite,{{cite web |url=https://github.com/influxdata/telegraf/tree/master/plugins/inputs/prometheus |title=Telegraf from InfluxData |website=GitHub |date=2018-12-25 }} InfluxDB, Google Cloud Platform,{{cite web |url=https://cloudplatform.googleblog.com/2018/05/Announcing-Stackdriver-Kubernetes-Monitoring-Comprehensive-Kubernetes-observability-from-the-start.html |title=Announcing Stackdriver Kubernetes Monitoring}} DataDog{{cite web |url=https://docs.datadoghq.com/agent/prometheus/ |title=DataDogHQ}} and New Relic.{{cite web |title=Send Prometheus metric data to New Relic {{!}} New Relic Documentation |url=https://docs.newrelic.com/docs/infrastructure/prometheus-integrations/get-started/send-prometheus-metric-data-new-relic/ |website=docs.newrelic.com |access-date=16 April 2025}}{{cite web |title=Configure Prometheus OpenMetrics integrations {{!}} New Relic Documentation |url=https://docs.newrelic.com/docs/infrastructure/prometheus-integrations/install-configure-openmetrics/configure-prometheus-openmetrics-integrations/ |website=docs.newrelic.com |access-date=16 April 2025}}
See also
{{Portal|Free and open-source software}}
References
{{reflist}}
Further reading
- {{Cite book|title=Monitoring Docker : monitor your Docker containers and their apps using various native and third-party tools with the help of this exclusive guide!|last=Russ|first=McKendrick|isbn=9781785885501|location=Birmingham, UK|oclc=933610431|date = 2015-12-15}}
- {{Cite book|title=KUBERNETES FOR DEVELOPERS use kubernetes to develop, test, and deploy your applications with the help of containers;use kubernetes to develop|last=JOSEPH.|first=HECK|date=2018|publisher=PACKT PUBLISHING|isbn=978-1788830607|location=[S.l.]|oclc=1031909876}}
- {{Cite book|title=Designing distributed systems : patterns and paradigms for scalable, reliable services|author=Burns, Brendan|isbn=9781491983614|edition=First|location=Sebastopol, CA|oclc=1023861580|date = 2018-02-20}}
- {{Cite book|title=Cloud Native programming with Golang Develop microservice-based high performance web apps for the cloud with Go|last=Martin.|first=Helmich|date=2017|publisher=Packt Publishing|others=Andrawos, Mina., Snoeck, Jelmer.|isbn=9781787127968|location=Birmingham|oclc=1020029257}}
- {{Cite book|title=Hybrid cloud for architects : build robust hybrid cloud solutions using AWS and OpenStack|last=Alok|first=Shrivastwa|isbn=9781788627986|location=Birmingham, UK|oclc=1028641698|date = 2018-02-23}}
- {{cite book|title=Native Docker Clustering with Swarm |first=Chanwit |last=Kaewkasi |year=2016 |publisher=Packt Publishing, Limited |isbn=978-1786469755}}
External links
- {{YouTube|rT4fJNbfe14|Prometheus: The Documentary}}
Category:Software using the Apache license