- PDF Link: cheatsheet-prometheus-A4.pdf, Category: tools
- Blog URL: https://cheatsheet.dennyzhang.com/cheatsheet-prometheus-A4
- Related posts: Nagios CheatSheet, Kubectl CheatSheet, #denny-cheatsheets
File me Issues or star this repo.
Name | Command |
---|---|
Run prometheus server with docker | docker run -p 9090:9090 prom/prometheus , http://localhost:9090/graph , http://localhost:9090/metrics |
Run cadvisor to get local containers’ metrics | docker run -v /var/run:/var/run -v /sys:/sys -p 8080:8080 google/cadvisor , http://localhost:8080/metrics |
Query metrics by api, instead of web console | curl http://localhost:9090/api/v1/query?query=container_memory_usage_bytes |
List all alerts of alertmanager | curl http://localhost:9093/api/v1/alerts |
Prometheus tech stack footprint | prometheus(350MB RAM), node-exporter(10MB), kube-state-metrics(20MB), alertmanager(15MB), grafana(30MB) |
Example of client libraries | Link: prometheus-python-example.py |
Prometheus Online Demo | Live demo from CloudAlchemy |
Prometheus Config file | /etc/prometheus/prometheus.yml Sections in conf: global , rule_files , scrape_configs |
Name | Command |
---|---|
Prometheus server | Scrapes and store time series data. It uses mainly pull model, instead of push. |
Special-purpose exporters | Get metrics for all kinds of services. e.g, Node Exporter, Blackbox Exporter, SNMP Exporter, JMX Exporter, etc |
Client libraries | Instrument application code. |
Alertmanager | Handle alerts. |
Push gateway | Support short-lived jobs. Persist the most recent push of metrics from batch jobs. |
Reference | Link: Exporters And Integrations, Link: Default port allocations |
Name | Command |
---|---|
Counter | It only goes up (and resets), counts something. e.g, the number of requests served, tasks completed, or errors. |
Gauge | It goes up and down, snapshot of state. e.g, temperatures or current memory usage, etc |
Summary | It samples observations, espeically over a sliding time window. e.g, rate(http_request_duration_seconds_sum[5m]) |
Histogram | It samples observations and counts them in configurable buckets. |
Name | Summary |
---|---|
Target | A target is the definition of an object to scrape. |
Job | A collection of targets with the same purpose. |
Instance | A label that uniquely identifies a target in a job. |
Exporter | Expose metrics from a non-Prometheus format into a format Prometheus supports. |
Collector | A part of an exporter that represents a set of metrics. |
Handler | |
Rule |
Name | Command | Sample Metrics |
---|---|---|
cadvisor | http://$node_ip:10255/metrics/cadvisor | Link: cadvisor-sample.txt |
node-exporter | http://$node_ip:9100/metrics | Link: node-exporter-sample.txt |
kubelet | http://$kubelet_ip:10255/metrics | Link: kubelet-sample.txt |
kube-dns | http://$kube_dns_addon_ip:10054/metrics | Link: kube-dns-sample.txt |
kube-state-metrics http-metric | http://$kube_state_metric_svc:8080/metrics | Link: kube-state-metrics-http-sample.txt |
kube-state-metrics telemetry | http://$kube_state_metric_svc:8081/metrics | Link: kube-state-metrics-telemetry-sample.txt |
apiserver | https://$api_server:443/metrics |
Name | Command |
---|---|
Reference | Link: query |
Find metric by name+job+group | somemetric{job=”prometheus”,group=”canary”} |
rate(apiserver_request_count{verb=”GET”, code=”200”}[1m]) | |
The avg network traffic received per second, over the last min | rate(node_network_receive_bytes_total[1m]) |
topk query | Link: query-topk.txt |
join | |
cut | |
slice | |
count | |
predict | |
sum | |
min | |
max | |
avg |
Name | Command |
---|---|
How full will the disks be in 4 hours? | |
Which services are the top 5 users of CPU? | |
What’s the 95th percentile latency in EU datacenter? |
License: Code is licensed under MIT License.
https://povilasv.me/prometheus-tracking-request-duration/