提示:边车有时也被称为sidekicks
prom/mysqld-exporter
镜像,并将容器命名为tornado-db-exp。我们使用DATA_SOURCE_NAME环境变量指定了数据库连接的详细信息,此连接使用DSN格式配置Mysql服务器的连接和凭据的详细信息kubectl exec -ti <pod> -- /usr/bin/mysql -p
prometheus.io/scrape
告诉Prometheus抓取这个服务:prometheus.io/port
告诉要抓取的端口。我们指定这一点是希望Prometheus在端口9104上访问Mysql Exporter,而不是直接访问Mysql服务器relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
……
prometheus.io/port
注解将被注入__address__
标签中,以便被作业抓取。接下来的服务发现将开始收集这些Mysql指标
警告:测量Mysql性能很难,尤其是在跟踪延迟等信号时,情况会因应用程序和服务器配置的不同而有很大差异。这些规则为你提供了起点,而不是明确的答案
- alert: MySQLHighSlowQuerysHigh
expr: rate(mysql_global_status_slow_queries[2m]) > 5
labels:
severity: warning
annotations:
summary: MySQL Slow query rate is exceeded on {{ $labels.instance }} for {{ $labels.kubernetes_name }}
groups:
- name: mysql_rules
rules:
- record: mysql:write_requests:rate2m
expr: sum(rate(mysql_global_status_commands_total{command=~"insert|update|delete"}[2m])) without (command)
- record: mysql:select_requests:rate2m
expr: sum(rate(mysql_global_status_commands_total{command="select"}[2m]))
- record: mysql:total_requests:rate2m
expr: rate(mysql_global_status_commands_total[2m])
- alert: MySQLAbortedConnectionsHigh
expr: rate(mysql_global_status_aborted_connects[2m]) > 5
labels:
severity: warning
annotations:
summary: MySQL Aborted connection rate is exceeded on {{ $labels.instance }} for {{ $labels.kubernetes_name }}
- alert: TornadoDBServerDown
expr: mysql_up{kubernetes_name="tornado-db"} == 0
for: 10m
labels:
severity: critical
annotations:
summary: MySQL Server {{ $labels.instance }} is down!
apiVersion: apps/v1beta2
kind: Deployment
- name: redis-exporter
image: oliver006/redis_exporter:latest
env:
- name: REDIS_ADDR
value: redis//tornado-redis:6379
- name: REDIS_PASSWORD
value: tornadoapi
ports:
- containerPort: 9121
- alert: TornadoRedisCacheMissesHigh
expr: redis_keyspace_hits_total / (redis_keyspace_hits_total + redis_keyspace_misses_total) > 0.8
for: 10m
labels:
severity: warning
annotations:
summary: Redis Server {{ $labels.instance }} Cache Misses are high.
- alert: TornadoRedisServerDown
expr: redis_up{kubernetes_name="tornado-redis"} == 0
for: 10m
labels:
severity: critical
annotations:
summary: Redis Server {{ $labels.instance }} is down!
(defproject tornado-api "0.1.0-SNAPSHOT"
:description "Example Clojure REST service for AoM"
:url "http://artofmonitoring.com"
:dependencies [[org.clojure/clojure "1.8.0"]
[compojure "1.1.1"]
[ring/ring-json "0.1.2"]
[ring/ring-jetty-adapter "1.3.1"]
[ring-logger-timbre "0.7.5"]
[com.taoensso/timbre "4.2.1"]
[c3p0/c3p0 "0.9.1.2"]
[org.clojure/java.jdbc "0.4.2"]
[mysql/mysql-connector-java "5.1.38"]
[com.taoensso/carmine "2.12.2"]
[cheshire "4.0.3"]
[clj-statsd "0.3.11"]]
:plugins [[lein-ring "0.7.3"]]
:main tornado-api.handler
:ring {:handler tornado-api.handler/app}
:profiles {
:dev {:dependencies [[ring-mock "0.1.3"]]}
:uberjar {:aot :all}})
(defn buy-item [item]
(let [id (uuid)]
(sql/db-do-commands db-config
(let [item (assoc item "id" id)]
(sql/insert! db-config :items item)
(statsd/gauge (str statsd-prefix "item.bought.total") (item "price"))))
(wcar* (car/ping)
(car/set id (item "title")))
(get-item id)))
(prometheus/set (registry :tornado/up) 1)
(def app
(-> (handler/api app-routes)
(middleware/wrap-json-body)
(middleware/wrap-json-response)
(ring/wrap-metrics registry {:path "/metrics")))
- record: tornado:request_latency_seconds:avg
expr: http_request_latency_seconds_sum{status="200"} / http_request_latency_seconds_count{status="200"}
- alert: TornadoRequestLatencyHigh
expr: histogram_quantile(0.9, rate(http_request_latency_seconds_bucket{kubernetes_name="tornado-api"}[5m])) > 0.05
for: 10m
labels:
severity: warning
annotations:
summary: API Server {{ $labels.instance }} latency is over 0.05.
- alert: TornadoAPIServerDown
expr: tornado_up{kubernetes_name="tornado-api"} != 1
for: 10m
labels:
severity: critical
annotations:
summary: API Server {{ $labels.instance }} is down!
- alert: TornadoAPIServerGone
expr: absent(tornado_up{kubernetes_name="tornado-api"})
for: 10m
labels:
severity: critical
annotations:
summary: No Tornado API servers are reporting!
description: Werner Heisenberg says - there is no uncertainty about the Tornado API server being gone.