query processing would load too many samples into memory in query execution

在 Grafana 中有的 dashboard 只要时间范围选择稍微长一些，dashboard 就展示就会失败

query processing would load too many samples into memory in query execution

由于 PromQL 语句可能会载入大量的 metrics 数据，导致 Prometheus 内存以及 CPU 消耗超标，所以 Prometheus 提供了相关命令行参数，防止复杂的查询耗光资源

–query.timeout=2m

Maximum time a query may take before being aborted.

–query.max-concurrency=20

Maximum number of queries executed concurrently.

–query.max-samples=50000000

Maximum number of samples a single query can load into memory. Note that

queries will fail if they try to load more samples than this into memory,

so this also limits the number of samples a query can return.

Prometheus 时序数据模型参见 Data model | Prometheus

Prometheus fundamentally stores all data as time series:

streams of timestamped values belonging to the same metric and

the same set of labeled dimensions.

每 <metric name>{<label name>=<label value>, ...} 值对应一 time series 。

如下两条 metric 由于 user_agent label 值不一样，从而属于两个 time series

nginx_http_response_count_total{request_uri="/index.html",method="GET",status="200",user_agent="Dalvik/1.6.0"}

nginx_http_response_count_total{request_uri="/index.html",method="GET",status="200",user_agent="Dalvik/2.1.0"}

当某些 label 取值较多的情况下，会导致 time series 过多，导致无法展示。

可以打开 http://prometheus.yourcompany.com/graph 实际执行一下查询语句，看一下查询性能， time series 过多时查询最近一个小时

Load time: 21119ms

Resolution: 14s

Total time series: 18435

做一个简单的计算（采用默认的 scrape_interval 值 15s ）

60*60/15*18435=4424400

假设 time series 数不变，最多只支持查询 11.3（50000000/4424400） 小时数据。