本文主要介绍 CTSDB 中常用查询及关键字的使用,并给出简单 CURL 示例。所有示例使用的 metric 结构如下所示:
{"ctsdb_test" : {"tags" : {"region" : "string"},"time" : {"name" : "timestamp","format" : "epoch_millis"},"fields" : {"cpuUsage" : "float","diskUsage" : "string","dcpuUsage" : "integer"},"options" : {"expire_day" : 7,"refresh_interval" : "10s","number_of_shards" : 5,"indexed_fields" : "cpuUsage"}}}
组合查询
这里的组合查询即可以用于单查询也可用于复合查询。查询体中的 query 关键字通过 Query DSL(Domain Specific Language)来定义查询条件。
下文会先介绍如何构造过滤条件,再介绍怎样组合过滤条件和如何处理返回结果集。
常用过滤条件
1. Range
Range 即区间查询,Range 查询支持的字段类型包括 string、long、integer、short、double、float、date。Range 查询可包含的参数如下表所示:
参数名称 | 描述 |
gte | 大于或等于 |
gt | 大于 |
lte | 小于或等于 |
lt | 小于 |
说明
时间范围查询的 CURL 示例:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"range": {"timestamp": {"gte": "01/01/2018","lte": "03/01/2018","format": "MM/dd/yyyy","time_zone":"+08:00"}}}}'
数字范围查询 CURL 示例:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"range": {"cpuUsage": {"gte": 1.0,"lte": 10.0}}}}'
2. Terms
Terms 关键字用于查询时匹配特定字段,其值需要用中括号包裹。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"terms": {"region": ["sh", "bj"]}}}'
过滤条件组合
复合查询常使用 Bool 关键字来组合多个查询条件,常见的用于 Bool 查询的组合关键字有 filter(类似于 AND)、must_not(类似于 NOT)、should(类似于 OR)。为提高查询性能,请务必加上 time 字段的 range 查询,且 time 字段,不论查询时以何种方式写入,返回值统一为 epoch_millis 格式。query-bool-filter 的组合形式可显著提升查询性能,请务必采用这种查询方式。
1. AND 条件 CURL 示例说明
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"bool": {"filter": [{"range": {"timestamp": {"format": "yyyy-MM-dd HH:mm:ss","gte": "2017-11-06 23:00:00","lt": "2018-03-06 23:05:00","time_zone":"+08:00"}}},{"terms": {"region": ["sh"]}},{"terms": {"cpuUsage": ["2.0"]}}]}},"docvalue_fields": ["cpuUsage","timestamp"]}'
说明
此查询条件类似于
timestamp>='2017-11-06 23:00:00' AND timestamp<'2018-03-06 23:05:00' AND region=sh AND cpuUsage=2.0
。2. OR 条件 CURL 示例说明
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"bool": {"filter": [{"range": {"timestamp": {"format": "yyyy-MM-dd HH:mm:ss","gte": "2017-11-06 23:00:00","lt": "2018-11-06 23:05:00","time_zone":"+08:00"}}},{"term": {"region": "gz"}}],"should": [{"terms": {"cpuUsage": ["2.0"]}},{"terms": {"cpuUsage": ["2.5"]}}],"minimum_should_match": 1}},"docvalue_fields": ["cpuUsage","timestamp"]}'
说明
此查询条件类似于
timestamp>='2017-11-06 23:00:00' AND timestamp<'2018-11-06 23:05:00' AND region='gz' AND (cpuUsage=2.0 or cpuUsage=2.5)
。minimum_should_match 参数的意义在于设置 cpuUsage=2.0 和 cpuUsage=2.5 至少匹配的个数,系统默认 minimum_should_match 为0。3. NOT 条件 CURL 示例说明
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"bool": {"filter": [{"range": {"timestamp": {"format": "yyyy-MM-dd HH:mm:ss","gte": "2017-11-06 23:00:00","lt": "2018-11-06 23:05:00","time_zone":"+08:00"}}},{"terms": {"region": ["gz"]}}],"must_not": [{"terms": {"cpuUsage": ["2.0"]}}]}},"docvalue_fields": ["cpuUsage","timestamp"]}'
说明
此查询条件类似于
timestamp>=2017-11-06 23:00:00 AND timestamp<2018-11-06 23:05:00 AND region='gz' AND cpuUsage !=2.0
。返回结果集处理
1. From/Size
通过设置 From 和 Size 关键字可对查询结果进行分页。From 关键字定义了查询结果中第一条数据的偏移量,Size 关键字配置返回结果的最大条数。From 系统默认为 0,Size 系统默认为 10,From 与 Size 的总和系统默认不能超过 65536。若想要更大的返回结果,请参考本文的 scroll 关键字查询。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"from": 0,"size": 5,"query": {"bool": {"filter": {"range": {"timestamp": {"gte": "01/01/2018","lte": "03/01/2018","format": "dd/MM/yyyy","time_zone":"+08:00"}}},"must_not": {"terms": {"region": ["bj"]}}}}}'
2. Scroll
Scroll 可以理解为关系型数据库里的 cursor,因此 scroll 并不适合用来做实时搜索,而更适用于后台批处理任务。
可以把 scroll 分为初始化和遍历两个阶段,初始化时将所有符合搜索条件的搜索结果缓存起来,类似于快照,在遍历时,从这个快照里取数据。注意,在 Scroll 初始化后对 metric 的插入、删除、更新都不会影响遍历结果。
初始化阶段可通过 size 关键字指定返回结果集的大小,size 默认值为10,最大值为65536;初始化阶段可通过 query 关键字指定查询条件;初始化阶段可通过 docvalue_fields 关键字指定返回字段。初始化阶段和遍历阶段都返回
_scroll_id
字段,后续遍历可以通过指定前一次遍历返回的 _scroll_id
来进行新的遍历,直到返回结果为空;初始化和遍历两个阶段都可以通过指定 scroll 参数来设定遍历的上下文环境保留时间,过期后 _scroll_id
将变得无效。时间格式如下表所示:格式 | 描述 |
d | days |
h | hours |
m | minutes |
s | seconds |
ms | milliseconds |
micros | microseconds |
nanos | nanoseconds |
CURL 示例说明:
scroll 初始化:
curl -u root:le201909 -H 'Content-Type:application/json' -X POST 172.xx.xx.4:9201/ctsdb_test/_search?scroll=1m -d'{"size":5,"query": {"bool": {"filter": [{"terms": {"region": ["gz"]}}]}},"docvalue_fields": ["cpuUsage","region","timestamp"]}'
scroll 初始化返回:
{"_scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAADrOFFm5YSEhnMjdnUWNPcndHS1k5Wjc3bHcAAAAAAAz_1RZiRkZTcGp4dFRXR18xMGtzSmhEUFJRAAAAAAAP5vQWOXFOR29lc0hROHFWMmFGTkVmSkxmZw==","took": 10641,"timed_out": false,"_shards": {"total": 3,"successful": 3,"skipped": 0,"failed": 0},"hits": {"total": {"value":1592072666,"relation":"eq"},"max_score": 0.65708643,"hits": [{"_index": "ctsdb_test@0_-1","_type": "doc","_id": "oyylNU0U65cZjByyt7sW_JmPPgAACY4Bh0","_score": 0.65708643,"_routing": "354d14eb","fields": {"region": ["gz"],"cpuUsage": ["2.0"],"timestamp": [1509909300000]}},{"_index": "ctsdb_test@0_-1","_type": "doc","_id": "oyykFN0yd1d9NDPfzjRdrJ8whQAACYsBqc","_score": 0.65708643,"_routing": "14dd3277","fields": {"region": ["gz"],"cpuUsage": ["1.8"],"timestamp": [1509908340000]}},{"_index": "ctsdb_test@0_-1","_type": "doc","_id": "oyylHLp9jl_3sF4N2rnh67h4SgAABHIBso","_score": 0.65708643,"_routing": "1cba7d8e","fields": {"region": ["gz"],"cpuUsage": ["2.5"],"timestamp": [1509909720000]}},{"_index": "ctsdb_test@0_-1","_type": "doc","_id": "oyylH2JsOKnFHGUinQ7jM-ZwkgAAAvcBso","_score": 0.65708643,"_routing": "1f626c38","fields": {"region": ["gz"],"cpuUsage": ["2.1"],"timestamp": [1509909720000]}},{"_index": "ctsdb_test@0_-1","_type": "doc","_id": "oyylHLp9jl_3sF4N2rnh67h4SgAABGsBso","_score": 0.65708643,"_routing": "1cba7d8e","fields": {"region": ["gz"],"cpuUsage": ["2.0"],"timestamp": [1509909720000]}}]}}
scroll 遍历:
curl -u root:le201909 -H 'Content-Type:application/json' -X POST 172.xx.xx.4:9201/_search/scroll -d'{"scroll" : "1m","scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAADrOFFm5YSEhnMjdnUWNPcndHS1k5Wjc3bHcAAAAAAAz_1RZiRkZTcGp4dFRXR18xMGtzSmhEUFJRAAAAAAAP5vQWOXFOR29lc0hROHFWMmFGTkVmSkxmZw=="}'
说明
此请求中的 scroll_id 是 scroll 初始化返回的
_scroll_id
值。而下一次遍历则需要将 scroll_id 参数调整为上一次遍历返回的 _scroll_id
值,即每次请求中的 scroll_id 参数是上一次请求返回的 _scroll_id
值,直到返回结果为空,遍历结束。两次遍历返回的
_scroll_id
值可能相同,无法利用 _scroll_id
进行指定页跳转。3. Sort
Sort 关键字主要用于对查询结果进行排序。排序方式有 asc 和 desc 两种,CTSDB 对于用户自定义字段的默认排序方式为 asc。排序模式有 min、max、sum、avg、median。其中 sum、avg、median 只适用于类型为 array 并存储数字的字段。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"bool": {"must": {"range": {"timestamp": {"gte": "01/01/2018","lte": "03/01/2018","format": "MM/dd/yyyy","time_zone":"+08:00"}}}}},"sort": [{"cpuUsage": {"order": "asc","mode": "min"}},{"timestamp": {"order": "asc"}},"diskUsage"]}'
4. docvalue_fields
docvalue_fields 关键词指定需要返回的字段名称,需要以数组的形式指定。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"terms": {"region": ["sh", "bj"]}},"docvalue_fields": ["timestamp", "cpuUsage"]}'
聚合查询
agg 关键字主要用于构造聚合查询。用户可到返回的 aggregations 字段取聚合结果。聚合返回字段说明详见下表。如果只关注聚合结果,请在查询时设置 size 参数为0。
字段名称 | 描述 |
hits | 返回匹配的查询结果,其中 total 字段表示参与聚合的数据条数。里面的 hits 字段为数组结构,若无指定,则包含所有查询结果的前十条。hits 数组里的每个结果中包含 _index (查询涉及到的 子表),若查询中指定了 docvalue_fields,则会返回 fields字段具体指明每个字段的值。 |
took | 整个查询耗费的毫秒数。 |
_shards | 参与查询的分片数。其中 total 标识总分片数,successful 标识执行成功的数量,failed 标识执行失败的数量,skipped 标识跳过执行的分片数。 |
timed_out | 查询是否超时,取值有 false 和 true。 |
aggregations | 聚合返回结果。 |
下面列举几种常用的聚合方式。
普通聚合
普通聚合需要指定聚合名称、聚合方式(常用的聚合方式有 min、max、avg、value_count、sum 等)和聚合所作用的字段。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"size":0,"query": {"terms": {"region": ["sh", "bj"]}},"aggs": {"myname": {"max": {"field": "cpuUsage"}}}}'
说明
上例中对字段 cpuUsage 做 max 聚合(用户也可指定 min、avg 等),返回结果取别名 myname(用户可任意指定该名称)。
返回结果:
{"took": 1,"timed_out": false,"_shards": {"total": 20,"successful": 20,"skipped": 0,"failed": 0},"hits": {"total": {"value":7,"relation":"eq"},"max_score": 0,"hits": []},"aggregations": {"myname": {"value": 4}}}
terms 聚合
terms 聚合主要用于查询某个字段的所有唯一值以及该值的个数,用户可指定唯一值返回时的排序规则,以及返回结果数,并且可对参与聚合的数据字段进行模糊或者精确匹配,详情请参考如下示例,使用 filter_path 参数可定制返回结果字段,使用方法请参考 批量查询数据。
CURL 示例说明:
请求:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'{"aggs": {"myname": {"terms": {"field":"region"}}}}'
返回:
{"aggregations": {"myname": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "sh","doc_count": 10},{"key": "Motor_sports","doc_count": 6},{"key": "gz","doc_count": 3},{"key": "bj","doc_count": 2},{"key": "cd","doc_count": 2},{"key": "Winter_sports","doc_count": 1},{"key": "water_sports","doc_count": 1}]}}}
说明
上例表示查看 ctsdb_test 的 region 字段中所有的唯一值及其出现的次数。分析返回字段 aggregations 中的 buckets 字段发现,region 字段共有7种类型的值,分别是 sh、Motor_sports、gz、bj、cd、Winter_sports、water_sports, 并且返回字段通过 doc_count 指明了每种值的个数。
用户可通过 size 字段指定返回的唯一值的个数,例如某字段名为 region,其唯一值有7个,可通过设置 size 字段为5指定只返回前5个,具体请参考示例。
请求:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'{"aggs": {"myname": {"terms": {"field":"region","size":5}}}}'
返回:
{"aggregations": {"myname": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 2,"buckets": [{"key": "sh","doc_count": 10},{"key": "Motor_sports","doc_count": 6},{"key": "gz","doc_count": 3},{"key": "bj","doc_count": 2},{"key": "cd","doc_count": 2}]}}}
用户可对返回结果进行排序,详情请参考如下示例。
1.请求(返回结果按照唯一值的个数降序排列):
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'{"aggs": {"myname": {"terms": {"field":"region","order":{"_count":"desc"}}}}}'
返回:
{"aggregations": {"myname": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "sh","doc_count": 10},{"key": "Motor_sports","doc_count": 6},{"key": "gz","doc_count": 3},{"key": "bj","doc_count": 2},{"key": "cd","doc_count": 2},{"key": "Winter_sports","doc_count": 1},{"key": "water_sports","doc_count": 1}]}}}
2.请求(返回结果按照唯一值的字母升序进行排列):
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'{"aggs": {"myname": {"terms": {"field":"region","order":{"_term":"asc"}}}}}'
返回:
{"aggregations": {"myname": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "Motor_sports","doc_count": 6},{"key": "Winter_sports","doc_count": 1},{"key": "bj","doc_count": 2},{"key": "cd","doc_count": 2},{"key": "gz","doc_count": 3},{"key": "sh","doc_count": 10},{"key": "water_sports","doc_count": 1}]}}}
用户可设置正则表达式对参与 terms 聚合的数据字段进行模糊匹配或者指定字段进行精确匹配,详情请参考示例。
模糊匹配示例:
请求(只针对 region 字段中有 sport 并且不是以
water_
开头的数据进行聚合并返回结果):curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'{"aggs": {"myname": {"terms": {"field":"region","include" : ".*sport.*","exclude" : "water_.*"}}}}'
返回:
{"aggregations": {"myname": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "Motor_sports","doc_count": 6},{"key": "Winter_sports","doc_count": 1}]}}}
精确匹配示例
请求(region_zone 指定值为"sh","bj","cd","gz"的 region 字段进行聚合,而 region_sports 指定值不为"sh","bj","cd","gz"的 region 字段进行聚合)
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'{"aggs" : {"region_zone" : {"terms" : {"field" : "region","include" : ["sh", "bj","cd","gz"]}},"region_sports" : {"terms" : {"field" : "region","exclude" : ["sh", "bj","cd","gz"]}}}}'
返回:
{"aggregations": {"region_sport": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "Motor_sports","doc_count": 6},{"key": "Winter_sports","doc_count": 1},{"key": "water_sports","doc_count": 1}]},"region_zone": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "sh","doc_count": 10},{"key": "gz","doc_count": 3},{"key": "bj","doc_count": 2},{"key": "cd","doc_count": 2}]}}}
Date Histogram 聚合
Date Histogram 主要对日期做直方图聚合。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"terms": {"region": ["sh", "bj"]}},"aggs": {"time_1h_agg": {"date_histogram": {"field": "timestamp","interval": "1h"},"aggs": {"avgCpuUsage": {"avg": {"field": "cpuUsage"}}}}}}'
返回:
{"took": 5,"timed_out": false,"_shards": {"total": 20,"successful": 20,"skipped": 0,"failed": 0},"hits": {"total": {"value":6,"relation":"eq"},"max_score": 0.074107975,"hits": [{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2QtGR5xcjRaw2ETf-","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2QtGR5xcjRaw2ETf_","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2Q5Xr5xcjRaw2ETgA","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2Q5Xr5xcjRaw2ETgB","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2RGfF5xcjRaw2ETgC","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2RGfF5xcjRaw2ETgD","_score": 0.074107975,"_routing": "sh"}]},"aggregations": {"time_1h_agg": {"buckets": [{"key_as_string": "1520222400","key": 1520222400000,"doc_count": 1,"avgCpuUsage": {"value": 2.5}},{"key_as_string": "1520226000","key": 1520226000000,"doc_count": 0,"avgCpuUsage": {"value": null}},{"key_as_string": "1520229600","key": 1520229600000,"doc_count": 0,"avgCpuUsage": {"value": null}},{"key_as_string": "1520233200","key": 1520233200000,"doc_count": 0,"avgCpuUsage": {"value": null}},{"key_as_string": "1520236800","key": 1520236800000,"doc_count": 0,"avgCpuUsage": {"value": null}},{"key_as_string": "1520240400","key": 1520240400000,"doc_count": 0,"avgCpuUsage": {"value": null}},{"key_as_string": "1520244000","key": 1520244000000,"doc_count": 0,"avgCpuUsage": {"value": null}},{"key_as_string": "1520247600","key": 1520247600000,"doc_count": 1,"avgCpuUsage": {"value": 2}},{"key_as_string": "1520251200","key": 1520251200000,"doc_count": 4,"avgCpuUsage": {"value": 2.25}}]}}}
说明
上例是对字段 cpuUsage 进行粒度为1小时的 date_histogram 聚合。返回结果中总的聚合名称为 time_1h_agg(用户可任意指定该名称),每个时间分段内的聚合名称为 avgCpuUsage(用户可任意指定该名称)。interval 字段可选的时间粒度有 year、quarter、 month、week、day、hour、minute、second,用户也可将时间粒度用时间单位来表示,例如1y代表1year,1h代表1hour。系统不支持小数粒度的时间单位,例如1.5h您需要转换为90min。
Percentiles 聚合
Percentiles 聚合即百分位聚合。百分位可以任意指定,系统默认的百分位是 1、5、25、50、75、95、99,可自行选择其他数值。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"terms": {"region": ["sh", "bj"]}},"aggs":{"myname":{"percentiles":{"field": "cpuUsage","percents": [1,25,50,70,99]}}}}'
返回:
{"took": 18,"timed_out": false,"_shards": {"total": 20,"successful": 20,"skipped": 0,"failed": 0},"hits": {"total": {"value":6,"relation":"eq"},"max_score": 0.074107975,"hits": [{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2QtGR5xcjRaw2ETf-","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2QtGR5xcjRaw2ETf_","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2Q5Xr5xcjRaw2ETgA","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2Q5Xr5xcjRaw2ETgB","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2RGfF5xcjRaw2ETgC","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2RGfF5xcjRaw2ETgD","_score": 0.074107975,"_routing": "sh"}]},"aggregations": {"myname": {"values": {"1.0": 2,"25.0": 2,"50.0": 2.25,"70.0": 2.5,"99.0": 2.5}}}}
说明
上例中是对字段 cpuUsage 做 percentiles 聚合,选取的百分位为 1,25,50,70,99。聚合返回结果的别名是 myname(用户可任意指定该名称)。
Cardinality 聚合
Cardinality 聚合主要用于统计去重后的数量。默认情况下,当聚合结果小于等于3000时,Cardinality 返回的结果是精确值,大于3000时,为近似值。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'{"query": {"terms": {"region": ["sh", "bj"]}},"aggs":{"myname":{"cardinality":{"field": "cpuUsage"}}}}'
返回:
{"took": 15,"timed_out": false,"_shards": {"total": 20,"successful": 20,"skipped": 0,"failed": 0},"hits": {"total": {"value":6,"relation":"eq"},"max_score": 0.074107975,"hits": [{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2QtGR5xcjRaw2ETf-","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2QtGR5xcjRaw2ETf_","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2Q5Xr5xcjRaw2ETgA","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2Q5Xr5xcjRaw2ETgB","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2RGfF5xcjRaw2ETgC","_score": 0.074107975,"_routing": "sh"},{"_index": "ctsdb_test@1520092800000_3","_type": "doc","_id": "AWH2RGfF5xcjRaw2ETgD","_score": 0.074107975,"_routing": "sh"}]},"aggregations": {"myname": {"value": 2}}}
说明
上例的聚合是对字段 cpuUsage 做 cardinality 聚合,返回的聚合结果取别名 myname(用户可任意指定该名称)。