druid 为各种场景提供了丰富的查询类型. 查询由各种 JSON 属性组合而成, 不同类型的查询, JSON 属性不同, 下面主要介绍常用的查询类型.
- 1. Components
- Datasources
一个数据源等价于 druid 表. 此外, 一个查询也可以作为数据源, 提供类似于子查询的功能. 查询数据源目前只支持 GroupBy 查询
Table Data Source
最常用的类型
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName",
- "countryIsoCode"
- ],
- "intervals": ["2016-06-27/2016-06-28"],
- "limit": 5
- }
- Union Data Source
联合数据源中的数据源必须拥有相同的 schema. 联合查询只能被发送给 broker/router 节点, 不支持直接发送到历史节点
- {
- "type": "union",
- "dataSources": ["<string_value1>", "<string_value2>", "<string_value3>", ... ]
- }
- Filters
- Selector filter
等价于 sql 的
- where countryIsoCode = 'US'
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "filter": {
- "type": "selector",
- "dimension": "countryIsoCode",
- "value": "US"
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName"
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "limit": 5
- }
- Column Comparison filter
等价于 sql 的
- where countryName = cityName
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "filter": {
- "type": "columnComparison",
- "dimensions": [
- "countryName",
- "cityName"
- ]
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName"
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "limit": 5
- }
- Regular expression filter
正则表达式, 支持标准的 java 正则表达式, 下面的查询表示 countryIsoCode 以 U 开头
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "filter": {
- "type": "regex",
- "dimension": "countryIsoCode",
- "pattern": "^U"
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName",
- "countryIsoCode"
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "limit": 5
- }
- Logical expression filters
支持 and or not, 下面的等价于
- where countryIsoCode = 'US' and cityName = 'New York'
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "filter": {
- "type": "and",
- "fields": [
- {
- "type": "selector",
- "dimension": "countryIsoCode",
- "value": "US"
- },
- {
- "type": "selector",
- "dimension": "cityName",
- "value": "New York"
- }
- ]
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName",
- "countryIsoCode"
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "limit": 5
- }
- JavaScript filter
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "filter": {
- "type": "javascript",
- "dimension" : "countryIsoCode",
- "function" : "function(value) { return (value =='US'|| value =='CN') }"
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName",
- "countryIsoCode"
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "limit": 5
- }
- Search filter
用于部分字符串匹配, 如下面的表示包含 foo, 并且对大小写不敏感
- {
- "filter": {
- "type": "search",
- "dimension": "product",
- "query": {
- "type": "insensitive_contains",
- "value": "foo"
- }
- }
- }
- In filter
等价于
- where countryIsoCode in ('US', 'CN')
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "filter": {
- "type": "in",
- "dimension" : "countryIsoCode",
- "values": ["US", "CN"]
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName",
- "countryIsoCode"
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "limit": 5
- }
- Like filter
等价于
- where countryIsoCode like '%U'
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "filter": {
- "type": "like",
- "dimension" : "countryIsoCode",
- "pattern": "%U"
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName",
- "countryIsoCode"
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "limit": 5
- }
- Bound filter
等价于
- "CN" < countryIsoCode < "US"
- {
- "queryType": "scan",
- "dataSource": {
- "type": "table",
- "name": "wikipedia3"
- },
- "filter": {
- "type": "bound",
- "dimension" : "countryIsoCode",
- "lower": "CN",
- "lowerStrict": true,
- "upper": "US" ,
- "ordering": "numeric",
- "upperStrict": true,
- "ordering": "lexicographic"
- },
- "resultFormat": "list",
- "columns": [
- "page",
- "countryName",
- "cityName",
- "countryIsoCode"
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "limit": 5
- }
- Aggregations
- Count aggregator
- select
- page,
- count(*) as num
- from wikipedia3
- where "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-28 00:00:00'
- group by page
- order by num desc
- limit 5
- {
- "queryType": "topN",
- "dataSource": "wikipedia3",
- "dimension": "page",
- "threshold": 5,
- "metric": "num",
- "granularity": "all",
- "aggregations": [
- {
- "type": "count",
- "name": "num"
- }
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ]
- }
- Sum aggregators
- longSum,doubleSum,floatSum
- select
- page,
- sum(delta) as num
- from wikipedia3
- where "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-28 00:00:00'
- group by page
- order by page asc
- limit 5
- {
- "queryType": "topN",
- "dataSource": "wikipedia3",
- "dimension": "page",
- "threshold": 5,
- "metric": "num",
- "granularity": "all",
- "aggregations": [
- {
- "type": "longSum",
- "name": "num",
- "fieldName" : "delta"
- }
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ]
- }
- Min / Max aggregators
doubleMin,doubleMax,floatMin,floatMax,longMin,longMax
- select
- page,
- max(delta) as num
- from wikipedia3
- where "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-28 00:00:00'
- group by page
- order by page asc
- limit 5
- {
- "queryType": "topN",
- "dataSource": "wikipedia3",
- "dimension": "page",
- "threshold": 5,
- "metric": "num",
- "granularity": "all",
- "aggregations": [
- {
- "type": "longMax",
- "name": "num",
- "fieldName" : "delta"
- }
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ]
- }
- First / Last aggregator
不能在数据摄入的时候使用, 只能用于查询
Last: 最大时间戳对应的数据, 0 if no row exist;First 最小时间戳对应的数据, 0 if no row exist
- JavaScript aggregator
- Post Aggregations
对 Aggregations 的结果进行二次加工并输出, 最终的结果既包含 Aggregations 的结果也包含 Post Aggregations 的结果
2. Timeseries
统计一段时间内的汇总数据
- SELECT count(*) as num,
- sum(added)
- FROM wikipedia
- WHERE "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-27 23:59:59'
- {
- "queryType": "timeseries",
- "dataSource": "wikipedia3",
- "granularity": "all",
- "aggregations": [
- { "type": "count", "name": "count" },
- { "type": "longSum", "name": "added", "fieldName": "added" }
- ],
- "intervals": [ "2016-06-27/2016-06-28" ]
- }
- 3. TopN
返回前 N 条数据, 并可以按照 metric 排序, 可以支持维度, 但只有一个
- SELECT
- page,
- sum(added) as num
- FROM wikipedia
- WHERE "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-27 23:59:59'
- group by page
- order by num desc
- limit 5
- {
- "queryType": "topN",
- "dataSource": "wikipedia3",
- "dimension": "page",
- "threshold": 5,
- "metric": "added",
- "granularity": "all",
- "aggregations": [
- {
- "type": "doubleSum",
- "name": "added",
- "fieldName": "added"
- }
- ],
- "intervals": [ "2016-06-27/2016-06-28" ]
- }
- 4. GroupBy
能对指定的多个维度分组, 也支持对指定的维度排序, 也支持 limit, 但是性能比 TopN 和 Timeseries 要差很多
- SELECT
- page,
- countryName,
- sum(added) as num,
- sum(delta) as num2
- FROM wikipedia
- WHERE "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-27 23:59:59'
- group by page,countryName
- order by num desc
- limit 5
- {
- "queryType": "groupBy",
- "dataSource": "wikipedia3",
- "granularity": "all",
- "dimensions": [
- "page",
- "countryName"
- ],
- "limitSpec": {
- "type": "default",
- "limit": 5,
- "columns": [
- {
- "dimension": "added",
- "direction": "descending",
- "dimensionOrder": {
- "type": "numeric"
- }
- }
- ]
- },
- "aggregations": [
- {
- "type": "longSum",
- "name": "added",
- "fieldName": "added"
- },
- {
- "type": "longSum",
- "name": "delta",
- "fieldName": "delta"
- }
- ],
- "intervals": [
- "2016-06-27/2016-06-28"
- ]
- }
- 5. Search
类似于 like 操作, 可以查询多个维度列, 不支持聚合
- SELECT
- page,
- countryName
- FROM wikipedia
- WHERE "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-27 23:59:59'
- and page like '%C' or countryName like '%C'
- limit 5
- {
- "queryType": "search",
- "dataSource": "wikipedia3",
- "granularity": "all",
- "dimensions": [
- "page",
- "countryName"
- ],
- "query": {
- "type": "insensitive_contains",
- "value": "C"
- },
- "sort" : {
- "type": "lexicographic"
- },
- "limit": 5,
- "intervals": [
- "2016-06-27/2016-06-28"
- ]
- }
- 6. Select
查数据, 不支持聚合, 但支持分页, 排序
- SELECT
- *
- FROM wikipedia
- WHERE "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-27 23:59:59'
- limit 0,5
- {
- "queryType": "select",
- "dataSource": "wikipedia3",
- "granularity": "all",
- "dimensions":[],
- "metrics":[],
- "pagingSpec":{"pagingIdentifiers": {}, "threshold":5},
- "intervals": [
- "2016-06-27/2016-06-28"
- ]
- }
- 7. Scan
类似于 Select, 但不支持分页, 但是如果没有分页需求, 推荐使用这个, 性能比 Select 好
- SELECT
- page,countryName
- FROM wikipedia
- WHERE "__time" BETWEEN TIMESTAMP '2016-06-27 00:00:00' AND TIMESTAMP '2016-06-27 23:59:59'
- limit 5
- {
- "queryType": "scan",
- "dataSource": "wikipedia3",
- "resultFormat": "list",
- "columns":["page","countryName"],
- "intervals": [
- "2016-06-27/2016-06-28"
- ],
- "batchSize":20480,
- "limit":5
- }
来源: https://www.cnblogs.com/xujieyang4j/p/9153443.html