Elasticsearch基本聚合查询

Elasticsearch基本聚合查询

es聚合查询

ES提供了强大的聚合分析功能,按照操作上细化,可以主要分为四种,如下表所示:

聚合方式        解释  
Bucket      Aggregation     一些满足特定条件的文档的集合  
Metric      Aggregation     一些数学计算,可以对文档字段统计分析  
Pipeline    Aggregation     对其他的聚合结果进行二次聚合  
Metrix      Aggregation     支持对多个字段的操作并提供一个结果矩阵  

这些只是理论意义上的细化,在实际的应用过程中,我们并没有说针对那种场景使用那种聚合分析。都是为了满足我们的业务,在实现的过程中同时会使用到多种聚合的方式。

一、四种聚合方式

1.1 Bucket(分桶)

分桶就是将具有某一类共同特征的数据归为一类,然后求其总数,例如: 男女、公司同一工作岗位的员工、商品高中低档等。在对数据分桶后还可以进一步分桶,例如:0~ 20岁男性、21~50岁男性、50岁以上男性;同一工作岗位男性、女性;高档商品好评、中评、差评的商品。

1.2、Metric(计算)

计算具有一类特征的数据的统计值,例如平均值、最大值、最小值等。

1.3、Pipeline(管道)

pipeline与Linux操作系统中的管道操作(将上一步操作的结果作为下一步操作的数据源)类似。即将上一次聚合操作的结果作为下一次聚合操作的数据源。

1.4、Metrix(矩阵)

矩阵就是同时可以支持多值的输出,例如对分桶的数据同时求平均、最大、最小值;

二、具体的案例

首先在ES中插入一批的测试数据,在插入测试数据之前先定义mapping。

2.1、mapping的定义

PUT employee
{
  "mappings": {
    "properties": {
      "id": {
        "type": "integer"
      },
      "name": {
        "type": "keyword"
      },
      "job": {
        "type": "keyword"
      },
      "age": {
        "type": "integer"
      },
      "gender": {
        "type": "keyword"
      }
    }
  }
}

2.2、插入数据

PUT employee/_bulk
{"index": {"_id": 1}}
{"id": 1, "name": "Bob", "job": "java", "age": 21, "sal": 8000, "gender": "male"}
{"index": {"_id": 2}}
{"id": 2, "name": "Rod", "job": "html", "age": 31, "sal": 18000, "gender": "female"}
{"index": {"_id": 3}}
{"id": 3, "name": "Gaving", "job": "java", "age": 24, "sal": 12000, "gender": "male"}
{"index": {"_id": 4}}
{"id": 4, "name": "King", "job": "dba", "age": 26, "sal": 15000, "gender": "female"}
{"index": {"_id": 5}}
{"id": 5, "name": "Jonhson", "job": "dba", "age": 29, "sal": 16000, "gender": "male"}
{"index": {"_id": 6}}
{"id": 6, "name": "Douge", "job": "java", "age": 41, "sal": 20000, "gender": "female"}
{"index": {"_id": 7}}
{"id": 7, "name": "cutting", "job": "dba", "age": 27, "sal": 7000, "gender": "male"}
{"index": {"_id": 8}}
{"id": 8, "name": "Bona", "job": "html", "age": 22, "sal": 14000, "gender": "female"}
{"index": {"_id": 9}}
{"id": 9, "name": "Shyon", "job": "dba", "age": 20, "sal": 19000, "gender": "female"}
{"index": {"_id": 10}}
{"id": 10, "name": "James", "job": "html", "age": 18, "sal": 22000, "gender": "male"}
{"index": {"_id": 11}}
{"id": 11, "name": "Golsling", "job": "java", "age": 32, "sal": 23000, "gender": "female"}
{"index": {"_id": 12}}
{"id": 12, "name": "Lily", "job": "java", "age": 24, "sal": 2000, "gender": "male"}
{"index": {"_id": 13}}
{"id": 13, "name": "Jack", "job": "html", "age": 23, "sal": 3000, "gender": "female"}
{"index": {"_id": 14}}
{"id": 14, "name": "Rose", "job": "java", "age": 36, "sal": 6000, "gender": "female"}
{"index": {"_id": 15}}
{"id": 15, "name": "Will", "job": "dba", "age": 38, "sal": 4500, "gender": "male"}
{"index": {"_id": 16}}
{"id": 16, "name": "smith", "job": "java", "age": 32, "sal": 23000, "gender": "male"}

数据说明:插入的数据为员工信息,name是员工的姓名,job是员工的工种,age为员工的年龄,sal为员工的薪水,gender为员工的性别。

2.3、聚合查询

查询工种的数量

GET employee/_search
{
  "size": 0, 
  "aggs": {
    "job_category_count": {
      "cardinality": {
        "field": "job"
      }
    }
  }
}

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 16,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "job_category_count": {
            "value": 3
        }
    }
}

查询不同工种分桶信息

{
  "size": 0, 
  "aggs": {
    "job_category_count": {
      "terms": {
        "field": "job"
      }
    }
  }
}

{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 16,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "job_category_count": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "java",
                    "doc_count": 7
                },
                {
                    "key": "dba",
                    "doc_count": 5
                },
                {
                    "key": "html",
                    "doc_count": 4
                }
            ]
        }
    }
}

查询不同工种的员工的数量,并查询每个工种最大年龄的员工信息。

GET employee/_search
{
    "size": 0,
    "aggs": {
        "job_analysis": {
            "terms": {
                "field": "job"
            },
            "aggs": {
                "age_top_1": {
                    "top_hits": {
                        "size": 1,
                        "sort": [
                            {
                                "age": {
                                    "order": "desc"
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
}

{
    "took": 75,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 16,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "job_analysis": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "java",
                    "doc_count": 7,
                    "age_top_1": {
                        "hits": {
                            "total": {
                                "value": 7,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "employee",
                                    "_type": "_doc",
                                    "_id": "6",
                                    "_score": null,
                                    "_source": {
                                        "id": 6,
                                        "name": "Douge",
                                        "job": "java",
                                        "age": 41,
                                        "sal": 20000,
                                        "gender": "female"
                                    },
                                    "sort": [
                                        41
                                    ]
                                }
                            ]
                        }
                    }
                },
                {
                    "key": "dba",
                    "doc_count": 5,
                    "age_top_1": {
                        "hits": {
                            "total": {
                                "value": 5,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "employee",
                                    "_type": "_doc",
                                    "_id": "15",
                                    "_score": null,
                                    "_source": {
                                        "id": 15,
                                        "name": "Will",
                                        "job": "dba",
                                        "age": 38,
                                        "sal": 4500,
                                        "gender": "male"
                                    },
                                    "sort": [
                                        38
                                    ]
                                }
                            ]
                        }
                    }
                },
                {
                    "key": "html",
                    "doc_count": 4,
                    "age_top_1": {
                        "hits": {
                            "total": {
                                "value": 4,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "employee",
                                    "_type": "_doc",
                                    "_id": "2",
                                    "_score": null,
                                    "_source": {
                                        "id": 2,
                                        "name": "Rod",
                                        "job": "html",
                                        "age": 31,
                                        "sal": 18000,
                                        "gender": "female"
                                    },
                                    "sort": [
                                        31
                                    ]
                                }
                            ]
                        }
                    }
                }
            ]
        }
    }
}

查询工资范围在 0~5000, 5001~8000, 8001~12000, 12001~18000, 18001+ 员工的人数

GET employee/_search
{
    "size": 0,
    "aggs": {
        "sal_range_info": {
            "range": {
                "field": "sal",
                "ranges": [
                    {
                        "to": 5000
                    },
                    {
                        "from": 5001,
                        "to": 8000
                    },
                    {
                        "from": 8001,
                        "to": 12000
                    },
                    {
                        "from": 12001,
                        "to": 18000
                    },
                    {
                        "from": 18001
                    }
                ]
            }
        }
    }
}

{
    "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 16,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "sal_range_info": {
            "buckets": [
                {
                    "key": "*-5000.0",
                    "to": 5000.0,
                    "doc_count": 3
                },
                {
                    "key": "5001.0-8000.0",
                    "from": 5001.0,
                    "to": 8000.0,
                    "doc_count": 2
                },
                {
                    "key": "8001.0-12000.0",
                    "from": 8001.0,
                    "to": 12000.0,
                    "doc_count": 0
                },
                {
                    "key": "12001.0-18000.0",
                    "from": 12001.0,
                    "to": 18000.0,
                    "doc_count": 3
                },
                {
                    "key": "18001.0-*",
                    "from": 18001.0,
                    "doc_count": 5
                }
            ]
        }
    }
}

以每5000为一个区间,查询工资在对应范围内的员工的数量

GET employee/_search
{
    "size": 0,
    "aggs": {
        "sal_histogram": {
            "histogram": {
                "field": "sal",
                "interval": 5000,
                "extended_bounds": {
                    "min": 0,
                    "max": 25000
                }
            }
        }
    }
}

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 16,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "sal_histogram": {
            "buckets": [
                {
                    "key": 0.0,
                    "doc_count": 3
                },
                {
                    "key": 5000.0,
                    "doc_count": 3
                },
                {
                    "key": 10000.0,
                    "doc_count": 2
                },
                {
                    "key": 15000.0,
                    "doc_count": 4
                },
                {
                    "key": 20000.0,
                    "doc_count": 4
                },
                {
                    "key": 25000.0,
                    "doc_count": 0
                }
            ]
        }
    }
}

查询每个工种的数量,以及不同工种的工资统计信息

GET employee/_search
{
  "size": 0, 
  "aggs": {
    "job_and_salary_info": {
      "terms": {
        "field": "job"
      },
      "aggs": {
        "sal_info": {
          "stats": {
            "field": "sal"
          }
        }
      }
    }
  }
}

{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 16,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "job_and_salary_info": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "java",
                    "doc_count": 7,
                    "sal_info": {
                        "count": 7,
                        "min": 2000.0,
                        "max": 23000.0,
                        "avg": 13428.57142857143,
                        "sum": 94000.0
                    }
                },
                {
                    "key": "dba",
                    "doc_count": 5,
                    "sal_info": {
                        "count": 5,
                        "min": 4500.0,
                        "max": 19000.0,
                        "avg": 12300.0,
                        "sum": 61500.0
                    }
                },
                {
                    "key": "html",
                    "doc_count": 4,
                    "sal_info": {
                        "count": 4,
                        "min": 3000.0,
                        "max": 22000.0,
                        "avg": 14250.0,
                        "sum": 57000.0
                    }
                }
            ]
        }
    }
}

不同工种下男女员工的数量,以及男女员工的薪资信息

GET employee/_search
{
    "size": 0,
    "aggs": {
        "job_gender_sal_info": {
            "terms": {
                "field": "job"
            },
            "aggs": {
                "gender_info": {
                    "terms": {
                        "field": "gender"
                    },
                    "aggs": {
                        "sal_info": {
                            "stats": {
                                "field": "sal"
                            }
                        }
                    }
                }
            }
        }
    }
}

{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 16,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "job_gender_sal_info": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "java",
                    "doc_count": 7,
                    "gender_info": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "male",
                                "doc_count": 4,
                                "sal_info": {
                                    "count": 4,
                                    "min": 2000.0,
                                    "max": 23000.0,
                                    "avg": 11250.0,
                                    "sum": 45000.0
                                }
                            },
                            {
                                "key": "female",
                                "doc_count": 3,
                                "sal_info": {
                                    "count": 3,
                                    "min": 6000.0,
                                    "max": 23000.0,
                                    "avg": 16333.333333333334,
                                    "sum": 49000.0
                                }
                            }
                        ]
                    }
                },
                {
                    "key": "dba",
                    "doc_count": 5,
                    "gender_info": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "male",
                                "doc_count": 3,
                                "sal_info": {
                                    "count": 3,
                                    "min": 4500.0,
                                    "max": 16000.0,
                                    "avg": 9166.666666666666,
                                    "sum": 27500.0
                                }
                            },
                            {
                                "key": "female",
                                "doc_count": 2,
                                "sal_info": {
                                    "count": 2,
                                    "min": 15000.0,
                                    "max": 19000.0,
                                    "avg": 17000.0,
                                    "sum": 34000.0
                                }
                            }
                        ]
                    }
                },
                {
                    "key": "html",
                    "doc_count": 4,
                    "gender_info": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "female",
                                "doc_count": 3,
                                "sal_info": {
                                    "count": 3,
                                    "min": 3000.0,
                                    "max": 18000.0,
                                    "avg": 11666.666666666666,
                                    "sum": 35000.0
                                }
                            },
                            {
                                "key": "male",
                                "doc_count": 1,
                                "sal_info": {
                                    "count": 1,
                                    "min": 22000.0,
                                    "max": 22000.0,
                                    "avg": 22000.0,
                                    "sum": 22000.0
                                }
                            }
                        ]
                    }
                }
            ]
        }
    }
}

查询平均工资最低的部门的平均工资,以及最低工资

GET employee/_search
{
    "size": 0,
    "aggs": {
        "jobs": {
            "terms": {
                "field": "job"
            },
            "aggs": {
                "sal_info": {
                    "avg": {
                        "field": "sal"
                    }
                }
            }
        },
        "min_avg_sal": {
            "max_bucket": {
                "buckets_path": "jobs>sal_info"
            }
        }
    }
}

{
    "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 16,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "jobs": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "java",
                    "doc_count": 7,
                    "sal_info": {
                        "value": 13428.57142857143
                    }
                },
                {
                    "key": "dba",
                    "doc_count": 5,
                    "sal_info": {
                        "value": 12300.0
                    }
                },
                {
                    "key": "html",
                    "doc_count": 4,
                    "sal_info": {
                        "value": 14250.0
                    }
                }
            ]
        },
        "min_avg_sal": {
            "value": 14250.0,
            "keys": [
                "html"
            ]
        }
    }
}

三、ES自带航空数据案例

查询到达各目的地的航班的数量

GET kibana_sample_data_flights/_search
{
    "size": 0,
    "aggs": {
        "dest_info": {
            "terms": {
                "field": "DestCountry"
            }
        }
    }
}

查询到达各航班的的数量,以及票价的最大值,平均值

GET kibana_sample_data_flights/_search
{
  "size": 0, 
  "aggs": {
    "dest_info": {
      "terms": {
        "field": "DestCountry"
      },
      "aggs": {
        "max_ticket_price": {
          "max": {
            "field": "AvgTicketPrice"
          }
        },
        "avg_ticket_price": {
          "avg": {
            "field": "AvgTicketPrice"
          }
        }
      }
    }
  }
}

查询到达各航班的的数量,以及票价的聚合信息以及天气的基本信息

GET kibana_sample_data_flights/_search
{
  "size": 0, 
  "aggs": {
    "dest_info": {
      "terms": {
        "field": "DestCountry"
      },
      "aggs": {
        "ticket_info": {
          "stats": {
            "field": "AvgTicketPrice"
          }
        },
        "weather_info": {
          "terms": {
            "field": "DestWeather"
          }
        }
      }
    }
  }
}

评论

暂无

添加新评论