【Elasticsearch入门】快速学习Elasticsearch搜索查询Bool查询部分

目标

本文中介绍了Elasticsearch的基本操作之一,即Search query的基本查询(match_phrase, multi_match)。由于开头部分是关于Elasticsearch的说明和数据准备部分,与【Elasticsearch入门】速学Elasticsearch Search query 基本查询(match, match_all)篇重复内容,因此可以跳过阅读。

    • CRUD

 

    • Search

query

基本クエリ
Termクエリ
Boolクエリ

must
should
must_not
filter

aggregation

Elasticsearch 是一种搜索引擎。

Elasticsearch 是一个开源的全文检索引擎,使用 Java 编写,建立在 Apache Lucene 的基础上,由 Elastic 公司进行开发。
您可以通过注册的文档快速搜索包含目标单词的文档。
通常情况下,您可以使用 Elasticsearch 的 Restful API 进行操作。

版本等

Elasticsearch版本:7.6.2
Kibana版本:7.6.2
※ 为了方便进行搜索查询操作,我们将使用Kibana的DevTools工具。

准备环境

关于环境准备,如下帖子所述进行了准备。
【Elasticsearch入门】环境构建
【Elasticsearch入门】环境构建 Windows版

准备

我們將準備樣本數據。

电子商务数据

image.png
GET kibana_sample_data_ecommerce/_search

我认为通过回应可以确认数据的存在。

自己制作的数据 de

请在Kibana中切换到Dev Tools,执行以下查询以创建一个包含五个文档的索引。
(顺便说一下,这些内容是在Elasticsearch官方文档中提到的。如果有时间,请阅读一下,会有所收获的。)

PUT my_index/_doc/1
{
  "title": "Elasticsearch",
  "content": "Elasticsearch is the distributed search and analytics engine at the heart of the Elastic Stack. Logstash and Beats facilitate collecting, aggregating, and enriching your data and storing it in Elasticsearch. Kibana enables you to interactively explore, visualize, and share insights into your data and manage and monitor the stack. Elasticsearch is where the indexing, search, and analysis magic happens."
}

PUT my_index/_doc/2
{
  "title": "Kibana",
  "content": "Elasticsearch is the distributed search and analytics engine at the heart of the Elastic Stack. Logstash and Beats facilitate collecting, aggregating, and enriching your data and storing it in Elasticsearch. Kibana enables you to interactively explore, visualize, and share insights into your data and manage and monitor the stack. Elasticsearch is where the indexing, search, and analysis magic happens."
}

PUT my_index/_doc/3
{
  "title": "Elasticsearch",
  "content": "While not every problem is a search problem, Elasticsearch offers speed and flexibility to handle data in a wide variety of use cases:"
}

PUT my_index/_doc/4
{
  "title": "Kibana",
  "content": "While not every problem is a search problem, Elasticsearch offers speed and flexibility to handle data in a wide variety of use cases:"
}

PUT my_index/_doc/5
{
  "title": "Elastic Stack",
  "content": "While not every problem is a search problem, Elasticsearch offers speed and flexibility to handle data in a wide variety of use cases:"
}

请确认您已经成功创建了以下查询的索引。

GET my_index/_search

当我们确认后,让我们尝试使用Bool查询。

介绍查询

布尔查询

布尔查询是将基本查询组合起来形成复合查询的查询类型。

基本语法

GET my_index/_search
{
  "query": {
    "bool": {
      "must": [ <基本クエリ>, <基本クエリ>,,, ],
      "should": [ <基本クエリ>, <基本クエリ>,,, ],
      "must_not": [ <基本クエリ>, <基本クエリ>,,, ],
      "filter": [ <基本クエリ>, <基本クエリ>,,, ]
    }
  }
}

您可以在布尔句中自由组合使用must、should、must_not和filter这四种句子,如上所述。

必须

在“must”语句中指定了必须包含的条件。
如果指定了多个基本查询,则必须满足所有条件。

例子

GET my_index/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {"title": "Elasticsearch"}},
        {"match": {"content": "problem"}}
      ]
    }
  }
}

回应

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.7706487,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.7706487,
        "_source" : {
          "title" : "Elasticsearch",
          "content" : "While not every problem is a search problem, Elasticsearch offers speed and flexibility to handle data in a wide variety of use cases:"
        }
      }
    ]
  }
}

通过观察响应,您可以看出标题中的Elasticsearch和内容中的问题相匹配。

应该

如果在should子句中指定了多个基本查询条件,那么只要满足其中任何一个条件,就会匹配到文档。
换句话说,与OR搜索具有相同的意义。

请求的示例。

GET my_index/_search
{
  "query": {
    "bool": {
      "should": [
        {"match": {"title": "elasticsearch"}},
        {"match": {"content": "problem"}}
      ]
    }
  }
}

反应

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 1.7706487,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.7706487,
        "_source" : {
          "title" : "Elasticsearch",
          "content" : "While not every problem is a search problem, Elasticsearch offers speed and flexibility to handle data in a wide variety of use cases:"
        }
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.9395274,
        "_source" : {
          "title" : "Elasticsearch",
          "content" : "Elasticsearch is the distributed search and analytics engine at the heart of the Elastic Stack. Logstash and Beats facilitate collecting, aggregating, and enriching your data and storing it in Elasticsearch. Kibana enables you to interactively explore, visualize, and share insights into your data and manage and monitor the stack. Elasticsearch is where the indexing, search, and analysis magic happens."
        }
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 0.8311213,
        "_source" : {
          "title" : "Kibana",
          "content" : "While not every problem is a search problem, Elasticsearch offers speed and flexibility to handle data in a wide variety of use cases:"
        }
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 0.8311213,
        "_source" : {
          "title" : "Elastic Stack",
          "content" : "While not every problem is a search problem, Elasticsearch offers speed and flexibility to handle data in a wide variety of use cases:"
        }
      }
    ]
  }
}

根据响应结果来看,在标题中匹配 Elasticsearch 或在内容中匹配问题之一的内容被检索到。

此外,在Elasticsearch的【入门】指南和Elasticsearch搜索查询的基础查询(match, match_all)部分中,我们还介绍了minimum_should_match,您可以在这里使用它。
在这里,您可以在应该子句中使用”指定的多个条件中至少满足N个条件”的用法。

以下是一种可能的用法。

GET my_index/_search
{
  "query": {
    "bool": {
      "should": [
        {"match": {"title": "elasticsearch"}},
        {"match": {"content": "problem"}},
        {"match": {"content": "data"}}
      ],
      "minimum_should_match": 2
    }
  }
}

不得不句

只有指定的基本查询不匹配的结果将从搜索结果中排除。换句话说,这是should句的否定。

例如请求。

GET my_index/_search
{
  "query": {
    "bool": {
      "must_not": [
        {"match": {"title": "elasticsearch"}},
        {"match": {"content": "problem"}}
      ]
    }
  }
}

回应

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.0,
        "_source" : {
          "title" : "Kibana",
          "content" : "Elasticsearch is the distributed search and analytics engine at the heart of the Elastic Stack. Logstash and Beats facilitate collecting, aggregating, and enriching your data and storing it in Elasticsearch. Kibana enables you to interactively explore, visualize, and share insights into your data and manage and monitor the stack. Elasticsearch is where the indexing, search, and analysis magic happens."
        }
      }
    ]
  }
}

根据响应结果,您可以看到其中没有任何与title中的Elasticsearch或content中的problem匹配的内容。此外,结合should语句的结果,您也可以了解到它们都合并为一个索引。

筛选句子

最后是过滤器句子。
过滤器句子的特点与之前介绍的句子特点略有不同。
之前的句子根据搜索条件的相关性返回分数。
而过滤器句子只返回匹配或不匹配的结果。
(由于这些差异,存在”查询上下文”和”过滤上下文”两种分类方式)

请举个例子。

GET kibana_sample_data_ecommerce/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {"products.product_name": "shirt"}}
      ],
      "filter": [
        {"range": 
          {
            "taxful_total_price": {
              "gte": 10,
              "lte": 20
            } 
          }
        }
      ]
    }
  }
}

回应

...
 {
        "_index" : "kibana_sample_data_ecommerce",
        "_type" : "_doc",
        "_id" : "3p4NgHMBFnhBnJdY3gCV",
        "_score" : 1.8671052,
        "_source" : {
          "category" : [
            "Men's Clothing"
          ],
          "currency" : "EUR",
          "customer_first_name" : "Sultan Al",
          "customer_full_name" : "Sultan Al Richards",
          "customer_gender" : "MALE",
          "customer_id" : 19,
          "customer_last_name" : "Richards",
          "customer_phone" : "",
          "day_of_week" : "Saturday",
          "day_of_week_i" : 5,
          "email" : "sultan al@richards-family.zzz",
          "manufacturer" : [
            "Elitelligence",
            "Low Tide Media"
          ],
          "order_date" : "2020-08-01T02:42:43+00:00",
          "order_id" : 581482,
          "products" : [
            {
              "base_price" : 7.99,
              "discount_percentage" : 0,
              "quantity" : 1,
              "manufacturer" : "Elitelligence",
              "tax_amount" : 0,
              "product_id" : 11389,
              "category" : "Men's Clothing",
              "sku" : "ZO0562105621",
              "taxless_price" : 7.99,
              "unit_discount_amount" : 0,
              "min_price" : 3.6,
              "_id" : "sold_product_581482_11389",
              "discount_amount" : 0,
              "created_on" : "2016-12-24T02:42:43+00:00",
              "product_name" : "Basic T-shirt - green",
              "price" : 7.99,
              "taxful_price" : 7.99,
              "base_unit_price" : 7.99
            },
            {
              "base_price" : 11.99,
              "discount_percentage" : 0,
              "quantity" : 1,
              "manufacturer" : "Low Tide Media",
              "tax_amount" : 0,
              "product_id" : 17390,
              "category" : "Men's Clothing",
              "sku" : "ZO0438604386",
              "taxless_price" : 11.99,
              "unit_discount_amount" : 0,
              "min_price" : 5.4,
              "_id" : "sold_product_581482_17390",
              "discount_amount" : 0,
              "created_on" : "2016-12-24T02:42:43+00:00",
              "product_name" : "Print T-shirt - multicoloured",
              "price" : 11.99,
              "taxful_price" : 11.99,
              "base_unit_price" : 11.99
            }
          ],
          "sku" : [
            "ZO0562105621",
            "ZO0438604386"
          ],
          "taxful_total_price" : 19.98,
          "taxless_total_price" : 19.98,
          "total_quantity" : 2,
          "total_unique_products" : 2,
          "type" : "order",
          "user" : "sultan",
          "geoip" : {
            "country_iso_code" : "AE",
            "location" : {
              "lon" : 54.4,
              "lat" : 24.5
            },
            "region_name" : "Abu Dhabi",
            "continent_name" : "Asia",
            "city_name" : "Abu Dhabi"
          }
        }
      }
...

以上是显示“在支付总额为10欧元到20欧元之间的顾客中购买了衬衫的顾客”的搜索结果。在10欧元到20欧元之间进行过滤,然后从中搜索与products.product_name字段匹配为shirt的商品。

使用筛选器可以获得查询缓存的优点。
当您不仅仅想要与分数相关的搜索,而是想要限定搜索范围时,使用筛选器查询可以改善性能。
使用筛选器查询可以缓存和保存搜索结果,因此一旦搜索,就可以快速返回结果。
请务必尝试一下。

我在这里结束了对布尔查询的解释。

推荐学习材料

    • はじめてのElasticsearch

 

    はじめてのKibana
广告
将在 10 秒后关闭
bannerAds