文章/答案/技术大牛

发布

社区首页 >专栏 >【Elasticsearch系列之四】腾讯云ES数据基本操作

【Elasticsearch系列之四】腾讯云ES数据基本操作

原创

Vicwan

发布于 2020-04-21 01:40:49

1.8K0

文章被收录于专栏：腾讯云迁云技术团队专栏腾讯云迁云技术团队专栏

注意：本教程提供的示例代码仅适用于腾讯云Elasticsearch 7.x版本，不确定是否适用于其他版本，其他版本的示例代码请参见官方文档：https://www.elastic.co/guide/en/elasticsearch/reference/7.5/docs-index_.html

1、创建索引

在kibana 控制台，通过Dev Tools进行操作。以下示例创建了一个名为product_info的索引，指定分片数和副本数量，索引映射。

PUT /product_info
{
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1
  },
  "mappings": {
      "properties": {
        "productName": {"type": "text","analyzer": "ik_smart"},
        "annual_rate":{"type":"keyword"},
        "describe": {"type": "text","analyzer": "ik_smart"}
      }
  }
}

注意：官方Elasticsearch7.0.0及之后版本移除映射中的type类型定义，之前版本会继续支持，详情请参见官方文档：https://www.elastic.co/guide/en/elasticsearch/reference/7.3/removal-of-types.html#_what_are_mapping_types

如果在Elasticsearch7.0.0及之后版本使用了type，会出现"type": "mapper_parsing_exception"的错误提示。

创建成功后，返回如下结果：

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "product_info"
}

2、查看索引的mapping信息

GET /product_info/_mapping

返回如下结果：

{
  "product_info" : {
    "mappings" : {
      "dynamic_templates" : [
        {
          "message_full" : {
            "match" : "message_full",
            "mapping" : {
              "fields" : {
                "keyword" : {
                  "ignore_above" : 2048,
                  "type" : "keyword"
                }
              },
              "type" : "text"
            }
          }
        },
        {
          "message" : {
            "match" : "message",
            "mapping" : {
              "type" : "text"
            }
          }
        },
        {
          "strings" : {
            "match_mapping_type" : "string",
            "mapping" : {
              "type" : "keyword"
            }
          }
        }
      ],
      "properties" : {
        "annual_rate" : {
          "type" : "keyword"
        },
        "describe" : {
          "type" : "text",
          "analyzer" : "ik_smart"
        },
        "productName" : {
          "type" : "text",
          "analyzer" : "ik_smart"
        }
      }
    }
  }
}

3、查看索引的setting信息

可以查询refresh间隔时间、translog刷盘时间间隔、分片、副本数量等。

GET /product_info/_settings

返回如下结果：

{
  "product_info" : {
    "settings" : {
      "index" : {
        "refresh_interval" : "30s",
        "number_of_shards" : "5",
        "translog" : {
          "sync_interval" : "5s",
          "durability" : "async"
        },
        "provided_name" : "product_info",
        "max_result_window" : "65536",
        "creation_date" : "1584345951007",
        "unassigned" : {
          "node_left" : {
            "delayed_timeout" : "5m"
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "D4qOmxP7RPqEwvyYzdjKTg",
        "version" : {
          "created" : "7050199"
        }
      }
    }
  }
}

4、创建文档并批量插入数据

Elasticsearch还提供了使用_bulk API批量执行任何操作的功能。此功能非常重要，因为它提供了一种非常高效的机制，尽可能快地完成多项操作，尽可能少的网络往返。_bulk API的官方地址为：https://www.elastic.co/guide/en/elasticsearch/reference/6.2/docs-bulk.html

在Kibana控制台中，执行以下命令创建文档并插入数据，使用批量插入数据的形式：

POST /product_info/_doc/_bulk?pretty
{"index":{}}
{"productName":"大健康天天理财","annual_rate":"3.2200%","describe":"180天定期理财，最低20000起投，收益稳定，可以自助选择消息推送"}
{"index":{}}
{"productName":"西部通宝","annual_rate":"3.1100%","describe":"90天定投产品，最低10000起投，每天收益到账消息推送"}
{"index":{}}
{"productName":"安详畜牧产业","annual_rate":"3.3500%","describe":"270天定投产品，最低40000起投，每天收益立即到账消息推送"}
{"index":{}}
{"productName":"5G设备采购月月盈","annual_rate":"3.1200%","describe":"90天定投产品，最低12000起投，每天收益到账消息推送"}
{"index":{}}
{"productName":"新能源动力理财","annual rate":"3.0100%","describe":"30天定投产品推荐，最低8000起投，每天收益会消息推送"}
{"index":{}}
{"productName":"微贷赚","annual_rate":"2.7500%","describe":"热门短期产品，3天短期，无须任何手续费用，最低500起投，通过短信提示获取收益消息"}

如果返回显示"errors" : false，说明数据插入成功，具体如下：

5、全文搜索

在Kibana控制台中，执行以下命令搜索描述内容包含每天收益到账消息推送的所有产品。

GET /product_info/_doc/_search?pretty
{
  "query": {
    "match": {
      "describe": "每天收益到账消息推送"
    }
  }
}

搜索成功后，返回结果如下：

{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : 2.7453551,
    "hits" : [
      {
        "_index" : "product_info",
        "_type" : "_doc",
        "_id" : "O42F4nABqaO0HpggcFaF",
        "_score" : 2.7453551,
        "_source" : {
          "productName" : "西部通宝",
          "annual_rate" : "3.1100%",
          "describe" : "90天定投产品，最低10000起投，每天收益到账消息推送"
        }
      },
      {
        "_index" : "product_info",
        "_type" : "_doc",
        "_id" : "PY2F4nABqaO0HpggcFaF",
        "_score" : 2.7453551,
        "_source" : {
          "productName" : "5G设备采购月月盈",
          "annual_rate" : "3.1200%",
          "describe" : "90天定投产品，最低12000起投，每天收益到账消息推送"
        }
      },
      {
        "_index" : "product_info",
        "_type" : "_doc",
        "_id" : "PI2F4nABqaO0HpggcFaF",
        "_score" : 1.7260926,
        "_source" : {
          "productName" : "安详畜牧产业",
          "annual_rate" : "3.3500%",
          "describe" : "270天定投产品，最低40000起投，每天收益立即到账消息推送"
        }
      },
      {
        "_index" : "product_info",
        "_type" : "_doc",
        "_id" : "Po2F4nABqaO0HpggcFaF",
        "_score" : 1.1507283,
        "_source" : {
          "productName" : "新能源动力理财",
          "annual rate" : "3.0100%",
          "describe" : "30天定投产品推荐，最低8000起投，每天收益会消息推送"
        }
      },
      {
        "_index" : "product_info",
        "_type" : "_doc",
        "_id" : "Oo2F4nABqaO0HpggcFaF",
        "_score" : 0.5885149,
        "_source" : {
          "productName" : "大健康天天理财",
          "annual_rate" : "3.2200%",
          "describe" : "180天定期理财，最低20000起投，收益稳定，可以自助选择消息推送"
        }
      },
      {
        "_index" : "product_info",
        "_type" : "_doc",
        "_id" : "P42F4nABqaO0HpggcFaF",
        "_score" : 0.19024058,
        "_source" : {
          "productName" : "微贷赚",
          "annual_rate" : "2.7500%",
          "describe" : "热门短期产品，3天短期，无须任何手续费用，最低500起投，通过短信提示获取收益消息"
        }
      }
    ]
  }
}

6、按条件查询搜索

在Kibana控制台中，执行以下命令搜索年化率在3.0000%到3.1300%之间的产品。

GET /product_info/_doc/_search?pretty
{
  "query": {
    "range": {
      "annual_rate": {
        "gte": "3.0000%",
        "lte": "3.1300%"
      }
    }
  }
}

执行成功后，返回结果如下：

{
  "took" : 20,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "product_info",
        "_type" : "_doc",
        "_id" : "m-Jp4nABSXi5kvvXIeNL",
        "_score" : 1.0,
        "_source" : {
          "productName" : "5G设备采购月月盈",
          "annual_rate" : "3.1200%",
          "describe" : "90天定投产品，最低12000起投，每天收益到账消息推送"
        }
      },
      {
        "_index" : "product_info",
        "_type" : "_doc",
        "_id" : "meJp4nABSXi5kvvXIeNL",
        "_score" : 1.0,
        "_source" : {
          "productName" : "西部通宝",
          "annual_rate" : "3.1100%",
          "describe" : "90天定投产品，最低10000起投，每天收益到账消息推送"
        }
      }
    ]
  }
}

7、更新文档内容

Elasticsearch实际上并没有在原文档进行就地更新，而是会删除旧文档，索引一个新文档来立刻替换它。Elasticsearch提供了在查询条件下更新多个文档的能力（如SQL UPDATE-WHERE语句），具体可以参考官方文档说明：https://www.elastic.co/guide/en/elasticsearch/reference/6.2/docs-update-by-query.html

以下示例通过指定id更新文档：将describe字段更改为“280天定期理财，最低20000起投，收益稳定，可以自助选择消息推送”并同时添加test字段：

POST /product_info/_doc/heVx5nABSXi5kvvX-C_K/_update?pretty
{
  "doc":{"productName":"大健康天天理财","annual_rate":"3.2200%","describe":"280天定期理财，最低20000起投，收益稳定，可以自助选择消息推送","test":111}
}

更新成功后，返回结果如下：

{
  "_index" : "product_info",
  "_type" : "_doc",
  "_id" : " heVx5nABSXi5kvvX-C_K ",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}

8、删除文档

在Kibana控制台中，通过指定文档ID，执行以下命令删除文档：

DELETE /product_info/_doc/heVx5nABSXi5kvvX-C_K?pretty

删除成功后，返回结果如下：

{
  "_index" : "product_info",
  "_type" : "_doc",
  "_id" : " heVx5nABSXi5kvvX-C_K",
  "_version" : 3,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 2,
  "_primary_term" : 1
}

9、删除索引

在Kibana控制台中，执行以下命令删除product_info索引：

DELETE /product_info?pretty

删除成功后，返回结果如下：

{

 "acknowledged" : true

}

10、查看所有的索引

GET _cat/indices

11、腾讯云ES默认索引模板说明和调整

1) 默认模板说明

索引模板是预先定义好的在创建新索引时自动应用的模板，主要包括索引设置、映射和模板优先级等配置。腾讯云 ES 在集群创建时提供了一个默认的索引模板，您可以在 Kibana 界面的【Dev Tools】中通过命令GET _template/default@template查看这个模板。下面是默认模板及其中配置的一些说明，可以根据需求适当调整这些配置。

{
  "default@template": {
    "order": 1, // 模板优先级，数值越大优先级越高
    "index_patterns": [ // 模板应用的索引
      "*"
    ],
    "settings": {
      "index": {
        "max_result_window": "65536", // 最大查询窗口，如果查询的窗口超过该大小，会报 Result window is too large 错误，需要调大这个配置
        "routing": {
          "allocation": {
            "include": {
              "temperature": "hot"
            }
          }
        },
        "refresh_interval": "30s", // 索引刷新间隔，被索引的文档在该间隔后才能被查询到，如果对于查询实时性要求较高，可以适当调小该值，但是值过小将影响写入性能
        "unassigned": {
          "node_left": {
            "delayed_timeout": "5m"
          }
        },
        "translog": {
          "sync_interval": "5s", // translog 刷盘间隔，值过小将影响写入性能
          "durability": "async" 
        },
        "number_of_replicas": "1" // 副本分片数
      }
    },
    "mappings": {
      "_default_": {
        "_all": {
          "enabled": false // 建议禁用，_all 字段会包含所有其他字段形成一个大字符串，会占用较多磁盘空间，也会影响写入性能
        },
        "dynamic_templates": [ // 动态模板
          {
            "message_full": { // 将名为 message_full 的字段动态映射为 text 和 keyword 类型
              "match": "message_full",
              "mapping": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 2048
                  }
                }
              }
            }
          },
          {
            "message": { // 将名为 message 的字段动态映射为 text 类型
              "match": "message",
              "mapping": {
                "type": "text"
              }
            }
          },
          {
            "strings": { // 将 string 类型字段动态映射为 keyword 类型
              "match_mapping_type": "string",
              "mapping": {
                "type": "keyword"
              }
            }
          }
        ]
      }
    },
    "aliases": {}
  }
}

2) 模板调整

您可以在 Kibana 界面的【Dev Tools】中通过命令PUT _template/my_template自定义自己的索引模板，并通过设置模板优先级order的数值大于默认模板优先级来覆盖默认的索引模板中的配置。

注意：索引模板仅在索引创建时应用，因此模板调整不会对已有的索引产生影响。

a) 调整主分片个数

在 Elasticsearch 5.6.4版本和6.4.3版本中，默认的索引主分片个数为5个。对于数据规模较小、索引个数较多的场景，建议调小主分片个数，以减轻索引元数据对堆内存的压力。您可以参考下面模板调整主分片个数：

{
  "index_patterns" : ["*"],
    "order" : 2, // 请确保模板中 order 字段的值大于1
    "settings" : {
        "index": {
            "number_of_shards" : 1
        }
    }
}

b) 调整字段类型

在默认模板中，我们将 string 类型字段动态映射为 keyword 类型，以防止对所有文本类型数据都进行全文索引。您可以根据业务需求，修改指定 string 类型字段为 text，使其可以全文索引：

{
  "index_patterns" : ["*"],
    "order" : 2, // 请确保模板中 order 字段的值大于1
    "mappings": {
      "properties": {
        "字段名": {
          "type":  "text"
        }
    }
  }
}

c) 其他业务场景

例如，您希望让索引的文档在10s之后就能被搜索到，并应用于所有的search-*索引，那么您可以新建一个如下的模板：

{
    "index_patterns" : ["search-*"],
    "order" : 2, // 请确保模板中 order 字段的值大于1
    "settings" : {
        "index": {
            "refresh_interval": "10s"
        }
    }
}

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

Elasticsearch Service