Elasticsearch Service 特别：单字段灵活支持

本文介绍 nested 搜索的基本要点，您可以参阅此文档了解更多详情。
ES 通过 nested 允许单字段支持向量数组，向量个数可以不定，这个特性非常有用，让我们来看一个视频检索的示例。
示例背景
假设我们需要针对视频进行向量检索，挑战是每条视频抽帧后的图片数量不一样，此时我们可以通过 nested 嵌套字段来实现。
举例：一条视频信息有 id、title、content、image 字段，其中 image 保存视频抽帧后的图片向量数据，每个视频有 n 张图，n 值不固定。
创建索引
创建 mappings 时指定 image 为 nested 嵌套字段：
//id、title、content 为文本字段
//image中的num为图片编号，比如第几帧（可不要）
//image中的emb为图片embedding之后的数据，image[image1_emb、image2_emb、image3_emb...imagen_emb]
﻿
PUT /image_embeddings
{
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
        },
      "title": {
        "type": "text"
        },
      "content": {
        "type": "text"
        },
      "image": {
        "type": "nested",
        "properties": {
          "num": {"type": "keyword"},
          "emb": {
              "type": "dense_vector",
              "dims": 5,
              "index_options": {
                "type": "int8_hnsw"  
              },
              "similarity": "cosine"
          }
        }
      }
    }
  }
}
写入数据
写入格式用数组形式将多组向量包起来，格式如下：
POST /image_embeddings/_doc/1
{
  "id": "book_001",
  "title": "晴天",
  "content": "刮风这天，我试着握着你手",
  "image": [
    {"num": "0", "emb": [0.1,0.2,0.3,0.4,0.5]},
    {"num": "1", "emb": [0.6,0.7,0.8,0.9,1.0]},
    {"num": "2", "emb": [0.2,0.3,0.4,0.5,0.6]}
  ]
}
﻿
POST /image_embeddings/_doc/2
{
  "id": "book_002",
  "title": "一路向北",
  "content": "一路向北，我试着握着你手",
  "image": [
    {"num": "0", "emb": [0.1,0.2,0.3,0.4,0.5]},
    {"num": "1", "emb": [0.6,0.7,0.8,0.9,1.0]}
  ]
}
﻿
POST /image_embeddings/_doc/3
{
  "id": "book_003",
  "title": " 双截棍",
  "content": "哼哼哈嘿",
  "image": [
    {"num": "0", "emb": [0.1,0.2,0.3,0.4,0.5]},
    {"num": "1", "emb": [0.6,0.7,0.8,0.9,1.0]},
    {"num": "2", "emb": [0.1,0.2,0.3,0.4,0.5]},
    {"num": "3", "emb": [0.6,0.7,0.8,0.9,1.0]},
    {"num": "4", "emb": [0.1,0.2,0.3,0.4,0.5]},
    {"num": "5", "emb": [0.6,0.7,0.8,0.9,1.0]}
  ]
}
执行向量搜索
查询数据，写法与前述混合检索算法一致，nested 嵌套字段的评分将取 max 作为最终评分。
GET book-index/_search
{
  "retriever": {
    "rrf": {
      "retrievers": [
        {
          "retriever": {
            "knn": {
              "field": "image.emb",
              "query_vector": [0.1, 0.2, 0.3,0.4,0.5],
              "k": 5,
              "num_candidates": 50
            }
          },
          "weight": 0.8
        },
        {
          "retriever": {
            "standard": {
              "query": {
                "match": {
                  "title": "晴天"
                }
              }
            }
          },
          "weight": 0.2
        }
      ],
      "rank_window_size": 50,
      "rank_constant": 20        
    }
  }
}
特别：单字段灵活支持数量不定的多向量

本页目录：

示例背景

创建索引

写入数据

执行向量搜索