说明:
在创建 Collection 时,需指定向量的 索引类型(如 HNSW 等)与 相似度计算 方法。数据库存储的向量将会按照指定的索引类型进行索引。那么,在向量检索时,便会依据索引并使用已选择的相似性计算方法进行匹配,快速高效地获取目标向量。具体信息,请参见 /collection/create。
主要功能
腾讯云向量数据库(Tencent Cloud VectorDB)较全面覆盖各种相似度检索场景。您可以根据实际存储数据的特点与应用场景,选择不同的检索方式。
数据库类别 | 精确查询支持方式 |
Base Database | 支持指定多维向量数值,检索与指定的多维向量数值最相似的 Top K 条文档。 支持指定 id(Document ID),检索与该 id 的向量值最相似的 Top K 条文档。 应用 Embedding 功能,支持输入原始文本,检索与该文本信息最相似的 Top K 条文档。 支持指定 id、多维向量或原始文本,搭配标量字段的 Filter 表达式一并检索与 id、多维向量、原始文本相似的文档。 支持批量进行相似度检索,即支持输入多条向量数据值、多个 id 分别检索与每一个向量数值、每一个 id 相似的 Top K 条数据。 |
AI Database | 支持输入文本信息,在指定的文件 id(DocumentSet ID)或文件名,检索相似的文本内容。 支持指定多个文件 id(DocumentSet ID)或文件名,单次批量检索多个文件的信息,最大数量为 20 个。 支持指定标量字段的 Filter 表达式 过滤文件信息。 支持指定多个文件 id(DocumentSet ID)或文件名,并搭配标量字段的 Filter 表达式一起过滤多个文件信息。 |
应用示例
本文简单给出相似度检索的请求示例,方便您直观理解相似度检索的功能。Base 类数据库,相似性检索,请参见接口文档 /document/search。AI 类数据库,请参见 /ai/documentSet/search。
根据 ID 进行相似度检索
curl -i -X POST \\-H 'Content-Type: application/json' \\-H 'Authorization: Bearer account=root&api_key=A5VOgsMpGWJhUI0WmUbY********************
' \\http://10.0.X.X:80/document/search \\-d '{"database": "db-test","collection": "book-vector","search": {"documentIds": ["0001","0002","0003"],"params": {"ef": 200},"retrieveVector": true,"limit": 3}}'
执行成功,返回如下信息:
{"code": 0,"msg": "operation success","documents": [[{"id": "0001","vector": [0.21230000257492066,0.23000000417232514,0.21299999952316285],"score": 1.0000001192092896,"bookName": "三国演义","author": "罗贯中","page": 21},{"id": "0002","vector": [0.21230000257492066,0.2199999988079071,0.21299999952316285],"score": 0.9997729063034058,"bookName": "西游记","page": 22,"author": "吴承恩"},{"id": "0003","vector": [0.21230000257492066,0.20999999344348908,0.21299999952316285],"score": 0.9990617036819458,"author": "曹雪芹","page": 23,"bookName": "红楼梦"}],[{"id": "0002","vector": [0.21230000257492066,0.2199999988079071,0.21299999952316285],"score": 1.000000238418579,"bookName": "西游记","page": 22,"author": "吴承恩"},{"id": "0001","vector": [0.21230000257492066,0.23000000417232514,0.21299999952316285],"score": 0.9997729063034058,"author": "罗贯中","bookName": "三国演义","page": 21},{"id": "0003","vector": [0.21230000257492066,0.20999999344348908,0.21299999952316285],"score": 0.9997580051422119,"author": "曹雪芹","bookName": "红楼梦","page": 23}],[{"id": "0003","vector": [0.21230000257492066,0.20999999344348908,0.21299999952316285],"score": 1.0,"bookName": "红楼梦","page": 23,"author": "曹雪芹"},{"id": "0002","vector": [0.21230000257492066,0.2199999988079071,0.21299999952316285],"score": 0.9997580051422119,"author": "吴承恩","bookName": "西游记","page": 22},{"id": "0001","vector": [0.21230000257492066,0.23000000417232514,0.21299999952316285],"score": 0.9990617036819458,"bookName": "三国演义","author": "罗贯中","page": 21}]]}
根据指定的多维向量进行相似度检索
注意:
如下示例不可直接复制运行,与 创建数据库 相同,api_key=4jpv6gzQTpq1Ev6iz2DUgAbv**************** 与 10.0.X.X 还需要依据实际情况进行替换之后,才能在 CVM 运行。
curl -i -X POST \\
-H 'Content-Type
:
application/json' \\
-H 'Authorization
:
Bearer account=root&api_key=A5VOgsMpGWJhUI0WmUbY********************' \\
http
:
//10.0.X.X:80/document/search \\
-d '
{
"database"
:
"db-test"
,
"collection"
:
"book-vector"
,
"search"
:
{
"vectors"
:
[
[
0.3123
,
0.43
,
0.213
]
]
,
"params"
:
{
"ef"
:
200
}
,
"filter"
:
"bookName in (\\"三国演义\\",\\"西游记\\")"
,
"retrieveVector"
:
true
,
"limit"
:
3
}
}
'
检索信息如下所示。
{"code": 0,"msg": "operation success","documents": [[{"id": "0001","vector": [0.21230000257492066,0.23000000417232514,0.21299999952316285],"score": 0.9714228510856628,"page": 21,"author": "罗贯中","bookName": "三国演义"},{"id": "0002","vector": [0.21230000257492066,0.2199999988079071,0.21299999952316285],"score": 0.9668837785720825,"bookName": "西游记","author": "吴承恩","page": 22}]]}
根据文本进行向量检索
curl -i -X POST \\-H 'Content-Type: application/json' \\-H 'Authorization: Bearer account=root&api_key=A5VOgsMpGWJhUI0WmUbY********************
' \\http://10.0.X.X:80/document/search \\-d '{"database": "db-test","collection": "book-emb","search": {"embeddingItems": ["天下大势,分久必合,合久必分"],"limit": 3,"params": {"ef": 200},"retrieveVector": false,"outputFields": ["id","author","text","bookName"]}}'
检索信息如下所示。
{"code": 0,"msg": "operation success","documents": [[{"id": "0001","score": 0.9792740345001221,"author": "罗贯中","bookName": "三国演义","text": "话说天下大势,分久必合,合久必分。"},{"id": "0002","score": 0.7909859418869019,"text": "混沌未分天地乱,茫茫渺渺无人间。","bookName": "西游记","author": "吴承恩"},{"id": "0003","score": 0.6858994364738464,"author": "曹雪芹","bookName": "红楼梦","text": "甄士隐梦幻识通灵,贾雨村风尘怀闺秀。"}]]}