项目地址 https://github.com/olivere/esdiff (该项目已经归档,所以对后续ES可能出现不支持的情况,使用需要小心)
esdiff 工具迭代 Elasticsearch 5.x、6.x 或 7.x 中的两个索引 并在这些索引中的文档之间执行差异。
它通过滚动索引来实现这一点。为了实现稳定的排序 order,它默认使用 _id(在 ES 5.x 中_uid)。
您需要 Go 1.11 或更高版本才能编译。
go install github.com/olivere/esdiff@latest
首先,我们需要设置两个 Elasticsearch 集群进行测试, 然后播种一些文档。
$ mkdir -p data
# Create an Elasticsearch 5.x cluster on http://localhost:19200
# Create an Elasticsearch 6.x cluster on http://localhost:29200
# Create an Elasticsearch 7.x cluster on http://localhost:39200
# Increase your docker memory limit (6.0GiB) in Docker App > Preferences > Advanced.
$ docker-compose up -d
Creating esdiff_elasticsearch5_1 ... done
Creating esdiff_elasticsearch6_1 ... done
Creating esdiff_elasticsearch7_1 ... done
# Check docker containers
$ docker-compose ps
Name Command State Ports
----------------------------------------------------------------------------------------------------
esdiff_elasticsearch5_1 /bin/bash bin/es-docker Up 0.0.0.0:19200->9200/tcp, 9300/tcp
esdiff_elasticsearch6_1 /usr/local/bin/docker-entr ... Up 0.0.0.0:29200->9200/tcp, 9300/tcp
esdiff_elasticsearch7_1 /usr/local/bin/docker-entr ... Up 0.0.0.0:39200->9200/tcp, 9300/tcp
# Check docker container logs
$ docker-compose logs -f elasticsearch5
Attaching to esdiff_elasticsearch5_1
elasticsearch5_1 | [2019-07-02T14:17:33,351][WARN ][o.e.b.JNANatives ] Unable to lock JVM Memory: error=12, reason=Cannot allocate memory
elasticsearch5_1 | [2019-07-02T14:17:33,355][WARN ][o.e.b.JNANatives ] This can result in part of the JVM being swapped out.
elasticsearch5_1 | [2019-07-02T14:17:33,355][WARN ][o.e.b.JNANatives ] Increase RLIMIT_MEMLOCK, soft limit: 83968000, hard limit: 83968000
elasticsearch5_1 | [2019-07-02T14:17:33,356][WARN ][o.e.b.JNANatives ] These can be adjusted by modifying /etc/security/limits.conf, for example:
elasticsearch5_1 | # allow user 'elasticsearch' mlockall
........
# Add some documents
$ ./seed/01.sh
# Compile
$ go build
让我们做一个简单的差异:
相同的集群和相同的文档应仅返回未更改的文档:
$ ./esdiff -u=true 'http://localhost:19200/index01/tweet' 'http://localhost:19200/index01/tweet'
Unchanged 1
Unchanged 2
Unchanged 3
以下示例将返回 ES 5.x 和 ES 6.x 中索引之间的差异:
$ ./esdiff -u=true 'http://localhost:19200/index01/tweet' 'http://localhost:29200/index01/_doc'
Unchanged 1
Deleted 2
Updated 3 {*diff.Document}.Source["message"]:
-: "Playing the piano is fun as well"
+: "Playing the guitar is fun as well"
Created 4 {*diff.Document}:
-: (*diff.Document)(nil)
+: &diff.Document{ID: "4", Source: map[string]interface {}{"message": "Climbed that mountain", "user": "sandrae"}}
ES 5.x 和 ES 7.x - 不同的文档:
$ ./esdiff -u=true 'http://localhost:19200/index01/tweet' 'http://localhost:39200/index01/_doc'
Unchanged 1
Deleted 2
Updated 3 {*diff.Document}.Source["message"]:
-: "Playing the piano is fun as well"
+: "Playing the flute, oh boy"
Created 5 {*diff.Document}:
-: (*diff.Document)(nil)
+: &diff.Document{ID: "5", Source: map[string]interface {}{"message": "Ran that marathon", "user": "sandrae"}}
请注意,您可以传递其他选项来过滤 您感兴趣的模式类型。例如,如果您还 想要查看所有未更改的文档,但不要查看已更改的文档 deleted,使用 -u=true -d=false:
$ ./esdiff -u=true -d=false 'http://localhost:19200/index01/tweet' 'http://localhost:29200/index01/_doc'
Unchanged 1
Updated 3 {*diff.Document}.Source["message"]:
-: "Playing the piano is fun as well"
+: "Playing the guitar is fun as well"
Created 4 {*diff.Document}:
-: (*diff.Document)(nil)
+: &diff.Document{ID: "4", Source: map[string]interface {}{"message": "Climbed that mountain", "user": "sandrae"}}
请改用 JSON 作为输出格式。䋰 jq 和 吉克 这是相当强大的 (以及其他与 JQ 相关的工具)。
$ ./esdiff -o=json 'http://localhost:29200/index01/_doc' 'http://localhost:39200/index01/_doc' | jq 'select(.mode | contains("deleted"))'
{
"mode": "deleted",
"_id": "4",
"src": {
"_id": "4",
"_source": {
"message": "Climbed that mountain",
"user": "sandrae"
}
},
"dst": null
}
您还可以传递查询来过滤源和/或目标, 分别使用 -sf 和 -df 参数:
$ $ ./esdiff -o=json -sf='{"term":{"user":"olivere"}}' 'http://localhost:29200/index01/_doc' 'http://localhost:19200/index01/_doc'
{"mode":"deleted","_id":"1","src":{"_id":"1","_source":{"message":"Welcome to Golang","user":"olivere"}},"dst":null}
使用 -h 显示所有选项:
$ ./esdiff -h
General usage:
esdiff [flags] <source-url> <destination-url>
General flags:
-a Print added docs (default true)
-c Print changed docs (default true)
-d Print deleted docs (default true)
-df string
Raw query for filtering the destination, e.g. {"term":{"name.keyword":"Oliver"}}
-dsort string
Field to sort the destination, e.g. "id" or "-id" (prepend with - for descending)
-exclude string
Raw source filter for excluding certain fields from the source, e.g. "hash_value,sub.*"
-include string
Raw source filter for including certain fields from the source, e.g. "obj.*"
-o string
Output format, e.g. json
-sf string
Raw query for filtering the source, e.g. {"term":{"user":"olivere"}}
-size int
Batch size (default 100)
-ssort string
Field to sort the source, e.g. "id" or "-id" (prepend with - for descending)
-u Print unchanged docs
-replace-with string
Replace the id in the document with the unique field you need from the source,e.g. "unique_key"
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。