curator这个工具很早就社区存在了,而它能够帮你更好的管理你的索引,适用场景很多。本文主要讲解从两个角度去讲解这个工具,第一个角度就是从运维人员的角度,通过这个工具实现日常索引维护的force merge,close,delete以及索引的定期备份等功能;第二个角度就是从架构师的角度,如何用curator进行冷热分离,实现ES热数据和冷数据的自动迁移。
Linux版本 | Elasticsearch 版本 | curator版本 |
---|---|---|
Redhat 7.6 | Elasticsearch 7.2 | curator 5.8.3 |
hot节点 | warm 节点 | cold节点 |
---|---|---|
192.168.248.116:9200 | 192.168.248.117:9200 | 192.168.248.115:9200 |
mkdir -p /appdata/curator-5.8.3 && cd /appdata/curator-5.8.3
wget https://packages.elastic.co/curator/5/centos/7/Packages/elasticsearch-curator-5.8.3-1.x86_64.rpm && yum install -y ./elasticsearch-curator-5.8.3-1.x86_64.rpm
如上curator就已经安装完了,下面就到了我们的重头戏了...
cd /appdata/curator-5.8.3
vim curator.yml
######################################
client:
hosts: ["192.168.248.115:9200"]
url_prefix:
use_ssl: False
certificate:
client_cert:
client_key:
aws_key:
aws_secret_key:
aws_region:
ssl_no_validate: False
http_auth: elastic:xxx
timeout: 30
master_only: False
logging:
loglevel: INFO
logfile:/appdata/curator-5.8.3/logs/log.log
logformat: default
blacklist: ['elasticsearch' 'urllib3']
#########################################
这里的参数我主要讲两个,其他都是默认的。
cd /appdata/curator-5.8.3 && mkdir actions && cd actions
vim forcemerge.yml
########################
actions:
1:
action: forcemerge
description: >-
forceMerge log_ prefixed indices older than 10 days (based on index
creation_date) to 1 segments per shard. Delay 120 seconds between each
forceMerge operation to allow the cluster to quiesce. Skip indices that
have already been forcemerged to the minimum number of segments to avoid
reprocessing.
options:
max_num_segments: 1
delay: 120
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: log_
exclude: True
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 10
exclude: True
- filtertype: forcemerged
max_num_segments: 1
exclude: True
######################
cd /appdata/curator-5.8.3/actions/
vim close.yml
########################
actions:
1:
action: close
description: >-
Close indices older than 30 days (based on index name) for log_
prefixed indices.
options:
delete_aliases: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: logstash-
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 7
########################
cd /appdata/curator-5.8.3/actions/
vim delete.yml
######################################
actions: 1:
action: delete_indices
description: >-
Delete indices older than 7 days (based on index name) for log_
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: log_
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 30
exclude: true
cd /appdata/curator-5.8.3/actions/
vim snapshot.yml
---
# Remember, leave a key empty if there is no value. None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True. If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
1:
action: snapshot
description: >-
Snapshot logstash- prefixed indices older than 1 day (based on index
creation_date) with the default snapshot name pattern of
'curator-%Y%m%d%H%M%S'. Wait for the snapshot to complete. Do not skip
the repository filesystem access check. Use the other options to create
the snapshot.
options:
repository:
# Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
name:
ignore_unavailable: False
include_global_state: True
partial: False
wait_for_completion: True
skip_repo_fs_check: False
disable_action: True
filters:
- filtertype: pattern
kind: prefix
value: logstash-
- filtertype: age
source: creation_date
direction: older
unit: days
unit_count: 1
定时任务制定
0 1 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/delete.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/delete.log 2>&1
0 2 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/close.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/close.log 2>&1
0 3 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/forcemerge.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/forcemerge.log 2>&1
0 4 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/snapshot.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/snapshot.log 2>&1
如上图,ES的集群分为:Master Node,Coordinate Node,Ingest Node,Data Node
讲了ES的冷热架构,我们就讲讲Data Node这一部分如何实现,按照我们的架构图我们的Data Node节点分为hot,warm,cold三种类型,它们分别保存3天前,3-15天,16-30天的数据。
假定我们索引的命名规则为:log_transaction_YY-MM-DD,那它在各数据节点分布如下,
节点类型 | log_transaction_YY-MM-DD |
---|---|
Hot | 3天前的数据 |
Warm | 3-15天的数据 |
Cold | 16-30天的数据 |
归档至NBU或者HDFS | 30天后的数据 |
1.由 Hot 迁移到Warm,action file 编写
cd /appdata/curator-5.8.3/actions/
vim Allocation_Warm.yml
actions:
1:
action: allocation
description: "Apply shard allocation filtering rules to the specified indices,Hot to Warm"
options:
key: box_type
value: warm
allocation_type: require
wait_for_completion: true
timeout_override:
continue_if_exception: false
disable_action: false
filters:
- filtertype: pattern
kind: prefix
value: log_transaction_
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: months
unit_count: 2
2.由Warm到Cold,action file 编写
cd /appdata/curator-5.8.3/actions/
vim Allocation_Cold.yml
actions:
1:
action: allocation
description: "Apply shard allocation filtering rules to the specified indices,Warm to Cold"
options:
key: box_type
value: cold
allocation_type: require
wait_for_completion: true
timeout_override:
continue_if_exception: false
disable_action: false
filters:
- filtertype: pattern
kind: prefix
value: log_transaction_
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: months
unit_count: 15
3.将超过30天的数据删除
cd /appdata/curator-5.8.3/actions/
vim delete.yml
######################################
actions: 1:
action: delete_indices
description: >-
Delete indices older than 30 days (based on index name) for log_transaction_
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: log_transaction_
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: days
unit_count: 30
exclude: true
4.定时任务制定
0 1 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/Allocation_Warm.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/Allocation_Warm.log 2>&1
0 3 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/Allocation_Cold.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs//Allocation_Cold.log 2>&1
0 5 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/delete.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/delete.log 2>&1
本文没有写怎么实现30天后的数据归档,其实这一部分内容也很容易实现。作者在本地的做法是:1.对25天后的数据通过curator进行snapshot备份;2.每天用一个定时的crontab去检查备份是否成功,如果成功了就可以自动通过delete.yml对数据进行删除。如果你想知道备份环境如何搭建可以参考《Elasticsearch基于nfs的备份环境搭建》这篇文章。
https://www.elastic.co/guide/en/elasticsearch/client/curator/5.8/installation.html
备注: 如有疑问或者建议,请及时反馈13580480392@163.com。本人会及时反馈,感谢您的支持!
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。