前言: 单位有套年代久远的Greenplum集群,接手后一直没怎么处理(也不知道怎么处理)。最近业务系统频繁出现访问卡顿,于是便想着搭建一套环境用于测试
Greenplum是一款基于分布式架构的开源数据库;采用无共享(no shareing)的MPP架构(每个数据节点拥有独立的CPU、IO和内存等资源);其具有良好的线性扩展能力,具有高效的并行运算、并行存储特性。拥有独特的高效的ORCA优化器。非常适合用于PB数据量级的存储、处理和实时分析能力。Greenplum是基于PostgreSQL数据库发展而来,本质上是多个PostgreSQL面向磁盘的数据库实例一起工作形成的一个紧密结合的数据库管理系统(DBMS),同时支持涵盖OLTP型业务混合负载,数据节点和主节点均可设计备份节点,进而提供数据库的高可用性。
Greenplum由Master节点<协调节点>和Segment节点<数据节点>以及高速网络层三部分组成
本次采用的是4台机器组成的集群,操作系统版本为Centos7.9
服务器IP地址 | 节点名称 |
---|---|
192.168.124.175 | minio1 |
192.168.124.176 | minio2 |
192.168.124.177 | minio3 |
192.168.124.178 | minio4 |
其中175为master节点,176为standby节点,175-178共4台为segment节点
systemctl stop firewalld
systemctl disable firewalld
sed -ri '/^SELINUX=/cSELINUX=disabled' /etc/selinux/config
setenforce 0 (临时关闭)
192.168.124.175 minio1
192.168.124.176 minio2
192.168.124.177 minio3
192.168.124.178 minio4
直接生成需要安装的依赖包命令
rpm -q apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml zlib openldap openssh-client openssl openssl-libs perl readline rsync R sed tar zip krb5-devel | grep "is not installed" | awk '{print $2}' | xargs echo yum install -y
共享内存:
kernel.shmall = ( _PHYS_PAGES / 2)
kernel.shmmax = ( _PHYS_PAGES / 2) * PAGE_SIZE
通过以下命令计算
echo $(expr $(getconf \_PHYS\_PAGES) / 2)
echo $(expr $(getconf \_PHYS\_PAGES) / 2 \\* $(getconf PAGE\_SIZE))
segment主机内存
vm.overcommit\_memory 系统使用该参数来确定可以为进程分配多少内存。对于GP数据库,此参数应设置为2。
vm.overcommit\_ratio 以为进程分配内的百分比,其余部分留给操作系统。默认值为50。建议设置95(根据主机内存大小自行设置)
系统内存
对于内存超过 64GB 的主机系统,建议采用以下设置:
vm.dirty\_background\_ratio = 0
vm.dirty\_ratio = 0
vm.dirty\_background\_bytes = 1610612736 # 1.5GB
vm.dirty\_bytes = 4294967296 # 4GB
对于内存为 64GB 或更少的主机系统,删除vm.dirty\_background\_bytes
和vm.dirty\_bytes
,并将两个ratio参数设置为以下值:
vm.dirty\_background\_ratio = 3
vm.dirty\_ratio = 10
增加vm.min_free_kbytes以确保PF_MEMALLOC轻松满足来自网络和存储驱动程序的请求。这对于具有大量系统内存的系统尤其重要。这些系统的默认值通常太低
awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min\_free\_kbytes =", $2 \* .03;}' /proc/meminfo >> /etc/sysctl.conf
IP Fragmentation设置
net.ipv4.ipfrag\_high\_thresh = 41943040
net.ipv4.ipfrag\_low\_thresh = 31457280
net.ipv4.ipfrag\_time = 60
systectl.conf完整参数配置
#共享内存设置
kernel.shmall = 485033
kernel.shmmax = 1986695168 # 这里的俩个参数为该节点第一步分别计算出的数据
kernel.shmmni = 4096
# See Segment Host Memory
vm.overcommit\_memory = 2
vm.overcommit\_ratio = 90
# 系统内存
vm.dirty\_background\_ratio = 3
vm.dirty\_ratio = 10
vm.dirty\_expire\_centisecs = 500
vm.dirty\_writeback\_centisecs = 100
vm.zone\_reclaim\_mode = 0
vm.min\_free\_kbytes = 116408
# See Port Settings 端口设定
net.ipv4.ip\_local\_port\_range = 10000 65535
kernel.sem = 500 2048000 200 4096
kernel.sysrq = 1
kernel.core\_uses\_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp\_syncookies = 1
net.ipv4.conf.default.accept\_source\_route = 0
net.ipv4.tcp\_max\_syn\_backlog = 4096
net.ipv4.conf.all.arp\_filter = 1
net.core.netdev\_max\_backlog = 10000
net.core.rmem\_max = 2097152
net.core.wmem\_max = 2097152
vm.swappiness = 10
sysctl -p
使配置生效
在文件中设置以下参数/etc/security/limits.conf:
\* soft nofile 524288
\* hard nofile 524288
\* soft nproc 131072
\* hard nproc 131072
在rhel系的系统中,/etc/security/limits.d/
目录下的配置文件,会覆盖/etc/security/limits.conf
文件中的配置。设置完可重新登陆系统,使用ulimit -a
查看配置是否生效
/etc/sysctl.conf
添加以下内容并加载内核 sysctl -p
kernel.core\_pattern=/var/core/core.%h.%t
/etc/security/limits.conf
添加以下内容:
\* soft core unlimited
I/O预读
#获取预读值
/sbin/blockdev --getra <devname>
#设置预读值
/sbin/blockdev --setra 16384 /dev/sda
# 永久设置
vim /etc/rc.d/rc.local
/sbin/blockdev --setra 16384 /dev/sda
chmod +x /etc/rc.d/rc.local
I/O调度
grubby --update-kernel=ALL --args="elevator=deadline"
列出所有内核参数: grubby --info=ALL
vim /etc/rc.local //添加如下内容
if test -f /sys/kernel/mm/transparent\_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent\_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent\_hugepage/defrag; then
echo never > /sys/kernel/mm/transparent\_hugepage/defrag
fi
# 在GRUB\_CMDLINE\_LINUX加入选项 transparent\_hugepage=never
vim etc/default/grub
GRUB\_TIMEOUT=5
GRUB\_DISTRIBUTOR="$(sed 's, release .\*$,,g' /etc/system-release)"
GRUB\_DEFAULT=saved
GRUB\_DISABLE\_SUBMENU=true
GRUB\_TERMINAL\_OUTPUT="console"
GRUB\_CMDLINE\_LINUX="rd.lvm.lv=fedora/swap rd.lvm.lv=fedora/root rhgb quiet transparent\_hugepage=never"
GRUB\_DISABLE\_RECOVERY="true"
# 根据启动方式,重新生成grub配置文件
grub2-mkconfig -o /boot/grub2/grub.cfg //bios
grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg //uefi
#重启
reboot
# 确认结果,显示为never
cat /sys/kernel/mm/transparent\_hugepage/enabled
cat /sys/kernel/mm/transparent\_hugepage/defrag
调大ssh连接数
vim /etc/ssh/sshd\_config
MaxSessions 200
MaxStartups 100:30:1000
重启ssh:systemctl restart sshd
vim /etc/systemd/logind.conf
RemoveIPC=no
systemctl restart systemd-logind.service
service systemd-logind restart
echo $LANG
# 如果不是en\_US.UTF-8的,执行下面语句
localectl set-locale LANG=en\_US.UTF-8
除特殊指出的步骤,以下操作都在master节点上进行
在每个节点上创建gpadmin用户,确保gpadmin用户在每个主机上具有相同的用户 ID (uid) 和组 ID (gid) 编号
groupadd -g 1530 gpadmin
useradd -u 1530 gpadmin -r -m -g gpadmin
passwd gpadmin
su - gpadmin
ssh-keygen -t rsa
给gpadmin设置用户具有root权限,方便后期加sudo执行root权限的命令
visudo
gpadmin ALL=(ALL) NOPASSWD: ALL
在所有节点安装数据软件,并更改数据库目录为gpadmin用户
sudo yum install ./greenplum-db-<version>-<platform>.rpm
sudo chown -R gpadmin:gpadmin /usr/local/greenplum\*
sudo chgrp -R gpadmin /usr/local/greenplum\*
gpadmin用户登陆mastter主机
source /usr/local/greenplum-db-<version>/greenplum\_path.sh
ssh-copy-id minio1
ssh-copy-id minio2
ssh-copy-id minio3
ssh-copy-id minio4
创建主机列表文件
vim hostfile\_exkeys
minio1
minio2
minio3
minio4
互免密登陆
gpssh-exkeys -f hostfile\_exkeys
gpadmin用户登陆master节点,运行下面命令,能正常显示各节点的gp目录和权限
gpssh -f hostfile\_exkeys -e 'ls -l /usr/local/greenplum-db-<version>'
vim /home/gpadmin/.bashrc
source /usr/local/greenplum-db/greenplum\_path.sh
export PGPORT=5432
export MASTER\_DATA\_DIRECTORY=/gpdata/master/gpseg-1
export PGDATABASE=zndb
source /home/gpadmin/.bashrc
master
mkdir -p /gpdata/master
chown gpadmin:gpadmin /gpdata/master
standby master
source /usr/local/greenplum-db/greenplum\_path.sh
mkdir -p /gpdata/master
chown gpadmin:gpadmin /gpdata/master
segment节点
source /usr/local/greenplum-db/greenplum\_path.sh
mkdir -p /gpdata/primary
mkdir -p /gpdata/mirror
chown -R gpadmin:gpadmin /gpdata/\*
以gpadmin用户登陆master,并设置环境变量
su - gpadmin
source /usr/local/greenplum-db/greenplum\_path.sh
vim hostfile\_gpinitsystem
minio1
minio2
minio3
minio4
复制示例文件
cp $GPHOME/docs/cli\_help/gpconfigs/gpinitsystem\_config /home/gpadmin/gpconfigs/gpinitsystem\_config
编辑配置文件
vim gpinitsystem\_config
# FILE NAME: gpinitsystem\_config
# Configuration file needed by the gpinitsystem
################################################
#### REQUIRED PARAMETERS
################################################
#### Name of this Greenplum system enclosed in quotes.
ARRAY\_NAME="Greenplum Data Platform"
#### Naming convention for utility-generated data directories.
SEG\_PREFIX=gpseg
#### Base number by which primary segment port numbers
#### are calculated.
PORT\_BASE=16000
#### File system location(s) where primary segment data directories
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA\_DIRECTORY=(/gpdata/primary /gpdata/primary)
#### OS-configured hostname or IP address of the master host.
MASTER\_HOSTNAME=minio1
#### File system location where the master data directory
#### will be created.
MASTER\_DIRECTORY=/gpdata/master
#### Port number for the master instance.
MASTER\_PORT=5432
#### Shell utility used to connect to remote hosts.
TRUSTED\_SHELL=ssh
#### Maximum log file segments between automatic WAL checkpoints.
CHECK\_POINT\_SEGMENTS=8
#### Default server-side character set encoding.
ENCODING=UNICODE
################################################
#### OPTIONAL MIRROR PARAMETERS
################################################
#### Base number by which mirror segment port numbers
#### are calculated.
MIRROR\_PORT\_BASE=17000
#### File system location(s) where mirror segment data directories
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the
#### DATA\_DIRECTORY parameter.
declare -a MIRROR\_DATA\_DIRECTORY=(/gpdata/mirror /gpdata/mirror)
################################################
#### OTHER OPTIONAL PARAMETERS
################################################
#### Create a database of this name after initialization.
DATABASE\_NAME=zndb
#### Specify the location of the host address file here instead of
#### with the -h option of gpinitsystem.
#MACHINE\_LIST\_FILE=/home/gpadmin/gpconfigs/hostfile\_gpinitsystem
gpinitsystem -c gpinitsystem\_config -h hostfile\_gpinitsystem -s minio2 --mirror-mode=spread
如果初始化发生错误,建议查看日志,路径:~/gpAdminLogs/gpinitsystem\_202XXXX.log
如果存在需要清理后再重新安装的,请检查是否存在回退脚本~/gpAdminLogs/backout\_gpinitsystem\_<user>\_<timestamp>
,执行后再重新运行初始化脚本;不存在该脚本,可以删除创建的数据目录后重新初始化
gpconfig -s TimeZone
gpconfig -c TimeZone -v 'Asia/Shanghai'
修改master节点pg_hba.conf文件,添加允许远程登陆权限
vim /gpdata/master/gpseg-1/pg\_hba.conf
...
host all all 192.168.124.0/24 md5
# 重新加载配置文件
gpstop -u
至此,Greenplum的安装已经完成。以上过程有不足之处,请参考官方文档
修改密码
psql -d postgres
alter user gpadmin with password 'gpadmin';
dbeaver登陆数据库
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。