一、简介
二、环境架构
三、快速部署1主2从流复制环境
3.1、申请主机
3.2、主库操作
3.2.1、主库放开防火墙
3.2.2、主库配置归档
3.2.3、主库创建复制用户
3.3、从库操作
3.3.1、在从库对主库进行备份
3.3.2、还原从库1
3.3.3、还原从库2
3.3.4、修改2个从库的primary_conninfo参数
3.3.5、启动从库
3.4、查询复制状态
3.5、测试主从复制
四、部署和使用pgpool-II和pgpoolAdmin工具
4.1、快速部署pgpool-II和pgpoolAdmin工具
4.2、使用pgpool-II实现读写分离+负载均衡
4.2.1、添加被监控主机
4.3、使用pgpoolAdmin管理pgpool-II
4.4、测试pgpool-II的读写分离+负载均衡+内存查询缓存功能
4.4.1、先关闭内存查询缓存
4.4.2、测试读写分离+负载均衡
4.4.3、开启内存查询缓存
之前发布过2篇有关PG中主从流复制的文章:
有关pgpool-II的相关文档:
但是,之前的文章中没有介绍有关pgpoolAdmin的知识,这款针对pgpool开发的web界面工具,还是比较实用的。
pgpool项目还提供了一个使用PHP写的 Web管理工具,称为“pgpoolAdmin ",该Web管理工具可以以 Web界面方式实现pgpool-II 的配置。
pgpoolAdmin下载:https://pgpool.net/mediawiki/index.php/Downloads
pgpoolAdmin官方文档:https://www.pgpool.net/docs/pgpoolAdmin/index_en.html
The pgpool Administration Tool is management tool of pgpool. It is possible to monitor, start, stop pgpool and change setting for pgpool.
PgpoolAdmin是管理pgpool的WEB界面工具,可以对pgpool进行监控,启停服务,更改设置。
先来一张图:
今天我们就来介绍一下pgpool-II和pgpoolAdmin的使用。
👉 注意: 本环境中的192.168.66.35或192.168.1.35均指的是同一台宿主机。 上图中的所有环境均在该宿主机中安装。
-- 拉取镜像
docker pull postgres:12
-- 创建专用网络
docker network create --subnet=172.72.6.0/24 pg-network
-- 创建宿主机相关映射路径
mkdir -p /docker_data/pg/lhrpg64302/data
mkdir -p /docker_data/pg/lhrpg64303/data
mkdir -p /docker_data/pg/lhrpg64304/data
-- 主库
docker rm -f lhrpg64302
rm -rf /docker_data/pg/lhrpg64302/data
docker run -d --name lhrpg64302 -h lhrpg64302 \
-p 64302:5432 --net=pg-network --ip 172.72.6.2 \
-v /docker_data/pg/lhrpg64302/data:/var/lib/postgresql/data \
-v /docker_data/pg/lhrpg64302/bk:/bk \
-e POSTGRES_PASSWORD=lhr \
-e TZ=Asia/Shanghai \
postgres:12
-- 从库1
docker rm -f lhrpg64303
rm -rf /docker_data/pg/lhrpg64303/data
rm -rf /docker_data/pg/lhrpg64303/bk
docker run -d --name lhrpg64303 -h lhrpg64303 \
-p 64303:5432 --net=pg-network --ip 172.72.6.3 \
-v /docker_data/pg/lhrpg64303/data:/var/lib/postgresql/data \
-v /docker_data/pg/lhrpg64303/bk:/bk \
-e POSTGRES_PASSWORD=lhr \
-e TZ=Asia/Shanghai \
postgres:12
-- 从库2
docker rm -f lhrpg64304
rm -rf /docker_data/pg/lhrpg64304/data
rm -rf /docker_data/pg/lhrpg64304/bk
docker run -d --name lhrpg64304 -h lhrpg64304 \
-p 64304:5432 --net=pg-network --ip 172.72.6.4 \
-v /docker_data/pg/lhrpg64304/data:/var/lib/postgresql/data \
-v /docker_data/pg/lhrpg64304/bk:/bk \
-e POSTGRES_PASSWORD=lhr \
-e TZ=Asia/Shanghai \
postgres:12
-- 远程登录
psql -U postgres -h 192.168.66.35 -p 64302
psql -U postgres -h 192.168.66.35 -p 64303
psql -U postgres -h 192.168.66.35 -p 64304
cat << EOF > /docker_data/pg/lhrpg64302/data/pg_hba.conf
# TYPE DATABASE USER ADDRESS METHOD
local all all trust
host all all 127.0.0.1/32 trust
host all all 0.0.0.0/0 md5
host replication all 0.0.0.0/0 md5
EOF
👉 注意添加replication
docker exec -it lhrpg64302 bash
-- 该路径也需要在从库创建
mkdir -p /postgresql/archive
chown -R postgres.postgres /postgresql/archive
cat >> /var/lib/postgresql/data/postgresql.conf <<"EOF"
wal_level='replica'
archive_mode='on'
archive_command='test ! -f /postgresql/archive/%f && cp %p /postgresql/archive/%f'
restore_command='cp /postgresql/archive/%f %p'
max_wal_senders=10
wal_keep_segments=256
wal_sender_timeout=60s
EOF
-- 重启
docker restart lhrpg64302
-- 或:
/usr/lib/postgresql/12/bin/pg_ctl restart -D /var/lib/postgresql/data/
psql -U postgres -h 192.168.66.35 -p 64302
select * from pg_settings where name in ('wal_level','archive_mode','archive_command');
-- 切换归档
select pg_switch_wal();
执行结果:
postgres=# select * from pg_settings where name in ('wal_level','archive_mode','archive_command');
name | setting | unit | category | short_desc | extra_desc | context | vartype | source | min_val | max_val | enumvals | boot_val | reset_val | sourcefile | sourceline | pending_restart
-----------------+------------------------------------------------------------------+------+-----------------------------+-------------------------------------------------------------------+------------+------------+---------+--------------------+---------+---------+---------------------------+----------+------------------------------------------------------------------+------------------------------------------+------------+-----------------
archive_command | test ! -f /postgresql/archive/%f && cp %p /postgresql/archive/%f | | Write-Ahead Log / Archiving | Sets the shell command that will be called to archive a WAL file. | | sighup | string | configuration file | | | | | test ! -f /postgresql/archive/%f && cp %p /postgresql/archive/%f | /var/lib/postgresql/data/postgresql.conf | 753 | f
archive_mode | on | | Write-Ahead Log / Archiving | Allows archiving of WAL files using archive_command. | | postmaster | enum | configuration file | | | {always,on,off} | off | on | /var/lib/postgresql/data/postgresql.conf | 752 | f
wal_level | replica | | Write-Ahead Log / Settings | Set the level of information written to the WAL. | | postmaster | enum | configuration file | | | {minimal,replica,logical} | replica | replica | /var/lib/postgresql/data/postgresql.conf | 751 | f
(3 rows)
postgres=# select * from pg_stat_get_archiver();
-[ RECORD 1 ]------+-----------------------------------------
archived_count | 8
last_archived_wal | 000000010000000000000006.00000028.backup
last_archived_time | 2021-04-22 11:42:54.049649+00
failed_count | 0
last_failed_wal |
last_failed_time |
stats_reset | 2021-04-22 11:35:55.727069+00
postgres=# select pg_switch_wal();
-[ RECORD 1 ]-+----------
pg_switch_wal | 0/7015058
postgres=# select * from pg_stat_get_archiver();
-[ RECORD 1 ]------+------------------------------
archived_count | 9
last_archived_wal | 000000010000000000000007
last_archived_time | 2021-04-23 01:00:30.076916+00
failed_count | 0
last_failed_wal |
last_failed_time |
stats_reset | 2021-04-22 11:35:55.727069+00
-- 切换归档前
root@lhrpg64302:/# ps -ef|grep post
postgres 1 0 0 01:28 ? 00:00:00 postgres
postgres 26 1 0 01:28 ? 00:00:00 postgres: checkpointer
postgres 27 1 0 01:28 ? 00:00:00 postgres: background writer
postgres 28 1 0 01:28 ? 00:00:00 postgres: walwriter
postgres 29 1 0 01:28 ? 00:00:00 postgres: autovacuum launcher
postgres 30 1 0 01:28 ? 00:00:00 postgres: archiver
postgres 31 1 0 01:28 ? 00:00:00 postgres: stats collector
postgres 32 1 0 01:28 ? 00:00:00 postgres: logical replication launcher
postgres 33 1 0 01:29 ? 00:00:00 postgres: postgres postgres 172.72.6.1(6884) idle
root 40 34 0 01:29 pts/0 00:00:00 grep post
root@lhrpg64302:/# cd /postgresql/archive/
root@lhrpg64302:/postgresql/archive# ls -l
total 0
-- 切换归档
postgres=# select pg_switch_wal();
pg_switch_wal
---------------
0/1645528
(1 row)
-- 切换归档后
root@lhrpg64302:/postgresql/archive# ls -l
total 16384
-rw------- 1 postgres postgres 16777216 Apr 23 01:30 000000010000000000000001
root@lhrpg64302:/postgresql/archive# ps -ef|grep post
postgres 1 0 0 01:28 ? 00:00:00 postgres
postgres 26 1 0 01:28 ? 00:00:00 postgres: checkpointer
postgres 27 1 0 01:28 ? 00:00:00 postgres: background writer
postgres 28 1 0 01:28 ? 00:00:00 postgres: walwriter
postgres 29 1 0 01:28 ? 00:00:00 postgres: autovacuum launcher
postgres 30 1 0 01:28 ? 00:00:00 postgres: archiver last was 000000010000000000000001
postgres 31 1 0 01:28 ? 00:00:00 postgres: stats collector
postgres 32 1 0 01:28 ? 00:00:00 postgres: logical replication launcher
postgres 33 1 0 01:29 ? 00:00:00 postgres: postgres postgres 172.72.6.1(6884) idle
root 47 34 0 01:30 pts/0 00:00:00 grep post
Specifies the maximum number of concurrent connections from standby servers or streaming base backup clients (i.e., the maximum number of simultaneously running WAL sender processes). The default is zero, meaning replication is disabled. WAL sender processes count towards the total number of connections, so the parameter cannot be set higher than max_connections. This parameter can only be set at server start. wal_level must be set to archive or hot_standby to allow connections from standby servers.
也就是说,这个参数是在主机上设置的,是从机连接到主机的并发连接数之总和,所以这个参数是个正整型。默认值是0,也即默认没有流复制功能。该并发连接数从进程上看,就是各个wal sender进程数之和,可以通过ps -ef|grep senders来查看,所以该值不能超过系统的最大连接数(max_connections,该BUG在9.1.5被修复),可以允许超过实际的流复制用户数。该参数更改需要重启DB,比如我只配了一个从机:
[postgres@ndb2 database]$ ps -ef|grep sender postgres 21257 21247 0 20:57 ? 00:00:00 postgres: wal sender process repuser 192.25.10.71(46161) streaming 0/4018ED8 postgres 22193 20949 0 23:02 pts/0 00:00:00 grep sender
表示保留多少个WAL文件。如果源库业务较繁忙,那么应该相应的增加这个值。
在PG13中,wal_keep_segments 已经取消,改用 wal_keep_size
中断那些停止活动超过指定毫秒数的复制连接。这对发送服务器检测一个后备机崩溃或网络中断有用。设置为0将禁用该超时机制。这个参数只能在postgresql.conf文件中或在服务器命令行上设置。默认值是 60 秒。
create role replhr login encrypted password 'lhr' replication;
👉 创建用户需要加上replication选项。
这里,我们在第一个从库上进行备份即可。
docker exec -it lhrpg64303 bash
mkdir -p /bk
chown postgres:postgres /bk
su - postgres
pg_basebackup -h 172.72.6.2 -p 5432 -U replhr -l bk20210422 -F p -P -R -D /bk
执行完成后,会产生文件standby.signal,如下:
root@lhrpg64303:/# mkdir -p /bk
root@lhrpg64303:/# chown postgres:postgres /bk
root@lhrpg64303:/#
root@lhrpg64303:/# su - postgres
postgres@lhrpg64303:~$
postgres@lhrpg64303:~$ pg_basebackup -h 172.72.6.2 -p 5432 -U replhr -l bk20210422 -F p -P -R -D /postgresql/pgdata
Password:
24560/24560 kB (100%), 1/1 tablespace
postgres@lhrpg64303:~$ cd /postgresql/pgdata/
postgres@lhrpg64303:/postgresql/pgdata$ ll
-bash: ll: command not found
postgres@lhrpg64303:/postgresql/pgdata$ ls -l
total 116
-rw------- 1 postgres postgres 3 Apr 22 10:52 PG_VERSION
-rw------- 1 postgres postgres 209 Apr 22 10:52 backup_label
drwx------ 5 postgres postgres 4096 Apr 22 10:52 base
drwx------ 2 postgres postgres 4096 Apr 22 10:52 global
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_commit_ts
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_dynshmem
-rw-r--r-- 1 postgres postgres 243 Apr 22 10:52 pg_hba.conf
-rw------- 1 postgres postgres 1636 Apr 22 10:52 pg_ident.conf
drwx------ 4 postgres postgres 4096 Apr 22 10:52 pg_logical
drwx------ 4 postgres postgres 4096 Apr 22 10:52 pg_multixact
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_notify
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_replslot
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_serial
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_snapshots
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_stat
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_stat_tmp
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_subtrans
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_tblspc
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_twophase
drwx------ 3 postgres postgres 4096 Apr 22 10:52 pg_wal
drwx------ 2 postgres postgres 4096 Apr 22 10:52 pg_xact
-rw------- 1 postgres postgres 255 Apr 22 10:52 postgresql.auto.conf
-rw------- 1 postgres postgres 26756 Apr 22 10:52 postgresql.conf
-rw------- 1 postgres postgres 0 Apr 22 10:52 standby.signal
postgres@lhrpg64303:/postgresql/pgdata$
👉 在PG12之前,-R备份结束之后会自动生成recovery.conf文件,用来做流复制判断主从同步的信息。但是从PG12开始,这个文件已经不需要了。只需要在参数文件postgresql.conf中配置primary_conninfo参数即可。
-- 关闭从库1,删除从库的数据文件,并且将备份文件覆盖从库的数据文件
docker stop lhrpg64303
rm -rf /docker_data/pg/lhrpg64303/data/*
cp -r /docker_data/pg/lhrpg64303/bk/* /docker_data/pg/lhrpg64303/data/
-- 关闭从库2,删除从库的数据文件,并且将备份文件覆盖从库的数据文件
docker stop lhrpg64304
rm -rf /docker_data/pg/lhrpg64304/data/*
cp -r /docker_data/pg/lhrpg64303/bk/* /docker_data/pg/lhrpg64304/data/
-- 从库1
cat >> /docker_data/pg/lhrpg64303/data/postgresql.conf <<"EOF"
primary_conninfo = 'host=172.72.6.2 port=5432 user=replhr password=lhr'
EOF
-- 从库2
cat >> /docker_data/pg/lhrpg64304/data/postgresql.conf <<"EOF"
primary_conninfo = 'host=172.72.6.2 port=5432 user=replhr password=lhr'
EOF
docker start lhrpg64303 lhrpg64304
主库进程:
root@lhrpg64302:/# ps -ef|grep post
postgres 1 0 0 11:35 ? 00:00:00 postgres
postgres 26 1 0 11:35 ? 00:00:00 postgres: checkpointer
postgres 27 1 0 11:35 ? 00:00:00 postgres: background writer
postgres 28 1 0 11:35 ? 00:00:00 postgres: walwriter
postgres 29 1 0 11:35 ? 00:00:00 postgres: autovacuum launcher
postgres 30 1 0 11:35 ? 00:00:00 postgres: archiver last was 000000010000000000000006.00000028.backup
postgres 31 1 0 11:35 ? 00:00:00 postgres: stats collector
postgres 32 1 0 11:35 ? 00:00:00 postgres: logical replication launcher
postgres 33 1 0 11:35 ? 00:00:00 postgres: postgres postgres 172.72.6.1(52776) idle
postgres 129 1 0 11:48 ? 00:00:00 postgres: walsender replhr 172.72.6.3(40056) streaming 0/7000148
从库进程:
root@lhrpg64303:/# ps -ef|grep post
postgres 1 0 0 11:48 ? 00:00:00 postgres
postgres 26 1 0 11:48 ? 00:00:00 postgres: startup recovering 000000010000000000000007
postgres 27 1 0 11:48 ? 00:00:00 postgres: checkpointer
postgres 28 1 0 11:48 ? 00:00:00 postgres: background writer
postgres 29 1 0 11:48 ? 00:00:00 postgres: stats collector
postgres 30 1 0 11:48 ? 00:00:00 postgres: walreceiver streaming 0/7000148
postgres 31 1 0 11:48 ? 00:00:00 postgres: postgres postgres 172.72.6.1(54413) idle
-- 主库查看wal日志发送状态
select * from pg_stat_replication;
-- 从库查看wal日志接收状态
select * from pg_stat_wal_receiver;
-- 也可以通过该名称查看
pg_controldata | grep state
-- 也可以查看这个,主库是f代表false ;备库是t,代表true
select pg_is_in_recovery();
主库查询复制状态:
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-----+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+-------------------------------
33 | 16384 | replhr | walreceiver | 172.72.6.3 | | 33806 | 2021-04-23 14:51:18.044699+08 | | streaming | 0/6000060 | 0/6000060 | 0/6000060 | 0/6000060 | | | | 0 | async | 2021-04-23 15:23:31.27773+08
75 | 16384 | replhr | walreceiver | 172.72.6.4 | | 47540 | 2021-04-23 15:23:14.795969+08 | | streaming | 0/6000060 | 0/6000060 | 0/6000060 | 0/6000060 | | | | 0 | async | 2021-04-23 15:23:34.927623+08
(2 rows)
postgres=# select client_addr,state,sync_state from pg_stat_replication;
client_addr | state | sync_state
-------------+-----------+------------
172.72.6.3 | streaming | async
172.72.6.4 | streaming | async
(2 rows)
postgres=#
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
pg_stat_replication是一个视图,主要用于监控PG流复制情况。在这个系统视图中每个记录只代表一个slave。因此,可以看到谁处于连接状态,在做什么任务。pg_stat_replication也是检查slave是否处于连接状态的一个好方法。
每个字段代码的含义:
• pid: 这代表负责流连接的wal_sender进程的进程ID。例如“postgres: walsender replhr 172.72.6.3(40056) streaming 0/7000148”。
• usesysid: 每个内部用户都有一个独一无二的编号。该系统的工作原理很像UNIX。usesysid 是 (PostgreSQL) 用户连接到系统的唯一标识符。
• usename: (不是用户名, 注意少了 r),它存储与用户相关的 usesysid 的名字。这是客户端放入到连接字符串中的东西。
• application_name:这是同步复制的通常设置。它可以通过连接字符串传递到master。
• client_addr: 它会告诉您流连接从何而来。它拥有客户端的IP地址。
• client_hostname: 除了客户端的IP,您还可以这样做,通过它的主机名来标识客户端。您可以通过master上的postgresql.conf中的log_hostname启用DNS反向查找。
• client_port: 这是客户端用来和WALsender进行通信使用的TPC端口号。如果不本地UNIX套接字被使用了将显示-1。
• backend_start: 它告诉我们slave什么时间创建了流连接。
• state: 此列告诉我们数据的连接状态。如果事情按计划进行,它应该包含流信息。
• sent_lsn:这代表发送到连接的最后的事务日志的位置。已经通过网络发送了多少WAL?
• write_lsn: 这是写到standby系统磁盘上最后的事务日志位置。已向操作系统发送了多少WAL?( 尚未 flushing)
• flush_lsn: 这是被刷新到standby系统的最后位置。(这里注意写和刷新之间的区别。写并不意味着刷新 。)已经有多少WAL已 flush 到磁盘?
• replay_lsn: 这是slave上重放的最后的事务日志位置。已重放了多少WAL,因此对查询可见?
• sync_priority: 这个字段是唯一和同步复制相关的。每次同步复制将会选择一个优先权 —sync_priority—会告诉您选择了那个优先权。
• sync_state: 最后您会看到slave在哪个状态。这个状态可以是async, sync, or potential。当有一个带有较高优先权的同步slave时,PostgreSQL会把slave 标记为 potential。
人们经常说 pg_stat_replication 视图是primary 端的,这是不对的。该视图的作用是揭示有关wal sender 进程的信息。换句话说:如果你正在运行级联复制,该视图意味着在 secondary 复制到其他slaves 的时候, secondary 端的 pg_stat_replication 上的也会显示entries ( 条目 )
从库查询wal日志接收状态:
postgres=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid | 30
status | streaming
receive_start_lsn | 0/4000000
receive_start_tli | 1
received_lsn | 0/4000060
received_tli | 1
last_msg_send_time | 2021-04-23 14:33:12.462989+08
last_msg_receipt_time | 2021-04-23 14:33:12.463126+08
latest_end_lsn | 0/4000060
latest_end_time | 2021-04-23 14:32:42.441224+08
slot_name |
sender_host | 172.72.6.2
sender_port | 5432
conninfo | user=replhr password=******** dbname=replication host=172.72.6.2 port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any
也可以通过pg_controldata来查询主备库的状态,主要看字段“Database cluster state”的值:
root@lhrpg64302:/# pg_controldata
pg_control version number: 1201
Catalog version number: 201909212
Database system identifier: 6954163543229509670
Database cluster state: in production
pg_control last modified: Fri 23 Apr 2021 09:54:48 AM CST
Latest checkpoint location: 0/5000098
Latest checkpoint's REDO location: 0/5000060
Latest checkpoint's REDO WAL file: 000000010000000000000005
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:488
Latest checkpoint's NextOID: 24576
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 480
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 488
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Fri 23 Apr 2021 09:54:48 AM CST
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: off
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce: 6006c7a6310b7d3904a3a1cf6b6ca96accce7a702df332c33e1c3fbc210e3623
root@lhrpg64302:/# pg_controldata | grep state
Database cluster state: in production
从库:
root@lhrpg64303:/var/lib/postgresql/data# pg_controldata
pg_control version number: 1201
Catalog version number: 201909212
Database system identifier: 6954163543229509670
Database cluster state: in archive recovery
pg_control last modified: Fri 23 Apr 2021 09:55:28 AM CST
Latest checkpoint location: 0/5000098
Latest checkpoint's REDO location: 0/5000060
Latest checkpoint's REDO WAL file: 000000010000000000000005
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:488
Latest checkpoint's NextOID: 24576
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 480
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 488
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Fri 23 Apr 2021 09:54:48 AM CST
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/5000148
Min recovery ending loc's timeline: 1
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: off
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce: 6006c7a6310b7d3904a3a1cf6b6ca96accce7a702df332c33e1c3fbc210e3623
root@lhrpg64303:/var/lib/postgresql/data# pg_controldata | grep state
Database cluster state: in archive recovery
select pg_is_in_recovery();--主库是f代表false ;备库是t,代表true
-- 主库新增表
postgres=# \c sbtest
psql (13.2, server 12.2 (Debian 12.2-2.pgdg100+1))
You are now connected to database "sbtest" as user "postgres".
sbtest=# select count(*) from sbtest1;
count | 10
sbtest=# select count(*) from sbtest11;
ERROR: relation "sbtest11" does not exist
LINE 1: select count(*) from sbtest11;
^
sbtest=# create table sbtest11 as select * from sbtest1;
SELECT 10
sbtest=# select count(*) from sbtest11;
count | 10
sbtest=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
-- 从库查询
sbtest=# select count(*) from sbtest11;
count
-------
10
(1 row)
sbtest=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
sbtest=# create database lhrdb;
ERROR: cannot execute CREATE DATABASE in a read-only transaction
-- 主库插入
sbtest=# insert into sbtest11 select * from sbtest11;
INSERT 0 10
sbtest=#
sbtest=# select count(*) from sbtest11;
count
-------
20
(1 row)
-- 从库查询
sbtest=# select count(*) from sbtest11;
count
-------
20
(1 row)
可以看到,主从是同步的。
由于安装和配置比较繁琐,我们本文主要着重使用,所以,这里省略安装和配置过程,具体可以参考:【DB宝61】PostgreSQL使用Pgpool-II实现读写分离+负载均衡
我们可以直接使用麦老师配置好的镜像,该镜像集成了pgpool-II和pgpoolAdmin工具,如下所示:
docker rm -f lhrpgpool
docker run -d --name lhrpgpool -h lhrpgpool \
--net=pg-network --ip 172.72.6.66 \
-p 19999:9999 -p 19898:9898 -p 180:80 -p 13389:3389 \
-v /sys/fs/cgroup:/sys/fs/cgroup \
--privileged=true lhrbest/lhrpgpool:4.2.2 \
/usr/sbin/init
docker network connect bridge lhrpgpool
docker restart lhrpgpool
docker exec -it lhrpgpool bash
-- lhrbest/lhrpgpool:4.2.2 只安装了pgpool
-- lhrbest/lhrpgpool:4.2.2_01 安装了pgpool并做了配置
-- lhrbest/lhrpgpool:4.2.2_02 安装了pgpool和pgpoolAdmin并做了配置
启动容器后,我们就拥有了pgpool-II和pgpoolAdmin工具。
[root@lhrpgpool /]# systemctl status pgpool
● pgpool.service - Pgpool-II
Loaded: loaded (/usr/lib/systemd/system/pgpool.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2021-06-24 15:06:22 CST; 9min ago
Main PID: 181 (pgpool)
CGroup: /docker/d45b808bb68bd3fa21db1881b6fc82a1ae194abc88fcbb6b8667875d9257ac54/system.slice/pgpool.service
├─181 /postgresql/pgpool/bin/pgpool -f /postgresql/pgpool/etc/pgpool.conf -n
├─218 pgpool: wait for connection request
├─219 pgpool: wait for connection request
├─220 pgpool: wait for connection request
├─221 pgpool: wait for connection request
├─222 pgpool: wait for connection request
├─223 pgpool: wait for connection request
├─224 pgpool: wait for connection request
├─225 pgpool: wait for connection request
├─226 pgpool: wait for connection request
├─227 pgpool: wait for connection request
├─229 pgpool: wait for connection request
├─230 pgpool: wait for connection request
├─231 pgpool: wait for connection request
├─235 pgpool: wait for connection request
├─236 pgpool: wait for connection request
├─237 pgpool: wait for connection request
├─239 pgpool: wait for connection request
├─240 pgpool: wait for connection request
├─241 pgpool: wait for connection request
├─243 pgpool: wait for connection request
├─245 pgpool: wait for connection request
├─246 pgpool: wait for connection request
├─249 pgpool: wait for connection request
├─251 pgpool: PCP: wait for connection request
├─252 pgpool: worker process
├─767 pgpool: wait for connection request
├─784 pgpool: wait for connection request
├─829 pgpool: wait for connection request
├─840 pgpool: wait for connection request
├─841 pgpool: wait for connection request
├─845 pgpool: wait for connection request
├─846 pgpool: wait for connection request
├─915 pgpool: wait for connection request
└─916 pgpool: wait for connection request
Jun 24 15:15:16 lhrpgpool pgpool[251]: [150-1] 2021-06-24 15:15:16: pid 251: LOG: PCP process with pid: 911 exits with status 0
Jun 24 15:15:16 lhrpgpool pgpool[251]: [151-1] 2021-06-24 15:15:16: pid 251: LOG: forked new pcp worker, pid=913 socket=8
Jun 24 15:15:16 lhrpgpool pgpool[251]: [152-1] 2021-06-24 15:15:16: pid 251: LOG: PCP process with pid: 913 exit with SUCCESS.
Jun 24 15:15:16 lhrpgpool pgpool[251]: [153-1] 2021-06-24 15:15:16: pid 251: LOG: PCP process with pid: 913 exits with status 0
Jun 24 15:15:16 lhrpgpool pgpool[774]: [20-1] 2021-06-24 15:15:16: pid 774: FATAL: pgpool is not accepting any new connections
Jun 24 15:15:16 lhrpgpool pgpool[181]: [49-1] 2021-06-24 15:15:16: pid 181: LOG: child process with pid: 774 exits with status 256
Jun 24 15:15:16 lhrpgpool pgpool[181]: [50-1] 2021-06-24 15:15:16: pid 181: LOG: fork a new child process with pid: 915
Jun 24 15:15:16 lhrpgpool pgpool[773]: [18-1] 2021-06-24 15:15:16: pid 773: FATAL: pgpool is not accepting any new connections
Jun 24 15:15:16 lhrpgpool pgpool[181]: [51-1] 2021-06-24 15:15:16: pid 181: LOG: child process with pid: 773 exits with status 256
Jun 24 15:15:16 lhrpgpool pgpool[181]: [52-1] 2021-06-24 15:15:16: pid 181: LOG: fork a new child process with pid: 916
vi /postgresql/pgpool/etc/pgpool.conf
# - Backend Connection Settings -
backend_hostname0 = '172.72.6.2'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/postgresql/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_application_name0 = 'lhrpg64302'
backend_hostname1 = '172.72.6.3'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/postgresql/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_application_name1 = 'lhrpg64303'
backend_hostname2 = '172.72.6.4'
backend_port2 = 5432
backend_weight2 = 1
backend_data_directory2 = '/var/lib/postgresql/data'
backend_flag2 = 'ALLOW_TO_FAILOVER'
backend_application_name2 = 'lhrpg64304'
👉 注意: 写只会发送到172.72.6.2,而读会均衡的发送到172.72.6.2、172.72.6.3和172.72.6.4上。 如果想读只发送到172.72.6.3和172.72.6.4上,那么需要修改172.72.6.2的权重backend_weight0 = 0
配置完成后,重新启动pgpool服务:
systemctl restart pgpool
[root@lhrpgpool /]# systemctl enable httpd
Created symlink from /etc/systemd/system/multi-user.target.wants/httpd.service to /usr/lib/systemd/system/httpd.service.
[root@lhrpgpool /]#
[root@lhrpgpool /]# systemctl start httpd
[root@lhrpgpool /]# systemctl status httpd
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2021-06-24 15:11:58 CST; 10s ago
Docs: man:httpd(8)
man:apachectl(8)
Main PID: 602 (httpd)
Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec"
CGroup: /docker/d45b808bb68bd3fa21db1881b6fc82a1ae194abc88fcbb6b8667875d9257ac54/system.slice/httpd.service
├─602 /usr/sbin/httpd -DFOREGROUND
├─603 /usr/sbin/httpd -DFOREGROUND
├─604 /usr/sbin/httpd -DFOREGROUND
├─605 /usr/sbin/httpd -DFOREGROUND
├─606 /usr/sbin/httpd -DFOREGROUND
└─607 /usr/sbin/httpd -DFOREGROUND
Jun 24 15:11:58 lhrpgpool systemd[1]: Starting The Apache HTTP Server...
Jun 24 15:11:58 lhrpgpool httpd[602]: AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.2. Set the 'ServerName' directive globally to suppress this message
Jun 24 15:11:58 lhrpgpool systemd[1]: Started The Apache HTTP Server.
Web登录地址:
用户名和密码:pgpooladmin/lhr
若要容器内登陆,则需要登陆到Linux的桌面中,我们通过远程桌面登录:
image-20210624160349941
image-20210624160420852
用户名和密码:root/lhr
image-20210624160513763
打开Firefox浏览器,输入如下地址:http://127.0.0.1/admin/login.php
image-20210624160605074
也可以登陆。
登录后可以看到的界面:
每个参数有详细的说明,也可以参考:
我们先关闭内存查询缓存功能,一会再来开启这个功能。
vi /postgresql/pgpool/etc/pgpool.conf
memory_cache_enabled = off
配置完成后,重新启动pgpool服务:
systemctl restart pgpool
测试过程:
-- 开3个窗口,密码为lhr
psql -U postgres -h 192.168.66.35 -p 19999 -d sbtest
create table test(id int);
insert into test values(1);
select * from test;
show pool_nodes;
show pool_backend_stats;
-- 获取pgPool_II 版本信息 show pool_version; -- 查看配置文件 show pool_status; -- 获取节点信息 show pool_nodes; -- 获取pgPool-II 进程信息 show pool_processes; -- 获取pgPool-II 所有的连接池信息 show pool_pools; -- 命令执行统计信息 show pool_backend_stats; -- 健康统计信息 show pool_health_check_stats; -- 缓存统计信息 show pool_cache;
C:\Users\lhrxxt>psql -U postgres -h 192.168.66.35 -p 19999 -d sbtest
Password for user postgres:
psql (13.3, server 12.2 (Debian 12.2-2.pgdg100+1))
Type "help" for help.
sbtest=#
sbtest=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | 172.72.6.2 | 5432 | up | 0.333333 | primary | 0 | false | 0 | | | 2021-06-24 15:38:35
1 | 172.72.6.3 | 5432 | up | 0.333333 | standby | 0 | false | 0 | | | 2021-06-24 15:38:35
2 | 172.72.6.4 | 5432 | up | 0.333333 | standby | 0 | true | 0 | | | 2021-06-24 15:38:35
(3 rows)
sbtest=# show pool_backend_stats;
node_id | hostname | port | status | role | select_cnt | insert_cnt | update_cnt | delete_cnt | ddl_cnt | other_cnt | panic_cnt | fatal_cnt | error_cnt
---------+------------+------+--------+---------+------------+------------+------------+------------+---------+-----------+-----------+-----------+-----------
0 | 172.72.6.2 | 5432 | up | primary | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
1 | 172.72.6.3 | 5432 | up | standby | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
2 | 172.72.6.4 | 5432 | up | standby | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
(3 rows)
sbtest=#
sbtest=# create table test(id int);
CREATE TABLE
sbtest=# insert into test values(1);
INSERT 0 1
sbtest=# select * from test;
id
----
1
(1 row)
sbtest=# show pool_backend_stats;
node_id | hostname | port | status | role | select_cnt | insert_cnt | update_cnt | delete_cnt | ddl_cnt | other_cnt | panic_cnt | fatal_cnt | error_cnt
---------+------------+------+--------+---------+------------+------------+------------+------------+---------+-----------+-----------+-----------+-----------
0 | 172.72.6.2 | 5432 | up | primary | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0
1 | 172.72.6.3 | 5432 | up | standby | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
2 | 172.72.6.4 | 5432 | up | standby | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
(3 rows)
sbtest=# select * from test;
id
----
1
(1 row)
sbtest=# select * from test;
id
----
1
(1 row)
sbtest=# show pool_backend_stats;
node_id | hostname | port | status | role | select_cnt | insert_cnt | update_cnt | delete_cnt | ddl_cnt | other_cnt | panic_cnt | fatal_cnt | error_cnt
---------+------------+------+--------+---------+------------+------------+------------+------------+---------+-----------+-----------+-----------+-----------
0 | 172.72.6.2 | 5432 | up | primary | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0
1 | 172.72.6.3 | 5432 | up | standby | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
2 | 172.72.6.4 | 5432 | up | standby | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
(3 rows)
使用shell来测试负载均衡:
[pgsql@lhrpgpool ~]$ for i in $(seq 1 20); do psql -U postgres -h 127.0.0.1 -p 9999 -d sbtest -c 'SELECT inet_server_addr()'; done | egrep '172.'
172.72.6.4
172.72.6.2
172.72.6.4
172.72.6.4
172.72.6.4
172.72.6.3
172.72.6.3
172.72.6.3
172.72.6.4
172.72.6.3
172.72.6.2
172.72.6.4
172.72.6.3
172.72.6.4
172.72.6.4
172.72.6.2
172.72.6.3
172.72.6.4
172.72.6.2
172.72.6.2
sbtest=# show pool_backend_stats;
node_id | hostname | port | status | role | select_cnt | insert_cnt | update_cnt | delete_cnt | ddl_cnt | other_cnt | panic_cnt | fatal_cnt | error_cnt
---------+------------+------+--------+---------+------------+------------+------------+------------+---------+-----------+-----------+-----------+-----------
0 | 172.72.6.2 | 5432 | up | primary | 23 | 1 | 0 | 0 | 1 | 61 | 0 | 0 | 0
1 | 172.72.6.3 | 5432 | up | standby | 18 | 0 | 0 | 0 | 0 | 18 | 0 | 0 | 0
2 | 172.72.6.4 | 5432 | up | standby | 28 | 0 | 0 | 0 | 0 | 26 | 0 | 0 | 1
(3 rows)
可以看到,172.72.6.2、172.72.6.3和172.72.6.4服务器是均衡的读的。
下面将参数文件/postgresql/pgpool/etc/pgpool.conf中的backend_weight0修改为0,再重新启动pgpool后进行测试:
[pgsql@lhrpgpool ~]$ for i in $(seq 1 50); do psql -U postgres -h 127.0.0.1 -p 9999 -d sbtest -c 'SELECT inet_server_addr()'; done | egrep '172.'
172.72.6.3
172.72.6.3
172.72.6.4
..............
172.72.6.3
172.72.6.3
172.72.6.3
172.72.6.4
172.72.6.4
172.72.6.4
sbtest=# show pool_backend_stats;
node_id | hostname | port | status | role | select_cnt | insert_cnt | update_cnt | delete_cnt | ddl_cnt | other_cnt | panic_cnt | fatal_cnt | error_cnt
---------+------------+------+--------+---------+------------+------------+------------+------------+---------+-----------+-----------+-----------+-----------
0 | 172.72.6.2 | 5432 | up | primary | 0 | 0 | 0 | 0 | 0 | 50 | 0 | 0 | 0
1 | 172.72.6.3 | 5432 | up | standby | 30 | 0 | 0 | 0 | 0 | 30 | 0 | 0 | 0
2 | 172.72.6.4 | 5432 | up | standby | 20 | 0 | 0 | 0 | 0 | 20 | 0 | 0 | 0
(3 rows)
可以看到,所有的读全部发送到了172.72.6.3和172.72.6.4上了。
https://www.pgpool.net/docs/latest/en/html/runtime-in-memory-query-cache.html
https://www.pgpool.net/docs/pgpool-II-3.2.5/pgpool-zh_cn.html#memqcache
你可以在任何模式中使用基于内存的查询缓存。它不同于以上的查询缓存,因为基于内存的查询缓存会快很多,因为缓存存储于内存中。另外,如果缓存事小了,你不需要重启 pgpool-II 因为相关的表已经得到更新了。
基于内存的缓存保存 SELECT 语句(以及它绑定的参数,如果 SELECT 是一个扩展的查询)以及对应的数据。如果是相同的 SELECT 语句,则直接返回缓存的值。因为不再有 SQL 分析或者到 PostgreSQL 的调用,实际上它会非常快。
其他方面,它会比较慢,因为它增加了一些负载用于缓存。另外,当一个表被更新,pgpool 自动删除相关的表的缓存。因此,在有很多更新的系统中,性能会降低。如果 cache_hit_ratio 低于 70%(可以查询show pool_cache;),建议你关闭基于内存的缓存。
修改参数memory_cache_enabled=on 就会开启查询缓存功能。另外,需要设置memqcache_oiddir路径,创建目录/var/log/pgpool/oiddir,并赋权。
vi /postgresql/pgpool/etc/pgpool.conf
memory_cache_enabled = on
mkdir -p /var/log/pgpool/oiddir
chown -R pgsql.pgsql /var/log/pgpool/oiddir
配置完成后,重新启动pgpool服务:
systemctl restart pgpool
接下来,测试缓存功能:
sbtest=# show pool_backend_stats;
node_id | hostname | port | status | role | select_cnt | insert_cnt | update_cnt | delete_cnt | ddl_cnt | other_cnt | panic_cnt | fatal_cnt | error_cnt
---------+------------+------+--------+---------+------------+------------+------------+------------+---------+-----------+-----------+-----------+-----------
0 | 172.72.6.2 | 5432 | up | primary | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
1 | 172.72.6.3 | 5432 | up | standby | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
2 | 172.72.6.4 | 5432 | up | standby | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0
(3 rows)
[pgsql@lhrpgpool ~]$ for i in $(seq 1 50); do psql -U postgres -h 127.0.0.1 -p 9999 -d sbtest -c 'SELECT * from test'; done | egrep 'row' | wc -l
50
sbtest=# show pool_backend_stats;
node_id | hostname | port | status | role | select_cnt | insert_cnt | update_cnt | delete_cnt | ddl_cnt | other_cnt | panic_cnt | fatal_cnt | error_cnt
---------+------------+------+--------+---------+------------+------------+------------+------------+---------+-----------+-----------+-----------+-----------
0 | 172.72.6.2 | 5432 | up | primary | 0 | 0 | 0 | 0 | 0 | 50 | 0 | 0 | 0
1 | 172.72.6.3 | 5432 | up | standby | 0 | 0 | 0 | 0 | 0 | 22 | 0 | 0 | 0
2 | 172.72.6.4 | 5432 | up | standby | 1 | 0 | 0 | 0 | 0 | 28 | 0 | 0 | 0
(3 rows)
sbtest=# delete from test;
DELETE 1
sbtest=# select * from test;
id
----
(0 rows)
sbtest=# show pool_health_check_stats;
node_id | hostname | port | status | role | last_status_change | total_count | success_count | fail_count | skip_count | retry_count | average_retry_count | max_retry_count | max_duration | min_duration | average_duration | last_health_check | last_successful_health_check | last_skip_health_check | last_failed_health_check
---------+------------+------+--------+---------+---------------------+-------------+---------------+------------+------------+-------------+---------------------+-----------------+--------------+--------------+------------------+---------------------+------------------------------+------------------------+--------------------------
0 | 172.72.6.2 | 5432 | up | primary | 2021-06-24 16:20:08 | 57 | 57 | 0 | 0 | 0 | 0.000000 | 0 | 8 | 7 | 7.035088 | 2021-06-24 16:29:29 | 2021-06-24 16:29:29 | |
1 | 172.72.6.3 | 5432 | up | standby | 2021-06-24 16:20:08 | 57 | 57 | 0 | 0 | 0 | 0.000000 | 0 | 7 | 7 | 7.000000 | 2021-06-24 16:29:29 | 2021-06-24 16:29:29 | |
2 | 172.72.6.4 | 5432 | up | standby | 2021-06-24 16:20:08 | 57 | 57 | 0 | 0 | 0 | 0.000000 | 0 | 8 | 7 | 7.017544 | 2021-06-24 16:29:29 | 2021-06-24 16:29:29 | |
(3 rows)
可以看到,我执行了50次的SELECT * from test
,但其实记录的只有1条(select_cnt),其实数据是从缓存中拿到的。
本文结束。