前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >使用pgCompare比对不同pg的数据差异

使用pgCompare比对不同pg的数据差异

原创
作者头像
保持热爱奔赴山海
发布于 2024-06-28 02:55:34
发布于 2024-06-28 02:55:34
5190
举报
文章被收录于专栏:数据库相关数据库相关

官方项目地址: https://github.com/CrunchyData/pgCompare

限制

代码语言:txt
AI代码解释
复制
    日期/时间戳仅与秒精度(DDMMYYYYYYHH24MISS)进行比较。
    不支持的数据类型:blob、long、longraw、byta。
    执行跨平台比较时数据类型布尔值的限制。
    待比较的表必须有主键(没有主键会在比对的时候被自动跳过,日志中提示 Table xx has no Primary Key, skipping reconciliation)
    如果target的行比source的多,则不会被报告出来

前置条件

代码语言:txt
AI代码解释
复制
    Java version 21 or higher.
    Maven 3.9 or higher.
    Postgres version 15 or higher (to use for the pgCompare Data Compare repository).
    Necessary JDBC drivers (Postgres and Oracle currently supported).

准备2个pg实例

代码语言:txt
AI代码解释
复制
我这里用的是单机版的pg15和pg16
pg15作为source端, pg16作为target端。
create database db1;
\c db1;
然后在db1里创建一些表,并写入测试数据。

编译安装

代码语言:txt
AI代码解释
复制
cd /root/pgCompare/pgCompare-v.0.2.0

export JAVA_HOME=/usr/local/software/jdk-22.0.1
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH
mvn --version    
Apache Maven 3.9.1 (2e178502fcdbffc201671fb2537d0cb4b4cc58f8)
Maven home: /usr/local/software/maven
Java version: 22.0.1, vendor: Oracle Corporation, runtime: /home/software/jdk-22.0.1
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "6.8.9-1.el7.elrepo.x86_64", arch: "amd64", family: "unix"


mvn clean install

准备配置文件

代码语言:txt
AI代码解释
复制
cd target
cp ../pgcompare.properties.sample /root/pgcompare.properties

修改后的配置文件

代码语言:txt
AI代码解释
复制
cat pgcompare.properties | egrep -v '^#'

batch-fetch-size = 2000  # 设置从源或目标数据库检索行的获取大小
batch-commit-size = 2000  # 提交大小控制并发插入到 dc_source/dc_target 暂存表中的数组大小和行数
batch-progress-report-size = 1000000  # 定义 mod 中用于报告进度的行数
loader-threads = 2  # 设置将数据加载到临时表中的线程数。默认值为 4。设置为 0 可禁用加载器线程
message-queue-size = 100  # 加载线程使用的消息队列的大小(nbr 个消息)。默认值为 100
number-cast: notation     # 定义哈希函数中数字的转换方式(符号|标准)。默认为符号(科学计数法)
observer-throttle = true   # 设置为 true 或 false,指示加载器线程暂停并等待观察器线程赶上来,然后再继续将更多数据加载到暂存表中。
observer-throttle-size = 2000000  # 加载器线程休眠并等待观察器线程清除之前加载的行数
observer-vacuum = true  # 设置为 true 或 false,指示观察者是否在检查点期间对暂存表执行真空清理
log-destination = stdout  # 设置临时暂存表的并行工作器数量。默认值为 0

log-level = INFO  # 测试期间,建议改为 DEBUG
database-sort = true

repo-host=192.168.31.181
repo-port=5432
repo-dbname=pgcompare
repo-user=dts
repo-password=dts
repo-schema=pgcompare
repo-sslmode: disable

source-name=pg15
source-type=postgres
source-host=192.168.31.181
source-port=5432
source-dbname=db1
source-user=dts
source-password=dts
source-database-hash=true
source-sslmode=disable

target-name=pg16
target-type=postgres
target-host=192.168.31.181
target-port=5436
target-dbname=db1
target-user=dts
target-password=dts
target-database-hash=true
target-sslmode=disable

在repo-host的pg上创建存储库

代码语言:txt
AI代码解释
复制
create database pgcompare;

调整下pgcompare的repo-user的隔离级别

代码语言:txt
AI代码解释
复制
alter user dts set transaction_isolation='read committed';

TIPS:如果使用默认的RR隔离级别,在执行后续的 java -jar pgcompare.jar --batch=0 会报如下的错误
[2024-06-28 09:32:20] [SEVERE ] [dbCommon ] Error executing simple update (UPDATE dc_result SET source_cnt=source_cnt+? WHERE cid=?):  ERROR: could not serialize access due to concurrent update 

初始化存储库

代码语言:txt
AI代码解释
复制
java -jar pgcompare.jar --init

[2024-06-27 19:18:55] [INFO   ] [main                    ] Starting - rid: 1719487134959 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Version: 0.2.0 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Batch Number: 0 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Recheck Out of Sync: false 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Parameters:  
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-port=5436 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   observer-vacuum=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-port=5432 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-user=dts 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-type=postgres 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   batch-commit-size=2000 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   stage-table-parallel=0 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-name=pg15 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   message-queue-size=100 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-sslmode=disable 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-dbname=db1 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-sslmode=disable 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   batch-fetch-size=2000 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-type=postgres 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   database-sort=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-dbname=pgcompare 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-host=192.168.31.181 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-port=5432 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   loader-threads=2 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-schema=pgcompare 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   config-file=pgcompare.properties 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-user=dts 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   batch-progress-report-size=1000000 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-dbname=db1 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   observer-throttle-size=2000000 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   observer-throttle=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-database-hash=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   number-cast=notation 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-host=192.168.31.181 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-database-hash=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   log-level=INFO 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   log-destination=stdout 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-user=dts 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-sslmode=disable 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-name=pg16 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-host=192.168.31.181 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Connecting to repository database 

查看存储库

代码语言:txt
AI代码解释
复制
pgcompare=# \dn
		List of schemas
   Name    |       Owner       
-----------+-------------------
 pgcompare | dts
 public    | pg_database_owner
(2 rows)

pgcompare=# set search_path ='pgcompare';
SET
pgcompare=# \dt
			  List of relations
  Schema   |       Name       | Type  | Owner 
-----------+------------------+-------+-------
 pgcompare | dc_object        | table | dts
 pgcompare | dc_result        | table | dts
 pgcompare | dc_source        | table | dts
 pgcompare | dc_table         | table | dts
 pgcompare | dc_table_history | table | dts
 pgcompare | dc_target        | table | dts
(6 rows)

自动tables注册

代码语言:txt
AI代码解释
复制
java -jar pgcompare.jar --discovery public  # 注意这里public指的是schema

[2024-06-27 19:21:54] [INFO   ] [main                    ] Starting - rid: 1719487314311 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Version: 0.2.0 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Batch Number: 0 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Recheck Out of Sync: false 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Parameters:  
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-port=5436 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   observer-vacuum=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-port=5432 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-user=dts 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-type=postgres 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   batch-commit-size=2000 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   stage-table-parallel=0 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-name=pg15 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   message-queue-size=100 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-sslmode=disable 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-dbname=db1 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-sslmode=disable 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   batch-fetch-size=2000 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-type=postgres 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   database-sort=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-dbname=pgcompare 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-host=192.168.31.181 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-port=5432 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   loader-threads=2 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-schema=pgcompare 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   config-file=pgcompare.properties 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-user=dts 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   batch-progress-report-size=1000000 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-dbname=db1 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   observer-throttle-size=2000000 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   observer-throttle=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-database-hash=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   number-cast=notation 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-host=192.168.31.181 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-database-hash=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   log-level=INFO 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   log-destination=stdout 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-user=dts 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-sslmode=disable 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-name=pg16 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-host=192.168.31.181 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Connecting to repository database 
[2024-06-27 19:21:55] [INFO   ] [main                    ] Connecting to source database 
[2024-06-27 19:21:55] [INFO   ] [main                    ] Connecting to target database 
[2024-06-27 19:21:55] [INFO   ] [main                    ] Performaning table discovery for schema: db1 
[2024-06-27 19:21:55] [INFO   ] [main                    ] Shutting down 

查看pg存储库,可以看到已经采集到表信息了

(它会把source和target的表都采集进来)

代码语言:txt
AI代码解释
复制
pgcompare=# select * from dc_table;
 tid | source_schema | source_table | target_schema | target_table | batch_nbr | parallel_degree | mod_column | status | table_filter | column_map 
-----+---------------+--------------+---------------+--------------+-----------+-----------------+------------+--------+--------------+------------
  19 | public        | t1           | public        | t1           |         1 |               1 | NULL       | ready  | NULL         | NULL
  20 | public        | t2           | public        | t2           |         1 |               1 | NULL       | ready  | NULL         | NULL
  21 | public        | t3           | public        | t3           |         1 |               1 | NULL       | ready  | NULL         | NULL
  22 | public        | t5           | public        | t5           |         1 |               1 | NULL       | ready  | NULL         | NULL
  23 | public        | tb           | public        | tb           |         1 |               1 | NULL       | ready  | NULL         | NULL
  24 | public        | ttt          | public        | ttt          |         1 |               1 | NULL       | ready  | NULL         | NULL
(6 rows)

执行数据比较

代码语言:txt
AI代码解释
复制
java -jar pgcompare.jar --batch=0   # 批次号的默认值为 0表示所有批次
	
[2024-06-28 10:14:01] [INFO   ] [main                    ] Starting - rid: 1719540841891 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Version: 0.2.0 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Batch Number: 0 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Recheck Out of Sync: false 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Parameters:  
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-port=5436 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   observer-vacuum=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-port=5432 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-user=dts 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-type=postgres 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   batch-commit-size=2000 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   stage-table-parallel=0 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-name=pg15 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   message-queue-size=100 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-sslmode=disable 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-dbname=db1 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-sslmode=disable 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   batch-fetch-size=2000 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-type=postgres 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   database-sort=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-dbname=pgcompare 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-host=192.168.31.181 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-port=5432 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   loader-threads=2 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-schema=pgcompare 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   config-file=pgcompare.properties 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-user=dts 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   batch-progress-report-size=1000000 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-dbname=db1 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   observer-throttle-size=2000000 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   observer-throttle=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-database-hash=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   number-cast=notation 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-host=192.168.31.181 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-database-hash=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   log-level=DEBUG 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   log-destination=stdout 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-user=dts 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-sslmode=disable 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-name=pg16 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-host=192.168.31.181 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Connecting to source database 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Connecting to target database 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Start reconciliation 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Clearing data compare findings 
[2024-06-28 10:14:02] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t1 
[2024-06-28 10:14:02] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t1 
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Source Columns:  
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Target Columns:  
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Source Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(b,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"b": ' || b||'}' pk, md5(concat_ws('','0')) FROM public.t1 WHERE 1=1  
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Target Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(b,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"b": ' || b||'}' pk, md5(concat_ws('','0')) FROM public.t1 WHERE 1=1  
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Starting compare hash threads 
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Creating data compare staging tables 
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Starting compare thread 0 
[2024-06-28 10:14:02] [INFO   ] [Observer-c23-t0         ] Starting reconcile observer 
[2024-06-28 10:14:02] [INFO   ] [Observer-c23-t0         ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-source-c23-t0 ] Start database reconcile thread 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-target-c23-t0 ] Start database reconcile thread 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-source-c23-t0 ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-target-c23-t0 ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [loader-source-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:02] [INFO   ] [loader-target-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:02] [INFO   ] [loader-source-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:02] [INFO   ] [loader-source-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [loader-target-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:02] [INFO   ] [loader-target-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [loader-target-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [loader-source-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-source-c23-t0 ] Connecting to source database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-target-c23-t0 ] Connecting to target database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-target-c23-t0 ] Complete. Total rows loaded: 9 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-source-c23-t0 ] Complete. Total rows loaded: 6 
[2024-06-28 10:14:03] [INFO   ] [loader-source-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:03] [INFO   ] [loader-target-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:03] [INFO   ] [loader-target-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:03] [INFO   ] [loader-source-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:04] [INFO   ] [ReconcileController     ] Waiting for hash threads to complete 
[2024-06-28 10:14:04] [INFO   ] [ReconcileController     ] Waiting for reconcile threads to complete 
[2024-06-28 10:14:04] [INFO   ] [Observer-c23-t0         ] Matched 6 rows 
[2024-06-28 10:14:08] [INFO   ] [Observer-c23-t0         ] Staging table cleanup 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Analyzing: Step 1 of 3 - Missing on Source 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Analyzing: Step 2 of 3 - Missing on Target 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Analyzing: Step 3 of 3 - Not Equal 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Reconciliation Complete: Table = t1; Status = out-of-sync ; Equal = 6; Not Equal = 0; Missing Source = 3; Missing Target = 0 
[2024-06-28 10:14:08] [INFO   ] [main                    ] Start reconciliation 
[2024-06-28 10:14:08] [INFO   ] [main                    ] Clearing data compare findings 
[2024-06-28 10:14:08] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t2 
[2024-06-28 10:14:08] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t2 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Source Columns:  
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Target Columns:  
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Source Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(b,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"b": ' || b||'}' pk, md5(concat_ws('','0')) FROM public.t2 WHERE 1=1  
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Target Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(b,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"b": ' || b||'}' pk, md5(concat_ws('','0')) FROM public.t2 WHERE 1=1  
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Starting compare hash threads 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Creating data compare staging tables 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Starting compare thread 0 
[2024-06-28 10:14:08] [INFO   ] [Observer-c24-t0         ] Starting reconcile observer 
[2024-06-28 10:14:08] [INFO   ] [loader-source-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-target-c24-t0 ] Start database reconcile thread 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-source-c24-t0 ] Start database reconcile thread 
[2024-06-28 10:14:08] [INFO   ] [Observer-c24-t0         ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [loader-target-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:08] [INFO   ] [loader-target-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [loader-source-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [loader-source-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:08] [INFO   ] [loader-target-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:08] [INFO   ] [loader-target-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-target-c24-t0 ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-source-c24-t0 ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [loader-source-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-target-c24-t0 ] Connecting to target database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-source-c24-t0 ] Connecting to source database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-target-c24-t0 ] Complete. Total rows loaded: 2 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-source-c24-t0 ] Complete. Total rows loaded: 3 
[2024-06-28 10:14:09] [INFO   ] [loader-target-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:09] [INFO   ] [loader-source-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:09] [INFO   ] [loader-target-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:09] [INFO   ] [loader-source-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:10] [INFO   ] [ReconcileController     ] Waiting for hash threads to complete 
[2024-06-28 10:14:10] [INFO   ] [ReconcileController     ] Waiting for reconcile threads to complete 
[2024-06-28 10:14:10] [INFO   ] [Observer-c24-t0         ] Matched 2 rows 
[2024-06-28 10:14:14] [INFO   ] [Observer-c24-t0         ] Staging table cleanup 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Analyzing: Step 1 of 3 - Missing on Source 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Analyzing: Step 2 of 3 - Missing on Target 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Analyzing: Step 3 of 3 - Not Equal 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Reconciliation Complete: Table = t2 ; Status = out-of-sync ; Equal = 2; Not Equal = 0; Missing Source = 0; Missing Target = 1 
[2024-06-28 10:14:14] [INFO   ] [main                    ] Start reconciliation 
[2024-06-28 10:14:14] [INFO   ] [main                    ] Clearing data compare findings 
[2024-06-28 10:14:14] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t3 
[2024-06-28 10:14:14] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t3 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Source Columns: address,remark 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Target Columns: address,remark 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Source Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(id,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"id": ' || id||'}' pk, md5(concat_ws('',coalesce(address::text,' ')||coalesce(remark::text,' '))) FROM public.t3 WHERE 1=1  
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Target Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(id,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"id": ' || id||'}' pk, md5(concat_ws('',coalesce(address::text,' ')||coalesce(remark::text,' '))) FROM public.t3 WHERE 1=1  
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Starting compare hash threads 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Creating data compare staging tables 
[2024-06-28 10:14:15] [INFO   ] [ReconcileController     ] Starting compare thread 0 
[2024-06-28 10:14:15] [INFO   ] [Observer-c25-t0         ] Starting reconcile observer 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-target-c25-t0 ] Start database reconcile thread 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-source-c25-t0 ] Start database reconcile thread 
[2024-06-28 10:14:15] [INFO   ] [loader-source-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:15] [INFO   ] [Observer-c25-t0         ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [loader-target-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:15] [INFO   ] [loader-target-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [loader-source-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:15] [INFO   ] [loader-source-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-source-c25-t0 ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [loader-target-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:15] [INFO   ] [loader-source-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-target-c25-t0 ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [loader-target-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-target-c25-t0 ] Connecting to target database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-source-c25-t0 ] Connecting to source database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-target-c25-t0 ] Complete. Total rows loaded: 6 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-source-c25-t0 ] Complete. Total rows loaded: 5 
[2024-06-28 10:14:16] [INFO   ] [loader-target-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:16] [INFO   ] [loader-source-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:16] [INFO   ] [loader-source-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:16] [INFO   ] [loader-target-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:17] [INFO   ] [ReconcileController     ] Waiting for hash threads to complete 
[2024-06-28 10:14:17] [INFO   ] [ReconcileController     ] Waiting for reconcile threads to complete 
[2024-06-28 10:14:17] [INFO   ] [Observer-c25-t0         ] Matched 4 rows 
[2024-06-28 10:14:21] [INFO   ] [Observer-c25-t0         ] Staging table cleanup 
[2024-06-28 10:14:21] [INFO   ] [ReconcileController     ] Analyzing: Step 1 of 3 - Missing on Source 
[2024-06-28 10:14:21] [INFO   ] [ReconcileController     ] Analyzing: Step 2 of 3 - Missing on Target 
[2024-06-28 10:14:21] [INFO   ] [ReconcileController     ] Analyzing: Step 3 of 3 - Not Equal 
[2024-06-28 10:14:21] [INFO   ] [ReconcileController     ] Reconciliation Complete: Table = t3 ; Status = out-of-sync ; Equal = 4; Not Equal = 0; Missing Source = 2; Missing Target = 1 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Processed 3 tables 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Table Summary: Table = t1 ; Status = out-of-sync ; Equal = 6; Not Equal = 0; Missing Source = 3; Missing Target = 0 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Table Summary: Table = t2 ; Status = out-of-sync ; Equal = 2; Not Equal = 0; Missing Source = 0; Missing Target = 1 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Table Summary: Table = t3 ; Status = out-of-sync ; Equal = 4; Not Equal = 0; Missing Source = 2; Missing Target = 1 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Run Summary:  Elapsed Time (seconds) = 19; Total Rows Processed = 19; Total Out-of-Sync = 7; Through-put (rows/per second

查看pg存储库的结果

代码语言:txt
AI代码解释
复制
	pgcompare=# select * from dc_result;
	 cid |      rid      | table_name |   status    |          compare_dt           | equal_cnt | missing_source_cnt | missing_target_cnt | not_equal_cnt | source_cnt | target_cnt 
	-----+---------------+------------+-------------+-------------------------------+-----------+--------------------+--------------------+---------------+------------+------------
	  11 | 1719540297159 | t1         | out-of-sync | 2024-06-28 10:04:58.066141+08 |         6 |                  3 |                  0 |             0 |          6 |          9
	  12 | 1719540297159 | t2         | out-of-sync | 2024-06-28 10:05:04.188957+08 |         2 |                  0 |                  1 |             0 |          3 |          2

官方给出的查询的SQL语句

代码语言:txt
AI代码解释
复制

1、上次运行的结果
	WITH mr AS (SELECT max(rid) rid FROM dc_result)
	SELECT compare_dt, table_name, status, source_cnt total_cnt, equal_cnt, not_equal_cnt, missing_source_cnt+missing_target_cnt missing_cnt
	FROM dc_result r
		 JOIN mr ON (mr.rid=r.rid)
	ORDER BY table_name;

结构
pgcompare-# ORDER BY table_name;
          compare_dt           | table_name |   status    | total_cnt | equal_cnt | not_equal_cnt | missing_cnt 
-------------------------------+------------+-------------+-----------+-----------+---------------+-------------
 2024-06-28 10:37:59.375011+08 | t1         | out-of-sync |         6 |         6 |             0 |           3
 2024-06-28 10:38:05.552199+08 | t2         | out-of-sync |         3 |         2 |             0 |           1
 2024-06-28 10:38:11.672654+08 | t3         | in-sync     |         2 |         2 |             0 |           0
 2024-06-28 10:38:17.797341+08 | t4         | skipped     |         0 |         0 |             0 |           0
(4 rows)


2、不同步的行(注意:如果target的行比source的多,则不会被报告出来)
	SELECT coalesce(s.table_name,t.table_name) table_name,
	   coalesce(s.batch_nbr, t.batch_nbr) batch_nbr,
       coalesce(s.thread_nbr,t.thread_nbr) thread_nbr,
       CASE WHEN s.compare_result='n' THEN 'out-of-sync'
            WHEN s.compare_result='m' THEN 'missing target'
            WHEN t.compare_result='m' THEN 'missing source'
            ELSE 'unknown'
       END compare_result,
       coalesce(s.pk,t.pk) primary_key       
	FROM dc_source s
		 CROSS JOIN dc_target t;

pgcompare-#  CROSS JOIN dc_target t;
 table_name | batch_nbr | thread_nbr | compare_result | primary_key 
------------+-----------+------------+----------------+-------------
 t2         |         1 |          0 | missing target | {"b": 567}
 t2         |         1 |          0 | missing target | {"b": 567}
 t2         |         1 |          0 | missing target | {"b": 567}
(3 rows)

如果发现差异,请使用--check选项运行比较:

代码语言:txt
AI代码解释
复制
java -jar pgcompare.jar --batch=0 --check

当初始比较期间事务可能正在进行时,此重新检查过程非常有用。
重新检查仅检查已标记为存在差异的行。如果行仍然不匹配,则会报告详细信息。否则,行将被清除并标记为同步。

其它:

代码语言:txt
AI代码解释
复制
如果在执行完pgcompare后,数据库里面又增加或者减少了表,则需要重新执行 下面的操作:

0、清空pgcompare下面的各个表(清掉后便于查看最新数据,不清的话则需要根据compare_dt时间戳来判断是哪一次执行的比对操作)
	truncate table pgcompare.dc_object        ;
	truncate table pgcompare.dc_result        ;
	truncate table pgcompare.dc_source        ;
	truncate table pgcompare.dc_table         ;
	truncate table pgcompare.dc_table_history ;
	truncate table pgcompare.dc_target        ;


1、java -jar pgcompare.jar --discovery public


2、查看pg存储库的是否采集到表信息
	pgcompare=# select * from dc_table;
	 tid | source_schema | source_table | target_schema | target_table | batch_nbr | parallel_degree | mod_column | status | table_filter | column_map 
	-----+---------------+--------------+---------------+--------------+-----------+-----------------+------------+--------+--------------+------------
	   6 | public        | t1           | public        | t1           |         1 |               1 | NULL       | ready  | NULL         | NULL
	   7 | public        | t2           | public        | t2           |         1 |               1 | NULL       | ready  | NULL         | NULL
	   8 | public        | t3           | public        | t3           |         1 |               1 | NULL       | ready  | NULL         | NULL
	(3 rows)


3、java -jar pgcompare.jar --batch=0
	
	
4、查看比对的结果
	pgcompare=# select * from dc_result;
	 cid |      rid      | table_name |   status    |          compare_dt           | equal_cnt | missing_source_cnt | missing_target_cnt | not_equal_cnt | source_cnt | target_cnt 
	-----+---------------+------------+-------------+-------------------------------+-----------+--------------------+--------------------+---------------+------------+------------
	  17 | 1719540698085 | t1         | out-of-sync | 2024-06-28 10:11:39.071168+08 |         6 |                  3 |                  0 |             0 |          6 |          9
	  18 | 1719540698085 | t2         | out-of-sync | 2024-06-28 10:11:45.231873+08 |         2 |                  0 |                  1 |             0 |          3 |          2
	  19 | 1719540698085 | t3         | in-sync     | 2024-06-28 10:11:51.378574+08 |         3 |                  0 |                  0 |             0 |          3 |          3
	(3 rows)


5、找出不同步的行(注意:如果target的行比source的多,则不会被报告出来)
		SELECT coalesce(s.table_name,t.table_name) table_name,
	   coalesce(s.batch_nbr, t.batch_nbr) batch_nbr,
       coalesce(s.thread_nbr,t.thread_nbr) thread_nbr,
       CASE WHEN s.compare_result='n' THEN 'out-of-sync'
            WHEN s.compare_result='m' THEN 'missing target'
            WHEN t.compare_result='m' THEN 'missing source'
            ELSE 'unknown'
       END compare_result,
       coalesce(s.pk,t.pk) primary_key       
		FROM dc_source s
		 CROSS JOIN dc_target t;

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
暂无评论
推荐阅读
编辑精选文章
换一批
C#宿舍系统数据库
Source Server : workplace Source Server Version : 50610 Source Host : 127.0.0.1:3306 Source Database : dormdb
张哥编程
2024/12/21
1180
pgcopydb的使用
https://pgcopydb.readthedocs.io/en/latest/
保持热爱奔赴山海
2024/06/27
3740
Linux系统之last命令的基本使用
文章名《Spring Cloud Alibaba + Dubbo 搭建一个微服务架构》 作者:王二蛋
江湖有缘
2024/11/28
2960
Linux系统之last命令的基本使用
【AI驱动TDSQL-C Serverless数据库技术实战营】结合AI进行电商数据分析
在去年TDSQL-C serverless,其实已经有写过文章介绍过,并且做过相关压测及实验,今天这篇文章主要是结合目前大火的AI大模型相关,继续去对该产品实验。
大侠之运维
2024/09/09
1610
oracle resetlogs后goldengate如何继续工作
最近有朋友在群里面咨询oracle resetlogs后,goldengate无法继续抽取日志,如何能够继续工作.
徐靖
2020/08/05
7790
oracle resetlogs后goldengate如何继续工作
Oracle数据库工程师手记:备份恢复双城记(三)
本文作者系大连健哥, POSTGRESQL、ORACLE 数据库资深从业人员、IT 技术的深度爱好者。相信科学改变人类、技术创造未来。个人主页:https://www.cnblogs.com/gaojian/,经其本人授权发布。
SQLplusDB
2022/08/19
3440
【GitHub日报】22-10-10 brew、minio、vite、seaweedfs、dbeaver 等8款App今日上新
最新发布版本:[Homebrew/brew] Release 3.6.5 - 3.6.5
程序员小助手
2022/12/20
6600
sqlmap的使用介绍
一、sqlmap获取目标 1.sql注入介绍 所谓SQL注入,就是通过把SQL命令插入到web表单提交或输入域名或页面请求的查询字符串,最终达到欺骗服务器执行恶意的SQL命令。 具体来说,它是利用现有
XRSec
2022/02/11
3K0
YAML 语言教程与使用案例
YAML 是 “YAML Ain’t a Markup Language”(YAML 不是一种标记语言)的递归缩写。在开发的这种语言时,YAML 的意思其实是:”Yet Another Markup Language”(仍是一种标记语言)。
踏歌行
2020/10/15
3.5K0
YAML 语言教程与使用案例
mysql数据实时同步到Elasticsearch
业务需要把mysql的数据实时同步到ES,实现低延迟的检索到ES中的数据或者进行其它数据分析处理。本文给出以同步mysql binlog的方式实时同步数据到ES的思路, 实践并验证该方式的可行性,以供参考。
bellen
2018/06/08
19.2K0
mysql数据实时同步到Elasticsearch
国产数据库|TiDB 5.0 快速体验
TiDB 是平凯星辰 PingCAP 公司自主设计、研发的开源分布式关系型数据库,是一款同时支持在线事务处理与在线分析处理 (Hybrid Transactional and Analytical Processing, HTAP) 的融合型分布式数据库产品,具备水平扩容或者缩容、金融级高可用、实时 HTAP、云原生的分布式数据库、兼容 MySQL 5.7 协议和 MySQL 生态等重要特性。更是墨天轮排行榜第一的国产数据库,可点击此处查看本月国产数据库排行榜。TiDB 的目标是为用户提供一站式 OLTP (Online Transactional Processing)、OLAP (Online Analytical Processing)、HTAP 解决方案。TiDB 适合高可用、强一致要求较高、数据规模较大等各种应用场景。今年四月份发布了 5.0 版本,六月二十四日又发布了 TiDB 5.1 版本,核心架构图如下:
JiekeXu之路
2021/07/30
1.5K0
国产数据库|TiDB 5.0 快速体验
MySQL高可用之MHA集群部署
很多小伙伴反映说网上的MHA教程甚至收费的课程里的MHA教程都存在坑,不少教程只是搭建完成了,是否真的能在主库宕机时自动切换不得而知,鉴于此情况,简单写了一个MHA集群的搭建步骤。由于搭建的次数较多,没踩到过多的坑(坏笑),所以没有写太多的排坑方法,如果小伙伴们在部署的过程中遇到问题可以和我沟通,文中如有问题欢迎斧正。
俊才
2020/05/26
1.1K0
MySQL高可用之MHA集群部署
Sqoop-1.4.7-部署与常见案例
该文章是基于 Hadoop2.7.6_01_部署 、 Hive-1.2.1_01_安装部署 进行的
踏歌行
2020/10/15
9020
Sqoop-1.4.7-部署与常见案例
【DB宝91】PG高可用之主从流复制+keepalived 的高可用
通过keepalived 来实现 PostgreSQL 数据库的主从自动切换,以达到高可用。当主节点宕机时,从节点可自动切换为主节点,继续对外提供服务。
AiDBA宝典
2022/02/23
3K0
【DB宝91】PG高可用之主从流复制+keepalived 的高可用
OGG实验:喂奶间隔数据表通过OGG配置同步
我之前在《使用SQL计算宝宝每次吃奶的时间间隔(数据保障篇)》中提到数据实时同步的方案,其中有一种是数据表通过OGG进行同步,当时没有详细展开测试,只给了之前学习OGG时的配置示例。由于之前很少接触OGG的管理运维,最近在进行实际测试时遇到些问题,虽然不难,但也是初学者常遇到的问题,所以记录下这个过程。
Alfred Zhao
2020/02/18
1.3K0
OGG实验:喂奶间隔数据表通过OGG配置同步
Sqoop1.4.5 导入 hive IOException running import job: java.io.IOException: Hive exited with status 1
hive.HiveImport: Exception in thread "main" Java.lang.NoSuchMethodError: org.apache.thrift.EncodingUtils.setBit(BIZ)B
星哥玩云
2022/07/03
9640
Database(Mysql)发版控制二
一.Installation & Configration 二.Advanced Usage of the Liquibase 三.Frequently Questions
全栈程序员站长
2022/07/08
7890
相关推荐
C#宿舍系统数据库
更多 >
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档