
collector在尝试采集replication、replication_slot指标时失败了,报错信息如下:
time=2025-09-05T09:34:16.375+08:00 level=ERROR source=collector.go:207 msg="collector failed" name=replication duration_seconds=0.0711601 err="ERROR: Function pg_last_wal_receive_lsn() does not exist. (SQLSTATE 42883)"time=2025-09-05T09:34:16.674+08:00 level=INFO source=namespace.go:235 msg="error finding namespace" err="Error running query on database \"113.44.80.136:8000\": pg_stat_replication ERROR: Function pg_last_wal_receive_lsn() does not exist. (SQLSTATE 42883)"time=2025-09-05T09:34:16.711+08:00 level=INFO source=namespace.go:235 msg="error finding namespace" err="Error running query on database \"113.44.80.136:8000\": pg_replication_slots ERROR: Function pg_last_wal_receive_lsn() does not exist. (SQLSTATE 42883)"time=2025-09-05T09:34:16.755+08:00 level=ERROR source=collector.go:207 msg="collector failed" name=replication_slot duration_seconds=0.4515532 err="ERROR: Function pg_last_wal_receive_lsn() does not exist. (SQLSTATE 42883)"time=2025-09-11T15:33:59.609+08:00 level=ERROR source=gaussdb_exporter.go:684 msg="error scraping dsn" err="queryNamespaceMappings errors encountered, namespace: pg_stat_replication error: Error running query on database \"113.44.80.136:8000\": pg_stat_replication ERROR: Function pg_last_wal_receive_lsn() does not exist. (SQLSTATE 42883), namespace: pg_replication_slots error: Error running query on database \"113.44.80.136:8000\": pg_replication_slots ERROR: Function pg_last_wal_receive_lsn() does not exist. (SQLSTATE 42883)" dsn="gaussdb://root:PASSWORD_REMOVED@113.44.80.136:8000/circle_test?sslmode=disable"相关SQL:
SELECT
*,
(CASE pg_is_in_recovery () WHEN 't' THEN pg_last_wal_receive_lsn () ELSE pg_current_wal_lsn () END) AS pg_current_wal_lsn,
(
CASE
pg_is_in_recovery ()
WHEN 't' THEN
pg_wal_lsn_diff (pg_last_wal_receive_lsn (), PG_LSN ('0/0')) :: FLOAT
ELSE
pg_wal_lsn_diff (pg_current_wal_lsn (), PG_LSN ('0/0')) :: FLOAT
END
) AS pg_current_wal_lsn_bytes,
(
CASE
pg_is_in_recovery ()
WHEN 't' THEN
pg_wal_lsn_diff (pg_last_wal_receive_lsn (), replay_lsn) :: FLOAT
ELSE
pg_wal_lsn_diff (pg_current_wal_lsn (), replay_lsn) :: FLOAT
END
) AS pg_wal_lsn_diff
FROM
pg_stat_replication;
SELECT
slot_name,
DATABASE,
active,
(CASE pg_is_in_recovery () WHEN 't' THEN pg_wal_lsn_diff (pg_last_wal_receive_lsn (), restart_lsn) ELSE pg_wal_lsn_diff (pg_current_wal_lsn (), restart_lsn) END) AS pg_wal_lsn_diff
FROM
pg_replication_slots;
SELECT
slot_name,
slot_type,
CASE
WHEN pg_is_in_recovery () THEN
pg_last_wal_receive_lsn () - '0/0'
ELSE
pg_current_wal_lsn () - '0/0'
END AS current_wal_lsn,
0 AS confirmed_flush_lsn,
active
FROM
pg_replication_slots;错误提示很明确:SQL查询中引用了名为 pg_last_wal_receive_lsn() 的函数,但该函数在数据库系统中不存在。
GaussDB版本是:gaussdb (GaussDB Kernel 505.2.1 build ff07bff6) compiled at 2024-12-27 09:22:42 commit 10161 last mr 21504 release 采集工具是:Prometheus gaussdb_exporter
从报错信息和背景来看,核心问题是GaussDB 与 PostgreSQL 在复制相关系统函数上存在差异,导致监控工具(gaussdb_exporter)使用的 PostgreSQL 风格函数在 GaussDB 中不被支持。
pg_last_wal_receive_lsn()是 PostgreSQL 特有函数
pg_last_xlog_receive_lsn(),因 PostgreSQL 10 将 XLOG 重命名为 WAL)。
pg_last_wal_receive_lsn()。这是典型的 “兼容性差异”——GaussDB 保留了 PostgreSQL 的核心语法,但在部分系统函数(尤其是与底层存储、复制相关的)上做了定制化实现。
GaussDB 通过系统视图和自有函数提供复制相关信息,需修改监控查询语句,替换pg_last_wal_receive_lsn()为 GaussDB 支持的方式。以下是具体改造方案:
需求(原 PostgreSQL 函数) | GaussDB 替代方案 | 说明 |
|---|---|---|
判断是否为备库(pg_is_in_recovery()) | 仍可使用pg_is_in_recovery()(GaussDB 兼容该函数) | 返回t表示备库,f表示主库 |
备库最后接收的 LSN(pg_last_wal_receive_lsn()) | 从pg_stat_replication视图的receive_lsn字段获取(主库视角);或备库通过pg_stat_get_wal_receive_lsn()函数(部分版本支持) | GaussDB 中,备库的接收 LSN 直接记录在复制状态视图中,无需单独函数计算 |
主库当前 WAL 位置(pg_current_wal_lsn()) | 使用pg_current_wal_lsn()(GaussDB 兼容)或pg_stat_get_wal_current_lsn()函数 | 主库当前写入的 WAL 位置,与 PostgreSQL 用法一致 |
针对您提供的 3 条 SQL,替换pg_last_wal_receive_lsn()为 GaussDB 支持的方式:
pg_stat_replication查询)原 SQL 问题:备库判断分支使用了pg_last_wal_receive_lsn(),GaussDB 不支持。
改造后:
SELECT
*,
-- 替换备库LSN获取方式:主库用pg_current_wal_lsn(),备库从pg_stat_replication取receive_lsn
(CASE pg_is_in_recovery ()
WHEN 't' THEN (SELECT receive_lsn FROM pg_stat_replication LIMIT 1) -- 备库场景
ELSE pg_current_wal_lsn ()
END) AS pg_current_wal_lsn,
(
CASE
pg_is_in_recovery ()
WHEN 't' THEN
-- 备库:计算receive_lsn与0/0的差值
pg_wal_lsn_diff ((SELECT receive_lsn FROM pg_stat_replication LIMIT 1), PG_LSN ('0/0')) :: FLOAT
ELSE
-- 主库:沿用原逻辑
pg_wal_lsn_diff (pg_current_wal_lsn (), PG_LSN ('0/0')) :: FLOAT
END
) AS pg_current_wal_lsn_bytes,
(
CASE
pg_is_in_recovery ()
WHEN 't' THEN
-- 备库:用receive_lsn与replay_lsn计算差值
pg_wal_lsn_diff ((SELECT receive_lsn FROM pg_stat_replication LIMIT 1), replay_lsn) :: FLOAT
ELSE
-- 主库:沿用原逻辑
pg_wal_lsn_diff (pg_current_wal_lsn (), replay_lsn) :: FLOAT
END
) AS pg_wal_lsn_diff
FROM
pg_stat_replication;
pg_replication_slots查询)改造后:
SELECT
slot_name,
database,
active,
(CASE pg_is_in_recovery ()
WHEN 't' THEN
-- 备库:用receive_lsn计算差值
pg_wal_lsn_diff ((SELECT receive_lsn FROM pg_stat_replication LIMIT 1), restart_lsn)
ELSE
-- 主库:沿用原逻辑
pg_wal_lsn_diff (pg_current_wal_lsn (), restart_lsn)
END) AS pg_wal_lsn_diff
FROM
pg_replication_slots;
pg_replication_slots补充查询)SELECT
slot_name,
slot_type,
CASE
WHEN pg_is_in_recovery () THEN
-- 备库:用receive_lsn与0/0的差值
pg_wal_lsn_diff ((SELECT receive_lsn FROM pg_stat_replication LIMIT 1), PG_LSN ('0/0'))
ELSE
-- 主库:沿用原逻辑
pg_wal_lsn_diff (pg_current_wal_lsn (), PG_LSN ('0/0'))
END AS current_wal_lsn,
0 AS confirmed_flush_lsn,
active
FROM
pg_replication_slots;
执行以下语句确认 GaussDB 支持的字段和函数,确保改造有效:
-- 1. 查看pg_stat_replication视图结构(确认是否有receive_lsn字段)
\d pg_stat_replication;
-- 2. 检查备库LSN相关函数(部分GaussDB版本可能提供)
SELECT pg_stat_get_wal_receive_lsn(); -- 若返回值,则可替代子查询
-- 3. 确认主库当前LSN函数
SELECT pg_current_wal_lsn();将修改后的 SQL 更新到 gaussdb_exporter 的查询模板中(通常在queries.yaml或代码内置的 SQL 字符串中),重新部署 exporter 即可。
pg_last_wal_receive_lsn()是 PostgreSQL 9.6 + 引入的(替换了 9.5 及之前的pg_last_xlog_receive_lsn())。因此,从基础版本兼容性来看,GaussDB 不支持该函数符合其版本定位。
pg_stat_replication(主库视图)、pg_stat_slave_replication(备库视图,部分版本有)等视图暴露状态,而非独立函数。这也是改造时优先使用视图字段的原因。
问题根因是GaussDB 不支持 PostgreSQL 的pg_last_wal_receive_lsn()函数,需通过查询系统视图(如pg_stat_replication的receive_lsn字段)替代。核心解决方案是:
pg_stat_replication等视图的字段存在性,确保 LSN 计算逻辑正确;原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。