导语 :目前生产环境中主从延迟常见于RO实例及备库中,由于历史原因目前部分用户的备库开启了只读服务,后续这块我们会逐步推荐客户使用RO实例。
接一线同时反馈,用户的RO实例延迟很大,达到1000多秒。
1.登上客户RO实例,通过执行show full processlist 查看确认是否有执行时长比较久的Select查询。
MySQL [(none)]> pager grep -v Sleep
PAGER set to 'grep -v Sleep'
MySQL [(none)]> show processlist;
+-----------+-------------+--------------+--------+---------+-------+----------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----------+-------------+--------------+--------+---------+-------+----------------------------------+------------------+
| 196776913 | system user | | NULL | Connect | 7198 | Waiting for master to send event | NULL |
| 196776914 | system user | | NULL | Connect | 12957 | Reading event from the relay log | NULL |
| 197039676 | tencentroot | localhost | NULL | Query | 0 | init | show processlist |
+-----------+-------------+--------------+--------+---------+-------+----------------------------------+------------------+
69 rows in set (0.00 sec)
MySQL [(none)]> show processlist;
+-----------+-------------+--------------+--------+---------+-------+----------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----------+-------------+--------------+--------+---------+-------+----------------------------------+------------------+
| 196776913 | system user | | NULL | Connect | 7200 | Waiting for master to send event | NULL |
| 196776914 | system user | | NULL | Connect | 12958 | Reading event from the relay log | NULL |
| 197039676 | tencentroot | localhost | NULL | Query | 0 | init | show processlist |
+-----------+-------------+--------------+--------+---------+-------+----------------------------------+------------------+
70 rows in set (0.00 sec)
这里看并没有发现执行时长的Select。
2.执行show slave status\G ,查看当前slave ,sql_thread执行到哪里.
MySQL [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.53.176.236
Master_User: tencentroot
Master_Port: 20133
Connect_Retry: 60
Master_Log_File: mysql-bin.000614
Read_Master_Log_Pos: 291987919
Relay_Log_File: relay-bin.001810
Relay_Log_Pos: 786595678
Relay_Master_Log_File: mysql-bin.000609 //当前Slave SQL_thread 执行事件所对应主库日志的名称
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 786595596////当前Slave SQL_thread 执行事件所对应主库日志的位置偏移量
Relay_Log_Space: 5661425134
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 13108
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 87411
Master_UUID: 89421293-dd22-11e6-b4d7-6c0b84d538f9
Master_Info_File: /data1/mysql_root/data/20141/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Reading event from the relay log
...........
此时保存记录,在slave 上获取的主库日志名(mysql-bin.000609) 和位置偏移量(786595596)。
3.登录主库,根据步骤2中获取的信息,查看用户业务当时执行的什么操作。
MySQL [(none)]> show binlog events in 'mysql-bin.000609' from 786595596 limit 100;
+------------------+-----------+-------------+-----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+-----------+-------------+-----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| mysql-bin.000609 | 786595596 | Gtid | 87411 | 786595640 | SET @@SESSION.GTID_NEXT= '89421293-dd22-11e6-b4d7-6c0b84d538f9:1688021788' |
| mysql-bin.000609 | 786595640 | Query | 87411 | 786595710 | BEGIN |
| mysql-bin.000609 | 786595710 | Table_map | 87411 | 786595786 | table_id: 698 (ultrax.pre_common_credit_log) |
| mysql-bin.000609 | 786595786 | Delete_rows | 87411 | 786603968 | table_id: 698 |
| mysql-bin.000609 | 786603968 | Delete_rows | 87411 | 786612150 | table_id: 698 |
| mysql-bin.000609 | 786612150 | Delete_rows | 87411 | 786620332 | table_id: 698 |
| mysql-bin.000609 | 786620332 | Delete_rows | 87411 | 786628514 | table_id: 698 |
| mysql-bin.000609 | 786628514 | Delete_rows | 87411 | 786636696 | table_id: 698 |
| mysql-bin.000609 | 786636696 | Delete_rows | 87411 | 786644878 | table_id: 698 |
| mysql-bin.000609 | 786644878 | Delete_rows | 87411 | 786653060 | table_id: 698 |
| mysql-bin.000609 | 786653060 | Delete_rows | 87411 | 786661242 | table_id: 698 |
| mysql-bin.000609 | 786661242 | Delete_rows | 87411 | 786669424 | table_id: 698 |
| mysql-bin.000609 | 786669424 | Delete_rows | 87411 | 786677606 | table_id: 698 |
| mysql-bin.000609 | 786677606 | Delete_rows | 87411 | 786685788 | table_id: 698
..........................................
这里看执行是一个delete操作。
查看表信息
MySQL [ultrax]> show table status like 'pre_common_credit_log'\G
*************************** 1. row ***************************
Name: pre_common_credit_log
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 513486861
Avg_row_length: 78
Data_length: 40310423552
Max_data_length: 0
Index_length: 65080279040
Data_free: 84828749824
Auto_increment: NULL
Create_time: 2017-01-18 17:50:19
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)
这里看到该表的数据量已经5亿多。
CREATE TABLE `pre_common_credit_log` (
`uid` int(10) unsigned NOT NULL DEFAULT '0',
`operation` char(3) NOT NULL DEFAULT '',
`relatedid` int(10) unsigned NOT NULL,
`dateline` int(10) unsigned NOT NULL,
`extcredits1` int(10) NOT NULL,
`extcredits2` int(10) NOT NULL,
`extcredits3` int(10) NOT NULL,
`extcredits4` int(10) NOT NULL,
`extcredits5` int(10) NOT NULL,
`extcredits6` int(10) NOT NULL,
`extcredits7` int(10) NOT NULL,
`extcredits8` int(10) NOT NULL,
KEY `uid` (`uid`),
KEY `operation` (`operation`),
KEY `relatedid` (`relatedid`),
KEY `dateline` (`dateline`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
MySQL [ultrax]> show index from ultrax.pre_common_credit_log;
+-----------------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| pre_common_credit_log | 1 | uid | 1 | uid | A | 17116232 | NULL | NULL | | BTREE | | |
| pre_common_credit_log | 1 | operation | 1 | operation | A | 13116 | NULL | NULL | | BTREE | | |
| pre_common_credit_log | 1 | relatedid | 1 | relatedid | A | 155696 | NULL | NULL | | BTREE | | |
| pre_common_credit_log | 1 | dateline | 1 | dateline | A | 32092936 | NULL | NULL | | BTREE | | |
+-----------------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
这里看表只有普通index,木有主键和唯一index,且index的Cardinality 也不高,表的数据量已经在5亿多,index 效率会很差。由于slave sql_thread是串行执行,再加上index效率问题,slave 不能够及时将接收的日志执行完毕,故引发主从延迟。
1.反馈用户此种业务基于主键或唯一性较高的index删除。
2.主从环境中,update、delete 操作时一定要留意当前表中是否有主键或唯一索引。若无, 可将自增列做为主键。如果没有主键的话,update、delete将会全表扫描,特别是在binlog row模式的时候,延迟将会很大。本文中binlog 中以row模
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。