MySQL多源复制故障修复, 重新初始化一个channel的复制

MySQL多源复制故障修复, 重新初始化一个channel的复制

2017-03-04
Mysql

MySQL 5.7数据库多源复制拓扑如下:

1M1 --------+
2           |
3           |
4           v
5M2 ------->S

M1和M2为两个主数据库实例,S为从数据库实例,建立了从M1到S和从M2到S的多源复制。

S上执行:

  1SHOW SLAVE STATUS \G
  2*************************** 1. row ***************************
  3               Slave_IO_State: Waiting for master to send event
  4                  Master_Host: x.x.x.x
  5                  Master_User: repl
  6                  Master_Port: 3306
  7                Connect_Retry: 60
  8              Master_Log_File: binlog.000106
  9          Read_Master_Log_Pos: 15847815
 10               Relay_Log_File: relay-bin-m1.000003
 11                Relay_Log_Pos: 15684092
 12        Relay_Master_Log_File: binlog.000106
 13             Slave_IO_Running: Yes
 14            Slave_SQL_Running: Yes
 15              Replicate_Do_DB: 
 16          Replicate_Ignore_DB: 
 17           Replicate_Do_Table: 
 18       Replicate_Ignore_Table: 
 19      Replicate_Wild_Do_Table: 
 20  Replicate_Wild_Ignore_Table: sys.%,performance_schema.%,mysql.%,information_schema.%
 21                   Last_Errno: 0
 22                   Last_Error: 
 23                 Skip_Counter: 0
 24          Exec_Master_Log_Pos: 15847815
 25              Relay_Log_Space: 15684290
 26              Until_Condition: None
 27               Until_Log_File: 
 28                Until_Log_Pos: 0
 29           Master_SSL_Allowed: No
 30           Master_SSL_CA_File: 
 31           Master_SSL_CA_Path: 
 32              Master_SSL_Cert: 
 33            Master_SSL_Cipher: 
 34               Master_SSL_Key: 
 35        Seconds_Behind_Master: 0
 36Master_SSL_Verify_Server_Cert: No
 37                Last_IO_Errno: 0
 38                Last_IO_Error: 
 39               Last_SQL_Errno: 0
 40               Last_SQL_Error: 
 41  Replicate_Ignore_Server_Ids: 
 42             Master_Server_Id: 86
 43                  Master_UUID: @17070e4d-a4bc-49b6-88d4-cbbc610abc2b
 44             Master_Info_File: mysql.slave_master_info
 45                    SQL_Delay: 0
 46          SQL_Remaining_Delay: NULL
 47      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
 48           Master_Retry_Count: 86400
 49                  Master_Bind: 
 50      Last_IO_Error_Timestamp: 
 51     Last_SQL_Error_Timestamp: 
 52               Master_SSL_Crl: 
 53           Master_SSL_Crlpath: 
 54           Retrieved_Gtid_Set: 134db0a3-61a4-4f96-8574-a3ec77045987:68811801-68821751
 55            Executed_Gtid_Set: 30d1ce13-e8b0-43ae-b989-667aa0394fb8:1-240,
 56f7123455-4085-4e24-a028-e0cff16d1cc6:1-8409014,
 57f0c67103-9183-46c3-b34f-1b7fac7c0f51:1,
 5817070e4d-a4bc-49b6-88d4-cbbc610abc2b:1,
 59bb09bbce-c153-4af2-ab6f-85e70b74a13c:1-3,
 60cc96b6de-d317-4881-9a86-9c9b690cb8cc:1-15,
 61178dde0c-492c-4365-96e2-57008f2bd232:1-524521756,
 62da0e0547-a1ca-4cae-8db9-f3a90b3a996a:1,
 63134db0a3-61a4-4f96-8574-a3ec77045987:1-68821751
 64                Auto_Position: 1
 65         Replicate_Rewrite_DB: 
 66                 Channel_Name: m1
 67           Master_TLS_Version: 
 68*************************** 2. row ***************************
 69               Slave_IO_State: Waiting for master to send event
 70                  Master_Host: x.x.x.x
 71                  Master_User: repl
 72                  Master_Port: 3306
 73                Connect_Retry: 60
 74              Master_Log_File: binlog.002584
 75          Read_Master_Log_Pos: 127954993
 76               Relay_Log_File: relay-bin-m2.003944
 77                Relay_Log_Pos: 127955152
 78        Relay_Master_Log_File: binlog.002584
 79             Slave_IO_Running: Yes
 80            Slave_SQL_Running: Yes
 81              Replicate_Do_DB: 
 82          Replicate_Ignore_DB: 
 83           Replicate_Do_Table: 
 84       Replicate_Ignore_Table: 
 85      Replicate_Wild_Do_Table: 
 86  Replicate_Wild_Ignore_Table: sys.%,performance_schema.%,mysql.%,information_schema.%
 87                   Last_Errno: 0
 88                   Last_Error: 
 89                 Skip_Counter: 0
 90          Exec_Master_Log_Pos: 127954993
 91              Relay_Log_Space: 127955430
 92              Until_Condition: None
 93               Until_Log_File: 
 94                Until_Log_Pos: 0
 95           Master_SSL_Allowed: No
 96           Master_SSL_CA_File: 
 97           Master_SSL_CA_Path: 
 98              Master_SSL_Cert: 
 99            Master_SSL_Cipher: 
100               Master_SSL_Key: 
101        Seconds_Behind_Master: 0
102Master_SSL_Verify_Server_Cert: No
103                Last_IO_Errno: 0
104                Last_IO_Error: 
105               Last_SQL_Errno: 0
106               Last_SQL_Error: 
107  Replicate_Ignore_Server_Ids: 
108             Master_Server_Id: 69
109                  Master_UUID: c8822a1d-d39c-11e9-8301-fa163e13923c
110             Master_Info_File: mysql.slave_master_info
111                    SQL_Delay: 0
112          SQL_Remaining_Delay: NULL
113      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
114           Master_Retry_Count: 86400
115                  Master_Bind: 
116      Last_IO_Error_Timestamp: 
117     Last_SQL_Error_Timestamp: 
118               Master_SSL_Crl: 
119           Master_SSL_Crlpath: 
120           Retrieved_Gtid_Set: 178dde0c-492c-4365-96e2-57008f2bd232:524093802-524521756
121            Executed_Gtid_Set: 30d1ce13-e8b0-43ae-b989-667aa0394fb8:1-240,
122f7123455-4085-4e24-a028-e0cff16d1cc6:1-8409014,
123f0c67103-9183-46c3-b34f-1b7fac7c0f51:1,
12417070e4d-a4bc-49b6-88d4-cbbc610abc2b:1,
125bb09bbce-c153-4af2-ab6f-85e70b74a13c:1-3,
126cc96b6de-d317-4881-9a86-9c9b690cb8cc:1-15,
127178dde0c-492c-4365-96e2-57008f2bd232:1-524521756,
128da0e0547-a1ca-4cae-8db9-f3a90b3a996a:1,
129134db0a3-61a4-4f96-8574-a3ec77045987:1-68821751
130                Auto_Position: 1
131         Replicate_Rewrite_DB: 
132                 Channel_Name: m2
133           Master_TLS_Version: 
1342 rows in set (0.00 sec)

多源复制(Multi-source replication)是指在一个数据库中从多个源数据库中复制数据到同一个目标数据库的过程。要在 MySQL 中实现多源复制,需要在目标数据库中创建多个复制通道(replication channel),每个通道对应一个源数据库。每个通道都需要有独立的复制配置,包括源数据库的连接信息、复制过滤规则等。

上面的命令输出中可以看出从M1到S复制的channel名字为m1,从M2到S复制的channel名字为m2。

实践中遇到了问题,M1到S的复制出错,需要重做M1到S的复制。

  1. 在S上操作,停掉S上所有的SLAVE CHANNEL
1STOP SLAVE;
  1. 在S上操作,查看并记录CHANNEL的Executed_Gtid_Set
 1SHOW SLAVE STATUS \G
 2
 3...
 4Executed_Gtid_Set: 30d1ce13-e8b0-43ae-b989-667aa0394fb8:1-240,
 5f7123455-4085-4e24-a028-e0cff16d1cc6:1-8409014,
 6f0c67103-9183-46c3-b34f-1b7fac7c0f51:1,
 717070e4d-a4bc-49b6-88d4-cbbc610abc2b:1,
 8bb09bbce-c153-4af2-ab6f-85e70b74a13c:1-3,
 9cc96b6de-d317-4881-9a86-9c9b690cb8cc:1-15,
10178dde0c-492c-4365-96e2-57008f2bd232:1-524521756,
11da0e0547-a1ca-4cae-8db9-f3a90b3a996a:1,
12134db0a3-61a4-4f96-8574-a3ec77045987:1-68821751
13...

注意从Retrieved_Gtid_Set字段来确认在Executed_Gtid_Set中:

134db0a3-61a4-4f96-8574-a3ec77045987:1-68821751对应的是channel m1, 178dde0c-492c-4365-96e2-57008f2bd232:1-524521756对应的是channel m2。

已经停掉了所有的slave,而要重做m1,因此channel m2是最重要的,一定要记录好。

  1. 在M1上操作,导出M1中需要被复制的数据库

具体要导出哪些数据库还是要导出全部数据库,这个由具体业务和复制规则决定。另外注意要加上--set-gtid-purged=ON参数

1mysqldump --single-transaction --default-character-set=utf8mb4 --master-data=2 --set-gtid-purged=ON -vv -E -R \
2--databases db1 db2 db3 db4 \
3-uroot -p > db.sql

将上面导出的db.sql拷贝到S上。

  1. 在S上操作,使用RESET MASTER清空二进制日志文件

RESET MASTER用于清空二进制日志文件并重新开始记录新的二进制日志。这个命令通常用于在主从复制中,当需要重新开始复制时使用。

1RESET MASTER;

执行完RESET MASTER后,查询@@GLOBAL.GTID_PURGED变量返回为空。

1SELECT @@GLOBAL.GTID_PURGED;
  1. 在S上操作,将db.sql导入到S中。
1\. db.sql

此时查看@@GLOBAL.GTID_PURGED:

1SELECT @@GLOBAL.GTID_PURGED;
2
3....
4134db0a3-61a4-4f96-8574-a3ec77045987:1-69811800

注意记录上面查询中原来channel m1使用的134db0a3-61a4-4f96-8574-a3ec77045987:xxxxxxx新的事件编号(event number)这里是1-69811800。

  1. 在S上操作,重新设置@@GLOBAL.GTID_PURGED

在S上操作, RESET MASTER用于清空二进制日志文件(第5步导入数据产生的)并重新开始记录新的二进制日志。。

1RESET MASTER;

使用第2步中记录的Executed_Gtid_Set,并将channel m1使用的134db0a3-61a4-4f96-8574-a3ec77045987:的事件编号修改为第5步中记录的1-69811800。重新设置@@GLOBAL.GTID_PURGED

1SET @@GLOBAL.GTID_PURGED='30d1ce13-e8b0-43ae-b989-667aa0394fb8:1-240,f7123455-4085-4e24-a028-e0cff16d1cc6:1-8409014,f0c67103-9183-46c3-b34f-1b7fac7c0f51:1,17070e4d-a4bc-49b6-88d4-cbbc610abc2b:1,bb09bbce-c153-4af2-ab6f-85e70b74a13c:1-3,cc96b6de-d317-4881-9a86-9c9b690cb8cc:1-15,178dde0c-492c-4365-96e2-57008f2bd232:1-524521756,da0e0547-a1ca-4cae-8db9-f3a90b3a996a:1,134db0a3-61a4-4f96-8574-a3ec77045987:1-69811800';
  1. 在S上操作,重新启动各个SLAVE CHANNEL
1START SLAVE;

或单独启动各个CHANNEL:

1START SLAVE FOR CHANNEL 'm1';
2START SLAVE FOR CHANNEL 'm2';

最后使用SHOW SLAVE STATUS \G观察各个CHANNEL是否正常,重点关注Seconds_Behind_Master, Retrieved_Gtid_Set, Executed_Gtid_Set等字段的变化。

© 2024 青蛙小白
comments powered by Disqus