Skip to main content
Version: 24.12.01

Troubleshoot diverged GTIDs in MySQL

The MySQL data service in PDS handles (in most cases) pod crashes and outages. For example, instances can failover and rejoin the cluster automatically on reboot. In some cases, a pod, after an outage will be unable to reboot the cluster and keeps failing with the following error:

The instance `instance-a` has an incompatible Global Transaction Identifier (GTID) set with the seed instance `instance-b` (GTIDs diverged). If you wish to proceed, the `force` option must be explicitly set.

This means, instances cannot agree on who should be the new master as data on those instances has diverged.

To troubleshoot this issue:

  1. Review the GTIDs in the binary log of the instances and choose which instance contains the latest or the most appropriate changes to continue on with. You can inspect the transactions on instances by:

    • opening a shell into the mysql container of the pods

    • using MySQL tools such as mysql and mysqlbinlog

  2. Once you selected which instance should be used as seed, you can force reboot the cluster by executing the following commands inside the mysql container of the selected pod:

    seed_instance=$(hostname -f)
    mysqlsh --host=$seed_instance --user=innodb-config --password=$password -- dba reboot-cluster-from-complete-outage --force --primary=$seed_instance:3306
  3. Check the cluster status and wait for the cluster to become recovered:

    mysqlsh --host=$seed_instance --user=innodb-config --password=$password -- cluster status

If the cluster does not become healthy or if some nodes are not becoming online, then you should continue with:

  • removing the failing instances:

    mysqlsh ... -- cluster remove-instance <other_instance>
  • and re-adding the instances:

    mysqlsh ... -- cluster add-instance <other_instance> --recoveryMethod=clone

See restoring and rebooting a cluster for more imformation.