Monday, June 24, 2013

Fun with Bugs #13 - MySQL replication and two-way communication

I hope you had noted this already, but in case you missed it, please, read this post by Matt Lord and check any bug at http://bugs.mysql.com. As soon as you log in to your Oracle account, you can vote for bugs and feature requests! I hope that eventually somebody will publish lists of "Top N Most Wanted" fixes based on number of users who clicked on this great "Affects Me" button.

If you plan to use this new feature to express your needs while given a chance, why not to start with replication-related bugs in latest and greatest MySQL 5.6.12? Here is my "Top 10" list (starting with recently reported):
  1. Bug #69444 - just do not assume that replication in MySQL 5.6 is magically crash safe in all cases. DDL and MTS (multi-threaded slave, in this context) may still make starting replication from proper position a problem sometimes.
  2. Bug #69369 - when GTIDs and MTS are used slave's SQL thread just stops when binlog rotation happens on slave (related to binlog group commit it seems).
  3. Bug #69341 - semi-sync replication is slow when changes are done by many clients on master. There is a preliminary patch in the bug report, so you may want to check it (no public feedback from original bug reporter so far).
  4. Bug #69135 - probably just a documentation issue formally, but still: don't forget to add sync_master_info=1 when master_info_repository = TABLE and maybe more if you want replication to be really crash safe in 5.6. Check comments in the bug report carefully, please.
  5. Bug #69097 - mysqld scans all binary logs on crash recovery. This is really serious and may be considered a performance regression of a kind. I am surprised that there are no public comments since April 30 and I really hope 5.6.13 is going to fix this bug.
  6. Bug #69096 - GTID_NEXT_LIST session variable is not visible. As a result, there is no way to recover from Bug #69045, so make sure you use MySQL 5.6.12, not any older version.
  7. Bug #69095 - replication in 5.6 (including 5.6.12) may break with GTIDs enabled and master changes from SBR to RBR. Bug is still "Open" and similar problem could happen with 5.5.31 it seems, but still make sure you review this case if you plan to switch to GTIDs in 5.6.
  8. Bug #69059 - for many real life use cases it's not possible to turn down and restart the entire database topology simultaneously in order to enable GTIDs. So, how to start using this feature in production, while you upgrade from older 5.x.y to 5.6? Still no public answer since April 24 to this question from Facebook, unfortunately.
  9. Bug #68953  - some of binlog write errors, namely originating in MYSQL_BIN_LOG::do_write_cache, are silently ignored. This may be considered as a regression comparing to 5.5 (as new code is affected) and, in any case, is not any good.
  10. Bug #68892 - Invalid use of GRANT command breaks replication. Surely DBA can break it using some other way, but writing to the binary log something that just can not be executed is wrong in any case, I think.
Let's stop at this stage. The goal of this issue was not to provide a complete list of replication bugs in 5.6.12, but rather to give a yet another partial answer to the question on what users upgrading to MySQL 5.6 should care about. To put it simple: do not assume that concurrency improvement, MTS, GTIDs, crash-safe replication and other new replication features just work well together by default.

Previous partial answer was given here. Few more issues on InnoDB, installation or upgrade/downgrade problems and PERFORMANCE_SCHEMA are still needed to get the entire picture...

1 comment:

  1. A couple of recent bug reports to add here:

    http://bugs.mysql.com/bug.php?id=69618 - some GTID values may not work as expected in debug build because of assertion (and lead to overflow otherwise)

    http://bugs.mysql.com/bug.php?id=69574 - Slave crashes when applying row-based binlog entries in cascading replication...

    ReplyDelete