Friday, August 9, 2013

Fun with Bugs #22 - Some Bug Reports You Should Not Miss

Yet another user installed MySQL 5.5.32 yesterday and got a system that can not start... It's really easy to help in this case - just downgrade back to 5.5.31 or upgrade to 5.5.33 if you can. Why problem happened during upgrade? Because of a regression bug #69623.

This case that was easily solved during a quick chat reminded me about the problem of bugs in production. Nobody expects any sane DBA to review every new bug report, but some of them should not be missed, at least when upgrading to any newer version. Regression bugs (I see 15 here reported for MySQL 5.6 GA versions and still "Verified", and it was a search for "regression" tag that may not be always used...) are in this category, same as bugs in new features that may be just enabled/always there by default. Let me list a few more for 5.6.13:
  • Bug #69325 - "MySQL uses significantly more memory for ALTER TABLE than expected". Imagine you are trying to use more partitions than usual because MySQL 5.6 allows it, and plan to enjoy fast ALTER maybe while adding some indexes... just to end up swapping as crazy and everything hanging. Surprise...
  • Imagine you use replication in MySQL 5.6, with status stored in tables:

    mysql> show variables like '%info%';
    +---------------------------+----------------+
    | Variable_name             | Value          |
    +---------------------------+----------------+
    | master_info_repository    | TABLE          |
    | relay_log_info_file       | relay-log.info |
    | relay_log_info_repository | TABLE          |
    | sync_master_info          | 10000          |
    | sync_relay_log_info       | 10000          |
    +---------------------------+----------------+
    5 rows in set (0.02 sec)

    and just run CHANGE MASTER from time to time (or some tool may do this even if you do not know about it) and restart your server. You know what? You may easily end up with:
  • Bug #69825 - "InnoDB: Assertion failure in thread ... in file lock0wait.cc line 297", or
  • Bug #69898 - "change_master() invokes ha_innobase::truncate() in a DML transaction" and same assertion as above actually, and then upon restart...
  • Bug #69907 - "Error(1030): Got error -1 from storage engine" and no way to start up even with innodb_force_recovery maybe...
Why is that so? Probably at least partially because you blindly trusted Oracle MySQL 5.6 GA status and had not cared to monitor bug reports... I'll speculate about possible reasons in some other post.

Is there any way to prevent this kind of troubles? Nobody can guarantee bugs free releases for you, unfortunately, but monitoring bugs database for any new bugs or at least some other sources that do monitor bugs database, like this my blog or my Facebook page, give you notably more chances to prevent unexpected troubles. So, take care...

No comments:

Post a Comment