Saturday, April 27, 2019

Fun with Bugs #84 - On Some Public Bugs Fixed in MySQL 5.7.26

Oracle released minor MySQL Server versions in all supported branches on April 25, 2019. MySQL 5.7.26 is just one of them, but recently I prefer to ignore MySQL 8 releases (after checking that I can build them from source code at least somewhere, even if it takes 18G+ of disk space and that they work in basic tests), as there are more chances for MySQL 5.7 bug fixes to affect me (and customers I care about) directly.

So, in this yet another boring blog post (that would never be a reason for any award) I plan to concentrate on bugs reported in public MySQL bugs database and fixed in MySQL 5.7.26. As usual I name bug reporters explicitly and give links to their remaining currently active bug reports, if any. This time the list is short enough, so I do not even split it by categories:
  • Bug #93164 - "Memory leak in innochecksum utility detected by ASan". This bug was reported by Yura Sorokin from Percona, who also had contributed a patch (for some reason this is not mentioned in the official release notes).
  • Bug #90402 - "innodb async io error handling in io_event". Wei Zhao found yet another case when wrong data type was used in the code and I/O error was not handled, and this could lead even to crashes. He had submitted a patch.
  • Bug #89126 - "create table panic on innobase_parse_hint_from_comment". Nice bug report with a patch from Yan Huang. Note also detailed analysis and test case provided by Marcelo Altmann in the comment. It's a great example of cooperation of all sides: Oracle MySQL developers, bugs verification team, bug reporter and other community users.
  • Bug #92241 - "alter partitioned table add auto_increment diff result depending on algorithm". Yet another great finding from Shane Bester himself!
  • Bug #94247 - "Contribution: Fix fractional timeout values used with WAIT_FOR_EXECUTED_GTI ...". This bug report was created based on pull request from Dirkjan Bussink, who had suggested a patch to fix the problem. Note the comment from Shlomi Noach that refers to Bug #94311 (still private).
  • Bug #85158 - "heartbeats/fakerotate cause a forced sync_master_info". Note MTR test case contributed by Sveta Smirnova and code analysis in a comment from Vlad Lesin (both from Percona at that time) in this bug report from Trey Raymond.
  • Bug #92690 - "Group Replication split brain with faulty network". I do not care about group replication (I have enough Galera in my life instead), but I could not skip this report by Przemyslaw Malkowski from Percona, with detailed steps on how to reproduce. Note comments from other community members. Yet another case to show that good bug reports attract community feedback and are fixed relatively fast.
  • Bug #93750 - "Escaping of column names for GRANT statements does not persist in binary logs". Clear and simple bug report from Andrii Ustymenko. I wonder why it was not found by internal testing/QA. Quick test shows that MariaDB 10.3.7, for example, is not affected:
    c:\Program Files\MariaDB 10.3\bin>mysql -uroot -proot -P3316 test
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 9
    Server version: 10.3.7-MariaDB-log binary distribution

    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

    MariaDB [test]> create table t_from(id int primary key, `from` int, c1 int);
    Query OK, 0 rows affected (0.582 sec)

    MariaDB [test]> create user 'user01'@'%' identified by 'user01';
    Query OK, 0 rows affected (0.003 sec)

    MariaDB [test]> grant select (`id`,`from`) on `test`.`t_from` to 'user01'@'%';
    Query OK, 0 rows affected (0.054 sec)

    MariaDB [test]> show master status;
    | File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
    | pc-PC-bin.000007 |      852 |              |                  |
    1 row in set (0.030 sec)

    MariaDB [test]> show binlog events in 'pc-PC-bin.000007';
    | Log_name         | Pos | Event_type        | Server_id | End_log_pos | Info
    | pc-PC-bin.000007 |   4 | Format_desc       |         1 |         256 | Server
    ver: 10.3.7-MariaDB-log, Binlog ver: 4                             |
    | pc-PC-bin.000007 | 256 | Gtid_list         |         1 |         299 | [0-1-42
    ]                                                                  |
    | pc-PC-bin.000007 | 299 | Binlog_checkpoint |         1 |         342 | pc-PC-b
    in.000007                                                          |
    | pc-PC-bin.000007 | 708 | Query             |         1 |         852 | use `te
    st`; grant select (`id`,`from`) on `test`.`t_from` to 'user01'@'%' |

    9 rows in set (0.123 sec)
  • Bug #73936 - "If the storage engine supports RBR, unsafe SQL statementes end up in binlog". Nice bug report with MTR test case by Santosh Praneeth Banda. Note that last comment about the fix mentions only MySQL 8.0.15, not a single work about the fix in MySQL 5.7.26 (or anything about MySQL 5.6.x while the bug was reported for 5.6).
  • Bug #93341 - "Check for tirpc needs improvement". The need for improvement of CMake check was noted by Terje Røsten.
  • Bug #91803 - "mysqladmin shutdown does not wait for MySQL to shut down anymore". This regression bug (without a "regression" tag) was reported by Christian Roser.
  • Bug #91541 - ""Flush status" statement adds twice to global values ". Yura Sorokin contributed a detailed anlysis, MTR test case and a patch in this bug reported by Carlos Tutte.
  • Bug #90351 - "GLOBAL STATUS variables drift after rollback". Zsolt Parragi contibuted a patch to this bug found and reported by Iwo P. For some reason this contribution is not highlighted in the release notes.
  • Bug #81441 - "Warning about localhost when using skip-name-resolve". One of many bug reports from Monty Solomon in which he (and other community members like Jean-François Gagné) had to spend a lot of efforts and fight with a member of bugs verification team to get the bug accepted as a real code bug and then get it fixed in all versions affected.
  • Bug #90902 - "Select Query With Complex Joins Leaks File Handles". This bug was reported by James Wilson. I still wonder if MySQL 5.6 was affected. Bug reports says nothing about this (while I expect all supported GA versions to be checked when the bug is verified, and the results of such check clearly documented).
The future looks bright for MySQL 5.7
To summarize:
  1. Consider upgrade to 5.7.26 if you use complex joins, partitioned tables with auto_increment columns or rely on InnoDB or replication a lot.
  2. It's good to see crashing bugs that do not end up as hidden/"security", maybe because they are reported with patches...
  3. It's good to see examples of cooperation of several community users contributing to the same bug report!
  4. Percona engineers contribute a lot to MySQL, both in form of bug reports, patches and by helping other community users to make their point and get their bugs fixed fast.
  5. There are still things to improve in a way Oracle egnineers handle bugs verification, IMHO.
  6.  It's also a bit strange to see only one optimizer-related fix in this release. It means that either MySQL optimizer is already near perfect and there are no bugs to fix (check yourself, but I see 123 bugs here), or that nobody cares that much about MySQL optimizer in 5.7 these days.
  7. It seems for some bugs fixed in previous MySQL 8.0.x minor release there is no extra check/updates in public comments about the versions with the fix when it is released in MySQL 5.6 or 5.7.

Sunday, April 14, 2019

Fun with Bugs #83 - On MySQL Bug Reports I am Subscribed to, Part XIX

I have not much yet to say on a popular topic of upgrading everything to MySQL 8, so let me just continue reviewing public MySQL bug reports that I've subscribed to recently. After my previous post at least one bug, Bug #94747, got enough comments and clarifications (up to specific commit that introduced this regression pointed out by Daniel Black!) to have it re-classified and verified as InnoDB code bug. So, I see good reasons to continue attracting wide public attention to selected MySQL bugs - this helps to make MySQL better eventually.

As usual, I start from the oldest bug reports:
  • Bug #94758 - "record with REC_INFO_MIN_REC_FLAG is not the min record on non-leaf page". It was reported by a well known person, Zhai Weixiang, who contributed a lot to MySQL code and quality. This time he added a function to the code to prove his point and show that data may be stored in an unexpected order on the root node of InnoDB table. For this very reason (Oracle's code modified to show the problem) this report was marked as "Not a Bug". This is weird, one may prove the point by checking memory with gdb if needed (or maybe by checking data pages on disk as well), without any code modifications.
  • Bug #94775 - "Innodb_row_lock_current_waits status variable incorrect values on idle server". If you read this bug report by Uday Sitaram you can find out a statement that some status variables, like Innodb_row_lock_current_waits, are designed to be "fuzzy", so no matter what value you may see it's probably not a bug. Very enlightening!
  • Bug #94777 - "Question about the redo log write_ahead_buffer". One may argue that public bugs database is not a proper place to ask questions, but in this case Chen Zongzhi actually proved that MySQL 8.0 works better and started up some discussion that reveal probably a real bug (see comments starting from this one, "[5 Apr 15:59] Inaam Rana "). So, even if "Not a Bug" status is correct for the original finding, it seems there is something to study and we have a hope this study happens elsewhere (although I'd prefer to see this or new public bug report for this "Verified").
  • Bug #94797 - "Auto_increment values may decrease when adding a generated column". I can not reproduce this problem reported by Fengchun Hua with MariaDB 10.1.x. My related comments in the bug remain hidden and I've already agreed not to make any such comments in the bugs database. So, for now we have a "Verified" bug in MySQL 5.7.
  • Bug #94800 - "Lost connection (for Debug version) or wrong result (for release version)". According to my tests, MariaDB 10.3.7 is not affected by this bug reported by Weidong Yu, who had also suggested a fix. See also his Bug #94802 - "The behavior between insert stmt and "prepare stmt and execute stmt" different ". (MariaDB 10.3.7 is also not affected).
  • Bug #94803 - "rpl sql_thread may broken due to XAER_RMFAIL error for unfinished xa transaction". This bug reported by Dennis Gao is verified based on code review, but we still do not know if any major version besides 5.7 is affected.
  • Bug #94814 - "slave replication lock wait timeout because of wrong trx order in binlog file". Yet another case when XA transactions may break replication was found by Zhenghu Wen. The bug is still "Open" and I am really interested to see it properly processed soon.
  • Bug #94816 - "Alter table results in foreign key error that appears to drop referenced table". From reading this report I conclude that MySQL 5.7.25 (and Percona Server 5.7.25-28, for that matter) is affected (src table disappears) and this was verified, but still the bug ends up as "Can't repeat" (?) with a statement that there is a fix in MySQL 8.0 that can not be back ported. This is rally weird, as we have plenty of bugs NOT affecting 8.0 but verified as valid 5.7.x bugs. Moreover, I've verified that in case of MySQL 8.0.x ref table just can not be created:
    mysql> create table ref (
        -> a_id int unsigned not null,
        -> b_id int unsigned not null,
        -> constraint FK_ref_a_b foreign key (b_id,a_id) references src (b_id,a_id)
        -> ) engine=InnoDB;
    ERROR 1822 (HY000): Failed to add the foreign key constraint. Missing index for constraint 'FK_ref_a_b' in the referenced table 'src'
    But it means the test case does not apply to 8.0 "as is", that MySQL 8.0 is not affected, but from the above it's not obvious if there is a fix to back port at all. As a next step I tried essentially the same test case on MariaDB 10.3 and ended up with a crash that I've reported as MDEV-19250. So, this bug report that was not even accepted by Oracle MySQL team ended up as a source of a useful check and bug report for MariaDB.
  • Bug #94835 - "debug-assert while restarting server post install component". This is a classical Percona style bug report from Krunal Bauskar. Percona engineers carefully work on debug builds and find many unique new bugs that way.
  • Bug #94850 - "Not able to import partitioned tablespace older than 8.0.14". This regression bug (for cases when lower_case_table_names=1) was reported by Sean Ren.
  • Bug #94858 - "Deletion count incorrect when rows deleted through multi-hop foreign keys". I've checked that MariaDB 10.3 is also affected by this bug reported by Sawyer Knoblich.
  • Bug #94862 - "MySQL optimizer scan full index for max() on indexed column." Nice bug report from Seunguck Lee. As one can easily check MariaDB is not affected:
    MariaDB [test]> explain select max(fd2) from test;
    | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
    |    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
    1 row in set (0,001 sec)

    MariaDB [test]> explain select get_timestamp(max(fd2)) from test;
    | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
    |    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
    1 row in set (0,001 sec)

    MariaDB [test]> select version();
    | version()       |
    | 10.3.14-MariaDB |
    1 row in set (0,000 sec)
  • Bug #94881 - "slave replication lock wait timeout because of supremum record". I fail to understand why this bug report from Zhenghu Wen ended up as "Closed". There is a detailed enough code analsys, but no test case to just copy/paste. The problem happens only with XA transactions and it's not clear if recent MySQL 5.7.25 is also affected. It means the bug can be in "Need Feedback" or even "Can't Repeat", but I see zero reasons to close it at the moment. Looks very wrong to me.
  • Bug #94903 - "optimizer chooses inefficient plan for order by + limit in subquery". It seems recently a lot of efforts from both bug reporter (Василий Лукьянчиков in this case) and even Oracle developer (Guilhem Bichot in this case) may be needed to force proper processing of the real bug.
It may take more than one dram of a good single malt to keep up with recent style of MySQL bugs processing...
* * *
To summarize:
  1. Attracting public attention of MySQL community users (via blog posts in this series or by any other means) to some MySQL bugs still helps to get them processed properly.
  2. Oracle MySQL engineers who work on bugs continue to refuse further processing of some valid bug reports based on formal and not entirely correct assumptions. In some cases I clearly miss checks for possible regressions vs older versions.
  3. As I already stated, Oracle does not seem to care much about bugs in XA transactions and possible replication problems they may cause.
  4. I encourage community users to share their findings and concerns in public MySQL bugs database. Even if they end up as "Not a Bug", they may still start useful discussions and fixes.
  5. By the way, my comment about the related discussion in MariaDB MDEV-15641 is still private in Bug #94610. This is unfortunate.

Thursday, April 4, 2019

Fun with Bugs #82 - On MySQL Bug Reports I am Subscribed to, Part XVIII

I've got few comments to my post on references to MariaDB in MySQL bug reports (not in the blog, but via social media and in personal messages), and all but one comments from current and former colleagues whose opinion I value a lot confirmed that this really looks like a kind of attempt to advertise MariaDB. So, from now on I'll try to keep my findings on how tests shared by MySQL bug reporters work in MariaDB for myself, MariaDB JIRA and this blog (where I can and will advertise whatever makes sense to me), and avoid adding them to MySQL bug reports.

That said, I still think that it's normal to share links to MariaDB bug reports that add something useful (like patches, explanations or better test cases), and I keep insisting that this kind of feedback should not be hidden. Yes, I want to mention Bug #94610 (and related MDEV-15641) again, as a clear example of censorship that is not reasonable and should not be tolerated.

In the meantime, since my previous post in this series I've subscribed to 30 or so new MySQL bug reports. Some of them are listed below, started from the oldest. This time I am not going to exclude "inactive" reports that were not accepted by Oracle MySQL engineers as valid:
  • Bug #94629 - "no variable can skip a single channel error in mysql replication". This is a request to add support for per-channel options to skip N transactions or specific errors. It is not accepted ("Not a Bug") just because one can stop replication on all channels and start on one to skip transaction(s) there, then resume replication for all channels. Do you really think this is a right and only way to process such a report?
  • Bug #94647 - "Memory leak in MEMORY table by glibc". This is also not a bug because one ca use something like malloc-lib=jemalloc with mysqld_safe or Environment="LD_PRELOAD=/path/to/jemalloc" with systemd services. There might be some cost related to that in older versions... Note that similar MDEV-14050 is still open.
  • Bug #94655 - "Some GIS function do not use spatial index anymore". yet another regression vs MySQL 5.7 reported by Cedric Tabin. It ended up verified as feature request without a regression tag...
  • Bug #94664 - "Binlog related deadlock leads to all incoming connection choked.". This report from Yanmin Qiao ended up as a duplicate of  Bug #92108 - "Deadlock by concurrent show binlogs, pfs session_variables table & binlog purge" (fixed in MySQL 5.7.25+, thanks Sveta Smirnova for the hint). See also Bug #91941.
  • Bug #94665 - "enabling undo-tablespace encryption doesn't mark tablespace encryption flag". Nice finding by Krunal Bauskar from Percona.
  • Bug #94699 - "Mysql deadlock and bugcheck on aarch64 under stress test". Bug report with a patch contributed by Cai Yibo. The fix is included in upcoming MySQL 8.0.17 and the bug is already closed.
  • Bug #94709 - "Regression behavior for full text index". This regression was reported by Carlos Tutte and properly verified (with regression tag added and all versions checked) by Umesh Shastry. See also detailed analysis of possible reason in the comment from Nikolai Ikhalainen.
  • Bug #94723 - "Incorrect simple query result with func result and FROM table column in where". Michal Vrabel found this interesting case when MySQL 8.0.215 returns wrong results. I've checked the test case on MariaDB 10.3.7 and it is not affected. Feel free to consider this check and statement my lame attempt to advertise MariaDB. I don't mind.
  • Bug #94730 - "Kill slave may cause start slave to report an error.". This bug was declared a duplicate of a nice Bug #93397 - "Replication does not start if restart MySQL after init without start slave." reported by Jean-François Gagné earlier. Both bugs were reported for MySQL 5.7.x, but I do not see any public attempt to verify if MySQL 5.6 or 8.0 is also affected. In the past it was required to check/verify bug on all GA versions supported if the test case applies. Nowadays this approach is not followed way too often, even when bug reporter cared enough to provide MTR test case.
  • Bug #94737 - "MySQL uses composite hash index when not possible and returns wrong result". Yet another optimizer bug was reported by Simon Banaan. Again, MariaDB 10.3.7 is NOT affected. I can freely and happily state this here if it's inappropriate to state so in the bug report itself. By the way, other MySQL versions were probably not checked. Also, unlike Oracle engineer who verified the bug, I do not hesitate to copy/paste the entire results of my testing here:
    MariaDB [test]> show create table tmp_projectdays_4\G*************************** 1. row ***************************
           Table: tmp_projectdays_4
    Create Table: CREATE TABLE `tmp_projectdays_4` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `project` int(11) NOT NULL,
      `datum` date NOT NULL,
      `voorkomen` tinyint(1) NOT NULL DEFAULT 1,
      `tijden` tinyint(1) NOT NULL DEFAULT 0,
      `personeel` tinyint(1) NOT NULL DEFAULT 0,
      `transport` tinyint(1) NOT NULL DEFAULT 0,
      `materiaal` tinyint(1) NOT NULL DEFAULT 0,
      `materiaaluit` tinyint(1) NOT NULL DEFAULT 0,
      `materiaalin` tinyint(1) NOT NULL DEFAULT 0,
      `voertuigen` varchar(1024) DEFAULT '',
      `medewerkers` varchar(1024) DEFAULT '',
      `personeel_nodig` int(11) DEFAULT 0,
      `personeel_gepland` int(11) DEFAULT 0,
      `voertuigen_nodig` int(11) DEFAULT 0,
      `voertuigen_gepland` int(11) DEFAULT 0,
      `created` datetime DEFAULT NULL,
      `modified` datetime DEFAULT NULL,
      `creator` int(11) DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `project` (`project`,`datum`) USING HASH
    1 row in set (0.001 sec)

    MariaDB [test]> explain SELECT COUNT(1) FROM `tmp_projectdays_4` WHERE `project`

    | id   | select_type | table             | type | possible_keys | key  | key_len
     | ref  | rows | Extra       |
    |    1 | SIMPLE      | tmp_projectdays_4 | ALL  | project       | NULL | NULL
     | NULL | 2544 | Using where |
    1 row in set (0.004 sec)

    MariaDB [test]> SELECT COUNT(1) FROM `tmp_projectdays_4` WHERE `project` IN(1540

    | COUNT(1) |
    |     2544 |
    1 row in set (0.025 sec)

    MariaDB [test]> select version();
    | version()          |
    | 10.3.7-MariaDB-log |
    1 row in set (0.021 sec)
    When the job was done properly I see no reasons NOT to share the results.
  • Bug #94747 - "4GB Limit on large_pages shared memory set-up". My former colleague Nikolai Ikhalainen from Percona noted this nice undocumented "feature" (Had I forgotten to advertise Percona recently? Sorry about that...) He proved with a C program that one can create shared memory segments on Linux large than 4GB, one just had to use proper data type, unsigned long integer, in MySQL's code. Still, this report ended up as non-critical bug in "MySQL Server: Documentation" category, or even maybe a feature request internally. What a shame!
Spring in Paris is nice, as this photo made 3 years ago proves. The way MySQL bug reports are handled this spring is not any nice in some cases.
To summarize:
  1. It seems recently the fact that there is some limited workaround already published somewhere is a good enough reason NOT to accept valid feature request. Noted.
  2. Regression bugs (reports about drop in performance or problem that had not happened with older version but happens with some recent) are still not marked with regression tag sometimes. Moreover, clear performance regressions in MySQL 8.0.x vs MySQL 5.7.x may end up as just feature requests... A request to "Make MySQL Great Again" maybe?
  3. MySQL engineers who verify bugs often do not care to check all major versions and/or share the results of their tests. This is unfortunate.
  4. Some bugs are not classified properly upon verification. The fact that wrong data type is used is anything but severity 3 documentation problem, really.