Saturday, February 22, 2020

Fun with Bugs #94 - On MySQL Bug Reports I am Subscribed to, Part XXVIII

I may get a chance to speak about proper bugs processing for open source projects later this year, so I have to keep reviewing recent MySQL bugs to be ready for that. In my previous post in this series I listed some interesting MySQL bug reports created in December, 2019. Time to move on to January, 2020! Belated Happy New Year of cool MySQL Bugs!

As usual I mostly care about InnoDB, replication and optimizer bugs and explicitly mention bug reporter by name and give link to his other active reports (if any). I also pick up examples of proper (or improper) reporter and Oracle engineers attitudes. Here is the list:
  • Bug #98103 - "unexpected behavior while logging an aborted query in the slow query log".  Query that was killed while waiting for the table metadata lock is not only get logged, but also lock wait time is saved as query execution time. I'd like to highlight how bug reporter, Pranay Motupalli, used gdb to study what really happens in the code in this case. Perfect bug report!
  • Bug #98113 - "Crash possible when load & unload a connection handler". The (quite obvious) bug was verified based on code review, but only after some efforts were spent by Oracle engineer on denial to accept the problem and its importance. This bug was reported by Fangxin Flou.
  • Bug #98132 - "Analyze table leads to empty statistics during online rebuild DDL ". Nice addition to my collections! This bug with a nice and clear test case was reported by Albert Hu, who also suggested a fix.
  • Bug #98139 - "Committing a XA transaction causes a wrong sequence of events in binlog". This bug reported by Dehao Wang was verified as a "documentation" one, but I doubt documenting current behavior properly is an acceptable fix. Bug reporter suggested to commit in the binary log first, for example. Current implementation that allows users to commit/rollback a XA transaction by using another connection if the former connection is closed or killed, is risky. A lot of arguing happened in comments in the process, and my comment asking for a clear quote from the manual:
    Would you be so kind to share some text from this page you mentioned:

    or any other fine MySQL 8 manual page stating that XA COMMIT is NOT supported when executed from session/connection/thread other than those prepared the XA transaction? I am doing something wrong probably, but I can not find such text anywhere.
    was hidden. Let's see what happens to this bug report next.
  • Bug #98211 - "Auto increment value didn't reset correctly.". Not sure what this bug reported by Zhao Jianwei has to do with "Data Types", IMHO it's more about DDL or data dictionary. Again, some sarcastic comments from Community users were needed to put work on this bug back on track...
  • Bug #98220 - "with log_slow_extra=on Errno: info not getting updated correctly for error". This bug was reported by lalit Choudhary from Percona.
  • Bug #98227 - "innodb_stats_method='nulls_ignored' and persistent stats get wrong cardinalities". I think category is wrong for this bug. It's a but in InnoDB's persistent statistics implementation, one of many. The bug was reported by Agustín G from Percona.
  • Bug #98231 - "show index from a partition table gets a wrong cardinality value". Yet another by report by Albert Hu. that ended up as a "documentation" bug for now, even though older MySQL versions provided better cardinality estimations than MySQL 8.0 in this case (so this is a regression of a kind). I hope the bug will be re-classified and properly processed later.
  • Bug #98238 - "I_S.KEY_COLUMN_USAGE is very slow". I am surprised to see such a bug in MySQL 8. According to the bug reporter, Manuel Mausz, this is also a kind of regression comparing to older MySQL version, where these queries used to run faster. Surely, no "regression" tag in this case was added.
  • Bug #98284 - "Low sysbench score in the case of a large number of connections". This notable performance regression of MySQL 8 vs 5.7 was reported by zanye zjy. perf profiling pointed out towards ppoll() where a lot of time is spent. There is a fix suggested by Fangxin Flou (to use poll() instead), but the bug is still "Open".
  • Bug #98287 - "Explanation of hash joins is inconsistent across EXPLAIN formats". This bug was reported by Saverio M and ended up marked as a duplicate of Bug #97299 fixed in upcoming 8.0.20. Use EXPLAIN FORMAT=TREE in the meantime to see proper information about hash joins usage in the plan.
  • Bug #98288 - "xa commit crash lead mysql replication error". This bug report from Phoenix Zhang (who also suggested a patch) was declared a duplicate of Bug #76233 - "XA prepare is logged ahead of engine prepare" (that I've already discussed among other XA transactions bugs here).
  • Bug #98324 - "Deadlocks more frequent since version 5.7.26". Nice regression bug report by Przemyslaw Malkowski from Percona, with additional test provided later by Stephen Wei . Interestingly enough, test results shared by Umesh Shastry show that MySQL 8.0.19 is affected in the same way as 5.7.26+, but 8.0.19 is NOT listed as one of versions affected. This is a mistake to fix, along with missing regression tag.
  • Bug #98427 - "InnoDB FullText AUX Tables are broken in 8.0". Yet another regression in MySQL 8 was found by Satya Bodapati. Change in default collation for utf8mb4 character set caused this it seems. InnoDB FULLTEXT search was far from perfect anyway...
The are clouds in the sky of MySQL bugs processing.
To summarize:
  1.  Still too much time and efforts are sometimes spent on arguing with bug reporter instead of accepting and processing bugs properly. This is unfortunate.
  2. Sometimes bugs are wrongly classified when verified (documentation vs code bug, wrong category, wrong severity, not all affected versions are listed, ignoring regression etc). This is also unfortunate.
  3. Percona engineers still help to make MySQL better.
  4. There are some fixes in upcoming MySQL 8.0.20 that I am waiting for :)
  5. XA transactions in MySQL are badly broken (they are not atomic in storage engine + binary log) and hardly safe to use in reality.

Saturday, February 8, 2020

Fun with Bugs #93 - On MySQL Bug Reports I am Subscribed to, Part XXVII

No matter what I write and present about dynamic tracing, blog posts about MySQL bugs are more popular based on statistics. So, to make more readers happy, I'd like to continue my review of interesting bugs reported in November with this post on bugs reported during December, 2019.

As usual, I'll try to concentrate on bug reports related to InnoDB, replication and optimizer, but some other categories also got my attention:
  • Bug #97911 - "[FATAL] Semaphore wait has lasted > 600 seconds. We intentionally crash the serv...". This bug got marked as a duplicate of other, older long semaphore wait bug (in "Can't repeat" status!) without much analysis. I think all Oracle engineers who added comments to that bug missed one interesting point:
    ... [ERROR] [MY-012872] [InnoDB] Semaphore wait has lasted > 39052544 seconds.
    even though bug reporter highlighted it in a comment. Reported wait time is a problem and is surely a bug, no matter how to reproduce the long wait itself and what is its root cause!
  • Bug #97913 - "Undo logs growing during partitioning ALTER queries". This bug (affecting only MySQL 5.7.x) was reported by Przemyslaw Malkowski from Percona, who also presented useful examples of monitoring queries to the information_schema.innodb_metrics and performance_schema. Check also comments that may explain why 8.0 is not affected in a similar way.
  • Bug #97935 - "Memory leak in client connection using information_schema". It took some efforts (starting from but not limited to Valgrind Massif profiling of heap memory usage) and time for Daniel Nichter to prove the point and get this bug "Verified". It is also not clear if MySQL 8 is also affected.
  • Bug #97950 - "buf_read_page_handle_error can trigger assert failure". Bug reporter, Shu Lin, tried his best to make the point. It's clear enough how to repeat this, and one could use one of documented test synchonisation methods if gdb is too much for bug verification. I do not think this bug was handled properly or got the level of attention it truly deserved.
  • Bug #97966 - "XA COMMIT in another session will not write binlog event". This bug was reported by Lujun Wang and immediately verified, but again with no documented check if MySQL 8 is affected. This happens too often, unfortunately.
  • Bug #97971 - "Roles not handling column level privileges correctly; Can SELECT, but not UPDATE". Clear and simple bug report with a workaround from Travis Bement. It was immediately verified.
  • Bug #98014 - "Lossy implicit conversion in conditional breaks ONLY_FULL_GROUP_BY". Yet another case of (IMHO) improper bug processing. The argument presented (from the manual):
    "MySQL 5.7.5 and later also permits a nonaggregate column not named in a GROUP BY clause when ONLY_FULL_GROUP_BY SQL mode is enabled, provided that this column is limited to a single value"
    does not apply, as "single value" for = 0 is NOT selected, we have multiple Host values matching it due to conversion. This is how proper version (guess what it is) works:
    mysql> SELECT User, Host, COUNT(*) FROM mysql.user WHERE Host = 0 GROUP BY 1;
    ERROR 1055 (42000): 'mysql.user.Host' isn't in GROUP BY
    mysql> select @@sql_mode;
    | @@sql_mode         |
    1 row in set (0.001 sec)

    mysql> set session sql_mode='';
    Query OK, 0 rows affected (0.029 sec)

    mysql> SELECT User, Host, COUNT(*) FROM mysql.user WHERE Host = 0 GROUP BY 1;+---------------+-----------+----------+
    | User          | Host      | COUNT(*) |
    | data_engineer |           |        1 |
    | en            | localhost |        1 |
    | ro1           |           |        1 |
    | ro2           |           |        1 |
    | role-1        |           |        1 |
    | root          | ::1       |        3 |
    | user1         | %         |        1 |
    13 rows in set, 17 warnings (0.003 sec)
    I think this bug reported by Joshua Varner must be verified.
  • Bug #98046 - "Inconsistent behavior while logging a killed query in the slow query log". Bug reporter, Pranay Motupalli, provided a clear test case and a detailed analysis, including the gdb debugging session that proves the point. Nice bug report.
  • Bug #98055 - "MySQL Optimizer Bug not picking right index". Both the bug reporter (Sudheer Gadipathi) and engineer who verified the bug stated that MySQL 8.0.x is similarly affected (UNIQUE key is preferred for the partitioned table, even though there is a better non-unique index). But 8.0.x is NOT listed in the "Version:" filed. Weird.
  • Bug #98068 - "SELECT FOR UPDATE not-exist table in PERFORMANCE SCHEMA reports confusing error". This is a funny (but still a regression) bug report by William ZHANG. Proper versions work like this:
    mysql> select database();+--------------------+
    | database()         |
    | performance_schema |
    1 row in set (0.001 sec)

    mysql> select * from not_exist_table;
    ERROR 1146 (42S02): Table 'performance_schema.not_exist_table' doesn't exist
    mysql> select * from not_exist_table for update;
    ERROR 1146 (42S02): Table 'performance_schema.not_exist_table' doesn't exist
  • Bug #98072 - "innochecksum summary shows blob pages as other type of page for 8.0 tables". The bug was reported by SERGEY KUZMICHEV. This time "regression" tag is missing, even though it's clearly stated that MySQL 5.7 worked differently. This is from the proper version:
    ================PAGE TYPE SUMMARY==============
           1        Index page
           0        Undo log page
           1        Inode page
           0        Insert buffer free list page
         508        Freshly allocated page
           1        Insert buffer bitmap
           0        System page
           0        Transaction system page
           1        File Space Header
           0        Extent descriptor page
          64        BLOB page
           0        Compressed BLOB page
           0        Page compressed page
           0        Page compressed encrypted page
           0        Other type of page

  • Bug #98083 - "Restarting the computer when deleting the database will cause directory residues". One would expect that MySQL 8 with a data dictionary should have some means to figure out the remaining database directory for a dropped database upon startup (as it stores information about databases elsewhere) and do proper cleanup. I think this bug reported by Jinming Liao must be verified and fixed. There is no "... manual creation or deletion of tables or databases..." involved in this case.
  • Bug #98091 - "InnoDB does not initialize raw disk partitions". As simple as that and both 5.7.29 and 8.0.219 are surely affected. It was not always the case, I've used raw devices myself with older MySQL versions, so this bug reported by Saverio M is a regression. Still, "regression" tag is missing.
That's all bugs reported in December, 2019 that I cared to subscribe to  and mention here. Next time I'll check bugs reported in January, 2020. There are at least 16 in my list already, so stay tuned.

Follow the links in this post to get more details about profiling and creating off-CPU FlameGraphs for MySQL. This post is devoted to bugs, though :)

To summarize:
  1. I am happy to see bug reports from people whom I never noticed before. MySQL Community is alive.
  2. Some flexibility in following common sense based bugs verification procedures is still visible. Bugs reported for 5.7 are not checked on 8.0 (or the results of this check are not documented in public), nobody cares to read what bug reporter says carefully or go extra mile, "regression" tag not added, and so on. 
  3. Probably at this stage my writings are mostly ignored by Oracle's decision makers. But I keep watching them all anyway.