Wednesday, May 1, 2019

Fun with Bugs #85 - On MySQL Bug Reports I am Subscribed to, Part XX

We have a public holiday here today and it's raining outside for a third day in a row already, so I hardly have anything better to do than writing yet another review of public MySQL bug reports that I've subscribed to recently.

Not sure if these reviews are really considered useful by anyone but few of my readers, but I am still going to try in a hope to end up with some useful conclusions. Last time I've stopped on Bug #94903, so let me continue with the next bug in my list:
  • Bug #94912 - "O_DIRECT_NO_FSYNC possible write hole". In this bug report Janet Campbell shared some concerns related to the way O_DIRECT_NO_FSYNC (and O_DIRECT) settings for innodb_flush_method work. Check comments, including those by Sunny Bains, where he agrees that "...this will cause problems where the redo and data are on separate devices.". Useful reading for anyone interested in InnoDB internals or using  innodb_dedicated_server setting in MySQL 8.0.14+.
  • Bug #94971 - "Incorrect key file error during log apply table stage in online DDL". Monty Solomon reported yet another case when "online' ALTER for InnoDB table fails in a weird way. The bug is still "Open" and there is no clear test case to just copy/paste, but both the problem and potential solutions (make sure you have "big enough" innodb_online_alter_log_max_size or better use pt-online-schema-change or gh-ost tools) were already discussed here.
  • Bug #94973 - "Wrong result with subquery in where clause and order by". Yet another wrong results bug with subquery on MySQL 5.7.25 was reported by Andreas Kohlbecker. We can only guess if MySQL 8 is also affected (MariaDB 10.3.7 is not, based on my test results shared below) as Oracle engineer who verified the bug had NOT card to check or share the results of this check. What can be easier than running this (a bit modified) test case on every MySQL major version and copy pasting the results:
    MariaDB [test]> CREATE TABLE `ReferenceB` (
        ->   `id` int(11) NOT NULL,
        ->   `bitField` bit(1) NOT NULL,
        ->   `refType` varchar(255) NOT NULL,
        ->   `externalLink` longtext,
        ->   PRIMARY KEY (`id`)
    Query OK, 0 rows affected (0.170 sec)

    MariaDB [test]> INSERT INTO ReferenceB (id, bitField, refType, externalLink) VALUES(1, 0, 'JOU', NULL);
    Query OK, 1 row affected (0.027 sec)

    MariaDB [test]> INSERT INTO ReferenceB (id, bitField, refType, externalLink) VALUES(2, 0, 'JOU', NULL);
    Query OK, 1 row affected (0.002 sec)

    MariaDB [test]> SELECT hex(bitField) from ReferenceB  where id in (select id as
    y0_ from ReferenceB  where refType='JOU') order by externalLink asc;
    | hex(bitField) |
    | 0             |
    | 0             |
    2 rows in set (0.028 sec)
    But we do not see anything like that in the bug report... This is sad.
  • Bug #94994 - "Memory leak detect on temptable storage engine". Yet another memory leak (found with ASan) reported by Zhao Jianwei, who had also suggested a patch.
  • Bug #95008 - "applying binary log doesn't work with blackhole engine tables". This bug was reported by Thomas Benkert. It seems there is a problem to apply row-based events to BLACKHOLE table and this prevents some nice recovery tricks from working.
  • Bug #95020 - "select no rows return but check profile process Creating sort index". Interesting finding from cui jacky. I can reproduce this with MariaDB as well. It seems we either have to define some new stage or define "Creating sort index" better than in the current manual. This:
    The thread is processing a SELECT that is resolved using an internal temporary table.
    is plain wrong in the case shown in the bug report IMHO.
  • Bug #95040 - "Duplicately remove locks from lock_sys->prdt_page_hash in btr_compress". One of those rare cases when Zhai Weixiang does not provide the patch, just suggests the fix based on code review :)
  • Bug #95045 - "Data Truncation error occurred on a write of column 0Data was 0 bytes long and". This really weird regression bug in MySQL 8.0.14+ was reported by Adarshdeep Cheema. MariaDB 10.3 is surely not affected.
  • Bug #95049 - "Modified rows are not locked after rolling back to savepoint". Bug reporter, John Lin, found that fine MySQL manual does not describe the real current implementation. Surprise!
  • Bug #95058 - "Index not used for column with IS TRUE or IS FALSE operators". Take extra care when using BOOLEAN columns in MySQL. As it was noted by Monty Solomon, proper index is NOT used when you try to check BOOLEAN values as manual suggests, using IS TRUE or IS FALSE conditions. Roy Lyseng explained how such queries are threated internally, but surely there is a better way. MariaDB 10.3.7 is also affected, unfortunately.
  • Bug #95064 - "slave server may has gaps in Executed_Gtid_Set when a special case happen ". Nice bug report from yoga yoga, who had also contributed a patch. Parallel slave can easily get out of sync with master in case of lock wait timeout and failed retries. Again, we do NOT see any check if MySQL 8 is affected, unfortunately.
  • Bug #95065 - "Strange memory management when using full-text indexes". We all know that InnoDB FULLTEXT indexes implementation is far from perfect. Now, thanks to Yura Sorokin, we know also about a verified memory leak bug there that may lead to OOM killing of MySQL server.
  • Bug #95070 - "INSERT .. VALUES ( .., (SELECT ..), ..) takes shared lock with READ-COMMITTED". Seunguck Lee found yet another case of InnoDB locking behavior that MySQL manual does not explain. The bug is still "Open" for some reason.
  • Bug #95115 - "mysqld deadlock of all client threads originating from 3-way deadlock". It took some efforts for bug reporter, Sandeep Dube, and other community users (mostly Jacek Cencek) to attract proper attention to this bug from proper Oracle developer, Dmitry Lenev, until it ended up "Verified" based on code review. We still can not be sure if MySQL 8 is also affected.
That's all for now. I have few more new bug reports that I monitor, but I do not plan to continue with this kind of reviews in upcoming few months in this blog. I hope I'll get a reason soon to write different kind of posts, with more in depth study of various topics...

In any case you may follow me on Twitter for anything related to recent interesting or wrongly handled MySQL bug reports.

This view of Chelsea from our apartment at Chelsea Cloisters reminds me that last year I spent spring holiday season properly - no time was devoted to MySQL bugs :)
To summarize:
  1. Do not use O_DIRECT_NO_FSYNC value for innodb_flush_method if your redo logs are located on different device than your data files. Just don't.
  2. Some Oracle engineers who process bugs still do not care to check if all supported major versions are affected and/or share the results of such checks in public.
  3. There are still many details of InnoDB locking to study, document properly and maybe fix.
  4. I am really concerned with the state of MySQL optimizer. We see all kinds of weird bugs (including regressions) and very few fixes in each maintenance release.

No comments:

Post a Comment