Saturday, August 10, 2019

Fun with Bugs #88 - On MySQL Bug Reports I am Subscribed to, Part XXII

It's Saturday night. I have a week of vacation ahead that I am going to spend at home, working on things I usually do not have time for. I already did something useful (created a couple of test cases for MariaDB bugs here and there, among other things), so why not to get some fun and continue my previous review of recent interesting MySQL bug report...

So, here is the list of 15 more MySQL community bug reports that I've subscribed to back in May and June 2019:
  • Bug #95491 - "The unused wake up in simulator AIO of function reserve_slot". I am impressed by the fact that these days many people just study InnoDB code, find problematic parts, improve them, test and prove the improvement, and submit patches to Oracle. Chen Zongzhi, who reported this bug, is one of such great people.
  • Bug #95496 - "Replication of memory tables". As pointed out by Iwo P, MySQL manual describes what happens to MEMORY tables replicated from master to slave in details, including the case of master restart. But one detail is missing - when slave with MEMORY tables restarts, even with super_read_only set, it also generates even to delete data from the table into its own binary log, with GTID, and this leads to problems in case of failover.
  • Bug #95507 - "innodb_stats_method is not honored when innodb_stats_persistent=ON". This bug was reported by my colleague Sergei Petrunia. See also related MDEV-19574  (still "Open").
  • Bug #95547 - "MySQL restarts automatically after ABORT_SERVER due to disk full". I am not sure this is a replication bug, more a problem of too primitive mysqld_safe or systemd startup scripts that do not check free space on disk or reason why mysqld process committed suicide. Anyway, this report by Przemyslaw Malkowski from Percona was accepted as a replication bug.
  • Bug #95582 - "DDL using bulk load is very slow under long flush_list". As found and proved by Fungo Wang, due to optimized away redo log writing for bulk load, to make sure such DDL is crash safe, dirtied pages must be flushed to disk synchronously. Currently this is done inefficiently, as shown in this bug report. See also a year old (and still "Open") Bug #92099 - "provide an option to record redo log for bulk loading" from the same reporter, where he asks for a way to get redo logging back (but still benefit from other "online" ALTER features). MariaDB has that option after the fix of MDEV-16809.
  • Bug #95612 - "replication breakage due to Can not lock user management caches for processing". Simon Mudd noted that in some cases simple DROP USER IF EXISTS ... breaks parallel replication in MySQL 8.0.16. Good to know that START SLAVE allows to continue :)
  • Bug #95698 - "JSON columns are returned with binary charset". This bug was reported by Daniel Book. Note that according to the original issue, DBD::MariaDB (I had no idea this exists!) works properly with MariaDB 10.3 or 10.4. I had not checked.
  • Bug #95734 - "Failing SET statement in a stored procedure changes variables". This really weird and unexpected bug was reported by Przemysław Skibiński.
  • Bug #95863 - "MTS ordered commit may cause deadlock ". Great finding by Lalit Choudhary from Percona!
  • Bug #95895 - "Shutdown takes long time when table uses AUTOINC and FOREIGN KEY constraints". It was reported by Satya Bodapati from Percona, who later contributed as patch. See also his related Bug #95898 - "InnoDB releases memory of table only on shutdown if table has FK constraint". I truly hope to see these fixed in next minor releases of MySQL. Note that originally bug (PS-5639) was reported by Sveta Smirnova.
  • Bug #95928 - "InnoDB crashes on compressed and encrypted table when changing FS block size". Make sure to check and write original filesystem block size somewhere... This bug was reported by Sergei Glushchenko from Percona.
  • Bug #95929 - "Duplicate entry for key 'PRIMARY' when querying information_schema.TABLES". This nice feature of new data dictionary of MySQL 8.0.16 under concurrent load was found by Manuel Rigger.
  • Bug #95934 - "innodb_flush_log_at_trx_commit = 0 get performance regression on 8 core machine". If I had access to 8 cores box, I'd try to check this sysbench-based complete test case immediately. I should do this on my quadcore box at least, next week. But here we see some 12 days just wasted and then bug report from Chen Zongzh ended up "Analyzing", till today, with no further feedback.
  • Bug #95954 - "CAST of negative function return value to UNSIGNED malfunctions with BIGINT". Yet another weird bug found by Manuel Rigger. There are patches (separate for 5.6, 5.7 and 8.0) contributed by Oleksandr Peresypkin.
Montjuïc is a place to enjoy nice view of Barcelona and forget about MySQL bugs. So, I could miss some interesting ones at the end of May, 2019...
To summarize:
  1. Percona engineers still contribute a lot of useful and interesting MySQL bug reports, often - with patches.
  2. I am impressed by quality of MySQL bug reports in 2019. Often we see detailed analysis, proper complete test cases and patches.
  3. There is a recent fork of DBD::mysql for MariaDB (and MySQL), DBD::MariaDB!
  4. I badly need some new hardware for running benchmarks and tests from recent bug reports...
  5. I should spend more time in May in Barcelona :)

Sunday, August 4, 2019

Fun with Bugs #87 - On MySQL Bug Reports I am Subscribed to, Part XXI

After a 3 months long break I'd like to continue reviewing MySQL bug reports that I am subscribed to. This issue is devoted to bug reports I've considered interesting to follow in May, 2019:
  • Bug #95215 - "Memory lifetime of variables between check and update incorrectly managed". As demonstrated by Manuel Ung, there is a problem with all InnoDB MYSQL_SYSVAR_STR variables that can be dynamically updated. Valgrind allows to highlight it.
  • Bug #95218 - "Virtual generated column altered unexpectedly when table definition changed". This weird bug (that does not seem to be repeatable on MariaDB 10.3.7 with proper test case modifications like removing NOT NULL and collation settings from virtual column) was reported by Joseph Choi. Unfortunately we do not see any documented attempt to check if MySQL 8.0.x is also affected. My quick test shows MySQL 8.0.17 is NOT affected, but I'd prefer to see check copy/pasted as a public comment to the bug.
  • Bug #95230 - "SELECT ... FOR UPDATE on a gap in repeatable read should be exclusive lock". There are more chances to get a deadlock with InnoDB than one might expect... I doubt this report from Domas Mituzas is a feature request. It took him some extra efforts to insist on the point and get it verified even as S4.
  • Bug #95231 - "LOCK=SHARED rejected contrary to specification". This bug report from Monty Solomon ended up as a documentation request. The documentation and the implementation are not aligned, and it was decided NOT to change the parser to match documented syntax. But why it is still "Verified" then? Should it take months to correct the fine manual?
  • Bug #95232 - "The text of error message 1846 and the online DDL doc table should be updated". Yet another bug report from Monty Solomon. Some (but not ALL) partition specific ALTER TABLE operations do not yet support LOCK clause.
  • Bug #95233 - "check constraint doesn't consider IF function that returns boolean a boolean fun". As pointed out by Daniel Black, IF() function in a check constraint isn't considered a boolean type. He had contributed a patch to fix this, but based on comments it's not clear if it's going to be accepted and used "as is". The following test shows that MariaDB 10.3 is not affected:
    C:\Program Files\MariaDB 10.3\bin>mysql -uroot -proot -P3316 test
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 9
    Server version: 10.3.7-MariaDB-log mariadb.org binary distribution

    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    MariaDB [test]> create table t1 (source enum('comment','post') NOT NULL, comment_id int unsigned, post_id int unsigned);
    Query OK, 0 rows affected (0.751 sec)
    MariaDB [test]> alter table t1 add check(IF(source = 'comment', comment_id IS NOT NULL AND post_id IS NULL, post_id IS NOT NULL AND comment_id IS NULL));
    Query OK, 0 rows affected (1.239 sec)
    Records: 0  Duplicates: 0  Warnings: 0
  • Bug #95235 - "ABRT:Can't generate a unique log-filename binlog.(1-999), while rotating the bin". Yet another bug report from Daniel Black. When MySQL 8.0.16 is built with gcc 9.0.x abort is triggered in the MTR suite on the binlog.binlog_restart_server_with_exhausted_index_value test.
  • Bug #95249 - "stop slave permanently blocked". This bug was reported by Wei Zhao, who had contributed a patch.
  • Bug #95256 - "MySQL 8.0.16 SYSTEM USER can be changed by DML". MySQL 8.0.16 had introduced an new privilege, SYSTEM_USER. MySQL manual actually says:
    "The protection against modification by regular accounts that is afforded to system accounts by the SYSTEM_USER privilege does not apply to regular accounts that have privileges on the mysql system schema and thus can directly modify the grant tables in that schema. For full protection, do not grant mysql schema privileges to regular accounts."
    But the report that a user with a privilege to execute DML on the mysql.GLOBAL_GRANTS table from Zhao Jianwei was accepted and verified. I hope Oracle engineers will finally make up their mind and decide either to fix this or to close this report as "Not a bug". I've subscribed in a hope for some fun around this decision making.
  • Bug #95269 - "binlog_row_image=minimal causes assertion failure". This assertion failure happens in debug build when one of standard MTR test cases, rpl.rpl_gis_ddl or rpl.rpl_window_functions is executed with --binlog-row-image=minimal option. In such cases I always wonder what is the reason for a failure NOT to be noted by Oracle MySQL QA and somehow fixed before Community users notice it? Either they don't run tests on debug builds with all possible combinations, or do not care to fix such failures (and thus should suffer from known failures in other test runs). I do not like any of these options, honestly. The bug was reported by Song Libing.
  • Bug #95272 - "Potential InnoDB SPATIAL INDEX corruption during root page split". This bug was reported by Albert Hu based on Valgrind report when running the test innodb.instant_alter. Do they run MTR tests under on Valgrind or ASan builds in Oracle? I assume they do, but then why Community users are reporting such cases first? Note that related MariaDB's bug, MDEV-13942, is fixed in 10.2.24+ and 10.3.15+.
  • Bug #95285 - "InnoDB: Page [page id: space=1337, page number=39] still fixed or dirty". This assertion failure that happens during normal shutdown was reported by LUCA TRUFFARELLI. There are chances that this is a regression bug (without a regression tag), as it does not happen for reporter on MySQL 5.7.21.
  • Bug #95319 - "SHOW SLAVE HOST coalesces certain server_id's into one". This bug was reported by Lalit Choudhary from Percona based on original findings by Glyn Astill.
  • Bug #95416 - "ZERO Date is both NULL and NOT NULL". This funny bug report was submitted Morgan Tocker. Manual actually explains that it's intended behavior (MariaDB 10.3.7 works the same way as MySQL), but it's still funny and unexpected, and the bug report remains "Verified".
  • Bug #95478 - "CREATE TABLE LIKE does not honour ROW_FORMAT." I'd like to add "...when it was not defined explicitly for the original table". The problem was reported by Jean-François Gagné and ended up as a verified feature request. See also this my post on the details of where row_format is stored and is not stored for InnoDB tables...
  • Bug #95484 - "EXCHANGE PARTITION works wrong/werid with different ROW_FORMAT". Another bug report by Jean-François Gagné related to the previous one. He had shown that it's actually possible to get partitions with different row formats in the same InnoDB table in MySQL 5.7.26, but not in the most natural way. It seems the problem may be fixed in 5.7.27 (by the fix for another, internally reported bug), but the bug remains "Verified".
There are some more bugs reported in May 2019 that I was interested in, but let me stop for now. Later in May I've got a chance to spend some days off in Barcelona, without any single MySQL bug report opened for day.

I like this view of Barcelona way more than any MySQL bugs review, including this one.
To summarize:
  1. Oracle engineers who process bugs still sometimes do not care to check if all supported major versions are affected and/or share the results of such checks in public. Instead, some of them care to argue about severity of the bug report, test case details etc.
  2. We still see bug reports that originates from existing, well known MTR test cases runs under Valgrind or in debug builds with some non-default options set. I do not have any good reason in mind to explain why these are NOT reported by Oracle's internal QA first.
  3. Surely some regression bugs still get verified without the regression tag added.
I truly hope my talk "Problems with Oracle's Way of MySQL Bugs Database Maintenance" will be accepted for Percona Live Europe 2019 conference (at least as a lightning talk) and I'll get another chance to speak about the problems highlighted above, and more. There are some "metabugs" in the way Oracle handles MySQL bug report, and these should be discussed and fixed, for the benefits of MySQL quality and all MySQL users and customers.