Sunday, April 14, 2019

Fun with Bugs #83 - On MySQL Bug Reports I am Subscribed to, Part XIX

I have not much yet to say on a popular topic of upgrading everything to MySQL 8, so let me just continue reviewing public MySQL bug reports that I've subscribed to recently. After my previous post at least one bug, Bug #94747, got enough comments and clarifications (up to specific commit that introduced this regression pointed out by Daniel Black!) to have it re-classified and verified as InnoDB code bug. So, I see good reasons to continue attracting wide public attention to selected MySQL bugs - this helps to make MySQL better eventually.

As usual, I start from the oldest bug reports:
  • Bug #94758 - "record with REC_INFO_MIN_REC_FLAG is not the min record on non-leaf page". It was reported by a well known person, Zhai Weixiang, who contributed a lot to MySQL code and quality. This time he added a function to the code to prove his point and show that data may be stored in an unexpected order on the root node of InnoDB table. For this very reason (Oracle's code modified to show the problem) this report was marked as "Not a Bug". This is weird, one may prove the point by checking memory with gdb if needed (or maybe by checking data pages on disk as well), without any code modifications.
  • Bug #94775 - "Innodb_row_lock_current_waits status variable incorrect values on idle server". If you read this bug report by Uday Sitaram you can find out a statement that some status variables, like Innodb_row_lock_current_waits, are designed to be "fuzzy", so no matter what value you may see it's probably not a bug. Very enlightening!
  • Bug #94777 - "Question about the redo log write_ahead_buffer". One may argue that public bugs database is not a proper place to ask questions, but in this case Chen Zongzhi actually proved that MySQL 8.0 works better and started up some discussion that reveal probably a real bug (see comments starting from this one, "[5 Apr 15:59] Inaam Rana "). So, even if "Not a Bug" status is correct for the original finding, it seems there is something to study and we have a hope this study happens elsewhere (although I'd prefer to see this or new public bug report for this "Verified").
  • Bug #94797 - "Auto_increment values may decrease when adding a generated column". I can not reproduce this problem reported by Fengchun Hua with MariaDB 10.1.x. My related comments in the bug remain hidden and I've already agreed not to make any such comments in the bugs database. So, for now we have a "Verified" bug in MySQL 5.7.
  • Bug #94800 - "Lost connection (for Debug version) or wrong result (for release version)". According to my tests, MariaDB 10.3.7 is not affected by this bug reported by Weidong Yu, who had also suggested a fix. See also his Bug #94802 - "The behavior between insert stmt and "prepare stmt and execute stmt" different ". (MariaDB 10.3.7 is also not affected).
  • Bug #94803 - "rpl sql_thread may broken due to XAER_RMFAIL error for unfinished xa transaction". This bug reported by Dennis Gao is verified based on code review, but we still do not know if any major version besides 5.7 is affected.
  • Bug #94814 - "slave replication lock wait timeout because of wrong trx order in binlog file". Yet another case when XA transactions may break replication was found by Zhenghu Wen. The bug is still "Open" and I am really interested to see it properly processed soon.
  • Bug #94816 - "Alter table results in foreign key error that appears to drop referenced table". From reading this report I conclude that MySQL 5.7.25 (and Percona Server 5.7.25-28, for that matter) is affected (src table disappears) and this was verified, but still the bug ends up as "Can't repeat" (?) with a statement that there is a fix in MySQL 8.0 that can not be back ported. This is rally weird, as we have plenty of bugs NOT affecting 8.0 but verified as valid 5.7.x bugs. Moreover, I've verified that in case of MySQL 8.0.x ref table just can not be created:
    mysql> create table ref (
        -> a_id int unsigned not null,
        -> b_id int unsigned not null,
        -> constraint FK_ref_a_b foreign key (b_id,a_id) references src (b_id,a_id)
        -> ) engine=InnoDB;
    ERROR 1822 (HY000): Failed to add the foreign key constraint. Missing index for constraint 'FK_ref_a_b' in the referenced table 'src'
    But it means the test case does not apply to 8.0 "as is", that MySQL 8.0 is not affected, but from the above it's not obvious if there is a fix to back port at all. As a next step I tried essentially the same test case on MariaDB 10.3 and ended up with a crash that I've reported as MDEV-19250. So, this bug report that was not even accepted by Oracle MySQL team ended up as a source of a useful check and bug report for MariaDB.
  • Bug #94835 - "debug-assert while restarting server post install component". This is a classical Percona style bug report from Krunal Bauskar. Percona engineers carefully work on debug builds and find many unique new bugs that way.
  • Bug #94850 - "Not able to import partitioned tablespace older than 8.0.14". This regression bug (for cases when lower_case_table_names=1) was reported by Sean Ren.
  • Bug #94858 - "Deletion count incorrect when rows deleted through multi-hop foreign keys". I've checked that MariaDB 10.3 is also affected by this bug reported by Sawyer Knoblich.
  • Bug #94862 - "MySQL optimizer scan full index for max() on indexed column." Nice bug report from Seunguck Lee. As one can easily check MariaDB is not affected:
    MariaDB [test]> explain select max(fd2) from test;
    | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
    |    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
    1 row in set (0,001 sec)

    MariaDB [test]> explain select get_timestamp(max(fd2)) from test;
    | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
    |    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
    1 row in set (0,001 sec)

    MariaDB [test]> select version();
    | version()       |
    | 10.3.14-MariaDB |
    1 row in set (0,000 sec)
  • Bug #94881 - "slave replication lock wait timeout because of supremum record". I fail to understand why this bug report from Zhenghu Wen ended up as "Closed". There is a detailed enough code analsys, but no test case to just copy/paste. The problem happens only with XA transactions and it's not clear if recent MySQL 5.7.25 is also affected. It means the bug can be in "Need Feedback" or even "Can't Repeat", but I see zero reasons to close it at the moment. Looks very wrong to me.
  • Bug #94903 - "optimizer chooses inefficient plan for order by + limit in subquery". It seems recently a lot of efforts from both bug reporter (Василий Лукьянчиков in this case) and even Oracle developer (Guilhem Bichot in this case) may be needed to force proper processing of the real bug.
It may take more than one dram of a good single malt to keep up with recent style of MySQL bugs processing...
* * *
To summarize:
  1. Attracting public attention of MySQL community users (via blog posts in this series or by any other means) to some MySQL bugs still helps to get them processed properly.
  2. Oracle MySQL engineers who work on bugs continue to refuse further processing of some valid bug reports based on formal and not entirely correct assumptions. In some cases I clearly miss checks for possible regressions vs older versions.
  3. As I already stated, Oracle does not seem to care much about bugs in XA transactions and possible replication problems they may cause.
  4. I encourage community users to share their findings and concerns in public MySQL bugs database. Even if they end up as "Not a Bug", they may still start useful discussions and fixes.
  5. By the way, my comment about the related discussion in MariaDB MDEV-15641 is still private in Bug #94610. This is unfortunate.

Thursday, April 4, 2019

Fun with Bugs #82 - On MySQL Bug Reports I am Subscribed to, Part XVIII

I've got few comments to my post on references to MariaDB in MySQL bug reports (not in the blog, but via social media and in personal messages), and all but one comments from current and former colleagues whose opinion I value a lot confirmed that this really looks like a kind of attempt to advertise MariaDB. So, from now on I'll try to keep my findings on how tests shared by MySQL bug reporters work in MariaDB for myself, MariaDB JIRA and this blog (where I can and will advertise whatever makes sense to me), and avoid adding them to MySQL bug reports.

That said, I still think that it's normal to share links to MariaDB bug reports that add something useful (like patches, explanations or better test cases), and I keep insisting that this kind of feedback should not be hidden. Yes, I want to mention Bug #94610 (and related MDEV-15641) again, as a clear example of censorship that is not reasonable and should not be tolerated.

In the meantime, since my previous post in this series I've subscribed to 30 or so new MySQL bug reports. Some of them are listed below, started from the oldest. This time I am not going to exclude "inactive" reports that were not accepted by Oracle MySQL engineers as valid:
  • Bug #94629 - "no variable can skip a single channel error in mysql replication". This is a request to add support for per-channel options to skip N transactions or specific errors. It is not accepted ("Not a Bug") just because one can stop replication on all channels and start on one to skip transaction(s) there, then resume replication for all channels. Do you really think this is a right and only way to process such a report?
  • Bug #94647 - "Memory leak in MEMORY table by glibc". This is also not a bug because one ca use something like malloc-lib=jemalloc with mysqld_safe or Environment="LD_PRELOAD=/path/to/jemalloc" with systemd services. There might be some cost related to that in older versions... Note that similar MDEV-14050 is still open.
  • Bug #94655 - "Some GIS function do not use spatial index anymore". yet another regression vs MySQL 5.7 reported by Cedric Tabin. It ended up verified as feature request without a regression tag...
  • Bug #94664 - "Binlog related deadlock leads to all incoming connection choked.". This report from Yanmin Qiao ended up as a duplicate of  Bug #92108 - "Deadlock by concurrent show binlogs, pfs session_variables table & binlog purge" (fixed in MySQL 5.7.25+, thanks Sveta Smirnova for the hint). See also Bug #91941.
  • Bug #94665 - "enabling undo-tablespace encryption doesn't mark tablespace encryption flag". Nice finding by Krunal Bauskar from Percona.
  • Bug #94699 - "Mysql deadlock and bugcheck on aarch64 under stress test". Bug report with a patch contributed by Cai Yibo. The fix is included in upcoming MySQL 8.0.17 and the bug is already closed.
  • Bug #94709 - "Regression behavior for full text index". This regression was reported by Carlos Tutte and properly verified (with regression tag added and all versions checked) by Umesh Shastry. See also detailed analysis of possible reason in the comment from Nikolai Ikhalainen.
  • Bug #94723 - "Incorrect simple query result with func result and FROM table column in where". Michal Vrabel found this interesting case when MySQL 8.0.215 returns wrong results. I've checked the test case on MariaDB 10.3.7 and it is not affected. Feel free to consider this check and statement my lame attempt to advertise MariaDB. I don't mind.
  • Bug #94730 - "Kill slave may cause start slave to report an error.". This bug was declared a duplicate of a nice Bug #93397 - "Replication does not start if restart MySQL after init without start slave." reported by Jean-François Gagné earlier. Both bugs were reported for MySQL 5.7.x, but I do not see any public attempt to verify if MySQL 5.6 or 8.0 is also affected. In the past it was required to check/verify bug on all GA versions supported if the test case applies. Nowadays this approach is not followed way too often, even when bug reporter cared enough to provide MTR test case.
  • Bug #94737 - "MySQL uses composite hash index when not possible and returns wrong result". Yet another optimizer bug was reported by Simon Banaan. Again, MariaDB 10.3.7 is NOT affected. I can freely and happily state this here if it's inappropriate to state so in the bug report itself. By the way, other MySQL versions were probably not checked. Also, unlike Oracle engineer who verified the bug, I do not hesitate to copy/paste the entire results of my testing here:
    MariaDB [test]> show create table tmp_projectdays_4\G*************************** 1. row ***************************
           Table: tmp_projectdays_4
    Create Table: CREATE TABLE `tmp_projectdays_4` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `project` int(11) NOT NULL,
      `datum` date NOT NULL,
      `voorkomen` tinyint(1) NOT NULL DEFAULT 1,
      `tijden` tinyint(1) NOT NULL DEFAULT 0,
      `personeel` tinyint(1) NOT NULL DEFAULT 0,
      `transport` tinyint(1) NOT NULL DEFAULT 0,
      `materiaal` tinyint(1) NOT NULL DEFAULT 0,
      `materiaaluit` tinyint(1) NOT NULL DEFAULT 0,
      `materiaalin` tinyint(1) NOT NULL DEFAULT 0,
      `voertuigen` varchar(1024) DEFAULT '',
      `medewerkers` varchar(1024) DEFAULT '',
      `personeel_nodig` int(11) DEFAULT 0,
      `personeel_gepland` int(11) DEFAULT 0,
      `voertuigen_nodig` int(11) DEFAULT 0,
      `voertuigen_gepland` int(11) DEFAULT 0,
      `created` datetime DEFAULT NULL,
      `modified` datetime DEFAULT NULL,
      `creator` int(11) DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `project` (`project`,`datum`) USING HASH
    1 row in set (0.001 sec)

    MariaDB [test]> explain SELECT COUNT(1) FROM `tmp_projectdays_4` WHERE `project`

    | id   | select_type | table             | type | possible_keys | key  | key_len
     | ref  | rows | Extra       |
    |    1 | SIMPLE      | tmp_projectdays_4 | ALL  | project       | NULL | NULL
     | NULL | 2544 | Using where |
    1 row in set (0.004 sec)

    MariaDB [test]> SELECT COUNT(1) FROM `tmp_projectdays_4` WHERE `project` IN(1540

    | COUNT(1) |
    |     2544 |
    1 row in set (0.025 sec)

    MariaDB [test]> select version();
    | version()          |
    | 10.3.7-MariaDB-log |
    1 row in set (0.021 sec)
    When the job was done properly I see no reasons NOT to share the results.
  • Bug #94747 - "4GB Limit on large_pages shared memory set-up". My former colleague Nikolai Ikhalainen from Percona noted this nice undocumented "feature" (Had I forgotten to advertise Percona recently? Sorry about that...) He proved with a C program that one can create shared memory segments on Linux large than 4GB, one just had to use proper data type, unsigned long integer, in MySQL's code. Still, this report ended up as non-critical bug in "MySQL Server: Documentation" category, or even maybe a feature request internally. What a shame!
Spring in Paris is nice, as this photo made 3 years ago proves. The way MySQL bug reports are handled this spring is not any nice in some cases.
To summarize:
  1. It seems recently the fact that there is some limited workaround already published somewhere is a good enough reason NOT to accept valid feature request. Noted.
  2. Regression bugs (reports about drop in performance or problem that had not happened with older version but happens with some recent) are still not marked with regression tag sometimes. Moreover, clear performance regressions in MySQL 8.0.x vs MySQL 5.7.x may end up as just feature requests... A request to "Make MySQL Great Again" maybe?
  3. MySQL engineers who verify bugs often do not care to check all major versions and/or share the results of their tests. This is unfortunate.
  4. Some bugs are not classified properly upon verification. The fact that wrong data type is used is anything but severity 3 documentation problem, really.

Saturday, March 30, 2019

On References to MariaDB and MariaDB Bugs (MDEVs) in MySQL Bug Reports

Recently I noted that some of my comments to public MySQL bug reports got hidden by somebody from Oracle with privileges to do so. I was not able to find out who did that and when, as this information is not communicated to bug subscribers (this may change if my feature requests, Bug #94807 - "Subscriber should be notified when comment is made private", is eventually implemented).

When it happened for the first time I thought it was probably non-intentional. When it happened for a second time I complained with a tweet that got few likes and zero comments. Recently this happened again and yet another tweet had not got much attention, but at least I've got a comment via Bug #94797 that my comment there (where I checked test case on MariaDB version I had at hand to find out it's not affected, something I often do for bugs mentioned in my blog posts here) was hidden as irrelevant and "an attempt to advertise MariaDB".

Snow hides everything, good and bad, dog shit, holes in the road and autumn flowers... Do we really want information provided in comments to public MySQL bugs got hidden just because someone once decided it's "bad"?
I really wonder if any of my readers think that I advertise MariaDB with my public posts or public comments anywhere or specifically in MySQL bug reports?

I'd also like to share here, where no one besides me can hide or delete comments (I hope), what was hidden in the case that caused me to tweet about censorship I have to deal with. In Bug #94610 - "Server stalls because ALTER TABLE on partitioned table holds dict mutex" that ended up as "Not a Bug" (not even a duplicate of a verified Bug #83435 - "ALTER TABLE is very slow when using PARTITIONED table" it referred to and extended with a global mutex usage highlighted and impact explained), I've added the following comment:
"[12 Mar 7:30] Valeriy Kravchuk
Not only it stalls, but if it stalls for long enough time it will crash :)

Useful related reading is here:
The comment was hidden very soon. Now, if you check that link, you'll see confirmed, unresolved MariaDB bug report. I mostly had this comment to the MDEV-15641 in mind, were my colleague and well known InnoDB developer Marko Mäkelä stated:
"The row_log_table_apply() is actually invoked while holding both dict_sys->mutex and dict_operation_lock. If there is a lot of log to apply, this may actually cause InnoDB to crash."
I may be mistaking in linking these two bug reports together, but isn't highlighting the possibility of crash due to long semaphore wait important to understand the impact of the bug report and triage it properly? What wrong MySQL users and bug report readers may see if they follow the link to MariaDB bug I considered relevant? What was advertised by this comment that is harmful or useless for MySQL Community?

I was even more surprised by these recent actions on my comments because in the past I had never noted similar approach. Check the following bug reports, for example (I searched for those with "MDEV" and "Kravchuk" in them to get these):
  • Bug #80919 - "MySQL Crashes when Droping Indexes - Long semaphore wait". In this bug report (real bug fixed in 5.7.22) I've added a comment that refers to MDEV-14637. The comment still remains public and, IMHO, is still useful. Providing this link helped to get proper attention to the bug, so it was re-opened and got comments from Oracle engineers finally. Was it an attempt to advertise MariaDB? How this case is different from my comment in Bug #94610 quoted above? 
  • Bug #84185 - "Not all "Statements writing to a table with an auto-increment..." are unsafe". I reported this "upstream" MySQL bug based on MDEV-10170 - "Misleading "Statements writing to a table with an auto-increment column after selecting from another table are unsafe" on DELETE ... SELECT", previously found by my colleague Hartmut Holzgraefe. I've also added link to the "upstream" MySQL bug report to that MDEV. Does anybody in MySQL or MariaDB user communities think that such cross-references are useless, harmful or may be considered as and "attempt to advertise competitor" if any of vendors fixes the bug first?
  • Bug #48392 - "mysql_upgrade improperly escapes passwords with single quotes". I verified this bug in 2009 while working for MySQL at Sun, and it still remains "Verified" (I had not re-checked if it's still repeatable with current MySQL versions). Then in 2013 community user added a comment referring to the MariaDB bug, MDEV-4664 - "mysql_upgrade crashes if root's password contains an apostrophe/single quotation mark" that was fixed later, in 2015. This comment still remains public and is useful!
So, had my comments that mention MDEVs or MariaDB in general became so irrelevant and MariaDB advertising recently comparing to the previous ones? What exact community standards or rules they break? Is it now forbidden to any user of MySQL bugs database to mention MariaDB or bugs in it, use MariaDB in tests to make some point and share the results in public in MySQL bugs database, or the problem is with me personally doing this?

I'd be happy to read explanations or opinions from MySQL community users and my former Oracle colleagues in comments to this blog post.

Saturday, March 9, 2019

Fun with Bugs #81 - On MySQL Bug Reports I am Subscribed to, Part XVII

Two weeks passed since my previous review of public MySQL bug reports I consider interesting enough to subscribe to them. Over this period I picked up a dozen or so new public bug reports that I'd like to briefly review today.

Here is my recent subscriptions list, starting from the oldest bug reports:
  • Bug #94431 - "Can't upgrade from 5.7 to 8.0 if any database have a hyphen in their name". It seems one actually needs a database like that created in MySQL 5.6 with at least one InnoDB table having FULLTEXT index to hit the problem. Great finding by Phil Murray. Note that after several unsuccessful attempts by others the bug was eventually reproduced and verified by Jesper Wisborg Krogh. Let's hope we'll see it fixed in MySQL 8.0.16.
  • Bug #94435 - "mysql command hangs up and cosume CPU almost 100%". It was reported by Masaaki HIROSE, whose previous related/similar Bug #94219 - "libmysqlclient enters and infinite loop and consume CPU usage 100%" ended up as "Not a bug" (wrongly, IMHO, as nobody cared enough to reproduce the steps instead of commenting on their correctness and checking something else). Bug reporter had not only insisted and provided all the details, but also tried to analyze the reasons of the bug and provided links to other potentially related bug reports (Bug #88428 - "mysql_real_query hangs with EINTR errno (using YASSL)" and Bug #92394 - "libmysqlclient enters infinite loop after signal (race condition)"). Great job and nice to see the bug "Verified" eventually.
  • Bug #94441 - "empty ibuf aio reads in innodb status". This regression vs MySQL 5.6 was noted by Nikolai Ikhalainen from Percona. MariaDB 10.3.7 is also affected, unfortunately:
    I/O thread 9 state: native aio handle (write thread)
    Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
     ibuf aio reads:, log i/o's:, sync i/o's:Pending flushes (fsync) log: 0; buffer pool: 0
    1344 OS file reads, 133 OS file writes, 2 OS fsyncs
  • Bug #94448 - "Rewrite LOG_BLOCK_FIRST_REC_GROUP during recovery may be dangerous.". Yet another MySQL 8 regression (not marked with "regression" tag) was found by Kang Wang.
  • Bug #94476 - "mysql semisync replication stuck with master in Waiting to finalize termination". It has "Need feedback" status at the moment. I've subscribed to this report from Shirish Keshava Murthy mostly to find out how a report that may look like a free support request will be processed by Oracle engineers. Pure curiosity, for now.
  • Bug #94504 - "AIO::s_log seems useless". This problem was reported by Yuhui Wang. It's a regression in a sense that part of the code is no longer needed (and seems not to be used) in MySQL 8, but still remains.
  • Bug #94541 - "Assertion on import via Transportable Tablespace". This bug reported by  Daniël van Eeden was verified based on code review and some internal discussion. We do not know if any other version besides 5.7.25 is affected, though. The assertion itself:
    InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->
    does not seem to be unique. We can find it in MDEV-18455 also (in other context).
  • Bug #94543 - "MySQL does not compile with protobuf 3.7.0". I care about build/compiling bugs historically, as I mostly use MySQL binaries that I built myself from GitHub source. So, I've immediately subscribed to this bug report from Laurynas Biveinis.
  • Bug #94548 - "Optimizer error evaluating JSON_Extract". This bug was reported by Dave Pullin. From my quick test it seems MariaDB 10.3.7 is also affected. Error message is different in the failing case, but the point is the same - the function is not evaluated if the column from derived table that is built using the function is not referenced in the SELECT list. This optimization is questionable and may lead to hidden "bombs" in the application code.
  • Bug #94550 - "generated columns referring to current_timestamp fail". I tried to check simple test case in this bug report by Mario Beck on MariaDB 10.3.7, but it does not seem to accept NOT NULL constraint for generated stored columns at all:
    MariaDB [test]> CREATE TABLE `t2` (
        ->   `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
        ->   `content` varchar(42) DEFAULT NULL,
        ->   `bucket` tinyint(4) GENERATED ALWAYS AS ((floor((to_seconds(`created_at
    `) / 10)) % 3)) STORED NOT NULL);
    ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
    corresponds to your MariaDB server version for the right syntax to use near 'NOT
     NULL)' at line 4
    I do not see this option in formal syntax described here as well. But in case of MariaDB we can actually make sure the generated column is never NULL by adding CHECK constraint like this:
    MariaDB [test]> CREATE TABLE `t2` (    ->   `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
        ->   `content` varchar(42) DEFAULT NULL,
        ->   `bucket` tinyint(4) GENERATED ALWAYS AS ((floor((to_seconds(`created_at`) / 10)) % 3)) STORED);
    Query OK, 0 rows affected (0.434 sec)

    MariaDB [test]> INSERT INTO t2 (content) VALUES ("taraaaa");
    Query OK, 1 row affected (0.070 sec)

    MariaDB [test]> alter table t2 add constraint cnn CHECK (`bucket` is NOT NULL);
    Query OK, 1 row affected (1.159 sec)
    Records: 1  Duplicates: 0  Warnings: 0

    MariaDB [test]> INSERT INTO t2 (content) VALUES ("tarabbb");
    Query OK, 1 row affected (0.029 sec)

    MariaDB [test]> INSERT INTO t2 (content) VALUES ("");
    Query OK, 1 row affected (0.043 sec)

    MariaDB [test]> select * from t2;
    | created_at          | content | bucket |
    | 2019-03-09 17:28:03 | taraaaa |      0 |
    | 2019-03-09 17:29:43 | tarabbb |      1 |
    | 2019-03-09 17:29:50 |         |      2 |
    3 rows in set (0.002 sec)

    MariaDB [test]> show create table t2\G*************************** 1. row ***************************
           Table: t2
    Create Table: CREATE TABLE `t2` (
      `created_at` timestamp NOT NULL DEFAULT current_timestamp(),
      `content` varchar(42) DEFAULT NULL,
      `bucket` tinyint(4) GENERATED ALWAYS AS (floor(to_seconds(`created_at`) / 10)
    MOD 3) STORED,
      CONSTRAINT `cnn` CHECK (`bucket` is not null)

    1 row in set (0.011 sec)
    So, maybe after all we can state that MariaDB is NOT affected.
  • Bug #94552 - "innodb.virtual_basic fails when valgrind is enabled". I still wonder if anyone in Oracle runs MTR test suite on Valgrind-enabled (-DWITH_VALGRIND=1 cmake option) at least in the process of official release (and if they check the failures). It seems not to be the case based on this bug report from Manuel Ung.
  • Bug #94553 - "Crash in trx_undo_rec_copy". Bernardo Perez noted that as a side effect of still "Verified" Bug #82734 - "trx_undo_rec_copy needlessly relies on buffer pool page alignment" (that affects both MySQL 5.7 and 8.0) we may get crashes while working with generated columns. I hope to see them both fixed soon, but for now Bug #94553 has status "Need Feedback", probably in a hope to get a repeatable test case. I'll watch it carefully.
  • Bug #94560 - "record comparison in spatial index non-leaf rtree node seems incorrect". I doubt spatial indexes of InnoDB are widely used, and I have no doubts there are many bugs waiting to be discovered in this area. This specific bug was reported by Jie Zhou who had also suggested a fix.
  • Bug #94610 - "Server stalls because ALTER TABLE on partitioned table holds dict mutex". My former colleague Justin Swanhart reported this bug just yesterday, so no wonder it is not verified yet. It refers to a well known verified old Bug #83435 - "ALTER TABLE is very slow when using PARTITIONED table"  (that I've also subscribed to immediately) from Roel Van de Paar, affecting both MySQL 5.6 and 5.7. I hope to see this bug verified and fixed soon, as recently I see this kind of state for main thread:
    Main thread process no. 3185, id 140434206619392, state: enforcing dict cache limit
    too often in INNODB STATUS outputs to my liking...
As you could note, I still try to check (at least in some cases) if MariaDB is also affected by the same problem. I think it's a useful check both for me (as I work mostly with MariaDB as a support engineer) and for the reader (to know if switching to MariaDB may help in any way or if there are any chances for MariaDB engineers to contribute anything useful, like a fix).

"Hove, actually". For years residents of Hove used this humorous reply when they live in Brighton... "Regression, actually" is what I want to say (seriously) about every other MySQL bug report I subscribe to... So, you see Hove and many regression bugs above!
To summarize:
  1. Sometimes Oracle engineers demonstrate proper collective effort to understand and carefully verify public bug reports. Good to know they are not ready to give up fast!
  2. I have to copy-paste this item from my previous post. As the list above proves, Oracle engineers still do not use "regression" tag when setting "Verified" status for obviously regression bugs. I think bug reporters should care then to always set it when they report regression of any kind.
  3. It seems there no regular MTR test runs for Valgrind builds performed by Oracle engineers, or maybe they just ignore failures.

Saturday, February 23, 2019

Fun with Bugs #80 - On MySQL Bug Reports I am Subscribed to, Part XVI

Today I'd like to continue my review of public MySQL bug reports with a list of some bugs I've subscribed to over last 3 weeks. It's already long enough and includes nice cases to check and share. Note that I usually subscribe to a bug either because it directly affects me or customers I work with, or I consider it technically interesting (so I mostly care about InnoDB, replication, partitioning and optimizer bugs), or it's a "metabug" - a problem in the way public bug report is handled by Oracle engineers. These are my interests related to MySQL bugs.

As usual, I start with the oldest bugs and try to mention bug reporters by name with links to their other reports whenever this may give something useful to a reader. I try to check if MariaDB is also affected in some cases. Check also my summary comments at the end of this blog post.
  • Bug #94148 - "Unnecessary Shared lock on parent table During UPDATE on a child table". In this bug report Uday Varagani reasonably pointed out that formally there is no need to lock parent row when column NOT included in the foreign key gets updated. This happens though when this column is included into the index used to support foreign key constraint. IMHO it's a reasonable feature request and both Trey Raymond and Sveta Smirnova tried their best to  highlight this, but this report now has a "Need Feedback" status with a request to explain new algorithm suggested. It's simple - "Stop it", check that column changed is NOT the one foreign key is defined on, even if it's in the same index...I see no reason NOT to verify this as a reasonable feature request. Is it a new policy that every feature request should come with details on how to implement it? I truly doubt.
  • Bug #94224 - "[5.6] Optimizer reconsiders index based on index definition order, not value". Domas Mituzas found yet another case (see also Bug #36817 - "Non optimal index choice, depending on index creation order" from Jocelyn Fournier, the bug I verified more than 10 years ago) when in MySQL order of index definition matters more for optimizer than anything else.  My quick check shows that MariaDB 10.3.7 is not affected:
    MariaDB [test]> explain select distinct b from t1 where c not in (0) and d > 0;+------+-------------+-------+-------+---------------+--------------------+---------+------+------+-------------+| id   | select_type | table | type  | possible_keys | key            | key_len
    | ref  | rows | Extra                    |
    |    1 | SIMPLE      | t1    | index | NULL          | non_covering_index | 9    | NULL |    1 | Using where |
    1 row in set (0.002 sec)

    MariaDB [test]> alter table t1 add index covering_index (b, c, d);
    Query OK, 0 rows affected (0.149 sec)
    Records: 0  Duplicates: 0  Warnings: 0

    MariaDB [test]> explain select distinct b from t1 where c not in (0) and d > 0;
    | id   | select_type | table | type  | possible_keys | key            | key_len
    | ref  | rows | Extra                    |
    |    1 | SIMPLE      | t1    | index | NULL          | covering_index | 14
    | NULL |    1 | Using where; Using index |
    1 row in set (0.025 sec)
    Fortunately MySQL 8 is no longer affected. Unfortunately we do not see a public comment showing the results of testing on MySQL 5.7 (or any version, for that matter), from engineer who verified the bug. I already pointed out that this "metabug" becomes popular in my previous blog post.
  • Bug #94243 - "WL#9508 introduced non-idiomatic potentially-broken C macros". Laurynas Biveinis from Percona found new code that in ideal world wound not pass any serious code review.
  • Bug #94251 - "Aggregate function result is dependent by window is defined directly or as named". This bug was reported by Владислав Сокол. From what I see:
    MariaDB [test]> WITH RECURSIVE cte AS (
        -> SELECT 1 num
        -> UNION ALL
        -> SELECT num+1 FROM cte WHERE num < 5
        -> )
        -> SELECT num, COUNT(*) OVER (frame) cnt_named, COUNT(*) OVER (ORDER BY num
    DESC) cnt_direct
        -> FROM cte
        -> WINDOW frame AS (ORDER BY num DESC);
    | num  | cnt_named | cnt_direct |
    |    1 |         5 |          5 |
    |    2 |         4 |          4 |
    |    3 |         3 |          3 |
    |    4 |         2 |          2 |
    |    5 |         1 |          1 |
    5 rows in set (0.117 sec)

    MariaDB [test]> WITH RECURSIVE cte AS (
        -> SELECT 1 num
        -> UNION ALL
        -> SELECT num+1 FROM cte WHERE num < 5
        -> )
        -> SELECT num, COUNT(*) OVER (frame) cnt_named, COUNT(*) OVER (ORDER BY num
    DESC) cnt_direct
        -> FROM cte
        -> WINDOW frame AS (ORDER BY num DESC)
        -> ORDER BY num desc;
    | num  | cnt_named | cnt_direct |
    |    5 |         1 |          1 |
    |    4 |         2 |          2 |
    |    3 |         3 |          3 |
    |    2 |         4 |          4 |
    |    1 |         5 |          5 |
    5 rows in set (0.003 sec)
    MariaDB 10.3.7 is NOT affected.
  • Bug #94283 - "MySQL 8.0.15 is slower than MySQL 5.7.25". Percona's CTO Vadim Tkachenko reported that MySQL 8.0.15 is notably slower than 5.7.25 on a simple oltp_read_write sysbench test. He had recently written a separate blog post about this, with more details.There is one detail to clarify based on today's comment from Peter Zaitsev (was the same default character set used), but as my dear friend Sinisa Milivojevic verified the bug without any questions, requests or his own test outputs shared, we can assume that Oracle officially accepted this performance regression (even though "regression" tag was not set).

    Check also later Bug #94387 - "MySQL 8.0.15 is slower than MySQL 5.7.25 in read only workloads", yet another performance regression report from Vadim, where he found that on read only (sysbench oltp_point_select) all in memory workloads MySQL 8.0.15 may also be slower than MySQL 5.7.25.
  • Bug #94302 - "reset master could not break dump thread in some cases". This bug was reported by Ashe Sun. This is definitely a corner case, as it happens only master is still writing to the very first binary log. We can not find out from public comments in the bug report if any other versions besides 5.7.x are affected. This is yet another "metabug" - during my days in Oracle's MySQL bugs verification team we had to check on all versions still supported and present the results explicitly.
  • Bug #94319 - "Format_description_log_event::write can cause segfaults". Nice bug report by Manuel Ung from Facebook.
  • Bug #94330 - "Test for possible compressed failures before upgrade?". Change of zlib version starting from MySQL 5.7.24 means that some operations for InnoDB tables with ROW_FORMAT=COMPRESSED that previously worked may start to fail. In this report Monty Solomon asks for some way to determine if there will be a problem with existing compressed tables before upgrading to 5.7.24. The bug is still "Open".
  • Bug #94338 - "Dirty read-like behavior in READ COMMITTED transaction". Bug reporter, Masaki Oguro, stated that MySQL 8 is not affected (only 5.6 and 5.7) and the bug is verified on these versions, so we should assume it's really the case. But I miss public comment showing the result of testing on recent MySQL 8.0.15.
  • Bug #94340 - "backwards incompatible changes in 8.0: Error number: 3747". Simon Mudd complains about incompatible change in 8.0.13 that does not allow slave to easily switch from SBR to RBR without restart (and was not clearly documented as a change in behavior). Make sure to read all comments.
  • Bug #94370 - "Performance regression of btr_cur_prefetch_siblings". Nice bug report with a patch from Zhai Weixiang.
  • Bug #94383 - "simple ALTER cause unnecessary InnoDB index rebuilds, 5.7.23 or later 5.7 rlses". In this bug report Mikhail Izioumtchenko presented the detailed analysis and suggested diagnostics patches to show what really happens and why. This bug is also a regression of a kind, so while testing results are presented, I still think that it could be processed better according to the good old rules I have in mind.
  • Bug #94394 - "Absence of mysql.user leads to auto-apply of --skip-grant-tables". Great finding by Ceri Williams from Percona. Sveta Smirnova provided a separate MTR test case and clarified the impact of the bug. Surely this is also a regression comparing to MySQL 5.7, as there you can not start MySQL if mysql.user table is missing. I leave it to a reader to decide if there is any security-related impact of this bug...
  • Bug #94396 - "Error message too broad: The used command is not allowed with this MySQL version". This bug was reported by my former colleague in Percona Support, famous Bill Karwin. Informative error messages matter for good user experience.
We rely on MySQL in a same way as that guys on top of dolphins pyramid on this strange monument in some court somewhere at the Lanes. Reliable foundation matters, so regressions should better be avoided.
To summarize:
  1. Looks like it's time for Oracle to spend some efforts to make MySQL 8 great again, by fixing some of the bugs mentioned above, especially performance regressions vs MySQL 5.7 found recently by Vadim Tkachenko from Percona.
  2. Oracle continues to introduce backward-incompatible changes in behavior in minor MySQL 8.0.x releases at GA stage. This is not really good for any production environment.
  3. Asking bug reporters to provide "the basics of such a new algorithm" when they complain that current one is wrong or not optimal is a new word in bugs processing!
  4. When I joined MySQL bugs verification team in 2005 we've set up a culture of bugs processing that included, among other things, presenting in a public comment any successful or unsuccessful attempt to verify the bug, by copy-pasting all commands and statements used along with the outputs, whenever possible and with enough context to show what was really checked. I've studied this approach from Oracle's Tom Kyte over the previous 10 years when I followed him closely. I used to think it's standard for more than a decade already, a kind of my (and not only my) "heritage". It's sad to see this approach is no longer followed by many Oracle engineers who process bugs, in too many cases.
  5. Oracle engineers still do not use "regression" tag when setting "Verified" status for obviously regression bugs. I think bug reporters should care then to always set it when they report regression of any kind.

Saturday, February 16, 2019

Fun with Bugs #79 - On MySQL Bug Reports I am Subscribed to, Part XV

More than 3 weeks passed since my previous review of public MySQL bug reports I am subscribed to, so it's time to present some of the bugs I've considered interesting in January, 2019.

As usual, I'll review them starting from the oldest and try to summarize my feelings about these bugs at the end of this post. Here they are:
  • Bug #93806 - "Document error about ON DUPLICATE KEY UPDATE". Years pass, but fine MySQL manual still does not explain some cases of InnoDB locking properly. Xiaobin Lin found yet another case that it does not explain properly. Or, maybe, the manual is correct and the problem in the implementation? MariaDB 10.3.7 shows the same behavior.
  • Bug #93827 - "dict_index_has_desc() is not efficient". Yet another bug report from Zhai Weixiang. I see 50 still active bug reports from him! Maybe Oracle should send some nice T-shirts to top N most productive bug reporters?
  • Bug #93845 - "Optimizer choose wrong index, sorting index instead of filtering index". yet another bug report of a known class, this time from Daniele Renda. It's good example of optimizer trace usage to make a point. Note also that using ANALYZE ... UPDATE HISTOGRAMS does not help. As a side note, implementation of optimizer trace for MariaDB is finally in progress and should be done for upcoming 10.4. See MDEV-6111 for the details if you care.
  • Bug #93875 - "mysqldump per-table dump is slow since 5.7 on instances with many tables". This performance regression bug (that was "verified" without adding the regression tag) was reported by Nikolai Ikhalainen from Percona. This bug report is a nice example of using Docker to create easily repeatable test cases for bug reports.
  • Bug #93878 - "innodb_status_output fails to restore to old value". This great bug report from Yuhui Wang  not only describes 3 cases when InnoDB status is printed to the error log automatically, but also shows that in one of these cases, when we can not found free block in the buffer pool in 20 loops, this printing is not stopped after the problem is resolved, and provides a patch that resolves the problem. See also his nice Bug #94065 - "MySQL fails to startup when setting persist variable" with detailed analysis of the problem.
  • Bug #93917 - "Wrong binlog entry for BLOB on a blackhole intermediary master". Nice corner case was found by Sveta Smirnova from Percona. With her 52 "Verified" bug reports at the moment she also deserves a T-shirt from Oracle as one of top bug reporters!
  • Bug #93922 - "UNION ALL very slow with SUM(0)". This weird bug was found and reported by Sergio Paternoster. He had to spend notable efforts to see this bug "Verified"...
  • Bug #93948 - "XID inconsistency on master-slave with CTAS". Krunal Bauskar from Percona noted this inconsistency in XID generation on slave vs master. Let's wait and check if it ends up as "Not a bug".
  • Bug #93957 - "slave_compressed_protocol doesn't work with semi-sync replication in MySQL-5.7". This bug report from Pavel Katiushyn also looks like a regression, as similar bug was fixed in older 5.7.x release. But I do not see any public comment with verification attempt neither in recent 5.7, nor in recent 8.0 (where older bug also had to be fixed). So, the bug is "verified", but the real impact and versions affected are not clear.
  • Bug #93963 - "Slow query log doesn't log a slow CREATE INDEX with admin statements enabled". This clear and properly tagged regression vs MySQL 5.7 was reported by Jeremy Smyth.
  • Bug #93986 - "Transactions in serializable mode are not actually serializable". I've subscribed to this bug report mostly for (expected) fun of reading further comments. It's still "Need feedback", but single comment so far is worth reading.
  • Bug #94121 - "Enable hardware CRC32 under Valgrind". Laurynas Biveinis from Percona also provided a patch for this 8 years old problem.
  • Bug #94130 - "XA COMMIT may lead replication broken". Yet another proof that XA transactions implementation is broken in MySQL. This time from Phoenix Zhang and in semi-sync replication case.
This photo reminds me current state of MySQL bugs processing in Oracle - it seems there is no clear and straightforward way to follow. Everything is fuzzy these days...

There are few more bugs reported in January, 2019 that I am watching, but their status is not yet clearly defined, so I decided to skip them in this review.

To summarize:
  1.  Oracle engineers who process bugs still do not add regression tag to many regression bugs. This is a shame, really. If I were their boss I'd make this a policy and one of important KPI values to monitor.
  2. In some cases bugs get verified immediately without any demonstrated attempt to show how the check was performed, while in other cases poor bug reporters have to fight hard to re-make their point and get a real check done. It seems these days good old approaches to bugs verification are not followed strictly by some Oracle engineers.

Saturday, February 9, 2019

On my Favorite FOSDEM 2019 MySQL, MariaDB and Friends Devroom Talks

This year I had not only spoken about MySQL bugs reporting at FOSDEM, but spent almost the entire day listening at MySQL, MariaDB and Friends Devroom. I missed only one talk, on ProxySQL, (to get some water, drink a bottle of famous Belgian beer and chat with my former colleague in MySQL support team, Geert, whom I had not seen for a decade). So, for the first time out of my 4 FOSDEM visits I've got a first hand impression about the entire set of talks in the devroom that I want to share today, while I still remember my feelings.

Most of the talks have both slides and videos already uploaded on site, so you can check them and make your own conclusions, but my top 5 favorite talks (that have both videos and slides already available to community) were the following:

  • "Un-split brain (aka Move Back in Time) MySQL", by Shlomi Noach. You can find slides at SlideShare.

    This was a replacement talk that was really interesting and had proper style for FOSDEM. It was mostly a nice background story of creation of the gh-mysql-rewind tool, a shell script that uses MariaDB's mysqlbinlog --flashback option and MySQL GTIDs and allows to "rewind" row-based binary log to roll back transactions to some previous point in time. The tool should become available to community soon, maybe as a part of orchestrator. I was impressed how one can successfully use 49 slides for 20 minutes talk. That's far beyond my current presentation skills...
  • "Test complex database systems in a laptop with dbdeployer", by Giuseppe Maxia. You can find slides at SlideShare.

    I've already built and used dbdeployer, as described in my blog post, so I was really interested in the talk. Giuseppe was able not only to show 45 slides over 20 minutes and explain all the reasons behind re-implementing MySQL-Sandbox in Go, but also run a live demo where dozens of sandbox instances were created and used. Very impressive!
  • "MySQL and the CAP theorem: relevance & misconceptions", second great talk and show by Shlomi Noach. You can find slides at SlideShare.

    The "CAP theorem" says is a concept that a distributed database system (like any kind of MySQL replication setup) can only have 2 of the 3 features: (atomic) Consistency, (high) Availability and Partition Tolerance. This can be proved mathematically, but Shlomi had not only defined terms and conditions to present the formal proof, but also explained that they are far from real production objectives of any engineer or DBA (like 99.95% of Availability). He had shown typical MySQL setups (from simple async master-slave replication to Galera, group replication and even Vitess) and proved that formally they all are neither consistent nor available from that formal CAP theorem point of view, while, as we all know, they are practically useful and work (and with some efforts, proxies on top etc can be made both highly available and highly consistent for practical purposes). So, CAP theorem is neither representing real production systems, nor meeting their real requirements. We've also got some kind of explanation of why async master-master or circular replication are still popular... All that in 48 slides, with links, and presented in 20 minutes! Greatest short MySQL-related talk I've ever attended.
  • "TiDB: Distributed, horizontally scalable, MySQL compatible", by Morgan Tocker. You can find slides at SlideShare.

    It was probably the first time when I listened to Morgan, even though we worked together for a long time. I liked his way of explaining the architecture of this yet another database system speaking MySQL protocol and reasons to create it. If you are interested in performance of this system, check this blog post.
  • "MySQL 8.0 Document Store: How to Mix NoSQL & SQL in MySQL 8.0", by Frédéric Descamps. You can find slides (70!) at SlideShare.

    LeFred managed to get me somewhat interested in MySQL Shell and new JSON functions in MySQL, way more than ever before. It's even more surprising that hist talk was the last one and we already spent 8+ hours listening before he started. Simple step by step explanation of how one may get the best of both SQL, ACID and NoSQL (JSON, "MongoDB") worlds, if needed, in a single database management syste, was impressive. Also this talk probably caused the longest discussion and the largest number of questions from those remaining attendees.

    He was also one of two "hosts" and "managers" of the devroom, so I am really thankful him for hist efforts year after year to make MySQL devroom at FOSDEM great!
There were more good talks, but I had to pick up few that already have slides shared and those of a kind that I personally prefer to listen to at FOSDEM. This year I also missed few people whom I like to see and talk to at FOSDEM, namely Mark Callaghan and Jean-François Gagné.

The only photo I made with my Nokia dumb phone this year in Brussels, on my way to FOSDEM on February 2. We've got snow and rain that morning, nice for anyone who had to walk 5 kilometers to the ULB campus.
Overall, based on my experience this year, it still makes a lot of sense to visit FOSDEM for anyone interested in MySQL. You can hardly find so many good, different MySQL-related talks per just one single day on any other conference.