Showing posts with label DDL. Show all posts
Showing posts with label DDL. Show all posts

Saturday, January 18, 2020

Fun with Bugs #92 - On MySQL Bug Reports I am Subscribed to, Part XXVI

I'd like to continue reviewing MySQL bug reports from Community users that I considered interesting and subscribed to. Unlike in the previous post in this series, I am not going to check test cases on any competitor product, but will use only recently released MySQL 5.7.29 and 8.0.19 for checks, if any. This time I'll concentrate on bugs reported in November 2019.

As usual, I mostly care about optimizer, InnoDB and replication related bugs. Here is the list:
  • Bug #97476 - "Range optimizer skips rows". This bug reported by Ilya Raudsepp looks like a clear regression in MySQL 8.0.x comparing to MySQL 5.7.x at least. I get the following correct results with 5.7.29:
    mysql> SELECT t.id
        -> FROM Test t
        -> JOIN (
        ->     SELECT item_id, MAX(created_at) AS created_at
        ->     FROM Test t
        ->     WHERE (platform_id = 2) AND (item_id IN (3,2,111)) AND (type = 'Default')
        ->     GROUP BY item_id
        -> ) t2 ON t.item_id = t2.item_id
        ->   t.item_id = t2.item_id
        ->   AND t.created_at = t2.created_at
        ->   AND t.type = 'Default'
        -> WHERE t.platform_id = 2;
    +----+
    | id |
    +----+
    |  6 |
    |  3 |
    |  5 |
    +----+
    3 rows in set (0,03 sec)

    mysql> select version();
    +-----------+
    | version() |
    +-----------+
    | 5.7.29    |
    +-----------+
    1 row in set (0,02 sec)
  • Bug #97531 - "5.7 replication breakage with syntax error with GRANT management". This tricky bug reported by Simon Mudd applies also to MySQL 8.0.x. It is closed as fixed, but the fix had not made it to recent 5.7.29 and 8.0.19 releases, so you'll have to wait for few more months.
  • Bug #97552 - "Regression: LEFT JOIN with Impossible ON condition performs slowly". Yet another optimizer regression in MySQL 8 (comparing to 5.7.x) that is fixed only in MySQL 8.0.20+. The bug was reported by Fredric Johansson.
  • Bug #97648 - "Bug in order by clause in union clause". Yet another regression (at least from user's point of view) in recent MySQL 5.7.x and 8.0.x comparing to 5.6.x. This time without a "regression" tag. The bug was reported by Andrei Mart.
  • Bug #97662 - "MySQL v8.0.18 FIPS mode is no longer supported". According to Ryan L, MySQL 8.0.18+ is no longer supporting ssl_fips_mode=STRICT, as OpenSSL 1.1.1 is not FIPS-compatible and MySQL Server must be compiled using OpenSSL 1.1.1 or higher. That's interesting. Check also this link.
  • Bug #97682 - "Handler fails to trigger on Error 1049 or SQLSTATE 42000 or plain sqlexception". This regression (comparing to MySQL 5.7) was reported by Jericho Rivera. It is fixed in MySQL 8.0.20. The patch was provided by Kamil Holubicki.
  • Bug #97692 - "Querying information_schema.TABLES issue". I do not see any documented attempt to check on MySQL 8.0, so I had to add a comment to the bug report. From what I see, in MySQL 8.0.19 we still get different (empty) result from the second query, but at least now we have a warning:
    mysql> SELECT ts.TABLE_SCHEMA
        -> FROM information_schema.TABLES ts
        -> WHERE ts.TABLE_TYPE ='VIEW'
        -> AND ts.TABLE_SCHEMA NOT IN ('sys')
        -> AND ts.TABLE_COMMENT LIKE '%invalid%';
    +--------------+
    | TABLE_SCHEMA |
    +--------------+
    | test         |
    +--------------+
    1 row in set, 1 warning (0,00 sec)

    mysql> show warnings\G
    *************************** 1. row ***************************
      Level: Warning
       Code: 1356
    Message: View 'test.v' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them
    1 row in set (0,00 sec)

    mysql> select version();
    +-----------+
    | version() |
    +-----------+
    | 8.0.19    |
    +-----------+
    1 row in set (0,00 sec)
    The bug was reported by Vinicius Malvestio Grippa.
  • Bug #97693 - "ALTER USER user IDENTIFIED BY 'password' broken by invalid authentication_string". The bug was reported by Nikolai Ikhalainen. MySQL 8.0.19 is still affected.
  • Bug #97694 - "MySQL 8.0.18 fails on STOP SLAVE/START SLAVE stress test". For some reason I do not see any documented attempt to verify this on MySQL 5.7 also. The bug was reported by Przemysław Skibiński, who also suggested a fix.
  • Bug #97734 - "Document the correct method to stop slaving with MTS without a warning or error". I can only agree with this request from Buchan Milne. Please. do :)
  • Bug #97735 - "ALTER USER IF EXISTS ... WITH_MAX_USER_CONNECTIONS 9999 not applied correctly". yet another bug report by Simon Mudd in this list. For some reason, again, I do not see any documented attempt to verify the bug on MySQL 8.0.x, while there is no clear reason to think it is not affected.
  • Bug #97742 - "bad item ref from correlated subquery to outer distinct table". This bug was reported by Song Zhibai, who also had contributed a patch. Based on further comments from  Øystein Grøvlen and these results:
    mysql> EXPLAIN SELECT f3 FROM t1 HAVING (SELECT 1 FROM t2 HAVING f2 LIMIT 1);
    +----+--------------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
    | id | select_type        | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
    +----+--------------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
    |  1 | PRIMARY            | t1    | NULL       | ALL   | NULL          | NULL    | NULL    | NULL |    3 |   100.00 | NULL        |
    |  2 | DEPENDENT SUBQUERY | t2    | NULL       | index | NULL          | PRIMARY | 4       | NULL |    1 |   100.00 | Using index |
    +----+--------------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
    2 rows in set, 2 warnings (0,00 sec)

    mysql> show warnings\G
    *************************** 1. row ***************************
      Level: Note
       Code: 1276
    Message: Field or reference 'f2' of SELECT #2 was resolved in SELECT #1
    *************************** 2. row ***************************
      Level: Note
       Code: 1003
    Message: /* select#1 */ select `test`.`t1`.`f3` AS `f3` from `test`.`t1` having (/* select#2 */ select 1 from `test`.`t2` having `test`.`t1`.`f2` limit 1)
    2 rows in set (0,00 sec)

    mysql> select version();
    +-----------+
    | version() |
    +-----------+
    | 5.7.29    |
    +-----------+
    1 row in set (0,00 sec)
    I'd say that MySQL 5.7.x is also affected, but for some reason nobody documented any attempt to verify it there. So, I've added a comment.
  • Bug #97777 - "separate global variables (from hot variables) using linker script (ELF)". Beautiful bug report from Daniel Black.With a lot of details, perf and readelf outputs and patch contributed. See also his Bug #97822 - "buf_page_get_gen buf_pool->stat.n_page_gets++ is a cpu waste", with perf analysis up to a single assembler instruction level and fix suggested.
  • Bug #97825 - "dd_mdl_acquire in dd_table_open with dict_sys->mutex hold may cause deadlock". Here I am really puzzled by no visible attempt to check the arguments of bug reporter, Dave Do, who tried to perform lock order analysis by code review. All we see as a result is this:
    "Lock order could be different, but it is irrelevant, since these are locks on totally different levels and can't, in themselves, cause any deadlock."
    What a great argument! Not a bug, surely... We trust you.
"What bugs are you talking about? I have no bugs, neither does MySQL 8!"
To summarize:
  1. MySQL 8 introduces some optimizer (and some other) regressions. They seem to be fixed fast enough, but I wonder why only Community users were able to find them not Oracle's QA...
  2. MySQL 8.0.19 is surely great, but I see many serious bugs fixed o0nly in 8.0.20+.
  3. Percona, Booking and Facebook engineers still continue contributing high quality bug reports, comments/verification details and patches. Oracle is lucky to have such nice partners in making MySQL better.
  4. I still see problems with following proper verification procedures and documenting the results. Too often the bug reported for 8.0.x is NOT checked on 5.7.x as well, regression tag is not set, and so on. Sometimes reports are closed as "Not a bug" without any attempt to follow the analysis provided or prove the point. This is sad and wrong.

Saturday, August 31, 2019

Fun with Bugs #89 - On MySQL Bug Reports I am Subscribed to, Part XXIII

I have to celebrate the anniversary of my last day in Oracle (that was 7 years ago!) somehow, and I think writing yet another blog post about Oracle MySQL bugs is a good way to do this. I am actually surprised (and happy) that public bugs database is still alive, maintained and considered important in Oracle, and I know who in Oracle was working hard all these years for this to happen!

In my previous post in this series I've stopped on Bug #95954 and had not completed review of interesting MySQL bug reports that I've subscribed to in June 2019. So, below I start with the next bug in my list, complete review for June and cover some bugs reported in July. There were many.
  • Bug #95957 - "IN operator issue when comparing signed column and the column cast to unsigned". This bug was reported by Manuel Rigger. As far as I can see, MariaDB 10.3 is not affected:
    MariaDB [test]> CREATE TABLE t0(c0 INT);
    Query OK, 0 rows affected (0.518 sec)

    MariaDB [test]> INSERT INTO t0(c0) VALUES(-1);
    Query OK, 1 row affected (0.196 sec)

    MariaDB [test]> SELECT t0.c0 IN (1, CAST(t0.c0 AS UNSIGNED)) from t0;
    +---------------------------------------+
    | t0.c0 IN (1, CAST(t0.c0 AS UNSIGNED)) |
    +---------------------------------------+
    |                                     0 |
    +---------------------------------------+
    1 row in set, 1 warning (0.207 sec)

    MariaDB [test]> show warnings\G
    *************************** 1. row ***************************
      Level: Note
       Code: 1105
    Message: Cast to unsigned converted negative integer to it's positive complement

    1 row in set (0.013 sec)
  • Bug #96001 - "No warning when creating foreign key in MyISAM tables". I am really surprised that this "documented feature" was accepted as a S3 bug when reported by Przemyslaw Malkowski from Percona. But this happened. I also prefer NOT to have unsupported syntax just ignored silently and would like to see a warning (or error in strict mode).
  • Bug #96002 - "'variable log_bin_trust_function_creators' -variable is "hidden"." Consistency is important, as well as correct documentation. So I was happy to see this bug report from Peter Laursen. 75(!) of his bug reports are still active, by the way, some were reported more than 10 years ago. I wonder if anyone is going to check (if not fix) them any time soon.
  • Bug #96079 - "large_tests.innodb_innochecksum_3gb test failing with debug build." Yet another MTR test failure reported by Lalit Choudhary from Percona.
  • Bug #96100 - "Generated column cause a heap-use-after-free error". Probably ASan builds are not tested as carefully in Oracle as they are by some community members, like Zkong Kong who reported this bug. Otherwise they would mark this bug report as a duplicate of some known internal bug.
  • Bug #96108 - "To run mtr "innodb.log_flush_order" MySql Server would be always core down". I've listed the bug as an example that even though code modification is needed to reproduce the crash (reported by Juncai Meng) literally, it was accepted and "Verified". In other reports this sometimes not happen, and the point is that it is surely not a rule carved in stone in Oracle to NOT accept bugs if test case involves code modification. Remember that and fight for your reports if needed.
  • Bug #96128 - "Doc: documentation is inaccurate when InnoDB starts with innodb_read_only". Correct manual matters a lot, so nice to have it corrected in this case by Calvin Sun.
  • Bug #96134 - "Please provide control functions for the IO Thread." I'd be also happy to see a way to control replication I/O thread progress and read logs only up to some position or GTID, per channel. Thanks Jean-François Gagné for this nice feature request.
  • Bug #96142 - "Inconsistent error on slave for Update event on table with non-exists partition". Yet another by report from by Lalit Choudhary. Good to see multiple versions check.
  • Bug #96148 - "using Invisible Index when slave apply EVENT". This bug was reported by Songlei Wang. Consistency matters, so if the index is invisible it should remain invisible for replication SQL thread as well. See also his another Bug #96150 - "'show slave status' show the Inaccurate Last_IO_Error message".
  • Bug #96167 - "Many header files now missing from devel package". As noted by Manuel Ung, now it is impossible to build plugins and UDFs unless users download the source tree, and then copy the headers to the appropriate places. Packaging in hard.
  • Bug #96178 - "mysqldump leaks memory when selected tables are dumped with --order-by-primary". Abhinav Sharma proposed a simple MTR test case to run on ASan build, and suggested a fix. Very nice bug report. Unfortunately I do not see any statements about the results of checking MySQL 8.0.x.
  • Bug #96192 - "Possible race condition with binlog-transaction-dependency-tracking". Bug reporter, Herman Lee, complained about one place in the code where race condition may happen even after the fix for one MySQL bug. Does it really matter for bug verification if he found more places? I fail to see a reason to keep he bug in "Need Feedback" status, when code review is enough to confirm there is a problem in that one part of the code clearly identified.
  • Bug #96196 - "performance_schema_accounts_size and p_s_hosts_size limited by 16384", Nice bug that can be confirmed by code review or just opening many connections was reported by Nikolai Ikhalainen from Percona. Autoscaling is broken/limited in this case.
  • Bug #96340 - "Slow startup for mysql 8.0 with many tables due to the tablespace files scan". This is actually a regression comparing to 5.7, but it's visible on slow disks. It's nice to see useful discussion, explanations and patch suggested (by Sunny Bains) in this bug report created by Lalit Choudhary.
I started this summer in a beautiful Barcelona. This blog posts ends it for me. One of the good changes leaving Oracle seven years ago introduced in my life is a real freedom to work from anywhere and travel as often as I really want, both for work and for fun.
To summarize:
  1. I am happy to see MySQL public bugs database still up and widely used, even though I do not work on it directly for 7 years already. It's a key service and media for cooperation for the entire MySQL Community! Just check how it happens in Bug #96340.
  2. Consistency matters.
  3. I still see cases when the time is wasted at bugs verification stage.
  4. MySQL fine manual still have details explained incorrectly, and this is unfortunate.
  5. Sometime I wonder why Percona engineers and other MySQL Community users manage to find even MTR test failures faster than anyone in Oracle cares to report and fix them. They all know magic (like actually running all tests on debug builds and checking the results, maybe). Or, maybe, they care?
* * *
Shameless self-promotion at the end. First time since 2015 I am going to attend Percona Live conference and speak there. Ticket prices increase on September 1, so using code CMESPEAK-VALERII you’ll get the best deal right now.

Sunday, August 4, 2019

Fun with Bugs #87 - On MySQL Bug Reports I am Subscribed to, Part XXI

After a 3 months long break I'd like to continue reviewing MySQL bug reports that I am subscribed to. This issue is devoted to bug reports I've considered interesting to follow in May, 2019:
  • Bug #95215 - "Memory lifetime of variables between check and update incorrectly managed". As demonstrated by Manuel Ung, there is a problem with all InnoDB MYSQL_SYSVAR_STR variables that can be dynamically updated. Valgrind allows to highlight it.
  • Bug #95218 - "Virtual generated column altered unexpectedly when table definition changed". This weird bug (that does not seem to be repeatable on MariaDB 10.3.7 with proper test case modifications like removing NOT NULL and collation settings from virtual column) was reported by Joseph Choi. Unfortunately we do not see any documented attempt to check if MySQL 8.0.x is also affected. My quick test shows MySQL 8.0.17 is NOT affected, but I'd prefer to see check copy/pasted as a public comment to the bug.
  • Bug #95230 - "SELECT ... FOR UPDATE on a gap in repeatable read should be exclusive lock". There are more chances to get a deadlock with InnoDB than one might expect... I doubt this report from Domas Mituzas is a feature request. It took him some extra efforts to insist on the point and get it verified even as S4.
  • Bug #95231 - "LOCK=SHARED rejected contrary to specification". This bug report from Monty Solomon ended up as a documentation request. The documentation and the implementation are not aligned, and it was decided NOT to change the parser to match documented syntax. But why it is still "Verified" then? Should it take months to correct the fine manual?
  • Bug #95232 - "The text of error message 1846 and the online DDL doc table should be updated". Yet another bug report from Monty Solomon. Some (but not ALL) partition specific ALTER TABLE operations do not yet support LOCK clause.
  • Bug #95233 - "check constraint doesn't consider IF function that returns boolean a boolean fun". As pointed out by Daniel Black, IF() function in a check constraint isn't considered a boolean type. He had contributed a patch to fix this, but based on comments it's not clear if it's going to be accepted and used "as is". The following test shows that MariaDB 10.3 is not affected:
    C:\Program Files\MariaDB 10.3\bin>mysql -uroot -proot -P3316 test
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 9
    Server version: 10.3.7-MariaDB-log mariadb.org binary distribution

    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    MariaDB [test]> create table t1 (source enum('comment','post') NOT NULL, comment_id int unsigned, post_id int unsigned);
    Query OK, 0 rows affected (0.751 sec)
    MariaDB [test]> alter table t1 add check(IF(source = 'comment', comment_id IS NOT NULL AND post_id IS NULL, post_id IS NOT NULL AND comment_id IS NULL));
    Query OK, 0 rows affected (1.239 sec)
    Records: 0  Duplicates: 0  Warnings: 0
  • Bug #95235 - "ABRT:Can't generate a unique log-filename binlog.(1-999), while rotating the bin". Yet another bug report from Daniel Black. When MySQL 8.0.16 is built with gcc 9.0.x abort is triggered in the MTR suite on the binlog.binlog_restart_server_with_exhausted_index_value test.
  • Bug #95249 - "stop slave permanently blocked". This bug was reported by Wei Zhao, who had contributed a patch.
  • Bug #95256 - "MySQL 8.0.16 SYSTEM USER can be changed by DML". MySQL 8.0.16 had introduced an new privilege, SYSTEM_USER. MySQL manual actually says:
    "The protection against modification by regular accounts that is afforded to system accounts by the SYSTEM_USER privilege does not apply to regular accounts that have privileges on the mysql system schema and thus can directly modify the grant tables in that schema. For full protection, do not grant mysql schema privileges to regular accounts."
    But the report that a user with a privilege to execute DML on the mysql.GLOBAL_GRANTS table from Zhao Jianwei was accepted and verified. I hope Oracle engineers will finally make up their mind and decide either to fix this or to close this report as "Not a bug". I've subscribed in a hope for some fun around this decision making.
  • Bug #95269 - "binlog_row_image=minimal causes assertion failure". This assertion failure happens in debug build when one of standard MTR test cases, rpl.rpl_gis_ddl or rpl.rpl_window_functions is executed with --binlog-row-image=minimal option. In such cases I always wonder what is the reason for a failure NOT to be noted by Oracle MySQL QA and somehow fixed before Community users notice it? Either they don't run tests on debug builds with all possible combinations, or do not care to fix such failures (and thus should suffer from known failures in other test runs). I do not like any of these options, honestly. The bug was reported by Song Libing.
  • Bug #95272 - "Potential InnoDB SPATIAL INDEX corruption during root page split". This bug was reported by Albert Hu based on Valgrind report when running the test innodb.instant_alter. Do they run MTR tests under on Valgrind or ASan builds in Oracle? I assume they do, but then why Community users are reporting such cases first? Note that related MariaDB's bug, MDEV-13942, is fixed in 10.2.24+ and 10.3.15+.
  • Bug #95285 - "InnoDB: Page [page id: space=1337, page number=39] still fixed or dirty". This assertion failure that happens during normal shutdown was reported by LUCA TRUFFARELLI. There are chances that this is a regression bug (without a regression tag), as it does not happen for reporter on MySQL 5.7.21.
  • Bug #95319 - "SHOW SLAVE HOST coalesces certain server_id's into one". This bug was reported by Lalit Choudhary from Percona based on original findings by Glyn Astill.
  • Bug #95416 - "ZERO Date is both NULL and NOT NULL". This funny bug report was submitted Morgan Tocker. Manual actually explains that it's intended behavior (MariaDB 10.3.7 works the same way as MySQL), but it's still funny and unexpected, and the bug report remains "Verified".
  • Bug #95478 - "CREATE TABLE LIKE does not honour ROW_FORMAT." I'd like to add "...when it was not defined explicitly for the original table". The problem was reported by Jean-François Gagné and ended up as a verified feature request. See also this my post on the details of where row_format is stored and is not stored for InnoDB tables...
  • Bug #95484 - "EXCHANGE PARTITION works wrong/werid with different ROW_FORMAT". Another bug report by Jean-François Gagné related to the previous one. He had shown that it's actually possible to get partitions with different row formats in the same InnoDB table in MySQL 5.7.26, but not in the most natural way. It seems the problem may be fixed in 5.7.27 (by the fix for another, internally reported bug), but the bug remains "Verified".
There are some more bugs reported in May 2019 that I was interested in, but let me stop for now. Later in May I've got a chance to spend some days off in Barcelona, without any single MySQL bug report opened for day.

I like this view of Barcelona way more than any MySQL bugs review, including this one.
To summarize:
  1. Oracle engineers who process bugs still sometimes do not care to check if all supported major versions are affected and/or share the results of such checks in public. Instead, some of them care to argue about severity of the bug report, test case details etc.
  2. We still see bug reports that originates from existing, well known MTR test cases runs under Valgrind or in debug builds with some non-default options set. I do not have any good reason in mind to explain why these are NOT reported by Oracle's internal QA first.
  3. Surely some regression bugs still get verified without the regression tag added.
I truly hope my talk "Problems with Oracle's Way of MySQL Bugs Database Maintenance" will be accepted for Percona Live Europe 2019 conference (at least as a lightning talk) and I'll get another chance to speak about the problems highlighted above, and more. There are some "metabugs" in the way Oracle handles MySQL bug report, and these should be discussed and fixed, for the benefits of MySQL quality and all MySQL users and customers.

Saturday, March 30, 2019

On References to MariaDB and MariaDB Bugs (MDEVs) in MySQL Bug Reports

Recently I noted that some of my comments to public MySQL bug reports got hidden by somebody from Oracle with privileges to do so. I was not able to find out who did that and when, as this information is not communicated to bug subscribers (this may change if my feature requests, Bug #94807 - "Subscriber should be notified when comment is made private", is eventually implemented).

When it happened for the first time I thought it was probably non-intentional. When it happened for a second time I complained with a tweet that got few likes and zero comments. Recently this happened again and yet another tweet had not got much attention, but at least I've got a comment via Bug #94797 that my comment there (where I checked test case on MariaDB version I had at hand to find out it's not affected, something I often do for bugs mentioned in my blog posts here) was hidden as irrelevant and "an attempt to advertise MariaDB".

Snow hides everything, good and bad, dog shit, holes in the road and autumn flowers... Do we really want information provided in comments to public MySQL bugs got hidden just because someone once decided it's "bad"?
I really wonder if any of my readers think that I advertise MariaDB with my public posts or public comments anywhere or specifically in MySQL bug reports?

I'd also like to share here, where no one besides me can hide or delete comments (I hope), what was hidden in the case that caused me to tweet about censorship I have to deal with. In Bug #94610 - "Server stalls because ALTER TABLE on partitioned table holds dict mutex" that ended up as "Not a Bug" (not even a duplicate of a verified Bug #83435 - "ALTER TABLE is very slow when using PARTITIONED table" it referred to and extended with a global mutex usage highlighted and impact explained), I've added the following comment:
"[12 Mar 7:30] Valeriy Kravchuk
Not only it stalls, but if it stalls for long enough time it will crash :)

Useful related reading is here: https://jira.mariadb.org/browse/MDEV-15641
"
The comment was hidden very soon. Now, if you check that link, you'll see confirmed, unresolved MariaDB bug report. I mostly had this comment to the MDEV-15641 in mind, were my colleague and well known InnoDB developer Marko Mäkelä stated:
"The row_log_table_apply() is actually invoked while holding both dict_sys->mutex and dict_operation_lock. If there is a lot of log to apply, this may actually cause InnoDB to crash."
I may be mistaking in linking these two bug reports together, but isn't highlighting the possibility of crash due to long semaphore wait important to understand the impact of the bug report and triage it properly? What wrong MySQL users and bug report readers may see if they follow the link to MariaDB bug I considered relevant? What was advertised by this comment that is harmful or useless for MySQL Community?

I was even more surprised by these recent actions on my comments because in the past I had never noted similar approach. Check the following bug reports, for example (I searched for those with "MDEV" and "Kravchuk" in them to get these):
  • Bug #80919 - "MySQL Crashes when Droping Indexes - Long semaphore wait". In this bug report (real bug fixed in 5.7.22) I've added a comment that refers to MDEV-14637. The comment still remains public and, IMHO, is still useful. Providing this link helped to get proper attention to the bug, so it was re-opened and got comments from Oracle engineers finally. Was it an attempt to advertise MariaDB? How this case is different from my comment in Bug #94610 quoted above? 
  • Bug #84185 - "Not all "Statements writing to a table with an auto-increment..." are unsafe". I reported this "upstream" MySQL bug based on MDEV-10170 - "Misleading "Statements writing to a table with an auto-increment column after selecting from another table are unsafe" on DELETE ... SELECT", previously found by my colleague Hartmut Holzgraefe. I've also added link to the "upstream" MySQL bug report to that MDEV. Does anybody in MySQL or MariaDB user communities think that such cross-references are useless, harmful or may be considered as and "attempt to advertise competitor" if any of vendors fixes the bug first?
  • Bug #48392 - "mysql_upgrade improperly escapes passwords with single quotes". I verified this bug in 2009 while working for MySQL at Sun, and it still remains "Verified" (I had not re-checked if it's still repeatable with current MySQL versions). Then in 2013 community user added a comment referring to the MariaDB bug, MDEV-4664 - "mysql_upgrade crashes if root's password contains an apostrophe/single quotation mark" that was fixed later, in 2015. This comment still remains public and is useful!
So, had my comments that mention MDEVs or MariaDB in general became so irrelevant and MariaDB advertising recently comparing to the previous ones? What exact community standards or rules they break? Is it now forbidden to any user of MySQL bugs database to mention MariaDB or bugs in it, use MariaDB in tests to make some point and share the results in public in MySQL bugs database, or the problem is with me personally doing this?

I'd be happy to read explanations or opinions from MySQL community users and my former Oracle colleagues in comments to this blog post.

Tuesday, July 24, 2018

On Some Problematic Oracle MySQL Server Features

In one of my previous posts I stated that in Oracle's MySQL server some old enough features remain half-backed, not well tested, not properly integrated with each other, and not documented properly. It's time to prove this statement.

I should highlight from the very beginning that most of the features I am going to list are not that much improved by other vendors. But they at least have an option of providing other, fully supported storage engines that may overcome the problems in these features, while Oracle's trend to get rid of most engines but InnoDB makes MySQL users more seriously affected by any problems related to InnoDB.

The Royal Pavilion in Brighton looks nice from the outside and is based on some great engineering decisions, but the decorations had never been completed, some interiors were ruined and never restored, and the building was used for too many different purposes over years.
The list of problematic MySQL server features includes (but is not limited to) the following:
  • InnoDB's data compression

    Classical InnoDB compression (row_format=compressed) has limited efficiency and does not get any attention from developers recently. Transparent page compression for InnoDB seems to be originally more like a proof of concept in MySQL that may not work well in production on commodity hardware and filesystems, and was not integrated with backup tools.
  • Partitioning

    Bugs reported for this feature by MySQL Community do not get proper attention. DDL against partitioned tables and partition pruning do not work the way DBAs may expect. We still miss parallel processing for partitioned tables (even though proof of concept for parallel DDL and some kinds of SELECTs was ready and working 10 years ago). Lack of careful testing of partitioning integration with other features is also visible.
  • InnoDB's FULLTEXT indexes
    This feature appeared in MySQL 5.6, but 5 years later there are still all kinds of serious bugs in it, from wrong results to hangs, debug assertions and crashes. There are performance regressions and missing features comparing to MyISAM FULLTEXT indexes, and this makes the idea to use InnoDB for everything even more problematic. Current implementation is not designed to work with really large tables and result sets. DBAs should expect problems during routine maintenance activities, like ALTERing tables or dumps and restores when any table with InnoDB FULLTEXT index is involved.

  • InnoDB's "online" DDL implementation
    It is not really "online" in too many important practical cases and senses. Replication ignores LOCK=NONE and slave starts to apply "concurrent" DML only after commit, and this may lead to a huge replication lag. The entire table is often rebuilt (data are (re-)written) to often, in place or by creating a copy. One recent improvement in MySQL 8, "instant ADD COLUMN", was actually contributed by Community. The size of the "online log" (that is kept in memory and in temporary file) created per table altered or index created, depends on concurrent DML workload and is hard to predict. For most practical purposes good old pt-online-schema-change or gh-ost tool work better.

  • InnoDB's persistent optimizer statistics

    Automatic statistics recalculation does not work as expected, and to get proper statistics explicit ANALYZE TABLE calls are still needed. The implementation is complicated and introduced separate implicit transactions (in dirty reads mode) against statistics tables. Bugs in the implementation do not seem to get proper priority and are not fixed.
I listed only those features I recently studied in some details in my previous blog posts. I've included main problems with each feature according to my older posts. Click on the links in the list above to find the details.

The Royal Pavilion of InnoDB in MySQL is beautiful from the outside (and somewhere inside), but is far from being completed, and some historical design decisions do not seem to be improved over years. We are lucky that it is still used and works nice for many current purposes, but there are too many dark corners and background threads there where even Oracle engineers rarely look and even less are improving them...

Friday, April 20, 2018

Fun with Bugs #67 - On Some Public Bugs Fixed in MySQL 8.0.11 GA

I stopped reviewing MySQL Release Notes for quite a some time, but major GA releases of MySQL do not happen often, so I decided to make an exception and write about some bugs from Community users fixed in MySQL 8.0.11 GA.

I'll start with good news about MySQL 8.0.11 GA! You can get sources at GitHub, and I had no problems to build on Fedora 27 on my good old QuadCore box, using the following cmake command line:
[openxs@fc23 mysql-server]$ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_CONFIG=mysql_release -DFEATURE_SET=community -DWITH_EMBEDDED_SERVER=OFF -DDOWNLOAD_BOOST=1 -DWITH_BOOST=/home/openxs/boost -DENABLE_DOWNLOADS=1 -DWITH_UNIT_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/8.0
...
[openxs@fc23 mysql-server]$ time make -j 4
...

[100%] Built target mysqld

real    33m52.791s
user    105m47.475s
sys     8m19.018s
Comparing to previous experience, I had minor problem with unit tests, so just skipped them with -DWITH_UNIT_TESTS=OFF option. There is no problem to run the resulting binaries, unless you try to use data directory from older 8.0.x. Then you'll end up with:
2018-04-19T15:36:35.165841Z 1 [ERROR] [MY-011092] [Server] Upgrading the data dictionary from dictionary version '80004' is not supported.
2018-04-19T15:36:35.166239Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2018-04-19T15:36:35.166310Z 0 [ERROR] [MY-010119] [Server] Aborting
I had to remove data directory and initialize it from scratch (it was testing instance anyway, last time used for real while I worked on this presentation):
[openxs@fc23 8.0]$ rm -rf data/*
[openxs@fc23 8.0]$ bin/mysqld --no-defaults --initialize-insecure --port=3308 --socket=/tmp/mysql.sock --basedir=/home/openxs/dbs/8.0 --datadir=/home/openxs/dbs/8.0/data --skip-log-bin
2018-04-19T15:43:55.324606Z 0 [Warning] [MY-010139] [Server] Changed limits: max_open_files: 1024 (requested 8161)
2018-04-19T15:43:55.324726Z 0 [Warning] [MY-010142] [Server] Changed limits: table_open_cache: 431 (requested 4000)
2018-04-19T15:43:55.325147Z 0 [System] [MY-013169] [Server] /home/openxs/dbs/8.0/bin/mysqld (mysqld 8.0.11) initializing of server in progress as process 20034
2018-04-19T15:44:14.438776Z 4 [Warning] [MY-010453] [Server] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.
2018-04-19T15:44:29.625227Z 0 [System] [MY-013170] [Server] /home/openxs/dbs/8.0/bin/mysqld (mysqld 8.0.11) initializing of server has completed
[openxs@fc23 8.0]$ bin/mysqld_safe --no-defaults --port=3308 --socket=/tmp/mysql.sock --basedir=/home/openxs/dbs/8.0 --datadir=/home/openxs/dbs/8.0/data --skip-log-bin &
[1] 20080
[openxs@fc23 8.0]$ 2018-04-19T15:44:58.224816Z mysqld_safe Logging to '/home/openxs/dbs/8.0/data/fc23.err'.
2018-04-19T15:44:58.271255Z mysqld_safe Starting mysqld daemon with databases from /home/openxs/dbs/8.0/data

[openxs@fc23 8.0]$ bin/mysql -uroot --socket=/tmp/mysql.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.11 MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 8.0.11                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| tls_version             | TLSv1,TLSv1.1,TLSv1.2        |
| version                 | 8.0.11                       |
| version_comment         | MySQL Community Server (GPL) |

| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
| version_compile_zlib    | 1.2.11                       |
+-------------------------+------------------------------+
9 rows in set (0.00 sec)
So, you can build MySQL 8.0.11 right now and start using it to make your own conclusions about this release.

I still do not care about NoSQL, JSON, new cool features etc. You'll see megabytes of texts about these by the end of 2018. I am going to concentrate mostly on InnoDB, replication bugs and few others:
  •  I am happy to start with Bugt #89509 - "Valgrind error on innodb.blob_page_reserve, bundled zlib", reported by Laurynas Biveinis. See also his Bug #89597 - "Valgrind reporting memory leak on MTR test main.validate_password_component" and Bug #89433 - "NULL dereference in dd::tables::DD_properties:unchecked_get". Percona engineers spent a lot of efforts recently testing MySQL 8.0.x and reporting bugs noted. I think Oracle should explicitly admit the impact of Percona's QA effrots for the quality of this GA release.
  • Biug #89127 - "Optimize trx_rw_is_active() by tracking the lowest active transaction id". This bug was reported by Zhai Weixiang, who had suggested a patch also.
  • Bug #89129 - "create table+DML on innodb_ddl_log table=crash in lock0lock.cc:7414:release_lock". This bug was reported by Ramana Yeruva. Tables were made protected and DDL and DML operations on these tables are no longer permitted.
  • Bug #89087 - "Assertion `key->flags & 1' failed". This debug assertion (related to the way PRIMARY key was created based on UNIQUE one) was reported by Roel Van de Paar for 5.7.21, but we see the fix documented only for 8.0.x.
  • Bug #87827 - "Performance regression in "create table" speed and scalability in 8.0.3". It was reported by Alexander Rubin from Percona.
  • Bug #87812 - "Concurrent DDL operation in progress even after acquiring backup lock". Nice bug report from Debarun Banerjee.
  • Bug #87532 - "Replay log record cause mysqld crash during online DDL". I am happy to see impovements in "online ALTER" implementation that covers all GA versions, not just 8.0. I am also happy to see Oracle engineers (Ohm Hong in this case) reporting bugs in public!
  • Bug #88272 - "Assertion `new_value >= 0' failed.". Yet another debug assertion found by Roel Van de Paar, this time related to GTIDs and XA transactions. Check also his Bug #88262 - "ERROR 1598 (HY000): Binary logging not possible + abort".
  • Bug #84415 - "slave don't report Seconds_Behind_Master when running slave_parallel_workers > 0". Yet another contribution from Percona engineers. This bug was reported by Marcelo Altmann and patches were provided by Robert Golebiowski. This bug is also fixed in MySQL 5.7.22.
  • Bug #89793 - "INFORMATION_SCHEMA.STATISTICS field type change". Unexpected change in early 8.0.x versions was noted and reported by Mark Guinness.
  • Bug #89584 - "5.7->8.0 upgrade crash with default-time-zone set". Nice to see this bug (reported by Shane Bester) fixed in GA release.
  • Bug #89487 - "ALTER TABLE hangs in "Waiting for tablespace metadata lock" state". This regression bug was reported by Sveta Smirnova.
  • Bug #89324 - "main.comment_column2 fails with compression". This regression was noted and reported by Manuel Ung.
  • Bug #89122 - "Severe performance regression in server bootstrap". I am really happy to see this bug reported by Georgi Kodinov fixed. I noted it as soon as I started testing 8.0.x (see a duplicate by Roel Van de Paar, Bug #89444) and it was very annoying. I've already checked (see above) that the problem is gone!
  • Bug #89038 - "Add new column to 'mysql.routines' to accommodate the Polygot project". So, Oracle is planning to support stored programs in different languages! Thank you, Sivert Sørumgård, for reporting this in public! See also his Bug #89035 - "Reject LCTN changing after --initialize".
  • Bug #87836 - "XA COMMIT/ROLLBACK rejected by non-autocommit session with no active transaction". It would be sad if this bug is not fixed in MySQL 5.7.x, where it was originally found by Wei Zhao.
  • Bug #87708 - "MDL for column statistics is not properly reflected in P_S.METADATA_LOCKS". It was reported by Erik Frøseth.
  • Bug #85997 - "inplace alter table with foreign keys causes table definition mismatch". This bug was reported by Magnus Blåudd.
  • Bug #85561 - "Users can be assigned non-existing roles as default". Nice to see this bug reported by Giuseppe Maxia fixed in GA release.
  • Bug #33004 - "integer constants casted to bigints by unions". This bug was reported by Domas Mituzas more than 10 years ago!
Now I have to stop, as I found private bug in release notes, Bug #89512. Based on description:
"Window function row-buffer handling has been refactored to reduce the number of handler reads by 25%. (Bug #89512, Bug #27484133)"
I truly do not get why it remains private (or why it was reported in public for such a "sensitive" matter), so I better stop.

MySQL 8 is GA, finally! There are a lot more fixes there that I had not mentioned above. I am surely there is even more bugs to find. So, happy hunting!



Sunday, January 28, 2018

On InnoDB's Online DDL

I am completing my preparations for the upcoming FOSDEM talk, and one of last things I wanted to do in frames of them is a quick review of known bugs and problems in current (as in MySQL 5.7 GA) implementation of so called "online" DDL for InnoDB tables.

In my previous post I already shared my view on another important InnoDB feature, persistent statistics. Unlike that, I do not really hate online DDL. I just try to avoid it if possible and use tools like pt-online-schema-change or gh-ost instead. Not because it is not documented properly (the documentation is quite detailed, there are still things to clarify though) or does not work as designed, but mostly because the term "online" (if we understand it as "not blocking", or "without blocking/affecting the application and read/write operations to the table being changed is available") is a bit misleading (it is more like "less blocking" or "blocking for shorter periods of time", faster and in-place, sometimes), and because it does not work the way one might expect in any kind of replication setups.

To be more specific:
  • Replication ignores LOCK=NONE :) Slave will only start to apply "concurrent" DML after commit, and this leads to a huge replication lag.
  • In too many cases the entire table is rebuilt (data are (re-)written), in place or by creating a copy, while notable writes in the process of running ALTER TABLE are really required only if we are introducing stricter constraints (and even in this case we can just validate the table, return error if some row does not satisfy new constraint, too long to fit, for example, and then change metadata if all rows are OK) or adding new indexes (that in any case can not be used until they are built).
  • The online log has to be kept (in memory or in temporary file). There is one such log file for each index being created or table being altered. Manual says:
    "This log file stores data inserted, updated, or deleted in the table during the DDL operation. The temporary log file is extended when needed by the value of innodb_sort_buffer_size, up to the maximum specified by innodb_online_alter_log_max_size. If a temporary log file exceeds the upper size limit, the ALTER TABLE operation fails and all uncommitted concurrent DML operations are rolled back. Thus, a large value for this option allows more DML to happen during an online DDL operation, but also extends the period of time at the end of the DDL operation when the table is locked to apply the data from the log."
    The problem is that the size depends on the concurrent DML workload and is hard to predict. Note also "when the table is locked" above to understand how much "online" is this...
There are also bugs, and I'd like to discuss some of them:
  • Bug #82997, "Online DDL fails with". There are not enough public details to be sure with what exactly, but maybe the problems (several are reported) happen when the table altered has generated column. if this is really so, the bug may be fixed in MySQL 5.7.19+.
  • Bug #73196, "Allow ALTER TABLE to run concurrently on master and slave". I can not put this better than Andrew Morgan did it in this verified feature request:
    "With online ALTER TABLE it is possible for the DDL operation to run for many hours while still processing DML on that same table. The ALTER TABLE is not started on the slave until after it has completed on the master and it will again take many hours to run on the slave. While the DDL runs on the slave, it is not possible for it to process any transactions which followed the ALTER TABLE on the master as they may be dependent on the changes that were made to the table's schema. This means that the slave will lag the master by many hours while the ALTER TABLE runs and then while it catches up on the backlog of DML sent from the master while that was happening."
    Both pt-osc and gh-ost resolve this problem, as they take replication topology into account and can throttle changes on master if needed. See also this documentation request by Daniël van Eeden, Bug #77619 , that lists more limitations of "online" DDL, and check how it helped to clarify them here.
  • Bug #67286, "InnoDB Online DDL hangs". It ended up as "Not a bug", but there is a good explanation of exclusive metadata lock set by the "online" ALTER in the comments:
    "The final (short) phase of ALTER where the internal data dictionary is updated requires exclusive access. That's why the ALTER was blocked by the active transaction having a shared lock on the table."
    I once studied similar (and even simpler) case in a lot of details with gdb, see this blog post. I've clearly see MDL_EXCLUSIVE lock request for simple ALTER TABLE ... STATS_AUTO_RECALC=1 that (according to the manual) "permits concurrent DML". Other manual page clarifies:
    "In most cases, an online DDL operation on a table waits for currently executing transactions that are accessing the table to commit or roll back because it requires exclusive access to the table for a brief period while the DDL statement is being prepared. Likewise, the online DDL operation requires exclusive access to the table for a brief time before finishing. Thus, an online DDL statement also waits for transactions that are started while the DDL is in progress to commit or roll back before completing."
    Dear MySQL Oracle developers, just remove "In most cases" (or clarify it), and this would be fair enough!
  • Bug #84004, "Manual misses details on MDL locks set and released for online ALTER TABLE". That's my documentation request I filed after spending some time tracing metadata locks usage in gdb. My request is simple (typos corrected):
    "Describe all kinds of metadata locks used by MySQL, their interactions and order of acquisition and release for most important SQL statements, including (but not limited to) all kinds of online ALTER TABLE statements for InnoDB tables."
  • Bug #68498, "can online ddl for innodb be more online?". This report by Mark Callaghan that refers to this detailed study is still "Verified". Based on the comments to that blog post, it is "enough online", but the details of implementation were not clearly documented at the moment. Check for the details and clarifications in the comments!
  • Bug #72109, "Avoid table rebuild when adding or removing of auto_increment settings". The bug report from Simon Mudd is still "Verified".
  • Bug #57583, "fast index create not used during "alter table foo engine=innodb"". The bug is still "Verified" and I can not tell from the manual if this is implemented in MySQL 5.7 or not.
  • Bug #83557, "Can't use LOCK=NONE to drop columns in table with virtual columns" - nice "Verified" bug report by Monty Solomon.
  • Bug #70790, "ALTER TABLE REBUILD PARTITION SHOULD NOT PREVENT DML IN UNAFFECTED PARTITIONS". My former colleague in Oracle Arnaud Adant simply asked to provide proper and reasonable support of online DDL for partitioned tables. This bug report is still "Verified", but at least we have a separate manual page now that explains the details and limitations of online DDL with partitioned tables (most of Arnaud's requests are still NOT implemented).
  • Bug #81819, "ALTER TABLE...LOCK=NONE is not allowed when FULLTEXT INDEX exists". As Marko Mäkelä explains in the last comment of this "Verified" feature request:
    "However, LOCK=NONE is never supported when a FULLTEXT INDEX exists on the table. Similarly, LOCK=NONE is not supported when SPATIAL INDEX (introduced in MySQL 5.7) exist. Speaking as the author of WL#6255 which implemented ALTER TABLE...LOCK=NONE for InnoDB B-tree indexes in MySQL 5.6, I share the bug reporter's disappointment."
To summarize, online DDL in MySQL 5.7 is surely far above and beyond "fast index creation", but there is still a lot of room from improvements. Real solution (that allows to perform ALTER TABLE fast and without unnecessary changes/writes to data in way more cases) may come with real data dictionary in MySQL and support for multiple table versions there, or from ideas like those implemented in MDEV-11369, "Instant add column for InnoDB", and expressed in MDEV-11424, "Instant ALTER TABLE of failure-free record format changes". Until that all is implemented I'd prefer to rely on good old tools like pt-osc

In any case we speak about backward incompatible changes to the way MySQL works and stores data now.