Showing posts with label fulltext. Show all posts
Showing posts with label fulltext. Show all posts

Saturday, May 30, 2020

Fun with Bugs #99 - On MySQL Bug Reports I am Subscribed to, Part XXXIII

In my previous post in this series I've commented on some interesting MySQL bug reports that were added during the second half of April. Time to move on to bugs reported in May, 2020, as we are quickly approaching MySQL Bug #100000 soon and I want to write a separate post for this great achievement :)

Here is the list:
  • Bug #99432 - "Improving memory barrier during rseg allocation". Nice contribution by my former colleague in Percona, Krunal Bauskar, who now works on making MySQL better for ARM processors. According to his findings, the use of a relaxed memory model improves performance on ARM by up to 2%. See also yet another bug report with a contribution that matters for ARM, Bug #99556 - "Avoid sequentially consistent atomics for atomic counters" (contributed by Sergey Glushchenko from Percona).
  • Bug #99444 - "New HASH JOIN order problem". One should not expect and rely on any specific order unless explicit ORDER BY is used, so formally this report by Gabor Aron is "Not a Bug". I put it into this list as several other community members helped him a lot in understanding why results with HASH_JOIN optimization in newer versions are still valid and what are the ways to get the results with the desired ordering efficiently. Guilhem Bichot, for instance, suggested two different ways, using window function and lateral table. Useful reading in any case!
  • Bug #99458 - "i_s_fts_index_cache_fill_one_index() is not protect by the lock". Looks like even crashes are possible as a result, based on comments. Nice finding by Haiqing Sun.
  • Bug #99459 - "SQL run with GROUP_MIN_MAX may infinite loop and never return". After some discussion around the validity and severity of bug reports where test case involved adding DEBUG_SYNC() to show the problem in a predictable way, this great bug report by Ze Yang was verified. All MySQL GA versions are affected, including 8.0.20! As a side note, I'd prefer NOT to read such discussions any more. They are wasting time of all parties involved.
  • Bug #99499 - "Incorrect result when constant equailty expression is used in LEFT JOIN condition". This bug that affects MySQL 5.7.x only (it was fixed in MySQL 8.0.17+ and in 5.6 code was different) was reported by Marcos Albe from Percona. 
  • Bug #99504 - "Generated column incorrect on INSERT when based on column w/ expression DEFAULT". Several problems are highlighted in the complex enough test case submitted by Brad Lanier.
  • Bug #99582 - "Reduce logging of new doublewrite buffer initialization which is confusing". 180 lines or so are added when --log-error-verbosity is set to 3. As a workaround one can add:
    log-error-suppression-list="MY-011950"
    to the [mysqld] section of the .cnf file. This problem was reported by Simon Mudd. Make sure to read all comments.
  • Bug #99591 - "Option --tc-heuristic-recover documentation wrong, missing details". In reality it does not work with more than one XA-capable engine installed. I wish fine manual documents the reality, not the good intentions of the past. This documentation request was added by Sami Ahlroos.
  • Bug #99593 - "Performance issues in 8.0.20". It seems to be yet another TempTable engine problem that caused regression comparing to MySQL 5.7. At least this:

    • SET GLOBAL internal_tmp_mem_storage_engine=MEMORY;

    is a workaround. The bug (a duplicate of internal Bug #30562964) was reported by billy noah and is fixed in upcoming MySQL 8.0.21.
  • Bug #99601 - "Broken Performance using EXIST function, increasing execution time each loop". This regression bug (without tag, but who cares...) in MySQL 8.0 was reported by Ronny Görner and minimal test case demonstrating that the problem is actually with function call was contributed by Shane Bester.
  • Bug #99643 - "innobase_commit_by_xid/innobase_rollback_by_xid is not thread safe". This bug was reported by Zhai Weixiang, who had also suggested the fix in the code.
  • Bug #99717 - "Performance regression of parallel count". Great bug report with code analysis and ready to use MTR test case from Ze Yang. Sunny Bains already confirmed that the problematic code is going to be removed.

To summarize:
  1. I am happy to see Oracle engineers explaining community bug reporters the reasons and possible solutions for the problems they hit that are not actually caused by any bug in MySQL. I tried to do this as well, whenever possible, while working on MySQL bugs...
  2. We can still find speculations that if the bug is repeatable only by adding DEBUG_SYNC() or similar debug lines, then it can not be verified or gets lower severity... IMHO this is nonsense, as there are many high severity verified real bug reports where this method is used to demonstrate the problem clearly. Just stop it!



Saturday, February 22, 2020

Fun with Bugs #94 - On MySQL Bug Reports I am Subscribed to, Part XXVIII

I may get a chance to speak about proper bugs processing for open source projects later this year, so I have to keep reviewing recent MySQL bugs to be ready for that. In my previous post in this series I listed some interesting MySQL bug reports created in December, 2019. Time to move on to January, 2020! Belated Happy New Year of cool MySQL Bugs!

As usual I mostly care about InnoDB, replication and optimizer bugs and explicitly mention bug reporter by name and give link to his other active reports (if any). I also pick up examples of proper (or improper) reporter and Oracle engineers attitudes. Here is the list:
  • Bug #98103 - "unexpected behavior while logging an aborted query in the slow query log".  Query that was killed while waiting for the table metadata lock is not only get logged, but also lock wait time is saved as query execution time. I'd like to highlight how bug reporter, Pranay Motupalli, used gdb to study what really happens in the code in this case. Perfect bug report!
  • Bug #98113 - "Crash possible when load & unload a connection handler". The (quite obvious) bug was verified based on code review, but only after some efforts were spent by Oracle engineer on denial to accept the problem and its importance. This bug was reported by Fangxin Flou.
  • Bug #98132 - "Analyze table leads to empty statistics during online rebuild DDL ". Nice addition to my collections! This bug with a nice and clear test case was reported by Albert Hu, who also suggested a fix.
  • Bug #98139 - "Committing a XA transaction causes a wrong sequence of events in binlog". This bug reported by Dehao Wang was verified as a "documentation" one, but I doubt documenting current behavior properly is an acceptable fix. Bug reporter suggested to commit in the binary log first, for example. Current implementation that allows users to commit/rollback a XA transaction by using another connection if the former connection is closed or killed, is risky. A lot of arguing happened in comments in the process, and my comment asking for a clear quote from the manual:
    Would you be so kind to share some text from this page you mentioned:

    https://dev.mysql.com/doc/refman/8.0/en/xa.html

    or any other fine MySQL 8 manual page stating that XA COMMIT is NOT supported when executed from session/connection/thread other than those prepared the XA transaction? I am doing something wrong probably, but I can not find such text anywhere.
    was hidden. Let's see what happens to this bug report next.
  • Bug #98211 - "Auto increment value didn't reset correctly.". Not sure what this bug reported by Zhao Jianwei has to do with "Data Types", IMHO it's more about DDL or data dictionary. Again, some sarcastic comments from Community users were needed to put work on this bug back on track...
  • Bug #98220 - "with log_slow_extra=on Errno: info not getting updated correctly for error". This bug was reported by lalit Choudhary from Percona.
  • Bug #98227 - "innodb_stats_method='nulls_ignored' and persistent stats get wrong cardinalities". I think category is wrong for this bug. It's a but in InnoDB's persistent statistics implementation, one of many. The bug was reported by Agustín G from Percona.
  • Bug #98231 - "show index from a partition table gets a wrong cardinality value". Yet another by report by Albert Hu. that ended up as a "documentation" bug for now, even though older MySQL versions provided better cardinality estimations than MySQL 8.0 in this case (so this is a regression of a kind). I hope the bug will be re-classified and properly processed later.
  • Bug #98238 - "I_S.KEY_COLUMN_USAGE is very slow". I am surprised to see such a bug in MySQL 8. According to the bug reporter, Manuel Mausz, this is also a kind of regression comparing to older MySQL version, where these queries used to run faster. Surely, no "regression" tag in this case was added.
  • Bug #98284 - "Low sysbench score in the case of a large number of connections". This notable performance regression of MySQL 8 vs 5.7 was reported by zanye zjy. perf profiling pointed out towards ppoll() where a lot of time is spent. There is a fix suggested by Fangxin Flou (to use poll() instead), but the bug is still "Open".
  • Bug #98287 - "Explanation of hash joins is inconsistent across EXPLAIN formats". This bug was reported by Saverio M and ended up marked as a duplicate of Bug #97299 fixed in upcoming 8.0.20. Use EXPLAIN FORMAT=TREE in the meantime to see proper information about hash joins usage in the plan.
  • Bug #98288 - "xa commit crash lead mysql replication error". This bug report from Phoenix Zhang (who also suggested a patch) was declared a duplicate of Bug #76233 - "XA prepare is logged ahead of engine prepare" (that I've already discussed among other XA transactions bugs here).
  • Bug #98324 - "Deadlocks more frequent since version 5.7.26". Nice regression bug report by Przemyslaw Malkowski from Percona, with additional test provided later by Stephen Wei . Interestingly enough, test results shared by Umesh Shastry show that MySQL 8.0.19 is affected in the same way as 5.7.26+, but 8.0.19 is NOT listed as one of versions affected. This is a mistake to fix, along with missing regression tag.
  • Bug #98427 - "InnoDB FullText AUX Tables are broken in 8.0". Yet another regression in MySQL 8 was found by Satya Bodapati. Change in default collation for utf8mb4 character set caused this it seems. InnoDB FULLTEXT search was far from perfect anyway...
The are clouds in the sky of MySQL bugs processing.
To summarize:
  1.  Still too much time and efforts are sometimes spent on arguing with bug reporter instead of accepting and processing bugs properly. This is unfortunate.
  2. Sometimes bugs are wrongly classified when verified (documentation vs code bug, wrong category, wrong severity, not all affected versions are listed, ignoring regression etc). This is also unfortunate.
  3. Percona engineers still help to make MySQL better.
  4. There are some fixes in upcoming MySQL 8.0.20 that I am waiting for :)
  5. XA transactions in MySQL are badly broken (they are not atomic in storage engine + binary log) and hardly safe to use in reality.

Thursday, April 4, 2019

Fun with Bugs #82 - On MySQL Bug Reports I am Subscribed to, Part XVIII

I've got few comments to my post on references to MariaDB in MySQL bug reports (not in the blog, but via social media and in personal messages), and all but one comments from current and former colleagues whose opinion I value a lot confirmed that this really looks like a kind of attempt to advertise MariaDB. So, from now on I'll try to keep my findings on how tests shared by MySQL bug reporters work in MariaDB for myself, MariaDB JIRA and this blog (where I can and will advertise whatever makes sense to me), and avoid adding them to MySQL bug reports.

That said, I still think that it's normal to share links to MariaDB bug reports that add something useful (like patches, explanations or better test cases), and I keep insisting that this kind of feedback should not be hidden. Yes, I want to mention Bug #94610 (and related MDEV-15641) again, as a clear example of censorship that is not reasonable and should not be tolerated.

In the meantime, since my previous post in this series I've subscribed to 30 or so new MySQL bug reports. Some of them are listed below, started from the oldest. This time I am not going to exclude "inactive" reports that were not accepted by Oracle MySQL engineers as valid:
  • Bug #94629 - "no variable can skip a single channel error in mysql replication". This is a request to add support for per-channel options to skip N transactions or specific errors. It is not accepted ("Not a Bug") just because one can stop replication on all channels and start on one to skip transaction(s) there, then resume replication for all channels. Do you really think this is a right and only way to process such a report?
  • Bug #94647 - "Memory leak in MEMORY table by glibc". This is also not a bug because one ca use something like malloc-lib=jemalloc with mysqld_safe or Environment="LD_PRELOAD=/path/to/jemalloc" with systemd services. There might be some cost related to that in older versions... Note that similar MDEV-14050 is still open.
  • Bug #94655 - "Some GIS function do not use spatial index anymore". yet another regression vs MySQL 5.7 reported by Cedric Tabin. It ended up verified as feature request without a regression tag...
  • Bug #94664 - "Binlog related deadlock leads to all incoming connection choked.". This report from Yanmin Qiao ended up as a duplicate of  Bug #92108 - "Deadlock by concurrent show binlogs, pfs session_variables table & binlog purge" (fixed in MySQL 5.7.25+, thanks Sveta Smirnova for the hint). See also Bug #91941.
  • Bug #94665 - "enabling undo-tablespace encryption doesn't mark tablespace encryption flag". Nice finding by Krunal Bauskar from Percona.
  • Bug #94699 - "Mysql deadlock and bugcheck on aarch64 under stress test". Bug report with a patch contributed by Cai Yibo. The fix is included in upcoming MySQL 8.0.17 and the bug is already closed.
  • Bug #94709 - "Regression behavior for full text index". This regression was reported by Carlos Tutte and properly verified (with regression tag added and all versions checked) by Umesh Shastry. See also detailed analysis of possible reason in the comment from Nikolai Ikhalainen.
  • Bug #94723 - "Incorrect simple query result with func result and FROM table column in where". Michal Vrabel found this interesting case when MySQL 8.0.215 returns wrong results. I've checked the test case on MariaDB 10.3.7 and it is not affected. Feel free to consider this check and statement my lame attempt to advertise MariaDB. I don't mind.
  • Bug #94730 - "Kill slave may cause start slave to report an error.". This bug was declared a duplicate of a nice Bug #93397 - "Replication does not start if restart MySQL after init without start slave." reported by Jean-François Gagné earlier. Both bugs were reported for MySQL 5.7.x, but I do not see any public attempt to verify if MySQL 5.6 or 8.0 is also affected. In the past it was required to check/verify bug on all GA versions supported if the test case applies. Nowadays this approach is not followed way too often, even when bug reporter cared enough to provide MTR test case.
  • Bug #94737 - "MySQL uses composite hash index when not possible and returns wrong result". Yet another optimizer bug was reported by Simon Banaan. Again, MariaDB 10.3.7 is NOT affected. I can freely and happily state this here if it's inappropriate to state so in the bug report itself. By the way, other MySQL versions were probably not checked. Also, unlike Oracle engineer who verified the bug, I do not hesitate to copy/paste the entire results of my testing here:
    MariaDB [test]> show create table tmp_projectdays_4\G*************************** 1. row ***************************
           Table: tmp_projectdays_4
    Create Table: CREATE TABLE `tmp_projectdays_4` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `project` int(11) NOT NULL,
      `datum` date NOT NULL,
      `voorkomen` tinyint(1) NOT NULL DEFAULT 1,
      `tijden` tinyint(1) NOT NULL DEFAULT 0,
      `personeel` tinyint(1) NOT NULL DEFAULT 0,
      `transport` tinyint(1) NOT NULL DEFAULT 0,
      `materiaal` tinyint(1) NOT NULL DEFAULT 0,
      `materiaaluit` tinyint(1) NOT NULL DEFAULT 0,
      `materiaalin` tinyint(1) NOT NULL DEFAULT 0,
      `voertuigen` varchar(1024) DEFAULT '',
      `medewerkers` varchar(1024) DEFAULT '',
      `personeel_nodig` int(11) DEFAULT 0,
      `personeel_gepland` int(11) DEFAULT 0,
      `voertuigen_nodig` int(11) DEFAULT 0,
      `voertuigen_gepland` int(11) DEFAULT 0,
      `created` datetime DEFAULT NULL,
      `modified` datetime DEFAULT NULL,
      `creator` int(11) DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `project` (`project`,`datum`) USING HASH
    ) ENGINE=MEMORY AUTO_INCREMENT=2545 DEFAULT CHARSET=utf8mb4
    1 row in set (0.001 sec)

    MariaDB [test]> explain SELECT COUNT(1) FROM `tmp_projectdays_4` WHERE `project`
     IN(15409,15911,15929,15936,16004,16005,16007,16029,16031,16052,16054,16040,1248
    5,15892,16035,16060,16066,16093,16057,16027,15988,15440,15996,11457,15232,15704,
    12512,12508,14896,15594,16039,14997,16058,14436,16006,15761,15536,16016,16019,11
    237,13332,16037,14015,15537,15369,15756,12038,14327,13673,11393,14377,15983,1251
    4,12511,13585,12732,14139,14141,12503,15727,15531,15746,15773,15207,13675,15676,
    15663,10412,13677,15528,15530,10032,15535,15693,15532,15533,15534,15529,16056,16
    064,16070,15994,15918,16045,16073,16074,16077,16069,16022,16081,15862,16048,1606
    2,15610,15421,16001,15896,15004,15881,15882,15883,15884,15886,16065,15814,16076,
    16085,16174,15463,15873,15874,15880,15636,16092,15909,16078,15923,16026,16047,16
    094,16111,15914,15919,16041,16063,16068,15971,16080,15961,16038,16096,16127,1564
    1,13295,16146,15762,15811,15937,16150,16152,14438,16086,16156,15593,16147,15910,
    16106,16107,16161,16132,16095,16137,16072,16097,16110,16114,16162,16166,16175,16
    176,16178,15473,16160,15958,16036,16042,16115,16165,16167,16170,16177,16185,1582
    3,16190,16169,15989,16194,16116,16131,16157,16192,16197,16203,16193,16050,16180,
    16209,15522,16148,16205,16201,15990,16158,16216,16033,15974,16112,16133,16181,16
    188,16189,16212,16238,16241,16183,15640,15638,16087,16088,16129,16186,16164,1610
    8,15985,16244,15991,15763,16049,15999,16104,16208,13976,16122,15924,16046,16242,
    16151,16117,16187);

    +------+-------------+-------------------+------+---------------+------+--------
    -+------+------+-------------+
    | id   | select_type | table             | type | possible_keys | key  | key_len
     | ref  | rows | Extra       |
    +------+-------------+-------------------+------+---------------+------+--------
    -+------+------+-------------+
    |    1 | SIMPLE      | tmp_projectdays_4 | ALL  | project       | NULL | NULL
     | NULL | 2544 | Using where |
    +------+-------------+-------------------+------+---------------+------+--------
    -+------+------+-------------+
    1 row in set (0.004 sec)

    MariaDB [test]> SELECT COUNT(1) FROM `tmp_projectdays_4` WHERE `project` IN(1540
    9,15911,15929,15936,16004,16005,16007,16029,16031,16052,16054,16040,12485,15892,
    16035,16060,16066,16093,16057,16027,15988,15440,15996,11457,15232,15704,12512,12
    508,14896,15594,16039,14997,16058,14436,16006,15761,15536,16016,16019,11237,1333
    2,16037,14015,15537,15369,15756,12038,14327,13673,11393,14377,15983,12514,12511,
    13585,12732,14139,14141,12503,15727,15531,15746,15773,15207,13675,15676,15663,10
    412,13677,15528,15530,10032,15535,15693,15532,15533,15534,15529,16056,16064,1607
    0,15994,15918,16045,16073,16074,16077,16069,16022,16081,15862,16048,16062,15610,
    15421,16001,15896,15004,15881,15882,15883,15884,15886,16065,15814,16076,16085,16
    174,15463,15873,15874,15880,15636,16092,15909,16078,15923,16026,16047,16094,1611
    1,15914,15919,16041,16063,16068,15971,16080,15961,16038,16096,16127,15641,13295,
    16146,15762,15811,15937,16150,16152,14438,16086,16156,15593,16147,15910,16106,16
    107,16161,16132,16095,16137,16072,16097,16110,16114,16162,16166,16175,16176,1617
    8,15473,16160,15958,16036,16042,16115,16165,16167,16170,16177,16185,15823,16190,
    16169,15989,16194,16116,16131,16157,16192,16197,16203,16193,16050,16180,16209,15
    522,16148,16205,16201,15990,16158,16216,16033,15974,16112,16133,16181,16188,1618
    9,16212,16238,16241,16183,15640,15638,16087,16088,16129,16186,16164,16108,15985,
    16244,15991,15763,16049,15999,16104,16208,13976,16122,15924,16046,16242,16151,16
    117,16187);

    +----------+
    | COUNT(1) |
    +----------+
    |     2544 |
    +----------+
    1 row in set (0.025 sec)

    MariaDB [test]> select version();
    +--------------------+
    | version()          |
    +--------------------+
    | 10.3.7-MariaDB-log |
    +--------------------+
    1 row in set (0.021 sec)
    When the job was done properly I see no reasons NOT to share the results.
  • Bug #94747 - "4GB Limit on large_pages shared memory set-up". My former colleague Nikolai Ikhalainen from Percona noted this nice undocumented "feature" (Had I forgotten to advertise Percona recently? Sorry about that...) He proved with a C program that one can create shared memory segments on Linux large than 4GB, one just had to use proper data type, unsigned long integer, in MySQL's code. Still, this report ended up as non-critical bug in "MySQL Server: Documentation" category, or even maybe a feature request internally. What a shame!
Spring in Paris is nice, as this photo made 3 years ago proves. The way MySQL bug reports are handled this spring is not any nice in some cases.
To summarize:
  1. It seems recently the fact that there is some limited workaround already published somewhere is a good enough reason NOT to accept valid feature request. Noted.
  2. Regression bugs (reports about drop in performance or problem that had not happened with older version but happens with some recent) are still not marked with regression tag sometimes. Moreover, clear performance regressions in MySQL 8.0.x vs MySQL 5.7.x may end up as just feature requests... A request to "Make MySQL Great Again" maybe?
  3. MySQL engineers who verify bugs often do not care to check all major versions and/or share the results of their tests. This is unfortunate.
  4. Some bugs are not classified properly upon verification. The fact that wrong data type is used is anything but severity 3 documentation problem, really.

Saturday, December 15, 2018

Fun with Bugs #75 - On MySQL Bug Reports I am Subscribed to, Part XII

From the lack of comments to my previous post it seems everything is clear with ERROR 1213 in different kinds and forks of MySQL. I may still write a post of two about MyRocks or TokuDB deadlocks one day, but let's get back to my main topic of MySQL bugs. Today I continue my series of posts about community bug reports I am subscribed to with a review of bugs reported in November, 2018, starting from the oldest and skipping those MySQL 8 regression ones I've already commented on. I also skip documentation bugs that should be a topic for a separate post one day (to give more illustration to these my statements).

These are the most interesting bug reports from Community members in November 2018:
  • Bug #93139 - "mysqldump temporary views missing definer". This bug reported by Nikolai Ikhalainen from Percona looks like a regression (that can appear in a bit unusual case of missing root user) in all versions starting from 5.6. There is no regression tag, surely. Also for some reason I do not see 8.0.x as affected version, while from the text it seems MySQL 8 is also affected.
  • Bug #93165 - "Memory leak in sync_latch_meta_init() after mysqld shutdown detected by ASan". This bug was reported by Yura Sorokin from Percona, who also made important statement in his last comment (that I totally agree with):
    "In commit https://github.com/mysql/mysql-server/commit/e93e8db42d89154b37f63772ce68c1efda637609 you literally made 14 MTR test cases ignore ALL memory problems detected by ASan, not only those which you consider 'OK' when you terminate the process with the call to 'exit()'. In other words, new memory leaks introduced in FUTURE commits may not be detected because of those changes. Address Sanitizer is a very powerful tool and its coverage should be constantly extending rather than shrinking."
  • Bug #93196 - "DD crashes on assert if ha_commit_trans() returns error". It seems Vlad Lesin from Percona spent notable time testing everything related to new MySQL 8 data dictionary (maybe while Percona worked on their Percona Server for MySQL 8.0 that should have MyRocks also supported, should be able to provide native partitioning and proper integration with data dictionary). See also his Bug #93250 - "the result of tc_log->commit() is ignored in trans_commit_stmt()".
  • Bug #93241 - "Query against full text index with ORDER BY silently fails". Nice finding by Jonathan Balinski, with detailed test cases and comments added by Shane Bester. One more confirmation that FULLTEXT indexes in InnoDB are still problematic.
  • Bug #93276 - "Crash when calling mysql_real_connect() in loop". Nice regression in C API (since 8.0.4!) noted by Reggie Burnett and still not fixed.
  • Bug #93321 - "Assertion `rc == TYPE_OK' failed". The last but not the least, yet another debug assertion (and error in non-debug build) found in MySQL 8.0.13 by Roel Van de Paar from Percona. You already know where QA for MySQL happens to large extent, don't you?
  • Bug #93361 - "memory/performance_schema/table_handles have memory leak!". It's currently in "Need Feedback" status and may end up as not a bug, but I've never seen 9G of memory used for just one Performance Schema table so far. It's impressive.
  • Bug #93365 - "Query on performance_schema.data_locks causes replication issues". Probably the first case when it was proved that query to some Performance Schema table may block some important server activity. Nice finding by Daniël van Eeden.
  • Bug #93395 - "ALTER USER succeeds on master but fails on slave." Yet another way to break replication was found by Jean-François Gagné. See also his Bug #93397 - "Replication does not start if restart MySQL after init without start slave."
  • Bug #93423 - "binlog_row_image=full not always honored for binlog_format=MIXED". For some reason this bug (with a clear test case) reported by James Lawrie is still "Open".
  • Bug #93430 - "Inconsistent output of SHOW STATUS LIKE 'Handler_read_key';". This weird inconsistency was found by Przemysław Skibiński from Percona.
Thinking about the future of MySQL 8 somewhere in Greenwich...
To summarize this review:
  1. I obviously pay a lot of attention to bug reports from Percona engineers.
  2. It seems memory problems detected by ASan in some MTR test cases are deliberately ignored instead of being properly fixed.
  3. There are still many surprises waiting for early adopters of MySQL 8.0 GA :) 
That's all I have to say about specific MySQL bugs in 2018. Next "Fun with Bugs" post, if any, will appear only next year. I am already subscribed to 11 bugs reported in December 2018. Stay tuned!

Tuesday, July 24, 2018

On Some Problematic Oracle MySQL Server Features

In one of my previous posts I stated that in Oracle's MySQL server some old enough features remain half-backed, not well tested, not properly integrated with each other, and not documented properly. It's time to prove this statement.

I should highlight from the very beginning that most of the features I am going to list are not that much improved by other vendors. But they at least have an option of providing other, fully supported storage engines that may overcome the problems in these features, while Oracle's trend to get rid of most engines but InnoDB makes MySQL users more seriously affected by any problems related to InnoDB.

The Royal Pavilion in Brighton looks nice from the outside and is based on some great engineering decisions, but the decorations had never been completed, some interiors were ruined and never restored, and the building was used for too many different purposes over years.
The list of problematic MySQL server features includes (but is not limited to) the following:
  • InnoDB's data compression

    Classical InnoDB compression (row_format=compressed) has limited efficiency and does not get any attention from developers recently. Transparent page compression for InnoDB seems to be originally more like a proof of concept in MySQL that may not work well in production on commodity hardware and filesystems, and was not integrated with backup tools.
  • Partitioning

    Bugs reported for this feature by MySQL Community do not get proper attention. DDL against partitioned tables and partition pruning do not work the way DBAs may expect. We still miss parallel processing for partitioned tables (even though proof of concept for parallel DDL and some kinds of SELECTs was ready and working 10 years ago). Lack of careful testing of partitioning integration with other features is also visible.
  • InnoDB's FULLTEXT indexes
    This feature appeared in MySQL 5.6, but 5 years later there are still all kinds of serious bugs in it, from wrong results to hangs, debug assertions and crashes. There are performance regressions and missing features comparing to MyISAM FULLTEXT indexes, and this makes the idea to use InnoDB for everything even more problematic. Current implementation is not designed to work with really large tables and result sets. DBAs should expect problems during routine maintenance activities, like ALTERing tables or dumps and restores when any table with InnoDB FULLTEXT index is involved.

  • InnoDB's "online" DDL implementation
    It is not really "online" in too many important practical cases and senses. Replication ignores LOCK=NONE and slave starts to apply "concurrent" DML only after commit, and this may lead to a huge replication lag. The entire table is often rebuilt (data are (re-)written) to often, in place or by creating a copy. One recent improvement in MySQL 8, "instant ADD COLUMN", was actually contributed by Community. The size of the "online log" (that is kept in memory and in temporary file) created per table altered or index created, depends on concurrent DML workload and is hard to predict. For most practical purposes good old pt-online-schema-change or gh-ost tool work better.

  • InnoDB's persistent optimizer statistics

    Automatic statistics recalculation does not work as expected, and to get proper statistics explicit ANALYZE TABLE calls are still needed. The implementation is complicated and introduced separate implicit transactions (in dirty reads mode) against statistics tables. Bugs in the implementation do not seem to get proper priority and are not fixed.
I listed only those features I recently studied in some details in my previous blog posts. I've included main problems with each feature according to my older posts. Click on the links in the list above to find the details.

The Royal Pavilion of InnoDB in MySQL is beautiful from the outside (and somewhere inside), but is far from being completed, and some historical design decisions do not seem to be improved over years. We are lucky that it is still used and works nice for many current purposes, but there are too many dark corners and background threads there where even Oracle engineers rarely look and even less are improving them...

Sunday, March 4, 2018

On InnoDB's FULLTEXT Indexes

I had recently written about InnoDB features that I try to avoid by all means if not hate: "online" DDL and persistent optimizer statistics. Time to add one more to the list - FULLTEXT indexes.

This feature had a lot of problems when initially introduced in MySQL 5.6. There was a nice series of blog posts about the initial experience with it by my colleague from Percona (at that times) : part I, part II, and part III. Many of the problems mentioned there were resolved or properly documented since that times, but even more were discovered. So, InnoDB FULLTEXT indexes may be used, with care, when MyISAM or other engines/means to add fulltext search is not an option. The list of bugs that are still important and must be taken into account is presented below.

What forced me to get back to this feature recently and hate it sincerely is one customer issue that led to this bug report: MDEV-14773  - "ALTER TABLE ... MODIFY COLUMN ... hangs for InnoDB table with FULLTEXT index". Note that I have to refer to MariaDB bug report here, as related upstream Bug #88844 is hidden from community (probably considered a shame, if not a security problem)! The bug is simple: if one applies any ALTER to the InnoDB table with FULLTEXT index, even not related that index and columns in in any way, chances are high that this ALTER may cause a kind of hang/infinite loop/conflict of the thread that tries to drop temporary table used by ALTER, as one of last steps, and FTS background optimize thread. Similar to other two problematic features, new background threads were introduced and their cooperation with other threads in InnoDB seems to be not that well designed/implemented.

There are many other bugs to take into account if you ever plan to add any single FULLTEXT index to your InnoDB table. Here is the list of the most important ones, mostly still "Verified" or open and ignored, that I collected during one of calm night shifts this week:
  • Bug #78048 - "INNODB Full text Case sensitive not working". This bug was fixed only recently, in MySQL 5.6.39, 5.7.21, and 8.0.4.
  • Bug #83776 - "InnoDB FULLTEXT search returns incorrect result for operators on ignored words". Still "Verified" on all GA versions and 8.0.x.
  • Bug #76210 - "InnoDB FULLTEXT index returns wrong results for key/value pair documents". This bug was reported by Justin Swanhart 3 years ago, quickly verified and then seems to be ignored.
  • Bug #86036 - "InnoDB FULLTEXT index has too strict innodb_ft_result_cache_limit max limit". I reported this bug 10 months ago, and it was immediately "Verified". It seems FULLTEXT indexes are hardly useful in general for large InnoDB tables because of this limitation.
  • Bug #78977 - "Enable InnoDB fulltext index to use generated FTS_DOC_ID column". This is a feature request (still "Open") to get rid of this well known limitation/specific column.
  • Bug #86460 - "Deleted DOCID are not maintained during OPTIMIZE of InnoDB FULLTEXT tables". If you want to get rid of deleted DOC_IDs in the INNODB_FT_DELETED, better just run ALTER TABLE ... ENGINE=InnoDB.
  • Bug #75763 - "InnoDB FULLTEXT index reduces insert performance by up to 6x on JSON docs". yet another verified bug report by Justin Swanhart.
  • Bug #69762 - "InnoDB fulltext match against in boolean mode misses results on join". Let me quote last comment there:
    "Since innodb doesn't support fulltext search on columns without fulltext index, and it is very complicated to support search on columns in multiple fulltext indexes in optimizer, it won't be fixed.

    We admit it's a point innodb fulltext is not compatible with myisam."
  • Bug #85880 - "Fulltext query is too slow when each ngram token match a lot of documents". This bug is still "Open".
  • Bug #78485 - "Fulltext search with char * produces a syntax error with InnoDB". Yet another verified regression comparing to MyISAM FULLTEXT indexes. Nobody cares for 2.5 years.
  • Bug #80432 - "No results in fulltext search for top level domain in domain part of email ". It ended up as "Won't fix", but at least a workaround was provided by Oracle developer.
  • Bug #81819 - "ALTER TABLE...LOCK=NONE is not allowed when FULLTEXT INDEX exists". Online ALTER just does not work for tables with FULLTEXT indexes. This is a serious limitation.
  • Bug #72132 - "Auxiliary tables for InnoDB FTS indexes are always created in shared tablespace". This my bug report was fixed in .5.6.20+ and 5.7.5+, but the fact that this regression was not noted for a long time internally says a lot about the way the feature was developed and maintained.
  • Bug #83560  - "InnoDB FTS - output from mysqldump extremely slow and blocks unrelated inserts". I have yet to check the metadata locks set when the table with FULLTEXT index is used in various SQL statements, but from this "Verified" report it is clear that just lading a dump of a table with FULLTEXT indexes may work too slow for any large table.
  • Bug #71551 - "ft_boolean_syntax has no impact on InnoDB FTS". yet another inconsistency with MyISAM FULLTEXT indexes that was reported 4 years ago and "Verified", but still ignored after that.
  • Bug #83741 - "InnoDB: Failing assertion: lock->magic_n == 22643". Surely, debug assertions can be ignored, but in most cases they are in the code for a good reason. This failure was reported by Roel Van de Paar from Percona.
  • Bug #83397 - "INSERT INTO ... SELECT FROM ... fails if source has > 65535 rows on FTS". This "Verified" bug alone, reported by Daniël van Eeden, makes InnoDB FULLTEXT indexes hardly usable in production for large tables.
  • Bug #80296 - "FTS query exceeds result cache limit". The bug is "Closed" silently (by the bug reporter maybe, Monty Solomon?), but users report that recent enough versions like 5.6.35 and 5.7.17 are still affected. See also Bug #82971 (no fix for MySQL 5.6.x for sure).
  • Bug #85876 - "Fulltext search can not find word which contains "," or ".".  Still "Verified" for 1 months.
  • Bug #68987 - "MySQL crash with InnoDB assertion failure in file pars0pars.cc". Crash was reported in MySQL 5.6.10, not repeatable. Then (different?) assertion failure was reported in debug builds only in MySQL 5.6.21+, and verified. Not sure what's going on with this bug report...
  • Bug #83398 - "Slow and unexpected explain output on FTS". The fact that EXPLAIN may be slow when the table with FULLTEXT index is involved is now documented, so this report by Daniël van Eeden is closed.
  • Bug #81930 - "incorrect result with InnoDB FTS and subquery". This bug report about wrong results by Sergei Golubchik from MariaDB was immediately "Verified", but ignored since that time.
  • Bug #80347 - "mysqldump backup restore fails due to invalid FTS_DOC_ID (Error 182 and 1030)". There is a workaround based on mydumper/myloader at least...
To summarize, InnoDB FULLTEXT indexes is one of the most problematic InnoDB features for any production use because:
  • There are all kinds of serious bugs, from wrong results to hangs, debug assertions and crashes, that do not seem to get any internal priority and stay "Verified" for years.
  • There are performance regressions and missing features comparing to MyISAM FULLTEXT indexes, so migration may cause problems.
  • InnoDB FULLTEXT indexes are not designed to work with really large tables/result sets.
  • You should expect problems during routine DBA activities, like ALTERing tables or dumps and restores when any table with InnoDB FULLTEXT index is involved. 
If you still plan/have to use it, please, make sure to use the latest MySQL version, check the list above carefully and test/check the results of fulltext searches and routine DBA operations like altering the table. You may get a lot of surprises. Consider alternatives like Sphinx seriously.