My "Fun with Bugs" series of blog posts about interesting or badly handled MySQL bug reports ended more than 7 months ago. The time had come for that. But I honestly miss that media for bitching about something wrong in MySQL once in a while...
The year of COVID-19 pandemic with video conferences that replaced normal offline ones forced me to start recording videos, and I even used to like the process so much that I am working on my wife's channel content as a hobby. I've created my own channel as well, for uploading some draft/bad/long/extended videos recorded in the process of work on my online talks:
Being fluent enough with recording videos at home (using different software tools and cameras) and publishing them at YouTube, I wonder now if it makes sense to turn this activity into a regular and MySQL-concentrated one? My next talk(s) will be submitted to Percona Live 2021, but it means they may go live only in May, and I'd like to be on screens earlier than that.
So, I wonder should I maybe have a regular video recordings shared, let's say, once a week every Tuesday, up to 5 minutes long at most, and devoted to some MySQL-related topics? What topics would you like me to cover? Would you mind if it will be a five minutes talk about a recent MySQL bug report or few of them, either interesting in general or badly handled by my former Oracle colleagues? Something else to better spend megabytes of video on? Leave it to younger and more attractive and experienced speakers? Keep writing here or stop bitching about the bugs once and for good?
I am waiting for your comments in this blog post and in social media that I'll share it till March 5, 2021. Then I'll decide on how to proceed with this regular YouTube videos idea.
I delayed this post #100 in the "Fun with Bugs" series for few weeks - the previous one was published 4 weeks ago. The idea was to make it the last one, and for this I needed something to celebrate. Two days ago proper event happened, we have MySQL Bug #100000 reported! Here it is:
Bug #100000 - "Provide an index hint that only affects the choice of index for NL join". This nice feature request was added by former optimizer developer in MySQL, Øystein Grøvlen. Hundreds of other feature requests are waiting for the attention both from the MySQL Verification Team and from developers, so good to see a feature request getting the number that nobody ever forget!
Actually Øystein Grøvlen created several interesting bug report during that day:
Bug #99994 - "Index range scan is chosen where table scan takes 40% less time". Clear and simple bug report that relies on the world sample database.
Bug #99995 - "Histogram is not used for filtering estimate when index is disabled".
Bug #99996 - "Prefer histogram over index statistics when eq_range_index_dive_limit is exceeded". This was verified as a feature request.
Bug #99997 - "Range estimates are usually off by a factor of 2 for large ranges". It was declared a duplicate of older bug report I am also subscribed to, Bug #73386 - "For ranges, innodb doubles estimates, or caps estimates to half the table". See also this MariaDB bug report, MDEV-19424 - "InnoDB's records_in_range estimates are capped at about 50%", and links from it for a lot of related discussions. Let's wait and see what vendor resolves this faster... The other report, Bug #99998 - "For large ranges, the range estimate will never exceed 50%", is probably also a duplicate of the same old bug.
Bug #99999 - "EXPLAIN FORMAT=TREE does not show cost/rows for semijoin materialization". Yet another nice and clear bug report.
So, hardly anyone else had a chance to get that #100000 filed. As far as I can see, all these were reported during a very short period of 2 minutes, from "26 Jun 7:57" till "26 Jun 7:58"! Not sure how to do this without some automation or at least all the details ready for quick copy/pasting!
Now back to some older bugs I've subscribed to over last 4 weeks:
Bug #99791 - "MySQL 8 orphaned table due to unchecked non-existent row format check." As reported by Marc Reilly, tables created in versions < MySQL 8 which use row_format COMPRESSED or REDUNDANT, where row_format is not set explicitly in the Table DDL allow users to create un-prefixed indexes on fields which exceed the maximum column size of 767 bytes. Upgrading to MySQL 8 do nothing with these tables, but as soon as new index is added and reboot happens, such a table becomes inaccessible. What a surprise!
Bug #99794 - "MySQL 57 client is inefficient at bulkloads/binlog replay". In this bug report Marc Reilly basically asks to back port the fix from MySQL 8.0.13.
Bug #99805 - "mysql async client is incomplete". There is no way to determine file descriptor state (should it block on read or write), so it is impossible to use it in asynchronous contexts without busy looping. This bug report by Domas Mituzas was used in one discussion as an argument that MySQL bugs database still gets proper attention from MySQL engineers. It's truly so.
Bug #99892 - "initialize with innodb_page_size=4096 gets "Specified key was too long" errors". This is a regression vs 5.7 (without a tag). As Mark Callaghan found out, one can not initialize MySQL 8 instance without errors with such a small innodb_page_size.
Bug #99924 - "The record per key value from InnoDB is not suitable when n_diff is zero". As reported by Ze Yang, due to lack of locking when server reads the innodb_rec_per_key, the n_diff value may be 0 (not set) while the table->stat_n_rows is > 0. As a result (see great comment by Øystein Grøvlen), if a table object is opened during the recalculation of statistics, the rec_per_key for a column/index may be quite misleading. It will be interpreted as all rows have the same value, and the index will probably not be chosen for any non-covering scans. There is a patch suggested (to set rec_per_key to 1 or 10 in such case), as well as other suggestion to set the value REC_PER_KEY_UNKNOWN. Useful reading!
Bug #99933 - "In-memory hash join will only use two-thirds of join buffer". Yet another bug report related to hash joins from Øystein Grøvlen, with a fix suggested. See also his Bug #99934 - "Hash join adds columns to the hash table that is not needed." There is a lot of work ahead to improve the implementation of this new feature in MySQL 8.
Bug #99935 - "innodb_doublewrite_files is not correct when innodb_buffer_pool_size > 1G". Just 2 files are created instead of 16 according to the manual. This bug was reported by Satya Bodapati.
Bug #99943 - "Hash join does not work for Semijoin and Antijoin". This bug report from Tibor Korocz was "Verified", but later comments suggest that it's more like wrong expectations/interpretation of cases when the feature has to be used (it is supposed to be used instead of BNL, but not instead of semijoin materialization and subquery materialization). Let's wait and see how it ends up...
Bug #99966 - "Switching to use NUMA-SMART Counter for statistical counters". Great bug report from Krunal Bauskar, with a patch suggested. I hope to get a NUMA system one day myself to understand the challenges and performance problems there better.
So, that's it, my very last post in the "Fun with Bugs" series that started more than 7 years ago. The series where I listed most of the interesting bug reports I keep an eye on, since Bug #2. It was a long way with a lot of fun and a lot of (rarely appreciated) work in the process, but now my watch has ended. I am not going to try to micro manage MySQL bugs processing any more and finally let the MySQL Verification Team do their job without my regular attention. Good luck!
Percona had recently started to blog about bugs, so I am sure they will keep an eye and share lists of important bugs on a regular basis. They should really care more than I do these days.
In my previous post in this series I've commented on some interesting MySQL bug reports that were added during the second half of April. Time to move on to bugs reported in May, 2020, as we are quickly approaching MySQL Bug #100000 soon and I want to write a separate post for this great achievement :)
Here is the list:
Bug #99432 - "Improving memory barrier during rseg allocation". Nice contribution by my former colleague in Percona, Krunal Bauskar, who now works on making MySQL better for ARM processors. According to his findings, the use of a relaxed memory model improves performance on ARM by up to 2%. See also yet another bug report with a contribution that matters for ARM, Bug #99556 - "Avoid sequentially consistent atomics for atomic counters" (contributed by Sergey Glushchenko from Percona).
Bug #99444 - "New HASH JOIN order problem". One should not expect and rely on any specific order unless explicit ORDER BY is used, so formally this report by Gabor Aron is "Not a Bug". I put it into this list as several other community members helped him a lot in understanding why results with HASH_JOIN optimization in newer versions are still valid and what are the ways to get the results with the desired ordering efficiently. Guilhem Bichot, for instance, suggested two different ways, using window function and lateral table. Useful reading in any case!
Bug #99458 - "i_s_fts_index_cache_fill_one_index() is not protect by the lock". Looks like even crashes are possible as a result, based on comments. Nice finding by Haiqing Sun.
Bug #99459 - "SQL run with GROUP_MIN_MAX may infinite loop and never return". After some discussion around the validity and severity of bug reports where test case involved adding DEBUG_SYNC() to show the problem in a predictable way, this great bug report by Ze Yang was verified. All MySQL GA versions are affected, including 8.0.20! As a side note, I'd prefer NOT to read such discussions any more. They are wasting time of all parties involved.
Bug #99499 - "Incorrect result when constant equailty expression is used in LEFT JOIN condition". This bug that affects MySQL 5.7.x only (it was fixed in MySQL 8.0.17+ and in 5.6 code was different) was reported by Marcos Albe from Percona.
Bug #99504 - "Generated column incorrect on INSERT when based on column w/ expression DEFAULT". Several problems are highlighted in the complex enough test case submitted by Brad Lanier.
Bug #99582 - "Reduce logging of new doublewrite buffer initialization which is confusing". 180 lines or so are added when --log-error-verbosity is set to 3. As a workaround one can add:
log-error-suppression-list="MY-011950"
to the [mysqld] section of the .cnf file. This problem was reported by Simon Mudd. Make sure to read all comments.
Bug #99591 - "Option --tc-heuristic-recover documentation wrong, missing details". In reality it does not work with more than one XA-capable engine installed. I wish fine manual documents the reality, not the good intentions of the past. This documentation request was added by Sami Ahlroos.
Bug #99593 - "Performance issues in 8.0.20". It seems to be yet another TempTable engine problem that caused regression comparing to MySQL 5.7. At least this:
SET GLOBAL internal_tmp_mem_storage_engine=MEMORY;
is a workaround. The bug (a duplicate of internal Bug #30562964) was reported by billy noah and is fixed in upcoming MySQL 8.0.21.
Bug #99601 - "Broken Performance using EXIST function, increasing execution time each loop". This regression bug (without tag, but who cares...) in MySQL 8.0 was reported by Ronny Görner and minimal test case demonstrating that the problem is actually with function call was contributed by
Shane Bester.
Bug #99643 - "innobase_commit_by_xid/innobase_rollback_by_xid is not thread safe". This bug was reported by Zhai Weixiang, who had also suggested the fix in the code.
Bug #99717 - "Performance regression of parallel count". Great bug report with code analysis and ready to use MTR test case from Ze Yang. Sunny Bains already confirmed that the problematic code is going to be removed.
To summarize:
I am happy to see Oracle engineers explaining community bug reporters the reasons and possible solutions for the problems they hit that are not actually caused by any bug in MySQL. I tried to do this as well, whenever possible, while working on MySQL bugs...
We can still find speculations that if the bug is repeatable only by adding DEBUG_SYNC() or similar debug lines, then it can not be verified or gets lower severity... IMHO this is nonsense, as there are many high severity verified real bug reports where this method is used to demonstrate the problem clearly. Just stop it!
There are many things to write about MySQL this week. It turned 25 years old, to begin with! Quite successful the first ever Percona Live ONLINE 24 hour conference also happened this week, and I've presented my talk there...
But this is a blog of former MySQL Entomologist, so when I have nothing ready to share about something exciting or immediately useful, I write about MySQL bugs. This is the case today as well. I need more time to think about MySQL history and write down the details to complement my presentation etc. Previous post in this series appeared a month ago, so let me present a yet another list of InnoDB, optimizer, replication and few other bugs reported by MySQL Community users since April 18, 2020 and before May, 2020:
Bug #99295 - "InnoDB in-memory estimate is wrong after an index is created". As Øystein Grøvlen found out, even though the entire newly added covering index is in the buffer pool, the buf_stat_per_index_t::get() function in MySQL 8 estimates just 1 page is in memory, and as a result the index is not used.
Bug #99326 - "undo truncation might still not be crash safe". After nice explanations of what may happen from
Kevin Lewis:
"After and internal discussion with Sunny Bains, I think I understand the concern better. Let's assume that a redo log is so large that it contains redo entries for all 512 Space IDs of an undo tablespace that is being truncated too often. In other words, even though each truncate removes old pages from the buffer pool and flushes newly created pages, it does not actually cause a checkpoint for each truncation like it did in 5.7. So the redo log can possibly contain records for more than 512 space IDs.
There is a worklog tested and pushed to the 8.0.21 release branch that fixes this highly unlikely possibility.
As part of WL#11819, we keep a count of the number of truncations that have happened between checkpoints. So if there is more than (512 / 8) truncations between checkpoints, then no more truncations can happen on that undo space until the next checkpoint happens."
this (regression vs 5.7) bug report by Zanye Zjy was closed as "Not a Bug". This is a totally wrong status and handling for such a case.
Bug #99354 - "Nondeterministic stored function returns incorrect results". This interesting bug was reported by Jacob Chafik. If anyone cares, it is still repeatable on 8.0.20 (and not repeatable on MariaDB 10.4.x):
mysql> SELECT IF(COUNT(t1.id) > 0, "Success", "Failure") "Result" FROM t1 WHERE 1 IN (SELECT t1_inner.id FROM t1 t1_inner INNER JOIN t2 WHERE f1(t2.id) > 0); +---------+ | Result | +---------+ | Failure | +---------+ 1 row in set (0,04 sec)
mysql> SELECT IF(COUNT(t1.id) > 0, "Success", "Failure") "Result"FROM t1 WHERE 1 IN (SELECT t1_inner.id FROM t1 t1_inner INNER JOIN t2 WHERE t2.id > 0); +---------+ | Result | +---------+ | Success | +---------+ 1 row in set (0,00 sec)
mysql> explain SELECT IF(COUNT(t1.id) > 0, "Success", "Failure") "Result" FROM t 1 WHERE 1 IN (SELECT t1_inner.id FROM t1 t1_inner INNER JOIN t2 WHERE f1(t2.id) > 0); +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+ | 1 | SIMPLE | t1_inner | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where | | 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | FirstMatch | | 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using where; Using join buffer (hash join) | +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+ 3 rows in set, 1 warning (0,00 sec)
mysql> explain SELECT IF(COUNT(t1.id) > 0, "Success", "Failure") "Result" FROM t1 WHERE 1 IN (SELECT t1_inner.id FROM t1 t1_inner INNER JOIN t2 WHERE t2.id > 0); +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------+ | 1 | SIMPLE | t1_inner | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where | | 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where; FirstMatch | | 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using join buffer (hash join) | +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------+ 3 rows in set, 1 warning (0,00 sec)
mysql> select version(); +-----------+ | version() | +-----------+ | 8.0.20 | +-----------+ 1 row in set (0,00 sec)
Wrong results should not be produced for logically equivalent queries, even if plans are different. Unfortunately this regression (since 5.7.5+) bug had not got the "regression" tag.
Bug #99359 - "Order by in group_concat with prepare statement returns unexpected results". Two executions of the same statement with the user variable having the same value assigned should either both work or fail with error message, it's a matter of consistency. This is not the case in a situation described in the bug report from Feng Liyuan.
Bug #99363 - "Innodb_data_pending_fsyncs shows incorrect value". In this bug report SeongUck Lee clearly shown that the problem happens and gave some hints based on source code review what could cause this regression in 8.0 comparing to 5.7. Moreover, after adding debug assertion he managed to show how it is hit and what wrong values are produced, in gdb. Still this report for now ended up as "Not a Bug" (until a test case is presented, reportedly. What a shame!
Bug #99377 - "Assertion `thd->get_transaction()->is_empty(Transaction_ctx::STMT)' failed in ". Surely dropping and re-creating the mysql.general_log table is a corner case, but even in debug builds it should end up with some proper error message, not just assertion failure. This bug was reported by
Roel Van de Paar.
Bug #99398 - "Data in wrong row on left join". This regression bug in MySQL 8.0.20 reported by Soner Sayakci is already fixed in upcoming MySQL 8.0.21. Good news.
Bug #99412 - "Threads_running becomes scalability bottleneck on multi-node NUMA topologies". Nice bug report from Sergey Glushchenko, with a patch contributed. perf annotate was used to demonstrate the problem. Good to know Percona engineers keep working to make MySQL perform better!
Bug #99413 - "Constant propagation get the wrong result when mix with the different collations". Again, this bug report from Wj Huang was closed as 'Not a Bug" based mostly on the results from 8.0.21+ that is not yet released. This is either a "Duplicate" if the exact fix that helped in 8.0.21+ can be isolated, or "Can't repeat", but setting it to "Not a Bug" seems wrong to me.
Bug #99414 - "Where exists performance 8.0.16 onwards". This optimizer regression bug in 8.0.16+ (comparing to 5.7) was reported by Jonathan Taylor. See also a comment by Øystein Grøvlen about the tool available at https://github.com/ogrovlen/opttrace to pre-process and get a condensed trace for the join optimizer, that helps to understand what's going on.
Let me stop for now. Many more interesting bugs were reported in May, so stay tuned!
Happy Birthday, MySQL and Sakila!
To summarize:
Now (regression!) bugs are sometimes closed as "Not a Bug" after explanation what the problem is and statement that there is a fix in a version not yet released. Not as a "Duplicate", but "Not a Bug", after confirming the problem. This is a new level of wrong bugs handling!
Same with a bug where bug reporter is still working on a repeatable test case, after showing the problem happened with some evidence, "Not a Bug".
Oracle does a good job recently in making sure MySQL Server bugs are not hanging around without reaction and are processed fast, but looks like this is partially achieved by lowering standards of bugs processing that were established over last 15+ years. Closing reports by all means as soon as possible as "Not a Bug" is not a way to go. This is sad.
Time for the next blog post about MySQL bugs! The previous one covered some bugs I considered interesting in March, this one will be about community bug reports that I've picked up since March 26, 2020. I'd better review bug fixed in MySQL 5.7.30 instead of this, but it is still not released, even though we know it must get some important security fixes based on "Oracle Critical Patch Update Advisory - April 2020" recently published.
As usual, I am mostly interested in replication, InnoDB, optimizer and few more categories. Here is the list:
Bug #99082 - "Problem with replication: XA transaction, temp tables and row based binlog". As noted by Libor Laichmann, creating temporary tables inside XA transaction leads to broken replication in MySQL 5.7.29, with somewhat misleading error message. This is unfortunate. We do not see any evidence, yet, if 5.6 and 8.0 branches are similarly affected.
Bug #99094 - "coredump when install information schema plugin". Bug reporter, Lou Shuai, tried to create simple plugin for the information_schema and it crashed MySQL server 8.0.19 upon installation attempt. Bot the plugin source code and patch for the bug were contributed.
Bug #99100 - "GROUP BY will throw table is full when temptable memory allocation exceed limit". More fun with TempTable storage engine that I blamed previously. This bug was reported by Dean Zhou, who had performed detailed analysis in gdb and suggested a fix. It took some efforts for the bug reporter to get it "Verified" as a regression bug in MySQL 8.0.19 (without a "regression" tag, surely).
Bug #99101 - "SELECT .. LOCK IN SHARE MODE requires LOCK TABLES grant". Simple regression in MySQL 8.0.11+ (or incompatible change in behavior, if you prefer) vs MySQL 5.7 was found and reported by Matthew Boehm.
Bug #99136 - "TempTable wastes 1MB for each connection in thread cache". Nikolai Ikhalainen demonstrated this additional memory usage in MySQL 8.0.16 comparing to 5.7 quite clearly.
Bug #99174 - "Prepare transaction won't rollback if server crash after online ddl prepare stage". This bug was reported by Zhang Xiaojian. Additional debugging code (one DBUG_EXECUTE_IF statement) was added to the source to demonstrate the problem easily and it caused some questionable arguments of a kind:
"First of all, changing our server code in order to cause a bug, can not be considered a repeatable test case."
But the bug reporter provided more details and clear test case, and had not agreed with the above. The bug was soon "Verified", even though no attempts to check (or explain) if 5.7 may be similarly affected were made in public, so we still have to wonder if this is a regression.
Bug #99180 - "Accessing freed memory in perfschema when aggregating status vars". Let me just quote Manuel Ung:
"When aggregate_thread_status is called for other threads, it's possible for that thread to have exited and freed the THD between the time we check that the thread was valid, until the time we call get_thd_status_var."
Ironically, he had to add some conditional my_sleep() call to server code to get a repeatable test case, and this was NOT a problem for a good Oracle engineer to verify the bug immediately. There are still bugs in my beloved Performance Schema. Who could imagine that?
Bug #99200 - "CREATE USER get stuck on SHOW PROCESSLIST and ps.threads on slave". So, slave may disclose some sensitive information. As demonstrated by Marcelo Altmann, if a query had been rewritten by the parser because it contains sensitive information, it won't be cleaned up when slave's SQL thread applies it, making it visible in SHOW PROCESSLIST and performance_schema.threads. Both 5.7.29 and 8.0.19 are affected. Bug reporter had contributed fixes for both versions. See also another bug in Performance Schema that he reported and contributed a fix for, Bug #99204 - "performance_schema threads table PROCESSLIST_INFO incorrect behaviour".
Bug #99205 - "xa prepare write binlog while execute failed". Then XA PREPARE from the binary log is executed on slave and... we are in troubles. The bug was reported by Phoenix Zhang. Unfortunately it is still not clear from this verified bug report if MySQL 5.6 and 5.7 are similarly affected (probably they are).
Bug #99206 - "lock_wait_timeout is taking twice time while adding new partition in table". Nice finding by Lalit Choudhary. MySQL 8.0.19 is not affected by this bug.
Bug #99244 - "bad performance in hash join because join with no join condition between tables". This optimizer bug (wrong join order when new hash join optimization is used) was reported by Chunyang Xu. But note also a simple test case and perf outputs contributed later by Shane Bester.
Bug #99257 - "Inconsistent output and wrong ORDER BY Sorting for query with LIMIT". Yet another 5.7-only optimizer bug found by Lalit Choudhary. My quick test presented in a comment shows that MySQL 5.6.27 produced correct results, so this is a regression bug.
Bug #99273 - "Item_ref in Having Clause points to invalid outer table field". This may lead to wrong results for simple enough queries. This regression bug that affects both MySQL 5.7.29 and 8.0.19 was reported by Shanshan Ying.
Bug #99286 - "Concurrent update cause crash in row_search_mvcc". This great bug report with code analysis, fix suggested and a test case with detailed instructions was created by Zkong Kong. It was promptly verified, but I still miss any documented attempt to check (by running the test case or by code analysis if the bug applies only to 5.7.29 or MySQL 8.0.x is also potentially affected. For now I'll try to remember this assertion line:
InnoDB: Assertion failure in thread 47758491551488 in file rem0rec.cc line 586
and "crash with row_search_mvcc in backtrace". Who know when I hit something similar and what fork/version it will be...
Rainy birthday at the seaside in Cap-d'Ail. Almost 15 years of my 50 were spent checking new MySQL bug reports almost every day.
To summarize:
We still see many regression bugs in recent versions of MySQL 5.7.x and 8.0.x, often without "regression" tag.
There are cases when the bug is verified, but there are no clearly documented checks if all GA versions are affected.
XA transactions are (and had always been) a real disaster for modern MySQL versions in replication environments - all kinds of replication breakage and inconsistencies are possible.
Check "MySQL Bug Reporter Hall of Fame" if you want to know who from MySQL Community had contributed a lot of bug reports over last 10 years.
My weekdays are still busy even during this lockdown period, but weekend is a perfect time for yet another blog post about MySQL bugs I considered interesting. Very few followers read my posts on other topics anyway, especially if they have the "M....DB" word...
So, here is a new list of optimizer, replication, InnoDB and few other bugs in GA versions of MySQL that I've subscribed to since the end of February, 2020:
Bug #98675 - "LIMIT clause in derived table garbles EXPLAIN Note". It's not a big deal and in specific case presented by Øystein Grøvlen optimizer works properly (note text aside), but some clarifications to the note and/or documentation would still help. Comments are useful to read and shows a great example of proper cooperation from Oracle side.
Bug #98719 - "Parallel CREATE TABLE statement binlogged in a wrong order". This bug report from Song Libing shows that Oracle engineers readily accept and verify bug reports where some code modification (adding debug sync point, for example) is needed to demonstrate the problem with a repeatable, deterministic test case. I've seen other bug reports where code modification was considered a problem and bug was not verified immediately as a result. Note that this bug seem to be not repeatable on MySQL 8.0.19 and this was explicitly tested, with test results shared in public. Good job, Umesh Shastry!
Bug #98734 - "Same digest for different queries if we use integer value in ORDER BY caluse". Different column numbers in ORDER BY often cause totally different execution plans, so assuming these queries are the same ("ORDER BY ?") for the purpose of digesting is misleading. Moreover, as noted by Lalit Choudhary from Percona, with column names instead of numbers such queries are not considered the same, so Performance Schema has to be more consistent.
Bug #98739 - "TempTable storage engine slow if mmap is required". Take care when large temporary tables are used for your queries in MySQL 8.0.x. By default (without temptable_use_mmap = OFF) when the table is larger than temptable_max_ram you may notice a very slow query execution. Nice finding by Marcelo Altmann. This is not the only related performance regression I've seen reported recently. Looks like TempTable storage engine problems is a real main topic of this post!
Bug #98750 - "SHOW FIELDS command regression on MySQL 8.0". This performance regression bug was reported by Juan Arruti. See also similar Bug #92195 - "checking permissions 90 time" that was wrongly(!) declared "Not a bug" back in 2018 just because of the way the problem was demonstrated. This time Przemyslaw Malkowski helped to make the point based on Performance Schema instrumentation, so there was no other option but to accept this as a real performance regression bug. Take care if you use SHOW statements with MySQL 8!
Bug #98782 - "Using TempTable engine for GROUP BY is slower than using MEMORY engine". In this bug report Øystein Grøvlen demonstrated that MEMORY engine is about 10% faster for temporary tables in MySQL 8. Make sure to use internal_tmp_mem_storage_engine=MEMORY if you care about performance.
Bug #98869 - "Temp ibt tablespace truncation at disconnection stuck InnoDB under large BP". Bug reporter, Fungo Wang, used different methods to show the performance impact of the regression probably introduced by WL#11613. From pt-pmp to perf and other OS level tools. Make sure to check all comments that point out to other bugs and problems.
Bug #98974 - "InnoDB temp table could hurt InnoDB perf badly". Yet another bug report by Fungo Wang. This time it took a lot of time and efforts from the bug reporter and many MySQL Community members (including me) to get this bug properly processed, even though it started with a detailed source code analysis, properly described test case, stack traces analysis and perf profiling details shared. The ideal bug report got far from ideal treatment, unfortunately.
Bug #98976 - "Case of referenced column of foreign key not corrected on import". Tim Düsterhus found that in versions with the fix for Bug #88718 - "Foreign key is always in lower case" the case is not always correct in the output of subsequent mysqldump.
Bug #98980 - "A state inside mysql_real_connect_nonblocking still blocks". This bug report by Jay Edgar was verified surprisingly fast. This is why it ended up in my list.
Bug #98990 - "avg_count_reset for monitor set owner is always NULL in I_S.INNODB_METRICS ". In this case Fungo Wang had not only found a bug, but also provided a patch that was accepted by Oracle after signing the OCA.
Bug #99006 - "GTID Synchronisation speed when doing START SLAVE". Simon Mudd noted that in some cases with GTIDs START SLAVE may take a lot of time:
and what's worse nothing is logged to the error log in the process to show the reason, progress or anything useful. Probably it's expected that with many binary logs finding proper GTID in them takes time, but some feedback would be useful. The bug does not have any clear public test case and is still under analysis.
Bug #99010 - "The mtr case --send command not use ps-protocol when run with --ps-protocol ". Does not sound like a big deal for anyone but developers who write MTR test cases, but in this report Ze Yang had provided source code analysis and quite detailed how to repeat steps and still got useless requests for additional feedback and nothing else till today. This is, again, unfortunate.
Bug #99039 - "PSI thread info not getting updated after statement execution". Who could imagine that regression bugs may be introduced even into Performance Schema? But Pranay Motupalli found one introduced into 8.0.18, 5.7.28+ and 5.6.47+ by this commit! I hope to see it fixed soon.
Bug #99051 - "XA commit may do engine commit before MYSQL_BIN_LOG::ordered_commit". XA transactions is one of my favorite weak areas of MySQL. In this bug report Dennis GAO described a case when XA COMMIT operation may do a engine commit before binlog flush on Ubuntu 18.04. He had contributed the fix based on the assumption that the best way is to ensure the sequence in the plugin_hash, so that the binlog plugin should be always before all the transnational engine plugins. I only hope that one day a long list of XA bugs will be fixed in one of forks if not in MySQL itself.
This is my favorite point of view this year.
To summarize:
I see some really good examples of bugs verification by Oracle engineers recently. All supported versions are tested, nobody tries to argue against the bug reporter approach used to demonstrate the problem, even if it includes source code modifications. Looks really promising.
Even ideal bug reports sometimes are not processed properly without extra efforts from MySQL Community, unfortunately.
Percona engineers still contribute a lot to MySQL with both new bug reports and useful comments and clarifications. I hope Oracle appreciates that.
Looks like code review and regression testing in Oracle still may benefit from some improvements, as we still see new regression bugs...
With conferences cancelled or postponed and people forced to stay at home due to COVID-19 wide spreading, what can be better than to give my readers a new list of MySQL bugs to check? Useful reading should help! So today I continue my previous post with a quick review of bugs I've subscribed to in February, 2020, while things were still going as usual for most of us...
Here is the list of InnoDB, replication, optimizer and some other bugs to note while working with recent MySQL 5.7.x and 8.0.19 releases:
Bug #98473 - "group replication will be block after lock table". This problem report by phoenix Zhang was declared not a bug recently. Looks like for group_replication_consistency= BEFORE_AND_AFTER it is expected to get nodes blocked if one of them executed LOCK TABLE ... WRITE and another tried to insert some rows into that table. Check last comment by Nuno Carvalho for more details. Having multiple nodes that change or block data in clusters is always fun. See also Bug #98643 - "group replication will be block primary node shutdown" from the same bug reporter. Analysis is still in progress for it
Bug #98498 - "P_S view status_by_thread does not reflect correct results". This bug was reported by Oli Sennhauser (my former colleague in MySQL, founder of FromDual). Really weird results in the output.
Bug #98501 - "Exchanging partition with tables does not update SDI correctly". Having a data dictionary is cool and useful, but the information there should be consistent in all places/forms where it is stored. Fungo Wang found and reported cases when it is wrong in the individual .ibd files for partitions or imported tablespaces. After some arguing about backups tools supported, the bug was verified.
Bug #98511 - "OPTIMIZE TABLE on myisam can increase table size (~2x) and reduce performance". This funny bug looks like a regression in MySQL 8.0.x comparing to 5.7. I doubt Oracle is going to fix anything for MyISAM, but maybe the regression still matters for them. As it often happens, this bug reported by Pete Dishman was verified without adding a regression tag.
Bug #98520 - "Suboptimal implementations for some load/store functions for little-endian arch". Alexey Kopytov identified some remaining performance problems in low level functions like int6store() and uint6korr() in MySQL 8.0 for platforms like x86_64 or ARM64. He request to optimize those functions for little-endian architectures by providing specialized implementations, as it was done in MySQL 8.0.x for many other similar and more widely used functions.
Bug #98530 - "crash when inplace encryption resumes makes tablespace unusable". This bug (with nice MTR test case) was reported by Satya Bodapati from Percona. See also his another bug report, Bug #98537 - "inplace encryption resume thread doesn't update DD properly".
Bug #98546 - "Transient indexes statistics are updated in foreground causing performance issue". As noted by Charly Batista, if persistent statistics is not enabled InnoDB checks if 1/16 rows of the table have been changed and if this was the case, it calls the dict_stats_update function in the foreground. Moreover, it does not only degrade performance while recalculating stats in the foreground thread, but also sets RW_X_LATCH for the table in the process to serialize all access. So, use persistent statistics, but do not let it be automatically recalculated (and here I explained why).
Bug #98616 - "XA PREPARE/XA COMMIT/XA ROLLBACK lost if mysql crash just after binlog flush". Yet another problem with XA transactions support in MySQL. Dennis Gao provided a patch and kindly explained that currently:
"When mysql start-up recover, the innodb engine will only recover the transaction in prepared state, which means the undo->state of the recovered transaction must be TRX_UNDO_PREPARED (check trx_rollback_resurrected in trx0roll.cc). So if a "xa prepare" transaction only flush binlog, it will just be rollback during start-up recover and lost."
Bug #98624 - "Can't connect to MySQL after establishing around 65536 connections". Yet another great finding by Fungo Wang. Basically, this is a bug in the MDL subsystem, the scalability of MDL is limited to 65536 users, due to the pins number limitation (LF_PINBOX_MAX_PINS = 65536) of the LF_HASH it employs.
Bug #98639 - "Redundant row-level locking for secondary index". I think this simple case pointed out by Sergei Muraviev is a yet another case of excessive locking where improvement is possible. I do not agree with simple "Not a bug" status.
Bug #98642 - "CONSISTENT SNAPSHOT CAN BE CORRUPTED BY OTHER TRANSACTIONS". This is a really serious bug found by Erwan MAS. It took me some efforts to make sure this bug report was properly processed and verified (my comment in the bug report is hidden, but that does not matter as long as it was really treated seriously). Looks like MySQL 8.0.x may be not affected, but recent 5.7.x and 5.6.x versions are affected for sure. Take care!
Bug #98651 - "Inserting into temporary table from function breaks replication in MIXED mode". This regression bug in MySQL 8 (vs 5.7.29) was reported by
Alexey Gavrilov, who had create a separate GitHub repository for the test case.
Bug #98665 - "replication broken on blackhole node if binlog_rows_query_log_events on master". This bug was reported by Zhenghu Wen. Take care if you use BLACKHOLE tables on slaves. Both MySQL 8.0.19 and 5.7.29 are affected.
Bug #98673 - "Allow hints to reference query block by system name". I like optimizer hints implementation in MySQL. This feature request to make them even more useful and less confusing in case of complex queries. As Kaiwang CHen put it:
"Note that the query blocks are internally identified with a number
(SELECT_LEX::select_number), with which system names are defined. That
system name could be explored to refer to any query block in the query."
I do not see a patch in the bug report, but there was a plan to provide it.
I have few more bugs in my list for February, but let's continue with them next time. We have many boring weeks ahead it seems...
Frida is always searching for something... I do the same with MySQL bugs.
To summarize:
Group replication still has a lot of problems to resolve before becoming a really mature solution. Ask Galera developers :)
There are regressions in MySQL 8.0.19.
Percona and Alibaba engineers still help to make MySQL better.
I still do not see a consistent use of "regression" tag for verified regression bugs. This is unfortunate.
InnoDB locking still needs more attention.
There is too much Docker usage in the industry for my linking...
I may get a chance to speak about proper bugs processing for open source projects later this year, so I have to keep reviewing recent MySQL bugs to be ready for that. In my previous post in this series I listed some interesting MySQL bug reports created in December, 2019. Time to move on to January, 2020! Belated Happy New Year of cool MySQL Bugs!
As usual I mostly care about InnoDB, replication and optimizer bugs and explicitly mention bug reporter by name and give link to his other active reports (if any). I also pick up examples of proper (or improper) reporter and Oracle engineers attitudes. Here is the list:
Bug #98103 - "unexpected behavior while logging an aborted query in the slow query log". Query that was killed while waiting for the table metadata lock is not only get logged, but also lock wait time is saved as query execution time. I'd like to highlight how bug reporter, Pranay Motupalli, used gdb to study what really happens in the code in this case. Perfect bug report!
Bug #98113 - "Crash possible when load & unload a connection handler". The (quite obvious) bug was verified based on code review, but only after some efforts were spent by Oracle engineer on denial to accept the problem and its importance. This bug was reported by Fangxin Flou.
Bug #98132 - "Analyze table leads to empty statistics during online rebuild DDL ". Nice addition to mycollections! This bug with a nice and clear test case was reported by Albert Hu, who also suggested a fix.
Bug #98139 - "Committing a XA transaction causes a wrong sequence of events in binlog". This bug reported by Dehao Wang was verified as a "documentation" one, but I doubt documenting current behavior properly is an acceptable fix. Bug reporter suggested to commit in the binary log first, for example. Current implementation that allows users to commit/rollback a XA transaction by using another connection if the former connection is closed or killed, is risky. A lot of arguing happened in comments in the process, and my comment asking for a clear quote from the manual:
Would you be so kind to share some text from this page you mentioned:
or any other fine MySQL 8 manual page stating that XA COMMIT is NOT supported when executed from session/connection/thread other than those prepared the XA transaction? I am doing something wrong probably, but I can not find such text anywhere.
was hidden. Let's see what happens to this bug report next.
Bug #98211 - "Auto increment value didn't reset correctly.". Not sure what this bug reported by Zhao Jianwei has to do with "Data Types", IMHO it's more about DDL or data dictionary. Again, some sarcastic comments from Community users were needed to put work on this bug back on track...
Bug #98220 - "with log_slow_extra=on Errno: info not getting updated correctly for error". This bug was reported by lalit Choudhary from Percona.
Bug #98227 - "innodb_stats_method='nulls_ignored' and persistent stats get wrong cardinalities". I think category is wrong for this bug. It's a but in InnoDB's persistent statistics implementation, one of many. The bug was reported by Agustín G from Percona.
Bug #98231 - "show index from a partition table gets a wrong cardinality value". Yet another by report by Albert Hu. that ended up as a "documentation" bug for now, even though older MySQL versions provided better cardinality estimations than MySQL 8.0 in this case (so this is a regression of a kind). I hope the bug will be re-classified and properly processed later.
Bug #98238 - "I_S.KEY_COLUMN_USAGE is very slow". I am surprised to see such a bug in MySQL 8. According to the bug reporter, Manuel Mausz, this is also a kind of regression comparing to older MySQL version, where these queries used to run faster. Surely, no "regression" tag in this case was added.
Bug #98284 - "Low sysbench score in the case of a large number of connections". This notable performance regression of MySQL 8 vs 5.7 was reported by zanye zjy. perf profiling pointed out towards ppoll() where a lot of time is spent. There is a fix suggested by Fangxin Flou (to use poll() instead), but the bug is still "Open".
Bug #98287 - "Explanation of hash joins is inconsistent across EXPLAIN formats". This bug was reported by Saverio M and ended up marked as a duplicate of Bug #97299 fixed in upcoming 8.0.20. Use EXPLAIN FORMAT=TREE in the meantime to see proper information about hash joins usage in the plan.
Bug #98288 - "xa commit crash lead mysql replication error". This bug report from Phoenix Zhang (who also suggested a patch) was declared a duplicate of Bug #76233 - "XA prepare is logged ahead of engine prepare" (that I've already discussed among other XA transactions bugs here).
Bug #98324 - "Deadlocks more frequent since version 5.7.26". Nice regression bug report by Przemyslaw Malkowski from Percona, with additional test provided later by Stephen Wei . Interestingly enough, test results shared by Umesh Shastry show that MySQL 8.0.19 is affected in the same way as 5.7.26+, but 8.0.19 is NOT listed as one of versions affected. This is a mistake to fix, along with missing regression tag.
Bug #98427 - "InnoDB FullText AUX Tables are broken in 8.0". Yet another regression in MySQL 8 was found by Satya Bodapati. Change in default collation for utf8mb4 character set caused this it seems. InnoDB FULLTEXT search was far from perfect anyway...
The are clouds in the sky of MySQL bugs processing.
To summarize:
Still too much time and efforts are sometimes spent on arguing with bug reporter instead of accepting and processing bugs properly. This is unfortunate.
Sometimes bugs are wrongly classified when verified (documentation vs code bug, wrong category, wrong severity, not all affected versions are listed, ignoring regression etc). This is also unfortunate.
Percona engineers still help to make MySQL better.
There are some fixes in upcoming MySQL 8.0.20 that I am waiting for :)
XA transactions in MySQL are badly broken (they are not atomic in storage engine + binary log) and hardly safe to use in reality.
No matter what I write and present about dynamic tracing, blog posts about MySQL bugs are more popular based on statistics. So, to make more readers happy, I'd like to continue my review of interesting bugs reported in November with this post on bugs reported during December, 2019.
As usual, I'll try to concentrate on bug reports related to InnoDB, replication and optimizer, but some other categories also got my attention:
Bug #97911 - "[FATAL] Semaphore wait has lasted > 600 seconds. We intentionally crash the serv...". This bug got marked as a duplicate of other, older long semaphore wait bug (in "Can't repeat" status!) without much analysis. I think all Oracle engineers who added comments to that bug missed one interesting point:
even though bug reporter highlighted it in a comment. Reported wait time is a problem and is surely a bug, no matter how to reproduce the long wait itself and what is its root cause!
Bug #97913 - "Undo logs growing during partitioning ALTER queries". This bug (affecting only MySQL 5.7.x) was reported by Przemyslaw Malkowski from Percona, who also presented useful examples of monitoring queries to the information_schema.innodb_metrics and performance_schema. Check also comments that may explain why 8.0 is not affected in a similar way.
Bug #97935 - "Memory leak in client connection using information_schema". It took some efforts (starting from but not limited to Valgrind Massif profiling of heap memory usage) and time for Daniel Nichter to prove the point and get this bug "Verified". It is also not clear if MySQL 8 is also affected.
Bug #97950 - "buf_read_page_handle_error can trigger assert failure". Bug reporter, Shu Lin, tried his best to make the point. It's clear enough how to repeat this, and one could use one of documented test synchonisation methods if gdb is too much for bug verification. I do not think this bug was handled properly or got the level of attention it truly deserved.
Bug #97966 - "XA COMMIT in another session will not write binlog event". This bug was reported by Lujun Wang and immediately verified, but again with no documented check if MySQL 8 is affected. This happens too often, unfortunately.
Bug #97971 - "Roles not handling column level privileges correctly; Can SELECT, but not UPDATE". Clear and simple bug report with a workaround from Travis Bement. It was immediately verified.
Bug #98014 - "Lossy implicit conversion in conditional breaks ONLY_FULL_GROUP_BY". Yet another case of (IMHO) improper bug processing. The argument presented (from the manual):
"MySQL 5.7.5 and later also permits a nonaggregate column not
named in a GROUP BY clause when
ONLY_FULL_GROUP_BY SQL mode is
enabled, provided that this column is limited to a single value"
does not apply, as "single value" for = 0 is NOT selected, we have multiple Host values matching it due to conversion. This is how proper version (guess what it is) works:
mysql> SELECT User, Host, COUNT(*) FROM mysql.user WHERE Host = 0 GROUP BY 1; ERROR 1055 (42000): 'mysql.user.Host' isn't in GROUP BY mysql> select @@sql_mode; +--------------------+ | @@sql_mode | +--------------------+ | ONLY_FULL_GROUP_BY | +--------------------+ 1 row in set (0.001 sec)
I think this bug reported by Joshua Varner must be verified.
Bug #98046 - "Inconsistent behavior while logging a killed query in the slow query log". Bug reporter, Pranay Motupalli, provided a clear test case and a detailed analysis, including the gdb debugging session that proves the point. Nice bug report.
Bug #98055 - "MySQL Optimizer Bug not picking right index". Both the bug reporter (Sudheer Gadipathi) and engineer who verified the bug stated that MySQL 8.0.x is similarly affected (UNIQUE key is preferred for the partitioned table, even though there is a better non-unique index). But 8.0.x is NOT listed in the "Version:" filed. Weird.
Bug #98068 - "SELECT FOR UPDATE not-exist table in PERFORMANCE SCHEMA reports confusing error". This is a funny (but still a regression) bug report by William ZHANG. Proper versions work like this:
mysql> select database();+--------------------+ | database() | +--------------------+ | performance_schema | +--------------------+ 1 row in set (0.001 sec)
mysql> select * from not_exist_table; ERROR 1146 (42S02): Table 'performance_schema.not_exist_table' doesn't exist mysql> select * from not_exist_table for update; ERROR 1146 (42S02): Table 'performance_schema.not_exist_table' doesn't exist
Bug #98072 - "innochecksum summary shows blob pages as other type of page for 8.0 tables". The bug was reported by SERGEY KUZMICHEV. This time "regression" tag is missing, even though it's clearly stated that MySQL 5.7 worked differently. This is from the proper version:
... File::..\data\test\blob_test.ibd ================PAGE TYPE SUMMARY============== #PAGE_COUNT PAGE_TYPE =============================================== 1 Index page 0 Undo log page 1 Inode page 0 Insert buffer free list page 508 Freshly allocated page 1 Insert buffer bitmap 0 System page 0 Transaction system page 1 File Space Header 0 Extent descriptor page 64 BLOB page 0 Compressed BLOB page 0 Page compressed page 0 Page compressed encrypted page 0 Other type of page ...
Bug #98083 - "Restarting the computer when deleting the database will cause directory residues". One would expect that MySQL 8 with a data dictionary should have some means to figure out the remaining database directory for a dropped database upon startup (as it stores information about databases elsewhere) and do proper cleanup. I think this bug reported by Jinming Liao must be verified and fixed. There is no "... manual creation or deletion of tables or databases..." involved in this case.
Bug #98091 - "InnoDB does not initialize raw disk partitions". As simple as that and both 5.7.29 and 8.0.219 are surely affected. It was not always the case, I've used raw devices myself with older MySQL versions, so this bug reported by Saverio M is a regression. Still, "regression" tag is missing.
That's all bugs reported in December, 2019 that I cared to subscribe to and mention here. Next time I'll check bugs reported in January, 2020. There are at least 16 in my list already, so stay tuned.
Follow the links in this post to get more details about profiling and creating off-CPU FlameGraphs for MySQL. This post is devoted to bugs, though :)
To summarize:
I am happy to see bug reports from people whom I never noticed before. MySQL Community is alive.
Some flexibility in following common sense based bugs verification procedures is still visible. Bugs reported for 5.7 are not checked on 8.0 (or the results of this check are not documented in public), nobody cares to read what bug reporter says carefully or go extra mile, "regression" tag not added, and so on.
Probably at this stage my writings are mostly ignored by Oracle's decision makers. But I keep watching them all anyway.