Saturday, February 16, 2019

Fun with Bugs #79 - On MySQL Bug Reports I am Subscribed to, Part XV

More than 3 weeks passed since my previous review of public MySQL bug reports I am subscribed to, so it's time to present some of the bugs I've considered interesting in January, 2019.

As usual, I'll review them starting from the oldest and try to summarize my feelings about these bugs at the end of this post. Here they are:
  • Bug #93806 - "Document error about ON DUPLICATE KEY UPDATE". Years pass, but fine MySQL manual still does not explain some cases of InnoDB locking properly. Xiaobin Lin found yet another case that it does not explain properly. Or, maybe, the manual is correct and the problem in the implementation? MariaDB 10.3.7 shows the same behavior.
  • Bug #93827 - "dict_index_has_desc() is not efficient". Yet another bug report from Zhai Weixiang. I see 50 still active bug reports from him! Maybe Oracle should send some nice T-shirts to top N most productive bug reporters?
  • Bug #93845 - "Optimizer choose wrong index, sorting index instead of filtering index". yet another bug report of a known class, this time from Daniele Renda. It's good example of optimizer trace usage to make a point. Note also that using ANALYZE ... UPDATE HISTOGRAMS does not help. As a side note, implementation of optimizer trace for MariaDB is finally in progress and should be done for upcoming 10.4. See MDEV-6111 for the details if you care.
  • Bug #93875 - "mysqldump per-table dump is slow since 5.7 on instances with many tables". This performance regression bug (that was "verified" without adding the regression tag) was reported by Nikolai Ikhalainen from Percona. This bug report is a nice example of using Docker to create easily repeatable test cases for bug reports.
  • Bug #93878 - "innodb_status_output fails to restore to old value". This great bug report from Yuhui Wang  not only describes 3 cases when InnoDB status is printed to the error log automatically, but also shows that in one of these cases, when we can not found free block in the buffer pool in 20 loops, this printing is not stopped after the problem is resolved, and provides a patch that resolves the problem. See also his nice Bug #94065 - "MySQL fails to startup when setting persist variable" with detailed analysis of the problem.
  • Bug #93917 - "Wrong binlog entry for BLOB on a blackhole intermediary master". Nice corner case was found by Sveta Smirnova from Percona. With her 52 "Verified" bug reports at the moment she also deserves a T-shirt from Oracle as one of top bug reporters!
  • Bug #93922 - "UNION ALL very slow with SUM(0)". This weird bug was found and reported by Sergio Paternoster. He had to spend notable efforts to see this bug "Verified"...
  • Bug #93948 - "XID inconsistency on master-slave with CTAS". Krunal Bauskar from Percona noted this inconsistency in XID generation on slave vs master. Let's wait and check if it ends up as "Not a bug".
  • Bug #93957 - "slave_compressed_protocol doesn't work with semi-sync replication in MySQL-5.7". This bug report from Pavel Katiushyn also looks like a regression, as similar bug was fixed in older 5.7.x release. But I do not see any public comment with verification attempt neither in recent 5.7, nor in recent 8.0 (where older bug also had to be fixed). So, the bug is "verified", but the real impact and versions affected are not clear.
  • Bug #93963 - "Slow query log doesn't log a slow CREATE INDEX with admin statements enabled". This clear and properly tagged regression vs MySQL 5.7 was reported by Jeremy Smyth.
  • Bug #93986 - "Transactions in serializable mode are not actually serializable". I've subscribed to this bug report mostly for (expected) fun of reading further comments. It's still "Need feedback", but single comment so far is worth reading.
  • Bug #94121 - "Enable hardware CRC32 under Valgrind". Laurynas Biveinis from Percona also provided a patch for this 8 years old problem.
  • Bug #94130 - "XA COMMIT may lead replication broken". Yet another proof that XA transactions implementation is broken in MySQL. This time from Phoenix Zhang and in semi-sync replication case.
This photo reminds me current state of MySQL bugs processing in Oracle - it seems there is no clear and straightforward way to follow. Everything is fuzzy these days...

There are few more bugs reported in January, 2019 that I am watching, but their status is not yet clearly defined, so I decided to skip them in this review.

To summarize:
  1.  Oracle engineers who process bugs still do not add regression tag to many regression bugs. This is a shame, really. If I were their boss I'd make this a policy and one of important KPI values to monitor.
  2. In some cases bugs get verified immediately without any demonstrated attempt to show how the check was performed, while in other cases poor bug reporters have to fight hard to re-make their point and get a real check done. It seems these days good old approaches to bugs verification are not followed strictly by some Oracle engineers.

Saturday, February 9, 2019

On my Favorite FOSDEM 2019 MySQL, MariaDB and Friends Devroom Talks

This year I had not only spoken about MySQL bugs reporting at FOSDEM, but spent almost the entire day listening at MySQL, MariaDB and Friends Devroom. I missed only one talk, on ProxySQL, (to get some water, drink a bottle of famous Belgian beer and chat with my former colleague in MySQL support team, Geert, whom I had not seen for a decade). So, for the first time out of my 4 FOSDEM visits I've got a first hand impression about the entire set of talks in the devroom that I want to share today, while I still remember my feelings.

Most of the talks have both slides and videos already uploaded on site, so you can check them and make your own conclusions, but my top 5 favorite talks (that have both videos and slides already available to community) were the following:

  • "Un-split brain (aka Move Back in Time) MySQL", by Shlomi Noach. You can find slides at SlideShare.

    This was a replacement talk that was really interesting and had proper style for FOSDEM. It was mostly a nice background story of creation of the gh-mysql-rewind tool, a shell script that uses MariaDB's mysqlbinlog --flashback option and MySQL GTIDs and allows to "rewind" row-based binary log to roll back transactions to some previous point in time. The tool should become available to community soon, maybe as a part of orchestrator. I was impressed how one can successfully use 49 slides for 20 minutes talk. That's far beyond my current presentation skills...
  • "Test complex database systems in a laptop with dbdeployer", by Giuseppe Maxia. You can find slides at SlideShare.

    I've already built and used dbdeployer, as described in my blog post, so I was really interested in the talk. Giuseppe was able not only to show 45 slides over 20 minutes and explain all the reasons behind re-implementing MySQL-Sandbox in Go, but also run a live demo where dozens of sandbox instances were created and used. Very impressive!
  • "MySQL and the CAP theorem: relevance & misconceptions", second great talk and show by Shlomi Noach. You can find slides at SlideShare.

    The "CAP theorem" says is a concept that a distributed database system (like any kind of MySQL replication setup) can only have 2 of the 3 features: (atomic) Consistency, (high) Availability and Partition Tolerance. This can be proved mathematically, but Shlomi had not only defined terms and conditions to present the formal proof, but also explained that they are far from real production objectives of any engineer or DBA (like 99.95% of Availability). He had shown typical MySQL setups (from simple async master-slave replication to Galera, group replication and even Vitess) and proved that formally they all are neither consistent nor available from that formal CAP theorem point of view, while, as we all know, they are practically useful and work (and with some efforts, proxies on top etc can be made both highly available and highly consistent for practical purposes). So, CAP theorem is neither representing real production systems, nor meeting their real requirements. We've also got some kind of explanation of why async master-master or circular replication are still popular... All that in 48 slides, with links, and presented in 20 minutes! Greatest short MySQL-related talk I've ever attended.
  • "TiDB: Distributed, horizontally scalable, MySQL compatible", by Morgan Tocker. You can find slides at SlideShare.

    It was probably the first time when I listened to Morgan, even though we worked together for a long time. I liked his way of explaining the architecture of this yet another database system speaking MySQL protocol and reasons to create it. If you are interested in performance of this system, check this blog post.
  • "MySQL 8.0 Document Store: How to Mix NoSQL & SQL in MySQL 8.0", by Frédéric Descamps. You can find slides (70!) at SlideShare.

    LeFred managed to get me somewhat interested in MySQL Shell and new JSON functions in MySQL, way more than ever before. It's even more surprising that hist talk was the last one and we already spent 8+ hours listening before he started. Simple step by step explanation of how one may get the best of both SQL, ACID and NoSQL (JSON, "MongoDB") worlds, if needed, in a single database management syste, was impressive. Also this talk probably caused the longest discussion and the largest number of questions from those remaining attendees.

    He was also one of two "hosts" and "managers" of the devroom, so I am really thankful him for hist efforts year after year to make MySQL devroom at FOSDEM great!
There were more good talks, but I had to pick up few that already have slides shared and those of a kind that I personally prefer to listen to at FOSDEM. This year I also missed few people whom I like to see and talk to at FOSDEM, namely Mark Callaghan and Jean-François Gagné.

The only photo I made with my Nokia dumb phone this year in Brussels, on my way to FOSDEM on February 2. We've got snow and rain that morning, nice for anyone who had to walk 5 kilometers to the ULB campus.
Overall, based on my experience this year, it still makes a lot of sense to visit FOSDEM for anyone interested in MySQL. You can hardly find so many good, different MySQL-related talks per just one single day on any other conference.