Saturday, March 9, 2019

Fun with Bugs #81 - On MySQL Bug Reports I am Subscribed to, Part XVII

Two weeks passed since my previous review of public MySQL bug reports I consider interesting enough to subscribe to them. Over this period I picked up a dozen or so new public bug reports that I'd like to briefly review today.

Here is my recent subscriptions list, starting from the oldest bug reports:
  • Bug #94431 - "Can't upgrade from 5.7 to 8.0 if any database have a hyphen in their name". It seems one actually needs a database like that created in MySQL 5.6 with at least one InnoDB table having FULLTEXT index to hit the problem. Great finding by Phil Murray. Note that after several unsuccessful attempts by others the bug was eventually reproduced and verified by Jesper Wisborg Krogh. Let's hope we'll see it fixed in MySQL 8.0.16.
  • Bug #94435 - "mysql command hangs up and cosume CPU almost 100%". It was reported by Masaaki HIROSE, whose previous related/similar Bug #94219 - "libmysqlclient enters and infinite loop and consume CPU usage 100%" ended up as "Not a bug" (wrongly, IMHO, as nobody cared enough to reproduce the steps instead of commenting on their correctness and checking something else). Bug reporter had not only insisted and provided all the details, but also tried to analyze the reasons of the bug and provided links to other potentially related bug reports (Bug #88428 - "mysql_real_query hangs with EINTR errno (using YASSL)" and Bug #92394 - "libmysqlclient enters infinite loop after signal (race condition)"). Great job and nice to see the bug "Verified" eventually.
  • Bug #94441 - "empty ibuf aio reads in innodb status". This regression vs MySQL 5.6 was noted by Nikolai Ikhalainen from Percona. MariaDB 10.3.7 is also affected, unfortunately:
    ...
    I/O thread 9 state: native aio handle (write thread)
    Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
     ibuf aio reads:, log i/o's:, sync i/o's:Pending flushes (fsync) log: 0; buffer pool: 0
    1344 OS file reads, 133 OS file writes, 2 OS fsyncs
    ...
  • Bug #94448 - "Rewrite LOG_BLOCK_FIRST_REC_GROUP during recovery may be dangerous.". Yet another MySQL 8 regression (not marked with "regression" tag) was found by Kang Wang.
  • Bug #94476 - "mysql semisync replication stuck with master in Waiting to finalize termination". It has "Need feedback" status at the moment. I've subscribed to this report from Shirish Keshava Murthy mostly to find out how a report that may look like a free support request will be processed by Oracle engineers. Pure curiosity, for now.
  • Bug #94504 - "AIO::s_log seems useless". This problem was reported by Yuhui Wang. It's a regression in a sense that part of the code is no longer needed (and seems not to be used) in MySQL 8, but still remains.
  • Bug #94541 - "Assertion on import via Transportable Tablespace". This bug reported by  Daniël van Eeden was verified based on code review and some internal discussion. We do not know if any other version besides 5.7.25 is affected, though. The assertion itself:
    InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no()
    does not seem to be unique. We can find it in MDEV-18455 also (in other context).
  • Bug #94543 - "MySQL does not compile with protobuf 3.7.0". I care about build/compiling bugs historically, as I mostly use MySQL binaries that I built myself from GitHub source. So, I've immediately subscribed to this bug report from Laurynas Biveinis.
  • Bug #94548 - "Optimizer error evaluating JSON_Extract". This bug was reported by Dave Pullin. From my quick test it seems MariaDB 10.3.7 is also affected. Error message is different in the failing case, but the point is the same - the function is not evaluated if the column from derived table that is built using the function is not referenced in the SELECT list. This optimization is questionable and may lead to hidden "bombs" in the application code.
  • Bug #94550 - "generated columns referring to current_timestamp fail". I tried to check simple test case in this bug report by Mario Beck on MariaDB 10.3.7, but it does not seem to accept NOT NULL constraint for generated stored columns at all:
    MariaDB [test]> CREATE TABLE `t2` (
        ->   `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
        ->   `content` varchar(42) DEFAULT NULL,
        ->   `bucket` tinyint(4) GENERATED ALWAYS AS ((floor((to_seconds(`created_at
    `) / 10)) % 3)) STORED NOT NULL);
    ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
    corresponds to your MariaDB server version for the right syntax to use near 'NOT
     NULL)' at line 4
    I do not see this option in formal syntax described here as well. But in case of MariaDB we can actually make sure the generated column is never NULL by adding CHECK constraint like this:
    MariaDB [test]> CREATE TABLE `t2` (    ->   `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
        ->   `content` varchar(42) DEFAULT NULL,
        ->   `bucket` tinyint(4) GENERATED ALWAYS AS ((floor((to_seconds(`created_at`) / 10)) % 3)) STORED);
    Query OK, 0 rows affected (0.434 sec)

    MariaDB [test]> INSERT INTO t2 (content) VALUES ("taraaaa");
    Query OK, 1 row affected (0.070 sec)

    MariaDB [test]> alter table t2 add constraint cnn CHECK (`bucket` is NOT NULL);
    Query OK, 1 row affected (1.159 sec)
    Records: 1  Duplicates: 0  Warnings: 0

    MariaDB [test]> INSERT INTO t2 (content) VALUES ("tarabbb");
    Query OK, 1 row affected (0.029 sec)

    MariaDB [test]> INSERT INTO t2 (content) VALUES ("");
    Query OK, 1 row affected (0.043 sec)

    MariaDB [test]> select * from t2;
    +---------------------+---------+--------+
    | created_at          | content | bucket |
    +---------------------+---------+--------+
    | 2019-03-09 17:28:03 | taraaaa |      0 |
    | 2019-03-09 17:29:43 | tarabbb |      1 |
    | 2019-03-09 17:29:50 |         |      2 |
    +---------------------+---------+--------+
    3 rows in set (0.002 sec)

    MariaDB [test]> show create table t2\G*************************** 1. row ***************************
           Table: t2
    Create Table: CREATE TABLE `t2` (
      `created_at` timestamp NOT NULL DEFAULT current_timestamp(),
      `content` varchar(42) DEFAULT NULL,
      `bucket` tinyint(4) GENERATED ALWAYS AS (floor(to_seconds(`created_at`) / 10)
    MOD 3) STORED,
      CONSTRAINT `cnn` CHECK (`bucket` is not null)

    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    1 row in set (0.011 sec)
    So, maybe after all we can state that MariaDB is NOT affected.
  • Bug #94552 - "innodb.virtual_basic fails when valgrind is enabled". I still wonder if anyone in Oracle runs MTR test suite on Valgrind-enabled (-DWITH_VALGRIND=1 cmake option) at least in the process of official release (and if they check the failures). It seems not to be the case based on this bug report from Manuel Ung.
  • Bug #94553 - "Crash in trx_undo_rec_copy". Bernardo Perez noted that as a side effect of still "Verified" Bug #82734 - "trx_undo_rec_copy needlessly relies on buffer pool page alignment" (that affects both MySQL 5.7 and 8.0) we may get crashes while working with generated columns. I hope to see them both fixed soon, but for now Bug #94553 has status "Need Feedback", probably in a hope to get a repeatable test case. I'll watch it carefully.
  • Bug #94560 - "record comparison in spatial index non-leaf rtree node seems incorrect". I doubt spatial indexes of InnoDB are widely used, and I have no doubts there are many bugs waiting to be discovered in this area. This specific bug was reported by Jie Zhou who had also suggested a fix.
  • Bug #94610 - "Server stalls because ALTER TABLE on partitioned table holds dict mutex". My former colleague Justin Swanhart reported this bug just yesterday, so no wonder it is not verified yet. It refers to a well known verified old Bug #83435 - "ALTER TABLE is very slow when using PARTITIONED table"  (that I've also subscribed to immediately) from Roel Van de Paar, affecting both MySQL 5.6 and 5.7. I hope to see this bug verified and fixed soon, as recently I see this kind of state for main thread:
    Main thread process no. 3185, id 140434206619392, state: enforcing dict cache limit
    too often in INNODB STATUS outputs to my liking...
As you could note, I still try to check (at least in some cases) if MariaDB is also affected by the same problem. I think it's a useful check both for me (as I work mostly with MariaDB as a support engineer) and for the reader (to know if switching to MariaDB may help in any way or if there are any chances for MariaDB engineers to contribute anything useful, like a fix).

"Hove, actually". For years residents of Hove used this humorous reply when they live in Brighton... "Regression, actually" is what I want to say (seriously) about every other MySQL bug report I subscribe to... So, you see Hove and many regression bugs above!
To summarize:
  1. Sometimes Oracle engineers demonstrate proper collective effort to understand and carefully verify public bug reports. Good to know they are not ready to give up fast!
  2. I have to copy-paste this item from my previous post. As the list above proves, Oracle engineers still do not use "regression" tag when setting "Verified" status for obviously regression bugs. I think bug reporters should care then to always set it when they report regression of any kind.
  3. It seems there no regular MTR test runs for Valgrind builds performed by Oracle engineers, or maybe they just ignore failures.

Saturday, February 23, 2019

Fun with Bugs #80 - On MySQL Bug Reports I am Subscribed to, Part XVI

Today I'd like to continue my review of public MySQL bug reports with a list of some bugs I've subscribed to over last 3 weeks. It's already long enough and includes nice cases to check and share. Note that I usually subscribe to a bug either because it directly affects me or customers I work with, or I consider it technically interesting (so I mostly care about InnoDB, replication, partitioning and optimizer bugs), or it's a "metabug" - a problem in the way public bug report is handled by Oracle engineers. These are my interests related to MySQL bugs.

As usual, I start with the oldest bugs and try to mention bug reporters by name with links to their other reports whenever this may give something useful to a reader. I try to check if MariaDB is also affected in some cases. Check also my summary comments at the end of this blog post.
  • Bug #94148 - "Unnecessary Shared lock on parent table During UPDATE on a child table". In this bug report Uday Varagani reasonably pointed out that formally there is no need to lock parent row when column NOT included in the foreign key gets updated. This happens though when this column is included into the index used to support foreign key constraint. IMHO it's a reasonable feature request and both Trey Raymond and Sveta Smirnova tried their best to  highlight this, but this report now has a "Need Feedback" status with a request to explain new algorithm suggested. It's simple - "Stop it", check that column changed is NOT the one foreign key is defined on, even if it's in the same index...I see no reason NOT to verify this as a reasonable feature request. Is it a new policy that every feature request should come with details on how to implement it? I truly doubt.
  • Bug #94224 - "[5.6] Optimizer reconsiders index based on index definition order, not value". Domas Mituzas found yet another case (see also Bug #36817 - "Non optimal index choice, depending on index creation order" from Jocelyn Fournier, the bug I verified more than 10 years ago) when in MySQL order of index definition matters more for optimizer than anything else.  My quick check shows that MariaDB 10.3.7 is not affected:
    MariaDB [test]> explain select distinct b from t1 where c not in (0) and d > 0;+------+-------------+-------+-------+---------------+--------------------+---------+------+------+-------------+| id   | select_type | table | type  | possible_keys | key            | key_len
    | ref  | rows | Extra                    |
    +------+-------------+-------+-------+---------------+--------------------+--------+------+------+-------------+
    |    1 | SIMPLE      | t1    | index | NULL          | non_covering_index | 9    | NULL |    1 | Using where |
    +------+-------------+-------+-------+---------------+--------------------+---------+------+------+-------------+
    1 row in set (0.002 sec)

    MariaDB [test]> alter table t1 add index covering_index (b, c, d);
    Query OK, 0 rows affected (0.149 sec)
    Records: 0  Duplicates: 0  Warnings: 0

    MariaDB [test]> explain select distinct b from t1 where c not in (0) and d > 0;
    +------+-------------+-------+-------+---------------+----------------+---------+------+------+--------------------------+
    | id   | select_type | table | type  | possible_keys | key            | key_len
    | ref  | rows | Extra                    |
    +------+-------------+-------+-------+---------------+----------------+---------+------+------+--------------------------+
    |    1 | SIMPLE      | t1    | index | NULL          | covering_index | 14
    | NULL |    1 | Using where; Using index |
    +------+-------------+-------+-------+---------------+----------------+---------+------+------+--------------------------+
    1 row in set (0.025 sec)
    Fortunately MySQL 8 is no longer affected. Unfortunately we do not see a public comment showing the results of testing on MySQL 5.7 (or any version, for that matter), from engineer who verified the bug. I already pointed out that this "metabug" becomes popular in my previous blog post.
  • Bug #94243 - "WL#9508 introduced non-idiomatic potentially-broken C macros". Laurynas Biveinis from Percona found new code that in ideal world wound not pass any serious code review.
  • Bug #94251 - "Aggregate function result is dependent by window is defined directly or as named". This bug was reported by Владислав Сокол. From what I see:
    MariaDB [test]> WITH RECURSIVE cte AS (
        -> SELECT 1 num
        -> UNION ALL
        -> SELECT num+1 FROM cte WHERE num < 5
        -> )
        -> SELECT num, COUNT(*) OVER (frame) cnt_named, COUNT(*) OVER (ORDER BY num
    DESC) cnt_direct
        -> FROM cte
        -> WINDOW frame AS (ORDER BY num DESC);
    +------+-----------+------------+
    | num  | cnt_named | cnt_direct |
    +------+-----------+------------+
    |    1 |         5 |          5 |
    |    2 |         4 |          4 |
    |    3 |         3 |          3 |
    |    4 |         2 |          2 |
    |    5 |         1 |          1 |
    +------+-----------+------------+
    5 rows in set (0.117 sec)

    MariaDB [test]> WITH RECURSIVE cte AS (
        -> SELECT 1 num
        -> UNION ALL
        -> SELECT num+1 FROM cte WHERE num < 5
        -> )
        -> SELECT num, COUNT(*) OVER (frame) cnt_named, COUNT(*) OVER (ORDER BY num
    DESC) cnt_direct
        -> FROM cte
        -> WINDOW frame AS (ORDER BY num DESC)
        -> ORDER BY num desc;
    +------+-----------+------------+
    | num  | cnt_named | cnt_direct |
    +------+-----------+------------+
    |    5 |         1 |          1 |
    |    4 |         2 |          2 |
    |    3 |         3 |          3 |
    |    2 |         4 |          4 |
    |    1 |         5 |          5 |
    +------+-----------+------------+
    5 rows in set (0.003 sec)
    MariaDB 10.3.7 is NOT affected.
  • Bug #94283 - "MySQL 8.0.15 is slower than MySQL 5.7.25". Percona's CTO Vadim Tkachenko reported that MySQL 8.0.15 is notably slower than 5.7.25 on a simple oltp_read_write sysbench test. He had recently written a separate blog post about this, with more details.There is one detail to clarify based on today's comment from Peter Zaitsev (was the same default character set used), but as my dear friend Sinisa Milivojevic verified the bug without any questions, requests or his own test outputs shared, we can assume that Oracle officially accepted this performance regression (even though "regression" tag was not set).

    Check also later Bug #94387 - "MySQL 8.0.15 is slower than MySQL 5.7.25 in read only workloads", yet another performance regression report from Vadim, where he found that on read only (sysbench oltp_point_select) all in memory workloads MySQL 8.0.15 may also be slower than MySQL 5.7.25.
  • Bug #94302 - "reset master could not break dump thread in some cases". This bug was reported by Ashe Sun. This is definitely a corner case, as it happens only master is still writing to the very first binary log. We can not find out from public comments in the bug report if any other versions besides 5.7.x are affected. This is yet another "metabug" - during my days in Oracle's MySQL bugs verification team we had to check on all versions still supported and present the results explicitly.
  • Bug #94319 - "Format_description_log_event::write can cause segfaults". Nice bug report by Manuel Ung from Facebook.
  • Bug #94330 - "Test for possible compressed failures before upgrade?". Change of zlib version starting from MySQL 5.7.24 means that some operations for InnoDB tables with ROW_FORMAT=COMPRESSED that previously worked may start to fail. In this report Monty Solomon asks for some way to determine if there will be a problem with existing compressed tables before upgrading to 5.7.24. The bug is still "Open".
  • Bug #94338 - "Dirty read-like behavior in READ COMMITTED transaction". Bug reporter, Masaki Oguro, stated that MySQL 8 is not affected (only 5.6 and 5.7) and the bug is verified on these versions, so we should assume it's really the case. But I miss public comment showing the result of testing on recent MySQL 8.0.15.
  • Bug #94340 - "backwards incompatible changes in 8.0: Error number: 3747". Simon Mudd complains about incompatible change in 8.0.13 that does not allow slave to easily switch from SBR to RBR without restart (and was not clearly documented as a change in behavior). Make sure to read all comments.
  • Bug #94370 - "Performance regression of btr_cur_prefetch_siblings". Nice bug report with a patch from Zhai Weixiang.
  • Bug #94383 - "simple ALTER cause unnecessary InnoDB index rebuilds, 5.7.23 or later 5.7 rlses". In this bug report Mikhail Izioumtchenko presented the detailed analysis and suggested diagnostics patches to show what really happens and why. This bug is also a regression of a kind, so while testing results are presented, I still think that it could be processed better according to the good old rules I have in mind.
  • Bug #94394 - "Absence of mysql.user leads to auto-apply of --skip-grant-tables". Great finding by Ceri Williams from Percona. Sveta Smirnova provided a separate MTR test case and clarified the impact of the bug. Surely this is also a regression comparing to MySQL 5.7, as there you can not start MySQL if mysql.user table is missing. I leave it to a reader to decide if there is any security-related impact of this bug...
  • Bug #94396 - "Error message too broad: The used command is not allowed with this MySQL version". This bug was reported by my former colleague in Percona Support, famous Bill Karwin. Informative error messages matter for good user experience.
We rely on MySQL in a same way as that guys on top of dolphins pyramid on this strange monument in some court somewhere at the Lanes. Reliable foundation matters, so regressions should better be avoided.
To summarize:
  1. Looks like it's time for Oracle to spend some efforts to make MySQL 8 great again, by fixing some of the bugs mentioned above, especially performance regressions vs MySQL 5.7 found recently by Vadim Tkachenko from Percona.
  2. Oracle continues to introduce backward-incompatible changes in behavior in minor MySQL 8.0.x releases at GA stage. This is not really good for any production environment.
  3. Asking bug reporters to provide "the basics of such a new algorithm" when they complain that current one is wrong or not optimal is a new word in bugs processing!
  4. When I joined MySQL bugs verification team in 2005 we've set up a culture of bugs processing that included, among other things, presenting in a public comment any successful or unsuccessful attempt to verify the bug, by copy-pasting all commands and statements used along with the outputs, whenever possible and with enough context to show what was really checked. I've studied this approach from Oracle's Tom Kyte over the previous 10 years when I followed him closely. I used to think it's standard for more than a decade already, a kind of my (and not only my) "heritage". It's sad to see this approach is no longer followed by many Oracle engineers who process bugs, in too many cases.
  5. Oracle engineers still do not use "regression" tag when setting "Verified" status for obviously regression bugs. I think bug reporters should care then to always set it when they report regression of any kind.

Saturday, February 16, 2019

Fun with Bugs #79 - On MySQL Bug Reports I am Subscribed to, Part XV

More than 3 weeks passed since my previous review of public MySQL bug reports I am subscribed to, so it's time to present some of the bugs I've considered interesting in January, 2019.

As usual, I'll review them starting from the oldest and try to summarize my feelings about these bugs at the end of this post. Here they are:
  • Bug #93806 - "Document error about ON DUPLICATE KEY UPDATE". Years pass, but fine MySQL manual still does not explain some cases of InnoDB locking properly. Xiaobin Lin found yet another case that it does not explain properly. Or, maybe, the manual is correct and the problem in the implementation? MariaDB 10.3.7 shows the same behavior.
  • Bug #93827 - "dict_index_has_desc() is not efficient". Yet another bug report from Zhai Weixiang. I see 50 still active bug reports from him! Maybe Oracle should send some nice T-shirts to top N most productive bug reporters?
  • Bug #93845 - "Optimizer choose wrong index, sorting index instead of filtering index". yet another bug report of a known class, this time from Daniele Renda. It's good example of optimizer trace usage to make a point. Note also that using ANALYZE ... UPDATE HISTOGRAMS does not help. As a side note, implementation of optimizer trace for MariaDB is finally in progress and should be done for upcoming 10.4. See MDEV-6111 for the details if you care.
  • Bug #93875 - "mysqldump per-table dump is slow since 5.7 on instances with many tables". This performance regression bug (that was "verified" without adding the regression tag) was reported by Nikolai Ikhalainen from Percona. This bug report is a nice example of using Docker to create easily repeatable test cases for bug reports.
  • Bug #93878 - "innodb_status_output fails to restore to old value". This great bug report from Yuhui Wang  not only describes 3 cases when InnoDB status is printed to the error log automatically, but also shows that in one of these cases, when we can not found free block in the buffer pool in 20 loops, this printing is not stopped after the problem is resolved, and provides a patch that resolves the problem. See also his nice Bug #94065 - "MySQL fails to startup when setting persist variable" with detailed analysis of the problem.
  • Bug #93917 - "Wrong binlog entry for BLOB on a blackhole intermediary master". Nice corner case was found by Sveta Smirnova from Percona. With her 52 "Verified" bug reports at the moment she also deserves a T-shirt from Oracle as one of top bug reporters!
  • Bug #93922 - "UNION ALL very slow with SUM(0)". This weird bug was found and reported by Sergio Paternoster. He had to spend notable efforts to see this bug "Verified"...
  • Bug #93948 - "XID inconsistency on master-slave with CTAS". Krunal Bauskar from Percona noted this inconsistency in XID generation on slave vs master. Let's wait and check if it ends up as "Not a bug".
  • Bug #93957 - "slave_compressed_protocol doesn't work with semi-sync replication in MySQL-5.7". This bug report from Pavel Katiushyn also looks like a regression, as similar bug was fixed in older 5.7.x release. But I do not see any public comment with verification attempt neither in recent 5.7, nor in recent 8.0 (where older bug also had to be fixed). So, the bug is "verified", but the real impact and versions affected are not clear.
  • Bug #93963 - "Slow query log doesn't log a slow CREATE INDEX with admin statements enabled". This clear and properly tagged regression vs MySQL 5.7 was reported by Jeremy Smyth.
  • Bug #93986 - "Transactions in serializable mode are not actually serializable". I've subscribed to this bug report mostly for (expected) fun of reading further comments. It's still "Need feedback", but single comment so far is worth reading.
  • Bug #94121 - "Enable hardware CRC32 under Valgrind". Laurynas Biveinis from Percona also provided a patch for this 8 years old problem.
  • Bug #94130 - "XA COMMIT may lead replication broken". Yet another proof that XA transactions implementation is broken in MySQL. This time from Phoenix Zhang and in semi-sync replication case.
This photo reminds me current state of MySQL bugs processing in Oracle - it seems there is no clear and straightforward way to follow. Everything is fuzzy these days...

There are few more bugs reported in January, 2019 that I am watching, but their status is not yet clearly defined, so I decided to skip them in this review.

To summarize:
  1.  Oracle engineers who process bugs still do not add regression tag to many regression bugs. This is a shame, really. If I were their boss I'd make this a policy and one of important KPI values to monitor.
  2. In some cases bugs get verified immediately without any demonstrated attempt to show how the check was performed, while in other cases poor bug reporters have to fight hard to re-make their point and get a real check done. It seems these days good old approaches to bugs verification are not followed strictly by some Oracle engineers.

Saturday, February 9, 2019

On my Favorite FOSDEM 2019 MySQL, MariaDB and Friends Devroom Talks

This year I had not only spoken about MySQL bugs reporting at FOSDEM, but spent almost the entire day listening at MySQL, MariaDB and Friends Devroom. I missed only one talk, on ProxySQL, (to get some water, drink a bottle of famous Belgian beer and chat with my former colleague in MySQL support team, Geert, whom I had not seen for a decade). So, for the first time out of my 4 FOSDEM visits I've got a first hand impression about the entire set of talks in the devroom that I want to share today, while I still remember my feelings.

Most of the talks have both slides and videos already uploaded on site, so you can check them and make your own conclusions, but my top 5 favorite talks (that have both videos and slides already available to community) were the following:

  • "Un-split brain (aka Move Back in Time) MySQL", by Shlomi Noach. You can find slides at SlideShare.

    This was a replacement talk that was really interesting and had proper style for FOSDEM. It was mostly a nice background story of creation of the gh-mysql-rewind tool, a shell script that uses MariaDB's mysqlbinlog --flashback option and MySQL GTIDs and allows to "rewind" row-based binary log to roll back transactions to some previous point in time. The tool should become available to community soon, maybe as a part of orchestrator. I was impressed how one can successfully use 49 slides for 20 minutes talk. That's far beyond my current presentation skills...
  • "Test complex database systems in a laptop with dbdeployer", by Giuseppe Maxia. You can find slides at SlideShare.

    I've already built and used dbdeployer, as described in my blog post, so I was really interested in the talk. Giuseppe was able not only to show 45 slides over 20 minutes and explain all the reasons behind re-implementing MySQL-Sandbox in Go, but also run a live demo where dozens of sandbox instances were created and used. Very impressive!
  • "MySQL and the CAP theorem: relevance & misconceptions", second great talk and show by Shlomi Noach. You can find slides at SlideShare.

    The "CAP theorem" says is a concept that a distributed database system (like any kind of MySQL replication setup) can only have 2 of the 3 features: (atomic) Consistency, (high) Availability and Partition Tolerance. This can be proved mathematically, but Shlomi had not only defined terms and conditions to present the formal proof, but also explained that they are far from real production objectives of any engineer or DBA (like 99.95% of Availability). He had shown typical MySQL setups (from simple async master-slave replication to Galera, group replication and even Vitess) and proved that formally they all are neither consistent nor available from that formal CAP theorem point of view, while, as we all know, they are practically useful and work (and with some efforts, proxies on top etc can be made both highly available and highly consistent for practical purposes). So, CAP theorem is neither representing real production systems, nor meeting their real requirements. We've also got some kind of explanation of why async master-master or circular replication are still popular... All that in 48 slides, with links, and presented in 20 minutes! Greatest short MySQL-related talk I've ever attended.
  • "TiDB: Distributed, horizontally scalable, MySQL compatible", by Morgan Tocker. You can find slides at SlideShare.

    It was probably the first time when I listened to Morgan, even though we worked together for a long time. I liked his way of explaining the architecture of this yet another database system speaking MySQL protocol and reasons to create it. If you are interested in performance of this system, check this blog post.
  • "MySQL 8.0 Document Store: How to Mix NoSQL & SQL in MySQL 8.0", by Frédéric Descamps. You can find slides (70!) at SlideShare.

    LeFred managed to get me somewhat interested in MySQL Shell and new JSON functions in MySQL, way more than ever before. It's even more surprising that hist talk was the last one and we already spent 8+ hours listening before he started. Simple step by step explanation of how one may get the best of both SQL, ACID and NoSQL (JSON, "MongoDB") worlds, if needed, in a single database management syste, was impressive. Also this talk probably caused the longest discussion and the largest number of questions from those remaining attendees.

    He was also one of two "hosts" and "managers" of the devroom, so I am really thankful him for hist efforts year after year to make MySQL devroom at FOSDEM great!
There were more good talks, but I had to pick up few that already have slides shared and those of a kind that I personally prefer to listen to at FOSDEM. This year I also missed few people whom I like to see and talk to at FOSDEM, namely Mark Callaghan and Jean-François Gagné.

The only photo I made with my Nokia dumb phone this year in Brussels, on my way to FOSDEM on February 2. We've got snow and rain that morning, nice for anyone who had to walk 5 kilometers to the ULB campus.
Overall, based on my experience this year, it still makes a lot of sense to visit FOSDEM for anyone interested in MySQL. You can hardly find so many good, different MySQL-related talks per just one single day on any other conference.

Saturday, January 26, 2019

Fun with Bugs #78 - On Some Public Bugs Fixed in MySQL 5.7.25

Today I'd like to continue my tradition of ignoring MySQL 8 (after all, I can not even build 8.0.14 any more on my Ubuntu 14.04, it's not supported suddenly because of old gcc version) and, of all MySQL server versions released by Oracle this week, concentrate on bugs reported in public bugs database and fixed in the latest minor release of MySQL 5.7 branch, 5.7.25.

This time there is only one InnoDB community-reported bug fixed, Buig #87423 - "os0file.cc assertion failed 'offset > 0' in os_file_io_complete", from Vasily Nemkov. See also it's duplicate, Bug #88956, by Aidan Diffey. Assertion failure seems to happen with 32-bit binaries running on 64-bit OS (both bugs are reported on Linux) and is related to infamous ulint type usage in the code, but for some reason release notes and closing comment mention Windows:
"An assertion was raised when attempting to write to a tablespace file greater than 4GB in size on a 64-bit Windows system. The failure was due to a narrowing cast."
Usually there are some bugs around flowers. But sometimes we can enjoy flowers without bugs...
There is a long enough list of replication bugs fixed:
  • Bug #92132 - "secure-file-priv breaks LOAD DATA INFILE replication in statement mode on 5.7.23". This regression bug (that had not got "regression" tag!) was reported by Nicolai Plum.
  • Bug #91941 - "Deadlock during purge_logs_before_date". Great bug report from Nikolai Ikhalainen. Note also that some hints by Jean-François Gagné and proper gdb backtrace analysis by Oracle engineers, Shane Bester and Dmitry Lenev, were needed to force proper processing of this bug. See also related Bug #92108 - "Deadlock by concurrent show binlogs, pfs session_variables table & binlog purge", from Shashank Sahni.
  • Bug #91548 - "LOCK_grant and LOCK_open can deadlock on a gtid slave". Great example of Shane Bester's regular work on bugs.
  • Bug #90640 - "`head->variables.gtid_next.type != UNDEFINED_GTID' ". One of those cases when assertion failure in debug build highlights real problem in the code. Great finding by Roel Van de Paar.
  • Bug #87832 - "Relay_Log_Space is inaccurate and leaks". Nice bug report by Manuel Ung. Related variable could be changed concurrently without any locking and get randomly wrong values.
  • Bug #84752 - "Multi-Slave Replication Fail: bogus data in log event". This bug was reported by Gonzalo Miguel Arruti . Eventually it seems internally reported Bug#22252394 - SLAVE I/O THREAD MAY STOP WHEN BINLOG ROTATES highlighted the real problem. Good to have this case fixed!
  • Bug #83003 - "Using temporary tables on slaves increases GTID sequence number". I am so happy to this bug reported by my former colleague Ovais Tariq fixed in 5.7! Patches provided by Laurynas Biveinis from Percona are finally backported. It's so sad to see slaves out of sync with master when GTIDs are used... For some reason this bug is not listed as replication-related and patches contributed are not mentioned in the release notes.
Some other fixes are also interesting to check:
  • Bug #92131  - "ASan: Direct leak of 272 byte(s) in main.mysqlpump_partial_bkp MTR test case". I mentioned this bug reported by Yura Sorokin from Percona in the past. Good to see it fixed. I truly hope Oracle is testing ASan builds regularly with MTR and we'll not see such reports from community any more. For now efforts of Percona engineers help, as we can see also from Bug #90238 - "Comparison of uninitailized memory in log_in_use" with patches from Zsolt Parragi and Laurynas Biveinis.
  • Bug #92049 - "bogus data when ordering results from variables_by_thread". Thanks to Shane Bester from Oracle (who still reports MySQL bugs in public), we have one bug less in otherwise near perfect Performance Schema.
  • Bug #90742 = "SIGHUP cause mysql server crash.". It was reported by Seunguck Lee. Good to see S1 crashing bug not blindly classified as "security" and hidden forever. Note also great analysis there provided by Jean-François Gagné! It would be impossible for any hidden bug to get this kind of useful feedback.
  • Bug #89214 - "The SELECT will deadlock in the stored procedure, if the result set is empty.". It took some time and efforts for bug reporter (two pg) to prove the point, confirm that the problem happens only with prepared statements and force proper processing. This regression bug (MySQL 5.6 was not affected) still does not have "regression" tag.
That's all bugs I wanted to mention today. To summarize:
  1. I'd surely consider immediate upgrade to 5.7.25 in any environment where replication is used.
  2. It seems Oracle engineers who process bugs still have no habit of adding "regression" tag when obviously needed. I'll ask bug reporters to do this themselves during my FOSDEM talk about MySQL bugs.
  3. Percona still helps Oracle to make MySQL better with proper QA efforts and patches.
  4. Most of the bugs mentioned here affected MySQL 8 as well, so from this post you know about many important fixes that happened in MySQL 8.0.14 :) 

Sunday, January 20, 2019

Fun with Bugs #77 - On MySQL Bug Reports I am Subscribed to, Part XIV

Slides for my talk about MySQL bugs at FOSDEM 2019 MySQL, MariaDB and Friends Devroon are ready, support customers decided not to break anything badly on weekend, so I have some free time for blogging.  As usual, when I do not have any better idea or useful recent real life experience to share I write about MySQL bugs.

Today I'd like to continue my review of interesting MySQL bug reports added by Community members in December, 2018. I'll review them starting from the oldest:
  • Bug #93701 - "Assertion `maybe_null' failed |Item_func_concat::val_str(String*)". It's debug assertion, not a big deal (S6 bug), so why should anyone care about this report from Ramesh Sivaraman? Because last time same assertion failure ended up as serious enough Bug #83115 fixed in 8.0.11.
  • Bug #93708 - "Page Cleaner will sleep for long time if clock changes". In this bug report from Marcelo Altmann I was mostly impressed by arguing started by Oracle engineer on bug severity (documentation request, feature request vs real bug). IMHO it was a waste of time.
  • Bug #93728 - "mysqld crash after using alter table with SPATIAL key". Nice finding by Rui Xu. Unfortunately MariaDB 10.3.7 is also affected. At least in the error log I also see this error message for each SPATIAL index of the table:
    2019-01-20 19:26:15 9 [ERROR] InnoDB: Record in index `idx2` of table `test`.`tab` was not found on update: TUPLE (info_bits=0, 2 fields): {[32]      $@      $@
          $@      $@(0x0000000000000400000000000000040000000000000004000000000000000
    400),[4]    (0x00000001)} at: COMPACT RECORD(info_bits=0, 1 fields): {[8]infimum
     (0x090E06090D050D00)}
  • Bug #93734 - "MySQL 8.0 is 36 times slower than MySQL 5.7". I like everything about this bug reported by Vadim Tkachenko, CTO of Percona. I like synopsis that sounds cool, missing exact MySQL 8.0.x version (but I suspect 8.0.13), impact of innodb_flush_log_at_trx_commit, suggested fix and the fact that the bug is still "Open" and nobody from Oracle paid any attention to the report from CTO of one of key Oracle partners and supporters of MySQL 8. Wonderful!
  • Bug #93737 - "mysqlbinlog mermory used grows unstoped". I had not tried to reproduce it (and nobody had it seems, as the bug is still "Open"), but this report by Chandler Bong sounded interesting. Let's see what it may end up with...
  • Bug #93746 - "MySQL Crashes with deadlock at Mutex DICT_SYS created dict0dict.cc:1172". Long semaphore waits involving DICT_SYS mutex and leading to watchdog thread killing MySQL are common. Bug processing is still in progress, but I subscribed to this bug by Anton Ravich mostly to follow another bug he refers to as a possibly related, Bug #80919. That bug is declared a duplicate of some bug in Oracle's internal bugs database, without bug number mentioned. This is not the first time I see such am artistic way to process community bugs (that I consider totally wrong and unacceptable).
  • Bug #93760 - "InnoDB got assert failure while figuring out space id at startup". Fungo Wang found this crash while running MTR test case innodb.innodb_redo_debug_1 with debug binaries. Now I wonder if Oracle QA runs MTR tests with debug binaries at all and do they care to check failures (or rely on Percona for this activity)? Any comments?
  • Bug #93761 - "optimize table cause the slave MTS deadlock!". Sounds interesting and serious, but nobody cared to check this bug report. It's "Open" without comments for 3 weeks already.
  • Bug #93767 - "INSTALL COMPONENT fails - handled segfault during installation nothing in log". I am surprised that somebody outside of Oracle already uses new MySQL 8.0 server components feature. My former colleague, Justin Swanhart, tried but failed so far. Let's see how this report may end up. For now Oracle engineer stated that Justin tried to do something this framework was not designed for, see nice last comment there.
  • Bug #93777 - "the Originator of mysql.event is not correct". Pay attention to this bug found by Dennis Gao if you use events in replication environment.
  • Bug #93779 - "dml in table with trigger to other Table map in binlog lead to slave sql stopped". Work on this bug report from Mohamed Atef is still in progress. Bug reported cared to add a lot of details in recent comments, so I hope to see this bug processed soon.
I've also included a couple of bugs reported in January, 2019 in the list above, but I hope you don't mind.
Today I'd like to remember those sunny days in Venice in September and those pesky MySQL bugs reported this winter...
To summarize:
  1. Some bug reports stay "Open" for weeks without any good reason recently.
  2. The shameful practice of referring to bugs in Oracle's internal bugs database without giving a number (that we see in release notes when the bug is fixed and can look for in commit messages at GitHub) should be stopped! Next time I'll blame engineer(s) who this by name, here and in social media. Check how this is properly done by Shane Bester.
  3. I wonder if Oracle QA runs MTR tests cases on debug builds on a regula basis and checks the results, or this is now a job for Percona mostly?
  4. I should write a separate blog post about bugs related to SPATIAL indexes implementation in InnoDB. There should be dragons...
P.S. Don't forget to click on names of bug reporters to see the entire list of still active MySQL bug reports each of them created.

Tuesday, January 15, 2019

Using dbdeployer With MariaDB Server

Some time ago I've noted that one of the tools I use for testing various MySQL and MariaDB cases and to reproduce potential bugs, MySQL-Sandbox, is not updated any more. It turned out that active development switched to its port in Go called dbdeployer. You can find detailed information about dbdeployer and reasons behind developing it provided by its author, Giuseppe Maxia, here and there. See also this post at Percona blog for some quick review of its main features. One of the points of dbdeployer (and reasons to use Go) is that it is built once (per platform supported) somewhere and then binaries are downloaded from GitHub and used everywhere, without any problems with dependencies etc.

I've added checking dbdeployer to my long ToDo list, as I planned to use it (if not MySQL Sandbox) for some tests and posts related to resolving typical practical problems with MariaDB GTID-based replication. Yesterday I've allocated some time to finally try it and, as usual, I've started with building it from source (as for me databases-related software that I can not build from source on my test systems is not any attractive as something new to study and use). I was immediately surprised by the lack of instructions on how to do this at GitHub, no Makefile of any kind etc. All I was able to find is build.sh script. Correction: just check README.md on how to build it properly, as Giuseppe Maxia explained in the comment.

Good, regular structure is important for deployment
Fortunately this is not the first project written in Go that I try to build (or change somehow and then build). The first one was this replication manager (that has proper build instructions in docs). So, I though I knew what to do. I've installed missing golang package on my netbook with Ubuntu 14.04 that I had at hand and tried the following typical steps:
openxs@ao756:~/go$ export GOPATH=$HOME/go
openxs@ao756:~/go$ echo $GOPATH
/home/openxs/go
openxs@ao756:~/go$ go get github.com/datacharmer/dbdeployer
# github.com/datacharmer/dbdeployer/common
src/github.com/datacharmer/dbdeployer/common/strutils.go:170: undefined: sort.Slice
...
That was a bit surprising, but quick Google search shown that this could be caused by outdated (pre-1.8) version of golang package. So, dbdeployer requires golang 1.8 or newer and there was no such package for my good old Ubuntu (it has some 1.2.x only). One day I'll upgrade it, but so far I am OK with 14.04 for all other testing purposes, so I had to give up on the idea to build from source temporary. 

Today during few free minutes I've retried on my good old desktop box with Fedora 27 (where I surely built some Go project(s) successfully):
[openxs@fc23 go]$ uname -a
Linux fc23 4.18.19-100.fc27.x86_64 #1 SMP Wed Nov 14 22:04:34 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[openxs@fc23 ~]$ ls go
pkg  src
[openxs@fc23 ~]$ echo $GOPATH

[openxs@fc23 ~]$ export GOPATH=$HOME/go
[openxs@fc23 ~]$ cd go
[openxs@fc23 go]$ go versiongo version go1.9.7 linux/amd64
This environment should work for build, so I've proceeded with:
[openxs@fc23 go]$ go get github.com/datacharmer/dbdeployer
[openxs@fc23 go]$ ls src/github.com/datacharmer/   jmoiron/       nsf/           tanji/
go-sql-driver/ mattn/         ogier/
[openxs@fc23 go]$ ls src/github.com/datacharmer/dbdeployer/
abbreviations/ compare/       docs/          mkreadme/      test/
.build/        concurrent/    .git/          rest/          unpack/
cmd/           cookbook/      .github/       sandbox/       vendor/
common/        defaults/      globals/       scripts/
Now let's try that scripts/build.sh with linux as a parameter, as it's a way to build Linux binaries based on what I found:
[openxs@fc23 go]$ MKDOCS=1 src/github.com/datacharmer/dbdeployer/scripts/build.sh linux
+ env GOOS=linux GOARCH=386 go build --tags docs -o dbdeployer-1.17.0-docs.linux .
+ env GOOS=linux GOARCH=amd64 go build -o sort_versions.linux sort_versions.go
/home/openxs/go/src/github.com/datacharmer/dbdeployer
-rwxrwxr-x. 1 openxs openxs 8.1M Jan 14 10:27 dbdeployer-1.17.0-docs.linux
-rw-rw-r--. 1 openxs openxs 3.0M Jan 14 10:27 dbdeployer-1.17.0-docs.linux.tar.gz
[openxs@fc23 go]$ ls
bin  pkg  src
[openxs@fc23 go]$ ls bin
dbdeployer
[openxs@fc23 go]$ bin/dbdeployer --version
dbdeployer version 1.17.0
Now we know how to build dbdeployer from source, if needed. If some dependencies are missing you'll be informed and similar go get ... command should allow to install it.

I was somewhat surprised to see MariaDB NOT mentioned at all in README.md. It says:
"DBdeployer is a tool that deploys MySQL database servers easily."
while good old MySQL-Sandbox also mentions MariaDB explicitly:
"This package is a sandbox for testing features under any version of MySQL from 3.23 to 8.0 (and any version of MariaDB.)"
So, my idea was to double check that dbdeployer is both MySQL-Sandbox compatible and MariaDB compatible (it is). I have several sandboxes already created in the past. I also have MariaDB 10.2.21 .tar.gz binaries that I want to use with dbdeployer for further testing:
[openxs@fc23 go]$ ls ~/sandboxes/
clear_all        rsandbox_mariadb-10_0_19  send_kill_all  test_replication
plugin.conf      rsandbox_mariadb-10_1_12  start_all      use_all
restart_all      rsandbox_mysql-8_0_12     status_all
rsandbox_8_0_12  sandbox_action            stop_all
[openxs@fc23 go]$ ls ~/*.tar.gz
/home/openxs/galera-25.3.22-x86_64.tar.gz
/home/openxs/galera-25.3.24-x86_64.tar.gz
/home/openxs/galera-25.3.25-glibc_214-x86_64.tar.gz
/home/openxs/mariadb-10.2.12-linux-x86_64.tar.gz
/home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz
With dbdeployer one has to unpack .tar.gz first with dbdeployer unpack command. So, I tried it immediately:
[openxs@fc23 go]$ bin/dbdeployer unpack ~/mariadb-10.2.21-linux-x86_64.tar.gz
directory '/home/openxs/opt/mysql' not found
You should create it or provide an alternate base directory using --sandbox-binary
It seems the tool now wants to use ~/opt/mysql as a directory to unpack to, while MySQL_Sandbox silently used ~:
[openxs@fc23 go]$ ls ~ | grep 8.0
8.0.12
I made a lame try to force it to use ~, but failed for the reason I was too lazy to study:
[openxs@fc23 go]$ bin/dbdeployer --sandbox-binary=/home/openxs unpack /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gzUnpacking tarball /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz to $HOME/10.2.21
.........100.........200....&tar.Header{Name:"mariadb-10.2.21-linux-x86_64/mysql-test/mysql-test-run", Mode:511, Uid:1021, Gid:1004, Size:0, ModTime:time.Time{wall:0x0, ext:63681810892, loc:(*time.Location)(0xa47aa0)}, Typeflag:0x32, Linkname:"./mysql-test-run.pl", Uname:"dbart", Gname:"my", Devmajor:0, Devminor:0, AccessTime:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, ChangeTime:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, Xattrs:map[string]string(nil)}
#ERROR: symlink ./mysql-test-run.pl mariadb-10.2.21-linux-x86_64/mysql-test/mysql-test-run: file exists
I just created ~/opt/mysql and proceeded with default configuration. After unpack step completed successfully I've proceeded with deploy step to create new replication sandbox:
[openxs@fc23 go]$ bin/dbdeployer unpack /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz
Unpacking tarball /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz to $HOME/opt/mysql/10.2.21
.........100.........200.........300.........400.........500.........600.........700.........800... ...
Renaming directory /home/openxs/opt/mysql/mariadb-10.2.21-linux-x86_64 to /home/openxs/opt/mysql/10.2.21

[openxs@fc23 go]$ bin/dbdeployer deploy replication 10.2.21
Installing and starting master
. sandbox server started
Installing and starting slave1
. sandbox server started
Installing and starting slave2
. sandbox server started
$HOME/sandboxes/rsandbox_10_2_21/initialize_slaves
initializing slave 1
initializing slave 2
Replication directory installed in $HOME/sandboxes/rsandbox_10_2_21
run 'dbdeployer usage multiple' for basic instructions'
We have access to nice enough documentation:
[openxs@fc23 go]$ bin/dbdeployer usage multiple
 USING MULTIPLE SERVER SANDBOX
On a replication sandbox, you have the same commands (run "dbdeployer usage single"),
with an "_all" suffix, meaning that you propagate the command to all the members.
Then you have "./m" as a shortcut to use the master, "./s1" and "./s2" to access
the slaves (and "s3", "s4" ... if you define more).

In group sandboxes without a master slave relationship (group replication and
multiple sandboxes) the nodes can be accessed by ./n1, ./n2, ./n3, and so on.

start_all    [options] > starts all nodes
status_all             > get the status of all nodes
restart_all  [options] > restarts all nodes
stop_all               > stops all nodes
use_all         "SQL"  > runs a SQL statement in all nodes
use_all_masters "SQL"  > runs a SQL statement in all masters
use_all_slaves "SQL"   > runs a SQL statement in all slaves
clear_all              > stops all nodes and removes all data
m                      > invokes MySQL client in the master
s1, s2, n1, n2         > invokes MySQL client in slave 1, 2, node 1, 2


The scripts "check_slaves" or "check_nodes" give the status of replication in the sandbox.
Typical sandbox directory (with some differences like use_all_slaves etc) is created in ~/sandboxes/ and shortcut commands work as expected:
[openxs@fc23 go]$ cd ~/sandboxes/rsandbox_10_2_21/
[openxs@fc23 rsandbox_10_2_21]$ ls
check_slaves       n2           sbdescription.json  test_sb_all
clear_all          node1        send_kill_all       use_all
initialize_slaves  node2        start_all           use_all_masters
m                  restart_all  status_all          use_all_slaves
master             s1           stop_all
n1                 s2           test_replication

[openxs@fc23 rsandbox_10_2_21]$ ls master/
add_option    init_db         restart             show_binlog    status   use
clear         load_grants     sbdescription.json  show_log       stop
data          my              sb_include          show_relaylog  test_sb
grants.mysql  my.sandbox.cnf  send_kill           start          tmp

[openxs@fc23 rsandbox_10_2_21]$ ls ../rsandbox_mariadb-10_1_12/
check_slaves             m       node1        s2             test_replication
clear_all                master  node2        send_kill_all  use_all
connection.json          n1      README       start_all
default_connection.json  n2      restart_all  status_all
initialize_slaves        n3      s1           stop_all

[openxs@fc23 rsandbox_10_2_21]$ ls ../rsandbox_mariadb-10_1_12/master/
add_option       default_connection.json  my              send_kill      tmp
change_paths     grants_5_7_6.mysql       mycli           show_binlog    use
change_ports     grants.mysql             my.sandbox.cnf  show_relaylog  USING
clear            json_in_db               proxy_start     start
connection.json  load_grants              README          status
data             msb                      restart         stop

[openxs@fc23 rsandbox_10_2_21]$ ./status_all
REPLICATION  /home/openxs/sandboxes/rsandbox_10_2_21
master : master on  -  port     23322 (23322)
node1 : node1 on  -  port       23323 (23323)
node2 : node2 on  -  port       23324 (23324)
[openxs@fc23 rsandbox_10_2_21]$ ./use_all "show variables like 'gtid%'"
# master
Variable_name   Value
gtid_binlog_pos 0-100-12
gtid_binlog_state       0-100-12
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos
gtid_strict_mode        OFF
# server: 1
Variable_name   Value
gtid_binlog_pos
gtid_binlog_state
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-12
gtid_strict_mode        OFF
# server: 2
Variable_name   Value
gtid_binlog_pos
gtid_binlog_state
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-12
gtid_strict_mode        OFF
[openxs@fc23 rsandbox_10_2_21]$ ./m
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 14
Server version: 10.2.21-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master [localhost:23322] {msandbox} ((none)) > show master status;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 |     2835 |              |                  |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

master [localhost:23322] {msandbox} ((none)) > show variables like 'gtid%';
+------------------------+----------+
| Variable_name          | Value    |
+------------------------+----------+
| gtid_binlog_pos        | 0-100-12 |
| gtid_binlog_state      | 0-100-12 |
| gtid_current_pos       | 0-100-12 |
| gtid_domain_id         | 0        |
| gtid_ignore_duplicates | OFF      |
| gtid_seq_no            | 0        |
| gtid_slave_pos         |          |
| gtid_strict_mode       | OFF      |
+------------------------+----------+
8 rows in set (0.00 sec)

master [localhost:23322] {msandbox} ((none)) > exit
Bye
For my further tests I needed slaves to have log_slave_updates enabled and gtid_strict_mode=ON. So, I've added these settings to my.sandbox.cnf in node1 and node2 subdirectories for both slaves and restarted them:
[openxs@fc23 rsandbox_10_2_21]$ ./restart_all# executing 'stop' on /home/openxs/sandboxes/rsandbox_10_2_21
stop /home/openxs/sandboxes/rsandbox_10_2_21/node1
stop /home/openxs/sandboxes/rsandbox_10_2_21/node2
stop /home/openxs/sandboxes/rsandbox_10_2_21/master
# executing 'start' on /home/openxs/sandboxes/rsandbox_10_2_21
executing 'start' on master
. sandbox server started
executing 'start' on slave 1
. sandbox server started
executing 'start' on slave 2
. sandbox server started
I need a table to play with and I want to check that slaves are in sync:
[openxs@fc23 rsandbox_10_2_21]$ ./m
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 12
Server version: 10.2.21-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master [localhost:23322] {msandbox} ((none)) > use test
Database changed
master [localhost:23322] {msandbox} (test) > create table t1(id int primary key, c1 int);
Query OK, 0 rows affected (0.17 sec)

master [localhost:23322] {msandbox} (test) > exit
Bye
[openxs@fc23 rsandbox_10_2_21]$ ./use_all "show variables like 'gtid%'"
# master
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos
gtid_strict_mode        OFF
# server: 1
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-13
gtid_strict_mode        ON
# server: 2
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-13
gtid_strict_mode        ON
[openxs@fc23 rsandbox_10_2_21]$
Note the value of gtid_current_pos on master and gtid_slave_pos on each slave. They are the same and slaves are in sync. If you want to find out more about the format of GTIDs in MariaDB or all that gtid% server variables, please, check this KB article.

* * *

To summarize, dbdeployer is a nice port of MySQL-Sandbox into Go, with some additional features. It can be easily built from source if you have golang version 1.8 or newer (or just downloaded if you have not). Sandboxes created with dbdeployer may co-exists with older sandboxes in the same default directory (but .tar.gz files are unpacked into different directory by default). It still works well with MariaDB. I am going to use replication sandboxes built with it for some further testing of various real life use cases and problems of MariaDB's GTIDs implementation (that may be presented in further posts).

Sunday, January 13, 2019

Understanding Status of MariaDB Server JIRA Issues

In my previous blog post on MariaDB's JIRA for MySQL users who are familiar with MySQL bugs database (but may be new to JIRA) I've presented some details about statuses that JIRA issues may have. There is no one to one correspondence with MySQL bug's statuses that I once described in details here. In case of MariaDB Server bugs ("JIRA issues") one may have to check not only "Status" field, but also "Resolution" filed and even "Labels" field to quickly understand what is the real status and what MariaDB engineers decided or are waiting for. So, I think some additional clarifications may help MySQL users who check or report MariaDB bugs as well.

Let me present details of this statuses correspondence in a simple table, where the first column contains MySQL's bug status, while 3 other columns contain the content of corresponding MariaDB Server JIRA issue's fields, "Status", "Resolution" and "Labels". There is also "Comment" column with some explanation on what else is usually done in JIRA issue when it gets this set of values defining its status or what this may mean in MySQL bugs database etc. Most important MySQL bug statuses are taken from this my post (there are more of them, but others are rarely used, especially when real work on bugs was moved into internal bugs database by Oracle, or were removed since that post as it happened to "To be fixed later").

MySQL Bug StatusMariaDB JIRA StatusMariaDB JIRA ResolutionMariaDB JIRA LabelComment
OpenOPENUnresolvedTypical status for just reported bug
ClosedCLOSEDFixedYou should see list of versions that got the fix in the Fix Version/s field
DuplicateCLOSEDDuplicateSo, in MariaDB it's "closed as a duplicate"
AnalyzingOPENUnresolvedUsually bug is assigned when some engineer is working on it, including analysis stage
VerifiedCONFIRMEDUnresolvedCONFIRMED bugs are usually assigned in JIRA while in MySQL "Verified" bugs are usually unassigned
Won't fixCLOSEDWon't FixUsually remains assigned
Can't repeatCLOSEDCannot reproduceUnlike in MySQL, usually means that both engineer and bug reporter are not able to reproduce this
No FeedbackCLOSEDIncompleteneed_feedbackAs in MySQL, bug should stay with "need_feedback" label for some time before it's closed as incomplete
Need FeedbackOPENUnresolvedneed_feedbackUsually in the last comment in the bug you can find out what kind of feedback is required. No automatic setting to "No Feedback" in 30 days
Not a BugCLOSEDNot a Bug 
UnsupportedCLOSEDWon't FixThere is no special "Unsupported" status in MariaDB. Most likely when there is a reason NOT to fix it's stated in the comment.

In the table above you can click on some links to see the list of MariaDB bugs with the status discussed in the table row. This is how I am going to use this post from now on, as a quick search starting point :) It will also be mentioned on one of slides of my upcoming FOSDEM 2019 talk.