Saturday, February 16, 2019

Fun with Bugs #79 - On MySQL Bug Reports I am Subscribed to, Part XV

More than 3 weeks passed since my previous review of public MySQL bug reports I am subscribed to, so it's time to present some of the bugs I've considered interesting in January, 2019.

As usual, I'll review them starting from the oldest and try to summarize my feelings about these bugs at the end of this post. Here they are:
  • Bug #93806 - "Document error about ON DUPLICATE KEY UPDATE". Years pass, but fine MySQL manual still does not explain some cases of InnoDB locking properly. Xiaobin Lin found yet another case that it does not explain properly. Or, maybe, the manual is correct and the problem in the implementation? MariaDB 10.3.7 shows the same behavior.
  • Bug #93827 - "dict_index_has_desc() is not efficient". Yet another bug report from Zhai Weixiang. I see 50 still active bug reports from him! Maybe Oracle should send some nice T-shirts to top N most productive bug reporters?
  • Bug #93845 - "Optimizer choose wrong index, sorting index instead of filtering index". yet another bug report of a known class, this time from Daniele Renda. It's good example of optimizer trace usage to make a point. Note also that using ANALYZE ... UPDATE HISTOGRAMS does not help. As a side note, implementation of optimizer trace for MariaDB is finally in progress and should be done for upcoming 10.4. See MDEV-6111 for the details if you care.
  • Bug #93875 - "mysqldump per-table dump is slow since 5.7 on instances with many tables". This performance regression bug (that was "verified" without adding the regression tag) was reported by Nikolai Ikhalainen from Percona. This bug report is a nice example of using Docker to create easily repeatable test cases for bug reports.
  • Bug #93878 - "innodb_status_output fails to restore to old value". This great bug report from Yuhui Wang  not only describes 3 cases when InnoDB status is printed to the error log automatically, but also shows that in one of these cases, when we can not found free block in the buffer pool in 20 loops, this printing is not stopped after the problem is resolved, and provides a patch that resolves the problem. See also his nice Bug #94065 - "MySQL fails to startup when setting persist variable" with detailed analysis of the problem.
  • Bug #93917 - "Wrong binlog entry for BLOB on a blackhole intermediary master". Nice corner case was found by Sveta Smirnova from Percona. With her 52 "Verified" bug reports at the moment she also deserves a T-shirt from Oracle as one of top bug reporters!
  • Bug #93922 - "UNION ALL very slow with SUM(0)". This weird bug was found and reported by Sergio Paternoster. He had to spend notable efforts to see this bug "Verified"...
  • Bug #93948 - "XID inconsistency on master-slave with CTAS". Krunal Bauskar from Percona noted this inconsistency in XID generation on slave vs master. Let's wait and check if it ends up as "Not a bug".
  • Bug #93957 - "slave_compressed_protocol doesn't work with semi-sync replication in MySQL-5.7". This bug report from Pavel Katiushyn also looks like a regression, as similar bug was fixed in older 5.7.x release. But I do not see any public comment with verification attempt neither in recent 5.7, nor in recent 8.0 (where older bug also had to be fixed). So, the bug is "verified", but the real impact and versions affected are not clear.
  • Bug #93963 - "Slow query log doesn't log a slow CREATE INDEX with admin statements enabled". This clear and properly tagged regression vs MySQL 5.7 was reported by Jeremy Smyth.
  • Bug #93986 - "Transactions in serializable mode are not actually serializable". I've subscribed to this bug report mostly for (expected) fun of reading further comments. It's still "Need feedback", but single comment so far is worth reading.
  • Bug #94121 - "Enable hardware CRC32 under Valgrind". Laurynas Biveinis from Percona also provided a patch for this 8 years old problem.
  • Bug #94130 - "XA COMMIT may lead replication broken". Yet another proof that XA transactions implementation is broken in MySQL. This time from Phoenix Zhang and in semi-sync replication case.
This photo reminds me current state of MySQL bugs processing in Oracle - it seems there is no clear and straightforward way to follow. Everything is fuzzy these days...

There are few more bugs reported in January, 2019 that I am watching, but their status is not yet clearly defined, so I decided to skip them in this review.

To summarize:
  1.  Oracle engineers who process bugs still do not add regression tag to many regression bugs. This is a shame, really. If I were their boss I'd make this a policy and one of important KPI values to monitor.
  2. In some cases bugs get verified immediately without any demonstrated attempt to show how the check was performed, while in other cases poor bug reporters have to fight hard to re-make their point and get a real check done. It seems these days good old approaches to bugs verification are not followed strictly by some Oracle engineers.

Saturday, February 9, 2019

On my Favorite FOSDEM 2019 MySQL, MariaDB and Friends Devroom Talks

This year I had not only spoken about MySQL bugs reporting at FOSDEM, but spent almost the entire day listening at MySQL, MariaDB and Friends Devroom. I missed only one talk, on ProxySQL, (to get some water, drink a bottle of famous Belgian beer and chat with my former colleague in MySQL support team, Geert, whom I had not seen for a decade). So, for the first time out of my 4 FOSDEM visits I've got a first hand impression about the entire set of talks in the devroom that I want to share today, while I still remember my feelings.

Most of the talks have both slides and videos already uploaded on site, so you can check them and make your own conclusions, but my top 5 favorite talks (that have both videos and slides already available to community) were the following:

  • "Un-split brain (aka Move Back in Time) MySQL", by Shlomi Noach. You can find slides at SlideShare.

    This was a replacement talk that was really interesting and had proper style for FOSDEM. It was mostly a nice background story of creation of the gh-mysql-rewind tool, a shell script that uses MariaDB's mysqlbinlog --flashback option and MySQL GTIDs and allows to "rewind" row-based binary log to roll back transactions to some previous point in time. The tool should become available to community soon, maybe as a part of orchestrator. I was impressed how one can successfully use 49 slides for 20 minutes talk. That's far beyond my current presentation skills...
  • "Test complex database systems in a laptop with dbdeployer", by Giuseppe Maxia. You can find slides at SlideShare.

    I've already built and used dbdeployer, as described in my blog post, so I was really interested in the talk. Giuseppe was able not only to show 45 slides over 20 minutes and explain all the reasons behind re-implementing MySQL-Sandbox in Go, but also run a live demo where dozens of sandbox instances were created and used. Very impressive!
  • "MySQL and the CAP theorem: relevance & misconceptions", second great talk and show by Shlomi Noach. You can find slides at SlideShare.

    The "CAP theorem" says is a concept that a distributed database system (like any kind of MySQL replication setup) can only have 2 of the 3 features: (atomic) Consistency, (high) Availability and Partition Tolerance. This can be proved mathematically, but Shlomi had not only defined terms and conditions to present the formal proof, but also explained that they are far from real production objectives of any engineer or DBA (like 99.95% of Availability). He had shown typical MySQL setups (from simple async master-slave replication to Galera, group replication and even Vitess) and proved that formally they all are neither consistent nor available from that formal CAP theorem point of view, while, as we all know, they are practically useful and work (and with some efforts, proxies on top etc can be made both highly available and highly consistent for practical purposes). So, CAP theorem is neither representing real production systems, nor meeting their real requirements. We've also got some kind of explanation of why async master-master or circular replication are still popular... All that in 48 slides, with links, and presented in 20 minutes! Greatest short MySQL-related talk I've ever attended.
  • "TiDB: Distributed, horizontally scalable, MySQL compatible", by Morgan Tocker. You can find slides at SlideShare.

    It was probably the first time when I listened to Morgan, even though we worked together for a long time. I liked his way of explaining the architecture of this yet another database system speaking MySQL protocol and reasons to create it. If you are interested in performance of this system, check this blog post.
  • "MySQL 8.0 Document Store: How to Mix NoSQL & SQL in MySQL 8.0", by Frédéric Descamps. You can find slides (70!) at SlideShare.

    LeFred managed to get me somewhat interested in MySQL Shell and new JSON functions in MySQL, way more than ever before. It's even more surprising that hist talk was the last one and we already spent 8+ hours listening before he started. Simple step by step explanation of how one may get the best of both SQL, ACID and NoSQL (JSON, "MongoDB") worlds, if needed, in a single database management syste, was impressive. Also this talk probably caused the longest discussion and the largest number of questions from those remaining attendees.

    He was also one of two "hosts" and "managers" of the devroom, so I am really thankful him for hist efforts year after year to make MySQL devroom at FOSDEM great!
There were more good talks, but I had to pick up few that already have slides shared and those of a kind that I personally prefer to listen to at FOSDEM. This year I also missed few people whom I like to see and talk to at FOSDEM, namely Mark Callaghan and Jean-François Gagné.

The only photo I made with my Nokia dumb phone this year in Brussels, on my way to FOSDEM on February 2. We've got snow and rain that morning, nice for anyone who had to walk 5 kilometers to the ULB campus.
Overall, based on my experience this year, it still makes a lot of sense to visit FOSDEM for anyone interested in MySQL. You can hardly find so many good, different MySQL-related talks per just one single day on any other conference.

Saturday, January 26, 2019

Fun with Bugs #78 - On Some Public Bugs Fixed in MySQL 5.7.25

Today I'd like to continue my tradition of ignoring MySQL 8 (after all, I can not even build 8.0.14 any more on my Ubuntu 14.04, it's not supported suddenly because of old gcc version) and, of all MySQL server versions released by Oracle this week, concentrate on bugs reported in public bugs database and fixed in the latest minor release of MySQL 5.7 branch, 5.7.25.

This time there is only one InnoDB community-reported bug fixed, Buig #87423 - "os0file.cc assertion failed 'offset > 0' in os_file_io_complete", from Vasily Nemkov. See also it's duplicate, Bug #88956, by Aidan Diffey. Assertion failure seems to happen with 32-bit binaries running on 64-bit OS (both bugs are reported on Linux) and is related to infamous ulint type usage in the code, but for some reason release notes and closing comment mention Windows:
"An assertion was raised when attempting to write to a tablespace file greater than 4GB in size on a 64-bit Windows system. The failure was due to a narrowing cast."
Usually there are some bugs around flowers. But sometimes we can enjoy flowers without bugs...
There is a long enough list of replication bugs fixed:
  • Bug #92132 - "secure-file-priv breaks LOAD DATA INFILE replication in statement mode on 5.7.23". This regression bug (that had not got "regression" tag!) was reported by Nicolai Plum.
  • Bug #91941 - "Deadlock during purge_logs_before_date". Great bug report from Nikolai Ikhalainen. Note also that some hints by Jean-François Gagné and proper gdb backtrace analysis by Oracle engineers, Shane Bester and Dmitry Lenev, were needed to force proper processing of this bug. See also related Bug #92108 - "Deadlock by concurrent show binlogs, pfs session_variables table & binlog purge", from Shashank Sahni.
  • Bug #91548 - "LOCK_grant and LOCK_open can deadlock on a gtid slave". Great example of Shane Bester's regular work on bugs.
  • Bug #90640 - "`head->variables.gtid_next.type != UNDEFINED_GTID' ". One of those cases when assertion failure in debug build highlights real problem in the code. Great finding by Roel Van de Paar.
  • Bug #87832 - "Relay_Log_Space is inaccurate and leaks". Nice bug report by Manuel Ung. Related variable could be changed concurrently without any locking and get randomly wrong values.
  • Bug #84752 - "Multi-Slave Replication Fail: bogus data in log event". This bug was reported by Gonzalo Miguel Arruti . Eventually it seems internally reported Bug#22252394 - SLAVE I/O THREAD MAY STOP WHEN BINLOG ROTATES highlighted the real problem. Good to have this case fixed!
  • Bug #83003 - "Using temporary tables on slaves increases GTID sequence number". I am so happy to this bug reported by my former colleague Ovais Tariq fixed in 5.7! Patches provided by Laurynas Biveinis from Percona are finally backported. It's so sad to see slaves out of sync with master when GTIDs are used... For some reason this bug is not listed as replication-related and patches contributed are not mentioned in the release notes.
Some other fixes are also interesting to check:
  • Bug #92131  - "ASan: Direct leak of 272 byte(s) in main.mysqlpump_partial_bkp MTR test case". I mentioned this bug reported by Yura Sorokin from Percona in the past. Good to see it fixed. I truly hope Oracle is testing ASan builds regularly with MTR and we'll not see such reports from community any more. For now efforts of Percona engineers help, as we can see also from Bug #90238 - "Comparison of uninitailized memory in log_in_use" with patches from Zsolt Parragi and Laurynas Biveinis.
  • Bug #92049 - "bogus data when ordering results from variables_by_thread". Thanks to Shane Bester from Oracle (who still reports MySQL bugs in public), we have one bug less in otherwise near perfect Performance Schema.
  • Bug #90742 = "SIGHUP cause mysql server crash.". It was reported by Seunguck Lee. Good to see S1 crashing bug not blindly classified as "security" and hidden forever. Note also great analysis there provided by Jean-François Gagné! It would be impossible for any hidden bug to get this kind of useful feedback.
  • Bug #89214 - "The SELECT will deadlock in the stored procedure, if the result set is empty.". It took some time and efforts for bug reporter (two pg) to prove the point, confirm that the problem happens only with prepared statements and force proper processing. This regression bug (MySQL 5.6 was not affected) still does not have "regression" tag.
That's all bugs I wanted to mention today. To summarize:
  1. I'd surely consider immediate upgrade to 5.7.25 in any environment where replication is used.
  2. It seems Oracle engineers who process bugs still have no habit of adding "regression" tag when obviously needed. I'll ask bug reporters to do this themselves during my FOSDEM talk about MySQL bugs.
  3. Percona still helps Oracle to make MySQL better with proper QA efforts and patches.
  4. Most of the bugs mentioned here affected MySQL 8 as well, so from this post you know about many important fixes that happened in MySQL 8.0.14 :) 

Sunday, January 20, 2019

Fun with Bugs #77 - On MySQL Bug Reports I am Subscribed to, Part XIV

Slides for my talk about MySQL bugs at FOSDEM 2019 MySQL, MariaDB and Friends Devroon are ready, support customers decided not to break anything badly on weekend, so I have some free time for blogging.  As usual, when I do not have any better idea or useful recent real life experience to share I write about MySQL bugs.

Today I'd like to continue my review of interesting MySQL bug reports added by Community members in December, 2018. I'll review them starting from the oldest:
  • Bug #93701 - "Assertion `maybe_null' failed |Item_func_concat::val_str(String*)". It's debug assertion, not a big deal (S6 bug), so why should anyone care about this report from Ramesh Sivaraman? Because last time same assertion failure ended up as serious enough Bug #83115 fixed in 8.0.11.
  • Bug #93708 - "Page Cleaner will sleep for long time if clock changes". In this bug report from Marcelo Altmann I was mostly impressed by arguing started by Oracle engineer on bug severity (documentation request, feature request vs real bug). IMHO it was a waste of time.
  • Bug #93728 - "mysqld crash after using alter table with SPATIAL key". Nice finding by Rui Xu. Unfortunately MariaDB 10.3.7 is also affected. At least in the error log I also see this error message for each SPATIAL index of the table:
    2019-01-20 19:26:15 9 [ERROR] InnoDB: Record in index `idx2` of table `test`.`tab` was not found on update: TUPLE (info_bits=0, 2 fields): {[32]      $@      $@
          $@      $@(0x0000000000000400000000000000040000000000000004000000000000000
    400),[4]    (0x00000001)} at: COMPACT RECORD(info_bits=0, 1 fields): {[8]infimum
     (0x090E06090D050D00)}
  • Bug #93734 - "MySQL 8.0 is 36 times slower than MySQL 5.7". I like everything about this bug reported by Vadim Tkachenko, CTO of Percona. I like synopsis that sounds cool, missing exact MySQL 8.0.x version (but I suspect 8.0.13), impact of innodb_flush_log_at_trx_commit, suggested fix and the fact that the bug is still "Open" and nobody from Oracle paid any attention to the report from CTO of one of key Oracle partners and supporters of MySQL 8. Wonderful!
  • Bug #93737 - "mysqlbinlog mermory used grows unstoped". I had not tried to reproduce it (and nobody had it seems, as the bug is still "Open"), but this report by Chandler Bong sounded interesting. Let's see what it may end up with...
  • Bug #93746 - "MySQL Crashes with deadlock at Mutex DICT_SYS created dict0dict.cc:1172". Long semaphore waits involving DICT_SYS mutex and leading to watchdog thread killing MySQL are common. Bug processing is still in progress, but I subscribed to this bug by Anton Ravich mostly to follow another bug he refers to as a possibly related, Bug #80919. That bug is declared a duplicate of some bug in Oracle's internal bugs database, without bug number mentioned. This is not the first time I see such am artistic way to process community bugs (that I consider totally wrong and unacceptable).
  • Bug #93760 - "InnoDB got assert failure while figuring out space id at startup". Fungo Wang found this crash while running MTR test case innodb.innodb_redo_debug_1 with debug binaries. Now I wonder if Oracle QA runs MTR tests with debug binaries at all and do they care to check failures (or rely on Percona for this activity)? Any comments?
  • Bug #93761 - "optimize table cause the slave MTS deadlock!". Sounds interesting and serious, but nobody cared to check this bug report. It's "Open" without comments for 3 weeks already.
  • Bug #93767 - "INSTALL COMPONENT fails - handled segfault during installation nothing in log". I am surprised that somebody outside of Oracle already uses new MySQL 8.0 server components feature. My former colleague, Justin Swanhart, tried but failed so far. Let's see how this report may end up. For now Oracle engineer stated that Justin tried to do something this framework was not designed for, see nice last comment there.
  • Bug #93777 - "the Originator of mysql.event is not correct". Pay attention to this bug found by Dennis Gao if you use events in replication environment.
  • Bug #93779 - "dml in table with trigger to other Table map in binlog lead to slave sql stopped". Work on this bug report from Mohamed Atef is still in progress. Bug reported cared to add a lot of details in recent comments, so I hope to see this bug processed soon.
I've also included a couple of bugs reported in January, 2019 in the list above, but I hope you don't mind.
Today I'd like to remember those sunny days in Venice in September and those pesky MySQL bugs reported this winter...
To summarize:
  1. Some bug reports stay "Open" for weeks without any good reason recently.
  2. The shameful practice of referring to bugs in Oracle's internal bugs database without giving a number (that we see in release notes when the bug is fixed and can look for in commit messages at GitHub) should be stopped! Next time I'll blame engineer(s) who this by name, here and in social media. Check how this is properly done by Shane Bester.
  3. I wonder if Oracle QA runs MTR tests cases on debug builds on a regula basis and checks the results, or this is now a job for Percona mostly?
  4. I should write a separate blog post about bugs related to SPATIAL indexes implementation in InnoDB. There should be dragons...
P.S. Don't forget to click on names of bug reporters to see the entire list of still active MySQL bug reports each of them created.

Tuesday, January 15, 2019

Using dbdeployer With MariaDB Server

Some time ago I've noted that one of the tools I use for testing various MySQL and MariaDB cases and to reproduce potential bugs, MySQL-Sandbox, is not updated any more. It turned out that active development switched to its port in Go called dbdeployer. You can find detailed information about dbdeployer and reasons behind developing it provided by its author, Giuseppe Maxia, here and there. See also this post at Percona blog for some quick review of its main features. One of the points of dbdeployer (and reasons to use Go) is that it is built once (per platform supported) somewhere and then binaries are downloaded from GitHub and used everywhere, without any problems with dependencies etc.

I've added checking dbdeployer to my long ToDo list, as I planned to use it (if not MySQL Sandbox) for some tests and posts related to resolving typical practical problems with MariaDB GTID-based replication. Yesterday I've allocated some time to finally try it and, as usual, I've started with building it from source (as for me databases-related software that I can not build from source on my test systems is not any attractive as something new to study and use). I was immediately surprised by the lack of instructions on how to do this at GitHub, no Makefile of any kind etc. All I was able to find is build.sh script. Correction: just check README.md on how to build it properly, as Giuseppe Maxia explained in the comment.

Good, regular structure is important for deployment
Fortunately this is not the first project written in Go that I try to build (or change somehow and then build). The first one was this replication manager (that has proper build instructions in docs). So, I though I knew what to do. I've installed missing golang package on my netbook with Ubuntu 14.04 that I had at hand and tried the following typical steps:
openxs@ao756:~/go$ export GOPATH=$HOME/go
openxs@ao756:~/go$ echo $GOPATH
/home/openxs/go
openxs@ao756:~/go$ go get github.com/datacharmer/dbdeployer
# github.com/datacharmer/dbdeployer/common
src/github.com/datacharmer/dbdeployer/common/strutils.go:170: undefined: sort.Slice
...
That was a bit surprising, but quick Google search shown that this could be caused by outdated (pre-1.8) version of golang package. So, dbdeployer requires golang 1.8 or newer and there was no such package for my good old Ubuntu (it has some 1.2.x only). One day I'll upgrade it, but so far I am OK with 14.04 for all other testing purposes, so I had to give up on the idea to build from source temporary. 

Today during few free minutes I've retried on my good old desktop box with Fedora 27 (where I surely built some Go project(s) successfully):
[openxs@fc23 go]$ uname -a
Linux fc23 4.18.19-100.fc27.x86_64 #1 SMP Wed Nov 14 22:04:34 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[openxs@fc23 ~]$ ls go
pkg  src
[openxs@fc23 ~]$ echo $GOPATH

[openxs@fc23 ~]$ export GOPATH=$HOME/go
[openxs@fc23 ~]$ cd go
[openxs@fc23 go]$ go versiongo version go1.9.7 linux/amd64
This environment should work for build, so I've proceeded with:
[openxs@fc23 go]$ go get github.com/datacharmer/dbdeployer
[openxs@fc23 go]$ ls src/github.com/datacharmer/   jmoiron/       nsf/           tanji/
go-sql-driver/ mattn/         ogier/
[openxs@fc23 go]$ ls src/github.com/datacharmer/dbdeployer/
abbreviations/ compare/       docs/          mkreadme/      test/
.build/        concurrent/    .git/          rest/          unpack/
cmd/           cookbook/      .github/       sandbox/       vendor/
common/        defaults/      globals/       scripts/
Now let's try that scripts/build.sh with linux as a parameter, as it's a way to build Linux binaries based on what I found:
[openxs@fc23 go]$ MKDOCS=1 src/github.com/datacharmer/dbdeployer/scripts/build.sh linux
+ env GOOS=linux GOARCH=386 go build --tags docs -o dbdeployer-1.17.0-docs.linux .
+ env GOOS=linux GOARCH=amd64 go build -o sort_versions.linux sort_versions.go
/home/openxs/go/src/github.com/datacharmer/dbdeployer
-rwxrwxr-x. 1 openxs openxs 8.1M Jan 14 10:27 dbdeployer-1.17.0-docs.linux
-rw-rw-r--. 1 openxs openxs 3.0M Jan 14 10:27 dbdeployer-1.17.0-docs.linux.tar.gz
[openxs@fc23 go]$ ls
bin  pkg  src
[openxs@fc23 go]$ ls bin
dbdeployer
[openxs@fc23 go]$ bin/dbdeployer --version
dbdeployer version 1.17.0
Now we know how to build dbdeployer from source, if needed. If some dependencies are missing you'll be informed and similar go get ... command should allow to install it.

I was somewhat surprised to see MariaDB NOT mentioned at all in README.md. It says:
"DBdeployer is a tool that deploys MySQL database servers easily."
while good old MySQL-Sandbox also mentions MariaDB explicitly:
"This package is a sandbox for testing features under any version of MySQL from 3.23 to 8.0 (and any version of MariaDB.)"
So, my idea was to double check that dbdeployer is both MySQL-Sandbox compatible and MariaDB compatible (it is). I have several sandboxes already created in the past. I also have MariaDB 10.2.21 .tar.gz binaries that I want to use with dbdeployer for further testing:
[openxs@fc23 go]$ ls ~/sandboxes/
clear_all        rsandbox_mariadb-10_0_19  send_kill_all  test_replication
plugin.conf      rsandbox_mariadb-10_1_12  start_all      use_all
restart_all      rsandbox_mysql-8_0_12     status_all
rsandbox_8_0_12  sandbox_action            stop_all
[openxs@fc23 go]$ ls ~/*.tar.gz
/home/openxs/galera-25.3.22-x86_64.tar.gz
/home/openxs/galera-25.3.24-x86_64.tar.gz
/home/openxs/galera-25.3.25-glibc_214-x86_64.tar.gz
/home/openxs/mariadb-10.2.12-linux-x86_64.tar.gz
/home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz
With dbdeployer one has to unpack .tar.gz first with dbdeployer unpack command. So, I tried it immediately:
[openxs@fc23 go]$ bin/dbdeployer unpack ~/mariadb-10.2.21-linux-x86_64.tar.gz
directory '/home/openxs/opt/mysql' not found
You should create it or provide an alternate base directory using --sandbox-binary
It seems the tool now wants to use ~/opt/mysql as a directory to unpack to, while MySQL_Sandbox silently used ~:
[openxs@fc23 go]$ ls ~ | grep 8.0
8.0.12
I made a lame try to force it to use ~, but failed for the reason I was too lazy to study:
[openxs@fc23 go]$ bin/dbdeployer --sandbox-binary=/home/openxs unpack /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gzUnpacking tarball /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz to $HOME/10.2.21
.........100.........200....&tar.Header{Name:"mariadb-10.2.21-linux-x86_64/mysql-test/mysql-test-run", Mode:511, Uid:1021, Gid:1004, Size:0, ModTime:time.Time{wall:0x0, ext:63681810892, loc:(*time.Location)(0xa47aa0)}, Typeflag:0x32, Linkname:"./mysql-test-run.pl", Uname:"dbart", Gname:"my", Devmajor:0, Devminor:0, AccessTime:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, ChangeTime:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, Xattrs:map[string]string(nil)}
#ERROR: symlink ./mysql-test-run.pl mariadb-10.2.21-linux-x86_64/mysql-test/mysql-test-run: file exists
I just created ~/opt/mysql and proceeded with default configuration. After unpack step completed successfully I've proceeded with deploy step to create new replication sandbox:
[openxs@fc23 go]$ bin/dbdeployer unpack /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz
Unpacking tarball /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz to $HOME/opt/mysql/10.2.21
.........100.........200.........300.........400.........500.........600.........700.........800... ...
Renaming directory /home/openxs/opt/mysql/mariadb-10.2.21-linux-x86_64 to /home/openxs/opt/mysql/10.2.21

[openxs@fc23 go]$ bin/dbdeployer deploy replication 10.2.21
Installing and starting master
. sandbox server started
Installing and starting slave1
. sandbox server started
Installing and starting slave2
. sandbox server started
$HOME/sandboxes/rsandbox_10_2_21/initialize_slaves
initializing slave 1
initializing slave 2
Replication directory installed in $HOME/sandboxes/rsandbox_10_2_21
run 'dbdeployer usage multiple' for basic instructions'
We have access to nice enough documentation:
[openxs@fc23 go]$ bin/dbdeployer usage multiple
 USING MULTIPLE SERVER SANDBOX
On a replication sandbox, you have the same commands (run "dbdeployer usage single"),
with an "_all" suffix, meaning that you propagate the command to all the members.
Then you have "./m" as a shortcut to use the master, "./s1" and "./s2" to access
the slaves (and "s3", "s4" ... if you define more).

In group sandboxes without a master slave relationship (group replication and
multiple sandboxes) the nodes can be accessed by ./n1, ./n2, ./n3, and so on.

start_all    [options] > starts all nodes
status_all             > get the status of all nodes
restart_all  [options] > restarts all nodes
stop_all               > stops all nodes
use_all         "SQL"  > runs a SQL statement in all nodes
use_all_masters "SQL"  > runs a SQL statement in all masters
use_all_slaves "SQL"   > runs a SQL statement in all slaves
clear_all              > stops all nodes and removes all data
m                      > invokes MySQL client in the master
s1, s2, n1, n2         > invokes MySQL client in slave 1, 2, node 1, 2


The scripts "check_slaves" or "check_nodes" give the status of replication in the sandbox.
Typical sandbox directory (with some differences like use_all_slaves etc) is created in ~/sandboxes/ and shortcut commands work as expected:
[openxs@fc23 go]$ cd ~/sandboxes/rsandbox_10_2_21/
[openxs@fc23 rsandbox_10_2_21]$ ls
check_slaves       n2           sbdescription.json  test_sb_all
clear_all          node1        send_kill_all       use_all
initialize_slaves  node2        start_all           use_all_masters
m                  restart_all  status_all          use_all_slaves
master             s1           stop_all
n1                 s2           test_replication

[openxs@fc23 rsandbox_10_2_21]$ ls master/
add_option    init_db         restart             show_binlog    status   use
clear         load_grants     sbdescription.json  show_log       stop
data          my              sb_include          show_relaylog  test_sb
grants.mysql  my.sandbox.cnf  send_kill           start          tmp

[openxs@fc23 rsandbox_10_2_21]$ ls ../rsandbox_mariadb-10_1_12/
check_slaves             m       node1        s2             test_replication
clear_all                master  node2        send_kill_all  use_all
connection.json          n1      README       start_all
default_connection.json  n2      restart_all  status_all
initialize_slaves        n3      s1           stop_all

[openxs@fc23 rsandbox_10_2_21]$ ls ../rsandbox_mariadb-10_1_12/master/
add_option       default_connection.json  my              send_kill      tmp
change_paths     grants_5_7_6.mysql       mycli           show_binlog    use
change_ports     grants.mysql             my.sandbox.cnf  show_relaylog  USING
clear            json_in_db               proxy_start     start
connection.json  load_grants              README          status
data             msb                      restart         stop

[openxs@fc23 rsandbox_10_2_21]$ ./status_all
REPLICATION  /home/openxs/sandboxes/rsandbox_10_2_21
master : master on  -  port     23322 (23322)
node1 : node1 on  -  port       23323 (23323)
node2 : node2 on  -  port       23324 (23324)
[openxs@fc23 rsandbox_10_2_21]$ ./use_all "show variables like 'gtid%'"
# master
Variable_name   Value
gtid_binlog_pos 0-100-12
gtid_binlog_state       0-100-12
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos
gtid_strict_mode        OFF
# server: 1
Variable_name   Value
gtid_binlog_pos
gtid_binlog_state
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-12
gtid_strict_mode        OFF
# server: 2
Variable_name   Value
gtid_binlog_pos
gtid_binlog_state
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-12
gtid_strict_mode        OFF
[openxs@fc23 rsandbox_10_2_21]$ ./m
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 14
Server version: 10.2.21-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master [localhost:23322] {msandbox} ((none)) > show master status;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 |     2835 |              |                  |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

master [localhost:23322] {msandbox} ((none)) > show variables like 'gtid%';
+------------------------+----------+
| Variable_name          | Value    |
+------------------------+----------+
| gtid_binlog_pos        | 0-100-12 |
| gtid_binlog_state      | 0-100-12 |
| gtid_current_pos       | 0-100-12 |
| gtid_domain_id         | 0        |
| gtid_ignore_duplicates | OFF      |
| gtid_seq_no            | 0        |
| gtid_slave_pos         |          |
| gtid_strict_mode       | OFF      |
+------------------------+----------+
8 rows in set (0.00 sec)

master [localhost:23322] {msandbox} ((none)) > exit
Bye
For my further tests I needed slaves to have log_slave_updates enabled and gtid_strict_mode=ON. So, I've added these settings to my.sandbox.cnf in node1 and node2 subdirectories for both slaves and restarted them:
[openxs@fc23 rsandbox_10_2_21]$ ./restart_all# executing 'stop' on /home/openxs/sandboxes/rsandbox_10_2_21
stop /home/openxs/sandboxes/rsandbox_10_2_21/node1
stop /home/openxs/sandboxes/rsandbox_10_2_21/node2
stop /home/openxs/sandboxes/rsandbox_10_2_21/master
# executing 'start' on /home/openxs/sandboxes/rsandbox_10_2_21
executing 'start' on master
. sandbox server started
executing 'start' on slave 1
. sandbox server started
executing 'start' on slave 2
. sandbox server started
I need a table to play with and I want to check that slaves are in sync:
[openxs@fc23 rsandbox_10_2_21]$ ./m
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 12
Server version: 10.2.21-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master [localhost:23322] {msandbox} ((none)) > use test
Database changed
master [localhost:23322] {msandbox} (test) > create table t1(id int primary key, c1 int);
Query OK, 0 rows affected (0.17 sec)

master [localhost:23322] {msandbox} (test) > exit
Bye
[openxs@fc23 rsandbox_10_2_21]$ ./use_all "show variables like 'gtid%'"
# master
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos
gtid_strict_mode        OFF
# server: 1
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-13
gtid_strict_mode        ON
# server: 2
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-13
gtid_strict_mode        ON
[openxs@fc23 rsandbox_10_2_21]$
Note the value of gtid_current_pos on master and gtid_slave_pos on each slave. They are the same and slaves are in sync. If you want to find out more about the format of GTIDs in MariaDB or all that gtid% server variables, please, check this KB article.

* * *

To summarize, dbdeployer is a nice port of MySQL-Sandbox into Go, with some additional features. It can be easily built from source if you have golang version 1.8 or newer (or just downloaded if you have not). Sandboxes created with dbdeployer may co-exists with older sandboxes in the same default directory (but .tar.gz files are unpacked into different directory by default). It still works well with MariaDB. I am going to use replication sandboxes built with it for some further testing of various real life use cases and problems of MariaDB's GTIDs implementation (that may be presented in further posts).

Sunday, January 13, 2019

Understanding Status of MariaDB Server JIRA Issues

In my previous blog post on MariaDB's JIRA for MySQL users who are familiar with MySQL bugs database (but may be new to JIRA) I've presented some details about statuses that JIRA issues may have. There is no one to one correspondence with MySQL bug's statuses that I once described in details here. In case of MariaDB Server bugs ("JIRA issues") one may have to check not only "Status" field, but also "Resolution" filed and even "Labels" field to quickly understand what is the real status and what MariaDB engineers decided or are waiting for. So, I think some additional clarifications may help MySQL users who check or report MariaDB bugs as well.

Let me present details of this statuses correspondence in a simple table, where the first column contains MySQL's bug status, while 3 other columns contain the content of corresponding MariaDB Server JIRA issue's fields, "Status", "Resolution" and "Labels". There is also "Comment" column with some explanation on what else is usually done in JIRA issue when it gets this set of values defining its status or what this may mean in MySQL bugs database etc. Most important MySQL bug statuses are taken from this my post (there are more of them, but others are rarely used, especially when real work on bugs was moved into internal bugs database by Oracle, or were removed since that post as it happened to "To be fixed later").

MySQL Bug StatusMariaDB JIRA StatusMariaDB JIRA ResolutionMariaDB JIRA LabelComment
OpenOPENUnresolvedTypical status for just reported bug
ClosedCLOSEDFixedYou should see list of versions that got the fix in the Fix Version/s field
DuplicateCLOSEDDuplicateSo, in MariaDB it's "closed as a duplicate"
AnalyzingOPENUnresolvedUsually bug is assigned when some engineer is working on it, including analysis stage
VerifiedCONFIRMEDUnresolvedCONFIRMED bugs are usually assigned in JIRA while in MySQL "Verified" bugs are usually unassigned
Won't fixCLOSEDWon't FixUsually remains assigned
Can't repeatCLOSEDCannot reproduceUnlike in MySQL, usually means that both engineer and bug reporter are not able to reproduce this
No FeedbackCLOSEDIncompleteneed_feedbackAs in MySQL, bug should stay with "need_feedback" label for some time before it's closed as incomplete
Need FeedbackOPENUnresolvedneed_feedbackUsually in the last comment in the bug you can find out what kind of feedback is required. No automatic setting to "No Feedback" in 30 days
Not a BugCLOSEDNot a Bug 
UnsupportedCLOSEDWon't FixThere is no special "Unsupported" status in MariaDB. Most likely when there is a reason NOT to fix it's stated in the comment.

In the table above you can click on some links to see the list of MariaDB bugs with the status discussed in the table row. This is how I am going to use this post from now on, as a quick search starting point :) It will also be mentioned on one of slides of my upcoming FOSDEM 2019 talk.

Sunday, January 6, 2019

Fun with Bugs #76 - On MySQL Bug Reports I am Subscribed to, Part XIII

Holidays season is almost over here, so it's time to get back to my main topic of MySQL bugs. Proper MySQL bug reporting will be a topic of my FOSDEM 2019 talk in less than 4 weeks (and few slides with recent examples of bugs are not yet ready), so I have to concentrate on bugs.

Last time in this series I reviewed some interesting bug reports filed in November, 2018. Time to move on and proceed with bugs reported in December, 2018, as I've subscribed to 27 or so of them. As usual, I'll review them briefly starting from the oldest and try to check if MariaDB 10.3 is also affected when the bug report is about common features:
  • Bug #93440 - "Noop UPDATE query is logged to binlog after read_only flag is set". Nice corner case found by Artem Danilov. super_read_only, even if set to ON successfully, may not prevent from committing and advancing GTID value.
  • Bug #93450 - "mysqldump does not wrap SET NAMES into mysql-extension comment". This is a regression bug in MySQL 8.0 that may break compatibility with 3rd party tools not aware of MySQL extensions. This bug was reported by Mattias Jonsson.
  • Bug #93451 - "The table comment is cut down on selecting with ORDER BY". Nice regression in MySQL 8. As one can easily check, MariaDB 10.3.x and older MySQL versions are not affected.
  • Bug #93491 - "Optimizer does not correctly consider attached conditions in planning". Clear and useful bug report from Morgan Tocker.
  • Bug #93544 - "SHOW BINLOG EVENTS FROM <bad offset> is not diagnosed". Yet another regression bug in MySQL 8 found by Laurynas Biveinis from Percona. MariaDB 10.3 does not accept bad offsets:
    MariaDB [test]> show binlog events from 14 limit 1;
    ERROR 1220 (HY000): Error when executing command SHOW BINLOG EVENTS: Wrong offset or I/O error
    MariaDB [test]> show binlog events limit 4;
    +------------------+-----+-------------------+-----------+-------------+--------
    ---------------------------------------+
    | Log_name         | Pos | Event_type        | Server_id | End_log_pos | Info
                                           |
    +------------------+-----+-------------------+-----------+-------------+--------
    ---------------------------------------+
    | pc-PC-bin.000001 |   4 | Format_desc       |         1 |         256 | Server
    ver: 10.3.7-MariaDB-log, Binlog ver: 4 |
    | pc-PC-bin.000001 | 256 | Gtid_list         |         1 |         285 | []
                                           |
    | pc-PC-bin.000001 | 285 | Binlog_checkpoint |         1 |         328 | pc-PC-b
    in.000001                              |
    | pc-PC-bin.000001 | 328 | Gtid              |         1 |         370 | GTID 0-
    1-1                                    |
    +------------------+-----+-------------------+-----------+-------------+--------
    ---------------------------------------+
    4 rows in set (0.002 sec)

    MariaDB [test]> show binlog events from 256 limit 1;
    +------------------+-----+------------+-----------+-------------+------+
    | Log_name         | Pos | Event_type | Server_id | End_log_pos | Info |
    +------------------+-----+------------+-----------+-------------+------+
    | pc-PC-bin.000001 | 256 | Gtid_list  |         1 |         285 | []   |
    +------------------+-----+------------+-----------+-------------+------+
    1 row in set (0.002 sec)
  • Bug #93572 - "parallel workers+slave_preserve_commit_order+flushtables with read lock deadlock". I subscribed to it as it's just yet another example of improper handling of useful bug reports, as already discussed in my post "Problems with Oracle's Way of MySQL Bugs Database Maintenance". I think Ashe Sun's point is clear and suggestions like "don't do it" have nothing to do with proper bugs processing.
  • Bug #93587 - "Error when creating a table with long partition names". Nice regression bug in MySQL 8 comparing to 5.7 was found by Sergei Glushchenko from Percona.

    MariaDB 10.3.7 on Windows also fails with error message that is not clear:
    ERROR 1005 (HY000): Can't create table `mc5noglq9ofy7ym76z1t758ztptj6iplvsldhmse
    xt63mlvhcpew4dnu2opqdrre`.`th6edxfx5d1u5blb3i50ln5dfo415jirp9xkuc0h9o2ionkql3iom
    fyw4zvocfpp` (errno: 168 "Unknown (generic) error from engine")
    In the error log I see:
    2019-01-06 19:36:46 10 [ERROR] InnoDB: Operating system error number 3 in a file operation.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: The error means the system cannot find the path specified.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: File .\mc5noglq9ofy7ym76z1t758ztptj6iplvs
    ldhmsext63mlvhcpew4dnu2opqdrre\th6edxfx5d1u5blb3i50ln5dfo415jirp9xkuc0h9o2ionkql
    3iomfyw4zvocfpp#p#o8w7066agxadomywht89twmbjomtfdmdc74wj7iupkd75lvu1enov1j008sjbk
    kf#sp#ywkq987ztkdj33zbmlw526153x86vxl4x44r15spf8jqs92665mt0qi6bsnkazy5.ibd: 'cre
    ate' returned OS error 203.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: Cannot create file '.\mc5noglq9ofy7ym76z1
    t758ztptj6iplvsldhmsext63mlvhcpew4dnu2opqdrre\th6edxfx5d1u5blb3i50ln5dfo415jirp9
    xkuc0h9o2ionkql3iomfyw4zvocfpp#p#o8w7066agxadomywht89twmbjomtfdmdc74wj7iupkd75lv
    u1enov1j008sjbkkf#sp#ywkq987ztkdj33zbmlw526153x86vxl4x44r15spf8jqs92665mt0qi6bsn
    kazy5.ibd'
  • Bug #93600 - "Setting out of range fractional part produces incorrect timestamps". After some arguing this bug reported by Evgeny Firsov was "Verified". In MariaDB 10.3 truncation happens:
    MariaDB [test]> SET SESSION TIMESTAMP=1.9999996;
    Query OK, 0 rows affected (0.039 sec)

    MariaDB [test]> SELECT CURRENT_TIMESTAMP(6);
    +----------------------------+
    | CURRENT_TIMESTAMP(6)       |
    +----------------------------+
    | 1970-01-01 02:00:01.999999 |
    +----------------------------+
    1 row in set (0.010 sec)

    MariaDB [test]> CREATE TABLE t1( ts TIMESTAMP(6), dt DATETIME(6) );
    Query OK, 0 rows affected (0.387 sec)

    MariaDB [test]> INSERT INTO t1 values (CURRENT_TIMESTAMP(6), CURRENT_TIMESTAMP(6));
    Query OK, 1 row affected (0.079 sec)

    MariaDB [test]> select * from t1;
    +----------------------------+----------------------------+
    | ts                         | dt                         |
    +----------------------------+----------------------------+
    | 1970-01-01 02:00:01.999999 | 1970-01-01 02:00:01.999999 |
    +----------------------------+----------------------------+
    1 row in set (0.016 sec)
  • Bug #93603 - "Memory access error with alter table character change." This bug was reported by Ramesh Sivaraman from Percona QA. I've subscribed mostly to find out how bug reports with new severity level (S6) are going to be processed and fixed. See also his Bug #93701 - "Assertion `maybe_null' failed |Item_func_concat::val_str(String*)".

    I've subscribed to S7 Bug #93617 - "Conditional jump or depends on uninitialized value(s) in Field_num::Field_num" from Laurynas Biveinis for similar reason.
  • Bug #93649 - "STOP SLAVE SQL_THREAD deadlocks if done while holding LOCK INSTANCE FOR BACKUP". New MySQL 8 feature, LOCK INSTANCE FOR BACKUP statement, is an attempt to introduce backup locks to MySQL. Sergei Glushchenko found that it may cause deadlocks. I am surprised that the bug is still "Open", since December 18, 2018.
  • Bug #93683 - "Got error 155 when reading table './test/t1'". I am not sure if this message in the error log noted by Roel Van de Paar from Percona is a bug or a problem. MariaDB 10.3.7 produces the same error message. Let's find out, so far this report is "Verified".
  • Bug #93684 - "mysql innodb dump restore slows down after upgrade mysql 5.7 to 8.0". Florian Kopp reported this potential notable performance regression of MySQL 8.0.13 vs 5.7, but it is still not verified. I am not sure how this report may end up, but it's not the first report about performance regressions in MySQL 8 :)

One can hardly find any bugs in this winter forest. But they are hiding and will affect everyone there one day, in spring...
To summarize:
  1. Some regression bugs are still not marked with "regression" tag.
  2. Some MySQL bug reports are still handled wrongly, with a trend of wasting bug reporter time on irrelevant clarifications and claims the problem is not clear, when there is a good enough explanation of the test case.
  3. Percona engineers still contribute a lot of MySQL 8 QA, by reporting numerous bugs. No wonder with their first Percona Server for MySQL 8 GA release happened in December...
I still have a dozen or so December 2018 bug reports (mostly "Open" at the moment) to review one day, so stay tuned!