Sunday, January 20, 2019

Fun with Bugs #77 - On MySQL Bug Reports I am Subscribed to, Part XIV

Slides for my talk about MySQL bugs at FOSDEM 2019 MySQL, MariaDB and Friends Devroon are ready, support customers decided not to break anything badly on weekend, so I have some free time for blogging.  As usual, when I do not have any better idea or useful recent real life experience to share I write about MySQL bugs.

Today I'd like to continue my review of interesting MySQL bug reports added by Community members in December, 2018. I'll review them starting from the oldest:
  • Bug #93701 - "Assertion `maybe_null' failed |Item_func_concat::val_str(String*)". It's debug assertion, not a big deal (S6 bug), so why should anyone care about this report from Ramesh Sivaraman? Because last time same assertion failure ended up as serious enough Bug #83115 fixed in 8.0.11.
  • Bug #93708 - "Page Cleaner will sleep for long time if clock changes". In this bug report from Marcelo Altmann I was mostly impressed by arguing started by Oracle engineer on bug severity (documentation request, feature request vs real bug). IMHO it was a waste of time.
  • Bug #93728 - "mysqld crash after using alter table with SPATIAL key". Nice finding by Rui Xu. Unfortunately MariaDB 10.3.7 is also affected. At least in the error log I also see this error message for each SPATIAL index of the table:
    2019-01-20 19:26:15 9 [ERROR] InnoDB: Record in index `idx2` of table `test`.`tab` was not found on update: TUPLE (info_bits=0, 2 fields): {[32]      $@      $@
          $@      $@(0x0000000000000400000000000000040000000000000004000000000000000
    400),[4]    (0x00000001)} at: COMPACT RECORD(info_bits=0, 1 fields): {[8]infimum
  • Bug #93734 - "MySQL 8.0 is 36 times slower than MySQL 5.7". I like everything about this bug reported by Vadim Tkachenko, CTO of Percona. I like synopsis that sounds cool, missing exact MySQL 8.0.x version (but I suspect 8.0.13), impact of innodb_flush_log_at_trx_commit, suggested fix and the fact that the bug is still "Open" and nobody from Oracle paid any attention to the report from CTO of one of key Oracle partners and supporters of MySQL 8. Wonderful!
  • Bug #93737 - "mysqlbinlog mermory used grows unstoped". I had not tried to reproduce it (and nobody had it seems, as the bug is still "Open"), but this report by Chandler Bong sounded interesting. Let's see what it may end up with...
  • Bug #93746 - "MySQL Crashes with deadlock at Mutex DICT_SYS created". Long semaphore waits involving DICT_SYS mutex and leading to watchdog thread killing MySQL are common. Bug processing is still in progress, but I subscribed to this bug by Anton Ravich mostly to follow another bug he refers to as a possibly related, Bug #80919. That bug is declared a duplicate of some bug in Oracle's internal bugs database, without bug number mentioned. This is not the first time I see such am artistic way to process community bugs (that I consider totally wrong and unacceptable).
  • Bug #93760 - "InnoDB got assert failure while figuring out space id at startup". Fungo Wang found this crash while running MTR test case innodb.innodb_redo_debug_1 with debug binaries. Now I wonder if Oracle QA runs MTR tests with debug binaries at all and do they care to check failures (or rely on Percona for this activity)? Any comments?
  • Bug #93761 - "optimize table cause the slave MTS deadlock!". Sounds interesting and serious, but nobody cared to check this bug report. It's "Open" without comments for 3 weeks already.
  • Bug #93767 - "INSTALL COMPONENT fails - handled segfault during installation nothing in log". I am surprised that somebody outside of Oracle already uses new MySQL 8.0 server components feature. My former colleague, Justin Swanhart, tried but failed so far. Let's see how this report may end up. For now Oracle engineer stated that Justin tried to do something this framework was not designed for, see nice last comment there.
  • Bug #93777 - "the Originator of mysql.event is not correct". Pay attention to this bug found by Dennis Gao if you use events in replication environment.
  • Bug #93779 - "dml in table with trigger to other Table map in binlog lead to slave sql stopped". Work on this bug report from Mohamed Atef is still in progress. Bug reported cared to add a lot of details in recent comments, so I hope to see this bug processed soon.
I've also included a couple of bugs reported in January, 2019 in the list above, but I hope you don't mind.
Today I'd like to remember those sunny days in Venice in September and those pesky MySQL bugs reported this winter...
To summarize:
  1. Some bug reports stay "Open" for weeks without any good reason recently.
  2. The shameful practice of referring to bugs in Oracle's internal bugs database without giving a number (that we see in release notes when the bug is fixed and can look for in commit messages at GitHub) should be stopped! Next time I'll blame engineer(s) who this by name, here and in social media. Check how this is properly done by Shane Bester.
  3. I wonder if Oracle QA runs MTR tests cases on debug builds on a regula basis and checks the results, or this is now a job for Percona mostly?
  4. I should write a separate blog post about bugs related to SPATIAL indexes implementation in InnoDB. There should be dragons...
P.S. Don't forget to click on names of bug reporters to see the entire list of still active MySQL bug reports each of them created.

Tuesday, January 15, 2019

Using dbdeployer With MariaDB Server

Some time ago I've noted that one of the tools I use for testing various MySQL and MariaDB cases and to reproduce potential bugs, MySQL-Sandbox, is not updated any more. It turned out that active development switched to its port in Go called dbdeployer. You can find detailed information about dbdeployer and reasons behind developing it provided by its author, Giuseppe Maxia, here and there. See also this post at Percona blog for some quick review of its main features. One of the points of dbdeployer (and reasons to use Go) is that it is built once (per platform supported) somewhere and then binaries are downloaded from GitHub and used everywhere, without any problems with dependencies etc.

I've added checking dbdeployer to my long ToDo list, as I planned to use it (if not MySQL Sandbox) for some tests and posts related to resolving typical practical problems with MariaDB GTID-based replication. Yesterday I've allocated some time to finally try it and, as usual, I've started with building it from source (as for me databases-related software that I can not build from source on my test systems is not any attractive as something new to study and use). I was immediately surprised by the lack of instructions on how to do this at GitHub, no Makefile of any kind etc. All I was able to find is script. Correction: just check on how to build it properly, as Giuseppe Maxia explained in the comment.

Good, regular structure is important for deployment
Fortunately this is not the first project written in Go that I try to build (or change somehow and then build). The first one was this replication manager (that has proper build instructions in docs). So, I though I knew what to do. I've installed missing golang package on my netbook with Ubuntu 14.04 that I had at hand and tried the following typical steps:
openxs@ao756:~/go$ export GOPATH=$HOME/go
openxs@ao756:~/go$ echo $GOPATH
openxs@ao756:~/go$ go get
src/ undefined: sort.Slice
That was a bit surprising, but quick Google search shown that this could be caused by outdated (pre-1.8) version of golang package. So, dbdeployer requires golang 1.8 or newer and there was no such package for my good old Ubuntu (it has some 1.2.x only). One day I'll upgrade it, but so far I am OK with 14.04 for all other testing purposes, so I had to give up on the idea to build from source temporary. 

Today during few free minutes I've retried on my good old desktop box with Fedora 27 (where I surely built some Go project(s) successfully):
[openxs@fc23 go]$ uname -a
Linux fc23 4.18.19-100.fc27.x86_64 #1 SMP Wed Nov 14 22:04:34 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[openxs@fc23 ~]$ ls go
pkg  src
[openxs@fc23 ~]$ echo $GOPATH

[openxs@fc23 ~]$ export GOPATH=$HOME/go
[openxs@fc23 ~]$ cd go
[openxs@fc23 go]$ go versiongo version go1.9.7 linux/amd64
This environment should work for build, so I've proceeded with:
[openxs@fc23 go]$ go get
[openxs@fc23 go]$ ls src/   jmoiron/       nsf/           tanji/
go-sql-driver/ mattn/         ogier/
[openxs@fc23 go]$ ls src/
abbreviations/ compare/       docs/          mkreadme/      test/
.build/        concurrent/    .git/          rest/          unpack/
cmd/           cookbook/      .github/       sandbox/       vendor/
common/        defaults/      globals/       scripts/
Now let's try that scripts/ with linux as a parameter, as it's a way to build Linux binaries based on what I found:
[openxs@fc23 go]$ MKDOCS=1 src/ linux
+ env GOOS=linux GOARCH=386 go build --tags docs -o dbdeployer-1.17.0-docs.linux .
+ env GOOS=linux GOARCH=amd64 go build -o sort_versions.linux sort_versions.go
-rwxrwxr-x. 1 openxs openxs 8.1M Jan 14 10:27 dbdeployer-1.17.0-docs.linux
-rw-rw-r--. 1 openxs openxs 3.0M Jan 14 10:27 dbdeployer-1.17.0-docs.linux.tar.gz
[openxs@fc23 go]$ ls
bin  pkg  src
[openxs@fc23 go]$ ls bin
[openxs@fc23 go]$ bin/dbdeployer --version
dbdeployer version 1.17.0
Now we know how to build dbdeployer from source, if needed. If some dependencies are missing you'll be informed and similar go get ... command should allow to install it.

I was somewhat surprised to see MariaDB NOT mentioned at all in It says:
"DBdeployer is a tool that deploys MySQL database servers easily."
while good old MySQL-Sandbox also mentions MariaDB explicitly:
"This package is a sandbox for testing features under any version of MySQL from 3.23 to 8.0 (and any version of MariaDB.)"
So, my idea was to double check that dbdeployer is both MySQL-Sandbox compatible and MariaDB compatible (it is). I have several sandboxes already created in the past. I also have MariaDB 10.2.21 .tar.gz binaries that I want to use with dbdeployer for further testing:
[openxs@fc23 go]$ ls ~/sandboxes/
clear_all        rsandbox_mariadb-10_0_19  send_kill_all  test_replication
plugin.conf      rsandbox_mariadb-10_1_12  start_all      use_all
restart_all      rsandbox_mysql-8_0_12     status_all
rsandbox_8_0_12  sandbox_action            stop_all
[openxs@fc23 go]$ ls ~/*.tar.gz
With dbdeployer one has to unpack .tar.gz first with dbdeployer unpack command. So, I tried it immediately:
[openxs@fc23 go]$ bin/dbdeployer unpack ~/mariadb-10.2.21-linux-x86_64.tar.gz
directory '/home/openxs/opt/mysql' not found
You should create it or provide an alternate base directory using --sandbox-binary
It seems the tool now wants to use ~/opt/mysql as a directory to unpack to, while MySQL_Sandbox silently used ~:
[openxs@fc23 go]$ ls ~ | grep 8.0
I made a lame try to force it to use ~, but failed for the reason I was too lazy to study:
[openxs@fc23 go]$ bin/dbdeployer --sandbox-binary=/home/openxs unpack /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gzUnpacking tarball /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz to $HOME/10.2.21
.........100.........200....&tar.Header{Name:"mariadb-10.2.21-linux-x86_64/mysql-test/mysql-test-run", Mode:511, Uid:1021, Gid:1004, Size:0, ModTime:time.Time{wall:0x0, ext:63681810892, loc:(*time.Location)(0xa47aa0)}, Typeflag:0x32, Linkname:"./", Uname:"dbart", Gname:"my", Devmajor:0, Devminor:0, AccessTime:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, ChangeTime:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, Xattrs:map[string]string(nil)}
#ERROR: symlink ./ mariadb-10.2.21-linux-x86_64/mysql-test/mysql-test-run: file exists
I just created ~/opt/mysql and proceeded with default configuration. After unpack step completed successfully I've proceeded with deploy step to create new replication sandbox:
[openxs@fc23 go]$ bin/dbdeployer unpack /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz
Unpacking tarball /home/openxs/mariadb-10.2.21-linux-x86_64.tar.gz to $HOME/opt/mysql/10.2.21
.........100.........200.........300.........400.........500.........600.........700.........800... ...
Renaming directory /home/openxs/opt/mysql/mariadb-10.2.21-linux-x86_64 to /home/openxs/opt/mysql/10.2.21

[openxs@fc23 go]$ bin/dbdeployer deploy replication 10.2.21
Installing and starting master
. sandbox server started
Installing and starting slave1
. sandbox server started
Installing and starting slave2
. sandbox server started
initializing slave 1
initializing slave 2
Replication directory installed in $HOME/sandboxes/rsandbox_10_2_21
run 'dbdeployer usage multiple' for basic instructions'
We have access to nice enough documentation:
[openxs@fc23 go]$ bin/dbdeployer usage multiple
On a replication sandbox, you have the same commands (run "dbdeployer usage single"),
with an "_all" suffix, meaning that you propagate the command to all the members.
Then you have "./m" as a shortcut to use the master, "./s1" and "./s2" to access
the slaves (and "s3", "s4" ... if you define more).

In group sandboxes without a master slave relationship (group replication and
multiple sandboxes) the nodes can be accessed by ./n1, ./n2, ./n3, and so on.

start_all    [options] > starts all nodes
status_all             > get the status of all nodes
restart_all  [options] > restarts all nodes
stop_all               > stops all nodes
use_all         "SQL"  > runs a SQL statement in all nodes
use_all_masters "SQL"  > runs a SQL statement in all masters
use_all_slaves "SQL"   > runs a SQL statement in all slaves
clear_all              > stops all nodes and removes all data
m                      > invokes MySQL client in the master
s1, s2, n1, n2         > invokes MySQL client in slave 1, 2, node 1, 2

The scripts "check_slaves" or "check_nodes" give the status of replication in the sandbox.
Typical sandbox directory (with some differences like use_all_slaves etc) is created in ~/sandboxes/ and shortcut commands work as expected:
[openxs@fc23 go]$ cd ~/sandboxes/rsandbox_10_2_21/
[openxs@fc23 rsandbox_10_2_21]$ ls
check_slaves       n2           sbdescription.json  test_sb_all
clear_all          node1        send_kill_all       use_all
initialize_slaves  node2        start_all           use_all_masters
m                  restart_all  status_all          use_all_slaves
master             s1           stop_all
n1                 s2           test_replication

[openxs@fc23 rsandbox_10_2_21]$ ls master/
add_option    init_db         restart             show_binlog    status   use
clear         load_grants     sbdescription.json  show_log       stop
data          my              sb_include          show_relaylog  test_sb
grants.mysql  my.sandbox.cnf  send_kill           start          tmp

[openxs@fc23 rsandbox_10_2_21]$ ls ../rsandbox_mariadb-10_1_12/
check_slaves             m       node1        s2             test_replication
clear_all                master  node2        send_kill_all  use_all
connection.json          n1      README       start_all
default_connection.json  n2      restart_all  status_all
initialize_slaves        n3      s1           stop_all

[openxs@fc23 rsandbox_10_2_21]$ ls ../rsandbox_mariadb-10_1_12/master/
add_option       default_connection.json  my              send_kill      tmp
change_paths     grants_5_7_6.mysql       mycli           show_binlog    use
change_ports     grants.mysql             my.sandbox.cnf  show_relaylog  USING
clear            json_in_db               proxy_start     start
connection.json  load_grants              README          status
data             msb                      restart         stop

[openxs@fc23 rsandbox_10_2_21]$ ./status_all
REPLICATION  /home/openxs/sandboxes/rsandbox_10_2_21
master : master on  -  port     23322 (23322)
node1 : node1 on  -  port       23323 (23323)
node2 : node2 on  -  port       23324 (23324)
[openxs@fc23 rsandbox_10_2_21]$ ./use_all "show variables like 'gtid%'"
# master
Variable_name   Value
gtid_binlog_pos 0-100-12
gtid_binlog_state       0-100-12
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_strict_mode        OFF
# server: 1
Variable_name   Value
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-12
gtid_strict_mode        OFF
# server: 2
Variable_name   Value
gtid_current_pos        0-100-12
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-12
gtid_strict_mode        OFF
[openxs@fc23 rsandbox_10_2_21]$ ./m
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 14
Server version: 10.2.21-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master [localhost:23322] {msandbox} ((none)) > show master status;
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
| mysql-bin.000001 |     2835 |              |                  |
1 row in set (0.00 sec)

master [localhost:23322] {msandbox} ((none)) > show variables like 'gtid%';
| Variable_name          | Value    |
| gtid_binlog_pos        | 0-100-12 |
| gtid_binlog_state      | 0-100-12 |
| gtid_current_pos       | 0-100-12 |
| gtid_domain_id         | 0        |
| gtid_ignore_duplicates | OFF      |
| gtid_seq_no            | 0        |
| gtid_slave_pos         |          |
| gtid_strict_mode       | OFF      |
8 rows in set (0.00 sec)

master [localhost:23322] {msandbox} ((none)) > exit
For my further tests I needed slaves to have log_slave_updates enabled and gtid_strict_mode=ON. So, I've added these settings to my.sandbox.cnf in node1 and node2 subdirectories for both slaves and restarted them:
[openxs@fc23 rsandbox_10_2_21]$ ./restart_all# executing 'stop' on /home/openxs/sandboxes/rsandbox_10_2_21
stop /home/openxs/sandboxes/rsandbox_10_2_21/node1
stop /home/openxs/sandboxes/rsandbox_10_2_21/node2
stop /home/openxs/sandboxes/rsandbox_10_2_21/master
# executing 'start' on /home/openxs/sandboxes/rsandbox_10_2_21
executing 'start' on master
. sandbox server started
executing 'start' on slave 1
. sandbox server started
executing 'start' on slave 2
. sandbox server started
I need a table to play with and I want to check that slaves are in sync:
[openxs@fc23 rsandbox_10_2_21]$ ./m
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 12
Server version: 10.2.21-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master [localhost:23322] {msandbox} ((none)) > use test
Database changed
master [localhost:23322] {msandbox} (test) > create table t1(id int primary key, c1 int);
Query OK, 0 rows affected (0.17 sec)

master [localhost:23322] {msandbox} (test) > exit
[openxs@fc23 rsandbox_10_2_21]$ ./use_all "show variables like 'gtid%'"
# master
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_strict_mode        OFF
# server: 1
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-13
gtid_strict_mode        ON
# server: 2
Variable_name   Value
gtid_binlog_pos 0-100-13
gtid_binlog_state       0-100-13
gtid_current_pos        0-100-13
gtid_domain_id  0
gtid_ignore_duplicates  OFF
gtid_seq_no     0
gtid_slave_pos  0-100-13
gtid_strict_mode        ON
[openxs@fc23 rsandbox_10_2_21]$
Note the value of gtid_current_pos on master and gtid_slave_pos on each slave. They are the same and slaves are in sync. If you want to find out more about the format of GTIDs in MariaDB or all that gtid% server variables, please, check this KB article.

* * *

To summarize, dbdeployer is a nice port of MySQL-Sandbox into Go, with some additional features. It can be easily built from source if you have golang version 1.8 or newer (or just downloaded if you have not). Sandboxes created with dbdeployer may co-exists with older sandboxes in the same default directory (but .tar.gz files are unpacked into different directory by default). It still works well with MariaDB. I am going to use replication sandboxes built with it for some further testing of various real life use cases and problems of MariaDB's GTIDs implementation (that may be presented in further posts).

Sunday, January 13, 2019

Understanding Status of MariaDB Server JIRA Issues

In my previous blog post on MariaDB's JIRA for MySQL users who are familiar with MySQL bugs database (but may be new to JIRA) I've presented some details about statuses that JIRA issues may have. There is no one to one correspondence with MySQL bug's statuses that I once described in details here. In case of MariaDB Server bugs ("JIRA issues") one may have to check not only "Status" field, but also "Resolution" filed and even "Labels" field to quickly understand what is the real status and what MariaDB engineers decided or are waiting for. So, I think some additional clarifications may help MySQL users who check or report MariaDB bugs as well.

Let me present details of this statuses correspondence in a simple table, where the first column contains MySQL's bug status, while 3 other columns contain the content of corresponding MariaDB Server JIRA issue's fields, "Status", "Resolution" and "Labels". There is also "Comment" column with some explanation on what else is usually done in JIRA issue when it gets this set of values defining its status or what this may mean in MySQL bugs database etc. Most important MySQL bug statuses are taken from this my post (there are more of them, but others are rarely used, especially when real work on bugs was moved into internal bugs database by Oracle, or were removed since that post as it happened to "To be fixed later").

MySQL Bug StatusMariaDB JIRA StatusMariaDB JIRA ResolutionMariaDB JIRA LabelComment
OpenOPENUnresolvedTypical status for just reported bug
ClosedCLOSEDFixedYou should see list of versions that got the fix in the Fix Version/s field
DuplicateCLOSEDDuplicateSo, in MariaDB it's "closed as a duplicate"
AnalyzingOPENUnresolvedUsually bug is assigned when some engineer is working on it, including analysis stage
VerifiedCONFIRMEDUnresolvedCONFIRMED bugs are usually assigned in JIRA while in MySQL "Verified" bugs are usually unassigned
Won't fixCLOSEDWon't FixUsually remains assigned
Can't repeatCLOSEDCannot reproduceUnlike in MySQL, usually means that both engineer and bug reporter are not able to reproduce this
No FeedbackCLOSEDIncompleteneed_feedbackAs in MySQL, bug should stay with "need_feedback" label for some time before it's closed as incomplete
Need FeedbackOPENUnresolvedneed_feedbackUsually in the last comment in the bug you can find out what kind of feedback is required. No automatic setting to "No Feedback" in 30 days
Not a BugCLOSEDNot a Bug 
UnsupportedCLOSEDWon't FixThere is no special "Unsupported" status in MariaDB. Most likely when there is a reason NOT to fix it's stated in the comment.

In the table above you can click on some links to see the list of MariaDB bugs with the status discussed in the table row. This is how I am going to use this post from now on, as a quick search starting point :) It will also be mentioned on one of slides of my upcoming FOSDEM 2019 talk.

Sunday, January 6, 2019

Fun with Bugs #76 - On MySQL Bug Reports I am Subscribed to, Part XIII

Holidays season is almost over here, so it's time to get back to my main topic of MySQL bugs. Proper MySQL bug reporting will be a topic of my FOSDEM 2019 talk in less than 4 weeks (and few slides with recent examples of bugs are not yet ready), so I have to concentrate on bugs.

Last time in this series I reviewed some interesting bug reports filed in November, 2018. Time to move on and proceed with bugs reported in December, 2018, as I've subscribed to 27 or so of them. As usual, I'll review them briefly starting from the oldest and try to check if MariaDB 10.3 is also affected when the bug report is about common features:
  • Bug #93440 - "Noop UPDATE query is logged to binlog after read_only flag is set". Nice corner case found by Artem Danilov. super_read_only, even if set to ON successfully, may not prevent from committing and advancing GTID value.
  • Bug #93450 - "mysqldump does not wrap SET NAMES into mysql-extension comment". This is a regression bug in MySQL 8.0 that may break compatibility with 3rd party tools not aware of MySQL extensions. This bug was reported by Mattias Jonsson.
  • Bug #93451 - "The table comment is cut down on selecting with ORDER BY". Nice regression in MySQL 8. As one can easily check, MariaDB 10.3.x and older MySQL versions are not affected.
  • Bug #93491 - "Optimizer does not correctly consider attached conditions in planning". Clear and useful bug report from Morgan Tocker.
  • Bug #93544 - "SHOW BINLOG EVENTS FROM <bad offset> is not diagnosed". Yet another regression bug in MySQL 8 found by Laurynas Biveinis from Percona. MariaDB 10.3 does not accept bad offsets:
    MariaDB [test]> show binlog events from 14 limit 1;
    ERROR 1220 (HY000): Error when executing command SHOW BINLOG EVENTS: Wrong offset or I/O error
    MariaDB [test]> show binlog events limit 4;
    | Log_name         | Pos | Event_type        | Server_id | End_log_pos | Info
    | pc-PC-bin.000001 |   4 | Format_desc       |         1 |         256 | Server
    ver: 10.3.7-MariaDB-log, Binlog ver: 4 |
    | pc-PC-bin.000001 | 256 | Gtid_list         |         1 |         285 | []
    | pc-PC-bin.000001 | 285 | Binlog_checkpoint |         1 |         328 | pc-PC-b
    in.000001                              |
    | pc-PC-bin.000001 | 328 | Gtid              |         1 |         370 | GTID 0-
    1-1                                    |
    4 rows in set (0.002 sec)

    MariaDB [test]> show binlog events from 256 limit 1;
    | Log_name         | Pos | Event_type | Server_id | End_log_pos | Info |
    | pc-PC-bin.000001 | 256 | Gtid_list  |         1 |         285 | []   |
    1 row in set (0.002 sec)
  • Bug #93572 - "parallel workers+slave_preserve_commit_order+flushtables with read lock deadlock". I subscribed to it as it's just yet another example of improper handling of useful bug reports, as already discussed in my post "Problems with Oracle's Way of MySQL Bugs Database Maintenance". I think Ashe Sun's point is clear and suggestions like "don't do it" have nothing to do with proper bugs processing.
  • Bug #93587 - "Error when creating a table with long partition names". Nice regression bug in MySQL 8 comparing to 5.7 was found by Sergei Glushchenko from Percona.

    MariaDB 10.3.7 on Windows also fails with error message that is not clear:
    ERROR 1005 (HY000): Can't create table `mc5noglq9ofy7ym76z1t758ztptj6iplvsldhmse
    fyw4zvocfpp` (errno: 168 "Unknown (generic) error from engine")
    In the error log I see:
    2019-01-06 19:36:46 10 [ERROR] InnoDB: Operating system error number 3 in a file operation.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: The error means the system cannot find the path specified.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: File .\mc5noglq9ofy7ym76z1t758ztptj6iplvs
    kf#sp#ywkq987ztkdj33zbmlw526153x86vxl4x44r15spf8jqs92665mt0qi6bsnkazy5.ibd: 'cre
    ate' returned OS error 203.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: Cannot create file '.\mc5noglq9ofy7ym76z1
  • Bug #93600 - "Setting out of range fractional part produces incorrect timestamps". After some arguing this bug reported by Evgeny Firsov was "Verified". In MariaDB 10.3 truncation happens:
    MariaDB [test]> SET SESSION TIMESTAMP=1.9999996;
    Query OK, 0 rows affected (0.039 sec)

    | CURRENT_TIMESTAMP(6)       |
    | 1970-01-01 02:00:01.999999 |
    1 row in set (0.010 sec)

    MariaDB [test]> CREATE TABLE t1( ts TIMESTAMP(6), dt DATETIME(6) );
    Query OK, 0 rows affected (0.387 sec)

    Query OK, 1 row affected (0.079 sec)

    MariaDB [test]> select * from t1;
    | ts                         | dt                         |
    | 1970-01-01 02:00:01.999999 | 1970-01-01 02:00:01.999999 |
    1 row in set (0.016 sec)
  • Bug #93603 - "Memory access error with alter table character change." This bug was reported by Ramesh Sivaraman from Percona QA. I've subscribed mostly to find out how bug reports with new severity level (S6) are going to be processed and fixed. See also his Bug #93701 - "Assertion `maybe_null' failed |Item_func_concat::val_str(String*)".

    I've subscribed to S7 Bug #93617 - "Conditional jump or depends on uninitialized value(s) in Field_num::Field_num" from Laurynas Biveinis for similar reason.
  • Bug #93649 - "STOP SLAVE SQL_THREAD deadlocks if done while holding LOCK INSTANCE FOR BACKUP". New MySQL 8 feature, LOCK INSTANCE FOR BACKUP statement, is an attempt to introduce backup locks to MySQL. Sergei Glushchenko found that it may cause deadlocks. I am surprised that the bug is still "Open", since December 18, 2018.
  • Bug #93683 - "Got error 155 when reading table './test/t1'". I am not sure if this message in the error log noted by Roel Van de Paar from Percona is a bug or a problem. MariaDB 10.3.7 produces the same error message. Let's find out, so far this report is "Verified".
  • Bug #93684 - "mysql innodb dump restore slows down after upgrade mysql 5.7 to 8.0". Florian Kopp reported this potential notable performance regression of MySQL 8.0.13 vs 5.7, but it is still not verified. I am not sure how this report may end up, but it's not the first report about performance regressions in MySQL 8 :)

One can hardly find any bugs in this winter forest. But they are hiding and will affect everyone there one day, in spring...
To summarize:
  1. Some regression bugs are still not marked with "regression" tag.
  2. Some MySQL bug reports are still handled wrongly, with a trend of wasting bug reporter time on irrelevant clarifications and claims the problem is not clear, when there is a good enough explanation of the test case.
  3. Percona engineers still contribute a lot of MySQL 8 QA, by reporting numerous bugs. No wonder with their first Percona Server for MySQL 8 GA release happened in December...
I still have a dozen or so December 2018 bug reports (mostly "Open" at the moment) to review one day, so stay tuned!

Sunday, December 30, 2018


These days several kinds and forks of MySQL are widely used, and while I promised not to write about MySQL bugs till the end of 2018, I think it makes sense to try to explain basic details about bug reporting for at least one of vendors that use JIRA instances as a public bug tracking systems. I work for MariaDB Corporation and it would be natural for me to write about MariaDB's JIRA that I use every day.

As a side note, Percona also switched to JIRA some time ago, and many of the JIRA-specific details described below (that are different comparing to good old apply to Percona bugs tracking system as well.

Why would MariaDB bugs be interesting to an average MySQL community member who does not use MariaDB at all most of the time? One of the reasons is that some MySQL bugs are also reported (as "upstream") to MariaDB and they may be fixed there well before they are fixed in MySQL. Consider MDEV-15953 - "Alter InnoDB Partitioned Table Moves Files (which were originally not in the datadir) to the datadir" (reported by Chris Calender) as an example. It was fixed in MariaDB 5 months ago, while corresponding Bug #78164 is still "Verified" and got no visible attention for more that 3 years. The fix is 12 rows added in two files (test case aside), so theoretically can be easily used to modify upstream MySQL by an interested and determined MySQL user who already compiles MySQL from the source code (for whatever reason), if the problem is important in their environment.

Another reason is related to the fact that work on new features of MariaDB server and connectors is performed in an open manner at all stages. You can see current plans, discussions and decision making on new features happening in real time, in JIRA. Existing problems (that often affect both MySQL and MariaDB) are presented and analyzed, and reading related comments may be useful to understand current limitations of MySQL and decide if at some stage switching to MariaDB or using some related patches may help in your production environment. There is no need to wait for some lab preview release. You can also add comments on design decisions and question them before it's too late. Great example of such a useful to read (for anyone interested in InnoDB) feature request and work in progress is MDEV-11424 - "Instant ALTER TABLE of failure-free record format changes".

Yet another reason to use MariaDB JIRA and follow some bug reports and feature requests there is to find some insights on how MySQL, its components (like optimizer) and storage engines (like InnoDB) really work. Consider my Bug #82127 - "Deadlock with 3 concurrent DELETEs by UNIQUE key". This bug was originally reported to Percona as lp:1598822 (as it was first noticed with Percona's XtraDB emgine) and ended up in their JIRA as PS-3479 (still "New"). In MySQL bugs database it got "Verified" after some discussions. Eventually I gave up waiting for "upstream" to make any progress on it and reported it as MDEV-10962. In that MariaDB bug report you can find explanations of the behavior noticed, multiple comments and ideas on the root case and on how to improve locking behavior in this case, links to other related bugs etc. It's a useful reading. Moreover, we see that there are plans to improve/fix this in MariaDB 10.4.

I also like to check some problematic and interesting test cases, no matter in what bugs database it was reported, on both MariaDB Server, Percona Server and MySQL Server, as long as it's about some common features. But may be it's so because I work with all these as a support engineer.

Anyway, one day following MariaDB Server bugs may help some MySQL DBA to do the job better. So, I suggest all MySQL users to check MariaDB's JIRA from time to time. Some basic details about the differences comparing to MySQL's bugs database are presented below.

First thing to notice in case of MariaDB's JIRA is a domain name. It's, so bug tracking system formally "belongs" to MariaDB Foundation - non-profit entity that supports continuity and open collaboration in the MariaDB ecosystem. Both MariaDB Foundation employees, MariaDB Corporation employees, developers working for partners (like Codership) and community members (like Olivier Bertrand, author of CONNECT storage engine I had written about here) work on source code (and bugs processing and fixing) together, at GitHub. Different users have different roles and privileges in JIRA, surely. But there is no other, "internal" bugs database in MariaDB Corporation. All work or bugs and features, time reporting, code review process, as well as release planning happen (or at least is visible) in an open manner, in JIRA.

Even if you do not have JIRA account, you still can see Jira Road Map, release plans and statuses. You can see all public comments and history of changes for each bug. If you create and log in into your account (this is needed to report new bugs, vote for them or watch them and get email notifications about any changes, obviously) you'll see also more details on bugs, like links to GitHub commits and pull requests related to the bug.

Unlike MySQL bugs database where bugs are split into "Categories" (where both "MySQL Server: Information schema" and "MySQL Workbench" are categories more or less of the same level) but are numbered sequentially over all categories, JIRA instances usually support "Projects", with separate "name" and sequential numbering of bugs per project.

At the moment there are 17 or so projects in MariaDB JIRA, of them the following public ones are most interesting for MySQL community users, I think:
Let's consider one MariaDB Server bug for example:

Unlike in MySQL bugs database, JIRA issues have "Type". For our case it's important that feature requests usually end up as "Task" vs "Bug" as a type for a bug. Some projects in MariaDB JIRA may also support a separate "New Feature" type to differentiate features from tasks not related to creating new code. In MySQL separate severity (S4, "Feature request") is used.

MariaDB JIRA issues have priorities from the following list:
  • Trivial
  • Minor
  • Major
  • Critical
  • Blocker
By default MariaDB bugs are filed with intermediate, "Major" priority. Priority may be changed by the bug reporter or by JIRA users (mostly developers) who work on the bug, it often changes with time (priority may increase if more users are affected, or if the fix does not happen for long enough time etc, or decrease when the problem can be workarounded somehow for affected users). Usually a bug with "Blocker" priority means there should be no next minor release for any major version listed in "Fix Version/s" without the fix.

There are many fields in MySQL bugs database to define priority of the fix (including "Priority" itself), but only "Severity" is visible to public. Usually "Severity" of the MySQL bug does NOT change with time (if only before it's "Verified").

It is normal to list all/many versions affected by the bug in JIRA in "Affected Version/s". If the bug is fixed, in "Fix Version/s" you can find the exact list of all minor MariaDB Server versions that got the fix.

Each JIRA issue has a "Status" and "Resolution". In MySQL bugs database there is just "Status" for both. Valid statuses are:
  • OPEN - this is usually a bug that is just reported or is not yet in the process of fixing.
  • CONFIRMED - this status means that some developer checked bug report and confirmed it's really a bug and it's clear how to reproduce it based on the information already present in the report. More or less this status matches "Verified" MySQL bug. But unlike in MySQL, even "Open" bug may be assigned to a developer to further work on it.
  • CLOSED - the bug is resolved somehow. See the content of the "Resolution" filed for details on how it was resolved.
  • STALLED - this is a real bug and some work on it was performed, but nobody actively works on it now.
  • IN PROGRESS - assignee is currently working on the fix for the bug.
  • IN REVIEW - assignee is currently reviewing the fix for the bug.
The following values are possible for "Resolution" field:
  • Unresolved - every bug that is not "CLOSED" is "Unresolved".
  • Fixed - every bug that was fixed with some change to the source code. If you log in to JIRA you should be able to find links to GitHub commit(s) with the fix in the "Fixed" JIRA issue.
  • Won't Fix - the problem is real, but it was decided not to fix it (as it's expected or may be too hard to fix). Consider my MDEV-15213 - "UPDATEs are slow after instant ADD COLUMN" as one of examples.
  • Duplicate - there is another bug report about the same problem. You can find link to it in the JIRA issue.
  • Incomplete - there is no way to reproduce or understand the problem based on the information provided. See MDEV-17808 for example.
  • Cannot Reproduce - bug reporter himself can not reproduce the problem any more, even after following the same steps that caused the problem before. See MDEV-17667 for example.
  • Not a Bug - the problem described is not a result of any bug. Everything works as designed and probably some misunderstanding caused bug reporter to think it was a bug. See MDEV-17790 as a typical example.
  • Done - this is used for completed tasks (like MDEV-17886) or bugs related to some 3rd party stored engine where the fix is done, but it's up to MariaDB to merge/use fixed version of the engine (like MDEV-17212).
  • Won't Do - it was decided NOT to do the task. See MDEV-16418 as one of examples.
In MySQL there are separate bug statuses for (most of) these. There are some tiny differences for the way some statuses like "Cannot reproduce" are applied by those who process bugs in MySQL vs MariaDB though.

Explanations above should be enough for any MySQL bugs database user to start using MariaDB's JIRA efficiently, I think. But I am open to any followup questions and I am considering separate blog posts explaining the life cycle of a MariaDB Server bug and some tips on efficient search in MariaDB JIRA.

Saturday, December 22, 2018

How to Get Details About MyRocks Deadlocks in MariaDB and Percona Server

In my previous post on ERROR 1213 I noted that Percona Server does not support the SHOW ENGINE ROCKSDB TRANSACTION STATUS statement to get deadlock details in "text" form. I've got some clarifications in my related feature request, PS-5114. So I decided to write this followup post and show what is the way to get deadlock details for the ROCKSDB tables in current versions of MariaDB and Percona Server.

First of all, I'd like to check MariaDB's implementation of MyRocks. For this I'll re-create deadlock scenario from that my post with MariaDB 10.3.12 I have at hand. We should start with installing ROCKSDB plugin according to this KB article:
openxs@ao756:~/dbs/maria10.3$ bin/mysql --no-defaults --socket=/tmp/mariadb.sock -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.3.12-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> install soname 'ha_rocksdb';
Query OK, 0 rows affected (36,338 sec)

MariaDB [test]> show engines;
| Engine             | Support | Comment                                                                          | Transactions | XA   | Savepoints |
| ROCKSDB            | YES     | RocksDB storage engine                                                           | YES          | YES  | YES        |
| CONNECT            | YES     | Management of External Data (SQL/NOSQL/MED), including many file formats         | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage                                           | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                               | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                   | YES          | NO   | YES        |
10 rows in set (8,753 sec)

MariaDB [test]> show plugins;
| Name                          | Status   | Type               | Library       | License |
| binlog                        | ACTIVE   | STORAGE ENGINE     | NULL          | GPL     |
| mysql_native_password         | ACTIVE   | AUTHENTICATION     | NULL          | GPL     |
| mysql_old_password            | ACTIVE   | AUTHENTICATION     | NULL          | GPL     |
| wsrep                         | ACTIVE   | STORAGE ENGINE     | NULL          | GPL     |
| CONNECT                       | ACTIVE   | STORAGE ENGINE     | | GPL     |
| ROCKSDB                       | ACTIVE   | STORAGE ENGINE     | | GPL     |
| ROCKSDB_LOCKS                 | ACTIVE   | INFORMATION SCHEMA | | GPL     |
| ROCKSDB_TRX                   | ACTIVE   | INFORMATION SCHEMA | | GPL     |
68 rows in set (2,451 sec)
Note that in MariaDB just one simple INSTALL SONAME ... statement is enough to get ROCKSDB with all related plugins loaded. Do not mind time to execute statements - I am running them on a netbook that is busy compiling Percona Server 8.0.13 from GitHub concurrently, to post something about it later :)

Now, let me re-create the same deadlock scenario:

MariaDB [test]> create table t1(id int, c1 int, primary key(id)) engine=rocksdb;
Query OK, 0 rows affected (4,163 sec)

MariaDB [test]> insert into t1 values (1,1), (2,2);
Query OK, 2 rows affected (0,641 sec)
Records: 2  Duplicates: 0  Warnings: 0

MariaDB [test]> set global rocksdb_lock_wait_timeout=50;
Query OK, 0 rows affected (0,644 sec)

MariaDB [test]> set global rocksdb_deadlock_detect=ON;
Query OK, 0 rows affected (0,037 sec)

MariaDB [test]> show global variables like 'rocksdb%deadlock%';
| Variable_name                 | Value |
| rocksdb_deadlock_detect       | ON    |
| rocksdb_deadlock_detect_depth | 50    |
| rocksdb_max_latest_deadlocks  | 5     |
3 rows in set (0,022 sec)
We need two sessions. In the first one:
MariaDB [test]> select connection_id();
| connection_id() |
|              11 |
1 row in set (0,117 sec)

MariaDB [test]> start transaction;
Query OK, 0 rows affected (0,000 sec)

MariaDB [test]> select * from t1 where id = 1 for update;
| id | c1   |
|  1 |    1 |
1 row in set (0,081 sec)

In the second:
MariaDB [test]> select connection_id();
| connection_id() |
|              12 |
1 row in set (0,000 sec)

MariaDB [test]> start transaction;
Query OK, 0 rows affected (0,000 sec)

MariaDB [test]> select * from t1 where id = 2 for update;
| id | c1   |
|  2 |    2 |
1 row in set (0,001 sec)
Back in the first:

MariaDB [test]> select * from t1 where id=2 for update;
It hangs waiting for incompatible lock. In the second:

MariaDB [test]> select * from t1 where id=1 for update;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Now, can we get any details about the deadlock that just happened using upstream SHOW ENGINE statement? Let's try:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'TRANSACTION STATUS' at line 1
Does not work, same as in Percona Server 5.7.x. Here is a related MariaDB task: MDEV-13859 - "Add SHOW ENGINE ROCKSDB TRANSACTION STATUS or its equivalent?". Still "Open" and without any target version set, not even 10.4.

The idea behind NOT supporting this statement by Percona, according to comments I've got in  PS-5114, is to rely on tables in the information_schema. That's what we have in the related tables, rocksdb_locks, rocksdb_trx and rocksdb_deadlock after deadlock above was detected:
MariaDB [test]> select * from information_schema.rocksdb_deadlock;
|           0 | 1545481878 |              6 | default | 0000010080000002 | EXCLUSIVE | PRIMARY    | test.t1    |           0 |
|           0 | 1545481878 |              7 | default | 0000010080000001 | EXCLUSIVE | PRIMARY    | test.t1    |           1 |
2 rows in set (0,078 sec)

MariaDB [test]> select * from information_schema.rocksdb_trx;
|              6 | STARTED |      |           0 |          2 |          50 |             |                        0 |              0 |            0 |         0 |                      1 |                    0 |        11 |       |
1 row in set (0,001 sec)

MariaDB [test]> select * from information_schema.rocksdb_locks;
|                0 |              6 | 0000010080000002 | X    |
|                0 |              6 | 0000010080000001 | X    |
2 rows in set (0,025 sec)
I was not able to find any really good documentation about these tables (I checked here, there and more), especially rocksdb_deadlock that is totally undocumented, so let me try to speculate and explain my ideas on how they are supposed to work and be used together. Information about up to rocksdb_max_latest_deadlocks is stored in the rocksdb_deadlock table, each deadlock is identified by deadlock_id and in case of MariaDB you can find out when it happened using the timestamp column that is a UNIX timestamp:
MariaDB [test]> select distinct deadlock_id, from_unixtime(timestamp) from information_schema.rocksdb_deadlock;+-------------+--------------------------+
| deadlock_id | from_unixtime(timestamp) |
|           0 | 2018-12-22 14:31:18      |
1 row in set (0,137 sec)

For each deadlock you have a row per lock wait for each transaction involved, identified by transaction_id. You can see for what key value (waited_key) in that index (index_name) of what table (table_name) the transaction was waited. Victim (transaction that was rolled back to prevent deadlock) of deadlock detection is identified by the value 1 in the rolled_back column. This is all the information that persists for any notable time, and I don't like it that much, as we can not see what lock(s) transactions had at the moment. We can guess conflicting lock based on what was the waiting_key, but I'd prefer InnoDB way of showing this clearly with all the details.

If you hurry up and query the rocksdb_trx table fast enough, you can get more details about those transaction(s) involved in deadlock that are NOT rolled back (and not yet committed). Join by the transaction_id column, obviously, to get the details up to current running query and processlist connection id (rocksdb_trx.thread_id column) involved. 

If you hurry up to query rocksdb_locks table also by the transaction_id of still active transaction, you can get a list of locks it holds and then guess which one was a blocking lock. If you are not fast enough and transaction is gone you have just to assume there was some blocking lock. One day gap locks may be added, and some lock would become not good enough guess.

I miss these rocks at Beaulieu-sur-Mer. MyRocks in all implementations but Facebook's, misses one useful way to get the details about deadlock(s) that happened in the past.

To summarize, while storing information about configurable number of last deadlocks in the table seems to be a good idea, in case of ROCKSDB in both Percona and MariaDB servers (as soon as all transactions involved in deadlock are completed one way or the other) we miss some details (like thread ids for sessions involved, exact locks that each transaction held etc) comparing to the text output provided by the original upstream statement (and SHOW ENGINE INNODB STATUS\G, surely). Even if we are lucky to query all tables in time, we still probably miss lock waits table (like innodb_lock_waits) and any built in way to store information in the error log about deadlocks that happened (and all locks involved). 

Note also lack of consistency in naming (rocksdb_locks, plural vs  rocksdb_deadlock, singular, in case of MariaDB), rocksdb_deadlock.lock_type with value EXCLUSIVE vs rocksdb_locks.mode with value X etc, and and very limited documentation available. In my opinion current state is unacceptable if we hope to see wide use of MyRocks by community users and DBAs outside of Facebook.