Showing posts with label rocksdb. Show all posts
Showing posts with label rocksdb. Show all posts

Thursday, January 27, 2022

First steps with MariaDB Server and DTrace on macOS

FOSDEM 2022 is going to happen on the next weekend and I am still missing blog posts supporting my unusual upcoming talk there, this time devoted to building and using MariaDB server on macOS. So tonight I am going to contribute something new to refer to during my talk or followup questions. I'll try to document my way to build MariaDB Server from GitHub source on old (early 2015) MacBook Air running macOS 10.13.6 High Sierra. I am also going to show why one may want to use macOS there instead of installing Linux and having way more usual environment on the same decent hardware.

I've inherited this Air from my daughter in September, 2021 after she had got a new, M1-based, and initially used it mostly for Zoom calls and web-browsing. Soon I recalled that I've used MacBook for more than 3 years in the past while working for Sun and Oracle, and it was my main working platform not only for content consumption, emails and virtual machines. It was regularly used for bugs verification, MySQL builds and tests of all kinds. It would be a waste of capable OS and good hardware (still more powerful formally than all my other machines but Fedora 33 desktop) NOT to try to use it properly.

That's why after few software upgrades to end up with more recent 10.13 minor release:

Yuliyas-Air:maria10.6 Valerii$ uname -a
Darwin Yuliyas-Air 17.7.0 Darwin Kernel Version 17.7.0: Mon Aug 31 22:11:23 PDT 2020; root:xnu-4570.71.82.6~1/RELEASE_X86_64 x86_64

I read the manual, registered as a developer, downloaded and installed proper version of XCode and decided to proceed with MacPorts.

Then I updated ports tree and proceed by the manual, with installing git and other surely needed tools and packages:

sudo port install git cmake jemalloc judy openssl boost gnutls

That was just the beginning, and eventually, with dependencies, port updates, problems and workarounds, I ended up like this (and that works for building current code of 10.1 to 10.8 for sure, with some ports maybe not needed or used for other builds, but who cares):

Yuliyas-Air:maria10.8 Valerii$ port installed
The following ports are currently installed:
  autoconf @2.71_1 (active)
  automake @1.16.5_0 (active)
  bison @3.8.2_0
  bison @3.8.2_2 (active)
  bison-runtime @3.8.2_0 (active)
  boehmgc @8.0.6_0 (active)
  boost @1.76_0 (active)
  boost171 @1.71.0_3+no_single+no_static+python39 (active)
  boost176 @1.76.0_2+no_single+no_static+python39
  boost176 @1.76.0_3+no_single+no_static+python39 (active)
  bzip2 @1.0.8_0 (active)
  cmake @3.21.4_0
  cmake @3.22.1_0 (active)
  curl @7.80.0_0+ssl (active)
  curl-ca-bundle @7.80.0_0
  curl-ca-bundle @7.80.0_1 (active)
  cyrus-sasl2 @2.1.27_5+kerberos (active)
  db48 @4.8.30_4 (active)
  expat @2.4.1_0
  expat @2.4.2_0
  expat @2.4.3_0 (active)
  gdb @11.1_0 (active)
  gdbm @1.22_0 (active)
  gettext @0.21_0 (active)
  gettext-runtime @0.21_0 (active)
  gettext-tools-libs @0.21_0 (active)
  git @2.34.1_1+credential_osxkeychain+diff_highlight+doc+pcre+perl5_28
  git @2.34.1_2+credential_osxkeychain+diff_highlight+doc+pcre+perl5_28 (active)
  gmp @6.2.1_0 (active)
  gnutls @3.6.16_1
  gnutls @3.6.16_2 (active)
  icu @67.1_4 (active)
  jemalloc @5.2.1_1 (active)
  judy @1.0.5_1 (active)
  kerberos5 @1.19.2_1 (active)
  libarchive @3.5.2_1 (active)
  libb2 @0.98.1_1 (active)
  libcomerr @1.45.6_0 (active)
  libcxx @5.0.1_4 (active)
  libedit @20210910-3.1_1 (active)
  libevent @2.1.12_1 (active)
  libffi @3.4.2_2 (active)
  libiconv @1.16_1 (active)
  libidn @1.38_0 (active)
  libidn2 @2.3.2_0
  libidn2 @2.3.2_1 (active)
  libpsl @0.21.1-20210726_2 (active)
  libtasn1 @4.18.0_0 (active)
  libtextstyle @0.21_0 (active)
  libtool @2.4.6_13 (active)
  libunistring @0.9.10_0
  libunistring @1.0_0 (active)
  libuv @1.42.0_1
  libuv @1.43.0_0 (active)
  libxml2 @2.9.12_1 (active)
  libxslt @1.1.34_6 (active)
  lmdb @0.9.29_0 (active)
  luajit @2.1.0-beta3_5 (active)
  lz4 @1.9.3_1 (active)
  lzma @4.65_1 (active)
  lzo2 @2.10_0 (active)
  m4 @1.4.19_1 (active)
  mariadb @5.5.68_0 (active)
  mariadb-server @5.5.68_0 (active)
  mysql57 @5.7.36_1 (active)
  mysql_select @0.1.2_4 (active)
  ncurses @6.3_0 (active)
  nettle @3.7.3_0 (active)
  openssl @3_1
  openssl @3_2 (active)
  openssl3 @3.0.0_6+legacy
  openssl3 @3.0.1_0+legacy (active)
  openssl11 @1.1.1l_5 (active)
  p5-dbd-mysql @4.50.0_0 (active)
  p5.28-authen-sasl @2.160.0_0 (active)
  p5.28-cgi @4.530.0_0 (active)
  p5.28-clone @0.450.0_0 (active)
  p5.28-dbd-mysql @4.50.0_0+mysql57 (active)
  p5.28-dbi @1.643.0_0 (active)
  p5.28-digest-hmac @1.40.0_0 (active)
  p5.28-digest-sha1 @2.130.0_4 (active)
  p5.28-encode @3.160.0_0 (active)
  p5.28-encode-locale @1.50.0_0 (active)
  p5.28-error @0.170.290_0 (active)
  p5.28-gssapi @0.280.0_3 (active)
  p5.28-html-parser @3.760.0_0 (active)
  p5.28-html-tagset @3.200.0_4 (active)
  p5.28-http-date @6.50.0_0 (active)
  p5.28-http-message @6.350.0_0 (active)
  p5.28-io-html @1.4.0_0 (active)
  p5.28-io-socket-ssl @2.72.0_0
  p5.28-io-socket-ssl @2.73.0_0 (active)
  p5.28-lwp-mediatypes @6.40.0_0 (active)
  p5.28-mozilla-ca @20211001_0 (active)
  p5.28-net-libidn @0.120.0_5 (active)
  p5.28-net-smtp-ssl @1.40.0_0 (active)
  p5.28-net-ssleay @1.900.0_4 (active)
  p5.28-term-readkey @2.380.0_0 (active)
  p5.28-time-local @1.300.0_0 (active)
  p5.28-timedate @2.330.0_0 (active)
  p5.28-uri @5.100.0_0 (active)
  p11-kit @0.24.0_1 (active)
  pcre2 @10.39_0 (active)
  perl5.28 @5.28.3_4 (active)
  perl5.30 @5.30.3_3 (active)
  pkgconfig @0.29.2_0 (active)
  popt @1.18_1 (active)
  python3_select @0.0_2 (active)
  python39 @3.9.9_0+lto+optimizations
  python39 @3.9.10_0+lto+optimizations (active)
  python_select @0.3_9 (active)
  readline @8.1.000_0 (active)
  readline-5 @5.2.014_2 (active)
  rsync @3.2.3_1 (active)
  sqlite3 @3.37.0_0
  sqlite3 @3.37.1_0
  sqlite3 @3.37.2_0 (active)
  sysbench @1.0.20_0 (active)
  tcp_wrappers @20_4 (active)
  texinfo @6.8_0 (active)
  umem @1.0.1_1 (active)
  xxhashlib @0.8.1_0
  xxhashlib @0.8.1_1 (active)
  xz @5.2.5_0 (active)
  zlib @1.2.11_0 (active)
  zstd @1.5.0_0
  zstd @1.5.1_0 (active)
Yuliyas-Air:maria10.8 Valerii$

This is surely not the minimal needed set of ports. I've highlighted a couple (like openssl11) that were really needed to build and install 10.8 successfully eventually.

Then I cloned the code with usual steps and ended up with this:

Yuliyas-Air:server Valerii$ git log -1
commit c1cef1afa9962544de4840c9a796ae0a9b5e92e6 (HEAD -> 10.8, origin/bb-10.8-wlad, origin/bb-10.8-release, origin/HEAD, origin/10.8)
Merge: db2013787d2 9d93b51effd
Author: Vladislav Vaintroub <wlad@mariadb.com>
Date:   Wed Jan 26 13:57:00 2022 +0100

    Merge remote-tracking branch 'origin/bb-10.8-wlad' into 10.8
Yuliyas-Air:server Valerii$ git submodule update --init --recursive
Submodule path 'libmariadb': checked out 'ddb031b6a1d8b6e26a0f10f454dc1453a48a6ca8'
Yuliyas-Air:server Valerii$ cd buildtmp/
Yuliyas-Air:buildtmp Valerii$ rm -rf *
Yuliyas-Air:buildtmp Valerii$ cmake .. -DCMAKE_INSTALL_PREFIX=/Users/Valerii/dbs/maria10.8 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_CONFIG=mysql_release -DFEATURE_SET=community -DWITH_EMBEDDED_SERVER=OFF -DPLUGIN_TOKUDB=NO -DWITH_SSL=/opt/local/libexec/openssl11 -DENABLE_DTRACE=1
...
-- The following OPTIONAL packages have not been found:

 * PMEM
 * Snappy

-- Configuring done
CMake Warning (dev):
  Policy CMP0042 is not set: MACOSX_RPATH is enabled by default.  Run "cmake
  --help-policy CMP0042" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  MACOSX_RPATH is not specified for the following targets:

   libmariadb

This warning is for project developers.  Use -Wno-dev to suppress it.

-- Generating done
-- Build files have been written to: /Users/Valerii/git/server/buildtmp
Yuliyas-Air:buildtmp Valerii$

Note that I've used somewhat nontrivial cmake command line for out of source build, and the output above was not from the clean state, but from the state after 10.1 to 10.7 where all checked out and built successfully, one by one, with problems found and resolved (more on that on my slides and during the talk).

Two key options in that command line are -DWITH_SSL=/opt/local/libexec/openssl11 to use supported OpenSSL version 1.1 (even though for 10.8 default version 3 should already work too) and -DENABLE_DTRACE=1 to enable static DTrace probes in the code of MariaDB that are still there. The rest are typical for my builds and blog posts.

Then I proceeded with usual make to end up with the first problem:

Yuliyas-Air:buildtmp Valerii$ make
...
[ 46%] Built target heap
[ 46%] Building CXX object storage/innobase/CMakeFiles/innobase.dir/btr/btr0btr.cc.o
In file included from /Users/Valerii/git/server/storage/innobase/btr/btr0btr.cc:28:
In file included from /Users/Valerii/git/server/storage/innobase/include/btr0btr.h:31:
In file included from /Users/Valerii/git/server/storage/innobase/include/dict0dict.h:32:
In file included from /Users/Valerii/git/server/storage/innobase/include/dict0mem.h:45:
In file included from /Users/Valerii/git/server/storage/innobase/include/buf0buf.h:33:
/Users/Valerii/git/server/storage/innobase/include/fil0fil.h:1497:11: error:
      'asm goto' constructs are not supported yet
  __asm__ goto("lock btsl $31, %0\t\njnc %l1" : : "m" (n_pending)
          ^
1 error generated.
make[2]: *** [storage/innobase/CMakeFiles/innobase.dir/btr/btr0btr.cc.o] Error 1
make[1]: *** [storage/innobase/CMakeFiles/innobase.dir/all] Error 2
make: *** [all] Error 2

caused by the fact that clang 10.0.0 from XCode: 

Yuliyas-Air:maria10.8 Valerii$ clang --version
Apple LLVM version 10.0.0 (clang-1000.10.44.4)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

does NOT support asm goto used in MariaDB Server code since 10.6 (while it was supposed to support it). I've applied a lame fix (see MDEV-27402 for more details and final diff later) and proceeded:

Yuliyas-Air:buildtmp Valerii$ make
...

[ 97%] Building CXX object client/CMakeFiles/mariadb.dir/mysql.cc.o
/Users/Valerii/git/server/client/mysql.cc:2853:59: error: expected expression
  rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_co...
                                                          ^
/Users/Valerii/git/server/client/mysql.cc:2853:38: error: use of undeclared
      identifier 'rl_completion_func_t'; did you mean 'rl_completion_matches'?
  rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_co...
                                     ^~~~~~~~~~~~~~~~~~~~
                                     rl_completion_matches
/usr/include/editline/readline.h:202:16: note: 'rl_completion_matches' declared
      here
char           **rl_completion_matches(const char *, rl_compentry_func_t *);
                 ^
/Users/Valerii/git/server/client/mysql.cc:2854:33: error: assigning to
      'Function *' (aka 'int (*)(const char *, int)') from incompatible type
      'rl_compentry_func_t *' (aka 'char *(*)(const char *, int)'): different
      return type ('int' vs 'char *')
  rl_completion_entry_function= (rl_compentry_func_t*)&no_completion;
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/Valerii/git/server/client/mysql.cc:2856:3: error: no matching function
      for call to 'rl_add_defun'
  rl_add_defun("magic-space", (rl_command_func_t *)&fake_magic_space, -1);
  ^~~~~~~~~~~~
/usr/include/editline/readline.h:195:7: note: candidate function not viable: no
      known conversion from 'rl_command_func_t *' (aka 'int (*)(int, int)') to
      'Function *' (aka 'int (*)(const char *, int)') for 2nd argument
int              rl_add_defun(const char *, Function *, int);
                 ^
4 errors generated.
make[2]: *** [client/CMakeFiles/mariadb.dir/mysql.cc.o] Error 1
make[1]: *** [client/CMakeFiles/mariadb.dir/all] Error 2
make: *** [all] Error 2
Yuliyas-Air:buildtmp Valerii$

This readline-related problem was also reported (see MDEV-27579) and I fixed it with a lame patch, and eventually I was able to build successfully:

...

[100%] Linking C executable wsrep_check_version
[100%] Built target wsrep_check_version
Yuliyas-Air:buildtmp Valerii$ make install && make clean
...
-- Installing: /Users/Valerii/dbs/maria10.8/share/aclocal/mysql.m4
-- Installing: /Users/Valerii/dbs/maria10.8/support-files/mysql.server
Yuliyas-Air:buildtmp Valerii$ echo $?
0

The final diff is like this:

Yuliyas-Air:server Valerii$ git diff -u
diff --git a/client/mysql.cc b/client/mysql.cc
index 6612b273d17..902589f2e83 100644
--- a/client/mysql.cc
+++ b/client/mysql.cc
@@ -2849,7 +2849,7 @@ static void initialize_readline ()
   rl_terminal_name= getenv("TERM");

   /* Tell the completer that we want a crack first. */
-#if defined(USE_NEW_READLINE_INTERFACE)
+#if defined(USE_NEW_READLINE_INTERFACE) && !defined(__APPLE_CC__)
   rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_completion;
   rl_completion_entry_function= (rl_compentry_func_t*)&no_completion;

@@ -2859,7 +2859,9 @@ static void initialize_readline ()
   setlocale(LC_ALL,""); /* so as libedit use isprint */
 #endif
   rl_attempted_completion_function= (CPPFunction*)&new_mysql_completion;
+#if !defined(__APPLE_CC__)
   rl_completion_entry_function= &no_completion;
+#endif
   rl_add_defun("magic-space", (Function*)&fake_magic_space, -1);
 #else
   rl_attempted_completion_function= (CPPFunction*)&new_mysql_completion;
diff --git a/storage/innobase/include/fil0fil.h b/storage/innobase/include/fil0fil.h
index 34a53746b42..a795313116f 100644
--- a/storage/innobase/include/fil0fil.h
+++ b/storage/innobase/include/fil0fil.h
@@ -1489,7 +1489,7 @@ inline void fil_space_t::reacquire()
 inline bool fil_space_t::set_stopping_check()
 {
   mysql_mutex_assert_owner(&fil_system.mutex);
-#if defined __clang_major__ && __clang_major__ < 10
+#if (defined __clang_major__ && __clang_major__ < 10) || defined __APPLE_CC__
   /* Only clang-10 introduced support for asm goto */
   return n_pending.fetch_or(STOPPING, std::memory_order_relaxed) & STOPPING;
 #elif defined __GNUC__ && (defined __i386__ || defined __x86_64__)
diff --git a/storage/rocksdb/rocksdb b/storage/rocksdb/rocksdb
--- a/storage/rocksdb/rocksdb
+++ b/storage/rocksdb/rocksdb
@@ -1 +1 @@
-Subproject commit bba5e7bc21093d7cfa765e1280a7c4fdcd284288
+Subproject commit bba5e7bc21093d7cfa765e1280a7c4fdcd284288-dirty
Yuliyas-Air:server Valerii$

The last diff in rocksdb submodule is related to the "missing zstd headers" problem I fined with yet another lame patch while building 10.7, see MDEV-27619

After usual mysql_install_db and startup, I've ended up with a shiny new MariaDB 10.8.0 up and running on macOS 10.13.6:

Yuliyas-Air:maria10.8 Valerii$ bin/mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.8.0-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show variables like 'version%';
+-------------------------+------------------------------------------+
| Variable_name           | Value                                    |
+-------------------------+------------------------------------------+
| version                 | 10.8.0-MariaDB                           |
| version_comment         | MariaDB Server                           |
| version_compile_machine | x86_64                                   |
| version_compile_os      | osx10.13                                 |
| version_malloc_library  | system                                   |
| version_source_revision | c1cef1afa9962544de4840c9a796ae0a9b5e92e6 |
| version_ssl_library     | OpenSSL 1.1.1l  24 Aug 2021              |
+-------------------------+------------------------------------------+
7 rows in set (0.001 sec)

I even checked few MTR test suites, and many tests pass:

Yuliyas-Air:mysql-test Valerii$ ./mtr --suite=rocksdb
Logging: ./mtr  --suite=rocksdb
VS config:
vardir: /Users/Valerii/dbs/maria10.8/mysql-test/var
Checking leftover processes...
 - found old pid 89161 in 'mysqld.1.pid', killing it...
   process did not exist!
Removing old var directory...
Creating var directory '/Users/Valerii/dbs/maria10.8/mysql-test/var'...
Checking supported features...
MariaDB Version 10.8.0-MariaDB
 - SSL connections supported
Using suites: rocksdb
Collecting tests...
Installing system database...
...
rocksdb.index_merge_rocksdb 'write_committed' [ pass ]    911
rocksdb.shutdown 'write_prepared'        [ pass ]   1895
rocksdb.index_merge_rocksdb 'write_prepared' [ pass ]    923
rocksdb.mariadb_misc_binlog 'write_committed' [ pass ]     43
rocksdb.mariadb_misc_binlog 'write_prepared' [ pass ]     43
...
rocksdb.issue495 'write_committed'       [ pass ]     33
rocksdb.partition 'write_committed'      [ fail ]
        Test ended at 2022-01-27 22:41:15

CURRENT_TEST: rocksdb.partition
mysqltest: At line 67: query 'ALTER TABLE t1 REBUILD PARTITION p0, p1' failed: ER_METADATA_INCONSISTENCY (4064): Table 'test.t1#P#p0#TMP#' does not exist, but metadata information exists inside MyRocks. This is a sign of data inconsistency. Please check if './test/t1#P#p0#TMP#.frm' exists, and try to restore it if it does not exist.

The result from queries just before the failure was:
< snip >
DROP TABLE IF EXISTS employees_hash_1;
DROP TABLE IF EXISTS t1_hash;
DROP TABLE IF EXISTS employees_linear_hash;
DROP TABLE IF EXISTS t1_linear_hash;
DROP TABLE IF EXISTS k1;
DROP TABLE IF EXISTS k2;
DROP TABLE IF EXISTS tm1;
DROP TABLE IF EXISTS tk;
DROP TABLE IF EXISTS ts;
DROP TABLE IF EXISTS ts_1;
DROP TABLE IF EXISTS ts_3;
DROP TABLE IF EXISTS ts_4;
DROP TABLE IF EXISTS ts_5;
DROP TABLE IF EXISTS trb3;
DROP TABLE IF EXISTS tr;
DROP TABLE IF EXISTS members_3;
DROP TABLE IF EXISTS clients;
DROP TABLE IF EXISTS clients_lk;
DROP TABLE IF EXISTS trb1;
CREATE TABLE t1 (i INT, j INT, k INT, PRIMARY KEY (i)) ENGINE = ROCKSDB PARTITION BY KEY(i) PARTITIONS 4;

More results from queries before failure can be found in /Users/Valerii/dbs/maria10.8/mysql-test/var/log/partition.log

 - saving '/Users/Valerii/dbs/maria10.8/mysql-test/var/log/rocksdb.partition-write_committed/' to '/Users/Valerii/dbs/maria10.8/mysql-test/var/log/rocksdb.partition-write_committed/'

Only  148  of 555 completed.
--------------------------------------------------------------------------
The servers were restarted 23 times
Spent 154.726 of 206 seconds executing testcases

Failure: Failed 1/27 tests, 96.30% were successful.

Failing test(s): rocksdb.partition

The log files in var/log may give you some hint of what went wrong.

If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html

69 tests were skipped, 2 by the test itself.

mysql-test-run: *** ERROR: there were failing test cases

But the real reason to build MariaDB server on macOS (other than "because I can") was not even RocksDB testing, but this:

Yuliyas-Air:mysql-test Valerii$ dtrace
Usage: dtrace [-aACeFHlqSvVwZ] [-arch i386|x86_64] [-b bufsz] [-c cmd] [-D name[=def]]
        [-I path] [-L path] [-o output] [-p pid] [-s script] [-U name]
        [-x opt[=val]]

        [-P provider [[ predicate ] action ]]
        [-m [ provider: ] module [[ predicate ] action ]]
        [-f [[ provider: ] module: ] func [[ predicate ] action ]]
        [-n [[[ provider: ] module: ] func: ] name [[ predicate ] action ]]
        [-i probe-id [[ predicate ] action ]] [ args ... ]

        predicate -> '/' D-expression '/'
           action -> '{' D-statements '}'

        -arch Generate programs and Mach-O files for the specified architecture

        -a  claim anonymous tracing state
        -A  generate plist(5) entries for anonymous tracing
        -b  set trace buffer size
        -c  run specified command and exit upon its completion
        -C  run cpp(1) preprocessor on script files
        -D  define symbol when invoking preprocessor
        -e  exit after compiling request but prior to enabling probes
        -f  enable or list probes matching the specified function name
        -F  coalesce trace output by function
        -h  generate a header file with definitions for static probes
        -H  print included files when invoking preprocessor
        -i  enable or list probes matching the specified probe id
        -I  add include directory to preprocessor search path
        -l  list probes matching specified criteria
        -L  add library directory to library search path
        -m  enable or list probes matching the specified module name
        -n  enable or list probes matching the specified probe name
        -o  set output file
        -p  grab specified process-ID and cache its symbol tables
        -P  enable or list probes matching the specified provider name
        -q  set quiet mode (only output explicitly traced data)
        -s  enable or list probes according to the specified D script
        -S  print D compiler intermediate code
        -U  undefine symbol when invoking preprocessor
        -v  set verbose mode (report stability attributes, arguments)
        -V  report DTrace API version
        -w  permit destructive actions
        -W  wait for specified process and exit upon its completion
        -x  enable or modify compiler and tracing options
        -Z  permit probe descriptions that match zero probes
Yuliyas-Air:mysql-test Valerii$

I remember how cool dtrace was since workin for Sun and after last re-install of Fedora I ended up with my Illumos VM gone and no recent FreeBSD VM anyway,. so basically there was no dtrace at hand until I've got this MacBook.

So, unlike MySQL 8.0, MariaDB server still contains USDT (DTrace probes) and I've built my version with them enabled. Let's quickly check how they can be used. There are few sample D files in the source:

Yuliyas-Air:server Valerii$ ls support-files/dtrace/
locktime.d                      query-rowops.d
query-execandqc.d               query-time.d
query-filesort-time.d           statement-time.d
query-network-time.d            statement-type-aggregate.d
query-parse-time.d
Yuliyas-Air:server Valerii$ cat support-files/dtrace/query-time.d
#!/usr/sbin/dtrace -s
#
# Copyright (c) 2009 Sun Microsystems, Inc.
# Use is subject to license terms.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; version 2 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA
#
# Shows basic query execution time, who execute the query, and on what database

#pragma D option quiet

dtrace:::BEGIN
{
   printf("%-20s %-20s %-40s %-9s\n", "Who", "Database", "Query", "Time(ms)");
}

mysql*:::query-start
{
   self->query = copyinstr(arg0);
   self->connid = arg1;
   self->db    = copyinstr(arg2);
   self->who   = strjoin(copyinstr(arg3),strjoin("@",copyinstr(arg4)));
   self->querystart = timestamp;
}

mysql*:::query-done
{
   printf("%-20s %-20s %-40s %-9d\n",self->who,self->db,self->query,
          (timestamp - self->querystart) / 1000000);
}
Yuliyas-Air:server Valerii$

We can try to run it (with yet another hack to do that I leave to the curious reader who is really going to try this) and then execute some queries against MariaDB from another terminal:

Yuliyas-Air:server Valerii$ sudo support-files/dtrace/query-time.d
dtrace: system integrity protection is on, some features will not be available

Who                  Database             Query                                    Time(ms)
Valerii@localhost                         select @@version_comment limit 1         0
Valerii@localhost                         select 1+1                               0
Valerii@localhost                         select sleep(4)                          4004
Valerii@localhost                         select @@version_comment limit 1         0
Valerii@localhost                         create database sbtest                   1
Valerii@localhost    sbtest               CREATE TABLE sbtest1(
  id INTEGER NOT NULL AUTO_INCREMENT,
  k INTEGER DEFAULT '0' NOT NULL,
  c CHAR(120) DEFAULT '' NOT NULL,
  pad CHAR(60) DEFAULT '' NOT NULL,
  PRIMARY KEY (id)
) /*! ENGINE = innodb */ 91
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(366941, '31451373586-15688153734-79729593694-96509299839-83724898275-86711833539-78981337422-35049690573-51724173961-87474696253', '98996621624-36689827414-04092488557-09587706818-65008859162'),(277750, '21472970079-7 181
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(124021, '80697810288-90543941719-80227288793-55278810422-59841440561-49369413842-83550451066-12907725305-62036548401-86959403176', '65708342793-83311865079-53224065384-18645733125-16333693298'),(660496, '07381386584-5 31
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(425080, '48883413333-17783399741-03981526516-97596354402-27141206678-83563692683-30244461835-25263435890-49140039573-28211133426', '81560227417-96691828090-72817141653-15106797886-43970285630'),(322421, '68618246702-7 33
...

to end up with a lightweight general query log with query execution time repported in milliseconds. Note some statement from, yes, sysbench test preapre :)

Nice view. macOS is also nice and useful platform to run MariaDB server for fun :)

Thta's probably enough for a simgle post. Now go and get that from your MySQL 8 on any OS! USDTs rule!

To summarize:

  1. It is surely possible to build even latest ang greates still alpha MariaDB 10.8 on macOS even as old as 10.13.6, and get usable result.
  2. One of the main benefits of macOS is DTrace support.
  3. It is still possible to use a limited set of USDTs once added to MySQL and now gone in all current versions of MariaDB server.
  4. That rocksdb.partition test failure is to be studied an maybe reported as a MDEV
  5. DTrace is surely more capable than wehat those examples in the source code show. More DTrace-related posts are coming this year. Stay tuned!

Saturday, December 22, 2018

How to Get Details About MyRocks Deadlocks in MariaDB and Percona Server

In my previous post on ERROR 1213 I noted that Percona Server does not support the SHOW ENGINE ROCKSDB TRANSACTION STATUS statement to get deadlock details in "text" form. I've got some clarifications in my related feature request, PS-5114. So I decided to write this followup post and show what is the way to get deadlock details for the ROCKSDB tables in current versions of MariaDB and Percona Server.

First of all, I'd like to check MariaDB's implementation of MyRocks. For this I'll re-create deadlock scenario from that my post with MariaDB 10.3.12 I have at hand. We should start with installing ROCKSDB plugin according to this KB article:
openxs@ao756:~/dbs/maria10.3$ bin/mysql --no-defaults --socket=/tmp/mariadb.sock -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.3.12-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


MariaDB [test]> install soname 'ha_rocksdb';
Query OK, 0 rows affected (36,338 sec)

MariaDB [test]> show engines;
+--------------------+---------+----------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                          | Transactions | XA   | Savepoints |
+--------------------+---------+----------------------------------------------------------------------------------+--------------+------+------------+
| ROCKSDB            | YES     | RocksDB storage engine                                                           | YES          | YES  | YES        |
...
| CONNECT            | YES     | Management of External Data (SQL/NOSQL/MED), including many file formats         | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage                                           | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                               | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                   | YES          | NO   | YES        |
+--------------------+---------+----------------------------------------------------------------------------------+--------------+------+------------+
10 rows in set (8,753 sec)

MariaDB [test]> show plugins;
+-------------------------------+----------+--------------------+---------------+---------+
| Name                          | Status   | Type               | Library       | License |
+-------------------------------+----------+--------------------+---------------+---------+
| binlog                        | ACTIVE   | STORAGE ENGINE     | NULL          | GPL     |
| mysql_native_password         | ACTIVE   | AUTHENTICATION     | NULL          | GPL     |
| mysql_old_password            | ACTIVE   | AUTHENTICATION     | NULL          | GPL     |
| wsrep                         | ACTIVE   | STORAGE ENGINE     | NULL          | GPL     |
...
| CONNECT                       | ACTIVE   | STORAGE ENGINE     | ha_connect.so | GPL     |
| ROCKSDB                       | ACTIVE   | STORAGE ENGINE     | ha_rocksdb.so | GPL     |
...
| ROCKSDB_LOCKS                 | ACTIVE   | INFORMATION SCHEMA | ha_rocksdb.so | GPL     |
| ROCKSDB_TRX                   | ACTIVE   | INFORMATION SCHEMA | ha_rocksdb.so | GPL     |
| ROCKSDB_DEADLOCK              | ACTIVE   | INFORMATION SCHEMA | ha_rocksdb.so | GPL     |
+-------------------------------+----------+--------------------+---------------+---------+
68 rows in set (2,451 sec)
Note that in MariaDB just one simple INSTALL SONAME ... statement is enough to get ROCKSDB with all related plugins loaded. Do not mind time to execute statements - I am running them on a netbook that is busy compiling Percona Server 8.0.13 from GitHub concurrently, to post something about it later :)

Now, let me re-create the same deadlock scenario:

MariaDB [test]> create table t1(id int, c1 int, primary key(id)) engine=rocksdb;
Query OK, 0 rows affected (4,163 sec)

MariaDB [test]> insert into t1 values (1,1), (2,2);
Query OK, 2 rows affected (0,641 sec)
Records: 2  Duplicates: 0  Warnings: 0

MariaDB [test]> set global rocksdb_lock_wait_timeout=50;
Query OK, 0 rows affected (0,644 sec)

MariaDB [test]> set global rocksdb_deadlock_detect=ON;
Query OK, 0 rows affected (0,037 sec)

MariaDB [test]> show global variables like 'rocksdb%deadlock%';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| rocksdb_deadlock_detect       | ON    |
| rocksdb_deadlock_detect_depth | 50    |
| rocksdb_max_latest_deadlocks  | 5     |
+-------------------------------+-------+
3 rows in set (0,022 sec)
We need two sessions. In the first one:
MariaDB [test]> select connection_id();
+-----------------+
| connection_id() |
+-----------------+
|              11 |
+-----------------+
1 row in set (0,117 sec)

MariaDB [test]> start transaction;
Query OK, 0 rows affected (0,000 sec)

MariaDB [test]> select * from t1 where id = 1 for update;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
+----+------+
1 row in set (0,081 sec)

In the second:
MariaDB [test]> select connection_id();
+-----------------+
| connection_id() |
+-----------------+
|              12 |
+-----------------+
1 row in set (0,000 sec)

MariaDB [test]> start transaction;
Query OK, 0 rows affected (0,000 sec)

MariaDB [test]> select * from t1 where id = 2 for update;
+----+------+
| id | c1   |
+----+------+
|  2 |    2 |
+----+------+
1 row in set (0,001 sec)
Back in the first:

MariaDB [test]> select * from t1 where id=2 for update;
It hangs waiting for incompatible lock. In the second:

MariaDB [test]> select * from t1 where id=1 for update;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Now, can we get any details about the deadlock that just happened using upstream SHOW ENGINE statement? Let's try:
MariaDB [test]> SHOW ENGINE ROCKSDB TRANSACTION STATUS\G
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'TRANSACTION STATUS' at line 1
Does not work, same as in Percona Server 5.7.x. Here is a related MariaDB task: MDEV-13859 - "Add SHOW ENGINE ROCKSDB TRANSACTION STATUS or its equivalent?". Still "Open" and without any target version set, not even 10.4.

The idea behind NOT supporting this statement by Percona, according to comments I've got in  PS-5114, is to rely on tables in the information_schema. That's what we have in the related tables, rocksdb_locks, rocksdb_trx and rocksdb_deadlock after deadlock above was detected:
MariaDB [test]> select * from information_schema.rocksdb_deadlock;
+-------------+------------+----------------+---------+------------------+-----------+------------+------------+-------------+
| DEADLOCK_ID | TIMESTAMP  | TRANSACTION_ID | CF_NAME | WAITING_KEY      | LOCK_TYPE | INDEX_NAME | TABLE_NAME | ROLLED_BACK |
+-------------+------------+----------------+---------+------------------+-----------+------------+------------+-------------+
|           0 | 1545481878 |              6 | default | 0000010080000002 | EXCLUSIVE | PRIMARY    | test.t1    |           0 |
|           0 | 1545481878 |              7 | default | 0000010080000001 | EXCLUSIVE | PRIMARY    | test.t1    |           1 |
+-------------+------------+----------------+---------+------------------+-----------+------------+------------+-------------+
2 rows in set (0,078 sec)

MariaDB [test]> select * from information_schema.rocksdb_trx;
+----------------+---------+------+-------------+------------+-------------+-------------+--------------------------+----------------+--------------+-----------+------------------------+----------------------+-----------+-------+
| TRANSACTION_ID | STATE   | NAME | WRITE_COUNT | LOCK_COUNT | TIMEOUT_SEC | WAITING_KEY | WAITING_COLUMN_FAMILY_ID | IS_REPLICATION | SKIP_TRX_API | READ_ONLY | HAS_DEADLOCK_DETECTION | NUM_ONGOING_BULKLOAD | THREAD_ID | QUERY |
+----------------+---------+------+-------------+------------+-------------+-------------+--------------------------+----------------+--------------+-----------+------------------------+----------------------+-----------+-------+
|              6 | STARTED |      |           0 |          2 |          50 |             |                        0 |              0 |            0 |         0 |                      1 |                    0 |        11 |       |
+----------------+---------+------+-------------+------------+-------------+-------------+--------------------------+----------------+--------------+-----------+------------------------+----------------------+-----------+-------+
1 row in set (0,001 sec)

MariaDB [test]> select * from information_schema.rocksdb_locks;
+------------------+----------------+------------------+------+
| COLUMN_FAMILY_ID | TRANSACTION_ID | KEY              | MODE |
+------------------+----------------+------------------+------+
|                0 |              6 | 0000010080000002 | X    |
|                0 |              6 | 0000010080000001 | X    |
+------------------+----------------+------------------+------+
2 rows in set (0,025 sec)
I was not able to find any really good documentation about these tables (I checked here, there and more), especially rocksdb_deadlock that is totally undocumented, so let me try to speculate and explain my ideas on how they are supposed to work and be used together. Information about up to rocksdb_max_latest_deadlocks is stored in the rocksdb_deadlock table, each deadlock is identified by deadlock_id and in case of MariaDB you can find out when it happened using the timestamp column that is a UNIX timestamp:
MariaDB [test]> select distinct deadlock_id, from_unixtime(timestamp) from information_schema.rocksdb_deadlock;+-------------+--------------------------+
| deadlock_id | from_unixtime(timestamp) |
+-------------+--------------------------+
|           0 | 2018-12-22 14:31:18      |
+-------------+--------------------------+
1 row in set (0,137 sec)

For each deadlock you have a row per lock wait for each transaction involved, identified by transaction_id. You can see for what key value (waited_key) in that index (index_name) of what table (table_name) the transaction was waited. Victim (transaction that was rolled back to prevent deadlock) of deadlock detection is identified by the value 1 in the rolled_back column. This is all the information that persists for any notable time, and I don't like it that much, as we can not see what lock(s) transactions had at the moment. We can guess conflicting lock based on what was the waiting_key, but I'd prefer InnoDB way of showing this clearly with all the details.

If you hurry up and query the rocksdb_trx table fast enough, you can get more details about those transaction(s) involved in deadlock that are NOT rolled back (and not yet committed). Join by the transaction_id column, obviously, to get the details up to current running query and processlist connection id (rocksdb_trx.thread_id column) involved. 

If you hurry up to query rocksdb_locks table also by the transaction_id of still active transaction, you can get a list of locks it holds and then guess which one was a blocking lock. If you are not fast enough and transaction is gone you have just to assume there was some blocking lock. One day gap locks may be added, and some lock would become not good enough guess.

I miss these rocks at Beaulieu-sur-Mer. MyRocks in all implementations but Facebook's, misses one useful way to get the details about deadlock(s) that happened in the past.

To summarize, while storing information about configurable number of last deadlocks in the table seems to be a good idea, in case of ROCKSDB in both Percona and MariaDB servers (as soon as all transactions involved in deadlock are completed one way or the other) we miss some details (like thread ids for sessions involved, exact locks that each transaction held etc) comparing to the text output provided by the original upstream statement (and SHOW ENGINE INNODB STATUS\G, surely). Even if we are lucky to query all tables in time, we still probably miss lock waits table (like innodb_lock_waits) and any built in way to store information in the error log about deadlocks that happened (and all locks involved). 

Note also lack of consistency in naming (rocksdb_locks, plural vs  rocksdb_deadlock, singular, in case of MariaDB), rocksdb_deadlock.lock_type with value EXCLUSIVE vs rocksdb_locks.mode with value X etc, and and very limited documentation available. In my opinion current state is unacceptable if we hope to see wide use of MyRocks by community users and DBAs outside of Facebook.

Saturday, December 8, 2018

What May Cause MySQL ERROR 1213

Probably all of us, MySQL users, DBAs and developers had seen error 1213 more than once, in one context or the other:
mysql> select * from t1;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
The first thing that comes to mind in this case is: "OK, we have InnoDB deadlock, let's check the details", followed by the SHOW ENGINE INNODB STATUS check, like this:
mysql> show engine innodb status\G
*************************** 1. row ***************************
  Type: InnoDB
  Name:
Status:
=====================================
2018-12-08 17:41:11 0x7f2f8b8db700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 12 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 59 srv_active, 0 srv_shutdown, 14824 srv_idle
srv_master_thread log flush and writes: 14882
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 326
OS WAIT ARRAY INFO: signal count 200
RW-shared spins 0, rounds 396, OS waits 195
RW-excl spins 0, rounds 120, OS waits 4
RW-sx spins 0, rounds 0, OS waits 0
Spin rounds per wait: 396.00 RW-shared, 120.00 RW-excl, 0.00 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 14960
Purge done for trx's n:o < 14954 undo n:o < 0 state: running but idle
History list length 28
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421316960193880, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421316960192752, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
--------
FILE I/O
--------
...
Now, what if you get the output like the one above? Without any LATEST DETECTED DEADLOCK section? I've seen people wondering how is it even possible and trying to find some suspicious bug somewhere...

Do not be in a hurry - time to recall that there are actually at least 4 quite common reasons to get error 2013 in modern (5.5+) version of MySQL, MariaDB and Friends:
  1. InnoDB deadlock happened
  2. Metadata deadlock happened
  3. If you are lucky enough to use Galera cluster, Galera conflict happened
  4. Deadlock happened in some other storage engine (for example, MyRocks)
I am not lucky enough to use MySQL's group replication yet, but I know that conflicts there are also possible. I am just not sure if error 1213 is also reported in that case. Feel free to check with a test case similar to the one I've used for Galera below.

I also suspect deadlocks with other engines are also possible. As a bonus point, I'll demonstrate the deadlock with MyRocks also.

Let's reproduce these 3 cases one by one and check how to get more information on them. In all cases it's enough to have at most 2 InnoDB tables with just two rows:
mysql> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` int(11) NOT NULL,
  `c1` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `c1` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)

mysql> select * from t2;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
We'll need two sessions, surely.

InnoDB Deadlock

With InnoDB and tables above it's really easy to end up with a deadlock. In the first session execute the following:
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 where id = 1 for update;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
+----+------+
1 row in set (0.00 sec)
In the second session execute:
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 where id = 2 for update;
+----+------+
| id | c1   |
+----+------+
|  2 |    2 |
+----+------+
1 row in set (0.02 sec)
Now in the first session try to access the row with id=2 asking for incompatible lock:
mysql> select * from t1 where id = 2 for update;
This statement hangs waiting for a lock (up to innodb_lock_wait_timeout seconds). Try to access the row with id=1 asking for incompatible lock in the second session, and you'll get the deadlock error:
mysql> select * from t1 where id = 1 for update;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
at this moment SELECT in the first transaction returns data:
+----+------+
| id | c1   |
+----+------+
|  2 |    2 |
+----+------+
1 row in set (5.84 sec)
It's that simple, one table and two rows is enough. We can get the details in the output of SHOW ENGINE INNODB STATUS:
...
------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-12-08 18:32:59 0x7f2f8b8db700
*** (1) TRANSACTION:
TRANSACTION 15002, ACTIVE 202 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 8, OS thread handle 139842181244672, query id 8545 localhost root statistics
select * from t1 where id = 2 for update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 94 page no 3 n bits 72 index PRIMARY of table `test`.`t1` trx id 15002 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 15003, ACTIVE 143 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 9, OS thread handle 139842181510912, query id 8546 localhost root statistics
select * from t1 where id = 1 for update
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 94 page no 3 n bits 72 index PRIMARY of table `test`.`t1` trx id 15003 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 94 page no 3 n bits 72 index PRIMARY of table `test`.`t1` trx id 15003 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (2)
------------
TRANSACTIONS
------------
...
In the case above I've used Percona Server  5.7.24-26 (why not). Details of output may vary depending on version (and bugs it has :).  If you use MariaDB 5.5+, in case of InnoDB deadlock special innodb_deadlocks status variable is also incremented.

Metadata Deadlock

Unlike with InnoDB deadlocks, chances that you've seen deadlocks with metadata locks involved are low. One may spend notable time trying to reproduce such a deadlock, but (as usual) quck check of MySQL bugs database may help to find an easy to reproduce case. I mean Bug #65890 - "Deadlock that is not a deadlock with transaction and lock tables".

So, let's try the following scenario with two sessions and out InnoDB tables, t1 and t2. In one session:
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t2 for update;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
In another session:
mysql> lock tables t1 write, t2 write;
It hangs, waiting as long as lock_wait_timeout. We can check what happens with metadata locks using performance_schema.metadata_locks table (as we use MySQL or Percona Server 5.7+, more on setup, alternatives for MariaDB etc here and there). In the first session:
mysql> select * from performance_schema.metadata_locks;
+-------------+--------------------+----------------+-----------------------+----------------------+---------------+-------------+--------+-----------------+----------------+
| OBJECT_TYPE | OBJECT_SCHEMA      | OBJECT_NAME    | OBJECT_INSTANCE_BEGIN | LOCK_TYPE            | LOCK_DURATION | LOCK_STATUS | SOURCE | OWNER_THREAD_ID | OWNER_EVENT_ID |
+-------------+--------------------+----------------+-----------------------+----------------------+---------------+-------------+--------+-----------------+----------------+
| TABLE       | test               | t2             |       139841686765904 | SHARED_WRITE         | TRANSACTION   | GRANTED     |        |              45 |           2850 || GLOBAL      | NULL               | NULL           |       139841688088672 | INTENTION_EXCLUSIVE  | STATEMENT     | GRANTED     |        |              46 |            205 |
| SCHEMA      | test               | NULL           |       139841688088912 | INTENTION_EXCLUSIVE  | TRANSACTION   | GRANTED     |        |              46 |            205 |
| TABLE       | test               | t1             |       139841688088992 | SHARED_NO_READ_WRITE | TRANSACTION   | GRANTED     |        |              46 |            207 |
| TABLE       | test               | t2             |       139841688089072 | SHARED_NO_READ_WRITE | TRANSACTION   | PENDING     |        |              46 |            208 |
| TABLE       | performance_schema | metadata_locks |       139841686219040 | SHARED_READ          | TRANSACTION   | GRANTED     |        |              45 |           3003 |
+-------------+--------------------+----------------+-----------------------+----------------------+---------------+-------------+--------+-----------------+----------------+
6 rows in set (0.00 sec)
As soon as we try this in the first session:
mysql> select * from t1;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
we get the same deadlock error 1213 and LOCK TABLES in the second session completes. We can find nothing about this deadlock in the output of SHOW ENGINE INNODB STATUS (as shared at the beginning of this post). I am also not aware about any status variables to count metadata deadlocks.

You can find some useful information about metadata deadlocks in the manual.

Galera Conflict

For simplicity I'll use MariaDB 10.1.x and simple 2 nodes setup on the same box as I described here. I'll start first node as a new cluster and create tables for this test:
openxs@ao756:~/dbs/maria10.1$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode1.cnf --wsrep-new-cluster &
[1] 13022
openxs@ao756:~/dbs/maria10.1$ 181208 20:40:52 mysqld_safe Logging to '/tmp/mysql-node1.err'.
181208 20:40:52 mysqld_safe Starting mysqld daemon with databases from /home/openxs/galera/node1

openxs@ao756:~/dbs/maria10.1$ bin/mysql  --socket=/tmp/mysql-node1.sock test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.34-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> drop table t1, t2;
ERROR 1051 (42S02): Unknown table 'test.t2'
MariaDB [test]> create table t1(id int, c1 int, primary key(id));
Query OK, 0 rows affected (0.29 sec)

MariaDB [test]> create table t2(id int, c1 int, primary key(id));
Query OK, 0 rows affected (0.22 sec)

MariaDB [test]> insert into t1 values (1,1), (2,2);
Query OK, 2 rows affected (0.07 sec)
Records: 2  Duplicates: 0  Warnings: 0

MariaDB [test]> insert into t2 values (1,1), (2,2);
Query OK, 2 rows affected (0.18 sec)
Records: 2  Duplicates: 0  Warnings: 0
Then I'll start second node, make sure it joined the cluster and has the same data:
openxs@ao756:~/dbs/maria10.1$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode2.cnf &
[2] 15110
openxs@ao756:~/dbs/maria10.1$ 181208 20:46:11 mysqld_safe Logging to '/tmp/mysql-node2.err'.
181208 20:46:11 mysqld_safe Starting mysqld daemon with databases from /home/openxs/galera/node2

openxs@ao756:~/dbs/maria10.1$ bin/mysql --socket=/tmp/mysql-node2.sock test     Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.34-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test]> show status like 'wsrep_cluster%';
+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 4                                    |
| wsrep_cluster_size       | 2                                    |
| wsrep_cluster_state_uuid | b1d227b1-0211-11e6-8ce0-3644ad2b03dc |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.04 sec)

MariaDB [test]> select * from t2;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.02 sec)
Now we are ready to try to provoke Galera conflict. For this we have to try to update the same data in transactions on two different nodes. In one session connected to node1:
MariaDB [test]> select @@wsrep_node_name;
+-------------------+
| @@wsrep_node_name |
+-------------------+
| node1             |
+-------------------+
1 row in set (0.00 sec)

MariaDB [test]> start transaction;
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> update test.t1 set c1 = 5 where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0


In another session connected to other node:

MariaDB [test]> select @@wsrep_node_name;
+-------------------+
| @@wsrep_node_name |
+-------------------+
| node2             |
+-------------------+
1 row in set (0.00 sec)

MariaDB [test]> start transaction;
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> update test.t1 set c1 = 6 where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
Now in the first we can COMMIT successfully:
MariaDB [test]> commit;
Query OK, 0 rows affected (0.12 sec)
But if we try to COMMIT in the second:
MariaDB [test]> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
We get that same error 1213 about the deadlock. Surely you'll see nothing about this deadlock in INNODB STATUS output, as it was NOT an InnoDB deadlock, but Galera conflict. Check these status variables on the node2:
MariaDB [test]> show status like 'wsrep_local%';
+----------------------------+--------------------------------------+
| Variable_name              | Value                                |
+----------------------------+--------------------------------------+
| wsrep_local_bf_aborts      | 1                                    |
| wsrep_local_cached_downto  | 75                                   |
| wsrep_local_cert_failures  | 0                                    |
| wsrep_local_commits        | 0                                    |
| wsrep_local_index          | 0                                    |
| wsrep_local_recv_queue     | 0                                    |
| wsrep_local_recv_queue_avg | 0.000000                             |
| wsrep_local_recv_queue_max | 1                                    |
| wsrep_local_recv_queue_min | 0                                    |
| wsrep_local_replays        | 0                                    |
| wsrep_local_send_queue     | 0                                    |
| wsrep_local_send_queue_avg | 0.000000                             |
| wsrep_local_send_queue_max | 1                                    |
| wsrep_local_send_queue_min | 0                                    |
| wsrep_local_state          | 4                                    |
| wsrep_local_state_comment  | Synced                               |
| wsrep_local_state_uuid     | b1d227b1-0211-11e6-8ce0-3644ad2b03dc |
+----------------------------+--------------------------------------+
17 rows in set (0.01 sec)
If wsrep_local_bf_aborts > 0, you had conflicts and local transaction was rolled back to prevent them. We can see that remote one wins, on node2:
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    5 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
To summarize, in Galera "first commit wins" and local transaction involved in conflict is always a looser. You can get a lot of information about conflicts in the error log if you enable conflict logging features through wsrep_log_conflicts and cert.log_conflicts. See this fine manual for details.

MyRocks Deadlock

We can easily check how deadlocks are processed by MyRocks by just loading the plugin for the engine, converting tables to MyRocks and trying the same InnoDB scenario with the same Percona Server we used initially. But first, if you use Percona binaries you have to install a separate package:
openxs@ao756:~$ dpkg -l | grep rocksdb
openxs@ao756:~$ sudo apt-get install percona-server-rocksdb-5.7
[sudo] password for openxs:
Reading package lists... Done
Building dependency tree
...
Unpacking percona-server-rocksdb-5.7 (5.7.24-26-1.trusty) ...
Setting up percona-server-rocksdb-5.7 (5.7.24-26-1.trusty) ...


 * This release of Percona Server is distributed with RocksDB storage engine.
 * Run the following script to enable the RocksDB storage engine in Percona Server:

        ps-admin --enable-rocksdb -u <mysql_admin_user> -p[mysql_admin_pass] [-S <socket>] [-h <host> -P <port>]
Percona's manual has a lot more details and relies on separate ps-admin script, but basically you have to INSTALL PLUGINs like this (check script's code):
mysql> INSTALL PLUGIN ROCKSDB SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.86 sec)

mysql> INSTALL PLUGIN ROCKSDB_CFSTATS SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.06 sec)

mysql> INSTALL PLUGIN ROCKSDB_DBSTATS SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.08 sec)

mysql> INSTALL PLUGIN ROCKSDB_PERF_CONTEXT SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.05 sec)

mysql> INSTALL PLUGIN ROCKSDB_PERF_CONTEXT_GLOBAL SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.06 sec)

mysql> INSTALL PLUGIN ROCKSDB_CF_OPTIONS SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.05 sec)

mysql> INSTALL PLUGIN ROCKSDB_GLOBAL_INFO SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.05 sec)

mysql> INSTALL PLUGIN ROCKSDB_COMPACTION_STATS SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.05 sec)

mysql> INSTALL PLUGIN ROCKSDB_DDL SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.06 sec)

mysql> INSTALL PLUGIN ROCKSDB_INDEX_FILE_MAP SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.05 sec)

mysql> INSTALL PLUGIN ROCKSDB_LOCKS SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.05 sec)

mysql> INSTALL PLUGIN ROCKSDB_TRX SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.05 sec)

mysql> INSTALL PLUGIN ROCKSDB_DEADLOCK SONAME 'ha_rocksdb.so';
Query OK, 0 rows affected (0.06 sec)
Then check that the engine is there and convert tables:
mysql> show engines;
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                    | Transactions | XA   | Savepoints |
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| ROCKSDB            | YES     | RocksDB storage engine                                                     | YES          | YES  | YES        |
...

mysql> alter table t1 engine=rocksdb;
Query OK, 2 rows affected (0.64 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> alter table t2 engine=rocksdb;
Query OK, 2 rows affected (0.58 sec)
Records: 2  Duplicates: 0  Warnings: 0
Now we are ready to try the same InnoDB scenario. Just note that lock wait timeout for MyRocks is defined by the rocksdb_lock_wait_timeout that is small by default, 1 second, do you have have to increase it first. You also have to set rocksdb_deadlock_detect to ON (as it's OFF by default):
mysql> set global rocksdb_lock_wait_timeout=50;
Query OK, 0 rows affected (0.00 sec)

mysql> set global rocksdb_deadlock_detect=ON;
Query OK, 0 rows affected (0.00 sec)

mysql> \r
Connection id:    14
Current database: test

mysql> start transaction;
Query OK, 0 rows affected (0.02 sec)

mysql> select * from t1 where id = 1 for update;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
+----+------+
1 row in set (0.00 sec)
Then in the second session:
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 where id = 2 for update;
+----+------+
| id | c1   |
+----+------+
|  2 |    2 |
+----+------+
1 row in set (0.00 sec)
In the first:
mysql> select * from t1 where id = 2 for update;
and in the second we can get deadlock error:
mysql> select * from t1 where id = 1 for update;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql> show global status like '%deadlock%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| rocksdb_row_lock_deadlocks | 1     |
+----------------------------+-------+
1 row in set (0.00 sec)
Note that MyRocks has status variable to count deadlocks. Note that Percona Server still does NOT seem to support SHOW ENGINE ROCKSDB TRANSACTION STATUS statement available upstream:
mysql> show engine rocksdb transaction status\G
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'transaction status' at line 1
I was not able to find a bug about this (sorry if I missed it), and just reported new task to Percona's JIRA: PS-5114 - "Add support for SHOW ENGINE ROCKSDB TRANSACTION STATUS".

That's probably more than enough for single blog post (that is mostly NOT about bugs). One day I'll refresh my knowledge of MyRocks etc and maybe write more about deadlocks troubleshooting there.

Do not be surprised if you can not find anything in INNODB STATUS when you get error 1213, just proceed with further steps. There are other reasons to explore. Venice hides a lot early in the morning...

To summarize, do not be surprised that after you got MySQL error 1213 you see no information about recent InnoDB deadlock - there are at least 3 more reasons for this error to be reported, as explained above. You should know your configuration and use several other commands and sources of information to pinpoint what exactly happened and why.