Showing posts with label mtr. Show all posts
Showing posts with label mtr. Show all posts

Thursday, January 27, 2022

First steps with MariaDB Server and DTrace on macOS

FOSDEM 2022 is going to happen on the next weekend and I am still missing blog posts supporting my unusual upcoming talk there, this time devoted to building and using MariaDB server on macOS. So tonight I am going to contribute something new to refer to during my talk or followup questions. I'll try to document my way to build MariaDB Server from GitHub source on old (early 2015) MacBook Air running macOS 10.13.6 High Sierra. I am also going to show why one may want to use macOS there instead of installing Linux and having way more usual environment on the same decent hardware.

I've inherited this Air from my daughter in September, 2021 after she had got a new, M1-based, and initially used it mostly for Zoom calls and web-browsing. Soon I recalled that I've used MacBook for more than 3 years in the past while working for Sun and Oracle, and it was my main working platform not only for content consumption, emails and virtual machines. It was regularly used for bugs verification, MySQL builds and tests of all kinds. It would be a waste of capable OS and good hardware (still more powerful formally than all my other machines but Fedora 33 desktop) NOT to try to use it properly.

That's why after few software upgrades to end up with more recent 10.13 minor release:

Yuliyas-Air:maria10.6 Valerii$ uname -a
Darwin Yuliyas-Air 17.7.0 Darwin Kernel Version 17.7.0: Mon Aug 31 22:11:23 PDT 2020; root:xnu-4570.71.82.6~1/RELEASE_X86_64 x86_64

I read the manual, registered as a developer, downloaded and installed proper version of XCode and decided to proceed with MacPorts.

Then I updated ports tree and proceed by the manual, with installing git and other surely needed tools and packages:

sudo port install git cmake jemalloc judy openssl boost gnutls

That was just the beginning, and eventually, with dependencies, port updates, problems and workarounds, I ended up like this (and that works for building current code of 10.1 to 10.8 for sure, with some ports maybe not needed or used for other builds, but who cares):

Yuliyas-Air:maria10.8 Valerii$ port installed
The following ports are currently installed:
  autoconf @2.71_1 (active)
  automake @1.16.5_0 (active)
  bison @3.8.2_0
  bison @3.8.2_2 (active)
  bison-runtime @3.8.2_0 (active)
  boehmgc @8.0.6_0 (active)
  boost @1.76_0 (active)
  boost171 @1.71.0_3+no_single+no_static+python39 (active)
  boost176 @1.76.0_2+no_single+no_static+python39
  boost176 @1.76.0_3+no_single+no_static+python39 (active)
  bzip2 @1.0.8_0 (active)
  cmake @3.21.4_0
  cmake @3.22.1_0 (active)
  curl @7.80.0_0+ssl (active)
  curl-ca-bundle @7.80.0_0
  curl-ca-bundle @7.80.0_1 (active)
  cyrus-sasl2 @2.1.27_5+kerberos (active)
  db48 @4.8.30_4 (active)
  expat @2.4.1_0
  expat @2.4.2_0
  expat @2.4.3_0 (active)
  gdb @11.1_0 (active)
  gdbm @1.22_0 (active)
  gettext @0.21_0 (active)
  gettext-runtime @0.21_0 (active)
  gettext-tools-libs @0.21_0 (active)
  git @2.34.1_1+credential_osxkeychain+diff_highlight+doc+pcre+perl5_28
  git @2.34.1_2+credential_osxkeychain+diff_highlight+doc+pcre+perl5_28 (active)
  gmp @6.2.1_0 (active)
  gnutls @3.6.16_1
  gnutls @3.6.16_2 (active)
  icu @67.1_4 (active)
  jemalloc @5.2.1_1 (active)
  judy @1.0.5_1 (active)
  kerberos5 @1.19.2_1 (active)
  libarchive @3.5.2_1 (active)
  libb2 @0.98.1_1 (active)
  libcomerr @1.45.6_0 (active)
  libcxx @5.0.1_4 (active)
  libedit @20210910-3.1_1 (active)
  libevent @2.1.12_1 (active)
  libffi @3.4.2_2 (active)
  libiconv @1.16_1 (active)
  libidn @1.38_0 (active)
  libidn2 @2.3.2_0
  libidn2 @2.3.2_1 (active)
  libpsl @0.21.1-20210726_2 (active)
  libtasn1 @4.18.0_0 (active)
  libtextstyle @0.21_0 (active)
  libtool @2.4.6_13 (active)
  libunistring @0.9.10_0
  libunistring @1.0_0 (active)
  libuv @1.42.0_1
  libuv @1.43.0_0 (active)
  libxml2 @2.9.12_1 (active)
  libxslt @1.1.34_6 (active)
  lmdb @0.9.29_0 (active)
  luajit @2.1.0-beta3_5 (active)
  lz4 @1.9.3_1 (active)
  lzma @4.65_1 (active)
  lzo2 @2.10_0 (active)
  m4 @1.4.19_1 (active)
  mariadb @5.5.68_0 (active)
  mariadb-server @5.5.68_0 (active)
  mysql57 @5.7.36_1 (active)
  mysql_select @0.1.2_4 (active)
  ncurses @6.3_0 (active)
  nettle @3.7.3_0 (active)
  openssl @3_1
  openssl @3_2 (active)
  openssl3 @3.0.0_6+legacy
  openssl3 @3.0.1_0+legacy (active)
  openssl11 @1.1.1l_5 (active)
  p5-dbd-mysql @4.50.0_0 (active)
  p5.28-authen-sasl @2.160.0_0 (active)
  p5.28-cgi @4.530.0_0 (active)
  p5.28-clone @0.450.0_0 (active)
  p5.28-dbd-mysql @4.50.0_0+mysql57 (active)
  p5.28-dbi @1.643.0_0 (active)
  p5.28-digest-hmac @1.40.0_0 (active)
  p5.28-digest-sha1 @2.130.0_4 (active)
  p5.28-encode @3.160.0_0 (active)
  p5.28-encode-locale @1.50.0_0 (active)
  p5.28-error @0.170.290_0 (active)
  p5.28-gssapi @0.280.0_3 (active)
  p5.28-html-parser @3.760.0_0 (active)
  p5.28-html-tagset @3.200.0_4 (active)
  p5.28-http-date @6.50.0_0 (active)
  p5.28-http-message @6.350.0_0 (active)
  p5.28-io-html @1.4.0_0 (active)
  p5.28-io-socket-ssl @2.72.0_0
  p5.28-io-socket-ssl @2.73.0_0 (active)
  p5.28-lwp-mediatypes @6.40.0_0 (active)
  p5.28-mozilla-ca @20211001_0 (active)
  p5.28-net-libidn @0.120.0_5 (active)
  p5.28-net-smtp-ssl @1.40.0_0 (active)
  p5.28-net-ssleay @1.900.0_4 (active)
  p5.28-term-readkey @2.380.0_0 (active)
  p5.28-time-local @1.300.0_0 (active)
  p5.28-timedate @2.330.0_0 (active)
  p5.28-uri @5.100.0_0 (active)
  p11-kit @0.24.0_1 (active)
  pcre2 @10.39_0 (active)
  perl5.28 @5.28.3_4 (active)
  perl5.30 @5.30.3_3 (active)
  pkgconfig @0.29.2_0 (active)
  popt @1.18_1 (active)
  python3_select @0.0_2 (active)
  python39 @3.9.9_0+lto+optimizations
  python39 @3.9.10_0+lto+optimizations (active)
  python_select @0.3_9 (active)
  readline @8.1.000_0 (active)
  readline-5 @5.2.014_2 (active)
  rsync @3.2.3_1 (active)
  sqlite3 @3.37.0_0
  sqlite3 @3.37.1_0
  sqlite3 @3.37.2_0 (active)
  sysbench @1.0.20_0 (active)
  tcp_wrappers @20_4 (active)
  texinfo @6.8_0 (active)
  umem @1.0.1_1 (active)
  xxhashlib @0.8.1_0
  xxhashlib @0.8.1_1 (active)
  xz @5.2.5_0 (active)
  zlib @1.2.11_0 (active)
  zstd @1.5.0_0
  zstd @1.5.1_0 (active)
Yuliyas-Air:maria10.8 Valerii$

This is surely not the minimal needed set of ports. I've highlighted a couple (like openssl11) that were really needed to build and install 10.8 successfully eventually.

Then I cloned the code with usual steps and ended up with this:

Yuliyas-Air:server Valerii$ git log -1
commit c1cef1afa9962544de4840c9a796ae0a9b5e92e6 (HEAD -> 10.8, origin/bb-10.8-wlad, origin/bb-10.8-release, origin/HEAD, origin/10.8)
Merge: db2013787d2 9d93b51effd
Author: Vladislav Vaintroub <wlad@mariadb.com>
Date:   Wed Jan 26 13:57:00 2022 +0100

    Merge remote-tracking branch 'origin/bb-10.8-wlad' into 10.8
Yuliyas-Air:server Valerii$ git submodule update --init --recursive
Submodule path 'libmariadb': checked out 'ddb031b6a1d8b6e26a0f10f454dc1453a48a6ca8'
Yuliyas-Air:server Valerii$ cd buildtmp/
Yuliyas-Air:buildtmp Valerii$ rm -rf *
Yuliyas-Air:buildtmp Valerii$ cmake .. -DCMAKE_INSTALL_PREFIX=/Users/Valerii/dbs/maria10.8 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_CONFIG=mysql_release -DFEATURE_SET=community -DWITH_EMBEDDED_SERVER=OFF -DPLUGIN_TOKUDB=NO -DWITH_SSL=/opt/local/libexec/openssl11 -DENABLE_DTRACE=1
...
-- The following OPTIONAL packages have not been found:

 * PMEM
 * Snappy

-- Configuring done
CMake Warning (dev):
  Policy CMP0042 is not set: MACOSX_RPATH is enabled by default.  Run "cmake
  --help-policy CMP0042" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  MACOSX_RPATH is not specified for the following targets:

   libmariadb

This warning is for project developers.  Use -Wno-dev to suppress it.

-- Generating done
-- Build files have been written to: /Users/Valerii/git/server/buildtmp
Yuliyas-Air:buildtmp Valerii$

Note that I've used somewhat nontrivial cmake command line for out of source build, and the output above was not from the clean state, but from the state after 10.1 to 10.7 where all checked out and built successfully, one by one, with problems found and resolved (more on that on my slides and during the talk).

Two key options in that command line are -DWITH_SSL=/opt/local/libexec/openssl11 to use supported OpenSSL version 1.1 (even though for 10.8 default version 3 should already work too) and -DENABLE_DTRACE=1 to enable static DTrace probes in the code of MariaDB that are still there. The rest are typical for my builds and blog posts.

Then I proceeded with usual make to end up with the first problem:

Yuliyas-Air:buildtmp Valerii$ make
...
[ 46%] Built target heap
[ 46%] Building CXX object storage/innobase/CMakeFiles/innobase.dir/btr/btr0btr.cc.o
In file included from /Users/Valerii/git/server/storage/innobase/btr/btr0btr.cc:28:
In file included from /Users/Valerii/git/server/storage/innobase/include/btr0btr.h:31:
In file included from /Users/Valerii/git/server/storage/innobase/include/dict0dict.h:32:
In file included from /Users/Valerii/git/server/storage/innobase/include/dict0mem.h:45:
In file included from /Users/Valerii/git/server/storage/innobase/include/buf0buf.h:33:
/Users/Valerii/git/server/storage/innobase/include/fil0fil.h:1497:11: error:
      'asm goto' constructs are not supported yet
  __asm__ goto("lock btsl $31, %0\t\njnc %l1" : : "m" (n_pending)
          ^
1 error generated.
make[2]: *** [storage/innobase/CMakeFiles/innobase.dir/btr/btr0btr.cc.o] Error 1
make[1]: *** [storage/innobase/CMakeFiles/innobase.dir/all] Error 2
make: *** [all] Error 2

caused by the fact that clang 10.0.0 from XCode: 

Yuliyas-Air:maria10.8 Valerii$ clang --version
Apple LLVM version 10.0.0 (clang-1000.10.44.4)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

does NOT support asm goto used in MariaDB Server code since 10.6 (while it was supposed to support it). I've applied a lame fix (see MDEV-27402 for more details and final diff later) and proceeded:

Yuliyas-Air:buildtmp Valerii$ make
...

[ 97%] Building CXX object client/CMakeFiles/mariadb.dir/mysql.cc.o
/Users/Valerii/git/server/client/mysql.cc:2853:59: error: expected expression
  rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_co...
                                                          ^
/Users/Valerii/git/server/client/mysql.cc:2853:38: error: use of undeclared
      identifier 'rl_completion_func_t'; did you mean 'rl_completion_matches'?
  rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_co...
                                     ^~~~~~~~~~~~~~~~~~~~
                                     rl_completion_matches
/usr/include/editline/readline.h:202:16: note: 'rl_completion_matches' declared
      here
char           **rl_completion_matches(const char *, rl_compentry_func_t *);
                 ^
/Users/Valerii/git/server/client/mysql.cc:2854:33: error: assigning to
      'Function *' (aka 'int (*)(const char *, int)') from incompatible type
      'rl_compentry_func_t *' (aka 'char *(*)(const char *, int)'): different
      return type ('int' vs 'char *')
  rl_completion_entry_function= (rl_compentry_func_t*)&no_completion;
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/Valerii/git/server/client/mysql.cc:2856:3: error: no matching function
      for call to 'rl_add_defun'
  rl_add_defun("magic-space", (rl_command_func_t *)&fake_magic_space, -1);
  ^~~~~~~~~~~~
/usr/include/editline/readline.h:195:7: note: candidate function not viable: no
      known conversion from 'rl_command_func_t *' (aka 'int (*)(int, int)') to
      'Function *' (aka 'int (*)(const char *, int)') for 2nd argument
int              rl_add_defun(const char *, Function *, int);
                 ^
4 errors generated.
make[2]: *** [client/CMakeFiles/mariadb.dir/mysql.cc.o] Error 1
make[1]: *** [client/CMakeFiles/mariadb.dir/all] Error 2
make: *** [all] Error 2
Yuliyas-Air:buildtmp Valerii$

This readline-related problem was also reported (see MDEV-27579) and I fixed it with a lame patch, and eventually I was able to build successfully:

...

[100%] Linking C executable wsrep_check_version
[100%] Built target wsrep_check_version
Yuliyas-Air:buildtmp Valerii$ make install && make clean
...
-- Installing: /Users/Valerii/dbs/maria10.8/share/aclocal/mysql.m4
-- Installing: /Users/Valerii/dbs/maria10.8/support-files/mysql.server
Yuliyas-Air:buildtmp Valerii$ echo $?
0

The final diff is like this:

Yuliyas-Air:server Valerii$ git diff -u
diff --git a/client/mysql.cc b/client/mysql.cc
index 6612b273d17..902589f2e83 100644
--- a/client/mysql.cc
+++ b/client/mysql.cc
@@ -2849,7 +2849,7 @@ static void initialize_readline ()
   rl_terminal_name= getenv("TERM");

   /* Tell the completer that we want a crack first. */
-#if defined(USE_NEW_READLINE_INTERFACE)
+#if defined(USE_NEW_READLINE_INTERFACE) && !defined(__APPLE_CC__)
   rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_completion;
   rl_completion_entry_function= (rl_compentry_func_t*)&no_completion;

@@ -2859,7 +2859,9 @@ static void initialize_readline ()
   setlocale(LC_ALL,""); /* so as libedit use isprint */
 #endif
   rl_attempted_completion_function= (CPPFunction*)&new_mysql_completion;
+#if !defined(__APPLE_CC__)
   rl_completion_entry_function= &no_completion;
+#endif
   rl_add_defun("magic-space", (Function*)&fake_magic_space, -1);
 #else
   rl_attempted_completion_function= (CPPFunction*)&new_mysql_completion;
diff --git a/storage/innobase/include/fil0fil.h b/storage/innobase/include/fil0fil.h
index 34a53746b42..a795313116f 100644
--- a/storage/innobase/include/fil0fil.h
+++ b/storage/innobase/include/fil0fil.h
@@ -1489,7 +1489,7 @@ inline void fil_space_t::reacquire()
 inline bool fil_space_t::set_stopping_check()
 {
   mysql_mutex_assert_owner(&fil_system.mutex);
-#if defined __clang_major__ && __clang_major__ < 10
+#if (defined __clang_major__ && __clang_major__ < 10) || defined __APPLE_CC__
   /* Only clang-10 introduced support for asm goto */
   return n_pending.fetch_or(STOPPING, std::memory_order_relaxed) & STOPPING;
 #elif defined __GNUC__ && (defined __i386__ || defined __x86_64__)
diff --git a/storage/rocksdb/rocksdb b/storage/rocksdb/rocksdb
--- a/storage/rocksdb/rocksdb
+++ b/storage/rocksdb/rocksdb
@@ -1 +1 @@
-Subproject commit bba5e7bc21093d7cfa765e1280a7c4fdcd284288
+Subproject commit bba5e7bc21093d7cfa765e1280a7c4fdcd284288-dirty
Yuliyas-Air:server Valerii$

The last diff in rocksdb submodule is related to the "missing zstd headers" problem I fined with yet another lame patch while building 10.7, see MDEV-27619

After usual mysql_install_db and startup, I've ended up with a shiny new MariaDB 10.8.0 up and running on macOS 10.13.6:

Yuliyas-Air:maria10.8 Valerii$ bin/mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.8.0-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show variables like 'version%';
+-------------------------+------------------------------------------+
| Variable_name           | Value                                    |
+-------------------------+------------------------------------------+
| version                 | 10.8.0-MariaDB                           |
| version_comment         | MariaDB Server                           |
| version_compile_machine | x86_64                                   |
| version_compile_os      | osx10.13                                 |
| version_malloc_library  | system                                   |
| version_source_revision | c1cef1afa9962544de4840c9a796ae0a9b5e92e6 |
| version_ssl_library     | OpenSSL 1.1.1l  24 Aug 2021              |
+-------------------------+------------------------------------------+
7 rows in set (0.001 sec)

I even checked few MTR test suites, and many tests pass:

Yuliyas-Air:mysql-test Valerii$ ./mtr --suite=rocksdb
Logging: ./mtr  --suite=rocksdb
VS config:
vardir: /Users/Valerii/dbs/maria10.8/mysql-test/var
Checking leftover processes...
 - found old pid 89161 in 'mysqld.1.pid', killing it...
   process did not exist!
Removing old var directory...
Creating var directory '/Users/Valerii/dbs/maria10.8/mysql-test/var'...
Checking supported features...
MariaDB Version 10.8.0-MariaDB
 - SSL connections supported
Using suites: rocksdb
Collecting tests...
Installing system database...
...
rocksdb.index_merge_rocksdb 'write_committed' [ pass ]    911
rocksdb.shutdown 'write_prepared'        [ pass ]   1895
rocksdb.index_merge_rocksdb 'write_prepared' [ pass ]    923
rocksdb.mariadb_misc_binlog 'write_committed' [ pass ]     43
rocksdb.mariadb_misc_binlog 'write_prepared' [ pass ]     43
...
rocksdb.issue495 'write_committed'       [ pass ]     33
rocksdb.partition 'write_committed'      [ fail ]
        Test ended at 2022-01-27 22:41:15

CURRENT_TEST: rocksdb.partition
mysqltest: At line 67: query 'ALTER TABLE t1 REBUILD PARTITION p0, p1' failed: ER_METADATA_INCONSISTENCY (4064): Table 'test.t1#P#p0#TMP#' does not exist, but metadata information exists inside MyRocks. This is a sign of data inconsistency. Please check if './test/t1#P#p0#TMP#.frm' exists, and try to restore it if it does not exist.

The result from queries just before the failure was:
< snip >
DROP TABLE IF EXISTS employees_hash_1;
DROP TABLE IF EXISTS t1_hash;
DROP TABLE IF EXISTS employees_linear_hash;
DROP TABLE IF EXISTS t1_linear_hash;
DROP TABLE IF EXISTS k1;
DROP TABLE IF EXISTS k2;
DROP TABLE IF EXISTS tm1;
DROP TABLE IF EXISTS tk;
DROP TABLE IF EXISTS ts;
DROP TABLE IF EXISTS ts_1;
DROP TABLE IF EXISTS ts_3;
DROP TABLE IF EXISTS ts_4;
DROP TABLE IF EXISTS ts_5;
DROP TABLE IF EXISTS trb3;
DROP TABLE IF EXISTS tr;
DROP TABLE IF EXISTS members_3;
DROP TABLE IF EXISTS clients;
DROP TABLE IF EXISTS clients_lk;
DROP TABLE IF EXISTS trb1;
CREATE TABLE t1 (i INT, j INT, k INT, PRIMARY KEY (i)) ENGINE = ROCKSDB PARTITION BY KEY(i) PARTITIONS 4;

More results from queries before failure can be found in /Users/Valerii/dbs/maria10.8/mysql-test/var/log/partition.log

 - saving '/Users/Valerii/dbs/maria10.8/mysql-test/var/log/rocksdb.partition-write_committed/' to '/Users/Valerii/dbs/maria10.8/mysql-test/var/log/rocksdb.partition-write_committed/'

Only  148  of 555 completed.
--------------------------------------------------------------------------
The servers were restarted 23 times
Spent 154.726 of 206 seconds executing testcases

Failure: Failed 1/27 tests, 96.30% were successful.

Failing test(s): rocksdb.partition

The log files in var/log may give you some hint of what went wrong.

If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html

69 tests were skipped, 2 by the test itself.

mysql-test-run: *** ERROR: there were failing test cases

But the real reason to build MariaDB server on macOS (other than "because I can") was not even RocksDB testing, but this:

Yuliyas-Air:mysql-test Valerii$ dtrace
Usage: dtrace [-aACeFHlqSvVwZ] [-arch i386|x86_64] [-b bufsz] [-c cmd] [-D name[=def]]
        [-I path] [-L path] [-o output] [-p pid] [-s script] [-U name]
        [-x opt[=val]]

        [-P provider [[ predicate ] action ]]
        [-m [ provider: ] module [[ predicate ] action ]]
        [-f [[ provider: ] module: ] func [[ predicate ] action ]]
        [-n [[[ provider: ] module: ] func: ] name [[ predicate ] action ]]
        [-i probe-id [[ predicate ] action ]] [ args ... ]

        predicate -> '/' D-expression '/'
           action -> '{' D-statements '}'

        -arch Generate programs and Mach-O files for the specified architecture

        -a  claim anonymous tracing state
        -A  generate plist(5) entries for anonymous tracing
        -b  set trace buffer size
        -c  run specified command and exit upon its completion
        -C  run cpp(1) preprocessor on script files
        -D  define symbol when invoking preprocessor
        -e  exit after compiling request but prior to enabling probes
        -f  enable or list probes matching the specified function name
        -F  coalesce trace output by function
        -h  generate a header file with definitions for static probes
        -H  print included files when invoking preprocessor
        -i  enable or list probes matching the specified probe id
        -I  add include directory to preprocessor search path
        -l  list probes matching specified criteria
        -L  add library directory to library search path
        -m  enable or list probes matching the specified module name
        -n  enable or list probes matching the specified probe name
        -o  set output file
        -p  grab specified process-ID and cache its symbol tables
        -P  enable or list probes matching the specified provider name
        -q  set quiet mode (only output explicitly traced data)
        -s  enable or list probes according to the specified D script
        -S  print D compiler intermediate code
        -U  undefine symbol when invoking preprocessor
        -v  set verbose mode (report stability attributes, arguments)
        -V  report DTrace API version
        -w  permit destructive actions
        -W  wait for specified process and exit upon its completion
        -x  enable or modify compiler and tracing options
        -Z  permit probe descriptions that match zero probes
Yuliyas-Air:mysql-test Valerii$

I remember how cool dtrace was since workin for Sun and after last re-install of Fedora I ended up with my Illumos VM gone and no recent FreeBSD VM anyway,. so basically there was no dtrace at hand until I've got this MacBook.

So, unlike MySQL 8.0, MariaDB server still contains USDT (DTrace probes) and I've built my version with them enabled. Let's quickly check how they can be used. There are few sample D files in the source:

Yuliyas-Air:server Valerii$ ls support-files/dtrace/
locktime.d                      query-rowops.d
query-execandqc.d               query-time.d
query-filesort-time.d           statement-time.d
query-network-time.d            statement-type-aggregate.d
query-parse-time.d
Yuliyas-Air:server Valerii$ cat support-files/dtrace/query-time.d
#!/usr/sbin/dtrace -s
#
# Copyright (c) 2009 Sun Microsystems, Inc.
# Use is subject to license terms.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; version 2 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA
#
# Shows basic query execution time, who execute the query, and on what database

#pragma D option quiet

dtrace:::BEGIN
{
   printf("%-20s %-20s %-40s %-9s\n", "Who", "Database", "Query", "Time(ms)");
}

mysql*:::query-start
{
   self->query = copyinstr(arg0);
   self->connid = arg1;
   self->db    = copyinstr(arg2);
   self->who   = strjoin(copyinstr(arg3),strjoin("@",copyinstr(arg4)));
   self->querystart = timestamp;
}

mysql*:::query-done
{
   printf("%-20s %-20s %-40s %-9d\n",self->who,self->db,self->query,
          (timestamp - self->querystart) / 1000000);
}
Yuliyas-Air:server Valerii$

We can try to run it (with yet another hack to do that I leave to the curious reader who is really going to try this) and then execute some queries against MariaDB from another terminal:

Yuliyas-Air:server Valerii$ sudo support-files/dtrace/query-time.d
dtrace: system integrity protection is on, some features will not be available

Who                  Database             Query                                    Time(ms)
Valerii@localhost                         select @@version_comment limit 1         0
Valerii@localhost                         select 1+1                               0
Valerii@localhost                         select sleep(4)                          4004
Valerii@localhost                         select @@version_comment limit 1         0
Valerii@localhost                         create database sbtest                   1
Valerii@localhost    sbtest               CREATE TABLE sbtest1(
  id INTEGER NOT NULL AUTO_INCREMENT,
  k INTEGER DEFAULT '0' NOT NULL,
  c CHAR(120) DEFAULT '' NOT NULL,
  pad CHAR(60) DEFAULT '' NOT NULL,
  PRIMARY KEY (id)
) /*! ENGINE = innodb */ 91
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(366941, '31451373586-15688153734-79729593694-96509299839-83724898275-86711833539-78981337422-35049690573-51724173961-87474696253', '98996621624-36689827414-04092488557-09587706818-65008859162'),(277750, '21472970079-7 181
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(124021, '80697810288-90543941719-80227288793-55278810422-59841440561-49369413842-83550451066-12907725305-62036548401-86959403176', '65708342793-83311865079-53224065384-18645733125-16333693298'),(660496, '07381386584-5 31
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(425080, '48883413333-17783399741-03981526516-97596354402-27141206678-83563692683-30244461835-25263435890-49140039573-28211133426', '81560227417-96691828090-72817141653-15106797886-43970285630'),(322421, '68618246702-7 33
...

to end up with a lightweight general query log with query execution time repported in milliseconds. Note some statement from, yes, sysbench test preapre :)

Nice view. macOS is also nice and useful platform to run MariaDB server for fun :)

Thta's probably enough for a simgle post. Now go and get that from your MySQL 8 on any OS! USDTs rule!

To summarize:

  1. It is surely possible to build even latest ang greates still alpha MariaDB 10.8 on macOS even as old as 10.13.6, and get usable result.
  2. One of the main benefits of macOS is DTrace support.
  3. It is still possible to use a limited set of USDTs once added to MySQL and now gone in all current versions of MariaDB server.
  4. That rocksdb.partition test failure is to be studied an maybe reported as a MDEV
  5. DTrace is surely more capable than wehat those examples in the source code show. More DTrace-related posts are coming this year. Stay tuned!

Saturday, April 4, 2020

Fun with Bugs #96 - On MySQL Bug Reports I am Subscribed to, Part XXX

My weekdays are still busy even during this lockdown period, but weekend is a perfect time for yet another blog post about MySQL bugs I considered interesting. Very few followers read my posts on other topics anyway, especially if they have the "M....DB" word...

So, here is a new list of optimizer, replication, InnoDB and few other bugs in GA versions of MySQL that I've subscribed to since the end of February, 2020:
  • Bug #98675 - "LIMIT clause in derived table garbles EXPLAIN Note". It's not a big deal and in specific case presented by Øystein Grøvlen optimizer works properly (note text aside), but some clarifications to the note and/or documentation would still help. Comments are useful to read and shows a great example of proper cooperation from Oracle side.
  • Bug #98719 - "Parallel CREATE TABLE statement binlogged in a wrong order". This bug report from Song Libing shows that Oracle engineers readily accept and verify bug reports where some code modification (adding debug sync point, for example) is needed to demonstrate the problem with a repeatable, deterministic test case. I've seen other bug reports where code modification was considered a problem and bug was not verified immediately as a result. Note that this bug seem to be not repeatable on MySQL 8.0.19 and this was explicitly tested, with test results shared in public. Good job, Umesh Shastry!
  • Bug #98734 - "Same digest for different queries if we use integer value in ORDER BY caluse". Different column numbers in ORDER BY often cause totally different execution plans, so assuming these queries are the same ("ORDER BY ?") for the purpose of digesting is misleading. Moreover, as noted by Lalit Choudhary from Percona, with column names instead of numbers such queries are not considered the same, so Performance Schema has to be more consistent.
  • Bug #98739 - "TempTable storage engine slow if mmap is required". Take care when large temporary tables are used for your queries in MySQL 8.0.x. By default (without temptable_use_mmap = OFF) when the table is larger than temptable_max_ram you may notice a very slow query execution. Nice finding by Marcelo Altmann. This is not the only related performance regression I've seen reported recently. Looks like TempTable storage engine problems is a real main topic of this post!
  • Bug #98750 - "SHOW FIELDS command regression on MySQL 8.0". This performance regression bug was reported by Juan Arruti. See also similar Bug #92195 - "checking permissions 90 time" that was wrongly(!) declared "Not a bug" back in 2018 just because of the way the problem was demonstrated. This time Przemyslaw Malkowski helped to make the point based on Performance Schema instrumentation, so there was no other option but to accept this as a real performance regression bug. Take care if you use SHOW statements with MySQL 8!
  • Bug #98782 - "Using TempTable engine for GROUP BY is slower than using MEMORY engine". In this bug report Øystein Grøvlen demonstrated that MEMORY engine is about 10% faster for temporary tables in MySQL 8. Make sure to use internal_tmp_mem_storage_engine=MEMORY if you care about performance.
  • Bug #98869 - "Temp ibt tablespace truncation at disconnection stuck InnoDB under large BP". Bug reporter, Fungo Wang, used different methods to show the performance impact of the regression probably introduced by WL#11613. From pt-pmp to perf and other OS level tools. Make sure to check all comments that point out to other bugs and problems.
  • Bug #98974 - "InnoDB temp table could hurt InnoDB perf badly". Yet another bug report by Fungo Wang. This time it took a lot of time and efforts from the bug reporter and many MySQL Community members (including me) to get this bug properly processed, even though it started with a detailed source code analysis, properly described test case, stack traces analysis and perf profiling details shared. The ideal bug report got far from ideal treatment, unfortunately.
  • Bug #98976 - "Case of referenced column of foreign key not corrected on import". Tim Düsterhus found that in versions with the fix for Bug #88718 - "Foreign key is always in lower case" the case is not always correct in the output of subsequent mysqldump.
  • Bug #98980 - "A state inside mysql_real_connect_nonblocking still blocks". This bug report by Jay Edgar was verified surprisingly fast. This is why it ended up in my list.
  • Bug #98990 - "avg_count_reset for monitor set owner is always NULL in I_S.INNODB_METRICS ". In this case Fungo Wang had not only found a bug, but also provided a patch that was accepted by Oracle after signing the OCA.
  • Bug #99006 - "GTID Synchronisation speed when doing START SLAVE". Simon Mudd noted that in some cases with GTIDs START SLAVE may take a lot of time:
    root@myhost [(none)]> start slave;
    Query OK, 0 rows affected (3 min 17.74 sec)
    and what's worse nothing is logged to the error log in the process to show the reason, progress or anything useful. Probably it's expected that with many binary logs finding proper GTID in them takes time, but some feedback would be useful. The bug does not have any clear public test case and is still under analysis.
  • Bug #99010 - "The mtr case --send command not use ps-protocol when run with --ps-protocol ". Does not sound like a big deal for anyone but developers who write MTR test cases, but in this report Ze Yang had provided source code analysis and quite detailed how to repeat steps and still got useless requests for additional feedback and nothing else till today. This is, again, unfortunate.
  • Bug #99039 - "PSI thread info not getting updated after statement execution". Who could imagine that regression bugs may be introduced even into Performance Schema? But Pranay Motupalli found one introduced into 8.0.18, 5.7.28+ and 5.6.47+ by this commit! I hope to see it fixed soon.
  • Bug #99051 - "XA commit may do engine commit before MYSQL_BIN_LOG::ordered_commit". XA transactions is one of my favorite weak areas of MySQL. In this bug report Dennis GAO described a case when XA COMMIT operation may do a engine commit before binlog flush on Ubuntu 18.04. He had contributed the fix based on the assumption that the best way is to ensure the sequence in the plugin_hash, so that the binlog plugin should be always before all the transnational engine plugins. I only hope that one day a long list of XA bugs will be fixed in one of forks if not in MySQL itself.
This is my favorite point of view this year.
To summarize:
  1. I see some really good examples of bugs verification by Oracle engineers recently. All supported versions are tested, nobody tries to argue against the bug reporter approach used to demonstrate the problem, even if it includes source code modifications. Looks really promising.
  2. Even ideal bug reports sometimes are not processed properly without extra efforts from MySQL Community, unfortunately.
  3. Percona engineers still contribute a lot to MySQL with both new bug reports and useful comments and clarifications. I hope Oracle appreciates that.
  4. Looks like code review and regression testing in Oracle still may benefit from some improvements, as we still see new regression bugs...

Sunday, August 4, 2019

Fun with Bugs #87 - On MySQL Bug Reports I am Subscribed to, Part XXI

After a 3 months long break I'd like to continue reviewing MySQL bug reports that I am subscribed to. This issue is devoted to bug reports I've considered interesting to follow in May, 2019:
  • Bug #95215 - "Memory lifetime of variables between check and update incorrectly managed". As demonstrated by Manuel Ung, there is a problem with all InnoDB MYSQL_SYSVAR_STR variables that can be dynamically updated. Valgrind allows to highlight it.
  • Bug #95218 - "Virtual generated column altered unexpectedly when table definition changed". This weird bug (that does not seem to be repeatable on MariaDB 10.3.7 with proper test case modifications like removing NOT NULL and collation settings from virtual column) was reported by Joseph Choi. Unfortunately we do not see any documented attempt to check if MySQL 8.0.x is also affected. My quick test shows MySQL 8.0.17 is NOT affected, but I'd prefer to see check copy/pasted as a public comment to the bug.
  • Bug #95230 - "SELECT ... FOR UPDATE on a gap in repeatable read should be exclusive lock". There are more chances to get a deadlock with InnoDB than one might expect... I doubt this report from Domas Mituzas is a feature request. It took him some extra efforts to insist on the point and get it verified even as S4.
  • Bug #95231 - "LOCK=SHARED rejected contrary to specification". This bug report from Monty Solomon ended up as a documentation request. The documentation and the implementation are not aligned, and it was decided NOT to change the parser to match documented syntax. But why it is still "Verified" then? Should it take months to correct the fine manual?
  • Bug #95232 - "The text of error message 1846 and the online DDL doc table should be updated". Yet another bug report from Monty Solomon. Some (but not ALL) partition specific ALTER TABLE operations do not yet support LOCK clause.
  • Bug #95233 - "check constraint doesn't consider IF function that returns boolean a boolean fun". As pointed out by Daniel Black, IF() function in a check constraint isn't considered a boolean type. He had contributed a patch to fix this, but based on comments it's not clear if it's going to be accepted and used "as is". The following test shows that MariaDB 10.3 is not affected:
    C:\Program Files\MariaDB 10.3\bin>mysql -uroot -proot -P3316 test
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 9
    Server version: 10.3.7-MariaDB-log mariadb.org binary distribution

    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    MariaDB [test]> create table t1 (source enum('comment','post') NOT NULL, comment_id int unsigned, post_id int unsigned);
    Query OK, 0 rows affected (0.751 sec)
    MariaDB [test]> alter table t1 add check(IF(source = 'comment', comment_id IS NOT NULL AND post_id IS NULL, post_id IS NOT NULL AND comment_id IS NULL));
    Query OK, 0 rows affected (1.239 sec)
    Records: 0  Duplicates: 0  Warnings: 0
  • Bug #95235 - "ABRT:Can't generate a unique log-filename binlog.(1-999), while rotating the bin". Yet another bug report from Daniel Black. When MySQL 8.0.16 is built with gcc 9.0.x abort is triggered in the MTR suite on the binlog.binlog_restart_server_with_exhausted_index_value test.
  • Bug #95249 - "stop slave permanently blocked". This bug was reported by Wei Zhao, who had contributed a patch.
  • Bug #95256 - "MySQL 8.0.16 SYSTEM USER can be changed by DML". MySQL 8.0.16 had introduced an new privilege, SYSTEM_USER. MySQL manual actually says:
    "The protection against modification by regular accounts that is afforded to system accounts by the SYSTEM_USER privilege does not apply to regular accounts that have privileges on the mysql system schema and thus can directly modify the grant tables in that schema. For full protection, do not grant mysql schema privileges to regular accounts."
    But the report that a user with a privilege to execute DML on the mysql.GLOBAL_GRANTS table from Zhao Jianwei was accepted and verified. I hope Oracle engineers will finally make up their mind and decide either to fix this or to close this report as "Not a bug". I've subscribed in a hope for some fun around this decision making.
  • Bug #95269 - "binlog_row_image=minimal causes assertion failure". This assertion failure happens in debug build when one of standard MTR test cases, rpl.rpl_gis_ddl or rpl.rpl_window_functions is executed with --binlog-row-image=minimal option. In such cases I always wonder what is the reason for a failure NOT to be noted by Oracle MySQL QA and somehow fixed before Community users notice it? Either they don't run tests on debug builds with all possible combinations, or do not care to fix such failures (and thus should suffer from known failures in other test runs). I do not like any of these options, honestly. The bug was reported by Song Libing.
  • Bug #95272 - "Potential InnoDB SPATIAL INDEX corruption during root page split". This bug was reported by Albert Hu based on Valgrind report when running the test innodb.instant_alter. Do they run MTR tests under on Valgrind or ASan builds in Oracle? I assume they do, but then why Community users are reporting such cases first? Note that related MariaDB's bug, MDEV-13942, is fixed in 10.2.24+ and 10.3.15+.
  • Bug #95285 - "InnoDB: Page [page id: space=1337, page number=39] still fixed or dirty". This assertion failure that happens during normal shutdown was reported by LUCA TRUFFARELLI. There are chances that this is a regression bug (without a regression tag), as it does not happen for reporter on MySQL 5.7.21.
  • Bug #95319 - "SHOW SLAVE HOST coalesces certain server_id's into one". This bug was reported by Lalit Choudhary from Percona based on original findings by Glyn Astill.
  • Bug #95416 - "ZERO Date is both NULL and NOT NULL". This funny bug report was submitted Morgan Tocker. Manual actually explains that it's intended behavior (MariaDB 10.3.7 works the same way as MySQL), but it's still funny and unexpected, and the bug report remains "Verified".
  • Bug #95478 - "CREATE TABLE LIKE does not honour ROW_FORMAT." I'd like to add "...when it was not defined explicitly for the original table". The problem was reported by Jean-François Gagné and ended up as a verified feature request. See also this my post on the details of where row_format is stored and is not stored for InnoDB tables...
  • Bug #95484 - "EXCHANGE PARTITION works wrong/werid with different ROW_FORMAT". Another bug report by Jean-François Gagné related to the previous one. He had shown that it's actually possible to get partitions with different row formats in the same InnoDB table in MySQL 5.7.26, but not in the most natural way. It seems the problem may be fixed in 5.7.27 (by the fix for another, internally reported bug), but the bug remains "Verified".
There are some more bugs reported in May 2019 that I was interested in, but let me stop for now. Later in May I've got a chance to spend some days off in Barcelona, without any single MySQL bug report opened for day.

I like this view of Barcelona way more than any MySQL bugs review, including this one.
To summarize:
  1. Oracle engineers who process bugs still sometimes do not care to check if all supported major versions are affected and/or share the results of such checks in public. Instead, some of them care to argue about severity of the bug report, test case details etc.
  2. We still see bug reports that originates from existing, well known MTR test cases runs under Valgrind or in debug builds with some non-default options set. I do not have any good reason in mind to explain why these are NOT reported by Oracle's internal QA first.
  3. Surely some regression bugs still get verified without the regression tag added.
I truly hope my talk "Problems with Oracle's Way of MySQL Bugs Database Maintenance" will be accepted for Percona Live Europe 2019 conference (at least as a lightning talk) and I'll get another chance to speak about the problems highlighted above, and more. There are some "metabugs" in the way Oracle handles MySQL bug report, and these should be discussed and fixed, for the benefits of MySQL quality and all MySQL users and customers.

Sunday, May 19, 2019

MySQL Support Engineer's Chronicles, Issue #9

My previous post from this series was published more than 1.5 years ago. I had never planned to stop writing about my everyday work on a regular basis, but sometimes it's not easy to pick up something really interesting for wider MySQL audience and when in doubts I always prefer to write about MySQL bugs...

In any case, any long way starts from the first step, so I decided to write one post in this series per week and try to summarize in it whatever findings, questions, discussions, bugs and links I've collected over the week. My work experience differs week after week, so some of these posts may be boring or less useful, but I still want to try to create them on a regular basis.

I was working on (upcoming) blog post (inspired by one customer issue) on the impact of innodb_default_row_format setting for importing tablespaces (and related checking of the row format really used in both .frm and .ibd files) and found FSP Header description in this old post by Jeremy Cole useful for further checks in the InnoDB source code. MySQL manual is not very informative (and MariaDB KB page is just wrong/incomplete) when describing flags for the table or tablespace, unfortunately, so I've reported MDEV-19523 to get this improved.

If you ever wonder what MariaDB plans to do with InnoDB in the future, please, check MDEV-11633 among other sources.

This week we in Support got customer (on MySQL 8.0.x) complaining that they can not start server any more on Windows 10 after moving datadir to other drive. Check this blog post by my colleague Nil on the reason, explanations and way to fix/prevent this from happening. One of those cases when MySQL Forums give useful hint.

If you build MariaDB (and MySQL) from source on a regular basis (as I do), you may wonder at times how to disable some storage engine plugin at build time (for example, NOT to be affected by some temporary bugs in it when you do not really need it for testing or production use). Save this as hint:
-DPLUGIN_MROONGA=NO
This is what you have to add to cmake command line to prevent building Mroonga, for example. Same approach applies to TokuDB etc. See this KB page also for more details.

I never noted before that "Explain Analyzer" service exists at mariadb.org, but it seems some customers use it and even prefer to share its output instead of plain text EXPLAIN. Just copy/paste any EXPLAIN ...\G there and decide if the result is useful. For Support purposes and queries accessing less than 10 tables or so I'd prefer usual text output.

Yet another public service at mariadb.org I noted this week by pure chance is "MariaDB CI" page with buildbot status and ways to check what is building now, what failed etc. MariaDB Foundation works in a true open manner at all levels.

If you ever cares to find out what exact versions of MariaDB (or MySQL) contain specific commit you can find out using git tag --contains commit_hash command.

I still do not care about Kubernetes at all, but it seems customers start to use it in production, so here is the hint for myself on how run specific command in a running container:
kubectl exec -it <pod> --container <container> -- vi grastate.dat
I may have to write or speak about some details of MySQL and MariaDB architecture soon, so I was looking for related pictures and texts. I found useful details in the following places:
If you are interested in different storage engines and efficiency of indexing, check this blog post by Mark Callaghan

The last but not the least, I've nominated the following bugs:
  • Bug #95269 - "binlog_row_image=minimal causes assertion failure". I really wonder why this combination was missed in any regular testing of debug builds (that I hope Oracle does).
  • Bug #90681 - "MySQL 8.0 fails to install and start from Oracle .debs on debian 9 x86_64". It seems proper documentation is missing for users to know what conflicting packages to remove, what paths to clean up (if any) etc. Maybe this is no longer a concern (I do no use Oracle .deb packages, so I don't know), but in any case having this bug just "Open" helps nobody.
  • Bug #87312 - "Test main.events_time_zone is fundamentally unstable". It's even more strange to see this bug report about unstable test case "Open" for more than 2 years. Is it really hard to run MTR many times or check the code and improve, or just agree to disable it?
  • Bug #95411 - "LATERAL produces wrong results (values instead of NULLs) on 8.0.16". This regression bug in optimizer of 8.0.16 (vs 8.0.14) leads to wrong results, but so far nobody cared to verify it (even though it has simple and clear "How to repeat" instructions). This is sad.
for bug of the day on Twitter this week. I've also participated in a discussion there. As a result I ended up reading some recent MEB 8.0 manual pages (and more here). MySQL Enterprise Backup really provides a lot of potentially useful options that mariabackup may benefit from one day...

I spent first two weeks of May properly last year, on vacation in UK. Battersea Park here.
That's more or less all I had written down for further review this week that I am ready top share. Stay tuned for what may come up next week!

Sunday, November 4, 2018

On New Severity Levels for MySQL Bugs

Four weeks ago while working on a blog post about half baked XA transactions feature of MySQL server I've noted that there are new severity levels added by Oracle for MySQL bug reports. Previously we had 5 levels:

  • S1 (Critical) - mostly for all kinds of crashes, DoS attack vectors, data corruptions etc
  • S2 (Serious) - mostly for wrong results bugs, broken replication etc
  • S3 (Non-critical) - all kinds of minor but annoying bugs, from unexpected results in some corner cases to misleading or wrong error messages, inefficient or unclear code etc
  • S4 (Feature requests) - anything that should work or be implemented based on common sense, but is not documented in the manual and was not required by the original specification or implementation of some feature.
  • S5 (Performance) - everything works as expected and documented, but the resulting performance is bad or less than expected. Something does not scale well, doesn't return results fast enough in some cases, or could be made faster or some specific platform using some different code or library. This severity level was also probably added at Oracle times, at least it was not there in 2005 when I started to work on MySQL bugs.

Informal descriptions above are mine and may be incorrect or different from definitions Oracle engineers currently use. I tried to search for Oracle definitions that apply to MySQL, but was not able to find anything immediately useful (any help with public URL is appreciated). 

In general, severity is defined as the degree of impact a bug has on the operation or use of some software, so less severity assumes less impact on common MySQL operations. One may also expect that bugs with higher severity are fixed first (have higher internal priority). It may not be that simple (and was not during my days in MySQL, when many more inputs were taken into account while setting priority for the bug fix), but it's a valid assumption for any community member.

By default when searching for bugs you got all bugs of severity levels S1, S2, S3 and S5. You had to specifically care to get feature requests in search results while using bugs database search interface.

If you try to search bugs today, you'll see two more severity levels added, S6 (Debug Builds) and S7 (Test Cases):

Now we have 7 Severity levels for MySQL bug reports
S6 severity level seems to be used for assertion failures and other bugs that affect only debug builds and can not be reproduced literally with non-debug binaries. S7 severity level is probably used for bug reports about failing MTR test cases, assuming that failure does NOT show a regression in MySQL software, but rather some non-determinism, platform dependencies, timing assumptions or other defects of the test case itself.

By default bug reports with these severity levels are NOT included in search (they are not considered "Production Bugs"). So, one has to care to see them. This, as well as normal common sense based assumption that lower severity eventually means to lower priority for the fix, caused some concerns. It would be great for somebody from Oracle to explain the intended use and reasons for introduction of these severity levels with some more text than a single tweet, to clarify possible FUD people may have. If applied formally, these new severity values may lead to low priority for quite important problems. Most debug assertions are in the code for really good reason, as many weird things (up to crashes and data corruption) may happen in non-debug binaries somewhere later in cases when debug-only assertion fails.

I was surprised to find out that at the moment we have 67 active S6 bug reports, and 32 active S7 bug reports. The latter list obviously includes reports that should not be S7, like Bug #92274 - "Question of mysql 8.0 sysbench oltp_point_select test" that is obviously about a performance regression noted in MySQL 8 (vs MySQL 5.7) by the bug reporter.

Any comments from Oracle colleagues on the reasons to introduce new severity levels, their formal definitions and impact on community bug reports processing are greatly appreciated.

Saturday, September 10, 2016

Fun with Bugs #45 - On Some Bugs Fixed in MySQL 5.7.15

Oracle released MySQL 5.7.15 recently, earlier than expected. The reason for this "unexpected" release is not clear to me, but it could happen because of a couple of security related internal bug reports that got fixed:

  • "It was possible to write log files ending with .ini or .cnf that later could be parsed as option files. The general query log and slow query log can no longer be written to a file ending with .ini or .cnf. (Bug #24388753)
  • Privilege escalation was possible by exploiting the way REPAIR TABLE used temporary files. (Bug #24388746)"
Let me concentrate on the most important fixes to bugs and problems reported by Community users. First of all, in MySQL 5.7.15 one can just turn off InnoDB deadlock detection using the new  innodb_deadlock_detect dynamic server variable. Domas had explained the positive effect of this more than 6 years ago in his post. Some improvements to the way deadlock detection worked in MySQL happened in frames of fix for the Bug #49047 long time ago, but this time Oracle just implemented a way to disable check and rely on InnoDB lock wait timeout instead.

Other InnoDB-related fixes to problems reported in public bugs database include:
  • Bug #82073 - "Crash with InnoDB Encryption, 5.7.13, FusionIO & innodb_flush_method=O_DIRECT". It was reported by my colleague from MariaDB, Chris Calender, and verified by other my colleague from MariaDB, Jan Lindström. Probably Bugs Verification Team in Oracle just had no access to proper hardware to verify this.
  • Bug #79378 - "buf_block_align() makes incorrect assumptions about chunk size". This bug was reported by Alexey Kopytov, who had provided a patch.
There were several fixes to replication-related bugs:
  • Bug #81675 - "mysqlbinlog does not free the existing connection before opening new remote one". It was reported by Laurynas Biveinis from Percona, who had also provided a patch, and verified by Umesh.
  • Bug #80881 - "MTR: binlog test suite failed to cleanup (contribution)". This fix to the binlog test suit was contributed by Daniel Black and verified by Umesh.
  • Bug #79867 - "unnecessary using temporary for update". This bug was reported by Zhang Yingqiangwho had also contributed a patch (that was not used after all, according to the comment from Oracle developer). It was verified by Umesh.
 Some more bugs from other categories were also fixed:
  • Bug #82125 - "@@basedir sysvar value not normalized if set through the command line/INI file". It was reported by Georgi Kodinov from Oracle. It's funny that there is a typo in the release notes when this fix is described (pay attention to slashes):
    "If the basedir system variable was set at server startup from the command line or option file, the value was not normalized (on Windows, / was not replaced with /)"
  • Bug #82097 is private. I can not say anything about it in addition to this:
    "kevent statement timer subsystem deinitialization was revised to avoid a mysqld hang during shutdown on OS X 10.12."
    I can repeat, though, my usual statement that in most cases making bugs private is a wrong thing to do. I feel myself personally insulted every time when I see that fixed bug report remains private.
  • Bug #81666 - "The MYSQL_SERVER define not defined du to spelling error in plugin.cmake". It was reported by Magnus Blåudd who had provided a patch also.
  • Bug #81587 - "Combining ALTER operations triggers table rebuild". This bug was reported by Daniël van Eeden and verified by Umesh.
  • Bug #68972 - "Can't find temporary table". This bug (that could happen in a stored procedure or when prepared statements are used) was reported by Cyril Scetbon and verified by Miguel Solorzano.
  • Bug #82019 - "Is client library supposed to retry EINTR indefinitely or not". It was reported by Laurynas Biveinis from Percona, who had also contributed patches later. This bug was verified formally by Sinisa Milivojevic.
To summarize, you should consider upgrade to MySQL 5.7.15 for sure if you use FusionIO or want to be able to disable InnoDB deadlock detection entirely, or if you consider security-related fixes in this release really important (I don't). Otherwise just check other fixes that could impact you positively, or just wait for 5.7.16...

Saturday, June 15, 2013

Fun with Bugs #10 - recently reported bugs affecting MySQL 5.6.12

MySQL 5.6.12 is available to community for more than a week already, so people started to test and use it. And, no wonder, new bug reports started to appear. Let's concentrate on them in this issue.

I'd like to start with a funny one.  Bug #69413 had scared some of my Facebook readers to death, as we see kernel mutex mentioned clearly in the release notes for 5.6.12. What, kernel mutex comes back again? No, it's just a result of null merge and, probably, copy/paste from the release notes for 5.5.32.

It seems recent bug reports for 5.6.12 are mostly related to small details that may not be of any importance to a typical user. For example, Bug #69419 that was reported by my colleague almost immediately after release questions the way mtr is used in the release process. Change related to fix for other bug had broken few tests, but tests were neither updated nor temporary disabled it seems. This is strange, at best, and can mean many things (from simple mistake to "nobody cares", to switch to some other tools for internal regression testing).

"Nobody cares" does NOT apply though, as during this week Shane Bester had reported 2 public bugs related to potential performance improvements possible in 5.6.12. Check Bug #69420 and Bug #69422. Looks like he tries to find and eliminate reasons for even less than smallest slowdown in benchmarks.

He is not the only one. Check Bug #69451. Event the smallest chunk of redundant code can not hide these days from careful users...

One topic for bug reports is ages old: MySQL still do not use proper data type for integers in many parts of the code. Bug #69431 from Shane is one of recent examples. Bug #69469 (that is more or less a duplicate of Bug #69249 reported for 5.6.11 a month ago), is another one, but related to a new feature introduced in 5.6. It seems that topic is valid for a new code as much as for older one that Monty and Sinisa were reviewing a decade ago. Let's hope that for MySQL 5.7 GA the review of the entire code base is planned, with the aim to find and fix all problems of this kind (among others).

Unfortunately it's not only about minor and cosmetic things. If you use raw devices with InnoDB and plan to upgrade to 5.6, check Bug #69424. It's not yet verified, and previous bug of this kind, Bug #68860, was set to "Not a bug" in two days... But, well, how one should upgrade with existing raw decide containing data, when code of srv_file_check_mode() function clearly says:

/*********************************************************************//**
Check if a file can be opened in read-write mode.
@return true if it doesn't exist or can be opened in rw mode. */
static
bool
srv_file_check_mode(
/*================*/
        const char*     name)           /*!< in: filename to check */
{
        os_file_stat_t  stat;

        memset(&stat, 0x0, sizeof(stat));

        dberr_t         err = os_file_get_status(name, &stat, true);

        if (err == DB_FAIL) {

                ib_logf(IB_LOG_LEVEL_ERROR,
                        "os_file_get_status() failed on '%s'. Can't determine "
                        "file permissions", name);

                return(false);
 

        } else if (err == DB_SUCCESS) {

                /* Note: stat.rw_perm is only valid of files */

                if (stat.type == OS_FILE_TYPE_FILE) {
                        if (!stat.rw_perm) {

                                ib_logf(IB_LOG_LEVEL_ERROR,
                                        "%s can't be opened in %s mode",
                                        name,
                                        srv_read_only_mode
                                        ? "read" : "read-write");

                                return(false);
                        }
                } else {
                        /* Not a regular file, bail out. */

                        ib_logf(IB_LOG_LEVEL_ERROR,
                                "'%s' not a regular file.", name);

                        return(false);
                }
        } else {

                /* This is OK. If the file create fails on RO media, there
                is nothing we can do. */

                ut_a(err == DB_NOT_FOUND);
        }

        return(true);
}


That is, if file is not a regular file we unconditionally return false, and as soon as this function returns false in all places it is used we just assume error. (I have to check this myself eventually as I have no raw decide at hand for immediate test, but code like this does not present in MySQL 5.5, so it seems good old manual page just can not be used any more.)

It seems Oracle MySQL engineers should pay more attention to testing upgrade procedures (and reading community bug reports). Even if eventually this may not be the case, currently community QA efforts (and public bugs database) are still important and sometimes lead to findings that seem new and unexpected to Oracle MySQL engineers.



Another serious enough bug from recently reported and verified, Bug #69444, is related to replication. It seems to be not really crash safe when DDL statement is involved. Potentially when crash happens during "wrong" time, DDL is going to be executed again upon slave restart.

That's all for now. MySQL 5.6.12 is going to be the best release ever for 6+ more weeks it seems, so we all have plenty of time to check it and contribute to public bugs database...