Saturday, July 30, 2016

Fun with Bugs #44 - Community Bugs Fixed in MySQL 5.7.14

MySQL 5.7.14 was officially released yesterday. So, it's time to check what bugs reported by MySQL Community in public were fixed in this release. Some of these bugs are presented below.

As usual, let me start with InnoDB. The following bugs were fixed there:
  • Bug #80296 - "FTS query exceeds result cache limit". It was reported (for 5.6, but I do not see new 5.6 release notes yet) by Monty Solomon and verified by Umesh.
  • Bug #80304 - "generated columns don't work with foreign key actions". It was reported by Guilhem Bichot based on test case by Peter Gulutzan presented here.As most community bug reports during last 2-3 years, it was verified by Umesh.
  • Bug #80298 - "Full-Text queries with additional secondary index gives NULL or Zero rows", was reported by Ray Lambe and verified by Umesh.
  • Bug #76728 - "reduce lock_sys->mutex contention for transaction that only contains SELECT". This old bug report by Zhai Weixiang (who had provided a patch) was verified by Sinisa Milivojevic.
  • Bug #80083 - "Setting innodb_monitor_enable to ALL does not enable all monitors". It was reported by Davi Arnaut and verified by Miguel Solorzano.
  • Bug #79772 - "Foreign key not allowed when a virtual index exists". It was reported and verified by Jesper wisborg Krogh from Oracle.
There are many more bugs fixed in InnoDB, but all of them were reported in internal Oracle's bugs database by Oracle employees. I do not like this trend.

Now, let's check replication bugs that were fixed:
  • Bug #79324 - "Slave is ~10x slower to execute set of statements compared to master RBR", was reported by Serge Grachov and verified by Umesh.
  • Bug #62008 - "read-only option does not allow inserts/updates on temporary tables". This bug was reported long time ago by Ivan Stanojevic and verified by me when I worked in Oracle.It's really good to see it fixed now!
Some bugs were fixed in Performance_Schema (who could imagine it has bugs...), but they were either reported internally or remain private, like Bug #81464. Just take into account that SELECT from some P_S tables could crash server before 5.7.14, based on release notes...

This time I see several build-related bugs fixed, like these:
  • Bug #81274 - "Add support for Solaris Studio 12.5 aka 5.14". It was reported (and probably fixed) by Tor Didriksen.
  • Bug #81593 - "adapt to gcc 5.3 on solaris". It was also reported and fixed by Tor Didriksen. personally I am happy to see that Oracle still cares about Solaris and related software. Historical sentiments...
  • Bug #80996 - "correct make_pair for c++11 (contribution)". This fix was contributed by Daniel Black
  • Bug #80371 - "MySQL fails to build with new default mode in GCC6". It was reported by Terje Røsten.
The last but not the least, I also have to mention this bug in audit (and, thus, query rewrite) plugins, Bug #81298 - "query rewrite plugin suffers scalability issues". It was reported by Vadim Tkachenko and verified by Sinisa Milivojevic. This is a great improvement.

To summarize, I see reasons to upgrade for those who rely a lot on FTS indexes in InnoDB, replication, audit plugins and Performance_schema. I had not even tried to build 5.7.14 from source yet, so I do not have any personal experience to share.

Sunday, June 5, 2016

Fun with Bugs #43 - Bugs Fixed in MySQL 5.7.13

For some reason I do not see numerous public announcements about recent MySQL 5.7.13 release, even though it happened a couple of days ago formally. Maybe that's because we do not have any really "big" new features in this release (new ->> operator for introduced though for those who, unlike me, care about JSON). Still there are many fixes for bugs previously reported by Community (or Oracle engineers, but still in public bugs database), and I'd like to discuss some of them.

Let me start with the bug that was NOT fixed though, Bug #81093, "mysqld_multi not included in mysql-community-server package", reported by Georgi Iovchev and verified by Umesh. For this bug we had not got any fix, but explanation of this kind:
"On platforms for which systemd support is installed, systemd has the capability of managing multiple MySQL instances. For details, see Configuring Multiple MySQL Instances Using systemd. Consequently, mysqld_multi and mysqld_multi.server are not installed because they are unnecessary."
I still think that removing familiar scripts like mysqld_safe or mysqld_multi from systems where they are not used by default to start services upon OS startup is a mistake that has to be corrected.

I'd like to highlight Bug #80299, "DEFAULT is not deterministic and should not be allowed in generated columns", also. It was reported by Roy Lyseng and verified by Umesh.The fix is not those bug's synopsis suggested. Instead, ALTER TABLE operations that change the definition of a column now cause a table rebuild if any generated column expression uses DEFAULT().

Now let's concentrate on InnoDB bugs fixed:
  • Bug #80898 - "Replication stops after transaction is rolled back asynchronously in master". It was reported by Debarun Banerjee (who probably works on group replication feature in Oracle). I wonder how this fix may influence Galera-based cluster, especially PXC 5.7 from Percona that may become available as GA some day. Something to check later.
  • Bug #80772 - "Excessive memory used in memory/innodb/os0file starting 5.7.8". This regression bug was reported by Morgan Tocker and verified by Umesh.
  • Bug #80761 - "Server crashed in mutex_create due to invalid option". It was reported by Allen Lai. It's not clear who could verify it, but the problem is fixed now.
  • Bug #80708 - "Problem importing encrypted tablespace". It was reported by my former colleague Sergei Glushchenko from Percona and verified by Umesh.
  • Bug #80669 - "Failing assert: fil_space_get(table->space) != __null in row0quiesce.cc line 724", was also reported by Percona employee and great, hard working QA engineer, Ramesh Sivaraman. As most of the bugs from Community, it was immediately (in 44 minutes!) verified by Umesh.
  • Bug #80667 - "innobase_get_computed_value(dtuple_t const*, dict_v_col_t const*, dict_index_t c". It was reported based on failing test case runs by Viswanatham Gudipati, who works for Oracle.
  • Bug #80327 - "InnoDB: Failing assertion: !(rec_get_info_bits(rec2, comp) & 0x10UL)", was reported by Erlend Dahl from Oracle based on numerous failures of innodb_gis.rtree_purge test. I am really happy to see Oracle engineers reporting test failures in public, even when they happen to 5.8 (a three that we, Community, may not have access to at the moment). This is exactly what I was fighting for since I started this blog in 2012.
  • Bug #80261 - "Invalid InnoDB FTS Doc ID during INSERT". This bug was reported by Monty Solomon and promptly verified by Umesh. The bug is fixed, but release notes do not mention it explicitly. There are bugs even in release notes, we should live with that.
  • Bug #80182 - "Unsupported extension error for new features while 'create table like'". This bug was reported by yet another great QA engineer and my former colleague in Percona, Shahriyar Rzayev. As usual, it was promptly verified by Umesh.
  • Bug #79330 - "DROP TABLESPACE fails for missing general tablespace *.ibd file", was reported by Daniel Price who works for Oracle and verified by Marko Mäkelä.
  • Bug #78761 - "Unexpected deadlock with innodb_autoinc_lock_mode=0". It was reported by Yiftach Kaplan and verified by Miguel Solorzano.
  • Bug #77011 - Hanging "System Lock" when executing "flush table ... for export", was reported by Vincent Meng and verified by Umesh. It is also fixed in MySQL 5.6.31.
Now let's proceed with replication bugs that were fixed:
  • Bug #78999 - "Injected query log events do not set explicit_defaults_ts, read it on applying", was reported by great Laurynas Biveinis from Percona and verified by Umesh. It was a regression introduced by the fix for Bug #72794.
  • Bug #78995 - "Binlog_sender::read_event may access freed event buffer". Yet another bug reported by Laurynas Biveinis and immediately verified by Umesh. Percona engineers did a great QA job while working on Percona Server 5.7 GA release, and we see that Oracle fixes their bugs reported back then...
  • Bug #78467 - "mysqldump --dump-slave does not work with multi-source". It was reported by Sven Sandberg. I think that bugs of this kind should not even appear with proper design for new features and, in any case, they have to be fixed before GA release, so that it is at least feature complete. But what do I know...
  • Bug #77496 - "Replication position lost after crash on MTS configured slave". It was reported by Simon Mudd almost a year ago for MySQL 5.6, but is fixed only recently (also in 5.6.31). The bug was verified by Umesh.
Honestly, I am scared with numerous problems we still see in new replication features that appeared in MySQL 5.6, and thinking about Oracle's group replication development on top of that makes me scared even more. It will be us, Support engineers, and poor production DBAs who are forced to deal with numerous features not working well together or breaking usual work habits here and there...
There we not so many bugs reported in public that were fixed in other areas. I'd like to highlight the following:
  • Bug #78512 - "/var/log/mysqld.log has incorrect permissions after installing server from repo", Bug #81390 - "mariadb galera not replaced with community packages during upgrade on fc24", and Bug #79377 - "service start failed to create database if it doesn't exists(ubuntu1510)". They were all reported by Ramana Yeruva who seem to work for Oracle.
  • Bug #81139 - "Missing break statement in mysql_read_default_options()". It was reported by Georgi Kodinov from Oracle.
  • Bug #81014 - "DELETE from joined tables with WHERE using derived table fails with error 1093". This regression bug in optimizer was reported by Jim Parks, who seem to work for Oracle.
  • Bug #80935 - "Dev library header files not on default include path". It was reported by Lars Tangvald who works for Oracle.
  • Bug #80748 - "mysql5.6 does not build on solaris12". This bug was reported by Tor Didriksen from Oracle. Fixed in MySQL 5.6.31 also. I have to check if this fix really help my builds on OpenIndiana VM one day...
  • Bug #80526 - "LEFT OUTER JOIN returns incorrect results on the outer side". It was reported by Attila Rózsár and verified by Miguel Solorzano
  • Bug #80461 - "Accessing uninitialized memory inside strings/decimal.c ull2dec line ~1071". This bug was reported by Sergey Sprogis and verified by Umesh.
  • Bug #80451 - "mysqld --initialize does not support the keyring_file_data option". It was reported and verified by Jesper Krogh from Oracle.
  • Bug #80333 - "Valgrind: Conditional jump or move in Rewriter::do_refresh", was reported by Roel Van de Paar from Percona and verified by Umesh.
  • Bug #80231 - "Column name in having, which is contained in select list '*', produces error", was reported by Su Dylan and formally verified by Sinisa Milivojevic.
  • Bug #80089 - "Memory leak from open files limit", was reported by Manuel Ung and verified by Umesh. This bug is also fixed in MySQL 5.6.31.
  • Bug #79591 - "select distinct not returning distinct rows in 5.7...". This regression bug was reported by Shane Bester and probably verified (and fixed) by Øystein Grøvlen.
  • Bug #79396 - "null pointer dereference", was reported by Павел Гусев and verified by Sinisa Milivojevic. It is fixed in MySQL 5.6.31 also. It would be nice to see Oracle using AppChecker static analyzer for MySQL code...
  • Bug #78254 - "After running mysql_upgrade proxies_priv user columns are not updated to 32". It was reported by Robert Gołębiowski and verified by Umesh well before 5.7 became GA, but fixed only recently. I really wonder from prevented fixing this bug in 5.7.9...
  • Bug #74977 - "Cryptic error when failing to unload a dynamic library", was reported by  Martin Hansson back in 2014 and fixed only recently (also in MySQL 5.6.31). 
  • Bug #74636 - "mysqld_multi misleading when my_print_defaults is not found", was reported by Andrii Nikitin from Oracle back in 2014 and fixed also in MySQL 5.6.31 recently.
  • Bug #72230 - "Undesirable MySQL yum repo packaging dependencies". Yet another bug reported by Simon Mudd more than 2 years ago and verified by Umesh. it is fixed only in 5.7.13.
  • The last but not the least, Bug #71783, "mysqldump silently quits when it encounters an error.", was reported by Shane Bester in 2014, and was recently fixed in MySQL 5.7.13 and 5.6.31.            
To summarize, there are many bug fixes of all kinds in MySQL 5.7.13. Tomorrow I'll surely try to build it from sources on all VMs and platforms I have.

What I consider suspicious is the fact that aside from Oracle engineers, my former Percona colleagues (who had released their own versions of Percona Server and Xtrabackup based on MySQL 5.7) and few well known community members like Simon Mudd, there are very few bug reports from Community users for MySQL 5.7 that are fixed. Not sure what it means...

Saturday, April 30, 2016

Fun with Bugs #42 - Bugs Fixed in MySQL 5.7.12

MySQL 5.7.12 was released more than 2 weeks ago. New features introduced there in a form of "rapid plugins" are widely discussed, but I am more interested in bugs reported by MySQL Community users that are fixed there. Unfortunately I do not see MySQL Community Release Notes by Morgan (like this) for quite a some time, so I have to continue describing key bug fixes and name people who reported and verified bugs in my "Fun with Bugs" series.

As usual, let's start with InnoDB bugs fixed:
  • Bug #80070 - "allocated_size and file_size differ if create general tablespace outside datadir". It was reported by my former colleague from Percona Shahriyar Rzayev and verified by Bogdan Kecman. Nice to see more people from Oracle involved in processing community bug reports!
  • Bug #79185 - "Innodb freeze running REPLACE statements". This bug (that affected many users, also on versions 5.5.x and 5.6.x, and was a kind of a regression) was reported by Will Bryant and verified (probably) and fixed by Shaohua Wang. The fix is also included into versions 5.5.49 and 5.6.30.
  • Bug #73816 - ''MySQL instance stalling “doing SYNC index”". It was reported by Denis Jedig and a lot of additional evidence was provided by my former colleague Aurimas Mikalauskas. This bug was fixed (and probably verified) by Shaohua Wang.
  • Bug #79200 - "InnoDB: "data directory" option of create table fails with pwrite() OS error 22", is a widely noted regression (I've seen customer issue with a potentially related MariaDB problem this week). This bug was reported by Frank Ullrich and verified by Bogdan Kecman. It is also fixed in MySQL 5.6.30.
  • Bug #79725 - "Check algorithm=innodb on crc32 checksum mismatch before crc32(big endian)". This bug was created to track the patch contributed by Daniel Black at GitHub. It was verified by Umesh.
Next, let's review replication bugs fixed in 5.7.12:
  • Bug #79504 - "STOP SLAVE IO THREAD prints wrong LOST CONNECTION message in error log file". It was reported by Venkatesh Duggirala.
  • Bug #78722 - "Relay log info currently_executing_gtid is not properly initialized or protected". This bug was reported by Pedro Gomes. It contains a nice simple test case and fix suggested.
  • Bug #78445 is private. So, I can only quote the release notes:
    "RESET SLAVE ALL could delete a channel even when master_pos_wait and wait_until_sql_thread_after_gtid were still waiting for binlog to be applied. This could cause a MySQL server exit when the functions tried to access the channel that was deleted. Now, a channel reference counter was added that is increased if the channel should not be deleted when the functions are running. RESET SLAVE ALL will wait for no reference, and then it will delete the channel."
    I am not sure this crash is a "security" bug of any kind, but what do I know...
  • Bug #78352 - "Slow startup of 5.7.x slave with relay_log_recovery = ON and many relay logs". I reported it based on regression comparing to 5.6.x reported by a customer of Percona, and verified by Umesh. Nice to see it fixed, as it was really annoying for almost anyone who upgraded production replication setup to 5.7.
  • Bug #78133 - "Slave_worker::write_info() incorrect DBUG_ENTER (contribution)". This bug was created to track the patch contributed by Stewart Smith at GitHub. It was verified by Umesh.
  • Bug #77740 - "silent failure to start if mysql.gtids_executed gets HA_ERR_LOCK_WAIT_TIMEOUT ". It was reported and verified by Shane Bester.
  • Bug #77237 - "Multi-threaded slave log spamming on failure". This bug was reported by Davi Arnaut and verified by Umesh. Fix is also included in MySQL 5.6.30.
  • Bug #78963 - "super_read_only aborts STOP SLAVE if relay_log_info_repository=TABLE, dbg crash". It was reported by my former colleague in Percona Laurynas Biveinis and verified by Umesh. Check also related Bug #79328 - "super_read_only broken as a server option".
  • Bug #77684 - "DROP TABLE IF EXISTS may brake replication if slave has replication filters". This bug was reported by my former colleague in Percona Fernando Laudares Camargos for MySQL 5.6.x and verified by Umesh. MySQL 5.6.30 also got this fixed.
We all remember that Performance Schema is perfect and the next greatest thing after sliced bread, but sometimes bugs are noted even there. Check Bug #79934 - "i_perfschema.table_leak random result failure" reported and verified by Magnus Blåudd. Another example is Bug #79784 - "update setup_instruments do not affect the global mutex/rwlock/cond" reported by Zhang Yingqiang and verified by Umesh. The later, IMHO, is related to or a super set of my good old report, Bug #68097 - "Manual does not explain that some P_S instruments must be enabled at startup" that remains open as a feature request (after some changes in the manual) for more than 3 years already. I truly hope 5.7.12 fixed this for a general case - it's truly important to be able to enable instruments dynamically if we expect Performance Schema to be used as a main tool for troubleshooting.

I'd also want to highlight a couple of fixes related to optimizer:
  • Bug #77209 - "Update may use index merge without any reason (increasing chances for deadlock)". It was reported and verified by my former colleagues from Oracle, Andrii Nikitin. MySQL 5.6.30 also includes the fix.
  • Bug #72858 - "EXPLAIN .. SELECT .. FOR UPDATE takes locks". This bug was reported by my former colleague in Percona (and, I hope, my colleague again soon) Justin Swanhart, who has a birthday today. Happy Birthday to you, Justin! The bug was verified by Umesh and is also fixed in MySQL 5.6.30. Justin had reported another bug fixed in 5.7.12, Bug #69375 - "LOAD DATA INFILE claims to be holding 'System Lock' in processlist".
Several more bugs reported by community were also fixed, but they were in the areas (or for platforms) I am not particularly interested in.

To summarize, MySQL 5.7.12 contains important bug fixes in replication and InnoDB and it makes sense to consider upgrade even if you do not care about any "rapid plugins", X protocol, encryption of data at rest, MySQL Keyring and other "cool" new shiny features.

Wednesday, April 27, 2016

Building MaxScale 1.4.2 from GitHub on Fedora 23

MariaDB MaxScale is mentioned in many blog posts recently. It's Application of the Year 2016 after all! I'd like to test it, follow posts like this etc, all that on my favorite and readily available testing platforms that are now Ubuntu of all kinds and, surely, Fedora 23 (on my wife's workstation, the most powerful hardware at hand).

My old habits force me to build open source software I test from source, and I do not want to even discuss the topic of "MaxScale binaries availability" that was quite "popular" some time ago. So, after building MaxScale 1.4.1 on CentOS 6.7 back on March 31, 2016 (mostly just following MariaDB KB article on the topic) using libmysqld.a from MariaDB 10.0.23, this morning I decided to check new branch, 1.4.2, and build it on Fedora 23, following that same KB article (that unfortunately does not even mention Fedora after the fix to MXS-248). Thing is, Fedora is not officially supported as a platform for MaxScale 1.4.x, but why should we, those who can build things from source for testing purposes, care about this?

I started with cloning MaxScale:

git clone https://github.com/mariadb-corporation/MaxScale.git
cd MaxScale
and then:

[openxs@fc23 MaxScale]$ git branch -r
...
  origin/HEAD -> origin/develop
...
  origin/release-1.4.2
...
I remember spending enough time fighting with develop branch while building on CentOS 6.7, mostly with sqlite-related things it contained, so this time I proceed immediately to the branch I want to build:

[openxs@fc23 MaxScale]$ git checkout release-1.4.2
Branch release-1.4.2 set up to track remote branch release-1.4.2 from origin.
Switched to a new branch 'release-1.4.2'
[openxs@fc23 MaxScale]$ git branch
  develop
* release-1.4.2

[openxs@fc23 MaxScale]$ mkdir build
[openxs@fc23 MaxScale]$ cd build

Last two steps originate from the KB article. We are almost ready for building, but what about the prerequisites? I've collected all the packages required for CentOS in that article and tried to install them all:
[openxs@fc23 build]$ sudo yum install mariadb-devel mariadb-embedded-devel libedit-devel gcc gcc-c++ ncurses-devel bison flex glibc-devel cmake libgcc perl make libtool openssl-devel libaio libaio-devel librabbitmq-devel libcurl-devel pcre-devel rpm-build[sudo] password for openxs:
Yum command has been deprecated, redirecting to '/usr/bin/dnf install mariadb-devel mariadb-embedded-devel libedit-devel gcc gcc-c++ ncurses-devel bison flex glibc-devel cmake libgcc perl make libtool openssl-devel libaio libaio-devel librabbitmq-devel libcurl-devel pcre-devel rpm-build'.
See 'man dnf' and 'man yum2dnf' for more information.
To transfer transaction metadata from yum to DNF, run:
'dnf install python-dnf-plugins-extras-migrate && dnf-2 migrate'

Last metadata expiration check: 0:26:04 ago on Wed Apr 27 10:43:24 2016.
Package gcc-5.3.1-6.fc23.x86_64 is already installed, skipping.
...
Package pcre-devel-8.38-7.fc23.x86_64 is already installed, skipping.
Dependencies resolved.
================================================================================
 Package                  Arch     Version                      Repository
                                                                           Size
================================================================================
Installing:
 autoconf                 noarch   2.69-21.fc23                 fedora    709 k
 automake                 noarch   1.15-4.fc23                  fedora    695 k
 dwz                      x86_64   0.12-1.fc23                  fedora    106 k
 flex                     x86_64   2.5.39-2.fc23                fedora    328 k
 ghc-srpm-macros          noarch   1.4.2-2.fc23                 fedora    8.2 k
 gnat-srpm-macros         noarch   2-1.fc23                     fedora    8.4 k
 go-srpm-macros           noarch   2-3.fc23                     fedora    8.0 k
 libcurl-devel            x86_64   7.43.0-6.fc23                updates   590 k
 libedit-devel            x86_64   3.1-13.20150325cvs.fc23      fedora     34 k
 librabbitmq              x86_64   0.8.0-1.fc23                 updates    43 k
 librabbitmq-devel        x86_64   0.8.0-1.fc23                 updates    52 k
 libtool                  x86_64   2.4.6-8.fc23                 updates   707 k
 mariadb-common           x86_64   1:10.0.23-1.fc23             updates    74 k
 mariadb-config           x86_64   1:10.0.23-1.fc23             updates    25 k
 mariadb-devel            x86_64   1:10.0.23-1.fc23             updates   869 k
 mariadb-embedded         x86_64   1:10.0.23-1.fc23             updates   4.0 M
 mariadb-embedded-devel   x86_64   1:10.0.23-1.fc23             updates   8.3 M
 mariadb-errmsg           x86_64   1:10.0.23-1.fc23             updates   199 k
 mariadb-libs             x86_64   1:10.0.23-1.fc23             updates   637 k
 ocaml-srpm-macros        noarch   2-3.fc23                     fedora    8.1 k
 patch                    x86_64   2.7.5-2.fc23                 fedora    123 k
 perl-Thread-Queue        noarch   3.07-1.fc23                  updates    22 k
 perl-generators          noarch   1.06-1.fc23                  updates    15 k
 perl-srpm-macros         noarch   1-17.fc23                    fedora    9.7 k
 python-srpm-macros       noarch   3-7.fc23                     updates   8.1 k
 redhat-rpm-config        noarch   36-1.fc23.1                  updates    59 k
 rpm-build                x86_64   4.13.0-0.rc1.13.fc23         updates   137 k

Transaction Summary
================================================================================
Install  27 Packages

Total download size: 18 M
Installed size: 64 M
Is this ok [y/N]: Y

...

Complete!
Now, let's try simple approach:

[openxs@fc23 build]$ cmake ..
...
-- MySQL version: 10.0.23
-- MySQL provider: MariaDB
-- Looking for pcre_stack_guard in MYSQL_EMBEDDED_LIBRARIES_STATIC-NOTFOUND
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
MYSQL_EMBEDDED_LIBRARIES_STATIC
    linked by target "cmTC_2494a" in directory /home/openxs/git/MaxScale/build/CMakeFiles/CMakeTmp

CMake Error: Internal CMake error, TryCompile configure of cmake failed
-- Looking for pcre_stack_guard in MYSQL_EMBEDDED_LIBRARIES_STATIC-NOTFOUND - not found
-- PCRE libs: /usr/lib64/libpcre.so
-- PCRE include directory: /usr/include
-- Embedded mysqld does not have pcre_stack_guard, linking with system pcre.
CMake Error at cmake/FindMySQL.cmake:115 (message):
  Library not found: libmysqld.  If your install of MySQL is in a non-default
  location, please provide the location with -DMYSQL_EMBEDDED_LIBRARIES=<path
  to library>
Call Stack (most recent call first):
  CMakeLists.txt:37 (find_package)


-- Configuring incomplete, errors occurred!
See also "/home/openxs/git/MaxScale/build/CMakeFiles/CMakeOutput.log".
See also "/home/openxs/git/MaxScale/build/CMakeFiles/CMakeError.log".

Failure, cmake can not find libmysqld.a it seems. Let me try to find it:

[openxs@fc23 build]$ sudo find / -name libmysqld.a 2>/dev/null
/home/openxs/git/percona-xtrabackup/libmysqld/libmysqld.a
/home/openxs/dbs/5.7/lib/libmysqld.a
/home/openxs/dbs/p5.6/lib/libmysqld.a
/home/openxs/dbs/fb56/lib/libmysqld.a
/home/openxs/10.1.12/lib/libmysqld.a
That's all, even though I installed all packages that looked as required based on the article! I have the library in many places (in my own builds and even in sandbox with MariaDB 10.1.12), but it's not installed where expected. Some more desperate tries (installing MariaDB server with sudo yum install mariadb-server, searches for package that provides libmysqld.a etc), chat with engineers of MariaDB and I've ended up with the fact that my packages are from Fedora (not MariaDB) and they just do not include the static library. Looks like a bug in Fedora packaging, if you ask me.

I was not ready to add MariaDB's repository at the moment (to get MariaDB-devel etc, something KB article also suggests for supported platforms), so I decided that it would be fair just to build current MariaDB 10.1.13 from source and use everything needed from there. Last time I built 10.2 branch, so I had to check out 10.1 first:
[openxs@fc23 server]$ git checkout 10.1
Switched to branch '10.1'
Your branch is behind 'origin/10.1' by 2 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
[openxs@fc23 server]$ git pull
Updating 1cf852d..071ae30
Fast-forward
 client/mysqlbinlog.cc                    | 523 ++++++++++++++++++++++---------
 mysql-test/r/mysqlbinlog_raw_mode.result | 274 ++++++++++++++++
 mysql-test/t/mysqlbinlog_raw_mode.test   | 387 +++++++++++++++++++++++
 sql/sql_priv.h                           |   3 +-
 storage/innobase/dict/dict0boot.cc       |  20 +-
 storage/xtradb/dict/dict0boot.cc         |  20 +-
 6 files changed, 1062 insertions(+), 165 deletions(-)
 create mode 100644 mysql-test/r/mysqlbinlog_raw_mode.result
 create mode 100644 mysql-test/t/mysqlbinlog_raw_mode.test
 Then I've executed the following while in server directory:

make clean
rm CMakeCache.txt
cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DENABLED_LOCAL_INFILE=1 -DWITH_JEMALLOC=system -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON -DWITH_EMBEDDED_SERVER=ON -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/maria10.1

make

make install && make clean
Note that I've explicitly asked to build embedded server. I checked that the library is in the location I need:

[openxs@fc23 server]$ sudo find / -name libmysqld.a 2>/dev/null
/home/openxs/git/percona-xtrabackup/libmysqld/libmysqld.a
/home/openxs/dbs/maria10.1/lib/libmysqld.a
/home/openxs/dbs/5.7/lib/libmysqld.a
/home/openxs/dbs/p5.6/lib/libmysqld.a
/home/openxs/dbs/fb56/lib/libmysqld.a
/home/openxs/10.1.12/lib/libmysqld.a
Then I moved back to MaxScale/build directory and explicitly pointed out the location of headers, library and messages that I want to use with MaxScale:

[openxs@fc23 build]$ cmake .. -DMYSQL_EMBEDDED_INCLUDE_DIR=/home/openxs/dbs/maria10.1/include/mysql -DMYSQL_EMBEDDED_LIBRARIES=/home/openxs/dbs/maria10.1/lib/libmysqld.a -DERRMSG=/home/openxs/dbs/maria10.1/share/english/errmsg.sys -DCMAKE_INSTALL_PREFIX=/home/openxs/maxscale -DWITH_MAXSCALE_CNF=N
...
-- Build files have been written to: /home/openxs/git/MaxScale/build

[openxs@fc23 build]$ make
...
[ 95%] [BISON][ruleparser] Building parser with bison 3.0.4
ruleparser.y:34.1-13: warning: deprecated directive, use Б-?%name-prefixБ-? [-Wdeprecated]
 %name-prefix="dbfw_yy"
 ^^^^^^^^^^^^^
[ 96%] Building C object server/modules/filter/dbfwfilter/CMakeFiles/dbfwfilter.dir/ruleparser.c.o
[ 96%] Building C object server/modules/filter/dbfwfilter/CMakeFiles/dbfwfilter.dir/token.c.o
[ 97%] Linking C shared library libdbfwfilter.so
[ 97%] Built target dbfwfilter
Scanning dependencies of target maxadmin
[ 98%] Building C object client/CMakeFiles/maxadmin.dir/maxadmin.c.o
[ 98%] Linking C executable maxadmin
[100%] Built target maxadmin

It seems build completed without problems this time. We can try to test it (some tests do fail):







[openxs@fc23 build]$ make testcore
...
 1/22 Test  #1: Internal-TestQueryClassifier .....***Exception: Other  0.35 sec
      Start  2: Internal-CanonicalQuery
 2/22 Test  #2: Internal-CanonicalQuery ..........***Failed    0.25 sec
      Start  3: Internal-CanonicalQuerySelect
 3/22 Test  #3: Internal-CanonicalQuerySelect ....***Failed    0.04 sec
      Start  4: Internal-CanonicalQueryAlter
 4/22 Test  #4: Internal-CanonicalQueryAlter .....***Failed    0.04 sec
      Start  5: Internal-CanonicalQueryComment
 5/22 Test  #5: Internal-CanonicalQueryComment ...***Failed    0.04 sec
      Start  6: Internal-TestAdminUsers
 6/22 Test  #6: Internal-TestAdminUsers ..........   Passed    0.44 sec
      Start  7: Internal-TestBuffer
 7/22 Test  #7: Internal-TestBuffer ..............   Passed    0.01 sec
      Start  8: Internal-TestDCB
 8/22 Test  #8: Internal-TestDCB .................   Passed    0.01 sec
      Start  9: Internal-TestFilter
 9/22 Test  #9: Internal-TestFilter ..............   Passed    0.03 sec
...
(As a side note, make install in my case had NOT installed anything to /home/openxs/maxscale, something to deal with later, as on CentOS 6.7 it worked...)


In any case, I now have binaries to work with, of version 1.4.2:

[openxs@fc23 build]$ ls bin/
maxadmin  maxbinlogcheck  maxkeys  maxpasswd  maxscale
[openxs@fc23 build]$ bin/maxscale --version


MariaDB Corporation MaxScale 1.4.2      Wed Apr 27 13:24:01 2016
------------------------------------------------------
MaxScale 1.4.2

[openxs@fc23 build]$ bin/maxadmin --version
bin/maxadmin Version 1.4.2
To be continued one day... Stay tuned!




Sunday, April 24, 2016

Building MariaDB 10.1.x and Galera from Source for Multiple Node Cluster Testing Setup

My Facebook followers probably noted that I quit from Percona some time ago and work for MariaDB since March 1, 2016. I changed the company, but neither the job role (I am still a Support Engineer), nor the approach to do my job. I still prefer to test everything I suggest to customers and I usually use software I build from source myself for these tests.

While I try to avoid all kinds of clusters as much as possible for 15 years or so already (it does not matter if it's Oracle RAC, MySQL Cluster or Percona XtraDB Cluster, all of them), it's really hard to avoid Galera clusters while working for MariaDB. One of the reasons for this is that Galera, starting from MariaDB 10.1, can be easily "enabled"/used with any MariaDB 10.1.x instance, any time (at least when we speak about official binaries or those properly built - they are all "Galera ready"). Most of MariaDB customers do use Galera or can try to use it any time, so I have to be ready to test something Galera-specific any moment.

For simple cases I decided to use a setup with several (2 to begin with) cluster nodes on one box. This approach is described in the manual for Percona XtraDB Cluster and was also used by my former colleague Fernando Laudares for his blog post and many real life related tests.

So, I decided to proceed with the mix of ideas from the sources above and MariaDB's KB article on building Galera from source. As I decided to do this on my wife's Fedora 23 workstation, I checked this KB article for some details also. It lists prerequisites (boost-devel check-devel glibc-devel openssl-devel scons) and some of these packages (like scons in one of my cases) could be missing even on a system previosly used for builds for all kinds of MySQL related software. You can find something missing and fix the problem at later stage, but reading and following the manual or KB articles may help to save some time otherwise spent on trial and error.

I've started with making directories in my home directory (/home/openxs) for this Galera related testing setup, like these:
[openxs@fc23 ~]$ mkdir galera
[openxs@fc23 ~]$ cd galera
[openxs@fc23 galera]$ mkdir node1[openxs@fc23 galera]$ mkdir node2
[openxs@fc23 galera]$ mkdir node3
[openxs@fc23 galera]$ ls
node1  node2  node3
I plan to use 3 nodes one day, but for this blog post I'll set up only 2, to have the smallest possible and simplest cluster as a proof of concept.

Then I proceeded with cloning Galera from Codership's GitHub (this is supposed to be the latest and greatest). I changed current directory to my usual git repository and executed git clone https://github.com/codership/galera.git. When this command completed I've got a subdirectory named galera.

In that directory, assuming that all prerequisites are installed, to build current Galera library version it's enough to execute simple script while in galera directory, ./scripts/build.sh. I ended up with the following:
[openxs@fc23 galera]$ ls -l libgalera_smm.so
-rwxrwxr-x. 1 openxs openxs 40204824 Mar 31 12:21 libgalera_smm.so
[openxs@fc23 galera]$ file libgalera_smm.so
libgalera_smm.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=11457fa9fd69dabe617708c0dd288b218255a886, not stripped

[openxs@fc23 galera]$ pwd
/home/openxs/git/galera
[openxs@fc23 galera]$ cp libgalera_smm.so ~/galera/
and copied the library to the target directory for my testing setup (that should NOT conflict with whatever software I may have installed later from packages).

Now, time to build MariaDB properly to let it use Galera if needed. I already had recent (at the moment) 10.1.13 in the server subdirectory of my git repository. I've executed the following commands then:

[openxs@fc23 server]$ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DENABLED_LOCAL_INFILE=1 -DWITH_JEMALLOC=system -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/maria10.1
-- Running cmake version 3.4.1
-- MariaDB 10.1.13
...

[openxs@fc23 server]$ time make -j 4...
real    9m28.164s
user    32m43.960s
sys     2m45.637s
This was my usual command line to build MariaDB 10.x with only 2 extra options added: -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON.After make completed, I've executed make install && make clean and was ready to use my shiny new Galera-ready MariaDB 10.1.13.

To take into account the directories I am going to use for my cluster nodes and make sure they can start and communicate as separate mysqld instances, I have to create configuration files for them. I've changed working directory to /home/openxs/dbs/mariadb10.1 and started with this configuration file for the first node:

[openxs@fc23 maria10.1]$ cat /home/openxs/galera/mynode1.cnf                    
[mysqld]
datadir=/home/openxs/galera/node1
port=3306
socket=/tmp/mysql-node1.sock
pid-file=/tmp/mysql-node1.pid
log-error=/tmp/mysql-node1.err
binlog_format=ROW
innodb_autoinc_lock_mode=2

wsrep_on=ON # this is important for 10.1!
wsrep_provider=/home/openxs/galera/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node1
# wsrep_cluster_address=gcomm://
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020?pc.wait_prim=no
It's one of the shortest possible. I had to specify unique datadir, error log location, pid file, port and socket for the instance, set binlog format and point out Galera library location, set cluster name and node name. With proper planning I was able to specify wsrep_cluster_address referring to all other nodes properly, but for initial setup of the first node I can have it "empty" as commented out in the above, so that we start as a new cluster node. There is one essential setting for MariaDB 10.1.x that is not needed for "cluster-specific" instances like Percona XtraDB Cluster or older 10.0.x Galera packages from MariaDB (where it's ON by default). This is wsrep_on=ON. Without it MariaDB works as normal, non-cluster instance and ignores anything cluster-related. You can save a lot of time in case of upgrade to 10.1.x if you put it in your configuration file explicitly right now, no matter what the version is used.

Then I copied and modified configuration file for the second node:
[openxs@fc23 maria10.1]$ cp /home/openxs/galera/mynode1.cnf /home/openxs/galera/mynode2.cnf
[openxs@fc23 maria10.1]$ vi /home/openxs/galera/mynode2.cnf                     

[openxs@fc23 maria10.1]$ cat /home/openxs/galera/mynode2.cnf                   
[mysqld]
datadir=/home/openxs/galera/node2
port=3307
socket=/tmp/mysql-node2.sock
pid-file=/tmp/mysql-node2.pid
log-error=/tmp/mysql-node2.err
binlog_format=ROW
innodb_autoinc_lock_mode=2

wsrep_on=ON # this is important for 10.1!wsrep_provider=/home/openxs/galera/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node2
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020

?pc.wait_prim=nowsrep_provider_options = "base_port=5020;"
Note that while Galera node uses 4 ports, I specified only 2 unique ones explicitly, port for MySQL clients and base port for all Galera-related communication like IST and SST, with base_port setting. Note also how I referred to all cluster nodes with wsrep_cluster_address - this same value can be used for the configuration file of the first node actually. We can just start it as the first node of a new cluster (see below).

Now we have configuration files for 2 nodes ready (we can always add node3 later in the same way). But before starting new cluster we have to install system databases. For node1 it was performed in the following way:
[openxs@fc23 maria10.1]$ scripts/mysql_install_db --defaults-file=/home/openxs/galera/mynode1.cnf
Installing MariaDB/MySQL system tables in '/home/openxs/galera/node1' ...
2016-03-31 12:51:34 139766046820480 [Note] ./bin/mysqld (mysqld 10.1.13-MariaDB) starting as process 28297 ...
...

[openxs@fc23 maria10.1]$ ls -l /home/openxs/galera/node1
-rw-rw----. 1 openxs openxs    16384 Mar 31 12:51 aria_log.00000001
-rw-rw----. 1 openxs openxs       52 Mar 31 12:51 aria_log_control
-rw-rw----. 1 openxs openxs 12582912 Mar 31 12:51 ibdata1
-rw-rw----. 1 openxs openxs 50331648 Mar 31 12:51 ib_logfile0
-rw-rw----. 1 openxs openxs 50331648 Mar 31 12:51 ib_logfile1
drwx------. 2 openxs openxs     4096 Mar 31 12:51 mysql
drwx------. 2 openxs openxs     4096 Mar 31 12:51 performance_schema
drwx------. 2 openxs openxs     4096 Mar 31 12:51 test
Then I started node1 as a new cluster:
[openxs@fc23 maria10.1]$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode1.cnf --wsrep-new-cluster &
and created a table, t1, with some data in it. After that I repeated installation of system tables etc for node2, just referencing proper configuration file, and started node2 that was supposed to join the cluster:
openxs@fc23 maria10.1]$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode2.cnf &
Let's check if we do have both instances running and communicating in Galera cluster:
[openxs@fc23 maria10.1]$ tail /tmp/mysql-node2.err                             
2016-03-31 13:40:29 139627414767744 [Note] WSREP: Signalling provider to continue.
2016-03-31 13:40:29 139627414767744 [Note] WSREP: SST received: c91d17b6-f72b-11e5-95de-96e95167f593:0
2016-03-31 13:40:29 139627117668096 [Note] WSREP: 1.0 (node2): State transfer from 0.0 (node1) complete.
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Shifting JOINER -> JOINED (TO: 0)
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Member 1.0 (node2) synced with group.
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2016-03-31 13:40:29 139627414452992 [Note] WSREP: Synchronized with group, ready for connections
2016-03-31 13:40:29 139627414452992 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-31 13:40:29 139627414767744 [Note] /home/openxs/dbs/maria10.1/bin/mysqld: ready for connections.
Version: '10.1.13-MariaDB'  socket: '/tmp/mysql-node2.sock'  port: 3307  Source distribution

[openxs@fc23 maria10.1]$ tail /tmp/mysql-node1.err
2016-03-31 13:40:27 140071390934784 [Note] WSREP: Provider resumed.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node2) complete.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 0)
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Member 0.0 (node1) synced with group.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2016-03-31 13:40:27 140072429247232 [Note] WSREP: Synchronized with group, ready for connections
2016-03-31 13:40:27 140072429247232 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-31 13:40:27 140072141715200 [Note] WSREP: (c91c99ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-03-31 13:40:29 140072133322496 [Note] WSREP: 1.0 (node2): State transfer from 0.0 (node1) complete.
2016-03-31 13:40:29 140072133322496 [Note] WSREP: Member 1.0 (node2) synced with group.
Familiar messages (unfortunately...) that prove we had a second node joined and performed state transfer from the first one. Now it's time to connect and test how cluster works. This is what I had after node1 started and table with some data created there, but before node2 started:
[openxs@fc23 maria10.1]$ bin/mysql -uroot --socket=/tmp/mysql-node1.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.1.13-MariaDB Source distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show variables like 'wsrep_cluster%';
+-----------------------+-------------------------------------------------------+
| Variable_name         | Value                                                 |
+-----------------------+-------------------------------------------------------+
| wsrep_cluster_address | gcomm://127.0.0.1:4567,127.0.0.1:5020?pc.wait_prim=no |
| wsrep_cluster_name    | singlebox                                             |
+-----------------------+-------------------------------------------------------+
2 rows in set (0.00 sec)

MariaDB [(none)]> show status like 'wsrep_cluster%';
+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 1                                    |
| wsrep_cluster_size       | 1                                    |
| wsrep_cluster_state_uuid | c91d17b6-f72b-11e5-95de-96e95167f593 |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.00 sec)

MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
Then, when node2 joined the cluster, I checked that the data we've added on node1 are there:

[openxs@fc23 maria10.1]$ bin/mysql -uroot --socket=/tmp/mysql-node2.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.13-MariaDB Source distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)

MariaDB [test]> show status like 'wsrep_cluster%'; +--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 2                                    |
| wsrep_cluster_size       | 2                                    |
| wsrep_cluster_state_uuid | c91d17b6-f72b-11e5-95de-96e95167f593 |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.01 sec)
So, the first basic test with the Galera cluster of 2 nodes (both running on the same box) built from current source of Galera and MariaDB 10.1.x on Fedora 23 is completed successfully. I plan to play with it more in the future, use current xtrabackup built from source for SST and so on, and create blog posts about these steps and any interesting tests in this setup. Stay tuned.

From the dates above you can conclude that it took me 3 weeks to publish this post. That's because I was busy with the company meeting in Berlin and some usual Support work, and was not sure is it really a good idea for me to write any post with "Galera" or "MariaDB" words used in it even once...



Monday, February 29, 2016

Exploring Metadata Locks with gdb - Studying Simple Case on Percona Server 5.7

I had forgotten the topic of studying metadata locks with gdb for quite a some time. Now that I've upgraded to Percona Server 5.7 and enjoy the remaining of my last free day as an independent MySQL Support Engineer doing nothing but some blogging, I think it's time to get back to gdb interactive sessions and try to answer the question I've got from customer back in Percona but had no chance to answer.

In a bit rewritten way, it sounded as follows:
"Do ALTER TABLE ... STATS_AUTO_RECALC=1" and the reverse ALTER
TABLE...STATS_AUTO_RECALC=default block select, update, delete, and insert on the table?
"
There are two ways to answer this question. First, we can just try to set up a simple test with 3 sessions and simple table, like this:
mysql> create table ti(id int primary key, c1 int, key(c1)) engine=InnoDB stats_auto_recalc=default;Query OK, 0 rows affected (0.16 sec)

mysql> insert into ti values (1,1), (2,2);
Query OK, 2 rows affected (0.09 sec)
Records: 2  Duplicates: 0  Warnings: 0
Now, in one session we can SELECT from the table in transaction that is not completed:
mysql> start transaction;
Query OK, 0 rows affected (0.01 sec)

mysql> select * from ti;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
and then in another session run the ALTER TABLE in question:
mysql> alter table ti stats_auto_recalc=1;
to find out in the first session that it actually hangs waiting on some metadata lock:
mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                           | Info                               | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
|  3 | root | localhost | test | Query   |   15 | Waiting for table metadata lock | alter table ti stats_auto_recalc=1 |         0 |             0 ||  4 | root | localhost | test | Query   |    0 | starting                        | show processlist                   |         0 |             0 |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
2 rows in set (0.00 sec)
So, one of metadata locks set by ALTER TABLE even in this very simple case (when we try to modify the way of statistics recalculation for InnoDB table) even in 5.7 is too strict, so that it conflicts with one of metadata locks set by simple SELECT for the duration of the entire transaction. What's more interesting, now in third session we can try to run whatever statements we want to check (those that were "compatible" with SELECT at metadata locks level) and see them also blocked, like this:
mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                           | Info                               | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
|  3 | root | localhost | test | Query   |   52 | Waiting for table metadata lock | alter table ti stats_auto_recalc=1 |         0 |             0 |
|  4 | root | localhost | test | Query   |    0 | starting                        | show processlist                   |         0 |             0 |
|  5 | root | localhost | test | Query   |    4 | Waiting for table metadata lock | select * from ti                   |         0 |             0 |+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
3 rows in set (0.00 sec)
This way we can already conclude that ALTER TABLE of this kind does block even SELECT from the table! Good that it runs fast (there is no need to make a table copy), so will not block for a long time, but if ALTER itself is blocked by some transaction (like in my case) - we are still in trouble!

That simple test would give the answer to customer that he could be satisfied with. But I know that ALTER TABLE sets many metadata locks at different stages and while for this trivial one it may not matter much, I still wanted to find some time to attach gdb and see what locks are requested and when/in what order.

I had not done this with 5.7, so tried the same approach as in the initial post in this series (please, check it carefully for the details from the source code, like types of the locks etc), setting breakpoint on MDL_request::init, but it seems this method does NOT exist in MySQL 5.7. So, I ended up with the following breakpoints set:
(gdb) b MDL_context::acquire_lockBreakpoint 1 at 0xbf9083: file /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc, line 3506.
(gdb) b lock_rec_lock
Breakpoint 2 at 0xf24ffd: file /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/storage/innobase/lock/lock0lock.cc, line 2343.
(gdb) c
Continuing.
With these breakpoints in place, I've executed alter table ti stats_auto_recalc=1; and ended up with the following in gdb:
Breakpoint 1, MDL_context::acquire_lock (this=0x7fc1e27ec0c8,
    mdl_request=0x7fc1b2df95b0, lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc:3506
3506      set_timespec(&abs_timeout, lock_wait_timeout);
(gdb) p mdl_request
$1 = (MDL_request *) 0x7fc1b2df95b0
(gdb) p *mdl_request
$2 = {type = MDL_INTENTION_EXCLUSIVE, duration = MDL_STATEMENT,
  next_in_list = 0x7fc1e27f7f70, prev_in_list = 0x7fc1b2df9778, ticket = 0x0,
  key = {m_length = 3, m_db_name_length = 0,
    m_ptr = "\000\000\000\000L\274q\324\301...
We see something not that much useful immediately, some metadata lock request at the weakest possible level for the duration of the statement. Just to see how we can see all of requests in the same "batch", let's do the following in gdb:
(gdb) p *(mdl_request->next_in_list)
$5 = {type = MDL_INTENTION_EXCLUSIVE, duration = MDL_TRANSACTION,
  next_in_list = 0x7fc1e27f7d70, prev_in_list = 0x7fc1b2df95b8, ticket = 0x0,
  key = {m_length = 7, m_db_name_length = 4,
    m_ptr = "\002test", '\000' <repeats 175 times>"\377, \377\377\037\000\000\000\000\001", '\000' <repeats 15 times>, "\001", '\000' <repeats 31 times>, "\a", '\000' <repeats 15 times>, "\004\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\001\001\000\001", '\000' <repeats 44 times>, "\001", '\000' <repeats 69 times>, static m_namespace_to_wait_state_name = {{m_key = 111,
...
So, the next one is also the weakest, but is set for the duration of transaction at the database (named "test") level. We can see the next after it as well:
(gdb) p *(mdl_request->next_in_list->next_in_list)
$6 = {type = MDL_SHARED_UPGRADABLE, duration = MDL_TRANSACTION,
  next_in_list = 0x0, prev_in_list = 0x7fc1e27f7f78, ticket = 0x0, key = {
    m_length = 9, m_db_name_length = 4,
    m_ptr = "\003test\000ti", '\000' <repeats 378 times>,
    static m_namespace_to_wait_state_name = {{m_key = 111,
...

(gdb) c
Continuing.
Here we can see a stronger metadata lock request for the duration of transaction on the test.ti table. The next_in_list pointer is NULL, so we are done with this "batch" of requests. If we continue, we'll see the same locks requested one by one while hitting breakpoint:
Breakpoint 1, MDL_context::acquire_lock (this=0x7fc1e27ec0c8,
    mdl_request=0x7fc1e27f7f70, lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc:3506
3506      set_timespec(&abs_timeout, lock_wait_timeout);

(gdb) p *mdl_request
$7 = {type = MDL_INTENTION_EXCLUSIVE, duration = MDL_TRANSACTION,
  next_in_list = 0x7fc1e27f7d70, prev_in_list = 0x7fc1b2df95b8, ticket = 0x0,
  key = {m_length = 7, m_db_name_length = 4,
    m_ptr = "\002test", '\000' <repeats 175 times>"\377, \377\377\037\000\000\000\000\001", '\000' <repeats 15 times>, "\001", '\000' <repeats 31 times>, "\a",
...

(gdb) p *(mdl_request->next_in_list)
$8 = {type = MDL_SHARED_UPGRADABLE, duration = MDL_TRANSACTION,
  next_in_list = 0x0, prev_in_list = 0x7fc1e27f7f78, ticket = 0x0, key = {
    m_length = 9, m_db_name_length = 4,
    m_ptr = "\003test\000ti", '\000' <repeats 378 times>,
    static m_namespace_to_wait_state_name = {{m_key = 111,
...

(gdb) c
Continuing.

Breakpoint 1, MDL_context::acquire_lock (this=0x7fc1e27ec0c8,
    mdl_request=0x7fc1e27f7d70, lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc:3506
3506      set_timespec(&abs_timeout, lock_wait_timeout);

(gdb) p *mdl_request
$9 = {type = MDL_SHARED_UPGRADABLE, duration = MDL_TRANSACTION,
  next_in_list = 0x0, prev_in_list = 0x7fc1e27f7f78, ticket = 0x0, key = {
    m_length = 9, m_db_name_length = 4,
    m_ptr = "\003test\000ti", '\000' <repeats 378 times>,
    static m_namespace_to_wait_state_name = {{m_key = 111,
...

(gdb) c
Continuing.
So, we have two ways to study the locks and new one in this post is to check the list of the lock requests in a single mdl_request list of pointers.

Eventually we also get the following breakpoint hit:
(gdb) c
Continuing.

Breakpoint 1, MDL_context::acquire_lock (this=0x7fc1e27ec0c8,
    mdl_request=0x7fc1b2df98a0, lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc:3506
3506      set_timespec(&abs_timeout, lock_wait_timeout);
(gdb) p *mdl_request
$15 = {type = MDL_EXCLUSIVE, duration = MDL_TRANSACTION,
  next_in_list = 0xd044f7, prev_in_list = 0x7fc1b2dfa914, ticket = 0x0, key = {
    m_length = 9, m_db_name_length = 4,
    m_ptr = "\003test\000ti\000_3\000\001\000\000\000\003\000\000\000<\240V\001\
...
Now, this is a real problem for any concurrent activity with the table, it's the strongest MDL_EXCLUSIVE lock requested on test.ti table that is hold (when obtained) till the end of ALTER statement...

To summarize this post:
  1. We see that we still can use gdb to study metadata locks in MySQL (or Percona Server) 5.7.x, but the breakpoint should be set on MDL_context::acquire_lock (at least until I'll find time to study changes in the source code for any alternatives).
  2. When breakpoint is hit you can study the content of structure pointed by the mdl_request parameter to find out what kind of lock is requested.
  3. At table level ALTER TABLE ... STATS_AUTO_RECALC=1 in MySQL 5.7 starts with weak enough MDL_SHARED_UPGRADABLE lock, but then eventually we end up with the request for the strongest MDL_EXCLUSIVE lock that is either blocked by any concurrent access to the table until transaction there is completed, and/or blocks everything (including SELECT from the table) until ALTER finally completes.
Next time in this series I plan to check what happens with metadata locks for partitioned tables and how much maintenance of individual partitions may affect other partitions. Other idea would be to check what metadata locks are set when statistics for the InnoDB table is recalculated automatically when 10% of data changes. Stay tuned!

Friday, February 26, 2016

On Upgrades to Percona Server 5.7 GA: Ubuntu 14.04 Case

After upgrading Percona Server 5.6.28 (and "full stack" of Percona software) to 5.7.10-3 GA on CentOS 6 and building it from Git Source on Fedora Core 23 yesterday, I've decided to spend some time upgrading Percona Server on my netbook that plays a role of "home build/test/VMs server" and runs under Ubuntu 14.04:
openxs@ao756:~$ uname -a
Linux ao756 3.13.0-77-generic #121-Ubuntu SMP Wed Jan 20 10:50:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
openxs@ao756:~$ cat /etc/issue
Ubuntu 14.04.4 LTS \n \l
It had the following set of Percona packages (as a result of some tests with Percona XtraDB Cluster I had to use sometime while working there) and mysqld process running after OS startup:
openxs@ao756:~$ dpkg -l | grep -i percona | grep ii
ii  libmysqlclient18                                      1:5.5.35-rel33.0-611.raring                         amd64        Percona Server database client library
ii  libmysqlclient18-dev                                  1:5.5.35-rel33.0-611.raring                         amd64        Percona Server database development files
ii  libperconaserverclient18                              5.5.47-rel37.7-1.trusty                             amd64        Percona Server database client library
ii  libperconaserverclient18.1                            5.6.28-76.1-1.trusty                                amd64        Percona Server database client library
ii  percona-server-5.6-dbg                                5.6.28-76.1-1.trusty                                amd64        Debugging package for Percona Server
ii  percona-server-client-5.6                             5.6.28-76.1-1.trusty                                amd64        Percona Server database client binaries
ii  percona-server-common-5.5                             5.5.47-rel37.7-1.trusty                             amd64        Percona Server database common files (e.g. /etc/mysql/my.cnf)
ii  percona-server-common-5.6                             5.6.28-76.1-1.trusty                                amd64        Percona Server database common files (e.g. /etc/mysql/my.cnf)
ii  percona-server-server-5.6                             5.6.28-76.1-1.trusty                                amd64        Percona Server database server binaries
ii  percona-toolkit                                       2.2.16-1                                            all          Advanced MySQL and system command-line tools
ii  percona-xtrabackup                                    2.3.3-1.trusty                                      amd64        Open source backup tool for InnoDB and XtraDB
ii  percona-xtradb-cluster-common-5.6                     5.6.28-25.14-1.trusty                               amd64        Percona XtraDB Cluster database common files (e.g. /etc/mysql/my.cnf)
ii  percona-xtradb-cluster-galera-3.x                     3.14-1.trusty                                       amd64        Galera components of Percona XtraDB Cluster

openxs@ao756:~$ ps aux | grep mysqld
root      1283  0.0  0.0   4440   756 ?        S    10:35   0:00 /bin/sh /usr/bin/mysqld_safe
mysql     1505  0.1 13.0 895592 505516 ?       Sl   10:35   0:01 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/lib/mysql/ao756.err --pid-file=/var/lib/mysql/ao756.pid
openxs    2938  0.0  0.0  14640   972 pts/4    S+   10:55   0:00 grep --color=auto mysqld
 This time I decided to follow the manual closely, so I checked for all usual/possible locations of configuration files, stopped MySQL server first and then installed the packages for 5.7 (I've noted TokuDB package was NOT installed, so I've added it to the list, as TokuDB storage engine is one of the reasons to use Percona Server 5.7 instead of well tested and known upstream MySQL 5.7):
openxs@ao756:~$ ls -l /etc/mysql/
total 16
drwxr-xr-x 2 root root 4096 тра 22  2014 conf.d
-rw------- 1 root root  333 тра 22  2014 debian.cnf
-rwxr-xr-x 1 root root 1285 січ  9 02:24 debian-start
-rw-r--r-- 1 root root  120 січ 12 13:03 my.cnf

openxs@ao756:~$ ls -l /etc/mysql/conf.d/
total 0

openxs@ao756:~$ ls -l /etc/my.cnf
-rw-r--r-- 1 root root 180 лис 17 14:22 /etc/my.cnf

openxs@ao756:~$ sudo service mysql stop
[sudo] password for openxs:
 * Stopping MySQL (Percona Server) mysqld                                [ OK ]
openxs@ao756:~$ ps aux | grep mysqld
openxs    3102  0.0  0.0  14636   968 pts/4    S+   11:01   0:00 grep --color=auto mysqld

openxs@ao756:~$ sudo apt-get install percona-server-server-5.7 percona-server-tokudb-5.7Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libperconaserverclient18.1 percona-server-common-5.6
  percona-xtradb-cluster-galera-3.x socat
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
  libmecab2 percona-server-client-5.7 percona-server-common-5.7
The following packages will be REMOVED:
  percona-server-5.6-dbg percona-server-client-5.6 percona-server-server-5.6
The following NEW packages will be installed:
  libmecab2 percona-server-client-5.7 percona-server-common-5.7
  percona-server-server-5.7 percona-server-tokudb-5.7
0 upgraded, 5 newly installed, 3 to remove and 3 not upgraded.
Need to get 21,6 MB of archives.
After this operation, 166 MB disk space will be freed.
...
Setting up libmecab2 (0.996-1.1) ...
Setting up percona-server-server-5.7 (5.7.10-3-1.trusty) ...
Installing new version of config file /etc/default/mysql ...
Installing new version of config file /etc/init.d/mysql ...
...
.................... * Percona Server 5.7.10-3 did not start. Please check logs for more details.
Processing triggers for ureadahead (0.100.0-16) ...
Setting up percona-server-tokudb-5.7 (5.7.10-3-1.trusty) ...


 * This release of Percona Server is distributed with TokuDB storage engine.
 * Run the following script to enable the TokuDB storage engine in Percona Server:

        ps_tokudb_admin --enable -u <mysql_admin_user> -p[mysql_admin_pass] [-S <socket>] [-h <host> -P <port>]

 * See http://www.percona.com/doc/percona-server/5.7/tokudb/tokudb_installation.html for more installation details

 * See http://www.percona.com/doc/percona-server/5.7/tokudb/tokudb_intro.html for an introduction to TokuDB


Processing triggers for libc-bin (2.19-0ubuntu6.7) ...
openxs@ao756:~$ ps aux | grep mysqldopenxs   17729  0.0  0.0  14636   972 pts/4    S+   11:20   0:00 grep --color=auto mysqld
So, even though the server was stopped (by the manual) at the moment of installation, there was an attempt to start server in the process that failed after painful 900 seconds of waiting... I've noted that new /etc/init.d/mysql script was installed and decided to check it (starting it manually under bash -x etc, reading and so on). While 900 seconds startup and 300 second shutdown default timeouts were visible in the code, it was not clear what's wrong. I suspected Percona Server Bug #1490897 again (it was a problem for me on Ubuntu 15.04), but here usual init.d approach to startup was used and none of the configuration files I checked had a single pid_file setting. Attempts to start using the script directly also ended up with 900 dots printed:
openxs@ao756:~$ sudo /etc/init.d/mysql start
No directory, logging in with HOME=/
...
Getting a bit tired of all these, I've checked that at least one way to start server up works, without defaults:
openxs@ao756:~$ sudo mysqld_safe --no-defaults &
[1] 21048
openxs@ao756:~$ 160226 11:51:32 mysqld_safe Adding '/usr/lib/x86_64-linux-gnu/libjemalloc.so.1' to LD_PRELOAD for mysqld
160226 11:51:32 mysqld_safe Logging to '/var/lib/mysql/ao756.err'.
160226 11:51:32 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

openxs@ao756:~$ ps aux | grep mysqld
root     21048  0.0  0.0  74060  2204 pts/4    S    11:51   0:00 sudo mysqld_safe --no-defaults
root     21049  0.0  0.0   4440   784 pts/4    S    11:51   0:00 /bin/sh /usr/bin/mysqld_safe --no-defaults
mysql    21124  1.8 14.8 1042020 573708 pts/4  Sl   11:51   0:00 /usr/sbin/mysqld --no-defaults --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/lib/mysql/ao756.err --pid-file=/var/lib/mysql/ao756.pid
openxs   21154  0.0  0.0  14636   972 pts/4    S+   11:51   0:00 grep --color=auto mysqld

openxs@ao756:~$ mysql -uroot -proot test
mysql: [Warning] Using a password on the command line interface can be insecure.
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.10-3 Percona Server (GPL), Release '3', Revision '63dafaf'

Copyright (c) 2009-2015 Percona LLC and/or its affiliates
Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>
 So, the problem was surely with some settings in the configuration files and/or init.d script. One of the problems was actually something to expect from 5.7, based on the messages in the error log:
2016-02-26T09:54:47.927969Z 0 [ERROR] You have enabled the binary log, but you haven't provided the mandatory server-id. Please refer to the proper server start-up parameters documentation
2016-02-26T09:54:47.928010Z 0 [ERROR] Aborting

2016-02-26T09:54:47.928052Z 0 [Note] Binlog end
2016-02-26T09:54:47.928131Z 0 [Note] /usr/sbin/mysqld: Shutdown complete
and taking into account the content of /etc/my.cnf at the moment:
openxs@ao756:~$ cat /etc/my.cnf[mysqld_safe]
pid_file=/var/lib/mysql/ao756.pid
[mysql]
connect_timeout=2
[mysqld]
innodb_file_format=Barracuda
table_open_cache=1156
table_open_cache_instances=1
log_bin
gtid_mode=ON
enforce_gtid_consistency
log_slave_updates
 So, I've added mandatory (for my case) server_id=N setting to /etc/my.cnf and tried to start again, to find out error messages that I've seen too many times recently and this year:
2016-02-26T09:56:47.576196Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2016-02-26T09:56:47.666859Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2016-02-26T09:56:47.668162Z 0 [Warning] InnoDB: Retry attempts for reading partial data failed.
2016-02-26T09:56:47.668218Z 0 [ERROR] InnoDB: Tried to read 2048 bytes at offset 0 was only able to read02016-02-26T09:56:47.668238Z 0 [ERROR] InnoDB: Operating system error number 22 in a file operation.
2016-02-26T09:56:47.668263Z 0 [ERROR] InnoDB: Error number 22 means 'Invalid argument'
2016-02-26T09:56:47.668274Z 0 [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
2016-02-26T09:56:47.668287Z 0 [ERROR] InnoDB: File (unknown): 'read' returned OS error 122. Cannot continue operation2016-02-26T09:56:47.668298Z 0 [ERROR] InnoDB: Cannot continue operation.
I know these messages by heart already thanks to Percona Server Bug #1131949. One of my configuration files had to have innodb_flush_method set to ALL_O_DIRECT:
openxs@ao756:~$ cat /etc/mysql/my.cnf
[mysqld]
innodb_flush_method = ALL_O_DIRECT
innodb_track_changed_pages = 1
innodb_log_file_size                    = 2G
After replacing it with usual O_DIRECT I was able to finally start the server as expected and execute mysql_upgrade:
openxs@ao756:~$ sudo mysqld_safe &
[1] 21711
openxs@ao756:~$ 160226 11:58:33 mysqld_safe Adding '/usr/lib/x86_64-linux-gnu/libjemalloc.so.1' to LD_PRELOAD for mysqld
160226 11:58:33 mysqld_safe Logging to '/var/lib/mysql/ao756.err'.
160226 11:58:33 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

openxs@ao756:~$ sudo mysql_upgrade -uroot -proot
mysql_upgrade: [Warning] Using a password on the command line interface can be insecure.
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
...
Repairing tables
mysql.proxies_priv                                 OK
Upgrade process completed successfully.
Checking if update is needed.

openxs@ao756:~$ ps aux | grep mysqld
root     21711  0.0  0.0  74060  2200 pts/4    S    11:58   0:00 sudo mysqld_safe
root     21712  0.0  0.0   4440   760 pts/4    S    11:58   0:00 /bin/sh /usr/bin/mysqld_safe
mysql    21938  6.8 15.8 1066860 610576 pts/4  Sl   11:58   0:12 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/lib/mysql/ao756.err --pid-file=/var/lib/mysql/ao756.pid
openxs   22292  0.0  0.0  14636   972 pts/4    S+   12:01   0:00 grep --color=auto mysqld
Moreover, now it can be started as a service also (even though the output looks unusual):
openxs@ao756:~$ sudo service mysql start
No directory, logging in with HOME=/
....
 * Percona Server 5.7.10-3 is started
Next step was to enable TokuDB based on instructions I've already seen:
openxs@ao756:~$ sudo ps_tokudb_admin --enable -uroot -proot
Checking if Percona Server is running with jemalloc enabled...
INFO: Percona Server is running with jemalloc enabled.

Checking transparent huge pages status on the system...
INFO: Transparent huge pages are enabled (should be disabled).

Checking if thp-setting=never option is already set in config file...
INFO: Option thp-setting=never is not set in the config file.
      (needed only if THP is not disabled permanently on the system)

Checking TokuDB engine plugin status...
INFO: TokuDB engine plugin is not installed.

Disabling transparent huge pages for the current session...
INFO: Successfully disabled transparent huge pages for this session.

Adding thp-setting=never option into /etc/my.cnf
INFO: Successfully added thp-setting=never option into /etc/my.cnf

Installing TokuDB engine...
INFO: Successfully installed TokuDB engine plugin.
and create my first TokuDB table on Percona Server 5.7:
openxs@ao756:~$ mysql -uroot -proot testmysql: [Warning] Using a password on the command line interface can be insecure.
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.7.10-3-log Percona Server (GPL), Release '3', Revision '63dafaf'

Copyright (c) 2009-2015 Percona LLC and/or its affiliates
Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table tt(id int primary key) engine=tokudb;
Query OK, 0 rows affected (0.40 sec)

mysql> show create table tt\G
*************************** 1. row ***************************
       Table: tt
Create Table: CREATE TABLE `tt` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
As I want to check "full stack", I definitely need xtrabackup version 2.4 that is not installed by default (note the unusual package name for .deb packages, server has dot in the name before minor version):
openxs@ao756:~$ sudo apt-get install percona-xtrabackup-24
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libperconaserverclient18.1 percona-server-common-5.6
  percona-xtradb-cluster-galera-3.x socat
Use 'apt-get autoremove' to remove them.
The following packages will be REMOVED:
  percona-xtrabackup
The following NEW packages will be installed:
  percona-xtrabackup-24
0 upgraded, 1 newly installed, 1 to remove and 3 not upgraded.
Need to get 6 084 kB of archives.
After this operation, 10,3 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://repo.percona.com/apt/ trusty/main percona-xtrabackup-24 amd64 2.4.1-1.trusty [6 084 kB]
Fetched 6 084 kB in 8s (691 kB/s)
(Reading database ... 801961 files and directories currently installed.)
Removing percona-xtrabackup (2.3.3-1.trusty) ...
Processing triggers for man-db (2.6.7.1-1ubuntu1) ...
Selecting previously unselected package percona-xtrabackup-24.
(Reading database ... 801947 files and directories currently installed.)
Preparing to unpack .../percona-xtrabackup-24_2.4.1-1.trusty_amd64.deb ...
Unpacking percona-xtrabackup-24 (2.4.1-1.trusty) ...
Processing triggers for man-db (2.6.7.1-1ubuntu1) ...
Setting up percona-xtrabackup-24 (2.4.1-1.trusty) ...

openxs@ao756:~$ xtrabackup --version
xtrabackup version 2.4.1 based on MySQL server 5.7.10 Linux (x86_64) (revision id: a2dc9d4)
It seems at least in my case xtrabackup was not able to find the socket by default, but eventually it worked:
openxs@ao756:~$ sudo xtrabackup --user=root --password=root --backup
160226 12:08:45  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup' as 'root'  (using password: YES).
160226 12:08:45  version_check Connected to MySQL server
160226 12:08:45  version_check Executing a version check against the server...
160226 12:08:47  version_check Done.
160226 12:08:47 Connecting to MySQL server host: localhost, user: root, password: set, port: 0, socket: (null)
Failed to connect to MySQL server: Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2).

openxs@ao756:~$ mysql -uroot -proot test -e "show variables like 'socket%'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+-----------------------------+
| Variable_name | Value                       |
+---------------+-----------------------------+
| socket        | /var/run/mysqld/mysqld.sock |
+---------------+-----------------------------+

openxs@ao756:~$ sudo xtrabackup --user=root --password=root --socket=/var/run/mysqld/mysqld.sock  --backup
...
160226 12:11:43 Finished backing up non-InnoDB tables and files
160226 12:11:43 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
xtrabackup: The latest check point (for incremental): '7481732856'
xtrabackup: Stopping log copying thread.
.160226 12:11:43 >> log scanned up to (7481732865)

160226 12:11:43 Executing UNLOCK TABLES
160226 12:11:43 All tables unlocked
160226 12:11:43 [00] Copying ib_buffer_pool to /home/openxs/xtrabackup_backupfiles/ib_buffer_pool
160226 12:11:43 [00]        ...done
160226 12:11:43 Backup created in directory '/home/openxs/xtrabackup_backupfiles/'
160226 12:11:43 [00] Writing backup-my.cnf
160226 12:11:43 [00]        ...done
160226 12:11:43 [00] Writing xtrabackup_info
160226 12:11:43 [00]        ...done
xtrabackup: Transaction log of lsn (7481732815) to (7481732865) was copied.
160226 12:11:44 completed OK!
I've also installed debug package to have symbols while working with gdb, and checked that pt-mysql-summary works:
openxs@ao756:~$ sudo apt-get install percona-server-5.7-dbg
...
Fetched 160 MB in 59s (2 702 kB/s)
Selecting previously unselected package percona-server-5.7-dbg.
(Reading database ... 801962 files and directories currently installed.)
Preparing to unpack .../percona-server-5.7-dbg_5.7.10-3-1.trusty_amd64.deb ...
Unpacking percona-server-5.7-dbg (5.7.10-3-1.trusty) ...
Setting up percona-server-5.7-dbg (5.7.10-3-1.trusty) ...
openxs@ao756:~$ pt-mysql-summary
# Percona Toolkit MySQL Summary Report #######################
              System time | 2016-02-26 10:17:09 UTC (local TZ: EET +0200)
# Instances ##################################################
  Port  Data Directory             Nice OOM Socket
  ===== ========================== ==== === ======
        /var/lib/mysql             0    0
# MySQL Executable ###########################################
       Path to executable | /usr/sbin/mysqld
              Has symbols | No
# Report On Port 3306 ########################################
                     User | @localhost
                     Time | 2016-02-26 12:17:08 (EET)
                 Hostname | ao756
                  Version | 5.7.10-3-log Percona Server (GPL), Release '3', Revision '63dafaf'
                 Built On | debian-linux-gnu x86_64
                  Started | 2016-02-26 12:02 (up 0+00:14:11)
                Databases | 2
                  Datadir | /var/lib/mysql/
                Processes | 1 connected, 1 running
              Replication | Is not a slave, has 0 slaves connected
                  Pidfile | /var/lib/mysql/ao756.pid (does not exist)...
As a last note, service can not be stopped properly:
openxs@ao756:~$ sudo service mysql stop
 * Percona Server 5.7.10-3 is already stopped
openxs@ao756:~$ ps aux | grep mysqld
mysql    24382  0.1  0.0   4440   760 ?        S    12:21   0:00 /bin/sh /usr/bin/mysqld_safe
mysql    24643  2.9  5.6 1234144 218252 ?      Sl   12:21   0:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --log-error=/var/lib/mysql/ao756.err --pid-file=/var/lib/mysql/ao756.pid
openxs   24756  0.0  0.0  14636   972 pts/4    S+   12:21   0:00 grep --color=auto mysqld
so one has to use mysqladmin:
openxs@ao756:~$ sudo mysqladmin -uroot -proot shutdownmysqladmin: [Warning] Using a password on the command line interface can be insecure.

openxs@ao756:~$ ps aux | grep mysqld
openxs   24765  0.0  0.0  14636   972 pts/4    S+   12:21   0:00 grep --color=auto mysqld
To summarize my experience of a less naive user of Ubuntu 14.04 (the one who already upgraded to Percona Server 5.7.10-3 on other Linux versions), one should note the following:

  1. Please, check all your configuration files carefully for any possible incompatibility with 5.7 before attempting upgrade.
  2. Make sure ALL_O_DIRECT value is NOT used for innodb_flush_method. Check this my Facebook post for more details and discussion with Laurynas about the bugs involved (Percona Server Bug #1529885 and Percona Server Bug #1550404) and manual statements related.
  3. If you need to use xtrabackup with Percona Server 5.7, you have to install percona-xtrabackup-24 explicitly and may have to specify --socket option explicitly.
  4. I think /etc/init.d/mysql script from Percona Server 5.7.10-3 is somewhat broken on Ubuntu 14.04. Something to study in more details later.
I hope this step by step guide will help some of you to upgrade faster and provide some troubleshooting hints in case of problems.