Saturday, November 18, 2017

How lsof Utility May Help MySQL DBAs

While working in Support, I noticed that probably at least once a week I have to use or mention lsof utility in some context. This week, for example, we had a customer trying to find out if his mysqld process running is linked with tcmalloc library. He started it different ways, using LD_PRELOAD directly and --malloc-lib option of mysqld_safe script etc, but wanted to verify that his attempts really worked as expected. My immediate comment in the internal chat was: "Just let them run lsof -p `pidof mysqld` | grep mall and check!" My MariaDB 10.2 instance uses jemalloc and this can be checked exactly the same way:
openxs@ao756:~/dbs/maria10.2$ ps aux | grep mysqld...
openxs    4619  0.0  0.0   4452   804 pts/2    S    17:02   0:00 /bin/sh bin/mysqld_safe --no-defaults --port=3308 --malloc-lib=/usr/lib/x86_64-linux-gnu/
openxs    4734  0.5  2.9 876368 115156 pts/2   Sl   17:02   0:00 /home/openxs/dbs/maria10.2/bin/mysqld --no-defaults --basedir=/home/openxs/dbs/maria10.2 --datadir=/home/openxs/dbs/maria10.2/data --plugin-dir=/home/openxs/dbs/maria10.2/lib/plugin --log-error=/home/openxs/dbs/maria10.2/data/ao756.err --port=3308
openxs    5391  0.0  0.0  14652   964 pts/2    S+   17:05   0:00 grep --color=auto mysqld
openxs@ao756:~/dbs/maria10.2$ lsof -p 4734 | grep mall
mysqld  4734 openxs  mem    REG              252,2    219776 12058822 /usr/lib/x86_64-linux-gnu/
I think it's time to summarize most important use cases for lsof utility for MySQL DBAs. I am going to show different cases when it can be useful based on public MySQL bug reports. 

As one can read in the manual, lsof "lists on its standard output file information about files opened by processes". In one of the simplest possible calls presented above, we just pass PID of the process after -p option and get list of open files for this process. This includes shared libraries the process uses. By default the following format of the output is used:
openxs@ao756:~/dbs/maria10.2$ lsof -p 4734 | more
COMMAND  PID   USER   FD   TYPE             DEVICE  SIZE/OFF     NODE NAMEmysqld  4734 openxs  cwd    DIR              252,2      4096 29638597 /home/openxs/dbs/maria10.2/data
mysqld  4734 openxs  rtd    DIR              252,2      4096        2 /
mysqld  4734 openxs  txt    REG              252,2 147257671 29514843 /home/openxs/dbs/maria10.2/bin/mysqld
mysqld  4734 openxs  mem    REG              252,2     31792  1311130 /lib/x86_64-linux-gnu/
mysqld  4734 openxs  mem    REG              252,2 101270905 29241175 /home/openxs/dbs/maria10.2/lib/plugin/

mysqld  4734 openxs  DEL    REG               0,11             443265 /[aio]
mysqld  4734 openxs    0r   CHR                1,3       0t0     1050 /dev/null
mysqld  4734 openxs    1w   REG              252,2     52623  5255961 /home/openxs/dbs/maria10.2/data/.rocksdb/LOG
mysqld  4734 openxs    2w   REG              252,2    458880 29647192 /home/openxs/dbs/maria10.2/data/ao756.err
mysqld  4734 openxs    3r   DIR              252,2      4096  5255249 /home/openxs/dbs/maria10.2/data/.rocksdb
mysqld  4734 openxs  451u  IPv6             443558       0t0      TCP *:3308 (LISTEN)
mysqld  4734 openxs  452u  unix 0x0000000000000000       0t0   443559 /tmp/mysql.sock
mysqld  4734 openxs  470u   REG              252,2         0 29756970 /home/openxs/dbs/maria10.2/data/mysql/event.MYD
mysqld  4734 openxs  471u   REG              252,2         0 29647195 /home/openxs/dbs/maria10.2/data/
Most columns have obvious meaning, so let me concentrate on the few. FD should be numeric file descriptor, and it is for normal files. In this case it is also followed by a letter describing the mode under which the file is open (r for read, w for write and u for update). There may be one more letter describing a type of lock applied to the file. But we can see values without any single digit in the output above, so obviously some special values can be present there, like cwd for current working directory, rtd for root directory, txt for program text (code and data) or mem for memory mapped file, etc.

TYPE column is also interesting and may have plenty of values (as there are many types of files in Linux). REG means regular file, DIR is, obviously, a directory. Note also unix for a socket and IPv6 for the TCP port mysqld process listens to.

In SIZE/OFF column for normal files we usually see their size in bytes. Values for offset in file are usually prefixed with 0t if the value is decimal, or 0x if it's hex. NAME is obviously a fully specified file name (with symbolic links resolved). Some more details about the output format are discussed in the following examples.

Another usual way to use lsof is to pass a file name and get details about processes that have it opened, like this:
openxs@ao756:~/dbs/maria10.2$ lsof /tmp/mysql.sock
mysqld  4734 openxs  452u  unix 0x0000000000000000      0t0 443559 /tmp/mysql.sock
openxs@ao756:~/dbs/maria10.2$ lsof /home/openxs/dbs/maria10.2
mysqld_sa  4619 openxs  cwd    DIR  252,2     4096 29235594 /home/openxs/dbs/maria10.2
lsof      14354 openxs  cwd    DIR  252,2     4096 29235594 /home/openxs/dbs/maria10.2
lsof      14355 openxs  cwd    DIR  252,2     4096 29235594 /home/openxs/dbs/maria10.2
bash      29244 openxs  cwd    DIR  252,2     4096 29235594 /home/openxs/dbs/maria10.2
In this case we see that /home/openxs/dbs/maria10.2 is used as a current working directory by 4 processes. Usually this kind of check is used when we can not unmount some directory, but it may be also useful in context of MySQL when you get error messages that some file is already used by other process. In the first example above I was checking what process could use /tmp/mysql.sock file.

Now, with the above details on basic usage in mind, let's check several recent enough MySQL bug reports that demonstrate typical and more advanced usage of lsof:
  • Bug #66237 - "Temporary files created by binary log cache are not purged after transaction commit". My former colleague and mentor from Percona, Miguel Angel Nieto (who recently joined a dark side of MongoDB employees) used lsof to show numerous files with names ML* created and left (until connection is closed) by mysqld process in /tmp directory (tmpdir to be precize) of a server with binary logging enabled, when transaction size was larger that binlog cache size. The bug is fixed in 5.6.17+ and 5.7.2+. It shows us a usual way of creating temporary files by MySQL server:
    # lsof -p 6112|grep ML
    mysqld  6112 root   38u   REG                7,0  106594304      18 /tmp/MLjw4ecJ (deleted)
    mysqld  6112 root   39u   REG                7,0  237314310      17 /tmp/MLwdWDGW (deleted)
    Notice (deleted) above. This is a result of immediate call to unlink() when temporary files are created. Check this in the source code, as well as my_delete() implementation.
  • Bug #82870 - "mysqld opens too many descriptors for slow query log". This bug (that is still "Verified") was opened by my former colleague Sveta Smirnova (now in Percona). Basically, mysqld opens too many descriptors for slow query log (and general query log) if it is turned ON and OFF while concurrent sessions are running. lsof allowed to see multiple descriptors created for the same file, until eventually open_files_limit is hit.
  • Bug #83434 - "Select statement with partition selection against MyISAM table opens all partitions". This bug (later declared a duplicate of older one and, eventually, a documented, even if unexpected, behavior by design) was opened by my colleague from MariaDB Geoff Montee. lsof utility helped to show that all partitions are actually opened by the mysqld process in this case.
  • Bug #74145 - "FLUSH LOGS improperly disables the logging if the log file cannot be accessed". This bug (still "Verified") was reported by Jean Weisbuch. Here we can see how lsof was used to find out if slow log is open after FLUSH. The logging has to be disabled, but MySQL continue to lie that it is enabled. I remember many cases when lsof also helped to find out where the error log (file with descriptor 2w) is really located/redirected to.
  • Bug #77752 - "bind-address wrongly prefers IPv4 over IPv6". This was not a bug (more like a configuration issue), but see how lsof -i is used by Daniël van Eeden to find out what process listens to a specific port, and does it listen to IPv4 or IPv6 address.
  • Bug #87589 - "Documentation incorrectly states that LOAD DATA LOCAL INFILE does not use tmpdir". In this "Verified" bug report Geoff Montee used lsof to show that temporary files are really created in tmpdir, not in /tmp (OS temporary directory). This is how you can find out when MySQL manual lies...
  • Bug #77519 - "Reported location of Innodb Merge Temp File is wrong". One more bug from Daniël van Eeden, this time "Verified". By calling lsof +L1 during an online alter table, he demonstrated that two temp files are created in tmpdir instead of in the datadir (as described by the manual), while events_waits_history_long table in performance_schema seems to claim it waited ion temporary file in the datadir. Note that in other his bug report, Bug #76225, fixed since 5.7.9 and 5.8.0, he had also shown ML* binlog cache files created that were not instrumented by performance_schema.
  • Bug #75706 - "alter table import tablespace creates a temporary table". This bug report by BJ Quinn is formally still "Verified", but according to my former colleague Przemyslaw Malkowski from Percona in recent 5.6.x and 5.7.x versions lsof does NOT show temporary table created. Time to re-verify this bug maybe, if the decision is made on how to implement this?
  • Bug #83717 - "Manual does not explain when ddl_log.log file is deleted and how large it can be". My own bug report where lsof was used to show that the ddl_log.log file remains open even after online ALTER completes. Manual is clear about this now.
To summarize, lsof may help MySQL DBA to find out:
  • what dynamic libraries are really used by the mysqld process
  • where the error log and other logs are really located
  • what other process may have some file, port or socket opened that is needed for current MySQL instance
  • why you may hit open_files_limit or use all free space in some filesystem unexpectedly
  • where all kinds of temporary files are created during specific operations
  • how MySQL really works with files, ports and sockets
It also allows to find MySQL bugs and clarify missing details in MySQL manual.

Saturday, November 11, 2017

Fun with Bugs #57 - On MySQL Bug Reports I am Subscribed to, Part I

I've decided to stop reviewing MySQL Release Notes in this series, but it does not mean that I am not interested in MySQL bugs any more. At the moment I am subscribed to 91 active MySQL bugs reported by other MySQL users, and in this blog post I am going to present 15 of them, the most recently reported ones. I'd really want to see them fixed or at least properly processed as soon as possible.

In some cases I am going to add my speculations on how the bug had better be handled, or maybe highlight some important details about it. It is not my job any more to process/"verify" any community bug reports for any kind of MySQL, but I did that for many years and I've spent more than 5 years "on the other side", being a member of Community, so in some cases I let myself to share some strong opinion on what may be done differently from the Oracle side.

As a side note, I started to subscribe to MySQL bugs mostly after I left Oracle, as before that I got email notification about each and every change in every MySQL bug report ever created...

Here is the list, starting from the most recent ones:
  • Bug #88422 - "MySQL 5.7 innodb purge thread get oldest readview could block other transaction". It is one of that bug reports without a test case from reporter. It is tempting to set it to "Verified" just "based on code review", as the code in 5.7 is quite obviously shows both holding the trx_sys->mutex and linear complexity of search depending on number of read views in the worst case (when most of them are closed):
    Get the oldest (active) view in the system.
    @return oldest view if found or NULL */

    MVCC::get_oldest_view() const
            ReadView*       view;


            for (view = UT_LIST_GET_LAST(m_views);
                 view != NULL;
                 view = UT_LIST_GET_PREV(m_view_list, view)) {

                    if (!view->is_closed()) {

    But probably current Oracle bugs verification rules do not let to just mark it as verified. After all, somebody will have to create a test case... So, my dear old friend Sinisa Milivojevic decided to try to force bug reporter to provide a test case instead of spending some time trying to create one himself. I am not going to blame him for that, why to try the easy way :) But I consider this his statement in the comment dated [10 Nov 16:21]:
    "... 5.7 methods holds no mutex what so ever..."
    a bit wrong, as we can see the mutex is acquired when get_oldest_view() method is called:
    MVCC::clone_oldest_view(ReadView* view)

            ReadView*       oldest_view = get_oldest_view();

            if (oldest_view == NULL) {
  • Bug #88381 - "Predicate cannot be pushed down "past" window function". Here bug reporter had provided enough hints for a test case. One can probably just check 'Handler%' status variables before and after query execution to come to the conclusion. Moreover, it seems Oracle developer,  Dag Wanvik, accepted this as a known limitation, but the bug still remains "Open" and nobody knows if it was copied to the internal bugs database, got prioritized and if any work on this is planned any time soon. We shell see. You may also want to monitor MDEV-10855.
  • Bug #88373 - "Renaming a column breaks replication from 5.7 to 8.0 because of impl. collation". This bug was quickly verified by Umesh Shastry. I expect a lot of "fun" for users upgrading to MySQL 8.0 when it becomes GA, especially in replication setups.
  • Bug #88328 - "Performance degradation with the slave_parallel_workers increase". There is no test case, just some general description and ideas about the root case when semi-sync replication is used. I expect this bug to stay "Open" for a long time, as it is a topic for a good research and blog posts like this one, that is, a work for real expert!
  • Bug #88223 - "Replication with no tmpdir space and InnoDB as tmp_storage_engine can break". Here we have clear and simple test case from Sveta Smirnova (no wonder, she also worked at bugs verification team in MySQL, Sun and Oracle). I hope Umesh will verify it soon. As a side note, it is explained (in the comments) elsewhere that InnoDB as internal_tmp_disk_storage_engine may not be the best possible option. We do not have this variable and do not plan to support InnoDB for internal temporary tables in MariaDB 10.2+.
  • Bug #88220 - "compressing and uncompressing InnoDB tables seems to be inconsistent". See also other, older bug reports mentioned there that are duplicates/closely related, but were not getting proper attention.
  • Bug #88150 - "'Undo log record is too big.' error occurring in very narrow range of str length". It was reported by my colleague Geoff Montee and is already fixed in recent versions of MariaDB (see MDEV-14051 for the details and some nice examples of gdb usage by a developer)!
  • Bug #88127 - "Index not used for 'order by' query with utf8mb4 character set". Here I am just curious when bugs like that would be caught up by Oracle QA before any public releases.
  • Bug #88071 - "An arresting Performance degradation when set sort_buffer_size=32M". here the test case is clear - just run sysbench oltp test at high concurrency with different values of sort_buffer_size. Still, Sinisa Milivojevic decided to explain when RAM limit may play a role instead of just showing how it works great (if it does) on any server with enough RAM... Let's see how this attempt to force bug reporter to work/explain more may end up...
  • Bug #87947 - "Optimizer chooses ref over range when access when range access is faster". Nice example of a case when optimizer trace may be really useful. Øystein Grøvlen kindly explained that "range access and ref access are not comparable costs". I wish we get better cost model for such cases in MySQL one day.
  • Bug #87837 - "MySQL 8 does not start after upgrade to 8.03". It is expected actually, and even somewhat documented in the release notes that MySQL 8.0.3 is not compatible to any older version. So, it is more like MySQL Installer (that I do not care much about) bug, but I still subscribed to it as yet another source of potential fun during further upgrade attempts.
  • Bug #87716 - "SELECT FOR UPDATE with BETWEEN AND gets row lock excessively". I think I already studied once why with IN() rows are locked differently by InnoDB comparing to BETWEEN that selects the same rows. But I'd like to know what's the Oracle's take on this, and I'd like to study this specific test case in details one day as well.
  • Bug #87670 - "Force index for group by is not always honored". Clear and simple test case, so no wonder it was immediately verified.
  • Bug #87621 - "Huge InnoDB slowdown when selecting strings without indexes ". I'd like to check with perf one day where the time is spent mostly during this test. For now I think this is a result of the way "long" data are stored on separate pages in InnoDB. What;'s interesting here is also a test case where R is used to generate data set.
  • Bug #87589 - "Documentation incorrectly states that LOAD DATA LOCAL INFILE does not use tmpdir". This was yet another report from my colleague Geoff Montee. lsof is your friend, maybe I have to talk about it one day at FOSDEM (call for papers is still open :) I like to find and follow bugs and missing details in MySQL manual, maybe because I would never be able to contribute to it as a writer directly...

So, this list shows my typical recent interests related to MySQL bugs - mostly InnoDB, optimizer, replication problems, fine manual and just some fun details like the way some Oracle engineers try to avoid working extra hard while processing bugs... I am also happy to know that in some cases MariaDB is able to deliver fixes faster.

Thursday, October 19, 2017

Fun with Bugs #56 - On Some Public Bugs Fixed in MySQL 5.7.20

While MySQL 8.0.x hardly has much impact on my regular work, recent MySQL 5.7.20 release is something to check carefully. MySQL 5.7 is widely used in production, as a base for Percona Server 5.7, some features may be merged into MariaDB 10.x etc. So, here is my review of some community reported bugs that were fixed in recently released MySQL 5.7.20, based on the release notes.

Usually I start with InnoDB bug fixes, but in 5.7.20 several related fixes were made only to bugs reported internally. So, this time I have to start with partitioning:
  • Bug #86255 - First one to write about, and the bug report is private... Second one (Bug #76418) is also private. All we have is this:
    "Partitioning: In certain cases when fetching heap records a partition ID could be set to zero. (Bug #86255, Bug #26034430)"

    "Partitioning: Queries involving NULL were not always handled correctly on tables that were partitioned by LIST. (Bug #76418, Bug #20748521)"
That's all, folks. I fail to understand why bugs with such a description can stay private after they are fixed. I have to admit: I do not get it. Moreover, I am not going even to try any longer. Lists with one item look stupid, but hiding such bugs is not much better, IMHO.

Lucky I am, there were several bug fixes related to replication:
  • Bug #85352 - "Replication regression with RBR and partitioned tables". This regression bug (comparing to 5.5.x) was reported by Juan Arruti and immediately verified by Umesh Shastry. I do not know why it is listed as InnoDB in the release notes, as it hardly can be about native InnoDB partitioning, based on versions affected.
  • Bug #86224 - "mysqlrplsync errors out because slave is using --super-read-only option". It was reported by Richard Morris and verified by Bogdan Kecman.
  • Bug #84731 - "Group Replication: mysql client connections hang during group replication start", was reported by Kenny Gryp and verified by Umesh Shastry. Another bug reported by Kenny was also fixed in 5.7.20, Bug #84798 - "Group Replication can use some verbosity in the error log".
  • Bug #86250 - "thd->ha_data[ht_arg->slot].ha_ptr_backup == __null || (thd->is_binlog_applier()". This debug assertion was reported by Roel Van de Paar and verified by Umesh Shastry
  • Bug #85639 - "XA transactions are 'unsafe' for RPL using SBR". It was reported by João Gramacho.
  • Bug #86288 - "mysqlbinlog read-from-remote-server not honoring rewrite_db filtering", was reported by Julie Hergert.
Other bug fixes not to miss are:
  • Bug #85622 - "5.6: wrong results with materialization, views". It was reported by Shane Bester. Even though it was initially stated that only 5.6 is affected, release notes say there was a fix for 5.7.20 and even 8.0.3.
  • Bug #82992 - "Some warnings appear in dump from mysqldump". This funny bug was found by Nong LO and verified by Sinisa Milivojevic.
  • Bug #81714 - "mysqldump get_view_structure does not free MYSQL_RES in one error path". Thisbug was reported by Laurynas Biveinis, but it was noticed and patched by Yura Sorokin, also from Percona.
  • Bug #83950 - "LOAD DATA INFILE fails with an escape character followed by a multi-byte one", was reported by yours truly and verified by Umesh Shastry. Unfortunately the bug report itself does NOT say explicitly what versions had got the fix.
  • Bug #79596 - "If client killed after ROLLBACK TO SAVEPOINT previous stmts committed". This regression and potential data corruption bug was reported by Sveta Smirnova, verified by Umesh Shastry and studied at the source code level by Zhai Weixiang. Nice to see it fixed!
That's all, few build and packaging related bugs aside.

This was my very last attempt to do a detailed review of bug reports from MySQL Community based on official Release Notes. With private bugs and very few fixes for things coming from the public bugs database in general, it seems to make zero sense now to continue these. Authors of patches contributed are properly mentioned by Oracle, and we all know who verify community bug reports... One Twitter message would be enough to fit everything I have to say, and any real details should better be checked in git commits.

It's time to get back to the roots of this series and maybe write about bugs just opened, bugs not properly handled or just funny ones. I think it helped a lot back in 2013 to make MySQL 5.6 a release that was commonly accepted as a "good one". It's not fun any more and not much useful to report Oracle achievements in public bugs fixing, so I'd better switch to their problems.

Saturday, September 2, 2017

MySQL Support Engineer's Chronicles, Issue #8

This week is special and full of anniversaries for me. This week 5 years ago I left Oracle behind and joined Percona... Same week 5 years ago I had written something about MySQL in this blog for the first time in my life. 5 years ago I've created my Facebook account that I actively (ab)use for discussing work-related issues. So, to summarize, it's a five years anniversary of my coming out as a MySQL Entomologist, somebody who writes and speaks about MySQL and bugs in MySQL in public! These 5 years were mostly awesome.

I decided to celebrate with yet another post in this relatively new series and summarize in short what interesting things I studied, noticed or had to work on this week while providing support to customers of all kinds of MySQL.

This week started for me with the need to find out why mariabackup fails on Windows, for one of customers. If you missed it, MariaDB created a tool for online backup based on Percona's XtraBackup that supports few additional features (like data at rest encryption of MariaDB Server) and works on Windows as well, included it into MariaDB server and even declared it "Stable" as of MariaDB 10.1.26. In the process of working on that problem I had to use procmon tool, based on this KB article. The root cause of the problem was NTFS compression used for the target directory (see MDEV-13691 by Vladislav Vaintroub , who forces lazy me to improve my rudimentary Windows user skills from time to time, for some related details). So, better do not use NTFS compression of the backup destination if you need to back up big enough (50G+) tables. I really enjoyed working with procmon that helped to find out what could cause (somewhat random) "error 22" failures.

I was (positively) surprised to find out there there is a MariaDB KB article on such a specific topic as troubleshooting on Windows. Besides this one, I had to use the following KB articles while working with customers this week:
and found something new (for me) there. I never cared to find out what join_cache_level is used for previously, for example.

Besides mariabackup, this week I had to discuss various problems related to backing up TokuDB tables, so you should expect my blog posts related to this topic soon.

My colleague Geoff Montee published a useful post this week, "Automatically Dropping Old Partitions in MySQL and MariaDB". Make sure to check comments and his followup. Geoff had also reported nice Bug #87589 - "Documentation incorrectly states that LOAD DATA LOCAL INFILE does not use tmpdir", that is still "Open" for some reason.

During such a great week I had to report some MySQL bug, so I did it. Check Bug #87624 - "Wrong statement-based binary logging for CONCAT() in stored procedure". I've seen all these "Illegal mix of collations..." way too often over years.

Other bugs that attracted my attention were:
  • Bug #84108 - "Mysql 5.7.14 hangs and kills itself during drop database statement". This bug should probably become "Open" and properly processed, as current "Duplicate" status is questionable, at best. Arjen Lentz attracted mine (and not only mine) attention to this old enough and improperly handled bug report.
  • Bug #87619 - "InnoDB partition table will lock into the near record as a condition in the use". Nice to see this regression bug "Verified" after all. It seems native partitioning in MySQL 5.7 came with some cost of extra/different locking.
Time to stop writing and prepare for further celebrations, fun and anniversaries. Stay tuned!

Sunday, August 20, 2017

Fun with Bugs #55 - On Some Public Bugs Fixed in MySQL 8.0.2

I do not care much about MySQL 8.0.x at the moment, as it's far from being GA and is work in progress. It is not yet used by customers whom I have to support. But I know about many interesting changes and improvements there that, eventually, are going to influence all main forks and kinds of MySQL. So, it would not be wise to ignore MySQL 8.0.c entirely even for me.

For this post I decided to briefly check what community reported bugs were fixed in the recent release, 8.0.2, based on release notes. For me it's a measure of community interest in MySQL 8.0.x and Oracle's interest in further working with MySQL Community. I ended up with the following, short enough list of bug fixes in the categories I usually care about (InnoDB, partitioning, replication and optimizer):
  • The very first InnoDB bug mentioned in the release notes, Bug #85043, is private. I fail to see any valid reason for a bug in the version currently under development and not declared GA to remain private after the fix is released. If only it affects GA versions, and this is the case. The bug is fixed in 5.7.19 as well, as you can see in my previous post.
  • Another bug that is related to InnoDB and optimizer, is Bug #81031. It was also fixed in MySQL 5.6.37 and 5.7.19.
  • Bug #84038 - "Errors when restarting MySQL after FLUSH TABLES FOR EXPORT, RENAME and DROP", was also fixed in MySQL 5.7.19. I am actually surprised that as this stage we still have older InnoDB internal data dictionary tables in MySQL 8.0.x.
  • Group replication related Bug #85667, Bug #85047, Bug #84728  and Bug #84733 were also listed as fixed in MySQL 5.7.19.
  • Same situation with normal async replication bugs: Bug #83184, Bug #82283, Bug #81232, Bug #77406 etc. It's expected to see fixes applied to the oldest version affected and then fixes merged to newer versions.
  • The first really unique fix in 8.0.2 that I found was Bug #85639 - "XA transactions are 'unsafe' for RPL using SBR". It was reported by João Gramacho (who probably works for Oracle) originally for MySQL 5.7 and is going to be fixed in MySQL 5.7.20 also.
  • Replication-related Bug #85739 is still private. Release notes say:
    "Issuing SHOW SLAVE STATUS FOR CHANNEL 'group_replication_recovery' following a restart of a server using group replication led to an unplanned shutdown."
  • Yet another private replication bugs fixed in 8.0.2 are: Bug #85405, Bug #85084, Bug #84646Bug #84471, Bug #82467 and Bug #80368. I do not know who reported them, what versions are affected (but I suspect .5.7.x also) and why are they remaining private after being fixed in 8.0.2.
  • The first 8.0.x specific public bug report I've found in the release notes was reported by Andrey Hristov and verified by Umesh Shastry. It is Bug #85937 - "Unchecked read after allocated buffer".
  • Bug #86120 - "Upgrading a MySQL instance with a table name of >64 chars results in breakage", was reported by Daniël van Eeden and verified by Umesh Shastry. I expect a lot of problem reports when users starts to upgrade to 8.0... See also Bug #84889 - "MYSQL 8.0.1 - MYSQLD ERRORLOG UPGRADE ERRORS AT SERVER START LIVE UPGRADE", by Susan Koerner.
  • Bug #85994 - "Out-of-bounds read in fix_paths", was reported by Laurynas Biveinis and verified by Umesh Shastry. Percona seems to care a lot about improving MySQL 8.0 (as well as other Oracle MySQL GA versions). See also Bug #85678 by Laurynas and Bug #85059 by Roel Van de Paar.
  • Jon Olav Hauglid seems to care about the new data dictionary code quality, so he had reported related public bugs: Bug #85811, Bug #85800, and Bug #83473.
  • Bug #85704 - "mysql 8.0.x crashes when old-style trigger misses the "created" line in .TRG", was reported by Shane Bester. Workaround was also suggested by Jesper Krogh. Shane also reported this nice regression Bug #83019 - "queries in "show processlist" oscillate with constant times higher each day", that is now fixed in 8.0.2.
  • Bug #85614 - "alter table fails when default character set changes to utf8mb4", was reported by Tor Didriksen. He had also reported Bug #85224 - "Illegal mix of collations for time/varchar".
  • Bug #85518 - "Distinct operations on temp tables allocate too little memory for sort keys", was reported by Steinar Gunderson. He had also reported Bug #85487 - "num_tmp_files in filesort optimizer trace is nonsensical".
  • Bug #85179 - "Assert in sql/ virtual String* Field_varstring::val_str", was reported by Matthias Leich.
I skipped several bugs that are fixed also in older versions. Many of them were already discussed in my posts. I also skipped all build/compilation/packaging bugs for now.

To summarize, while total number of public bug reports fixed in MySQL 8.0.2 is notable, many of these bugs were reported by few Oracle engineers who are still brave enough to report bugs in public. From Community, it seems only mostly Percona and engineers do care to check MySQL 8.0.x at this early stage and report bugs. I am especially concerned with the number of private bug reports mentioned in the release notes of 8.0.2...

Saturday, August 12, 2017

More on Studying MySQL Hashes in gdb, and How P_S Code May Help

I have to get back to the topic of checking user variables in gdb to clarify few more details. In his comment Shane Bester kindly noted that calling functions defined in MySQL code is not going to work when core dump is studied. So, I ended up with a need to check what does the my_hash_element() function I've used really do, to be ready to repeat that step by step manually. Surely I could skip that and use Python and Shane himself did, but structures of HASH type are widely used in MySQL, so I'd better know how to investigate them manually than blindly use existing code.

Quick search with grep for my_hash_element shows:
[root@centos mysql-server]# grep -rn my_hash_element *
include/hash.h:94:uchar *my_hash_element(HASH *hash, ulong idx);
mysys/hash.c:734:uchar *my_hash_element(HASH *hash, ulong idx)plugin/keyring/      if(store_key_in_buffer(reinterpret_cast<const IKey *>(my_hash_element(keys_hash, i)),
plugin/version_token/  while ((token_obj= (version_token_st *) my_hash_element(&version_tokens_hash, i)))
plugin/version_token/    while ((token_obj= (version_token_st *) my_hash_element(&version_tokens_hash, i)))
sql/    TABLE_SHARE *share= (TABLE_SHARE *)my_hash_element(&table_def_cache, idx);
sql/        share= (TABLE_SHARE*) my_hash_element(&table_def_cache, idx);
sql/    udf_func *udf=(udf_func*) my_hash_element(&udf_hash,idx);
sql/      (Table_cache_element*) my_hash_element(&m_cache, idx);
sql/rpl_gtid.h:2123:        Node *node= (Node *)my_hash_element(hash, i);
sql/rpl_gtid.h:2274:          node= (Node *)my_hash_element(hash, node_index);
sql/    entry *e= (entry *)my_hash_element(&m_table_ids, i);
sql/    SLAVE_INFO* si = (SLAVE_INFO*) my_hash_element(&slave_list, i);
sql-common/client.c:3245:        LEX_STRING *attr= (LEX_STRING *) my_hash_element(attrs, idx);
storage/perfschema/    sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));storage/ndb/include/util/HashMap.hpp:155:    Entry* entry = (Entry*)my_hash_element(&m_hash, (ulong)i);
storage/ndb/include/util/HashMap.hpp:169:    Entry* entry = (Entry*)my_hash_element((HASH*)&m_hash, (ulong)i);
[root@centos mysql-server]#
That is, HASH structure is used everywhere in MySQL, from keyring to UDFs and table cache, to replication and NDB Cluster, with everything in between. If I can navigate to each HASH element and dump/print it, I can better understand a lot of code, if needed. If anyone cares, HASH is defined in a very simple way in include/hash.h:
typedef struct st_hash {
  size_t key_offset,key_length;         /* Length of key if const length */
  size_t blength;
  ulong records;
  uint flags;
  DYNAMIC_ARRAY array;                          /* Place for hash_keys */
  my_hash_get_key get_key;
  void (*free)(void *);
  CHARSET_INFO *charset;
  my_hash_function hash_function;
  PSI_memory_key m_psi_key;
It relies on DYNAMIC_ARRAY to store keys.

The code of the my_hash_element function in mysys/hash.c is very simple:
uchar *my_hash_element(HASH *hash, ulong idx)
  if (idx < hash->records)
    return dynamic_element(&hash->array,idx,HASH_LINK*)->data;
  return 0;
Quick search for dynamic_element shows that it's actually a macro:
[root@centos mysql-server]# grep -rn dynamic_element *
client/mysqldump.c:1608:    my_err= dynamic_element(&ignore_error, i, uint *);
extra/comp_err.c:471:      tmp= dynamic_element(&tmp_error->msg, i, struct message*);
extra/comp_err.c:692:    tmp= dynamic_element(&err->msg, i, struct message*);
extra/comp_err.c:803:  first= dynamic_element(&err->msg, 0, struct message*);
include/my_sys.h:769:#define dynamic_element(array,array_index,type) \mysys/hash.c:126:  HASH_LINK *data= dynamic_element(&hash->array, 0, HASH_LINK*);
that is defined in include/my_sys.h as follows:
#define dynamic_element(array,array_index,type) \
  ((type)((array)->buffer) +(array_index))
So, now it's clear what to do in gdb, having in mind what array do we use. Let me start the session, find a thread I am interested in and try to check elements one by one:
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fc3a037b700 (LWP 3061))]#0  0x00007fc3d2cb3383 in poll () from /lib64/
(gdb) p do_command::thd->m_thread_id
$1 = 5
(gdb) p do_command::thd->user_vars
$2 = {key_offset = 0, key_length = 0, blength = 4, records = 3, flags = 0,
  array = {buffer = 0x7fc3ba2b3560 "\377\377\377\377", elements = 3,
    max_element = 16, alloc_increment = 32, size_of_element = 16,
    m_psi_key = 38},
  get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
  free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
  hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
(gdb) set $uvars=&(do_command::thd->user_vars)
(gdb) p $uvars
$3 = (HASH *) 0x7fc3b9fad280
(gdb) p &($uvars->array)
$5 = (DYNAMIC_ARRAY *) 0x7fc3b9fad2a8
(gdb) p ((HASH_LINK*)((&($uvars->array))->buffer) + (0))
$6 = (HASH_LINK *) 0x7fc3ba2b3560
(gdb) p ((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data
$7 = (uchar *) 0x7fc3b9fc80e0 "H\201\374\271\303\177"
(gdb) p (user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data)
$8 = (user_var_entry *) 0x7fc3b9fc80e0
(gdb) p *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data)
$9 = {static extra_size = 8, m_ptr = 0x7fc3b9fc8148 "bbb", m_length = 3,
  m_type = STRING_RESULT, m_owner = 0x7fc3b9fad000, m_catalog = {
    str = 0x100000000 <Address 0x100000000 out of bounds>,
    length = 416611827727}, entry_name = {m_str = 0x7fc3b9fc8150 "a",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 25,
  used_query_id = 25, unsigned_flag = false}
(gdb) p *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (2))->data)
$11 = {static extra_size = 8, m_ptr = 0x7fc3b9e6e220 "\002", m_length = 64,
  m_type = DECIMAL_RESULT, m_owner = 0x7fc3b9fad000, m_catalog = {str = 0x0,
    length = 0}, entry_name = {m_str = 0x7fc3b9fc8290 "c", m_length = 1},
  collation = {collation = 0x1ded740, derivation = DERIVATION_IMPLICIT,
    repertoire = 3}, update_query_id = 25, used_query_id = 25,
  unsigned_flag = false}
I tried to highlight important details above. With gdb variables it's a matter of proper type casts and dereferencing. In general, I was printing content of item (user variable in this case) with index N as *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (N))->data).

Now, back to printing the variables.  Let's see how this is done in performance_schema, in storage/perfschema/
     74   for (;;)
     75   {
     76     sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));
     77     if (sql_uvar == NULL)
     78       break;
     98     /* Copy VARIABLE_NAME */
     99     const char *name= sql_uvar->entry_name.ptr();
    100     size_t name_length= sql_uvar->entry_name.length();
    101     DBUG_ASSERT(name_length <= sizeof(pfs_uvar.m_name));
    102     pfs_uvar.m_name.make_row(name, name_length);
    104     /* Copy VARIABLE_VALUE */
    105     my_bool null_value;
    106     String *str_value;
    107     String str_buffer;
    108     uint decimals= 0;
    109     str_value= sql_uvar->val_str(& null_value, & str_buffer, decimals);    110     if (str_value != NULL)
    111     {
    112       pfs_uvar.m_value.make_row(str_value->ptr(), str_value->length());
    113     }
    114     else
    115     {
    116       pfs_uvar.m_value.make_row(NULL, 0);
    117     }
    119     index++;    120   }

So, there we check elements by index until there is no element with such index, and apply the val_str() function of the class. While debugging live server we can do the same, but if we care to see how it works step by step, here is the code from sql/
String *user_var_entry::val_str(my_bool *null_value, String *str,
                                uint decimals) const
  if ((*null_value= (m_ptr == 0)))
    return (String*) 0;

  switch (m_type) {
    str->set_real(*(double*) m_ptr, decimals, collation.collation);    break;
  case INT_RESULT:
    if (!unsigned_flag)
      str->set(*(longlong*) m_ptr, collation.collation);    else
      str->set(*(ulonglong*) m_ptr, collation.collation);    break;
    str_set_decimal((my_decimal *) m_ptr, str, collation.collation);
    if (str->copy(m_ptr, m_length, collation.collation))
      str= 0;                                   // EOM error
  case ROW_RESULT:
    DBUG_ASSERT(1);                             // Impossible
For INT_RESULT and REAL_RESULT it's all clear, and Shane did essentially the same in his Python code. For strings we have to copy proper items into a zero terminated string or use methods of String class if we debug on a live server to get the entire string data. For DECIMAL_RESULT I checked the implementation of str_set_decimal() that relies on decimal2string() eventually that... looks somewhat complicated (check yourself in strings/decimal.c). So, I'd better to just print my_decimal structure in gdb, for any practical purposes, instead of re-implementing this function in Python.

To summarize, HASH structure is widely used in MySQL and it is easy to dump any of these hashes in gdb, item by item, in the same way as, for example, Performance Schema in MySQL 5.7 does this of user variables. Getting string representation of my_decimal "manually" is complicated.

Monday, August 7, 2017

How to Find Values of User Variables With gdb

In his comment to my announcement of the previous post, Shane Bester kindly suggested to consider pretty printing the information about user variables from gdb. I tried to do this tonight, after a long working day, while working with the same Percona server 5.7.x on CentOS 6.9, and found several interesting details to share even before getting to the pretty printing part (that I'd surely try to avoid doing with Python anyway, as I am lazy and not in a mood to use that programming language for a decade already). So, I decided to share them in a separate post.

To begin with, I checked if actually using gdb for this task is needed for anyone else but those poor souls who has to study core dumps after the crash. This time I do not want Mark Leith to step in with a comment that I had better use <some cool P_S feature> (instead of attaching gdb to properly working production server), so I did my homework...

One can surely use the user_variables_by_thread table in the Performance Schema to get these details if server still works, P_S is enabled and it's possible to connect to it to run SQL queries. There is one minor problem (the type of user variable is missing there), that I was happy to report immediately as Bug #87341. By the way, I had found old feature request, Bug #20307, that I verified 11 years ago and that should be just properly closed because of this nice feature of current MySQL.

So, there is (small) reason to use gdb even before any crashes, and I did it immediately after settings values for few user variables in connection with id 16 with the following statements:
set @a := 'aaa', @b := 254, @c := sysdate();
set @e = 10.5;
I also want to show how one may use (relatively) new column in the threads table of Performance Schema while studying live server with gdb. This was also a hint from Mark Leith in a comment on my older post, to check THREAD_OS_ID column:
mysql> select * from threads\G
*************************** 27. row ***************************
          THREAD_ID: 42
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
   PROCESSLIST_HOST: localhost
               ROLE: NULL
            HISTORY: YES
       THREAD_OS_ID: 2224
My first naive attempt to check the details about user variables with gdb ended up as follows:
(gdb) thread 3
[Switching to thread 3 (Thread 0x7fc5997bc700 (LWP 2224))]#0  0x00007fc5cc299383 in poll () from /lib64/
(gdb) p do_command::thd->m_thread_id
$1 = 16
(gdb) p do_command::thd->user_vars
$2 = {key_offset = 0, key_length = 0, blength = 8, records = 4, flags = 0,
  array = {buffer = 0x7fc5aabf1560 "\001", elements = 4, max_element = 16,
    alloc_increment = 32, size_of_element = 16, m_psi_key = 38},
  get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
  free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
  hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
I see 4 elements for the thread I am interested in. Looks good. I also assume from the above that they are of user_var_entry type. So, given some buffer, I try to cast it to this type and check (that was stupid, especially further pointer increment attempt - if that were an array of structures imagine how long would it take to find some variable by name!):
(gdb) p do_command::thd->user_vars->array->buffer
$3 = (uchar *) 0x7fc5aabf1560 "\001"
(gdb) p (user_var_entry *)(do_command::thd->user_vars->array->buffer)
$4 = (user_var_entry *) 0x7fc5aabf1560
(gdb) p sizeof(user_var_entry)
$5 = 104
(gdb) set $vars = (user_var_entry *)(do_command::thd->user_vars->array->buffer)
(gdb) p $vars
$6 = (user_var_entry *) 0x7fc5aabf1560
(gdb) p *($vars)
$7 = {static extra_size = 8, m_ptr = 0x1 <Address 0x1 out of bounds>,
  m_length = 140486683628224, m_type = 4294967295
, m_owner = 0x7fc59adf90e0,
  m_catalog = {str = 0xffffffff <Address 0xffffffff out of bounds>,
    length = 140486683628064}, entry_name = {
    m_str = 0xffffffff <Address 0xffffffff out of bounds>,
    m_length = 140486683627904}, collation = {collation = 0x0,
    derivation = DERIVATION_EXPLICIT, repertoire = 0}, update_query_id = 0,
  used_query_id = 0, unsigned_flag = false}
(gdb) set $vars = $vars + 1
(gdb) p *($vars)
$8 = {static extra_size = 8, m_ptr = 0x0, m_length = 0,
  m_type = STRING_RESULT, m_owner = 0x0, m_catalog = {str = 0x0, length = 0},
  entry_name = {m_str = 0x0, m_length = 0}, collation = {collation = 0x0,
    derivation = DERIVATION_EXPLICIT, repertoire = 0}, update_query_id = 0,
  used_query_id = 0, unsigned_flag = false}
So, while eventually I've see something expected (STRING_RESULT as type) and pointer trick "worked", the content I get is obviously a bullshit - addresses are out of bounds or zero etc. This leads us nowhere at best.

I had to dig into the code to see how this structure is actually used. This is easy with grep:
[root@centos openxs]# grep -rn user_vars git/mysql-server/*
git/mysql-server/sql/  HASH *hash= & thd->user_vars;
git/mysql-server/sql/    entry= (user_var_entry*)my_hash_search(&thd->user_vars,...
git/mysql-server/storage/perfschema/    sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));
git/mysql-server/sql/  my_hash_init(&user_vars, system_charset_info, USER_VARS_HASH_SIZE, 0, 0,
git/mysql-server/sql/  my_hash_init(&user_vars, system_charset_info, USER_VARS_HASH_SIZE, 0, 0,
git/mysql-server/sql/    (user_var_entry*) my_hash_search(&m_thd->user_vars, (uchar*) name.str,
git/mysql-server/sql/sql_class.h:1576:  HASH    user_vars;                     // hash for user variables...
git/mysql-server/storage/perfschema/    sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));[root@centos openxs]#
So, user_vars is actually a HASH, and we see functions to find the item in this hash by name, my_hash_search(), and to get item by index, my_hash_element(). I've immediately proceed to use the latter, as I wanted to see all variables:
(gdb) thread 10
[Switching to thread 10 (Thread 0x7fc5997bc700 (LWP 2224))]#0  0x00007fc5cc299383 in poll () from /lib64/
(gdb) p do_command::thd->m_thread_id
$1 = 16
(gdb) p do_command::thd->user_vars$2 = {key_offset = 0, key_length = 0, blength = 8, records = 4, flags = 0,
  array = {buffer = 0x7fc5aabf1560 "\001", elements = 4, max_element = 16,
    alloc_increment = 32, size_of_element = 16, m_psi_key = 38},
  get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
  free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
  hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
(gdb) p my_hash_element(&(do_command::thd->user_vars), 1)
$3 = (uchar *) 0x7fc59adf90e0 "H\221ъ \305\177"
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 1))
[New Thread 0x7fc59946f700 (LWP 2447)]
[Thread 0x7fc59973a700 (LWP 2396) exited]
[New Thread 0x7fc59932a700 (LWP 2448)]
(gdb) p $uvar
$4 = (user_var_entry *) 0x7fc59adf90e0
(gdb) p *$uvar
$5 = {static extra_size = 8, m_ptr = 0x7fc59adf9148 "aaa", m_length = 3,
  m_type = STRING_RESULT
, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x41272c275441434e <Address 0x41272c275441434e out of bounds>,
    length = 6075168230325441358}, entry_name = {m_str = 0x7fc59adf9150 "a",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
  used_query_id = 426, unsigned_flag = false}
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 3))
[Thread 0x7fc599636700 (LWP 2400) exited]
(gdb) p *$uvar
$7 = {static extra_size = 8, m_ptr = 0x7fc59adf91e8 "\376", m_length = 8,
  m_type = INT_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x655f79625f726573 <Address 0x655f79625f726573 out of bounds>,
    length = 7881702179129419126}, entry_name = {m_str = 0x7fc59adf91f0 "b",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
  used_query_id = 426, unsigned_flag = false}
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 4))
(gdb) p *$uvar
Cannot access memory at address 0x0
We see that this experimenting was performed on a server under some load, we see threads created and gone, we are able to get proper details by index, but should take extra case to check addresses before accessing them, and should note that items are numbered starting from zero, not 1 (there is no item 4, we get NULL pointer). If anyone cares, item with index 0 was this:
$8 = {static extra_size = 8, m_ptr = 0x7fc5b23b41c0 "\002", m_length = 64,
  m_type = DECIMAL_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x7665602e60616d65 <Address 0x7665602e60616d65 out of bounds>,
    length = 7593481698565975653}, entry_name = {m_str = 0x7fc59adf9330 "e",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 444,
  used_query_id = 444, unsigned_flag = false}
As this was a hash table the order of items is determined by hash, not by the order of assignments to variables.

Now, if I know the name of the variable I can access it in the same way as it's done in the server code (one fo examples found by grep):
    entry= (user_var_entry*)my_hash_search(&thd->user_vars,
Basically, we need to pass variable name and the length of this name:
(gdb) set $uvar=(user_var_entry *)(my_hash_search(&(do_command::thd->user_vars), "e", strlen("e")))
[Thread 0x7fc5995b4700 (LWP 2848) exited]
(gdb) p *$uvar
$1 = {static extra_size = 8, m_ptr = 0x7fc5b23b41c0 "\002", m_length = 64,
  m_type = DECIMAL_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x7665602e60616d65 <Address 0x7665602e60616d65 out of bounds>,
    length = 7593481698565975653}, entry_name = {m_str = 0x7fc59adf9330 "e",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 444,
  used_query_id = 444, unsigned_flag = false}

(gdb) p *((my_decimal *)$uvar->m_ptr)
$4 = {<st_decimal_t> = {intg = 2, frac = 1, len = 9, sign = 0 '\000',
    buf = 0x7fc5b23b41d8}, buffer = {10, 500000000, 0, 0, 0, 0, 257, 0,
I highlighted some parts that eventually should help to pretty print decimal values. With just strings it's also not that trivial:
$5 = {static extra_size = 8, m_ptr = 0x7fc59adf9148 "aaa", m_length = 3,
  m_type = STRING_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x41272c275441434e <Address 0x41272c275441434e out of bounds>,
    length = 6075168230325441358}, entry_name = {m_str = 0x7fc59adf9150 "a",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
  used_query_id = 426, unsigned_flag = false}
(gdb) p *$uvar->m_ptr
$6 = 97 'a'
We can not simply print the pointer in a hope, as this is not a null-terminated string.

I am too tired to create a useful summary right now, but you can see that essentially it's easy (even if not trivial) to get all the details about the user variables defined in any session with gdb, both from working process or from core dump. Enjoy!

Saturday, August 5, 2017

How to Find Values of Session Variables With gdb

Usually I make notes while working on customer issues, reading the Slack channels or even scrolling across my Facebook news feed. Some of them are of "ToDo" kind, and this week while watching how my colleagues work on a very complicated crash/bug I noted that one of the problems they were discussing was how to find out from the core dump if the session behind the crashing thread had mrr=ON in the optimizer_switch.

I had already written more than once how to check MySQL threads one by one in gdb, depending on version and your real goal. For this post I decided to concentrate on checking the values of session variables and, specifically, individual switchable optimizations. This is actually easy, comparing to checking MySQL plugin variables...

I've used Percona 5.7.x and executed set session optimizer_switch='mrr=off'; in one of sessions. I checked threads one by one in gdb until I find the one I need, where thd variable is defined in some frame:
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f7171fbc700 (LWP 2186))]#0  0x00007f71a4be3383 in poll () from /lib64/
(gdb) p do_command::thd->m_thread_id
$1 = 7
Local session variables are kept in the field called variables in the THD structure:
(gdb) p do_command::thd->variables
$2 = {dynamic_variables_version = 61,
  dynamic_variables_ptr = 0x7f717359c820 "\001", dynamic_variables_head = 384,
  dynamic_variables_size = 392, dynamic_variables_allocs = 0x7f71735f9040,
  max_heap_table_size = 16777216, tmp_table_size = 16777216,
  long_query_time = 0, end_markers_in_json = 0 '\000',
  optimizer_switch = 523711, optimizer_trace = 0,
  optimizer_trace_features = 15, optimizer_trace_offset = -1,
  optimizer_trace_limit = 1, optimizer_trace_max_mem_size = 16384,
  sql_mode = 1075838976, option_bits = 2147748608,

(gdb) p do_command::thd->variables->optimizer_switch
$3 = 523711
(gdb) p global_system_variables->optimizer_switch
$9 = 523775
Now, this is surely cool, but how to interpret the value I see, that must be some bitmask? I've printed the value of global variable also (it's in global_system_variables), so from comparing the values I can assume mrr setting is controlled by bit 64 in this bitmask. But we have to check this in the code.

Quick search with grep helps to find out that individual switches are defined in the sql/sql_const.h header file as follows:
/* @@optimizer_switch flags. These must be in sync with optimizer_switch_typelib */
#define OPTIMIZER_SWITCH_INDEX_MERGE               (1ULL << 0)
/** If this is off, MRR is never used. */
#define OPTIMIZER_SWITCH_MRR                       (1ULL << 6)/**
   If OPTIMIZER_SWITCH_MRR is on and this is on, MRR is used depending on a
   cost-based choice ("automatic"). If OPTIMIZER_SWITCH_MRR is on and this is
   off, MRR is "forced" (i.e. used as long as the storage engine is capable of
   doing it).
#define OPTIMIZER_SWITCH_MRR_COST_BASED            (1ULL << 7)
#define OPTIMIZER_SWITCH_BNL                       (1ULL << 8)
#define OPTIMIZER_SWITCH_BKA                       (1ULL << 9)
#define OPTIMIZER_SWITCH_SEMIJOIN                  (1ULL << 11)
#define OPTIMIZER_SWITCH_LOOSE_SCAN                (1ULL << 12)
#define OPTIMIZER_SWITCH_FIRSTMATCH                (1ULL << 13)
#define OPTIMIZER_SWITCH_DUPSWEEDOUT               (1ULL << 14)
#define OPTIMIZER_SWITCH_DERIVED_MERGE             (1ULL << 18)
#define OPTIMIZER_SWITCH_LAST                      (1ULL << 19)
So, to check if at session level we have mrr set, we should do a bitwise AND with 1<<6:
(gdb) thread 7
[Switching to thread 7 (Thread 0x7f7171e77700 (LWP 2257))]#0  0x00007f71a4be3383 in poll () from /lib64/
(gdb) p do_command::thd->variables->optimizer_switch
$1 = 523775
(gdb) p do_command::thd->variables->optimizer_switch & (1 << 6)
$2 = 64
(gdb) p /t do_command::thd->variables->optimizer_switch
$6 = 1111111110111111111
Note that if you know bit positions for individual switches by heart printing with /t modifier may show you all bits set quickly.

Saturday, July 29, 2017

How to Find Processlist Thread id in gdb

I was involved in a discussion on some complex MySQL-related problem where we had to study backtraces of all threads in gdb (produced by the thread apply all bt command if you ever forgets this) in a hope to find out why MySQL hangs. In the process the question appeared on how to find the thread id for each thread to match it against previous collected outputs of SHOW PROCESSLIST. and SHOW ENGINE INNODB STATUS.

I assumed I know the answer, as I had to find this out recently enough for this blog post (and before that for the real customer case). The idea is simple. Find a frame where function has a parameter of THD * type (usually named thd), like this:
#10 0x0000000000cb47fe in do_command (thd=0x7f32512b7000)
    at /usr/src/debug/percona-server-5.7.18-15/percona-server-5.7.18-15/sql/
and check thread_id item of this structure.

In that my blog post it looked as simple as just referring to thd of do_command's frame without even checking much:
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f7f5ce02b00 (LWP 9232))]
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
238     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: No such file or directory.
(gdb) p do_command::thd->thread_id
$9 = 14
I prefer to double check my suggestions before making them, so I immediately tried this with my CentOS 6.9 VM running recent Percona Server 5.7.x by default since that times when I worked at Percona:
[root@centos ~]# gdb -p `pidof mysqld`
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6)
Loaded symbols for /usr/lib64/mysql/plugin/
0x00007f550ad35383 in poll () from /lib64/
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 jemalloc-3.6.0-1.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libcom_err-1.41.12-23.el6.x86_64 libgcc-4.4.7-18.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libstdc++-4.4.7-18.el6.x86_64 nss-softokn-freebl-3.14.3-23.3.el6_8.x86_64 numactl-2.0.9-2.el6.x86_64 openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f550d2b2820 (LWP 1978))]#0  0x00007f550ad35383 in poll () from /lib64/
(gdb) p do_command::thd->thread_id
No frame is currently executing in block do_command(THD*).
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f54d837b700 (LWP 2183))]#0  0x00007f550ad35383 in poll () from /lib64/
(gdb) p do_command::thd->thread_id
Cannot take address of method thread_id.
(gdb) call do_command::thd->thread_id()
Cannot evaluate function -- may be inlined
As you can see I started to check threads one by one and apply that good old trick. Thread 1 had no frame executing do_command(), but I did not gave up and proceeded to the next thread, as I knew I had at least one active connection (I checked the output of SHOW PROCESSLIST). There I had a surprise, no way to get thread_id of thd. I used tab completion, so I know that thread_id (variable or method) exists, but attempt to call it also failed as you can see.

This is a problem with using gdb-based "tricks" over the code that evolves/changes in time. Last time I used p do_command::thd->thread_id it was for MariaDB 10.1.x probably, and the item was there. But in MySQL 5.7 (and all forks based on it) there were many code changes, so we should be ready to changes in unexpected places.

I had not added more comments on finding thread id to that discussion, made a note to myself and then, later, decided to check the source code of MySQL 5.7 (I did not have Percona 5.7 one at hand, but they hardly differs in such basic details) to find out what had changed in the THD structure so that thread_id is not longer just a variable. I expect to see the structure defined in sql/sql_class.h from the past, but grep will help to find this out even if it's no longer the case:
[root@centos mysql-server]# grep -n "class THD" sql/*.h
sql/debug_sync.h:27:class THD;
sql/derror.h:24:class THD;
sql/event_data_objects.h:40:class THD;
sql/sql_class.h:1412:class THD :public MDL_context_owner,
sql/sql_class.h:4175:    raise_error() or raise_warning() methods provided by class THD.
sql/sql_cmd.h:25:class THD;
 I found the following there:
class THD :public MDL_context_owner,
           public Query_arena,
           public Open_tables_state
  my_thread_id  m_thread_id;public:
    Assign a value to m_thread_id by calling
  void set_new_thread_id();
  my_thread_id thread_id() const { return m_thread_id; }...
So, in MySQL 5.7 thread_id() is, indeed, a method that was inlined, and essentially it returns private m_thread_id item. Benefits of C++... I had highlighted Global_THD_manager singleton also as during my next gdb sessions I had found out that simple global list of threads is also gone and in 5.7 everything is done via that Global_THD_manager. This is a topic for some other post, though.

At least now I know what to do in gdb:
(gdb) thread 7[Switching to thread 7 (Thread 0x7f54d8236700 (LWP 2275))]#0  0x00007f550ad35383 in poll () from /lib64/
(gdb) p do_command::thd->m_thread_id
$1 = 86
(gdb) p do_command::thd->m_main_security_ctx
$3 = {m_user = {m_ptr = 0x7f5500fdaf90 "myuser", m_length = 6,
    m_charset = 0x1ded640, m_alloced_length = 8, m_is_alloced = true},
  m_host = {m_ptr = 0x7f54d98ab090 "localhost", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 16, m_is_alloced = true},
  m_ip = {m_ptr = 0x7f54f0eb0210 "", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 16, m_is_alloced = true},
  m_host_or_ip = {m_ptr = 0x7f54d98ab090 "localhost", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 0, m_is_alloced = false},
  m_external_user = {m_ptr = 0x15167ab "", m_length = 0,
    m_charset = 0x1ded640, m_alloced_length = 0, m_is_alloced = false},
  m_priv_user = "myuser", '\000' <repeats 89 times>, m_priv_user_length = 6,
  m_proxy_user = '\000' <repeats 161 times>, m_proxy_user_length = 0,
  m_priv_host = "localhost", '\000' <repeats 51 times>,
  m_priv_host_length = 9, m_master_access = 1589248, m_db_access = 0,
  m_password_expired = false}
So, I know that specific thread  7 was for a session with Id 86 in the output of SHOW PROCESSLIST, and (from m_main_security_ctx, also a new name for old things in 5.7) I know it was a session of myuser connecting locally.

To summarize, there were notable changes in MySQL 5.7 in THD structure and threads management-related code in general, so make sure to re-check your "old gdb tricks" when you start working with 5.7. Reading the code helps.

Unfortunately (for gdb beginners like me) a lot of C++ approaches were introduced, including singletons, iterators based on templates instead of simple double linked lists etc, so one has to work hard to adapt to these. I hope to discuss some of my further findings and new "C++ specific" and "MySQL 5.7 specific" approaches studying MySQL in gdb in my upcoming posts.

Sunday, July 23, 2017

Why Thread May Hang in "Waiting for table level lock" State - Part I

Last time I had to use gdb to check what's going on in MySQL server and found something useful with it to share in the blog post it was April 2017, and I miss this kind of experience already. So, today I decided to try it again in a hope to get some insights in cases when other tools may not apply or may not be as helpful as one could expect. Here is the long enough story, based on recent customer issue I worked on this week.
* * *
Had you seen anything like this output of SHOW PROCESSLIST statement:

Id User Host db Command Time State 
Info Progress
28 someuser01 xx.xx.xx.xx:39644 somedb001 Sleep 247  
NULL 0.000
29 someuser01 xx.xx.xx.yy:44100 somedb001 Query 276 
Waiting for table level lock DELETE FROM t1 WHERE (some_id = 'NNNNN') AND ... 0.000
33 someuser01 somedb001 Query 275 
Waiting for table level lock DELETE FROM t2 WHERE (some_id = 'XXXXX') 0.000
38 someuser01 xx.xx.xx.xx:57055 somedb001 Query 246 
Waiting for table level lock DELETE FROM t3 WHERE (some_id in (123456)) AND ...  0.000
recently? That is, many threads accessing InnoDB(!) tables and hanging in the "Waiting for table level lock" state for a long time without any obvious reason in the SHOW PROCESSLIST or SHOW ENGINE INNODB STATUS?

I've seen it this week, along with a statement from customer that the server is basically unusable. Unlike in many other cases I had not found any obvious problem from the outputs mentioned above (other than high concurrency and normal InnoDB row lock waits etc for other sessions). There were no unexpected table level locks reported by InnoDB, and the longest running transactions were those in that waiting state.

Unfortunately fine MySQL manual is not very helpful when describing this thread state:
"Waiting for lock_type lock
The server is waiting to acquire a THR_LOCK lock or a lock from the metadata locking subsystem, where lock_type indicates the type of lock.
This state indicates a wait for a THR_LOCK:
  • Waiting for table level lock
These states indicate a wait for a metadata lock:
  • Waiting for event metadata lock
  • Waiting for global read lock
What that "THR_LOCK" should tell the reader is beyond me (and probably most of MySQL users). Fortunately, I know from experience that this state usually means that thread is trying to access some MyISAM table. The tables mentioned in currently running statements were all InnoDB, though. Server had performance_schema enabled, but only with default instrumentation and consumers defined, so I had no chance to get anything related to metadata locks or much hope to see all previous statement executed by each thread anyway. I surely insisted on checking the source code etc, but this rarely gives immediate results.

So, to check my assumption and try to prove MyISAM tables are involved, and get some hint what sessions they were locked by, I suggested to run mysqladmin debug, the command that is rarely used these days, with the output that Oracle explicitly refused to document properly (see my Bug #71346)! I do know what to expect there (from the days when MyISAM was widely used), and when in doubts I can always check the source code.

The mysqladmin debug command outputs a lot of information into the error log. Among other things I found the following section:

Thread database.table_name          Locked/Waiting        Lock_type

928     somedb001.somewhat_seq      Waiting - write       High priority write lock
12940   somedb001.somewhat_seq      Locked - write        High priority write lock
where somedb001.somewhat_seq surely was the MyISAM table. There were actually many tables mentioned as locked, with waits on them, and names ending with _seq suffix. One can now check what MySQL thread with id=12940 is doing at all levels, where it comes from, should it be killed etc.

Now it became more clear what's going on. Probably application developers tried to implement sequences with MyISAM tables and, probably, triggers to use them when data are not provided for the columns! The idea is quite popular and described in many blog posts. Check this old blog post by Peter Zaitsev, for example. This assumption was confirmed, and essentially they did something like I did in the following deliberately primitive example:
-- this is our sequence table
mysql> show create table misam\G
*************************** 1. row ***************************
       Table: misam
Create Table: CREATE TABLE `misam` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
1 row in set (0.00 sec)

mysql> create table inno like misam;
Query OK, 0 rows affected (0.30 sec)

mysql> alter table inno engine=InnoDB;
Query OK, 0 rows affected (2.73 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table inno\G
*************************** 1. row ***************************
       Table: inno
Create Table: CREATE TABLE `inno` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
1 row in set (0.03 sec)

mysql> select * from misam;
Empty set (0.00 sec)

-- let's set up the first value for the sequence
mysql> insert into misam values(1);
Query OK, 1 row affected (0.05 sec)

-- now let's create trigger to insert the value from sequencemysql> delimiter //
mysql> create trigger tbi before insert on inno
    -> for each row
    -> begin
    -> if ifnull(,0) = 0 then
    ->   update misam set id=last_insert_id(id+1);
    ->   set = last_insert_id();
    -> end if;
    -> end
    -> //

Query OK, 0 rows affected (0.25 sec)

mysql> delimiter ;
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into inno values(0);
Query OK, 1 row affected (0.20 sec)

mysql> insert into inno values(0);
Query OK, 1 row affected (0.10 sec)

mysql> select * from inno;
| id |
|  2 |
|  3 |
2 rows in set (0.00 sec)

mysql> select * from misam;
| id |
|  3 |
1 row in set (0.00 sec)
So, this is how it is supposed to work, and, you know, it works from two concurrent sessions without any locking noticed etc.

This is the end of the real story, for now. We have a lot of things to discuss there with customer, including how to find out previous commands executed in each sessions, what useful details we can get from the performance_schema etc. The root cause was clear, but we still have to find out why the sessions are spending so much time holding the blocking locks on MyISAM tables and what we can do about that, or with architecture based on such implementation of sequences (that I kindly ask you to avoid whenever possible! Use auto_increment values, please, if you care about concurrency.) I expect more blog posts eventually inspired by that real story, but now it's time to move to more generic technical details.
* * *
I have two problems with this story if I try to approach it in a more generic way.

First, do we have any other way to see MyISAM table level locks, besides mysqladmin debug command? (Oracle kindly plans to replace with something, let me quote Shane Bester's comment the bug that originates from hist internal feature request: "We need to replace it with appropriate performance_schema or information_schema tables.")

Second, how exactly to reproduce the situation customer reported? My initial attempts to get such a status with the trigger in place failed, I was not able to capture threads spending any notable time in "Waiting for table level lock" state even with concurrent single row inserts and the trigger above in place. I was thinking about explicit LOCK TABLES misam WRITE etc, but I doubt real code does use that. Fortunately, the following INSERT running from one session:
mysql> insert into inno values(sleep(100));
Query OK, 1 row affected (1 min 40.20 sec)
allowed me to run INSERT in the other session that got blocked:
mysql> insert into inno values(0);
Query OK, 1 row affected (1 min 36.36 sec)
and while it was blocked I've seen the thread state I wanted in the output of the SHOW PROCESSLIST:
mysql> show processlist;+----+------+-----------+------+---------+------+------------------------------+-------------------------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                        | Info                                | Rows_sent | Rows_examined |
| 16 | root | localhost | test | Query   |   17 | User sleep                   | insert into inno values(sleep(100)) |         0 |             0 |
| 17 | root | localhost | test | Query   |   13 | Waiting for table level lock | insert into inno values(0)          |         0 |             0 |
| 22 | root | localhost | test | Query   |    0 | starting                     | show processlist                    |         0 |             0 |
3 rows in set (0.00 sec)
So, I know how exactly to reproduce the problem and what it can be caused by. Any slow running single INSERT statement (caused by something complex executed in trigger, some slow function used in the list of values inserted, or, maybe just INSERT ... SELECT ... from the big table) will give us the desired thread state.

Coming back to the first generic problem mentioned above, is there any way besides running mysqladmin debug and checking the output vs processlist etc, to identify the thread that holds MyISAM table level locks? One should expect performance_schema to help with this, at least as of 5.7 (I've used Percona Server 5.7.18-15 on my Ubuntu 14.04 netbook for testing today, while the original problem was on MariaDB 10.2.x. Percona is for fun, while MariaDB is for real work...). Indeed, we have table_handles there, and until recently I ignored its existence (it's new in 5.7, and I am not even sure if MariaDB 10.2.x has it already).

So, I reproduced the problem again and got the following there:
mysql> select * from performance_schema.table_handles;
| TABLE       | test          | t           |       140392772351024 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | t1          |       140392772352560 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | ttime       |       140392772355632 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | misam       |       140392772358704 |              55 |             10 | WRITE         | WRITE EXTERNAL |
| TABLE       | test          | misam       |       140392896981552 |              54 |             10 | WRITE         | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392772361776 |              55 |             10 | NULL          | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392896983088 |              54 |             10 | NULL          | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392826836016 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | misam       |       140392826837552 |               0 |              0 | NULL          | NULL           |
9 rows in set (0.00 sec)
What I am supposed to do with the above to find out the MySQL thread id of the blocking session? I am supposed to join to performance_schema.threads table, looking for thread_id value that is the same as owner_thread_id above, 54 and 55. I get the following, note that processlist_id is what you are looking for in the processlist:
mysql> select * from performance_schema.threads where thread_id in (54, 55)\G
*************************** 1. row ***************************
          THREAD_ID: 54
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
   PROCESSLIST_HOST: localhost
               ROLE: NULL
            HISTORY: YES
       THREAD_OS_ID: 32252
*************************** 2. row ***************************
          THREAD_ID: 55
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
   PROCESSLIST_HOST: localhost
               ROLE: NULL
            HISTORY: YES
       THREAD_OS_ID: 27844
2 rows in set (0.10 sec)
I do not see a way to distinguish lock wait from actually holding the lock. All I know at the moment is that there is engine-level lock on misam table (and inno table, for that matter). If the MySQL thread id is NOT for the thread that is "Waiting for table level lock", then it must be the thread that holds the lock! Some smart join with proper WHERE clause would let me to find out what I need directly. Maybe in one of the next parts I'll even present it, but writing it from the top of my head in critical situation is above my current skills related to performance_schema.
* * *
Now, what those of us should do who do not have performance_schema enabled, or have to use version before 5.7? Or those with access to gdb and spare 5 minutes?

Surely we should attach gdb to the mysqld process and, if in doubts, read some parts of the source code to know where to start. I started with the following fragments of code that were easy to find (as long as you know that COM_DEBUG command is actually sent by mysqladmin debug to server):
openxs@ao756:~/git/percona-server$ grep -rni com_debug *
include/mysql/plugin_audit.h.pp:160:  COM_DEBUG,
include/mysql.h.pp:47:  COM_DEBUG,
include/my_command.h:39:  COM_DEBUG,
libmysql/libmysql.c:899:  DBUG_RETURN(simple_command(mysql,COM_DEBUG,0,0,0));
rapid/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task_debug.h:62:  "[XCOM_DEBUG] ",
sql/  case COM_DEBUG:
# from the line highlighted above...  case COM_DEBUG:
    if (check_global_access(thd, SUPER_ACL))
      break;                                    /* purecov: inspected */
    mysql_print_status();    query_logger.general_log_print(thd, command, NullS);

openxs@ao756:~/git/percona-server$ grep -rni mysql_print_status *
sql/sql_test.h:40:void mysql_print_status();
sql/ "sql_test.h"         // mysql_print_status
sql/    mysql_print_status();
sql/ mysql_print_status()sql/ "sql_test.h"     // mysql_print_status
sql/        mysql_print_status();   // Print some debug info

# from the line highlighted above...

void mysql_print_status()
  char current_dir[FN_REFLEN];
  STATUS_VAR current_global_status_var;

  printf("\nStatus information:\n\n");
  (void) my_getwd(current_dir, sizeof(current_dir),MYF(0));
  printf("Current dir: %s\n", current_dir);
  printf("Running threads: %u  Stack size: %ld\n",
         (long) my_thread_stack_size);
  thr_print_locks();                            // Write some debug info
  printf("\nTable status:\n\
# finally, let's find the code of display_table_locks...openxs@ao756:~/git/percona-server$ grep -rni display_table_locks *
sql/ void display_table_locks(void)sql/  display_table_locks();
sql/  { &key_memory_locked_thread_list, "display_table_locks", PSI_FLAG_THREAD},

static void display_table_locks(void)
  LIST *list;
  Saved_locks_array saved_table_locks(key_memory_locked_thread_list);
  saved_table_locks.reserve(table_cache_manager.cached_tables() + 20);

  for (list= thr_lock_thread_list; list; list= list_rest(list))
    THR_LOCK *lock=(THR_LOCK*) list->data;
    push_locks_into_array(&saved_table_locks, lock->, FALSE,
                          "Locked - write");
    push_locks_into_array(&saved_table_locks, lock->, TRUE,
                          "Waiting - write");
    push_locks_into_array(&saved_table_locks, lock->, FALSE,
                          "Locked - read");
    push_locks_into_array(&saved_table_locks, lock->, TRUE,
                          "Waiting - read");
That's enough to start. We'll have to finally study what THR_LOCK is. There is a global (double) linked lists of all them, thr_lock_thread_list, and this is where we can get details from, in a way similar (let's hope) to those server does while processing COM_DEBUG command for us. So, let's attach gdb and let's fun begin:
openxs@ao756:~/git/percona-server$ sudo gdb -p 23841
[sudo] password for openxs:
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Loaded symbols for /lib/x86_64-linux-gnu/
0x00007fafd93d9c5d in poll () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) set $list=(LIST *)thr_lock_thread_list
(gdb) set $lock=(THR_LOCK*) $list->data
(gdb) p *($lock)
$1 = {list = {prev = 0x0, next = 0x7fafd2fa4780, data = 0x7fafbd545c80},
  mutex = {m_mutex = {__data = {__lock = 0, __count = 0, __owner = 0,
        __nusers = 0, __kind = 3, __spins = 0, __elision = 0, __list = {
          __prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\003", '\000' <repeats 22 times>,
      __align = 0}, m_psi = 0x7fafd1f97c80}, read_wait = {data = 0x0,
    last = 0x7fafbd545cc8}, read = {data = 0x0, last = 0x7fafbd545cd8},
  write_wait = {data = 0x0, last = 0x7fafbd545ce8}, write = {
    data = 0x7fafbd69f188, last = 0x7fafbd69f190}
, write_lock_count = 0,
  read_no_write_count = 0, get_status = 0xf84910 <mi_get_status>,
  copy_status = 0xf84a70 <mi_copy_status>,
  update_status = 0xf84970 <mi_update_status>,
  restore_status = 0xf84a50 <mi_restore_status>,
  check_status = 0xf84a80 <mi_check_status>}
In the above we started from the first item in the list. If needed we can move on to the next with set $list=(LIST *)($list->next) etc. We then interpret $list->data as a (THD_LOCK*). There we can see read, read_wait, write and write_wait structures. One of them will have data item that is not 0x0. This way we can identify is it a lock or lock wait, and what kind of lock is it. In my case i was looking for a write lock, and here it is, the very first one. So, I continue:
(gdb) p *($lock)->
$2 = {owner = 0x7fafbd468d38, next = 0x0, prev = 0x7fafbd545cf8,
  lock = 0x7fafbd545c80, cond = 0x0, type = TL_WRITE,
  status_param = 0x7fafbd69ee20, debug_print_param = 0x7fafbd515020,
  m_psi = 0x7fafa4c06cc0}
We can check the owner field:
(gdb) p *($lock)->
$3 = {thread_id = 16, suspend = 0x7fafbd469370}
and thread_id there is what we were looking for, the MySQL thread id of the blocking thread. If we want to get the details about the table locks we can do this as follows, by studying debug_print_param in data:
(gdb) set $table=(TABLE *) &(*($lock)->>debug_print_param)
(gdb) p $table
$4 = (TABLE *) 0x7fafbd515020
(gdb) p *($table)
$5 = {s = 0x7fafbd4d7030, file = 0x7fafbd535230, next = 0x0, prev = 0x0,
  cache_next = 0x0, cache_prev = 0x7fafc4df05a0, in_use = 0x7fafbd468000,
  field = 0x7fafbd4d7440, hidden_field_count = 0, record = {
    0x7fafbd4d7430 "\377", 0x7fafbd4d7438 "\n"}, write_row_record = 0x0,
  insert_values = 0x0, covering_keys = {map = 0}, quick_keys = {map = 0},
  merge_keys = {map = 0}, possible_quick_keys = {map = 0},
  keys_in_use_for_query = {map = 0}, keys_in_use_for_group_by = {map = 0},
  keys_in_use_for_order_by = {map = 0}, key_info = 0x7fafbd4d7508,
  next_number_field = 0x0, found_next_number_field = 0x0, vfield = 0x0,
  hash_field = 0x0, fts_doc_id_field = 0x0, triggers = 0x0,
  pos_in_table_list = 0x7fafbd47a9b8, pos_in_locked_tables = 0x7fafbd4e5030,
  group = 0x0, alias = 0x7fafbbbad120 "misam",
  null_flags = 0x7fafbd4d7430 "\377", bitmap_init_value = 0x0, def_read_set = {
    bitmap = 0x7fafbd4d75a8, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75a8, mutex = 0x0}, def_write_set = {
    bitmap = 0x7fafbd4d75ac, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75ac, mutex = 0x0}, tmp_set = {
    bitmap = 0x7fafbd4d75b0, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75b0, mutex = 0x0}, cond_set = {
    bitmap = 0x7fafbd4d75b4, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75b4, mutex = 0x0},
  def_fields_set_during_insert = {bitmap = 0x7fafbd4d75b8, n_bits = 1,
    last_word_mask = 4294967294, last_word_ptr = 0x7fafbd4d75b8, mutex = 0x0},
---Type <return> to continue, or q <return> to quit---
  read_set = 0x7fafbd515128, write_set = 0x7fafbd515148,
  fields_set_during_insert = 0x7fafbd5151a8, query_id = 0, quick_rows = {
    0 <repeats 64 times>}, const_key_parts = {0 <repeats 64 times>},
  quick_key_parts = {0 <repeats 64 times>}, quick_n_ranges = {
    0 <repeats 64 times>}, quick_condition_rows = 0, lock_position = 0,
  lock_data_start = 0, lock_count = 1, temp_pool_slot = 0, db_stat = 39,
  current_lock = 1, nullable = 0 '\000', null_row = 0 '\000',
  status = 3 '\003', copy_blobs = 0 '\000', force_index = 0 '\000',
  force_index_order = 0 '\000', force_index_group = 0 '\000',
  distinct = 0 '\000', const_table = 0 '\000', no_rows = 0 '\000',
  key_read = 0 '\000', no_keyread = 0 '\000', locked_by_logger = 0 '\000',
  no_replicate = 0 '\000', locked_by_name = 0 '\000',
  fulltext_searched = 0 '\000', no_cache = 0 '\000',
  open_by_handler = 0 '\000', auto_increment_field_not_null = 0 '\000',
  insert_or_update = 0 '\000', alias_name_used = 0 '\000',
  get_fields_in_item_tree = 0 '\000', m_needs_reopen = 0 '\000',
  created = true, max_keys = 0, reginfo = {join_tab = 0x0, qep_tab = 0x0,
    lock_type = TL_WRITE, not_exists_optimize = false,
    impossible_range = false}, mem_root = {free = 0x7fafbd4d7420,
    used = 0x7fafbd535220, pre_alloc = 0x0, min_malloc = 32, block_size = 992,
    block_num = 6, first_block_usage = 0, max_capacity = 0,
    allocated_size = 2248, error_for_capacity_exceeded = 0 '\000',
    error_handler = 0xc4b240 <sql_alloc_error_handler()>, m_psi_key = 106},
---Type <return> to continue, or q <return> to quit---
  blob_storage = 0x0, grant = {grant_table = 0x0, version = 0,
    privilege = 536870911, m_internal = {m_schema_lookup_done = true,
      m_schema_access = 0x0, m_table_lookup_done = true,
      m_table_access = 0x0}}, sort = {filesort_buffer = {m_next_rec_ptr = 0x0,
      m_rawmem = 0x0, m_record_pointers = 0x0, m_sort_keys = 0x0,
      m_num_records = 0, m_record_length = 0, m_sort_length = 0,
      m_size_in_bytes = 0, m_idx = 0}, io_cache = 0x0, merge_chunks = {
      m_array = 0x0, m_size = 0}, addon_fields = 0x0,
    sorted_result_in_fsbuf = false, sorted_result = 0x0,
    sorted_result_end = 0x0, found_records = 0}, part_info = 0x0,
  all_partitions_pruned_away = false, mdl_ticket = 0x7fafbd4c4f10,
  m_cost_model = {m_cost_model_server = 0x0, m_se_cost_constants = 0x0,
    m_table = 0x0}, should_binlog_drop_if_temp_flag = false}
We already see table alias, but more details are in the s filed (of TABLE_SHARE * type):
(gdb) p $table->s$26 = (TABLE_SHARE *) 0x7fafbd4d7030
(gdb) p $table->s->table_cache_key
$27 = {str = 0x7fafbd4d73b8 "test", length = 11}
(gdb) p $table->s->db
$30 = {str = 0x7fafbd4d73b8 "test", length = 4}
(gdb) p $table->s->path
$31 = {str = 0x7fafbd4d73c8 "./test/misam", length = 12}
(gdb) p $table->s->table_name
$32 = {str = 0x7fafbd4d73bd "misam", length = 5}
# to remind you, this is how to get the thread id directly
(gdb) p (*($lock)->
$37 = 16
I skilled some of my tests and intermediate results. To summarize, for me it took only a bit more time to search the code and try commands in gdb than to find out that metadata_locks does not help and one has to use table_handles in performance_schema. Next time in gdb I'll do it instantly, and the number of key strokes is far less :)

Surely, for production environments I'll have to write, test and note somewhere proper queries to the performance_schema. For next attempts with gdb and, maybe, FOSDEM talks I'll have to read the code to remember key details about the data structures used above... More posts to come, stay tuned!