Saturday, July 29, 2017

How to Find Processlist Thread id in gdb

I was involved in a discussion on some complex MySQL-related problem where we had to study backtraces of all threads in gdb (produced by the thread apply all bt command if you ever forgets this) in a hope to find out why MySQL hangs. In the process the question appeared on how to find the thread id for each thread to match it against previous collected outputs of SHOW PROCESSLIST. and SHOW ENGINE INNODB STATUS.

I assumed I know the answer, as I had to find this out recently enough for this blog post (and before that for the real customer case). The idea is simple. Find a frame where function has a parameter of THD * type (usually named thd), like this:
#10 0x0000000000cb47fe in do_command (thd=0x7f32512b7000)
    at /usr/src/debug/percona-server-5.7.18-15/percona-server-5.7.18-15/sql/sql_parse.cc:960
and check thread_id item of this structure.

In that my blog post it looked as simple as just referring to thd of do_command's frame without even checking much:
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f7f5ce02b00 (LWP 9232))]
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
238     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: No such file or directory.
(gdb) p do_command::thd->thread_id
$9 = 14
I prefer to double check my suggestions before making them, so I immediately tried this with my CentOS 6.9 VM running recent Percona Server 5.7.x by default since that times when I worked at Percona:
[root@centos ~]# gdb -p `pidof mysqld`
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6)
...
Loaded symbols for /usr/lib64/mysql/plugin/tokudb_backup.so
0x00007f550ad35383 in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 jemalloc-3.6.0-1.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libcom_err-1.41.12-23.el6.x86_64 libgcc-4.4.7-18.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libstdc++-4.4.7-18.el6.x86_64 nss-softokn-freebl-3.14.3-23.3.el6_8.x86_64 numactl-2.0.9-2.el6.x86_64 openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f550d2b2820 (LWP 1978))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->thread_id
No frame is currently executing in block do_command(THD*).
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f54d837b700 (LWP 2183))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->thread_id
Cannot take address of method thread_id.
(gdb) call do_command::thd->thread_id()
Cannot evaluate function -- may be inlined
As you can see I started to check threads one by one and apply that good old trick. Thread 1 had no frame executing do_command(), but I did not gave up and proceeded to the next thread, as I knew I had at least one active connection (I checked the output of SHOW PROCESSLIST). There I had a surprise, no way to get thread_id of thd. I used tab completion, so I know that thread_id (variable or method) exists, but attempt to call it also failed as you can see.

This is a problem with using gdb-based "tricks" over the code that evolves/changes in time. Last time I used p do_command::thd->thread_id it was for MariaDB 10.1.x probably, and the item was there. But in MySQL 5.7 (and all forks based on it) there were many code changes, so we should be ready to changes in unexpected places.

I had not added more comments on finding thread id to that discussion, made a note to myself and then, later, decided to check the source code of MySQL 5.7 (I did not have Percona 5.7 one at hand, but they hardly differs in such basic details) to find out what had changed in the THD structure so that thread_id is not longer just a variable. I expect to see the structure defined in sql/sql_class.h from the past, but grep will help to find this out even if it's no longer the case:
[root@centos mysql-server]# grep -n "class THD" sql/*.h
sql/debug_sync.h:27:class THD;
sql/derror.h:24:class THD;
sql/event_data_objects.h:40:class THD;
...
sql/sql_class.h:1412:class THD :public MDL_context_owner,
sql/sql_class.h:4175:    raise_error() or raise_warning() methods provided by class THD.
sql/sql_cmd.h:25:class THD;
...
 I found the following there:
class THD :public MDL_context_owner,
           public Query_arena,
           public Open_tables_state
{
...
private:
  my_thread_id  m_thread_id;public:
...
  /**
    Assign a value to m_thread_id by calling
    Global_THD_manager::get_new_thread_id().
  */
  void set_new_thread_id();
  my_thread_id thread_id() const { return m_thread_id; }...
So, in MySQL 5.7 thread_id() is, indeed, a method that was inlined, and essentially it returns private m_thread_id item. Benefits of C++... I had highlighted Global_THD_manager singleton also as during my next gdb sessions I had found out that simple global list of threads is also gone and in 5.7 everything is done via that Global_THD_manager. This is a topic for some other post, though.

At least now I know what to do in gdb:
...
(gdb) thread 7[Switching to thread 7 (Thread 0x7f54d8236700 (LWP 2275))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 86
(gdb) p do_command::thd->m_main_security_ctx
$3 = {m_user = {m_ptr = 0x7f5500fdaf90 "myuser", m_length = 6,
    m_charset = 0x1ded640, m_alloced_length = 8, m_is_alloced = true},
  m_host = {m_ptr = 0x7f54d98ab090 "localhost", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 16, m_is_alloced = true},
  m_ip = {m_ptr = 0x7f54f0eb0210 "127.0.0.1", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 16, m_is_alloced = true},
  m_host_or_ip = {m_ptr = 0x7f54d98ab090 "localhost", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 0, m_is_alloced = false},
  m_external_user = {m_ptr = 0x15167ab "", m_length = 0,
    m_charset = 0x1ded640, m_alloced_length = 0, m_is_alloced = false},
  m_priv_user = "myuser", '\000' <repeats 89 times>, m_priv_user_length = 6,
  m_proxy_user = '\000' <repeats 161 times>, m_proxy_user_length = 0,
  m_priv_host = "localhost", '\000' <repeats 51 times>,
  m_priv_host_length = 9, m_master_access = 1589248, m_db_access = 0,
  m_password_expired = false}
...
So, I know that specific thread  7 was for a session with Id 86 in the output of SHOW PROCESSLIST, and (from m_main_security_ctx, also a new name for old things in 5.7) I know it was a session of myuser connecting locally.

To summarize, there were notable changes in MySQL 5.7 in THD structure and threads management-related code in general, so make sure to re-check your "old gdb tricks" when you start working with 5.7. Reading the code helps.

Unfortunately (for gdb beginners like me) a lot of C++ approaches were introduced, including singletons, iterators based on templates instead of simple double linked lists etc, so one has to work hard to adapt to these. I hope to discuss some of my further findings and new "C++ specific" and "MySQL 5.7 specific" approaches studying MySQL in gdb in my upcoming posts.

3 comments:

  1. Maybe start including getting this kind of data instead of just using a SHOW now:

    mysql> select name, thread_id, processlist_id, thread_os_id from performance_schema.threads;
    +----------------------------------------+-----------+----------------+--------------+
    | name | thread_id | processlist_id | thread_os_id |
    +----------------------------------------+-----------+----------------+--------------+
    | thread/sql/main | 1 | NULL | 7260 |
    | thread/sql/thread_timer_notifier | 2 | NULL | 4632 |
    | thread/innodb/io_ibuf_thread | 3 | NULL | 5904 |
    | thread/innodb/io_log_thread | 4 | NULL | 6724 |
    | thread/innodb/io_read_thread | 5 | NULL | 6352 |
    | thread/innodb/io_read_thread | 6 | NULL | 3800 |
    | thread/innodb/io_read_thread | 7 | NULL | 4636 |
    | thread/innodb/io_read_thread | 8 | NULL | 7924 |
    | thread/innodb/io_write_thread | 9 | NULL | 7680 |
    | thread/innodb/io_write_thread | 10 | NULL | 7444 |
    | thread/innodb/io_write_thread | 11 | NULL | 7764 |
    | thread/innodb/io_write_thread | 12 | NULL | 6356 |
    | thread/innodb/page_cleaner_thread | 13 | NULL | 7440 |
    | thread/innodb/srv_lock_timeout_thread | 16 | NULL | 8568 |
    | thread/innodb/srv_error_monitor_thread | 17 | NULL | 8572 |
    | thread/innodb/srv_monitor_thread | 18 | NULL | 8576 |
    | thread/innodb/srv_master_thread | 19 | NULL | 8580 |
    | thread/sql/background | 20 | NULL | 8584 |
    | thread/innodb/srv_purge_thread | 21 | NULL | 8584 |
    | thread/sql/background | 22 | NULL | 8588 |
    | thread/sql/background | 23 | NULL | 8596 |
    | thread/sql/background | 24 | NULL | 8592 |
    | thread/innodb/buf_dump_thread | 25 | NULL | 8600 |
    | thread/innodb/dict_stats_thread | 26 | NULL | 8604 |
    | thread/sql/compress_gtid_table | 27 | 1 | 8704 |
    | thread/sql/con_sockets | 28 | NULL | 8708 |
    | thread/sql/one_connection | 30 | 3 | 8864 |
    +----------------------------------------+-----------+----------------+--------------+
    27 rows in set (0.00 sec)

    ReplyDelete
    Replies
    1. Yes, it makes sense to use such a SELECT in case of MySQL 5.7 (I do not see thread_os_id in 5.6). Explanation in https://dev.mysql.com/doc/refman/5.7/en/threads-table.html is useful.

      For the post factum analysis of core dumps one still would need to know how to identify all the details about the MySQL thread for each OS thread she sees.

      If performance_schema was enabled, all these and more information should be in related memory buffers in that core dump. Now, I have an idea for some future post, if I'll be able to find it and access with gdb...

      Delete
  2. It is still important to know how to get the info in gdb also, for when we inspect core files ;-)

    ReplyDelete