Monday, January 25, 2021

Checking User Threads and Temporary Tables With gdb in MariaDB 10.4+, Step By Step

There were no posts about gdb tricks for a long time in this blog. This is surely unusual, but I had not done anything fancy with gdb for more than a year. Today I've got a chance finally to find something new in the code and answer yet another question based on code review and some basic gdb commands.

The question was about the way to find what temporary tables, if any, that are created by some connection. There is no way to do this in MariaDB 10.4, see MDEV-12459 for some related plans (and I_S.INNODB_TEMP_TABLE_INFO of MySQL that appeared in MariaDB for a short time only). My immediate answer was that this is surely stored somewhere in THD structure and I just have to find it (and a way to work with that information) using code review and/or gdb.

The first step was easy. I know that THD is defined in sql/sql_class.h, and there I see:

class THD: public THD_count, /* this must be first */
           public Statement,
           /*
             This is to track items changed during execution of a prepared
             statement/stored procedure. It's created by
             nocheck_register_item_tree_change() in memory root of THD,
             and freed in rollback_item_tree_changes().
             For conventional execution it's always empty.
           */
           public Item_change_list,
           public MDL_context_owner,
           public Open_tables_state
...

Temporary tables surely must be somewhere in that Open_tables_state. In the same file we can find the following:

class Open_tables_state
{
public:
  /**
    As part of class THD, this member is set during execution
    of a prepared statement. When it is set, it is used
    by the locking subsystem to report a change in table metadata.
    When Open_tables_state part of THD is reset to open
    a system or INFORMATION_SCHEMA table, the member is cleared
    to avoid spurious ER_NEED_REPREPARE errors -- system and
    INFORMATION_SCHEMA tables are not subject to metadata version
    tracking.
    @sa check_and_update_table_version()
  */
  Reprepare_observer *m_reprepare_observer;

  /**
    List of regular tables in use by this thread. Contains temporary and
    base tables that were opened with @see open_tables().
  */
  TABLE *open_tables;

  /**
    A list of temporary tables used by this thread. This includes
    user-level temporary tables, created with CREATE TEMPORARY TABLE,
    and internal temporary tables, created, e.g., to resolve a SELECT,
    or for an intermediate table used in ALTER.
  */
  All_tmp_tables_list *temporary_tables;

...

With this information I am ready to dive into gdb session. I have MariaDB 10.4.18 at hand and create a couple of temporary tables in connection with id 9:

MariaDB [test]> select version(), connection_id(), current_user();
+-----------------+-----------------+------------------+
| version()       | connection_id() | current_user()   |
+-----------------+-----------------+------------------+
| 10.4.18-MariaDB |               9 | openxs@localhost |
+-----------------+-----------------+------------------+
1 row in set (0,000 sec)

MariaDB [test]> create temporary table mytemp(c1 int, c2 varchar(100));
Query OK, 0 rows affected (0,034 sec)

MariaDB [test]> create temporary table mytemp2(id int, c2 int) engine=MyISAM;
Query OK, 0 rows affected (0,001 sec)

MariaDB [test]> show processlist;
+----+-------------+-----------+------+---------+------+--------------------------+------------------+----------+
| Id | User        | Host      | db   | Command | Time | State                    | Info             | Progress |
+----+-------------+-----------+------+---------+------+--------------------------+------------------+----------+
|  3 | system user |           | NULL | Daemon  | NULL | InnoDB purge worker      | NULL             |    0.000 |
|  4 | system user |           | NULL | Daemon  | NULL | InnoDB purge worker      | NULL             |    0.000 |
|  1 | system user |           | NULL | Daemon  | NULL | InnoDB purge worker      | NULL             |    0.000 |
|  2 | system user |           | NULL | Daemon  | NULL | InnoDB purge coordinator | NULL             |    0.000 |
|  5 | system user |           | NULL | Daemon  | NULL | InnoDB shutdown handler  | NULL             |    0.000 |
|  9 | openxs      | localhost | test | Query   |    0 | Init                     | show processlist |    0.000 |
+----+-------------+-----------+------+---------+------+--------------------------+------------------+----------+
6 rows in set (0,000 sec)

Now I attach gdb and immediately try to check what's inside the temporary_table filed of the do_command frame where thd is present:

openxs@ao756:~$ sudo gdb -p `pidof mysqld`
[sudo] password for openxs:
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 26620
[New LWP 26621]
... 28 more LWPs were here
[New LWP 26658]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f59fb69080d in poll () at ../sysdeps/unix/syscall-template.S:84
84      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) thread 31
[Switching to thread 31 (Thread 0x7f59dd6ca700 (LWP 26658))]
#0  0x00007f59fb69080d in poll () at ../sysdeps/unix/syscall-template.S:84
84      in ../sysdeps/unix/syscall-template.S
(gdb) p do_command::thd->thread_id
$1 = 9
(gdb) p do_command::thd->temporary_tables
$2 = (All_tmp_tables_list *) 0x5606b575cd28
(gdb) p *do_command::thd->temporary_tables
$3 = {<I_P_List_null_counter> = {<No data fields>}, <I_P_List_no_push_back<TMP_TABLE_SHARE>> = {<No data fields>}, m_first = 0x5606b60dfa38}
(gdb) p *do_command::thd->temporary_tables->m_first
$4 = {<TABLE_SHARE> = {table_category = TABLE_CATEGORY_TEMPORARY, name_hash = {
      key_offset = 0, key_length = 0, blength = 0, records = 0, flags = 0,
      array = {buffer = 0x0, elements = 0, max_element = 0,
        alloc_increment = 0, size_of_element = 0, malloc_flags = 0},
      get_key = 0x0, hash_function = 0x0, free = 0x0, charset = 0x0},
    mem_root = {free = 0x5606b60e3cf8, used = 0x5606b60e40e8, pre_alloc = 0x0,
      min_malloc = 32, block_size = 985, total_alloc = 2880, block_num = 6,
      first_block_usage = 0,
      error_handler = 0x5606b1e6bac0 <sql_alloc_error_handler()>,
      name = 0x5606b2538d63 "tmp_table_share"}, keynames = {count = 0,
      name = 0x0, type_names = 0x5606b60e3d78, type_lengths = 0x5606b60e3d94},
    fieldnames = {count = 2, name = 0x0, type_names = 0x5606b60e3d60,
      type_lengths = 0x5606b60e3d88}, intervals = 0x0, LOCK_ha_data = {
      m_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
          __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
            __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0},
      m_psi = 0x0}, LOCK_share = {m_mutex = {__data = {__lock = 0,
          __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0,
          __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
        __size = '\000' <repeats 39 times>, __align = 0}, m_psi = 0x0},
    tdc = 0x0, tabledef_version = {
      str = 0x5606b60e3d10 "~A\370\275_4\021К°Ё\364\267\342\023=\275",
      length = 16}, option_list = 0x0, option_struct = 0x0,
---Type <return> to continue, or q <return> to quit---field = 0x5606b60e3dQuit

You may be wondering why I jumped to Thread 31 immediately, how did I know that it corresponds to connection with thread_id 9, as I verified with later print? It was not pure luck, I knew I am the only user and just jumped to the last thread in order of creation. There is a better way for a general case, and it's navigating over a "list" of threads that must exist somewhere, as SHOW PROCESSLIST must have a way to get them all, easy one. We'll get back to that important task later in this post.

Now, in temporary_tables->m_first filed we have a table share, with a lot of details we may need. We can try to see some of them that were actually requested originally:

(gdb) p do_command::thd->temporary_tables->m_first->table_name
$5 = {str = 0x5606b60dfeef "mytemp2", length = 7}
(gdb) p do_command::thd->temporary_tables->m_first->table_name.str
$6 = 0x5606b60dfeef "mytemp2"
(gdb) p do_command::thd->temporary_tables->m_first->path
$7 = {str = 0x5606b60dfed8 "/tmp/#sql67fc_9_3", length = 17}
(gdb) p do_command::thd->temporary_tables->m_first->path.str
$8 = 0x5606b60dfed8 "/tmp/#sql67fc_9_3"

So, I can get as many details as are presented or can be found from the TABLE_SHARE structure. I see them immediately for the last temporary table I've created in that session. But what about the other, there might be many of them. I expected some kind of a linked list or array, but type information presented above gave me no real hint. Where is the next or previous item? This hints towards list by name, but that's all:

(gdb) p *do_command::thd->temporary_tables
$3 = {<I_P_List_null_counter> = {<No data fields>}, <I_P_List_no_push_back<TMP_TABLE_SHARE>> = {<No data fields>}, m_first = 0x5606b60dfa38}

The type, <I_P_List_no_push_back<TMP_TABLE_SHARE>>, looks like some template class instantiated with TMP_TABLE_SHARE, and I can find the source code:

template <typename T> class I_P_List_no_push_back;


/**
   Intrusive parameterized list.
   Unlike I_List does not require its elements to be descendant of ilink
   class and therefore allows them to participate in several such lists
   simultaneously.
   Unlike List is doubly-linked list and thus supports efficient deletion
   of element without iterator.
   @param T  Type of elements which will belong to list.
   @param B  Class which via its methods specifies which members
             of T should be used for participating in this list.
             Here is typical layout of such class:
             struct B
             {
               static inline T **next_ptr(T *el)
               {
                 return &el->next;
               }
               static inline T ***prev_ptr(T *el)
               {
                 return &el->prev;
               }
             };
   @param C  Policy class specifying how counting of elements in the list
             should be done. Instance of this class is also used as a place
             where information about number of list elements is stored.
             @sa I_P_List_null_counter, I_P_List_counter
   @param I  Policy class specifying whether I_P_List should support
             efficient push_back() operation. Instance of this class
             is used as place where we store information to support
             this operation.
             @sa I_P_List_no_push_back, I_P_List_fast_push_back.
*/

template <typename T, typename B,
          typename C = I_P_List_null_counter,
          typename I = I_P_List_no_push_back<T> >
class I_P_List : public C, public I
{
  T *m_first;
...

but I get lost in all these C++ stuff. Luckily I asked at the Engineering channel and got a hint that "I" in the name means "Intrusive" and that base type T is supposed to include pointers to the next and previous item. Moreover, in case of TMP_TABLE_SHARE they are named tmp_next and tmp_prev. I had to read the entire structure, as next and prev had not worked for me...

With this hint it was easy to proceed:

(gdb) p do_command::thd->temporary_tables->m_first->tmp_next
$12 = (TMP_TABLE_SHARE *) 0x5606b60df558
(gdb) set $t = do_command::thd->temporary_tables->m_first
(gdb) p $t
$13 = (TMP_TABLE_SHARE *) 0x5606b60dfa38
(gdb) p $t->table_name.str
$14 = 0x5606b60dfeef "mytemp2"
(gdb) set $t = $t->tmp_next
(gdb) p $t
$15 = (TMP_TABLE_SHARE *) 0x5606b60df558
(gdb) p $t->table_name.str
$16 = 0x5606b60dfa0f "mytemp"
(gdb) set $t = $t->tmp_next
(gdb) p $t
$17 = (TMP_TABLE_SHARE *) 0x0

The idea is to iterate while $t is not zero, starting from temporary_tables->m_first. You can surely put it into a Python loop for automation. One day I'll do this too. For now I am happy to be able to list all temporary tables with all the details manually, with gdb commands.

The remaining question is: how to iterate over user threads in this MariaDB version? No more global threads variable:

(gdb) p threads
No symbol "threads" in current context.

No surprize, we had that changed in MySQL 5.7+ too

Here I also used a hint from a way more experienced colleague, Sergei Golubchik. That's what we have now:

(gdb) p server_threads
$18 = {threads = {<base_ilist> = {first = 0x5606b60b6d28, last = {
        _vptr.ilink = 0x5606b2d03a38 <vtable for ilink+16>,
        prev = 0x7f59a80009b8, next = 0x0}}, <No data fields>}, lock = {
    m_rwlock = {__data = {__lock = 0, __nr_readers = 0, __readers_wakeup = 0,
        __writer_wakeup = 0, __nr_readers_queued = 0, __nr_writers_queued = 0,
        __writer = 0, __shared = 0, __rwelision = 0 '\000',
        __pad1 = "\000\000\000\000\000\000", __pad2 = 0, __flags = 0},
      __size = '\000' <repeats 55 times>, __align = 0}, m_psi = 0x0}}
(gdb) ptype server_threads
type = class THD_list {
  private:
    I_List<THD> threads;
    mysql_rwlock_t lock;

  public:
    void init();
    void destroy();
    void insert(THD *);
    void erase(THD *);
    int iterate<std::vector<unsigned long long> >(my_bool (*)(THD *,
    std::vector<unsigned long long> *), std::vector<unsigned long long> *);
}
(gdb) p server_threads.threads
$19 = {<base_ilist> = {first = 0x5606b60b6d28, last = {
      _vptr.ilink = 0x5606b2d03a38 <vtable for ilink+16>,
      prev = 0x7f59a80009b8, next = 0x0}}, <No data fields>}

From that I had to proceed myself. I already know what "I" means in these templates, so I expect to find the next pointer somewhere if I start from first:

(gdb) p server_threads.threads
$19 = {<base_ilist> = {first = 0x5606b60b6d28, last = {
      _vptr.ilink = 0x5606b2d03a38 <vtable for ilink+16>,
      prev = 0x7f59a80009b8, next = 0x0}}, <No data fields>}
(gdb) p server_threads.threads.first
$20 = (ilink *) 0x5606b60b6d28
(gdb) p *server_threads.threads.first
$21 = {_vptr.ilink = 0x5606b2d08f80 <vtable for THD+16>,
  prev = 0x5606b2ed1de0 <server_threads>, next = 0x7f59980009a8}
(gdb) set $thd = (THD *)server_threads.threads.first
(gdb) p $thd->thread_id
$22 = 9

This was the initialization part, now let's check some more and iterate:

(gdb) p $thd->proc_info
$23 = 0x5606b2521cd9 "Reset for next command"
(gdb) set $thd = (THD *)$thd->next
(gdb) p $thd->thread_id
$24 = 5
(gdb) p $thd->proc_info
$25 = 0x5606b267a145 "InnoDB shutdown handler"
(gdb) set $thd = (THD *)$thd->next
(gdb) p $thd->thread_id
$26 = 2
(gdb) p $thd->main_security_ctx.user
$27 = 0x0
(gdb) p $thd->proc_info
$28 = 0x5606b26a3da9 "InnoDB purge coordinator"
(gdb) set $thd = (THD *)$thd->next
(gdb) p $thd->thread_id
$29 = 1
(gdb) p $thd->proc_info
$30 = 0x5606b26a3e20 "InnoDB purge worker"
(gdb) set $thd = (THD *)$thd->next
(gdb) p $thd->thread_id
$31 = 4
(gdb) p $thd->proc_info
$32 = 0x5606b26a3e20 "InnoDB purge worker"
(gdb) set $thd = (THD *)$thd->next
(gdb) p $thd->thread_id
$33 = 3
(gdb) p $thd->proc_info
$34 = 0x5606b26a3e20 "InnoDB purge worker"
(gdb) set $thd = (THD *)$thd->next
(gdb) p $thd->thread_id
$35 = 1095216660735
(gdb) p $thd->proc_info
$36 = 0x0
(gdb) set $thd = (THD *)$thd->next
(gdb) p $thd
$37 = (THD *) 0x0

The idea of iteration is also clear: we move to $thd->next if it's not zero. What we see matches the SHOW PROCESSLIST output with the exception of the last thread, with zero proc_info too. It is some "sentinel" that is not present in the PROCESSLIST. One day I'll figure out why is it so and automate checking all threads based on Python code of this kind, suggested by Shane Bester. Tonight I am just happy to document what I recently found, as all details related to gdb usage do change with time and new versions released.

Free travels and digging into the code in gdb with specific goal in mind  - I miss these activities recently

* * *

To summarize:

  1. It's relatively easy to find out all the details about every temporary table of any kind created in any MariaDB server user thread, in gdb.
  2. It's still fun to work on MariaDB, as you can promptly get help from developers no matter what crazy questions you may ask
  3. Changes towards more modern C++ may make it more diffical to debug in gdb initially for those unaware of the details of clasees implementation and design.

No comments:

Post a Comment