In his
comment to my
announcement of the
previous post,
Shane Bester kindly suggested to consider pretty printing the information about
user variables from
gdb. I tried to do this tonight, after a long working day, while working with the same Percona server 5.7.x on CentOS 6.9, and found several interesting details to share even before getting to the pretty printing part (that I'd surely try to avoid doing with Python anyway, as I am lazy and not in a mood to use that programming language for a decade already). So, I decided to share them in a separate post.
To begin with, I checked if actually using
gdb for this task is needed for anyone else but those poor souls who has to study core dumps after the crash. This time I do not want
Mark Leith to step in with a comment that I had better use
<some cool P_S feature> (instead of attaching
gdb to properly working production server), so I did my homework...
One can surely use the
user_variables_by_thread table in the Performance Schema to get these details if server still works, P_S is enabled and it's possible to connect to it to run SQL queries. There is one minor problem (the type of user variable is missing there), that I was happy to report immediately as
Bug #87341. By the way, I had found old feature request,
Bug #20307, that I verified 11 years ago and that should be just properly closed because of this nice feature of current MySQL.
So, there is (small) reason to use
gdb even before any crashes, and I did it immediately after settings values for few user variables in connection with id 16 with the following statements:
set @a := 'aaa', @b := 254, @c := sysdate();
set @e = 10.5;
I also want to show how one may use (relatively) new column in the
threads table of Performance Schema while studying live server with
gdb. This was also a hint from
Mark Leith in a comment on my older post, to check
THREAD_OS_ID column:
mysql> select * from threads\G
...
*************************** 27. row ***************************
THREAD_ID: 42
NAME: thread/sql/one_connection
TYPE: FOREGROUND
PROCESSLIST_ID: 16
PROCESSLIST_USER: root
PROCESSLIST_HOST: localhost
PROCESSLIST_DB: test
PROCESSLIST_COMMAND: Sleep
PROCESSLIST_TIME: 195
PROCESSLIST_STATE: NULL
PROCESSLIST_INFO: NULL
PARENT_THREAD_ID: NULL
ROLE: NULL
INSTRUMENTED: YES
HISTORY: YES
CONNECTION_TYPE: Socket
THREAD_OS_ID: 2224
...
My first naive attempt to check the details about user variables with
gdb ended up as follows:
(gdb) thread 3
[Switching to thread 3 (Thread 0x7fc5997bc700 (LWP 2224))]#0 0x00007fc5cc299383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 16
(gdb) p do_command::thd->user_vars
$2 = {key_offset = 0, key_length = 0, blength = 8, records = 4, flags = 0,
array = {buffer = 0x7fc5aabf1560 "\001", elements = 4, max_element = 16,
alloc_increment = 32, size_of_element = 16, m_psi_key = 38},
get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
I see 4 elements for the thread I am interested in. Looks good. I also assume from the above that they are of
user_var_entry type. So, given some
buffer, I try to cast it to this type and check (that was stupid, especially further pointer increment attempt - if that were an array of structures imagine how long would it take to find some variable by name!):
(gdb) p do_command::thd->user_vars->array->buffer
$3 = (uchar *) 0x7fc5aabf1560 "\001"
(gdb) p (user_var_entry *)(do_command::thd->user_vars->array->buffer)
$4 = (user_var_entry *) 0x7fc5aabf1560
(gdb) p sizeof(user_var_entry)
$5 = 104
(gdb) set $vars = (user_var_entry *)(do_command::thd->user_vars->array->buffer)
(gdb) p $vars
$6 = (user_var_entry *) 0x7fc5aabf1560
(gdb) p *($vars)
$7 = {static extra_size = 8, m_ptr = 0x1 <Address 0x1 out of bounds>,
m_length = 140486683628224, m_type = 4294967295, m_owner = 0x7fc59adf90e0,
m_catalog = {str = 0xffffffff <Address 0xffffffff out of bounds>,
length = 140486683628064}, entry_name = {
m_str = 0xffffffff <Address 0xffffffff out of bounds>,
m_length = 140486683627904}, collation = {collation = 0x0,
derivation = DERIVATION_EXPLICIT, repertoire = 0}, update_query_id = 0,
used_query_id = 0, unsigned_flag = false}
(gdb) set $vars = $vars + 1
(gdb) p *($vars)
$8 = {static extra_size = 8, m_ptr = 0x0, m_length = 0,
m_type = STRING_RESULT, m_owner = 0x0, m_catalog = {str = 0x0, length = 0},
entry_name = {m_str = 0x0, m_length = 0}, collation = {collation = 0x0,
derivation = DERIVATION_EXPLICIT, repertoire = 0}, update_query_id = 0,
used_query_id = 0, unsigned_flag = false}
...
So, while eventually I've see something expected (STRING_RESULT as type) and pointer trick "worked", the content I get is obviously a bullshit - addresses are out of bounds or zero etc. This leads us nowhere at best.
I had to dig into the code to see how this structure is actually used. This is easy with
grep:
[root@centos openxs]# grep -rn user_vars git/mysql-server/*
git/mysql-server/sql/item_func.cc:6017: HASH *hash= & thd->user_vars;
...
git/mysql-server/sql/sql_prepare.cc:2182: entry= (user_var_entry*)my_hash_search(&thd->user_vars,...
git/mysql-server/storage/perfschema/table_uvar_by_thread.cc:76: sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));
...
git/mysql-server/sql/sql_class.cc:1209: my_hash_init(&user_vars, system_charset_info, USER_VARS_HASH_SIZE, 0, 0,
git/mysql-server/sql/sql_class.cc:1650: my_hash_init(&user_vars, system_charset_info, USER_VARS_HASH_SIZE, 0, 0,
...
git/mysql-server/sql/rpl_binlog_sender.cc:654: (user_var_entry*) my_hash_search(&m_thd->user_vars, (uchar*) name.str,
...
git/mysql-server/sql/sql_class.h:1576: HASH user_vars; // hash for user variables...
git/mysql-server/storage/perfschema/table_uvar_by_thread.cc:76: sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));[root@centos openxs]#
So, user_vars is actually a
HASH, and we see functions to find the item in this hash by name,
my_hash_search(), and to get item by index,
my_hash_element(). I've immediately proceed to use the latter, as I wanted to see all variables:
...
(gdb) thread 10
[Switching to thread 10 (Thread 0x7fc5997bc700 (LWP 2224))]#0 0x00007fc5cc299383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 16
(gdb) p do_command::thd->user_vars$2 = {key_offset = 0, key_length = 0, blength = 8, records = 4, flags = 0,
array = {buffer = 0x7fc5aabf1560 "\001", elements = 4, max_element = 16,
alloc_increment = 32, size_of_element = 16, m_psi_key = 38},
get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
(gdb) p my_hash_element(&(do_command::thd->user_vars), 1)
$3 = (uchar *) 0x7fc59adf90e0 "H\221ъ \305\177"
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 1))
[New Thread 0x7fc59946f700 (LWP 2447)]
[Thread 0x7fc59973a700 (LWP 2396) exited]
[New Thread 0x7fc59932a700 (LWP 2448)]
(gdb) p $uvar
$4 = (user_var_entry *) 0x7fc59adf90e0
(gdb) p *$uvar
$5 = {static extra_size = 8, m_ptr = 0x7fc59adf9148 "aaa", m_length = 3,
m_type = STRING_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
str = 0x41272c275441434e <Address 0x41272c275441434e out of bounds>,
length = 6075168230325441358}, entry_name = {m_str = 0x7fc59adf9150 "a",
m_length = 1}, collation = {collation = 0x1ded740,
derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
used_query_id = 426, unsigned_flag = false}
...
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 3))
[Thread 0x7fc599636700 (LWP 2400) exited]
(gdb) p *$uvar
$7 = {static extra_size = 8, m_ptr = 0x7fc59adf91e8 "\376", m_length = 8,
m_type = INT_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
str = 0x655f79625f726573 <Address 0x655f79625f726573 out of bounds>,
length = 7881702179129419126}, entry_name = {m_str = 0x7fc59adf91f0 "b",
m_length = 1}, collation = {collation = 0x1ded740,
derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
used_query_id = 426, unsigned_flag = false}
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 4))
(gdb) p *$uvar
Cannot access memory at address 0x0
We see that this experimenting was performed on a server under some load, we see threads created and gone, we are able to get proper details by index, but should take extra case to check addresses before accessing them, and should note that items are numbered starting from zero, not 1 (there is no item
4, we get NULL pointer). If anyone cares, item with index 0 was this:
$8 = {static extra_size = 8, m_ptr = 0x7fc5b23b41c0 "\002", m_length = 64,
m_type = DECIMAL_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
str = 0x7665602e60616d65 <Address 0x7665602e60616d65 out of bounds>,
length = 7593481698565975653}, entry_name = {m_str = 0x7fc59adf9330 "e",
m_length = 1}, collation = {collation = 0x1ded740,
derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 444,
used_query_id = 444, unsigned_flag = false}
As this was a hash table the order of items is determined by hash, not by the order of assignments to variables.
Now, if I know the name of the variable I can access it in the same way as it's done in the server code (one fo examples found by
grep):
entry= (user_var_entry*)my_hash_search(&thd->user_vars,
(uchar*)lex->prepared_stmt_code.str,
lex->prepared_stmt_code.length);
Basically, we need to pass variable name and the length of this name:
(gdb) set $uvar=(user_var_entry *)(my_hash_search(&(do_command::thd->user_vars), "e", strlen("e")))
[Thread 0x7fc5995b4700 (LWP 2848) exited]
(gdb) p *$uvar
$1 = {static extra_size = 8, m_ptr = 0x7fc5b23b41c0 "\002", m_length = 64,
m_type = DECIMAL_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
str = 0x7665602e60616d65 <Address 0x7665602e60616d65 out of bounds>,
length = 7593481698565975653}, entry_name = {m_str = 0x7fc59adf9330 "e",
m_length = 1}, collation = {collation = 0x1ded740,
derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 444,
used_query_id = 444, unsigned_flag = false}
...
(gdb) p *((my_decimal *)$uvar->m_ptr)
$4 = {<st_decimal_t> = {intg = 2, frac = 1, len = 9, sign = 0 '\000',
buf = 0x7fc5b23b41d8}, buffer = {10, 500000000, 0, 0, 0, 0, 257, 0,
-1697445128}}
I highlighted some parts that eventually should help to pretty print decimal values. With just strings it's also not that trivial:
...
$5 = {static extra_size = 8, m_ptr = 0x7fc59adf9148 "aaa", m_length = 3,
m_type = STRING_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
str = 0x41272c275441434e <Address 0x41272c275441434e out of bounds>,
length = 6075168230325441358}, entry_name = {m_str = 0x7fc59adf9150 "a",
m_length = 1}, collation = {collation = 0x1ded740,
derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
used_query_id = 426, unsigned_flag = false}
(gdb) p *$uvar->m_ptr
$6 = 97 'a'
We can not simply print the pointer in a hope, as this is not a null-terminated string.
I am too tired to create a useful summary right now, but you can see that essentially it's easy (even if not trivial) to get all the details about the user variables defined in any session with
gdb, both from working process or from core dump. Enjoy!