Sunday, July 12, 2020

Dynamic Tracing of C++ Class Member Functions with perf

I've started to write posts about dynamic tracing on Linux last year, and basic examples presented here and there make it look like in general with perf it's already easy to add dynamic probe for entry and return for any function, or even for every other line  of code inside the function (that perf probe -x <path to binary> --line <function name> shows). Numerous examples of adding probes to do_command(), dispatch_command() in MySQL or MariaDB code (or even malloc() provided by the libraries) etc kind of illustrated if not proved that.

It turned out that when one tries to use this method in a more generic case against MySQL or MariaDB code things may get way more complicated. The reason is that the code these days is mostly C++ and one often has to trace class member functions, not just plain global C functions. The problem can be easily demonstrated this way:

openxs@ao756:~/dbs/maria10.3$ ps aux | grep dbs
openxs   30377  0.8  3.2 1849476 125008 pts/18 Sl   14:03   0:00 /home/openxs/dbs/maria10.3/bin/mysqld --no-defaults --basedir=/home/openxs/dbs/maria10.3 --datadir=/home/openxs/dbs/maria10.3/data --plugin-dir=/home/openxs/dbs/maria10.3/lib/plugin --log-error=/home/openxs/dbs/maria10.3/data/ao756.err --pid-file=ao756.pid --socket=/tmp/mariadb.sock --port=3309
...
openxs@ao756:~$ perf version
perf version 4.4.219

So, I have a nice MariaDB 10.3.x server built from GitHub source as usual, up and running on my netbook/"bedroom test server" with Ubuntu 16.04.6 LTS (GNU/Linux 4.4.0-179-generic x86_64). Let's assume I want to add a probe to some method that I clearly see and can use in gdb, like this:
openxs@ao756:~/dbs/maria10.3$ sudo gdb -p 30377
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
...
[New LWP 30447]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fd8b1b1f80d in poll () at ../sysdeps/unix/syscall-template.S:84
84      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) b ha_heap::records_in_range
Breakpoint 1 at 0x55f7d3aeccf0: file /home/openxs/git/server/storage/heap/ha_heap.cc, line 586.
(gdb)
So, gdb kindly understands method names and one may expect the same from perf. Now if I quit gdb and try to add probe:
openxs@ao756:~/dbs/maria10.3$ perf probe -x /home/openxs/dbs/maria10.3/bin/mysqld ha_heap::records_in_range
Semantic error :There is non-digit char in line number.
  Error: Command Parse Error.
openxs@ao756:~/dbs/maria10.3$
The reason is that perf probe syntax
PROBE SYNTAX
       Probe points are defined by following syntax.

           1) Define event based on function name
            [EVENT=]FUNC[@SRC][:RLN|+OFFS|%return|;PTN] [ARG ...]

           2) Define event based on source file with line number
            [EVENT=]SRC:ALN [ARG ...]

           3) Define event based on source file with lazy pattern
            [EVENT=]SRC;PTN [ARG ...]

...
uses the : character for line numbers, and this conflicts with the C++ scope syntax. It could probably infer that :: is not a line number, but nobody has written that yet, as of version 4.4.219.

Let's try to find out if there is still a way to add the probe. First, let's check if perf sees any similar functions, using the --funcs option:
openxs@ao756:~/dbs/maria10.3$ perf probe -x /home/openxs/dbs/maria10.3/bin/mysqld --funcs | grep records_in_range
ha_heap::records_in_range
ha_innobase::records_in_range
ha_maria::records_in_range
ha_myisam::records_in_range
ha_myisammrg::records_in_range
ha_partition::records_in_range
ha_seq::records_in_range
handler::records_in_range
hp_rb_records_in_range
maria_records_in_range
mi_records_in_range
myrg_records_in_range
openxs@ao756:~/dbs/maria10.3$ perf probe -x /home/openxs/dbs/maria10.3/bin/mysqld --funcs --no-demangle | grep records_in_range
hp_rb_records_in_range
maria_records_in_range
mi_records_in_range
myrg_records_in_range
openxs@ao756:~/dbs/maria10.3$
In theory --no-demangle means that demangling is disabled:
       --demangle
           Demangle application symbols. --no-demangle is also available for
           disabling demangling.
and I expected to see the original mangled name to refer to. But I see demangled name as one of functions I can use (while I can NOT), but do not see mangled name at all. This looks inconsistent and would be really unfortunate if true. I am sure the function exists and is visible. Quick search in Google gave a hint in this nice post, there is a filter that disables showing names starting with '_' by default:
--filter=FILTER
(Only for --vars and --funcs) Set filter. FILTER is a combination of glob pattern, see FILTER PATTERN for detail. Default FILTER is "!k???tab_* & !crc_*" for --vars, and "!_*" for --funcs. If several filters are specified, only the last filter is used.
So, let's try non-default filter:
openxs@ao756:~/dbs/maria10.3$ perf probe -x /home/openxs/dbs/maria10.3/bin/mysqld --funcs --no-demangle --filter '*' | grep records_in_range
_ZN11ha_innobase16records_in_rangeEjP12st_key_rangeS1_
_ZN12ha_myisammrg16records_in_rangeEjP12st_key_rangeS1_
_ZN12ha_partition16records_in_rangeEjP12st_key_rangeS1_
_ZN6ha_seq16records_in_rangeEjP12st_key_rangeS1_
_ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_
_ZN7handler16records_in_rangeEjP12st_key_rangeS1_
_ZN8ha_maria16records_in_rangeEjP12st_key_rangeS1_
_ZN9ha_myisam16records_in_rangeEjP12st_key_rangeS1_
hp_rb_records_in_range
maria_records_in_range
mi_records_in_range
myrg_records_in_range
openxs@ao756:~/dbs/maria10.3$
The mangles name we are looking for is _ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_. I'd expect it should work, as we had to use mangled names even for functions that are not members of any class with ftrace. So, let's try (I need sudo on Ubuntu for adding probe to work):
openxs@ao756:~/dbs/maria10.3$ sudo perf probe -x /home/openxs/dbs/maria10.3/bin/mysqld _ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_
[sudo] password for openxs:
Probe point '_ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_' not found.
  Error: Failed to add events.
openxs@ao756:~/dbs/maria10.3$ sudo perf probe --no-demangle -x /home/openxs/dbs/maria10.3/bin/mysqld _ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_
Added new event:
  probe_mysqld:_ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_ (on _ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_ in /home/openxs/dbs/maria10.3/bin/mysqld)

You can now use it in all perf tools, such as:

        perf record -e probe_mysqld:_ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_ -aR sleep 1

openxs@ao756:~/dbs/maria10.3$
As you can see, adding the mangled function name and --no-demangle option worked. I can add a probe at function exit to print the returned value as well:
openxs@ao756:~/dbs/maria10.3$ sudo perf probe --no-demangle -x /home/openxs/dbs/maria10.3/bin/mysqld --add ha_heap_records_in_range_ret='_ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_%return records=$retval:u32'
Added new event:
  probe_mysqld:ha_heap_records_in_range_ret (on _ZN7ha_heap16records_in_rangeEjP12st_key_rangeS1_%return in /home/openxs/dbs/maria10.3/bin/mysqld with records=$retval:u32)

You can now use it in all perf tools, such as:

        perf record -e probe_mysqld:ha_heap_records_in_range_ret -aR sleep 1

openxs@ao756:~/dbs/maria10.3$
As you can see I had to give an explicit another name to the exit probe and I tried to see the return value. I also had to find out what data type should be used for the return value (it's ulong or u32 in perms of perf/ftrace, see the source code). My perf is not smart enough to infer this.

Now if I try to record the probe:
openxs@ao756:~/dbs/maria10.3$ sudo perf record -e probe_mysqld:ha_heap_records_in_range_ret -aR sleep 1000
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.904 MB perf.data (2 samples) ]

openxs@ao756:~/dbs/maria10.3$
while doing something with MEMORY table in another session:
openxs@ao756:~/dbs/maria10.3$ bin/mysql -uroot --socket=/tmp/mariadb.sock test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.3.24-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> show create table theap\G
*************************** 1. row ***************************
       Table: theap
Create Table: CREATE TABLE `theap` (
  `id` int(11) DEFAULT NULL,
  `c1` int(11) DEFAULT NULL,
  KEY `id` (`id`),
  KEY `c1` (`c1`) USING BTREE
) ENGINE=MEMORY DEFAULT CHARSET=latin1
1 row in set (0,000 sec)

MariaDB [test]> select * from theap;
+------+------+
| id   | c1   |
+------+------+
|    1 |    1 |
|    2 |    2 |
|    3 |    3 |
+------+------+
3 rows in set (0,001 sec)

MariaDB [test]> explain select * from theap where c1 between 2 and 3;
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | theap | range | c1            | c1   | 5       | NULL |    1 | Using where |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0,001 sec)

MariaDB [test]> select * from theap where c1 between 2 and 3;
+------+------+
| id   | c1   |
+------+------+
|    2 |    2 |
|    3 |    3 |
+------+------+
2 rows in set (0,001 sec)
I can see that it worked:
openxs@ao756:~/dbs/maria10.3$ sudo perf script > /tmp/trace.txt
openxs@ao756:~/dbs/maria10.3$ cat /tmp/trace.txt
          mysqld 31547 [001] 326219.193101: probe_mysqld:ha_heap_records_in_range_ret: (55f7d3aeccf0 <- 55f7d35df8da) records=0x1
          mysqld 31547 [001] 326224.706375: probe_mysqld:ha_heap_records_in_range_ret: (55f7d3aeccf0 <- 55f7d35df8da) records=0x1

openxs@ao756:~/dbs/maria10.3$
It works as expected! You may want to find out why it returned the value we see (1 in hex) etc, but basically it works,

Never give up! I want Sam to be remembered for this attitude...
To summarize:
  1. Never give up on dynamic tracing, it works!
  2. With your version of perf you may have to use mangled names for C++ class member functions and --no-demangle option to define the probe.
  3. You may have to quote the probe while adding, if it has some special characters like ':'. Read the man perf-probe manual carefully also.
  4. You may have to study the source code to print the arguments or return values properly.
  5. There are other options to deal with C++ class memeber functions. You can refer to them by source file and line number, or even by address that you find in objdump etc.
  6. I hope one day tools like bpftrace will let us use the real, not mangled names of functions. I have to check on Fedora 31 if it's already the case.
  7. I feel myself cool when some experienced developer asks me something about perf and I can provide an answer, even if not immediately :)

2 comments:

  1. Hi Valerii,

    Very nice post, thanks!


    The return value of records_in_range is 64-bit in most versions:

    virtual ha_rows records_in_range(uint inx, const key_range *min_key,
    const key_range *max_key,
    page_range *res)

    (gdb) p sizeof(ha_rows)
    $2 = 8

    I guess the above script worked because the machine is little-endian.

    Other than that, everything worked for me on Ubuntu 18.04.4

    ReplyDelete
    Replies
    1. I think I tried u64 and had not seen what I've expected, and so moved on to quick check of source (maybe misinterpreted it). Maybe I made some mistake...

      Delete