Saturday, January 25, 2020

Dynamic Tracing of MariaDB Server With ftrace - Basic Example

Today I'd like to continue my series of blog posts about dynamic tracing of MySQL server (or Percona Server for MySQL, or MariaDB, or basically any other software, whatever you prefer) on Linux. This post ideally had to start it, as the approach discussed here is truly dynamic and suitable production servers under high load (unlike gdb tricks, most of the time), works more or less the same way starting from Linux 2.6.27, is "always there" at your fingertips (if you are root, but this is a requirement in a general case for all kinds of dynamic tracing), and does not even strictly require -debuginfo packages with symbolic information for basic usage. I mean ftrace.

To prove that this interface that appeared in 2008 is still usable, let me add the same kind of probe for tracing dispatch_command() function and printing SQL statement that somebody tries to execute (so we can avoid enabling general query log etc) with pure ftrace interface, without much explanations or background (that is presented below). In this case I use recent MariaDB Server 10.3.22 built from source on my Ubuntu 16.04 netbook (because I do not care about being shared in the feed of Planet MySQL anymore in any case).

You just have to do the following:
  1. Check that dynamic tracing with ftrace is possible:
    openxs@ao756:~$ mount | grep tracefs
    tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime)
    If you do not see tracefs mounted, mount it with the following command (as root):
    mount -t tracefs nodev /sys/kernel/tracing
  2. Find the address of (mangled) dispatch_command() function in the mysqld binary:
    openxs@ao756:~$ objdump -T /home/openxs/dbs/maria10.3/bin/mysqld | grep dispatch_command
    g    DF .text  000000000000236e  Base        _Z16dispatch_command19enum_server_commandP3THDPcjbb
    The address you need (in many cases) is the first word in the output, in hex. So, 0x0000000000587b90
  3. If you want to print values of parameters passed, you need to know how they are passed by the compiler used on the architecture used. Let me share the quote that applies to x86_64:
    "First six arguments are in rdi, rsi, rdx, rcx, r8d, r9d; remaining arguments are on the stack."
    In my case (see this post or details below, or check the source code yourself if in doubts) we need to print third argument, so it must be in the rdx (64-bit) register. For ftrace we have to use just %dx, trust me for now, I tried ;)
  4. Check probe syntax (see the picture below for quick reference) and come up with something like this to add a probe named "dc". Note that on Ubuntu 16.04 this had not worked via sudo directly, I had to become root via "sudo su -" (weird, something to check one day):
    root@ao756:~# echo 'p:dc /home/openxs/dbs/maria10.3/bin/mysqld:0x0000000000587b90 query=+0(%dx):string' > /sys/kernel/debug/tracing/uprobe_events
    All these happens while MariaDB server is running and procerssing queries, in a truly dynamic manner. There is no impact at all so far. Creators of MariaDB never added anything ot the code (or to the build process) to make this work. It's enough to have a binary that runs of Linux kernel that supports ftrace.
  5. Enable tracing for the probe added (this is when magic happens) and tracing in general:
    root@ao756:~# echo 1 > /sys/kernel/debug/tracing/events/uprobes/dc/enable
    root@ao756:~# echo 1 > /sys/kernel/debug/tracing/tracing_on
  6. Now assuming you run the following:
    root@ao756:~# mysql -uroot --socket=/tmp/mariadb.sock
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 92
    Server version: 5.5.5-10.3.22-MariaDB Source distribution

    Copyright (c) 2009-2019 Percona LLC and/or its affiliates
    Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective

    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

    mysql> select 1+1;
    | 1+1 |
    |   2 |
    1 row in set (0.00 sec)

    mysql> select version();
    | version()       |
    | 10.3.22-MariaDB |
    1 row in set (0.00 sec)
    you may get the trace as follows, for example:
    root@ao756:~# cat /sys/kernel/debug/tracing/trace
    # tracer: nop
    # entries-in-buffer/entries-written: 9851/9851   #P:2
    #                              _-----=> irqs-off
    #                             / _----=> need-resched
    #                            | / _---=> hardirq/softirq
    #                            || / _--=> preempt-depth
    #                            ||| /     delay
    #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
    #              | |       |   ||||       |         |
              python-21369 [001] d... 209044.826580: : sys_wr: 0000000000000008
    (I had show the top of the output to give hint about format), or:
    root@ao756:~# cat /sys/kernel/debug/tracing/trace_pipe
              mysqld-1082  [000] d... 273258.971401: dc: (0x560d8a20fb90) query="select @@version_comment limit 1"
              mysqld-1082  [000] d... 273263.187839: dc: (0x560d8a20fb90) query="select 1+1"
              mysqld-1082  [001] d... 273269.128542: dc: (0x560d8a20fb90) query="select version()"
  7. If you want to stop this basic tracing, run these:
    root@ao756:~# echo 0 > /sys/kernel/debug/tracing/events/uprobes/dc/enable
    root@ao756:~# echo 0 > /sys/kernel/debug/tracing/tracing_on
So, with 7 simple steps we can enable dynamic tracing and see SQL statements executed and time when this happened.
The magic behind ftrace and steps that I really had to perform when I tried to apply this approach for the very first time will be presented in the next posts. Stay tuned! 

1 comment:

  1. Beautiful tracing series so far. I really enjoying these (despite the post to comment lag time) and reading Brendan Greggs's 2019 BPF Performance Tools (e-)book that complements this quite well.