Sunday, January 22, 2017

oprofile Basics for MySQL Profiling

In my previous post I've presented a list of bugs reported by famous MySQL developers, DBAs, users and support engineers who used oprofile to find the problem or root cause of the problem they studied. I assume that some of my readers would like to follow their steps and try to use this tool while working on MySQL performance problems. Bug reports I mentioned contain outputs and, rarely, just some commands used. That's why I decided to write a kind of a basic "Howto" document. The tool does have a detailed enough manual, but few basic steps for quick use of oprofile to study MySQL-related problems should still help. At least I had always considered this old post by Oli a useful reference for myself and Googled for it more than once.

These days users of oprofile may need an updated quick reference or Howto guide, as starting from version 0.9.8 the tool started to change, new commands were introduced and, starting from version 1.0.0, familiar commands like opcontrol are not only deprecated, but removed from the tool (and it means that pt-stalk's --collect-oprofile option just does not work any more). For this post I decided to use version 0.9.9 on my good old Ubuntu 14.04 netbook. This should, theoretically, allow me to demonstrate both "old" and "new" way to use the tool.

So, I am using the following system for the purposes of this blog post:
openxs@ao756:~/dbs/maria10.1$ uname -a
Linux ao756 3.13.0-100-generic #147-Ubuntu SMP Tue Oct 18 16:48:51 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
openxs@ao756:~/dbs/maria10.1$ cat /etc/issueUbuntu 14.04.5 LTS \n \l

openxs@ao756:~/dbs/maria10.1$ dpkg -l | grep -i oprofile
ii  oprofile                                              0.9.9-0ubuntu8                                      amd64        system-wide profiler for Linux systems
openxs@ao756:~/dbs/maria10.1$ bin/mysql -uroot -e"show variables like '%version%'"
| Variable_name           | Value            |
| innodb_version          | 5.6.31-77.0      |
| protocol_version        | 10               |
| slave_type_conversions  |                  |
| version                 | 10.1.18-MariaDB  |
| version_comment         | MariaDB Server   |
| version_compile_machine | x86_64           |
| version_compile_os      | Linux            |
| version_malloc_library  | bundled jemalloc |
| version_ssl_library     | YaSSL 2.3.9b     |
| wsrep_patch_version     | wsrep_25.16      |
openxs@ao756:~/dbs/maria10.1$ ps aux | grep mysqld
openxs   25568  0.0  0.0   4440   784 pts/10   S    19:00   0:00 /bin/sh bin/mysqld_safe --no-defaults
openxs   25631  1.1  2.4 751552 93596 pts/10   Sl   19:00   0:22 /home/openxs/dbs/maria10.1/bin/mysqld --no-defaults --basedir=/home/openxs/dbs/maria10.1 --datadir=/home/openxs/dbs/maria10.1/data --plugin-dir=/home/openxs/dbs/maria10.1/lib/plugin --log-error=/home/openxs/dbs/maria10.1/data/ao756.err
openxs   26158  0.0  0.0  14636   968 pts/10   S+   19:31   0:00 grep --color=auto mysqld
I assume you already have the oprofile package installed. I use MariaDB 10.1.18 built from source on this box (I work for MariaDB after all).

Basic steps to use the tool for studying some problematic period of MySQL server are the following:
  1. We check version used and, if we want to use older version of the tool and daemon-based approach, we initiate and start the process of samples collection using opcontrol tool:

    openxs@ao756:~/dbs/maria10.1$ opcontrol --version
    opcontrol: oprofile 0.9.9 compiled on Apr  4 2014 23:34:07
    openxs@ao756:~/dbs/maria10.1$ opcontrol --init
    Kernel support not available, missing opcontrol --init as root ?
    openxs@ao756:~/dbs/maria10.1$ sudo opcontrol --init
    [sudo] password for openxs:
    openxs@ao756:~/dbs/maria10.1$ sudo opcontrol --start --no-vmlinux
    ATTENTION: Use of opcontrol is discouraged.  Please see the man page for operf.
    Using default event: CPU_CLK_UNHALTED:100000:0:1:1
    Using 2.6+ OProfile kernel interface.
    Using log file /var/lib/oprofile/samples/oprofiled.log
    Daemon started.
    Profiler running.

    Note that superuser permissions are needed for most opcontrol commands. I've added --no-vmlinux command as I do not want to profile the kernel itself.
  2. Now we can run with samples collection activated for some time, until the period we want to study ends. For this post I use a primitive single thread "benchmark" test from the Bug #39630 I once reported and worked on for a long time:

    openxs@ao756:~/dbs/maria10.1$ time bin/mysql -uroot -e'select benchmark(500000000, 2*2)'
    | benchmark(500000000, 2*2) |
    |                         0 |

    real    0m20.209s
    user    0m0.009s
    sys     0m0.004s

    This was a deliberately very simple use case to study, just to check that we get meaningful and expected outputs.
  3. When the period we want to study ends, we should stop collecting samples and dump the results:

    openxs@ao756:~/dbs/maria10.1$ sudo opcontrol --stop
    Stopping profiling.
    openxs@ao756:~/dbs/maria10.1$ sudo opcontrol --dump

    Theoretically we can have several sample-collecting "sessions" that we can save under unique "names" etc, but for now I want to concentrate on the most basic usage.
  4. Now we have to use opreport tool to check the results:

    openxs@ao756:~/dbs/maria10.1$ sudo opreport --demangle=smart --symbols --merge tgid `pwd`/bin/mysqld | head -12
    Using /var/lib/oprofile/samples/ for samples directory.
    warning: /no-vmlinux could not be found.
    warning: [vdso] (tgid:25631 range:0x7ffc6fac3000-0x7ffc6fac5000) could not be found.
    CPU: Intel Sandy Bridge microarchitecture, speed 1.5e+06 MHz (estimated)
    Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
    samples  %        image name               symbol name
    83358    28.0632  mysqld                   Item_func_mul::int_op()

    58838    19.8083  mysqld                   Item_func_benchmark::val_int()
    55420    18.6576  mysqld                   Item_hybrid_func::result_type() const
    52001    17.5066  mysqld                   Item_func_hybrid_field_type::val_int()
    25931     8.7299  mysqld                   Item_int::val_int()
    10861     3.6564  mysqld                   Type_handler_int_result::cmp_type() const
    10104     3.4016  mysqld                   Type_handler_int_result::result_type() const
    460       0.1549  no-vmlinux               /no-vmlinux
    12        0.0040  mysqld                   sync_array_print_long_waits(unsigned long*, void const**)

    We provided --symbols option to list per-symbol information instead of a binary image summary, --demangle=smart option to get readable names of C++ functions, and --merge tgid option to merge profiles by thread group ID (for all threads of the same multi-threaded process, like mysqld). We also had to provide a full path to the mysqld binary to restrict symbol summary only to the specific executable image, mysqld process from MariaDB 10.1.18.When mysqld from PATH or just from some standard package is used, $(which mysqld) is used instead in the command line.
Basically that's all, we clearly see that most of the time was spent on integers multiplication (as one had to expect). We may have to use --deinit, --reset or --shutdown options of opcontrol if we do not plan any further profiling.

Now, what if we use oprofile version 1.0.0 or newer (as it happened to me on Fedora 25 or Ubuntu 16.04)? opcontrol tool just does not exist then, as well as oprofiled daemon I see running now:

openxs@ao756:~/dbs/maria10.1$ ps aux | grep oprofiled
root     27281  0.0  0.2  14036  7852 ?        S    20:22   0:00 /usr/bin/oprofiled --session-dir=/var/lib/oprofile --separate-lib=1 --separate-kernel=1 --separate-thread=1 --separate-cpu=0 --events=CPU_CLK_UNHALTED:60:0:100000:0:1:1, --no-vmlinux
openxs   27446  0.0  0.0  14636   972 pts/10   S+   20:41   0:00 grep --color=auto oprofiled

Basic steps are the following, based on operf tool (surely without oprofiled running):

openxs@ao756:~/dbs/maria10.1$ operf -p `pidof mysqld` &
[2] 4359
openxs@ao756:~/dbs/maria10.1$ operf: Press Ctl-c or 'kill -SIGINT 4359' to stop profiling
operf: Profiler started

openxs@ao756:~/dbs/maria10.1$ time bin/mysql -uroot -e'select benchmark(500000000, 2*2)'
| benchmark(500000000, 2*2) |
|                         0 |

real    0m19.772s
user    0m0.006s
sys     0m0.005s
openxs@ao756:~/dbs/maria10.1$ kill -SIGINT 4359
Profiling done.

[2]+  Done                    operf -p `pidof mysqld`
openxs@ao756:~/dbs/maria10.1$ opreport --demangle=smart --symbols --merge tgid /home/openxs/dbs/maria10.1/bin/mysqld
Using /home/openxs/dbs/maria10.1/oprofile_data/samples/ for samples directory.
warning: /no-vmlinux could not be found.
warning: [vdso] (tgid:4270 range:0x7ffebbf38000-0x7ffebbf39fff) could not be found.
CPU: Intel Sandy Bridge microarchitecture, speed 1.5e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
132      70.5882  no-vmlinux               /no-vmlinux
13        6.9519             memset
10        5.3476  mysqld                   os_aio_linux_handle(unsigned long, fil_node_t**, void**, unsigned long*, unsigned long*)
9         4.8128  mysqld                   sync_array_print_long_waits(unsigned long*, void const**)
3         1.6043       __pthread_mutex_cond_lock
2         1.0695       pthread_self
2         1.0695  mysqld                   btr_defragment_thread
2         1.0695  mysqld                   srv_error_monitor_thread
1         0.5348  [vdso] (tgid:4270 range:0x7ffebbf38000-0x7ffebbf39fff) [vdso] (tgid:4270 range:0x7ffebbf38000-0x7ffebbf39fff)
1         0.5348          /lib/x86_64-linux-gnu/
1         0.5348             __libc_disable_asynccancel
1         0.5348             __select_nocancel
1         0.5348             fflush
1         0.5348       pthread_cond_timedwait@@GLIBC_2.3.2
1         0.5348  mysqld                   buf_flush_page_cleaner_thread
1         0.5348  mysqld                   buf_pool_get_oldest_modification()
1         0.5348  mysqld                   log_checkpoint(unsigned long, unsigned long, unsigned long)
1         0.5348  mysqld                   mutex_enter_nowait_func(ib_mutex_t*, char const*, unsigned long)
1         0.5348  mysqld                   srv_check_activity(unsigned long, unsigned long)
1         0.5348  mysqld                   srv_master_evict_from_table_cache(unsigned long)
1         0.5348  mysqld                   srv_master_thread
1         0.5348  mysqld                   ut_delay(unsigned long)

Note that, theoretically, operf allows to profile separate commands or user's own process (as I did with -p `pidof mysqld` option) as non-privileged user. I had not used sudo above. Also, we just need one command now and running operf is stopped with -SIGINT signal when we are done with profiling.

The results I've got from opreport in this case, though, do not look 100% trustful to me though. Not because of that warning from [vdso], we had similar (although for other addresses) with classic approach as well (and you can find some related information in the FAQ), but because I see totally different profile than before under basically the same load. I see activity by background functions, but nothing related to multiplication of integers or benhcmark() function. This is suspicious and, until further tests on Fedora 25, I'd prefer NOT to trust to operf from oprofile version 0.9.9 on Ubuntu 14.04...

Next I plan to check how operf in the latest oprofile version works on Fedora 25 and (maybe) Ubuntu 16.04, apply profiler to a more interesting use case(s), show more options including --callgraph option, and try to compare recent oprofile to perf profiler (that I currently trust more and find somewhat easier to use for MySQL-related tasks). Stay tuned!


  1. does this mean 70% of the time is spent in the kernel? for arithmetic? Doesn't make sense...

  2. Yes, that result does not make sense to me. Today I treid to use operf from oprofile 1.1.0 on Fedora 25 and was able to get a reasonable profile (againsdt MariaDB Columnstore, but who cares):

    [openxs@fc23 maria10.1]$ sudo opreport --demangle=smart --symbols --merge tgid /usr/local/mariadb/columnstore/mysql//bin/mysqld | head -10
    [sudo] password for openxs:
    Using /home/openxs/dbs/maria10.1/oprofile_data/samples/ for samples directory.
    warning: /kvm could not be found.
    CPU: Core 2, speed 2499 MHz (estimated)
    Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
    samples % image name symbol name
    87250 26.3115 mysqld Item_func_mul::int_op()
    71403 21.5326 mysqld Type_handler_int_result::result_type() const
    61213 18.4597 mysqld Item_int::val_int()
    45623 13.7583 mysqld Item_func_hybrid_field_type::val_int()
    41265 12.4441 mysqld Item_func_benchmark::val_int()
    13228 3.9891 mysqld Item_hybrid_func::result_type() const
    10621 3.2029 mysqld Type_handler_int_result::cmp_type() const
    [openxs@fc23 maria10.1]$

    But only for system-wide profiling. Nothing useful is produced with -p option.

    For reference, the commands used (based on history) were:

    [openxs@fc23 maria10.1]$ fc -l 705
    705 sudo operf -s &
    706 time bin/mysql --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock -uroot -e'select benchmark(500000000, 2*2)'
    707 sudo kill -SIGINT 4226
    710 ps aux | grep mysqld
    711 sudo opreport --demangle=smart --symbols --merge tgid /usr/local/mariadb/columnstore/mysql//bin/mysqld
    712 sudo opreport --demangle=smart --symbols --merge tgid /usr/local/mariadb/columnstore/mysql//bin/mysqld | head -10

  3. Valeriy, great post, thanks. It seemed simple enough so you inspired me to create this branch, which should allow pt-stalk to continue working once 1.0.0 becomes widely used, provided someone good at QA tests it ;) :

    1. Well done, Nando! I planned to create similar patch myself, but busy with other parts/blog posts of upcoming FOSDEM 2017 presentation on profilers...