Sunday, October 3, 2021

bpftrace as a code/function coverage tool for MariaDB server

I am going to speak about bpftrace again soon, this time at MariaDB Server Fest 2021. Among other useful applications of bpftrace I mention using it as a code coverage (or, more precisely, function coverage) tool to check if some test executes all/specific functions in MariaDB server source code. In this blog post I am going to present some related tests in more details.

For testing I use new enough MariaDB server and bpftrace both built from GitHub source code on my Ubuntu 20.04 "home server":

openxs@ao756:~/dbs/maria10.6$ bin/mariadbd --version
bin/mariadbd  Ver 10.6.5-MariaDB for Linux on x86_64 (MariaDB Server)
openxs@ao756:~/dbs/maria10.6$ bpftrace -V
bpftrace v0.13.0-120-gc671
openxs@ao756:~/dbs/maria10.6$ cat /etc/lsb-release

The idea I tried immediately as a lame function coverage approach was to attach a probe to print function name for every function in the mariadbd binary:

openxs@ao756:~/dbs/maria10.6$ sudo bpftrace -e 'uprobe:/home/openxs/dbs/maria10.6/bin/mariadbd:* { printf("%s\n", func); }'
ERROR: Can't attach to 34765 probes because it exceeds the current limit of 512 probes.
You can increase the limit through the BPFTRACE_MAX_PROBES environment variable, but BE CAREFUL since a high number of probes attached can cause your system to crash.
openxs@ao756:~/dbs/maria10.6$ bpftrace --help  2>&1 | grep MAX_PROBES
    BPFTRACE_MAX_PROBES         [default: 512] max number of probes

This way I found out that bpftrace found 34765 different functions in this binary, but by default it can have at most 512 probes per call. So, let me increase BPFTRACE_MAX_PROBES, as suggested (I don't mind system crash, it's for testing anyway):

openxs@ao756:~/dbs/maria10.6$ sudo BPFTRACE_MAX_PROBES=35000 bpftrace -e 'uprobe:/home/openxs/dbs/maria10.6/bin/mariadbd:* { printf("%s\n", func); }'
Attaching 34765 probes...
bpf: Failed to load program: Too many open files
processed 17 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0

bpf: Failed to load program: Too many open files
processed 17 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0

ERROR: Error loading program: uprobe:/home/openxs/dbs/maria10.6/bin/mariadbd:thd_client_ip (try -v)
Segmentation fault
openxs@ao756:~/dbs/maria10.6$ ulimit -n

This time I crashed bpftrace, and before that hit the limit on number of open files, quite small by default. What if I try to increase the limit? Let me try:

openxs@ao756:~/dbs/maria10.6$ ulimit -n 40000
openxs@ao756:~/dbs/maria10.6$ ulimit -n
openxs@ao756:~/dbs/maria10.6$ sudo BPFTRACE_MAX_PROBES=35000 bpftrace -e 'uprobe:/home/openxs/dbs/maria10.6/bin/mariadbd:* { printf("%s\n", func); }'
Attaching 34765 probes...
ERROR: Offset outside the function bounds ('register_tm_clones' size is 0)
Segmentation fault

Still crash, but with different error message before. Looks like some functions are not traceable for some reason I have yet to understand. Segmentation fault still may be related to the number of probes to create, but we can get the same error with much less probes (and for a different function):

openxs@ao756:~/dbs/maria10.6$ sudo BPFTRACE_MAX_PROBES=40000 bpftrace -e 'uprobe:/home/openxs/dbs/maria10.6/bin/mariadbd:*do* { printf("%s\n", func); }'
Attaching 1076 probes...
ERROR: Offset outside the function bounds ('__do_global_dtors_aux' size is 0)

The problematic function above is actually added by the compiler for callings the destructors of static objects (if we trust this source), it's not even from MariaDB server code itself. register_tm_clones function we hit before is even more mysterious (read this). It is related to transactional memory model in C++ somehow. Looks like on day I'll have to just create a list of all traceable and important functions in the MariaDB server source code and add explicit probe for each. Lame attempts to trace all do not work.

But we surely can trace a lot of known useful functions, like in the following example where I add a probe for each function named like 'ha_*' and try to count how many times it was called during some sysbench test:

openxs@ao756:~/dbs/maria10.6$ sudo BPFTRACE_MAX_PROBES=1000 bpftrace -e 'uprobe:/home/openxs/dbs/maria10.6/bin/mariadbd:ha_* { @cnt[func] += 1; }' > /tmp/funcs2.txt
[sudo] password for openxs:

The test that I executed in another shell was the following (numbers do not really matter):

openxs@ao756:~/dbs/maria10.6$ sysbench oltp_read_write --mysql-socket=/tmp/mariadb.sock --mysql-user=openxs --tables=2 --table-size=1000 --threads=2 run --time=1600 --report-interval=10
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 2
Report intermediate results every 10 second(s)
Initializing random number generator from current time

Initializing worker threads...

Threads started!

[ 10s ] thds: 2 tps: 74.96 qps: 1502.59 (r/w/o: 1052.20/300.26/150.13) lat (ms,95%): 44.17 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 2 tps: 70.31 qps: 1404.51 (r/w/o: 983.07/280.82/140.61) lat (ms,95%): 36.24 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 2 tps: 82.50 qps: 1650.50 (r/w/o: 1155.30/330.20/165.00) lat (ms,95%): 40.37 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 2 tps: 82.60 qps: 1652.60 (r/w/o: 1156.80/330.60/165.20) lat (ms,95%): 33.12 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 2 tps: 50.20 qps: 1004.48 (r/w/o: 703.29/200.80/100.40) lat (ms,95%): 89.16 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 2 tps: 74.80 qps: 1494.61 (r/w/o: 1045.91/299.20/149.50) lat (ms,95%): 44.98 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 2 tps: 82.70 qps: 1654.31 (r/w/o: 1158.21/330.60/165.50) lat (ms,95%): 33.72 err/s: 0.00 reconn/s: 0.00
[ 80s ] thds: 2 tps: 81.60 qps: 1632.61 (r/w/o: 1142.80/326.60/163.20) lat (ms,95%): 44.17 err/s: 0.00 reconn/s: 0.00

I let it work for a minute or so, and now I can check what ha_* functions of MariaDB server were called, and how many times:

openxs@ao756:~/dbs/maria10.6$ cat /tmp/funcs2.txt
Attaching 953 probes...

@cnt[ha_rollback_trans(THD*, bool)]: 2
@cnt[ha_close_connection(THD*)]: 2
@cnt[ha_heap::open(char const*, int, unsigned int)]: 6296
@cnt[ha_heap::close()]: 6298
@cnt[ha_heap::update_key_stats()]: 6320
@cnt[ha_heap::drop_table(char const*)]: 6353
@cnt[ha_lock_engine(THD*, handlerton const*)]: 6524
@cnt[ha_innobase::delete_row(unsigned char const*)]: 6551
@cnt[ha_innobase::rnd_init(bool)]: 6602
@cnt[ha_innobase::innobase_lock_autoinc()]: 6604
@cnt[ha_innobase::write_row(unsigned char const*)]: 6604
@cnt[ha_heap::table_flags() const]: 11746
@cnt[ha_heap::info(unsigned int)]: 12606
@cnt[ha_check_if_updates_are_ignored(THD*, handlerton*, char const*)]: 13094
@cnt[ha_innobase::update_row(unsigned char const*, unsigned char const*)]: 13122
@cnt[ha_heap::~ha_heap()]: 18819
@cnt[ha_heap::rnd_init(bool)]: 18865
@cnt[ha_innobase::unlock_row()]: 26252
@cnt[ha_innobase::records_in_range(unsigned int, st_key_range const*, st_key_range const*, st_page_range*)]: 26320
@cnt[ha_innobase::try_semi_consistent_read(bool)]: 26413
@cnt[ha_innobase::referenced_by_foreign_key()]: 26432
@cnt[ha_innobase::was_semi_consistent_read()]: 33022
@cnt[ha_innobase::multi_range_read_explain_info(unsigned int, char*, unsigned long)]: 46221
@cnt[ha_innobase::read_time(unsigned int, unsigned int, unsigned long long)]: 46228
@cnt[ha_innobase::multi_range_read_info_const(unsigned int, st_range_seq_if*, void*, unsigned int, unsigned int*, unsigned int*, Cost_estimate*)]: 46230
@cnt[ha_innobase::multi_range_read_init(st_range_seq_if*, void*, unsigned int, unsigned int, st_handler_buffer*)]: 46232
@cnt[ha_heap::extra(ha_extra_function)]: 50393
@cnt[ha_innobase::estimate_rows_upper_bound()]: 52828
@cnt[ha_innobase::scan_time()]: 72638
@cnt[ha_innobase::lock_count() const]: 107072
@cnt[ha_innobase::index_init(unsigned int, bool)]: 111372
@cnt[ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)]: 111432
@cnt[ha_innobase::info(unsigned int)]: 112253
@cnt[ha_innobase::info_low(unsigned int, bool)]: 112271
@cnt[ha_innobase::store_lock(THD*, st_thr_lock_data**, thr_lock_type)]: 118088
@cnt[ha_innobase::change_active_index(unsigned int)]: 118700
@cnt[ha_innobase::reset()]: 118934
@cnt[ha_commit_trans(THD*, bool)]: 123967
@cnt[ha_commit_one_phase(THD*, bool)]: 124096
@cnt[ha_check_and_coalesce_trx_read_only(THD*, Ha_trx_info*, bool)]: 124363
@cnt[ha_innobase::column_bitmaps_signal()]: 191279
@cnt[ha_innobase::table_flags() const]: 213982
@cnt[ha_innobase::build_template(bool)]: 230402
@cnt[ha_innobase::external_lock(THD*, int)]: 236446
@cnt[ha_innobase::rnd_end()]: 237608
@cnt[ha_innobase::extra(ha_extra_function)]: 415923
@cnt[ha_innobase::index_flags(unsigned int, unsigned int, bool) const]: 451710
@cnt[ha_heap::write_row(unsigned char const*)]: 628369
@cnt[ha_heap::position(unsigned char const*)]: 629184
@cnt[ha_heap::rnd_pos(unsigned char*, unsigned char*)]: 629341
@cnt[ha_heap::rnd_next(unsigned char*)]: 635061
@cnt[ha_innobase::position(unsigned char const*)]: 660590
@cnt[ha_innobase::index_next(unsigned char*)]: 2621915
@cnt[ha_innobase::multi_range_read_next(void**)]: 2708026

I think this is really awesome! I can trace 900+ function calls with a bpftrace one liner and get the @cnt associative array printed at the end with a (not mangled!) function name as a key and number of times function was called while we run the test as a value, ordered by increasing the value, automatically!

bpftrace may help MariaDB DBAs and developers in many cases!

This is already useful to find out if any of the functions traced was called at all. I can run several bpftrace command lines concurrently, each tracing a batch of different function calls and then summarize their outputs, I can measure time spent in each function, get stack traces if needed, and so on. I hope you already agree that the idea to use bpftrace as a function coverage tool may work even for such a complex software as MariaDB server.

I demonstrate this during my upcoming talk. See you there on Wednesday, October 6!