Thursday, February 10, 2022

How to Summarize gdb Backtrace with pt-pmp (and Flamegraph)

This is going to be more like a note to myself and my readers than a real blog post. But still I'd like o document the trick that I am applying for years already, in its most convenient form. 

Assuming you have a backtrace (or full backtrace) created from a core file or by attaching gdb to a live (and maybe hanging) process with many threads, like MySQL or MariaDB server, what is the best way to summarize it quickly, to understand what most of the threads are doing/hanging at? Like this huge backtrace with hundreds of threads:

openxs@ao756:~$ ls -l /tmp/backtrace1.txt
-rw-rw-r-- 1 openxs openxs 2817054 лют 10 17:02 /tmp/backtrace1.txt
openxs@ao756:~$ grep LWP /tmp/backtrace1.txt | wc -l
1915

Here it is. You have to download pt-pmp from Percona Toolkit. Then source the pt-pmp script and rely on the aggregate_stacktrace function from it, quite advanced in summarizing stack traces:

openxs@ao756:~$ which pt-pmp
/usr/bin/pt-pmp
openxs@ao756:~$ . /usr/bin/pt-pmp
openxs@ao756:~$ cat /tmp/backtrace1.txt | aggregate_stacktrace > /tmp/pmp1.txt 
openxs@ao756:~$ ls -l /tmp/pmp1.txt
-rw-rw-r-- 1 openxs openxs 34174 лют 10 18:07 /tmp/pmp1.txt
openxs@ao756:~$ head -5 /tmp/pmp1.txt
    598 poll(libc.so.6),vio_io_wait(viosocket.c:945),vio_socket_io_wait(viosocket.c:108),vio_read(viosocket.c:184),my_real_read(net_serv.cc:892),my_net_read_packet_reallen(net_serv.cc:1162),my_net_read_packet(net_serv.cc:1146),do_command(sql_parse.cc:1262),do_handle_one_connection(sql_connect.cc:1336),handle_one_connection(sql_connect.cc:1241),pfs_spawn_thread(pfs.cc:1862),start_thread(libpthread.so.0),clone(libc.so.6)
     82 pthread_cond_wait,wait(os0event.cc:158),wait_low(os0event.cc:158),os_event_wait_low(os0event.cc:158),sync_array_wait_event(sync0arr.cc:471),rw_lock_s_lock_spin(sync0rw.cc:373),rw_lock_s_lock_func(sync0rw.ic:290),pfs_rw_lock_s_lock_func(sync0rw.ic:290),buf_page_get_gen(buf0buf.cc:4905),btr_cur_search_to_nth_level(btr0cur.cc:1243),btr_pcur_open_low(btr0pcur.ic:467),row_ins_scan_sec_index_for_duplicate(btr0pcur.ic:467),row_ins_sec_index_entry_low(btr0pcur.ic:467),row_ins_sec_index_entry(row0ins.cc:3251),row_ins_index_entry(row0ins.cc:3297),row_ins_index_entry_step(row0ins.cc:3297),row_ins(row0ins.cc:3297),row_ins_step(row0ins.cc:3297),row_insert_for_mysql(row0mysql.cc:1414),ha_innobase::write_row(ha_innodb.cc:8231),handler::ha_write_row(handler.cc:6089),write_record(sql_insert.cc:1941),mysql_insert(sql_insert.cc:1066),mysql_execute_command(sql_parse.cc:4170),mysql_parse(sql_parse.cc:7760),dispatch_command(sql_parse.cc:1832),do_command(sql_parse.cc:1386),do_handle_one_connection(sql_connect.cc:1336),handle_one_connection(sql_connect.cc:1241),pfs_spawn_thread(pfs.cc:1862),start_thread(libpthread.so.0),clone(libc.so.6)
     55 pthread_cond_wait,wait(os0event.cc:158),wait_low(os0event.cc:158),os_event_wait_low(os0event.cc:158),lock_wait_suspend_thread(lock0wait.cc:347),row_mysql_handle_errors(row0mysql.cc:741),row_insert_for_mysql(row0mysql.cc:1428),ha_innobase::write_row(ha_innodb.cc:8231),handler::ha_write_row(handler.cc:6089),write_record(sql_insert.cc:1941),mysql_insert(sql_insert.cc:1066),mysql_execute_command(sql_parse.cc:4170),mysql_parse(sql_parse.cc:7760),dispatch_command(sql_parse.cc:1832),do_command(sql_parse.cc:1386),do_handle_one_connection(sql_connect.cc:1336),handle_one_connection(sql_connect.cc:1241),pfs_spawn_thread(pfs.cc:1862),start_thread(libpthread.so.0),clone(libc.so.6)
     38 pthread_cond_wait,wait(os0event.cc:158),wait_low(os0event.cc:158),os_event_wait_low(os0event.cc:158),sync_array_wait_event(sync0arr.cc:471),rw_lock_x_lock_func(sync0rw.cc:733),pfs_rw_lock_x_lock_func(sync0rw.ic:544),buf_page_get_gen(buf0buf.cc:4918),btr_cur_search_to_nth_level(btr0cur.cc:1243),row_ins_sec_index_entry_low(row0ins.cc:2946),row_ins_sec_index_entry(row0ins.cc:3251),row_ins_index_entry(row0ins.cc:3297),row_ins_index_entry_step(row0ins.cc:3297),row_ins(row0ins.cc:3297),row_ins_step(row0ins.cc:3297),row_insert_for_mysql(row0mysql.cc:1414),ha_innobase::write_row(ha_innodb.cc:8231),handler::ha_write_row(handler.cc:6089),write_record(sql_insert.cc:1941),mysql_insert(sql_insert.cc:1066),mysql_execute_command(sql_parse.cc:4170),mysql_parse(sql_parse.cc:7760),dispatch_command(sql_parse.cc:1832),do_command(sql_parse.cc:1386),do_handle_one_connection(sql_connect.cc:1336),handle_one_connection(sql_connect.cc:1241),pfs_spawn_thread(pfs.cc:1862),start_thread(libpthread.so.0),clone(libc.so.6)
     32 pthread_cond_wait,wait(os0event.cc:158),wait_low(os0event.cc:158),os_event_wait_low(os0event.cc:158),sync_array_wait_event(sync0arr.cc:471),rw_lock_x_lock_func(sync0rw.cc:733),pfs_rw_lock_x_lock_func(sync0rw.ic:544),buf_page_get_gen(buf0buf.cc:4918),btr_cur_search_to_nth_level(btr0cur.cc:1243),row_ins_sec_index_entry_low(row0ins.cc:3040),row_ins_sec_index_entry(row0ins.cc:3251),row_ins_index_entry(row0ins.cc:3297),row_ins_index_entry_step(row0ins.cc:3297),row_ins(row0ins.cc:3297),row_ins_step(row0ins.cc:3297),row_insert_for_mysql(row0mysql.cc:1414),ha_innobase::write_row(ha_innodb.cc:8231),handler::ha_write_row(handler.cc:6089),write_record(sql_insert.cc:1941),mysql_insert(sql_insert.cc:1066),mysql_execute_command(sql_parse.cc:4170),mysql_parse(sql_parse.cc:7760),dispatch_command(sql_parse.cc:1832),do_command(sql_parse.cc:1386),do_handle_one_connection(sql_connect.cc:1336),handle_one_connection(sql_connect.cc:1241),pfs_spawn_thread(pfs.cc:1862),start_thread(libpthread.so.0),clone(libc.so.6)

If a summary like above is not clear enough, we can surely go one step further and create a proper flame graph based on the output above:

openxs@ao756:~$ cat /tmp/pmp1.txt | awk '{print $2, $1}' | sed -e 's/,/;/g' | ~/git/FlameGraph/flamegraph.pl --countname="threads" --reverse - >/tmp/pmp1.svg  

Then with some creative zooming and search we can concentrate on waits:

Frames with "wait" substring in the function name are highlighted

That's all. As simple as it looks. Quick overview of the backtrace before you start digging there.

Friday, February 4, 2022

How to Add Probe Inside a Function and Access Local Variables with bpftrace

During several my talks on bpftrace I mentioned that it is possible to add probe to "every other" line of the code, same as with perf, but had never demonstrated how to do this. Looks like it's time to show both this and access to local variables declared inside the function we trace, as I am going to speak about bpftrace again at FOSDEM (and have to share something new and cool).

Consider the following debugging session with latest MariaDB server 10.8 built from GitHub source (as usual, for my own work I use only my own builds from current source) on Ubuntu 20.04:

openxs@ao756:~/dbs/maria10.8$ sudo gdb -p `pidof mariadbd`
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 200650
[New LWP 200652]
[New LWP 200653]
[New LWP 200654]
[New LWP 200655]
[New LWP 200656]
[New LWP 200659]
[New LWP 200662]
[New LWP 200663]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
--Type <RET> for more, q to quit, c to continue without paging--
0x00007f7b8424caff in __GI___poll (fds=0x557960733d38, nfds=3,
    timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
29      ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) b do_command
Breakpoint 1 at 0x55795e867570: file /home/openxs/git/server/sql/sql_parse.cc, line 1197.
(gdb) c
Continuing.

Basically I've attached to running mariadbd process and set a breakpoint on do_command(), and then let it continue working. In another terminal I've connected to server and got the breakpoint hit immediately:

[New Thread 0x7f7b5dbf0700 (LWP 200680)]
[New Thread 0x7f7b6c084700 (LWP 200797)]
[Switching to Thread 0x7f7b6c084700 (LWP 200797)]

Thread 11 "mariadbd" hit Breakpoint 1, do_command (thd=0x7f7b4c001738,
    blocking=blocking@entry=true)
    at /home/openxs/git/server/sql/sql_parse.cc:1197
1197    {
(gdb) bt
#0  do_command (thd=0x7f7b4c001738, blocking=blocking@entry=true)
    at /home/openxs/git/server/sql/sql_parse.cc:1197
#1  0x000055795e97df07 in do_handle_one_connection (connect=<optimized out>,
    put_in_cache=true) at /home/openxs/git/server/sql/sql_connect.cc:1418
#2  0x000055795e97e23d in handle_one_connection (arg=arg@entry=0x55796079f798)
    at /home/openxs/git/server/sql/sql_connect.cc:1312
#3  0x000055795ecc7e8d in pfs_spawn_thread (arg=0x557960733dd8)
    at /home/openxs/git/server/storage/perfschema/pfs.cc:2201
#4  0x00007f7b84686609 in start_thread (arg=<optimized out>)
    at pthread_create.c:477
#5  0x00007f7b84259293 in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

From this state in gdb we can trace execution step by step:

(gdb) n
[New Thread 0x7f7b5ebf2700 (LWP 200800)]
1213      if (thd->async_state.m_state == thd_async_state::enum_async_state::RESUMED)
(gdb) n
1229      thd->lex->current_select= 0;
(gdb) n
1237      if (!thd->skip_wait_timeout)
(gdb) n
1238        my_net_set_read_timeout(net, thd->get_net_wait_timeout());
(gdb) n
1241      thd->clear_error(1);
(gdb) n
1243      net_new_transaction(net);
(gdb) n
1246      thd->start_bytes_received= thd->status_var.bytes_received;
(gdb) n
1264      packet_length= my_net_read_packet(net, 1);
(gdb) n
1266      if (unlikely(packet_length == packet_error))
(gdb) n
1300      packet= (char*) net->read_pos;
(gdb) n
1309      if (packet_length == 0)                       /* safety */
(gdb) n
1316      packet[packet_length]= '\0';                  /* safety */
(gdb) n
1319      command= fetch_command(thd, packet);
(gdb) p packet
$1 = 0x7f7b4c008d08 "\003select @@version_comment limit 1"
(gdb) q
A debugging session is active.

        Inferior 1 [process 200650] will be detached.

Quit anyway? (y or n) y
Detaching from program: /home/openxs/dbs/maria10.8/bin/mariadbd, process 200650
[Inferior 1 (process 200650) detached]

In that session above I moved to the next step until I've got to the place in the code where local variable packet got a value, zero-terminated string. I know from the previous experience that it's value is the SQL statement to be executed (in most cases). So, I printed it to find out that mysql command line client starts with finding the server version comment to output.

Question is: can we do the same with bpftrace, attache the probe to the instruction at/around line 1316 of the sql/sql_parse.cc and print the value of the local variable named packet as a string? 

It should be possible, as we can do this with perf or even raw ftrace probes. This is how we can find all "tracable" lines in the code of do_command() function with perf:

openxs@ao756:~/dbs/maria10.8$ sudo perf probe -x /home/openxs/dbs/maria10.8/bin/mysqld --line do_command
<do_command@/home/openxs/git/server/sql/sql_parse.cc:0>
      0  dispatch_command_return do_command(THD *thd, bool blocking)
         {
      2    dispatch_command_return return_value;
           char *packet= 0;
           ulong packet_length;
           NET *net= &thd->net;
           enum enum_server_command command;
           DBUG_ENTER("do_command");

         #ifdef WITH_WSREP
           DBUG_ASSERT(!thd->async_state.pending_ops() ||
                       (WSREP(thd) &&
                        thd->wsrep_trx().state() == wsrep::transaction::s_abort>
         #else
           DBUG_ASSERT(!thd->async_state.pending_ops());
         #endif

           if (thd->async_state.m_state == thd_async_state::enum_async_state::R>
           {
             /*
              Resuming previously suspended command.
              Restore the state
             */
     23      command = thd->async_state.m_command;
     24      packet = thd->async_state.m_packet.str;
             packet_length = (ulong)thd->async_state.m_packet.length;
             goto resume;
           }

...
    104    packet= (char*) net->read_pos;
           /*
             'packet_length' contains length of data, as it was stored in packet
             header. In case of malformed header, my_net_read returns zero.
             If packet_length is not zero, my_net_read ensures that the returned
             number of bytes was actually read from network.
             There is also an extra safety measure in my_net_read:
             it sets packet[packet_length]= 0, but only for non-zero packets.
           */
    113    if (packet_length == 0)                       /* safety */
           {
             /* Initialize with COM_SLEEP packet */
    116      packet[0]= (uchar) COM_SLEEP;
    117      packet_length= 1;
           }
           /* Do not rely on my_net_read, extra safety against programming erro>
    120    packet[packet_length]= '\0';                  /* safety */


    123    command= fetch_command(thd, packet);
...
    221    DBUG_ASSERT(thd->m_digest == NULL);
           DBUG_ASSERT(thd->m_statement_psi == NULL);
         #ifdef WITH_WSREP
           if (packet_length != packet_error)
           {
             /* there was a command to process, and before_command() has been c>
             if (unlikely(wsrep_service_started))
    228        wsrep_after_command_after_result(thd);
           }
         #endif /* WITH_WSREP */
           DBUG_RETURN(return_value);

openxs@ao756:~/dbs/maria10.8$

So, basically we need a probe on line 120 to match the place where I printed the value in gdb. Let's create it with perf:

openxs@ao756:~/dbs/maria10.8$ sudo perf probe -x /home/openxs/dbs/maria10.8/bin/mariadbd 'do_command:120 packet:string'
Added new event:
  probe_mariadbd:do_command_L120 (on do_command:120 in /home/openxs/dbs/maria10.8/bin/mariadbd with packet:string)

You can now use it in all perf tools, such as:

        perf record -e probe_mariadbd:do_command_L120 -aR sleep 1

Now, we can check what user probe was really created by that command line:

openxs@ao756:~/dbs/maria10.8$ sudo cat /sys/kernel/tracing/uprobe_events
p:probe_mariadbd/do_command_L120 /home/openxs/dbs/maria10.8/bin/mariadbd:0x00000000007b761f packet_string=+0(%r10):string

So, the probe is attached to some address, 0x7b761f, in the mariadb binary. Looks like we refer to the register, r10, while trying to access local variable, packet.

Now let's delete this probe (we'll add it back with bpftrace soon) and try to make sense of that address:

openxs@ao756:~/dbs/maria10.8$ sudo perf probe --del 'probe_mariadbd:do_command*'
Removed event: probe_mariadbd:do_command_L120
openxs@ao756:~/dbs/maria10.8$ objdump -tT /home/openxs/dbs/maria10.8/bin/mariadbd | grep do_command
00000000007b7570 g     F .text  00000000000007e9              _Z10do_commandP3THDb
00000000007b7570 g    DF .text  00000000000007e9  Base        _Z10do_commandP3THDb

Note that based on the above the entry address of the do_command() function in the binary is 0x7b7570, a bit smaller than 0x7b761f. The difference is 0x7b761f - 0x7b7570 = 0xaf or 175 in decimal. 

Can we come up to the address for the probe  without perf? Yes, we can do it by checking the assembly code of the function, for example, in gdb:

openxs@ao756:~/dbs/maria10.8$ ps aux | grep 'mariadbd ' | grep -v grep
openxs    200650  0.0  2.4 1341032 94976 ?       Sl   січ30   0:05 /home/openxs/dbs/maria10.8/bin/mariadbd --no-defaults --basedir=/home/openxs/dbs/maria10.8 --datadir=/home/openxs/dbs/maria10.8/data --plugin-dir=/home/openxs/dbs/maria10.8/lib/plugin --log-error=/home/openxs/dbs/maria10.8/data/ao756.err --pid-file=ao756.pid --socket=/tmp/mariadb.sock --port=3309
openxs@ao756:~/dbs/maria10.8$ sudo gdb -p `pidof mariadbd`
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 200650
[New LWP 200652]
[New LWP 200653]
[New LWP 200654]
[New LWP 200655]
[New LWP 200662]
[New LWP 200663]
[New LWP 204349]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
--Type <RET> for more, q to quit, c to continue without paging--
0x00007f7b8424caff in __GI___poll (fds=0x557960733d38, nfds=3,
    timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
29      ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) disass /m do_command
Dump of assembler code for function do_command(THD*, bool):
1142    static bool wsrep_command_no_result(char command)

1143    {
1144      return (command == COM_STMT_PREPARE          ||
   0x000055795e8679d0 <+1120>:  cmp    $0x16,%al
   0x000055795e8679d2 <+1122>:  je     0x55795e8679dc <do_command(THD*, bool)+1132>
   0x000055795e8679d4 <+1124>:  cmp    $0x1c,%al
   0x000055795e8679d6 <+1126>:  jne    0x55795e867b50 <do_command(THD*, bool)+1504>
   0x000055795e8679dc <+1132>:  mov    $0x1,%esi

1145              command == COM_STMT_FETCH            ||
1146              command == COM_STMT_SEND_LONG_DATA   ||
1147              command == COM_STMT_CLOSE);
1148    }
1149    #endif /* WITH_WSREP */
1150    #ifndef EMBEDDED_LIBRARY
1151    static enum enum_server_command fetch_command(THD *thd, char *packet)

1152    {
1153      enum enum_server_command
--Type <RET> for more, q to quit, c to continue without paging--
   0x000055795e867628 <+184>:   movzbl %al,%r15d

1154        command= (enum enum_server_command) (uchar) packet[0];
1155      DBUG_ENTER("fetch_command");

1156
1157      if (command >= COM_END ||
   0x000055795e86762c <+188>:   lea    -0x20(%r15),%edx
   0x000055795e867630 <+192>:   cmp    $0xd9,%edx
   0x000055795e867636 <+198>:   jbe    0x55795e8676f0 <do_command(THD*, bool)+384>
   0x000055795e86763c <+204>:   cmp    $0xff,%al
   0x000055795e86763e <+206>:   je     0x55795e8676f0 <do_command(THD*, bool)+384>

1158          (command >= COM_MDB_GAP_BEG && command <= COM_MDB_GAP_END))
1159        command= COM_END;                           // Wrong command

1160
1161      DBUG_PRINT("info",("Command on %s = %d (%s)",

1162                         vio_description(thd->net.vio), command,
1163                         command_name[command].str));
--Type <RET> for more, q to quit, c to continue without paging--
1164      DBUG_RETURN(command);

1165    }
1166
1167    /**
1168      Read one command from connection and execute it (query or simple command).
1169      This function is to be used by different schedulers (one-thread-per-connection,
1170      pool-of-threads)
1171
1172      For profiling to work, it must never be called recursively.
1173
1174      @param thd - client connection context
1175
1176      @param blocking - wait for command to finish.
1177                        if false (nonblocking), then the function might
1178                        return when command is "half-finished", with
1179                        DISPATCH_COMMAND_WOULDBLOCK.
1180                        Currenly, this can *only* happen when using
1181                        threadpool. The command will resume, after all outstanding
1182                        async operations (i.e group commit) finish.
--Type <RET> for more, q to quit, c to continue without paging--
1183                        Threadpool scheduler takes care of "resume".
1184
1185      @retval
1186        DISPATCH_COMMAND_SUCCESS - success
1187      @retval
1188        DISPATCH_COMMAND_CLOSE_CONNECTION  request of THD shutdown
1189              (s. dispatch_command() description)
1190      @retval
1191        DISPATCH_COMMAND_WOULDBLOCK - need to wait for asyncronous operations
1192                                      to finish. Only returned if parameter
1193                                      'blocking' is false.
1194    */
1195
1196    dispatch_command_return do_command(THD *thd, bool blocking)
1197    {
   0x000055795e867570 <+0>:     endbr64

1198      dispatch_command_return return_value;

1199      char *packet= 0;

1200      ulong packet_length;
--Type <RET> for more, q to quit, c to continue without paging--

1201      NET *net= &thd->net;

1202      enum enum_server_command command;

1203      DBUG_ENTER("do_command");

1204
1205    #ifdef WITH_WSREP
1206      DBUG_ASSERT(!thd->async_state.pending_ops() ||

1207                  (WSREP(thd) &&
1208                   thd->wsrep_trx().state() == wsrep::transaction::s_aborted));
1209    #else
1210      DBUG_ASSERT(!thd->async_state.pending_ops());
1211    #endif
1212
1213      if (thd->async_state.m_state == thd_async_state::enum_async_state::RESUMED)
   0x000055795e867574 <+4>:     push   %rbp
   0x000055795e867575 <+5>:     mov    %rsp,%rbp
   0x000055795e867578 <+8>:     push   %r15
--Type <RET> for more, q to quit, c to continue without paging--
   0x000055795e86757a <+10>:    push   %r14
   0x000055795e86757c <+12>:    push   %r13
   0x000055795e86757e <+14>:    mov    %esi,%r13d
   0x000055795e867581 <+17>:    push   %r12
   0x000055795e867583 <+19>:    push   %rbx
   0x000055795e867584 <+20>:    mov    %rdi,%rbx
   0x000055795e867587 <+23>:    sub    $0x78,%rsp
   0x000055795e86758b <+27>:    cmpl   $0x2,0x6270(%rdi)
   0x000055795e867592 <+34>:    je     0x55795e867770 <do_command(THD*, bool)+512>
   0x000055795e867598 <+40>:    mov    0x58(%rdi),%rax
   0x000055795e86759c <+44>:    cmpb   $0x0,0x279e(%rdi)
   0x000055795e8675a3 <+51>:    lea    0x290(%rdi),%r14

1214      {
1215        /*
1216         Resuming previously suspended command.
1217         Restore the state
1218        */
1219        command = thd->async_state.m_command;
   0x000055795e867770 <+512>:   mov    0x6278(%rdi),%r10
   0x000055795e867777 <+519>:   mov    0x6280(%rdi),%r12
   0x000055795e86777e <+526>:   movzbl %sil,%r8d
--Type <RET> for more, q to quit, c to continue without paging--
   0x000055795e867782 <+530>:   mov    %rdi,%rsi
   0x000055795e867785 <+533>:   mov    0x6274(%rdi),%r15d

1220        packet = thd->async_state.m_packet.str;

1221        packet_length = (ulong)thd->async_state.m_packet.length;

1222        goto resume;

1223      }
1224
1225      /*
1226        indicator of uninitialized lex => normal flow of errors handling
1227        (see my_message_sql)
1228      */
1229      thd->lex->current_select= 0;
   0x000055795e8675aa <+58>:    movq   $0x0,0xd38(%rax)

1230
1231      /*
1232        This thread will do a blocking read from the client which
1233        will be interrupted when the next command is received from
1234        the client, the connection is closed or "net_wait_timeout"
--Type <RET> for more, q to quit, c to continue without paging--
1235        number of seconds has passed.
1236      */
1237      if (!thd->skip_wait_timeout)
   0x000055795e8675b5 <+69>:    je     0x55795e867710 <do_command(THD*, bool)+416>

1238        my_net_set_read_timeout(net, thd->get_net_wait_timeout());
1239
1240      /* Errors and diagnostics are cleared once here before query */
1241      thd->clear_error(1);
1242
1243      net_new_transaction(net);
   0x000055795e8675dd <+109>:   mov    0x13f0(%rbx),%rax
   0x000055795e8675e4 <+116>:   mov    $0x1,%esi
   0x000055795e8675e9 <+121>:   mov    %r14,%rdi
   0x000055795e8675ec <+124>:   movl   $0x0,0x2f0(%rbx)

1244
1245      /* Save for user statistics */
1246      thd->start_bytes_received= thd->status_var.bytes_received;
   0x000055795e8675f6 <+134>:   mov    %rax,0x4040(%rbx)

1247
--Type <RET> for more, q to quit, c to continue without paging--
1248      /*
1249        Synchronization point for testing of KILL_CONNECTION.
1250        This sync point can wait here, to simulate slow code execution
1251        between the last test of thd->killed and blocking in read().
1252
1253        The goal of this test is to verify that a connection does not
1254        hang, if it is killed at this point of execution.
1255        (Bug#37780 - main.kill fails randomly)
1256
1257        Note that the sync point wait itself will be terminated by a
1258        kill. In this case it consumes a condition broadcast, but does
1259        not change anything else. The consumed broadcast should not
1260        matter here, because the read/recv() below doesn't use it.
1261      */
1262      DEBUG_SYNC(thd, "before_do_command_net_read");

1263
1264      packet_length= my_net_read_packet(net, 1);
   0x000055795e8675fd <+141>:   callq  0x55795ebda8a0 <my_net_read_packet(NET*, my_bool)>
   0x000055795e867602 <+146>:   mov    %rax,%r12

1265
--Type <RET> for more, q to quit, c to continue without paging--
1266      if (unlikely(packet_length == packet_error))
   0x000055795e867605 <+149>:   cmp    $0xffffffffffffffff,%rax
   0x000055795e867609 <+153>:   je     0x55795e8678d0 <do_command(THD*, bool)+864>

1267      {
1268        DBUG_PRINT("info",("Got error %d reading command from socket %s",

1269                           net->error,
1270                           vio_description(net->vio)));
1271
1272        /* Instrument this broken statement as "statement/com/error" */
1273        thd->m_statement_psi= MYSQL_REFINE_STATEMENT(thd->m_statement_psi,
   0x000055795e8678d0 <+864>:   mov    0x3b78(%rbx),%rdi

1274                                                     com_statement_info[COM_END].
1275                                                     m_key);
1276
1277
1278        /* Check if we can continue without closing the connection */
1279
1280        /* The error must be set. */
--Type <RET> for more, q to quit, c to continue without paging--
1281        DBUG_ASSERT(thd->is_error());

1282        thd->protocol->end_statement();
   0x000055795e8678ee <+894>:   mov    0x558(%rbx),%rdi
   0x000055795e8678f5 <+901>:   callq  0x55795e79f7e0 <Protocol::end_statement()>

1283
1284        /* Mark the statement completed. */
1285        MYSQL_END_STATEMENT(thd->m_statement_psi, thd->get_stmt_da());
1286        thd->m_statement_psi= NULL;
   0x000055795e86791b <+939>:   cmpb   $0x3,0x324(%rbx)
   0x000055795e867922 <+946>:   mov    $0x1,%r13d
   0x000055795e867928 <+952>:   movq   $0x0,0x3b78(%rbx)

1287        thd->m_digest= NULL;
   0x000055795e867933 <+963>:   movq   $0x0,0x3b30(%rbx)

1288
1289        if (net->error != 3)
   0x000055795e86793e <+974>:   jne    0x55795e86794a <do_command(THD*, bool)+986>

--Type <RET> for more, q to quit, c to continue without paging--
1290        {
1291          return_value= DISPATCH_COMMAND_CLOSE_CONNECTION;     // We have to close it.
1292          goto out;
1293        }
1294
1295        net->error= 0;
   0x000055795e867940 <+976>:   movb   $0x0,0x324(%rbx)

1296        return_value= DISPATCH_COMMAND_SUCCESS;

1297        goto out;
   0x000055795e867947 <+983>:   xor    %r13d,%r13d

1298      }
1299
1300      packet= (char*) net->read_pos;
   0x000055795e86760f <+159>:   mov    0x2b0(%rbx),%r10

1301      /*
1302        'packet_length' contains length of data, as it was stored in packet
1303        header. In case of malformed header, my_net_read returns zero.
1304        If packet_length is not zero, my_net_read ensures that the returned
--Type <RET> for more, q to quit, c to continue without paging--
1305        number of bytes was actually read from network.
1306        There is also an extra safety measure in my_net_read:
1307        it sets packet[packet_length]= 0, but only for non-zero packets.
1308      */
1309      if (packet_length == 0)                       /* safety */
   0x000055795e867616 <+166>:   test   %rax,%rax
   0x000055795e867619 <+169>:   je     0x55795e8676e0 <do_command(THD*, bool)+368>

1310      {
1311        /* Initialize with COM_SLEEP packet */
1312        packet[0]= (uchar) COM_SLEEP;
   0x000055795e8676e0 <+368>:   movb   $0x0,(%r10)

1313        packet_length= 1;
   0x000055795e8676e4 <+372>:   mov    $0x1,%r12d
   0x000055795e8676ea <+378>:   jmpq   0x55795e86761f <do_command(THD*, bool)+175>
   0x000055795e8676ef <+383>:   nop

1314      }
1315      /* Do not rely on my_net_read, extra safety against programming errors. */
--Type <RET> for more, q to quit, c to continue without paging--
1316      packet[packet_length]= '\0';                  /* safety */
   0x000055795e86761f <+175>:   movb   $0x0,(%r10,%r12,1)

1317
1318
1319      command= fetch_command(thd, packet);
   0x000055795e867624 <+180>:   movzbl (%r10),%eax

1320
1321    #ifdef WITH_WSREP
1322      DEBUG_SYNC(thd, "wsrep_before_before_command");

1323      /*
1324        If this command does not return a result, then we
1325        instruct wsrep_before_command() to skip result handling.
1326        This causes BF aborted transaction to roll back but keep
1327        the error state until next command which is able to return
1328        a result to the client.
1329      */
1330      if (unlikely(wsrep_service_started) &&
   0x000055795e867644 <+212>:   lea    0x1862105(%rip),%rcx        # 0x5579600c9750 <wsrep_service_started>
   0x000055795e86764b <+219>:   cmpb   $0x0,(%rcx)
--Type <RET> for more, q to quit, c to continue without paging--q
Quit
(gdb) quit
A debugging session is active.

        Inferior 1 [process 200650] will be detached.

Quit anyway? (y or n) y
Detaching from program: /home/openxs/dbs/maria10.8/bin/mariadbd, process 200650
[Inferior 1 (process 200650) detached]

At this stage we can stop debugging and quit from gdb. We see that line 1316 in the code (were packet is properly formed and ready for printing) matches (decimal) offset 175 from the beginning of the function. Now we can add that to the start of the function from objdump to find the address for the probe, 0x7b761f.

The only remaining detail is how to print the content of the register r10 as a zero terminated string. Just read fine bpftrace manual for this, and you can come up with the following:

openxs@ao756:~/dbs/maria10.8$ cd ~/git/bpftrace/
openxs@ao756:~/git/bpftrace$ cd build/
openxs@ao756:~/git/bpftrace/build$ sudo src/bpftrace -e 'uprobe:/home/openxs/dbs/maria10.8/bin/mariadbd:0x00000000007b761f { printf("Function: %s packet: %s\n", func, str(reg("r10"))); }'
[sudo] password for openxs:
Attaching 1 probe...
Function: do_command(THD*, bool) packet: select @@version_comment limit 1
Function: do_command(THD*, bool) packet: select 1 + 1
^C

This is what I've got from the probe while in another terminal I've connected and executed the following:

openxs@ao756:~/dbs/maria10.8$ bin/mysql
ERROR 2002 (HY000): Can't connect to local server through socket '/tmp/mysql.sock' (2)
openxs@ao756:~/dbs/maria10.8$ bin/mysql --socket=/tmp/mariadb.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.8.1-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> select 1 + 1;
+-------+
| 1 + 1 |
+-------+
|     2 |
+-------+
1 row in set (0.000 sec)

So, as long as you know the address of instruction inside the function code, you can attach a probe to it. You can safely experiment, as if the address is surely wrong:

openxs@ao756:~/git/bpftrace/build$ sudo src/bpftrace -e 'uprobe:/home/openxs/dbs/maria10.8/bin/mariadbd:0x00000000007b761e { printf("Function: %s packet: %s\n", func, str(reg("r10"))); }'
Attaching 1 probe...
ERROR: Could not add uprobe into middle of instruction: /home/openxs/dbs/maria10.8/bin/mariadbd:_Z10do_commandP3THDb+174
openxs@ao756:~/git/bpftrace/build$ sudo src/bpftrace -e 'uprobe:/home/openxs/dbs/maria10.8/bin/mariadbd:0x00000000008b761f { printf("Function: %s packet: %s\n", func, str(reg("r10"))); }'
Attaching 1 probe...
ERROR: Could not add uprobe into middle of instruction: /home/openxs/dbs/maria10.8/bin/mariadbd:_ZN14partition_info25set_up_default_partitionsEP3THDP7handlerP14HA_CREATE_INFOj+799

the tool will tell you and do no harm.


Nice conbination of old Tower and new buildings in the City of London. Combining bpftrace with good old gdb is also nice.

To summarize:

  1.  It is possible to add a user probe to every assembly insturction inside the function with bpftrace (or perf, or ftrace, as long as you know what you are doing).
  2. It is possible to access CPU registers in the bpftrace probes.
  3. One can surely access, print etc local varaibles inside functions, as long as the address or register where it is stored is determined.
  4. It is surely easier and faster to do all the above with some knowledge of gdb and assembly language for your CPU.
  5. It is still fun to try to solve more complex problems with bpftrace, so more posts are expected. Stay tuned!


Sunday, January 30, 2022

DTrace Basics and Tracing Every MariaDB Server Function Call

In my previous post I've explained how to build MariaDB Server 10.8 from GitHub source on macOS. This kind of (totally unsupported by both MariaDB Corporation and Foundation) build may be used to study new interesting testing, troubleshooting and performance problem solving options that DTrace provides.

To start adding custom probes and doing something non-trivial one has to understand (or, better, master) some basics of the language used. Code in the sample support-files/dtrace/query-time.d program looked clear and familiar enough (at least for those who've seen bpftrace probes). It consists of several probes like this:

mysql*:::query-start
{
   self->query = copyinstr(arg0);
   self->connid = arg1;
   self->db    = copyinstr(arg2);
   self->who   = strjoin(copyinstr(arg3),strjoin("@",copyinstr(arg4)));
   self->querystart = timestamp;
}

As we can read in the manual, the format for the probe specification is provider:module:function:name. An empty component in a probe specification matches anything. You can also describe probes with a pattern matching syntax that is similar to the syntax that is used by shell. 

In the case above  (USDT) provider name should start with mysql, we do not care about module and function (actually we maybe have USDT with the name in different functions), and we refer to the probe by name. 

If you are wondering to see other kinds of specific probes, you can check dtrace -l output:

Yuliyas-Air:~ Valerii$ sudo dtrace -l | head -20
dtrace: system integrity protection is on, some features will not be available

   ID   PROVIDER            MODULE                          FUNCTION NAME
    1     dtrace                                                     BEGIN
    2     dtrace                                                     END
    3     dtrace                                                     ERROR
...
   42    profile                                                     tick-1000
   43    profile                                                     tick-5000
   44  mach_trap                                        kern_invalid entry
   45  mach_trap                                        kern_invalid return
   46  mach_trap                    _kernelrpc_mach_vm_allocate_trap entry
Yuliyas-Air:~ Valerii$ sudo dtrace -l | tail -4
dtrace: system integrity protection is on, some features will not be available

254755   pid82428          mariadbd            do_command(THD*, bool) 5e4
254917   pid82428 libsystem_malloc.dylib                            malloc return
467101 security_exception639          csparser _ZN8Security7CFErrorC2Ev [Security::CFError::CFError()] throw-cf
467102 security_exception639          csparser _ZN8Security9UnixErrorC2Eib [Security::UnixError::UnixError(int, bool)] throw-unix

We should study different providers that DTrace supports:

  • dtrace - few special probes: BEGIN, END and ERROR.
  • profile - this provider includes probes that are associated with an interrupt that fires at some regular, specified time interval .profile-N probes basically fire N times per second on every CPU. tick-N probes fire on only one CPU per interval.
  • syscall - enables you to trace every system call entry and return.
  • pid* - enables tracing of any user process, as specified by its pid. It basically allows to trace entries (with entry probe name) and returns (with return probe name) to every user function, among other things.
  • ...

There are more providers actually, but those mentioned above allow to start reading and writing DTrace scripts for MariaDB server tracing and profiling, so let's stop listing them for now.

As we can see from this output 

Yuliyas-Air:~ Valerii$ dtrace
Usage: dtrace [-aACeFHlqSvVwZ] [-arch i386|x86_64] [-b bufsz] [-c cmd] [-D name[=def]]
        [-I path] [-L path] [-o output] [-p pid] [-s script] [-U name]
        [-x opt[=val]]

        [-P provider [[ predicate ] action ]]
        [-m [ provider: ] module [[ predicate ] action ]]
        [-f [[ provider: ] module: ] func [[ predicate ] action ]]
        [-n [[[ provider: ] module: ] func: ] name [[ predicate ] action ]]
        [-i probe-id [[ predicate ] action ]] [ args ... ]

        predicate -> '/' D-expression '/'
           action -> '{' D-statements '}'
...

for each probe we can define optional predicate (in slashes), expression that must evaluate to true (non-zero integer value) when probe fires for optional action (in curly brackets) to be executed. Action is a sequence of DTrace statements separated by semicolons. Note that DTrace does NOT support loops or complex conditional execution in the actions, for good reasons discussed elsewhere. We may also have optional arguments for the probe.

In our query-start probe above there is no predicate, so the action is executed unconditionally when we hit it. We use some built-in functions (like copyinsrt()) and variables, like arg0, arg1 and timestamp. We'll discuss them in details in some other post, but arg0 ... arg9 obviously refer to the first ten arguments of the probe and timestamp is the current value of a nanosecond timestamp counter.

In the action we see several assignments to self->something. These are thread-local variables that share a common name in your D code but refer to separate data storage associated with each operating system thread.

Now we can try to add some probes. Given this information:

Yuliyas-Air:~ Valerii$ ps aux | grep mariadbd
Valerii          37341   0.0  0.0  4267752    700 s003  S+   10:47AM   0:00.01 grep mariadbd
Valerii          82428   0.0  0.1  4834244   5960   ??  S    Wed09PM 116:47.57 /Users/Valerii/dbs/maria10.8/bin/mariadbd --no-defaults --basedir=/Users/Valerii/dbs/maria10.8 --datadir=/Users/Valerii/dbs/maria10.8/data --plugin-dir=/Users/Valerii/dbs/maria10.8/lib/plugin --log-error=/Users/Valerii/dbs/maria10.8/data/Yuliyas-Air.err --pid-file=Yuliyas-Air.pid

and knowing that function name portion of the current probe's description is avaiable ad probefunc built-in variable, we can try to trace the enter into the every tracable function in this MariaDB server process:

Yuliyas-Air:~ Valerii$ sudo dtrace -n 'pid82428:::entry { printf ("\nEnter %s", probefunc); }' | more
dtrace: system integrity protection is on, some features will not be available

dtrace: description 'pid82428:::entry ' matched 240404 probes
CPU     ID                    FUNCTION:NAME
  1 128445            cerror_nocancel:entry

Enter cerror_nocancel
  1 129499        _pthread_testcancel:entry
Enter _pthread_testcancel
  1 128444                    __error:entry
Enter __error
  1 129505    _pthread_cond_updateval:entry
Enter _pthread_cond_updateval
  1 129429         pthread_mutex_lock:entry
Enter pthread_mutex_lock
  1  36852                  my_hrtime:entry
Enter my_hrtime
  1 122312              clock_gettime:entry
Enter clock_gettime
  1 122657               gettimeofday:entry
Enter gettimeofday
  1 128412    __commpage_gettimeofday:entry
Enter __commpage_gettimeofday
  1 128413 __commpage_gettimeofday_internal:entry
Enter __commpage_gettimeofday_internal
  1 128343         mach_absolute_time:entry
Enter mach_absolute_time
:

I've used faimiliar C-style printf() function to print some custom string including the name of the function when we hit the probe. Note that dtrace attached probes to 240404 different functions without any problem, in default configuration. We immediately started to gte the output that shows function calls happening in background threads all the time. I've used more to be able to copy-paste small portion of the output and then stopped tracing with Ctrl+C after quiting from more.

Note that by default for each hit of the probe we get probe identifier, CPU that it fired on, function and probe name reported. Add -q option to suppress this and see only what we explicitly print in the probe:

Yuliyas-Air:~ Valerii$ sudo dtrace -q -n 'pid82428:::entry { printf ("\nEnter %s", probefunc); }' | more
dtrace: system integrity protection is on, some features will not be available


Enter cerror_nocancel
Enter _pthread_testcancel
Enter __error
Enter _pthread_cond_updateval
Enter pthread_mutex_lock
Enter my_hrtime
Enter clock_gettime
Enter gettimeofday
Enter __commpage_gettimeofday
Enter __commpage_gettimeofday_internal
...

If we want to trace all processes with "mariadbd" as the name of the executable, we can add predicate to the probe that does not specify pid at all:

Yuliyas-Air:~ Valerii$ sudo dtrace -q -n ':::entry / execname == "mariadbd" / { printf ("\nEnter %s", probefunc); }' | more
dtrace: system integrity protection is on, some features will not be available


Enter cerror_nocancel
Enter _pthread_testcancel
Enter __error
Enter _pthread_cond_updateval
Enter pthread_mutex_lock
Enter my_hrtime
Enter clock_gettime
Enter gettimeofday
...

So, we can produce a raw trace (to be stored or processed by some user level code outside of DTrace). But what if we would like to do some more advanced processing in the probe itself and share only the results? For example, we run some MariaDB tests and would like to know how many times specific function in the code was executed (if at all) in the process?

For this we have to use aggregations and rely on aggregate functions to do the job in kernel space. For now it's enough to know that @name[key] is the aggregation name indexed by the key (you can use several keys separated by commas) and count() aggregation function simly counts the number of times it was called. 

Another inportant detail to study to create useful DTrace scripts is scripts arguments substitution. When called with -s option dtrace prowides a set of built-in macro variables to the script, with nsame starting with $. Of them $0 ... $9 have the same meenings as in shell, so $1 is the first argument of the script.

With the above taken into account, we can use the following basic DTrace script to count function executions per name in any user process (quiet option has the same effect as -q):

Yuliyas-Air:~ Valerii$ cat /tmp/codecoverage.d
#!/usr/sbin/dtrace -s
#pragma D option quiet
pid$1::$2:entry {
        @calls[probefunc] = count();
}

This script substitutes the first argument for the process id in the pid provider and the second argument for the function name in the probe, thus we create one or more probes. The aggregation's collected results are printed by default when dtrace script ends, but there is also the printa() function to do this explicitly in the probes if needed. Let's try to apply this to MariaDB server 10.8 running as process 82428 and trace all functions (hence '*' as the explicite second argument):

Yuliyas-Air:~ Valerii$ sudo /tmp/codecoverage.d 82428 '*' >/tmp/res.txt
dtrace: system integrity protection is on, some features will not be available

^C
Yuliyas-Air:~ Valerii$ ls -l /tmp/res.txt
-rw-r--r--  1 Valerii  wheel  58772 Jan 28 20:53 /tmp/res.txt
Yuliyas-Air:~ Valerii$ head -10 /tmp/res.txt


  ACL_USER* find_by_username_or_anon<ACL_USER>(ACL_USER*, unsigned long, char const*, char const*, char const*, char const*, char                1
  ACL_USER::copy(st_mem_root*)                                      1
  Apc_target::init(st_mysql_mutex*)                                 1
  Binary_string::copy(char const*, unsigned long)                   1
  CONNECT::create_thd(THD*)                                         1
  Create_func_arg1::create_func(THD*, st_mysql_const_lex_string*, List<Item>*)                1
  Create_func_sleep::create_1_arg(THD*, Item*)                      1
  Current_schema_tracker::update(THD*, set_var*)                    1
Yuliyas-Air:~ Valerii$ tail -10 /tmp/res.txt
  buf_page_get_low(page_id_t, unsigned long, unsigned long, buf_block_t*, unsigned long, mtr_t*, dberr_t*, bool)            33895
  pthread_self                                                  36422
  mtr_t::memcpy_low(buf_block_t const&, unsigned short, void const*, unsigned long)            44143
  unsigned char* mtr_t::log_write<(unsigned char)48>(page_id_t, buf_page_t const*, unsigned long, bool, unsigned long)            44167
  mach_boottime_usec                                            45007
  mtr_t::modify(buf_block_t const&)                             45609
  pthread_mutex_unlock                                          56530
  pthread_mutex_lock                                            57295
  pthread_cond_broadcast                                        64197
  _platform_memmove$VARIANT$Haswell                             72767
Yuliyas-Air:~ Valerii$ cat /tmp/res.txt | grep do_command
  do_command(THD*, bool)                                            4

I've executed just several statements (4 if you check do_command() executions) over a short enough period of time, but some functions like pthread_mutex_lock were executed thousands of times in the background. Now we have a lame code coverage testing method that works at MariaDB server scale (unlike with bpftrace on Linux). There are performance implications (huge impact when I tried to do the same while running sysbench test with many concurrent threads) and limitations of internal buffer sizes that may cause some probes to be skipped etc, but basically it works!

DTrace is almost as beautiful as Côte d'Azur :)

 To summarize:

  1. We checked basic building blocks of simple DTrace one liners and scripts in some details, including probe structure, main providers, some built-in variables, thread-local variables referenced via self-> and aggregations.
  2. It is possible to add probes to each and every function in MariaDB server 10.8 and do both raw tracing and aggregation in the kernel context. Lame code coverage scripts do work.
  3. It is possible to create parametrized, generic scripts with command line arguments and other macros substituted.
  4. DTrace language is similar to what we see later implemented in bpftrace. After some checks of the manual one can easily switch from one tool to the other for basic tasks.
  5. Built-in variablers and functions may be named differently than in bpftrace.
  6. There are many more details to cover and interesting scripts to use with MariaDB server, so stay tuned!

Thursday, January 27, 2022

First steps with MariaDB Server and DTrace on macOS

FOSDEM 2022 is going to happen on the next weekend and I am still missing blog posts supporting my unusual upcoming talk there, this time devoted to building and using MariaDB server on macOS. So tonight I am going to contribute something new to refer to during my talk or followup questions. I'll try to document my way to build MariaDB Server from GitHub source on old (early 2015) MacBook Air running macOS 10.13.6 High Sierra. I am also going to show why one may want to use macOS there instead of installing Linux and having way more usual environment on the same decent hardware.

I've inherited this Air from my daughter in September, 2021 after she had got a new, M1-based, and initially used it mostly for Zoom calls and web-browsing. Soon I recalled that I've used MacBook for more than 3 years in the past while working for Sun and Oracle, and it was my main working platform not only for content consumption, emails and virtual machines. It was regularly used for bugs verification, MySQL builds and tests of all kinds. It would be a waste of capable OS and good hardware (still more powerful formally than all my other machines but Fedora 33 desktop) NOT to try to use it properly.

That's why after few software upgrades to end up with more recent 10.13 minor release:

Yuliyas-Air:maria10.6 Valerii$ uname -a
Darwin Yuliyas-Air 17.7.0 Darwin Kernel Version 17.7.0: Mon Aug 31 22:11:23 PDT 2020; root:xnu-4570.71.82.6~1/RELEASE_X86_64 x86_64

I read the manual, registered as a developer, downloaded and installed proper version of XCode and decided to proceed with MacPorts.

Then I updated ports tree and proceed by the manual, with installing git and other surely needed tools and packages:

sudo port install git cmake jemalloc judy openssl boost gnutls

That was just the beginning, and eventually, with dependencies, port updates, problems and workarounds, I ended up like this (and that works for building current code of 10.1 to 10.8 for sure, with some ports maybe not needed or used for other builds, but who cares):

Yuliyas-Air:maria10.8 Valerii$ port installed
The following ports are currently installed:
  autoconf @2.71_1 (active)
  automake @1.16.5_0 (active)
  bison @3.8.2_0
  bison @3.8.2_2 (active)
  bison-runtime @3.8.2_0 (active)
  boehmgc @8.0.6_0 (active)
  boost @1.76_0 (active)
  boost171 @1.71.0_3+no_single+no_static+python39 (active)
  boost176 @1.76.0_2+no_single+no_static+python39
  boost176 @1.76.0_3+no_single+no_static+python39 (active)
  bzip2 @1.0.8_0 (active)
  cmake @3.21.4_0
  cmake @3.22.1_0 (active)
  curl @7.80.0_0+ssl (active)
  curl-ca-bundle @7.80.0_0
  curl-ca-bundle @7.80.0_1 (active)
  cyrus-sasl2 @2.1.27_5+kerberos (active)
  db48 @4.8.30_4 (active)
  expat @2.4.1_0
  expat @2.4.2_0
  expat @2.4.3_0 (active)
  gdb @11.1_0 (active)
  gdbm @1.22_0 (active)
  gettext @0.21_0 (active)
  gettext-runtime @0.21_0 (active)
  gettext-tools-libs @0.21_0 (active)
  git @2.34.1_1+credential_osxkeychain+diff_highlight+doc+pcre+perl5_28
  git @2.34.1_2+credential_osxkeychain+diff_highlight+doc+pcre+perl5_28 (active)
  gmp @6.2.1_0 (active)
  gnutls @3.6.16_1
  gnutls @3.6.16_2 (active)
  icu @67.1_4 (active)
  jemalloc @5.2.1_1 (active)
  judy @1.0.5_1 (active)
  kerberos5 @1.19.2_1 (active)
  libarchive @3.5.2_1 (active)
  libb2 @0.98.1_1 (active)
  libcomerr @1.45.6_0 (active)
  libcxx @5.0.1_4 (active)
  libedit @20210910-3.1_1 (active)
  libevent @2.1.12_1 (active)
  libffi @3.4.2_2 (active)
  libiconv @1.16_1 (active)
  libidn @1.38_0 (active)
  libidn2 @2.3.2_0
  libidn2 @2.3.2_1 (active)
  libpsl @0.21.1-20210726_2 (active)
  libtasn1 @4.18.0_0 (active)
  libtextstyle @0.21_0 (active)
  libtool @2.4.6_13 (active)
  libunistring @0.9.10_0
  libunistring @1.0_0 (active)
  libuv @1.42.0_1
  libuv @1.43.0_0 (active)
  libxml2 @2.9.12_1 (active)
  libxslt @1.1.34_6 (active)
  lmdb @0.9.29_0 (active)
  luajit @2.1.0-beta3_5 (active)
  lz4 @1.9.3_1 (active)
  lzma @4.65_1 (active)
  lzo2 @2.10_0 (active)
  m4 @1.4.19_1 (active)
  mariadb @5.5.68_0 (active)
  mariadb-server @5.5.68_0 (active)
  mysql57 @5.7.36_1 (active)
  mysql_select @0.1.2_4 (active)
  ncurses @6.3_0 (active)
  nettle @3.7.3_0 (active)
  openssl @3_1
  openssl @3_2 (active)
  openssl3 @3.0.0_6+legacy
  openssl3 @3.0.1_0+legacy (active)
  openssl11 @1.1.1l_5 (active)
  p5-dbd-mysql @4.50.0_0 (active)
  p5.28-authen-sasl @2.160.0_0 (active)
  p5.28-cgi @4.530.0_0 (active)
  p5.28-clone @0.450.0_0 (active)
  p5.28-dbd-mysql @4.50.0_0+mysql57 (active)
  p5.28-dbi @1.643.0_0 (active)
  p5.28-digest-hmac @1.40.0_0 (active)
  p5.28-digest-sha1 @2.130.0_4 (active)
  p5.28-encode @3.160.0_0 (active)
  p5.28-encode-locale @1.50.0_0 (active)
  p5.28-error @0.170.290_0 (active)
  p5.28-gssapi @0.280.0_3 (active)
  p5.28-html-parser @3.760.0_0 (active)
  p5.28-html-tagset @3.200.0_4 (active)
  p5.28-http-date @6.50.0_0 (active)
  p5.28-http-message @6.350.0_0 (active)
  p5.28-io-html @1.4.0_0 (active)
  p5.28-io-socket-ssl @2.72.0_0
  p5.28-io-socket-ssl @2.73.0_0 (active)
  p5.28-lwp-mediatypes @6.40.0_0 (active)
  p5.28-mozilla-ca @20211001_0 (active)
  p5.28-net-libidn @0.120.0_5 (active)
  p5.28-net-smtp-ssl @1.40.0_0 (active)
  p5.28-net-ssleay @1.900.0_4 (active)
  p5.28-term-readkey @2.380.0_0 (active)
  p5.28-time-local @1.300.0_0 (active)
  p5.28-timedate @2.330.0_0 (active)
  p5.28-uri @5.100.0_0 (active)
  p11-kit @0.24.0_1 (active)
  pcre2 @10.39_0 (active)
  perl5.28 @5.28.3_4 (active)
  perl5.30 @5.30.3_3 (active)
  pkgconfig @0.29.2_0 (active)
  popt @1.18_1 (active)
  python3_select @0.0_2 (active)
  python39 @3.9.9_0+lto+optimizations
  python39 @3.9.10_0+lto+optimizations (active)
  python_select @0.3_9 (active)
  readline @8.1.000_0 (active)
  readline-5 @5.2.014_2 (active)
  rsync @3.2.3_1 (active)
  sqlite3 @3.37.0_0
  sqlite3 @3.37.1_0
  sqlite3 @3.37.2_0 (active)
  sysbench @1.0.20_0 (active)
  tcp_wrappers @20_4 (active)
  texinfo @6.8_0 (active)
  umem @1.0.1_1 (active)
  xxhashlib @0.8.1_0
  xxhashlib @0.8.1_1 (active)
  xz @5.2.5_0 (active)
  zlib @1.2.11_0 (active)
  zstd @1.5.0_0
  zstd @1.5.1_0 (active)
Yuliyas-Air:maria10.8 Valerii$

This is surely not the minimal needed set of ports. I've highlighted a couple (like openssl11) that were really needed to build and install 10.8 successfully eventually.

Then I cloned the code with usual steps and ended up with this:

Yuliyas-Air:server Valerii$ git log -1
commit c1cef1afa9962544de4840c9a796ae0a9b5e92e6 (HEAD -> 10.8, origin/bb-10.8-wlad, origin/bb-10.8-release, origin/HEAD, origin/10.8)
Merge: db2013787d2 9d93b51effd
Author: Vladislav Vaintroub <wlad@mariadb.com>
Date:   Wed Jan 26 13:57:00 2022 +0100

    Merge remote-tracking branch 'origin/bb-10.8-wlad' into 10.8
Yuliyas-Air:server Valerii$ git submodule update --init --recursive
Submodule path 'libmariadb': checked out 'ddb031b6a1d8b6e26a0f10f454dc1453a48a6ca8'
Yuliyas-Air:server Valerii$ cd buildtmp/
Yuliyas-Air:buildtmp Valerii$ rm -rf *
Yuliyas-Air:buildtmp Valerii$ cmake .. -DCMAKE_INSTALL_PREFIX=/Users/Valerii/dbs/maria10.8 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_CONFIG=mysql_release -DFEATURE_SET=community -DWITH_EMBEDDED_SERVER=OFF -DPLUGIN_TOKUDB=NO -DWITH_SSL=/opt/local/libexec/openssl11 -DENABLE_DTRACE=1
...
-- The following OPTIONAL packages have not been found:

 * PMEM
 * Snappy

-- Configuring done
CMake Warning (dev):
  Policy CMP0042 is not set: MACOSX_RPATH is enabled by default.  Run "cmake
  --help-policy CMP0042" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  MACOSX_RPATH is not specified for the following targets:

   libmariadb

This warning is for project developers.  Use -Wno-dev to suppress it.

-- Generating done
-- Build files have been written to: /Users/Valerii/git/server/buildtmp
Yuliyas-Air:buildtmp Valerii$

Note that I've used somewhat nontrivial cmake command line for out of source build, and the output above was not from the clean state, but from the state after 10.1 to 10.7 where all checked out and built successfully, one by one, with problems found and resolved (more on that on my slides and during the talk).

Two key options in that command line are -DWITH_SSL=/opt/local/libexec/openssl11 to use supported OpenSSL version 1.1 (even though for 10.8 default version 3 should already work too) and -DENABLE_DTRACE=1 to enable static DTrace probes in the code of MariaDB that are still there. The rest are typical for my builds and blog posts.

Then I proceeded with usual make to end up with the first problem:

Yuliyas-Air:buildtmp Valerii$ make
...
[ 46%] Built target heap
[ 46%] Building CXX object storage/innobase/CMakeFiles/innobase.dir/btr/btr0btr.cc.o
In file included from /Users/Valerii/git/server/storage/innobase/btr/btr0btr.cc:28:
In file included from /Users/Valerii/git/server/storage/innobase/include/btr0btr.h:31:
In file included from /Users/Valerii/git/server/storage/innobase/include/dict0dict.h:32:
In file included from /Users/Valerii/git/server/storage/innobase/include/dict0mem.h:45:
In file included from /Users/Valerii/git/server/storage/innobase/include/buf0buf.h:33:
/Users/Valerii/git/server/storage/innobase/include/fil0fil.h:1497:11: error:
      'asm goto' constructs are not supported yet
  __asm__ goto("lock btsl $31, %0\t\njnc %l1" : : "m" (n_pending)
          ^
1 error generated.
make[2]: *** [storage/innobase/CMakeFiles/innobase.dir/btr/btr0btr.cc.o] Error 1
make[1]: *** [storage/innobase/CMakeFiles/innobase.dir/all] Error 2
make: *** [all] Error 2

caused by the fact that clang 10.0.0 from XCode: 

Yuliyas-Air:maria10.8 Valerii$ clang --version
Apple LLVM version 10.0.0 (clang-1000.10.44.4)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

does NOT support asm goto used in MariaDB Server code since 10.6 (while it was supposed to support it). I've applied a lame fix (see MDEV-27402 for more details and final diff later) and proceeded:

Yuliyas-Air:buildtmp Valerii$ make
...

[ 97%] Building CXX object client/CMakeFiles/mariadb.dir/mysql.cc.o
/Users/Valerii/git/server/client/mysql.cc:2853:59: error: expected expression
  rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_co...
                                                          ^
/Users/Valerii/git/server/client/mysql.cc:2853:38: error: use of undeclared
      identifier 'rl_completion_func_t'; did you mean 'rl_completion_matches'?
  rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_co...
                                     ^~~~~~~~~~~~~~~~~~~~
                                     rl_completion_matches
/usr/include/editline/readline.h:202:16: note: 'rl_completion_matches' declared
      here
char           **rl_completion_matches(const char *, rl_compentry_func_t *);
                 ^
/Users/Valerii/git/server/client/mysql.cc:2854:33: error: assigning to
      'Function *' (aka 'int (*)(const char *, int)') from incompatible type
      'rl_compentry_func_t *' (aka 'char *(*)(const char *, int)'): different
      return type ('int' vs 'char *')
  rl_completion_entry_function= (rl_compentry_func_t*)&no_completion;
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/Valerii/git/server/client/mysql.cc:2856:3: error: no matching function
      for call to 'rl_add_defun'
  rl_add_defun("magic-space", (rl_command_func_t *)&fake_magic_space, -1);
  ^~~~~~~~~~~~
/usr/include/editline/readline.h:195:7: note: candidate function not viable: no
      known conversion from 'rl_command_func_t *' (aka 'int (*)(int, int)') to
      'Function *' (aka 'int (*)(const char *, int)') for 2nd argument
int              rl_add_defun(const char *, Function *, int);
                 ^
4 errors generated.
make[2]: *** [client/CMakeFiles/mariadb.dir/mysql.cc.o] Error 1
make[1]: *** [client/CMakeFiles/mariadb.dir/all] Error 2
make: *** [all] Error 2
Yuliyas-Air:buildtmp Valerii$

This readline-related problem was also reported (see MDEV-27579) and I fixed it with a lame patch, and eventually I was able to build successfully:

...

[100%] Linking C executable wsrep_check_version
[100%] Built target wsrep_check_version
Yuliyas-Air:buildtmp Valerii$ make install && make clean
...
-- Installing: /Users/Valerii/dbs/maria10.8/share/aclocal/mysql.m4
-- Installing: /Users/Valerii/dbs/maria10.8/support-files/mysql.server
Yuliyas-Air:buildtmp Valerii$ echo $?
0

The final diff is like this:

Yuliyas-Air:server Valerii$ git diff -u
diff --git a/client/mysql.cc b/client/mysql.cc
index 6612b273d17..902589f2e83 100644
--- a/client/mysql.cc
+++ b/client/mysql.cc
@@ -2849,7 +2849,7 @@ static void initialize_readline ()
   rl_terminal_name= getenv("TERM");

   /* Tell the completer that we want a crack first. */
-#if defined(USE_NEW_READLINE_INTERFACE)
+#if defined(USE_NEW_READLINE_INTERFACE) && !defined(__APPLE_CC__)
   rl_attempted_completion_function= (rl_completion_func_t*)&new_mysql_completion;
   rl_completion_entry_function= (rl_compentry_func_t*)&no_completion;

@@ -2859,7 +2859,9 @@ static void initialize_readline ()
   setlocale(LC_ALL,""); /* so as libedit use isprint */
 #endif
   rl_attempted_completion_function= (CPPFunction*)&new_mysql_completion;
+#if !defined(__APPLE_CC__)
   rl_completion_entry_function= &no_completion;
+#endif
   rl_add_defun("magic-space", (Function*)&fake_magic_space, -1);
 #else
   rl_attempted_completion_function= (CPPFunction*)&new_mysql_completion;
diff --git a/storage/innobase/include/fil0fil.h b/storage/innobase/include/fil0fil.h
index 34a53746b42..a795313116f 100644
--- a/storage/innobase/include/fil0fil.h
+++ b/storage/innobase/include/fil0fil.h
@@ -1489,7 +1489,7 @@ inline void fil_space_t::reacquire()
 inline bool fil_space_t::set_stopping_check()
 {
   mysql_mutex_assert_owner(&fil_system.mutex);
-#if defined __clang_major__ && __clang_major__ < 10
+#if (defined __clang_major__ && __clang_major__ < 10) || defined __APPLE_CC__
   /* Only clang-10 introduced support for asm goto */
   return n_pending.fetch_or(STOPPING, std::memory_order_relaxed) & STOPPING;
 #elif defined __GNUC__ && (defined __i386__ || defined __x86_64__)
diff --git a/storage/rocksdb/rocksdb b/storage/rocksdb/rocksdb
--- a/storage/rocksdb/rocksdb
+++ b/storage/rocksdb/rocksdb
@@ -1 +1 @@
-Subproject commit bba5e7bc21093d7cfa765e1280a7c4fdcd284288
+Subproject commit bba5e7bc21093d7cfa765e1280a7c4fdcd284288-dirty
Yuliyas-Air:server Valerii$

The last diff in rocksdb submodule is related to the "missing zstd headers" problem I fined with yet another lame patch while building 10.7, see MDEV-27619

After usual mysql_install_db and startup, I've ended up with a shiny new MariaDB 10.8.0 up and running on macOS 10.13.6:

Yuliyas-Air:maria10.8 Valerii$ bin/mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.8.0-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show variables like 'version%';
+-------------------------+------------------------------------------+
| Variable_name           | Value                                    |
+-------------------------+------------------------------------------+
| version                 | 10.8.0-MariaDB                           |
| version_comment         | MariaDB Server                           |
| version_compile_machine | x86_64                                   |
| version_compile_os      | osx10.13                                 |
| version_malloc_library  | system                                   |
| version_source_revision | c1cef1afa9962544de4840c9a796ae0a9b5e92e6 |
| version_ssl_library     | OpenSSL 1.1.1l  24 Aug 2021              |
+-------------------------+------------------------------------------+
7 rows in set (0.001 sec)

I even checked few MTR test suites, and many tests pass:

Yuliyas-Air:mysql-test Valerii$ ./mtr --suite=rocksdb
Logging: ./mtr  --suite=rocksdb
VS config:
vardir: /Users/Valerii/dbs/maria10.8/mysql-test/var
Checking leftover processes...
 - found old pid 89161 in 'mysqld.1.pid', killing it...
   process did not exist!
Removing old var directory...
Creating var directory '/Users/Valerii/dbs/maria10.8/mysql-test/var'...
Checking supported features...
MariaDB Version 10.8.0-MariaDB
 - SSL connections supported
Using suites: rocksdb
Collecting tests...
Installing system database...
...
rocksdb.index_merge_rocksdb 'write_committed' [ pass ]    911
rocksdb.shutdown 'write_prepared'        [ pass ]   1895
rocksdb.index_merge_rocksdb 'write_prepared' [ pass ]    923
rocksdb.mariadb_misc_binlog 'write_committed' [ pass ]     43
rocksdb.mariadb_misc_binlog 'write_prepared' [ pass ]     43
...
rocksdb.issue495 'write_committed'       [ pass ]     33
rocksdb.partition 'write_committed'      [ fail ]
        Test ended at 2022-01-27 22:41:15

CURRENT_TEST: rocksdb.partition
mysqltest: At line 67: query 'ALTER TABLE t1 REBUILD PARTITION p0, p1' failed: ER_METADATA_INCONSISTENCY (4064): Table 'test.t1#P#p0#TMP#' does not exist, but metadata information exists inside MyRocks. This is a sign of data inconsistency. Please check if './test/t1#P#p0#TMP#.frm' exists, and try to restore it if it does not exist.

The result from queries just before the failure was:
< snip >
DROP TABLE IF EXISTS employees_hash_1;
DROP TABLE IF EXISTS t1_hash;
DROP TABLE IF EXISTS employees_linear_hash;
DROP TABLE IF EXISTS t1_linear_hash;
DROP TABLE IF EXISTS k1;
DROP TABLE IF EXISTS k2;
DROP TABLE IF EXISTS tm1;
DROP TABLE IF EXISTS tk;
DROP TABLE IF EXISTS ts;
DROP TABLE IF EXISTS ts_1;
DROP TABLE IF EXISTS ts_3;
DROP TABLE IF EXISTS ts_4;
DROP TABLE IF EXISTS ts_5;
DROP TABLE IF EXISTS trb3;
DROP TABLE IF EXISTS tr;
DROP TABLE IF EXISTS members_3;
DROP TABLE IF EXISTS clients;
DROP TABLE IF EXISTS clients_lk;
DROP TABLE IF EXISTS trb1;
CREATE TABLE t1 (i INT, j INT, k INT, PRIMARY KEY (i)) ENGINE = ROCKSDB PARTITION BY KEY(i) PARTITIONS 4;

More results from queries before failure can be found in /Users/Valerii/dbs/maria10.8/mysql-test/var/log/partition.log

 - saving '/Users/Valerii/dbs/maria10.8/mysql-test/var/log/rocksdb.partition-write_committed/' to '/Users/Valerii/dbs/maria10.8/mysql-test/var/log/rocksdb.partition-write_committed/'

Only  148  of 555 completed.
--------------------------------------------------------------------------
The servers were restarted 23 times
Spent 154.726 of 206 seconds executing testcases

Failure: Failed 1/27 tests, 96.30% were successful.

Failing test(s): rocksdb.partition

The log files in var/log may give you some hint of what went wrong.

If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html

69 tests were skipped, 2 by the test itself.

mysql-test-run: *** ERROR: there were failing test cases

But the real reason to build MariaDB server on macOS (other than "because I can") was not even RocksDB testing, but this:

Yuliyas-Air:mysql-test Valerii$ dtrace
Usage: dtrace [-aACeFHlqSvVwZ] [-arch i386|x86_64] [-b bufsz] [-c cmd] [-D name[=def]]
        [-I path] [-L path] [-o output] [-p pid] [-s script] [-U name]
        [-x opt[=val]]

        [-P provider [[ predicate ] action ]]
        [-m [ provider: ] module [[ predicate ] action ]]
        [-f [[ provider: ] module: ] func [[ predicate ] action ]]
        [-n [[[ provider: ] module: ] func: ] name [[ predicate ] action ]]
        [-i probe-id [[ predicate ] action ]] [ args ... ]

        predicate -> '/' D-expression '/'
           action -> '{' D-statements '}'

        -arch Generate programs and Mach-O files for the specified architecture

        -a  claim anonymous tracing state
        -A  generate plist(5) entries for anonymous tracing
        -b  set trace buffer size
        -c  run specified command and exit upon its completion
        -C  run cpp(1) preprocessor on script files
        -D  define symbol when invoking preprocessor
        -e  exit after compiling request but prior to enabling probes
        -f  enable or list probes matching the specified function name
        -F  coalesce trace output by function
        -h  generate a header file with definitions for static probes
        -H  print included files when invoking preprocessor
        -i  enable or list probes matching the specified probe id
        -I  add include directory to preprocessor search path
        -l  list probes matching specified criteria
        -L  add library directory to library search path
        -m  enable or list probes matching the specified module name
        -n  enable or list probes matching the specified probe name
        -o  set output file
        -p  grab specified process-ID and cache its symbol tables
        -P  enable or list probes matching the specified provider name
        -q  set quiet mode (only output explicitly traced data)
        -s  enable or list probes according to the specified D script
        -S  print D compiler intermediate code
        -U  undefine symbol when invoking preprocessor
        -v  set verbose mode (report stability attributes, arguments)
        -V  report DTrace API version
        -w  permit destructive actions
        -W  wait for specified process and exit upon its completion
        -x  enable or modify compiler and tracing options
        -Z  permit probe descriptions that match zero probes
Yuliyas-Air:mysql-test Valerii$

I remember how cool dtrace was since workin for Sun and after last re-install of Fedora I ended up with my Illumos VM gone and no recent FreeBSD VM anyway,. so basically there was no dtrace at hand until I've got this MacBook.

So, unlike MySQL 8.0, MariaDB server still contains USDT (DTrace probes) and I've built my version with them enabled. Let's quickly check how they can be used. There are few sample D files in the source:

Yuliyas-Air:server Valerii$ ls support-files/dtrace/
locktime.d                      query-rowops.d
query-execandqc.d               query-time.d
query-filesort-time.d           statement-time.d
query-network-time.d            statement-type-aggregate.d
query-parse-time.d
Yuliyas-Air:server Valerii$ cat support-files/dtrace/query-time.d
#!/usr/sbin/dtrace -s
#
# Copyright (c) 2009 Sun Microsystems, Inc.
# Use is subject to license terms.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; version 2 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA
#
# Shows basic query execution time, who execute the query, and on what database

#pragma D option quiet

dtrace:::BEGIN
{
   printf("%-20s %-20s %-40s %-9s\n", "Who", "Database", "Query", "Time(ms)");
}

mysql*:::query-start
{
   self->query = copyinstr(arg0);
   self->connid = arg1;
   self->db    = copyinstr(arg2);
   self->who   = strjoin(copyinstr(arg3),strjoin("@",copyinstr(arg4)));
   self->querystart = timestamp;
}

mysql*:::query-done
{
   printf("%-20s %-20s %-40s %-9d\n",self->who,self->db,self->query,
          (timestamp - self->querystart) / 1000000);
}
Yuliyas-Air:server Valerii$

We can try to run it (with yet another hack to do that I leave to the curious reader who is really going to try this) and then execute some queries against MariaDB from another terminal:

Yuliyas-Air:server Valerii$ sudo support-files/dtrace/query-time.d
dtrace: system integrity protection is on, some features will not be available

Who                  Database             Query                                    Time(ms)
Valerii@localhost                         select @@version_comment limit 1         0
Valerii@localhost                         select 1+1                               0
Valerii@localhost                         select sleep(4)                          4004
Valerii@localhost                         select @@version_comment limit 1         0
Valerii@localhost                         create database sbtest                   1
Valerii@localhost    sbtest               CREATE TABLE sbtest1(
  id INTEGER NOT NULL AUTO_INCREMENT,
  k INTEGER DEFAULT '0' NOT NULL,
  c CHAR(120) DEFAULT '' NOT NULL,
  pad CHAR(60) DEFAULT '' NOT NULL,
  PRIMARY KEY (id)
) /*! ENGINE = innodb */ 91
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(366941, '31451373586-15688153734-79729593694-96509299839-83724898275-86711833539-78981337422-35049690573-51724173961-87474696253', '98996621624-36689827414-04092488557-09587706818-65008859162'),(277750, '21472970079-7 181
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(124021, '80697810288-90543941719-80227288793-55278810422-59841440561-49369413842-83550451066-12907725305-62036548401-86959403176', '65708342793-83311865079-53224065384-18645733125-16333693298'),(660496, '07381386584-5 31
Valerii@localhost    sbtest               INSERT INTO sbtest1(k, c, pad) VALUES(425080, '48883413333-17783399741-03981526516-97596354402-27141206678-83563692683-30244461835-25263435890-49140039573-28211133426', '81560227417-96691828090-72817141653-15106797886-43970285630'),(322421, '68618246702-7 33
...

to end up with a lightweight general query log with query execution time repported in milliseconds. Note some statement from, yes, sysbench test preapre :)

Nice view. macOS is also nice and useful platform to run MariaDB server for fun :)

Thta's probably enough for a simgle post. Now go and get that from your MySQL 8 on any OS! USDTs rule!

To summarize:

  1. It is surely possible to build even latest ang greates still alpha MariaDB 10.8 on macOS even as old as 10.13.6, and get usable result.
  2. One of the main benefits of macOS is DTrace support.
  3. It is still possible to use a limited set of USDTs once added to MySQL and now gone in all current versions of MariaDB server.
  4. That rocksdb.partition test failure is to be studied an maybe reported as a MDEV
  5. DTrace is surely more capable than wehat those examples in the source code show. More DTrace-related posts are coming this year. Stay tuned!