Blog of (former?) MySQL Entomologist: 2020

Sunday, November 29, 2020

Installing the Latest MariaDB from the Repository on Debian 10 and Downgrading to Older Minor Version

I had not written any new blog posts here since this one for quite a some time, for two reasons. First of all, I have another project to spend time on, not related to any software at all. Not that it is very successful so far, but I enjoy the process and the results... I was also a bit disappointed by the lack of reactions to some previous posts I considered really useful like this one on BCC tools or the other one on tracing the mutex locks.

Anyway, I write posts here mostly for myself to be used as references later, as it was proved by the sad experience over last 30 years or so that I can forget the solutions of both minor and serious problems I once successfully resolved... So I am going to document one of tests of this week when I had to downgrade MariaDB to some previous version on Debian 10 due to some regression bug. Yes, shit happens and there are regression bugs reported once in a while for MariaDB, all kindly marked with the "regression" label after checks.

I like to use Docker for such tests, so this time I used debian:buster from the official images, pulled and started bash there. I tried to follow this fine MariaDB KB article that is mostly correct, but miss some small details. So, this is what I did:

1. Starting fresh and executing the repository configuration script

openxs@ao756:~$ sudo docker run -it debian:buster bash
root@8af6489c0df8:/# curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash
bash: curl: command not found

I had to know better, We need to update and install curl package first:

root@8af6489c0df8:/# apt-get update
...
root@8af6489c0df8:/# apt-get install curl
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
ca-certificates krb5-locales libcurl4 libgssapi-krb5-2 libk5crypto3
libkeyutils1 libkrb5-3 libkrb5support0 libldap-2.4-2 libldap-common
libnghttp2-14 libpsl5 librtmp1 libsasl2-2 libsasl2-modules
libsasl2-modules-db libssh2-1 libssl1.1 openssl publicsuffix
Suggested packages:
krb5-doc krb5-user libsasl2-modules-gssapi-mit
| libsasl2-modules-gssapi-heimdal libsasl2-modules-ldap libsasl2-modules-otp
libsasl2-modules-sql
The following NEW packages will be installed:
ca-certificates curl krb5-locales libcurl4 libgssapi-krb5-2 libk5crypto3
libkeyutils1 libkrb5-3 libkrb5support0 libldap-2.4-2 libldap-common
libnghttp2-14 libpsl5 librtmp1 libsasl2-2 libsasl2-modules
libsasl2-modules-db libssh2-1 libssl1.1 openssl publicsuffix
0 upgraded, 21 newly installed, 0 to remove and 0 not upgraded.
Need to get 5010 kB of archives.
After this operation, 11.9 MB of additional disk space will be used.
Do you want to continue? [Y/n]
...
done.
root@8af6489c0df8:/# curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash
[error] The following package is needed by the script, but not installed:
apt-transport-https
Please install and rerun the script.

OK, so let's install this one too:

root@8af6489c0df8:/# apt-get install apt-transport-https
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
apt-transport-https
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 149 kB of archives.
After this operation, 156 kB of additional disk space will be used.
Get:1 http://deb.debian.org/debian buster/main amd64 apt-transport-https all 1.8.2.1 [149 kB]
Fetched 149 kB in 0s (848 kB/s)
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package apt-transport-https.
(Reading database ... 7160 files and directories currently installed.)
Preparing to unpack .../apt-transport-https_1.8.2.1_all.deb ...
Unpacking apt-transport-https (1.8.2.1) ...
Setting up apt-transport-https (1.8.2.1) ...

root@8af6489c0df8:/# curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash
[info] Repository file successfully written to /etc/apt/sources.list.d/mariadb.list
[info] Adding trusted package signing keys...
[info] Running apt-get update...
[info] Done adding trusted package signing keys

Now we are ready to continue with adding the repository for MariaDB 10.5.

2. Adding the repository

KB says we should start with adding software-properties-common package:

root@8af6489c0df8:/# apt-get install software-properties-common
Reading package lists... Done
...
0 upgraded, 71 newly installed, 0 to remove and 0 not upgraded.
Need to get 31.6 MB of archives.
After this operation, 139 MB of additional disk space will be used.
Do you want to continue? [Y/n]
...
Setting up software-properties-common (0.96.20.2-2) ...
Processing triggers for systemd (241-7~deb10u4) ...
Processing triggers for libc-bin (2.28-10) ...
Processing triggers for dbus (1.12.20-0+deb10u1) ...
root@8af6489c0df8:/#

Now we can add the repository:

root@8af6489c0df8:/# add-apt-repository 'deb [arch=amd64,arm64,ppc64el] http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.4/debian buster main'

root@8af6489c0df8:/# apt-get update
Hit:1 http://security.debian.org/debian-security buster/updates InRelease
Hit:2 http://deb.debian.org/debian buster InRelease
Hit:3 http://deb.debian.org/debian buster-updates InRelease
Get:5 http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.4/debian buster InRelease [4635 B]
Hit:4 https://downloads.mariadb.com/MariaDB/mariadb-10.5/repo/debian buster InRelease
Hit:6 https://downloads.mariadb.com/Tools/debian buster InRelease
Get:7 https://dlm.mariadb.com/repo/maxscale/latest/debian buster InRelease [3515 B]
Get:8 http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.4/debian buster/main amd64 Packages [29.1 kB]
Get:9 http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.4/debian buster/main ppc64el Packages [19.7 kB]
Get:10 http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.4/debian buster/main arm64 Packages [19.8 kB]
Fetched 76.7 kB in 2s (50.7 kB/s)
Reading package lists... Done

Note that I asked for 10.4 and got 10.5 in the outputs and on one of the next steps. One day I'll figure out why it was so...

3. Importing the MariaDB GPG public key

KB articles says I have to install dirmngr first starting from Debian 9, so I did it:

root@8af6489c0df8:/# apt-get install dirmngr
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server
gpgconf gpgsm libassuan0 libksba8 libnpth0 pinentry-curses
Suggested packages:
dbus-user-session pinentry-gnome3 tor parcimonie xloadimage scdaemon
pinentry-doc
The following NEW packages will be installed:
dirmngr gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client
gpg-wks-server gpgconf gpgsm libassuan0 libksba8 libnpth0 pinentry-curses
0 upgraded, 14 newly installed, 0 to remove and 0 not upgraded.
Need to get 7089 kB of archives.
After this operation, 14.9 MB of additional disk space will be used.
Do you want to continue? [Y/n]
...
Setting up gnupg (2.2.12-1+deb10u1) ...
Processing triggers for libc-bin (2.28-10) ...
root@8af6489c0df8:/#

Then I called apt-key as suggested by the KB article, with the key fingerprint listed there:

root@8af6489c0df8:/# apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xF1656F24C74CD1D8
Executing: /tmp/apt-key-gpghome.h3pkEOHaEm/gpg.1.sh --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xF1656F24C74CD1D8
...

4. Installing MariaDB packages with apt-get

Finally I can install what I need:

root@8af6489c0df8:/# apt-get install mariadb-server galera-4 mariadb-client libmariadb3 mariadb-backup mariadb-common
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
gawk libaio1 libcgi-fast-perl libcgi-pm-perl libdbd-mariadb-perl libdbi-perl
libencode-locale-perl libfcgi-perl libgdbm-compat4 libgdbm6 libgpm2
libhtml-parser-perl libhtml-tagset-perl libhtml-template-perl
libhttp-date-perl libhttp-message-perl libio-html-perl
liblwp-mediatypes-perl libmpfr6 libncurses6 libpcre2-8-0 libperl5.28
libpopt0 libprocps7 libreadline5 libsigsegv2 libterm-readkey-perl
libtimedate-perl liburi-perl libwrap0 lsof mariadb-client-10.5
mariadb-client-core-10.5 mariadb-server-10.5 mariadb-server-core-10.5
mysql-common netbase perl perl-modules-5.28 procps psmisc rsync socat
Suggested packages:
gawk-doc libclone-perl libmldbm-perl libnet-daemon-perl
libsql-statement-perl gdbm-l10n gpm libdata-dump-perl
libipc-sharedcache-perl libwww-perl mailx mariadb-test netcat-openbsd
perl-doc libterm-readline-gnu-perl | libterm-readline-perl-perl make
libb-debug-perl liblocale-codes-perl openssh-client openssh-server
The following NEW packages will be installed:
galera-4 gawk libaio1 libcgi-fast-perl libcgi-pm-perl libdbd-mariadb-perl
libdbi-perl libencode-locale-perl libfcgi-perl libgdbm-compat4 libgdbm6
libgpm2 libhtml-parser-perl libhtml-tagset-perl libhtml-template-perl
libhttp-date-perl libhttp-message-perl libio-html-perl
liblwp-mediatypes-perl libmariadb3 libmpfr6 libncurses6 libpcre2-8-0
libperl5.28 libpopt0 libprocps7 libreadline5 libsigsegv2
libterm-readkey-perl libtimedate-perl liburi-perl libwrap0 lsof
mariadb-backup mariadb-client mariadb-client-10.5 mariadb-client-core-10.5
mariadb-common mariadb-server mariadb-server-10.5 mariadb-server-core-10.5
mysql-common netbase perl perl-modules-5.28 procps psmisc rsync socat
0 upgraded, 49 newly installed, 0 to remove and 0 not upgraded.
Need to get 44.7 MB of archives.
After this operation, 302 MB of additional disk space will be used.
Do you want to continue? [Y/n]
...

and start the service (called mariadb, as 10.5 is installed):

root@8af6489c0df8:/# service mariadb start
[ ok ] Starting MariaDB database server: mariadbd.
root@8af6489c0df8:/# mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 12
Server version: 10.5.8-MariaDB-1:10.5.8+maria~buster mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]>

So, this is how installation from the repository is done. Now what if we need to downgrade to some older minor version, let's say, to 10.5.6?

5. Downgrading to specific minor release

MariaDB KB article explains the process. For this you can create a repository with the URL hard-coded to that specific minor release. You can get these URLs from the MariaDB Foundation's archives. I tried:

root@8af6489c0df8:/# add-apt-repository 'deb [arch=amd64,arm64,ppc64el] http://archive.mariadb.org/mariadb-10.5.6/repo/debian/ buster main'

root@8af6489c0df8:/# apt-get update
Hit:1 http://deb.debian.org/debian buster InRelease
Hit:2 http://security.debian.org/debian-security buster/updates InRelease
Hit:3 http://deb.debian.org/debian buster-updates InRelease
Hit:6 http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.4/debian buster InRelease
Get:4 https://archive.mariadb.org/mariadb-10.5.6/repo/debian buster InRelease [3154 B]
Hit:5 https://downloads.mariadb.com/MariaDB/mariadb-10.5/repo/debian buster InRelease
Hit:7 https://downloads.mariadb.com/Tools/debian buster InRelease
Get:8 https://dlm.mariadb.com/repo/maxscale/latest/debian buster InRelease [3515 B]
Get:9 https://archive.mariadb.org/mariadb-10.5.6/repo/debian buster/main amd64 Packages [36.0 kB]
Fetched 42.7 kB in 2s (23.6 kB/s)
Reading package lists... Done
N: Skipping acquire of configured file 'main/binary-ppc64el/Packages' as repository 'http://archive.mariadb.org/mariadb-10.5.6/repo/debian buster InRelease' doesn't support architecture 'ppc64el'
N: Skipping acquire of configured file 'main/binary-arm64/Packages' as repository 'http://archive.mariadb.org/mariadb-10.5.6/repo/debian buster InRelease' doesn't support architecture 'arm64'

and from the messages above it seems the repository is taken into account during update. The repository is added to the list:

root@8af6489c0df8:/# cat /etc/apt/sources.list
# deb http://snapshot.debian.org/archive/debian/20201117T000000Z buster main
deb http://deb.debian.org/debian buster main
# deb http://snapshot.debian.org/archive/debian-security/20201117T000000Z buster/updates main
deb http://security.debian.org/debian-security buster/updates main
# deb http://snapshot.debian.org/archive/debian/20201117T000000Z buster-updates main
deb http://deb.debian.org/debian buster-updates main
deb [arch=ppc64el,arm64,amd64] http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.4/debian buster main
# deb-src [arch=ppc64el,arm64,amd64] http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.4/debian buster main
deb [arch=amd64,ppc64el,arm64] http://archive.mariadb.org/mariadb-10.5.6/repo/debian/ buster main
# deb-src [arch=amd64,ppc64el,arm64] http://archive.mariadb.org/mariadb-10.5.6/repo/debian/ buster main
root@8af6489c0df8:/#

but upgrade suggested nothing to do:

root@8af6489c0df8:/# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

The problem is that this added repository has no priority over existing one. You can find some related details on how to fix that in Debian Wiki, while I've got a hint from smart customer actually. I had to create a file in /etc/apt/preferences.d/ for that, to pin thew repository on top for all packages it provides:

root@8af6489c0df8:/# cat /etc/apt/preferences.d/mariadb.pref
Package: *
Pin: origin archive.mariadb.org
Pin-Priority: 1001
root@8af6489c0df8:/#

Now we can downgrade:

root@8af6489c0df8:/# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
The following packages will be DOWNGRADED:
galera-4 libmariadb3 mariadb-backup mariadb-client mariadb-client-10.5
mariadb-client-core-10.5 mariadb-common mariadb-server mariadb-server-10.5
mariadb-server-core-10.5 mysql-common
0 upgraded, 0 newly installed, 11 downgraded, 0 to remove and 0 not upgraded.
Need to get 32.4 MB of archives.
After this operation, 158 kB of additional disk space will be used.
Do you want to continue? [Y/n]
...
Setting up galera-4 (26.4.5-buster) ...
Setting up mysql-common (1:10.5.6+maria~buster) ...
Setting up mariadb-common (1:10.5.6+maria~buster) ...
Setting up libmariadb3:amd64 (1:10.5.6+maria~buster) ...
Setting up mariadb-server-core-10.5 (1:10.5.6+maria~buster) ...
Setting up mariadb-client-core-10.5 (1:10.5.6+maria~buster) ...
Setting up mariadb-backup (1:10.5.6+maria~buster) ...
Setting up mariadb-client-10.5 (1:10.5.6+maria~buster) ...
Setting up mariadb-client (1:10.5.6+maria~buster) ...
Setting up mariadb-server-10.5 (1:10.5.6+maria~buster) ...
Installing new version of config file /etc/logrotate.d/mysql-server ...
debconf: unable to initialize frontend: Dialog
debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76.)
debconf: falling back to frontend: Readline
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of stop.
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of start.
Setting up mariadb-server (1:10.5.6+maria~buster) ...
Processing triggers for systemd (241-7~deb10u4) ...
Processing triggers for libc-bin (2.28-10) ...

Note that Galera library is also downgraded. You may want to avoid that and list packages with priority in a less generic way.

Now we can restart the service and check that downgrade really happened as expected:

root@8af6489c0df8:/# service mariadb restart
[ ok ] Stopping MariaDB database server: mariadbd.
[ ok ] Starting MariaDB database server: mariadbd.
root@8af6489c0df8:/# mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 11
Server version: 10.5.6-MariaDB-1:10.5.6+maria~buster mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show variables like '%version%';
+-----------------------------------+------------------------------------------+
| Variable_name                     | Value                                    |
+-----------------------------------+------------------------------------------+
| in_predicate_conversion_threshold | 1000                                     |
| innodb_version                    | 10.5.6                                   |
| protocol_version                  | 10                                       |
| slave_type_conversions            |                                          |
| system_versioning_alter_history   | ERROR                                    |
| system_versioning_asof            | DEFAULT                                  |
| tls_version                       | TLSv1.1,TLSv1.2,TLSv1.3                  |
| version                           | 10.5.6-MariaDB-1:10.5.6+maria~buster     |
| version_comment                   | mariadb.org binary distribution          |
| version_compile_machine           | x86_64                                   |
| version_compile_os                | debian-linux-gnu                         |
| version_malloc_library            | system                                   |
| version_source_revision           | 5b8ab1934a10966336e66751bc13fc66923b02f6 |
| version_ssl_library               | OpenSSL 1.1.1d 10 Sep 2019              |
| wsrep_patch_version               | wsrep_26.22                              |
+-----------------------------------+------------------------------------------+
15 rows in set (0.002 sec)

So, the roblem is resolved, and all steps are documented for me to find them later online easily. I rarely use packages, as I prefer to build myself from GitHub sources or at least rely on .tar.gz binaries and tools like MySQL Sandbox for my tests, so this excercise was really needed.

Now you know what I had to work on this week. In the video above you can check what I had for breakfast, if you are interested :)

To summarize:

MariaDB KB has a lot of details on installation and downgrade, but some of them are still missing.
In case of .deb pakages installed from the MariaDB repositories one has to pin specific packages to the repositories providing older versions (like those from http://archive.mariadb.org/) and to set higher priority for this repository in some .pref file in the /etc/apt/preferences.d/ directory.
Docker is useful for testing installation steps and anything in a clean environment. It's easy to miss some step otherwise.
Those who are interested in what exact regression bug forced me to consider and document downgrade procesude can just ask in comments :)

Sunday, October 11, 2020

Stored Procedures Instrumentation in MariaDB 10.5

MariaDB 10.5 added a lot of instrumentation around stored procedures, functions and events along the lines of MySQL WL#5766. In this blog post I'll try to check how it works and provide some details that are still missing in the MariaDB Knowledge Base.

So, in frames of porting most of Performance Schema from MySQL 5.7 to MariaDB 10.5 four new types of Performance Schema objects were added:

MariaDB [performance_schema]> select distinct(object_type) from setup_objects;
+-------------+
| object_type |
+-------------+
| EVENT       |
| FUNCTION    |
| PROCEDURE   |
| TABLE       |
| TRIGGER     |
+-------------+
5 rows in set (0,001 sec)

MariaDB [performance_schema]> select * from setup_objects where object_type != 'TABLE' and object_schema not in ('mysql', 'performance_schema', 'information_schema');
+-------------+---------------+-------------+---------+-------+
| OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | ENABLED | TIMED |
+-------------+---------------+-------------+---------+-------+
| EVENT       | %             | %           | YES     | YES   |
| FUNCTION    | %             | %           | YES     | YES   |
| PROCEDURE   | %             | %           | YES     | YES   |
| TRIGGER     | %             | %           | YES     | YES   |
+-------------+---------------+-------------+---------+-------+
4 rows in set (0,001 sec)

MariaDB [performance_schema]> select version();
+----------------+
| version()      |
+----------------+
| 10.5.7-MariaDB |
+----------------+
1 row in set (0,000 sec)

As you can see from the above, these objects are enabled and timed by default in all databases besides system ones. Additional 20 instruments were also added, enabled and timed by default:

MariaDB [performance_schema]> select * from setup_instruments where name like 'statement/sp/%' or name like 'statement/scheduler%';
+---------------------------------+---------+-------+
| NAME                            | ENABLED | TIMED |
+---------------------------------+---------+-------+
| statement/sp/stmt               | YES     | YES   |
| statement/sp/set                | YES     | YES   |
| statement/sp/set_trigger_field | YES     | YES   |
| statement/sp/jump               | YES     | YES   |
| statement/sp/jump_if_not        | YES     | YES   |
| statement/sp/freturn            | YES     | YES   |
| statement/sp/preturn            | YES     | YES   |
| statement/sp/hpush_jump         | YES     | YES   |
| statement/sp/hpop               | YES     | YES   |
| statement/sp/hreturn            | YES     | YES   |
| statement/sp/cpush              | YES     | YES   |
| statement/sp/cpop               | YES     | YES   |
| statement/sp/copen              | YES     | YES   |
| statement/sp/cclose             | YES     | YES   |
| statement/sp/cfetch             | YES     | YES   |
| statement/sp/agg_cfetch         | YES     | YES   |
| statement/sp/cursor_copy_struct | YES     | YES   |
| statement/sp/error              | YES     | YES   |
| statement/sp/set_case_expr      | YES     | YES   |
| statement/scheduler/event       | YES     | YES   |
+---------------------------------+---------+-------+
20 rows in set (0,002 sec)

In the above I do not see any direct match to most of statements used in stored procedures. You may be wondering what statement/sp/jump_if_not is, for example. These instruments are representing instructions of the low level sp_instr language used to implement the semantics of flow control statements and exception handlers. See more details about them in the MySQL Source Code documentation. As I use a non-debug build, attempt to see these instructions from the procedure using SHOW PROCEDURE CODE statement surely failed:

MariaDB [performance_schema]> show procedure code sbtest.p_sbtest1;
ERROR 1289 (HY000): The 'SHOW PROCEDURE|FUNCTION CODE' feature is disabled; you need MariaDB built with '--with-debug' to have it working

The statement/sp/stmt instrument represents usual DDL or DML statement that is executed "as is".

To check how these new instrument work I deci8ded to add a trigger calling a primitive stored procedure to the table created by sysbench:

MariaDB [sbtest]> show create table sbtest1\G
*************************** 1. row ***************************
Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`k` int(11) NOT NULL DEFAULT 0,
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=latin1
1 row in set (0,001 sec)

MariaDB [sbtest]> set sql_mode='ORACLE';
Query OK, 0 rows affected (0,000 sec)

MariaDB [sbtest]> delimiter //
MariaDB [sbtest]> create procedure p_sbtest1(id int) as begin select k into @k from sbtest1 t where t.id = id; set @i := 0; for i in 1..round(@k/1000)+1 loop set @i := @i + 1; end loop; end;//
Query OK, 0 rows affected (0,058 sec)

MariaDB [sbtest]> create trigger tr1 after update on sbtest1 for each row call p_sbtest1(new.id);//
Query OK, 0 rows affected (0,086 sec)

MariaDB [sbtest]> delimiter ;

This is a primitive procedure that runs some loop to add delay. For fun I've used ORACLE mode and for loop inherited from PL/SQL that is really nice.

With all these in place I ran the following simple sysbench oltp_update_index.lua test for 50 seconds:

openxs@ao756:~/dbs/maria10.5$ sysbench --table-size=1000000 --threads=4 --time=50 --mysql-socket=/tmp/mariadb105.sock --mysql-user=openxs --mysql-db=sbtest --report-interval=5 /usr/share/sysbench/oltp_update_index.lua run
sysbench 1.1.0-faaff4f (using bundled LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 4
Report intermediate results every 5 second(s)
Initializing random number generator from current time

Initializing worker threads...

Threads started!

[ 5s ] thds: 4 tps: 130.61 qps: 130.61 (r/w/o: 0.00/130.61/0.00) lat (ms,95%): 55.82 err/s: 0.00 reconn/s: 0.00
...

and while it was running checked current stored procedure-related statements multiple times with the following SELECT:

MariaDB [performance_schema]> select * from events_statements_current where event_name like 'statement/sp/%'\G
Empty set (0,001 sec)

In all cases I've got zero rows. That's because related consumer is not enabled by default:

MariaDB [performance_schema]> select * from setup_consumers;
+----------------------------------+---------+
| NAME                             | ENABLED |
+----------------------------------+---------+
| events_stages_current            | NO      |
| events_stages_history            | NO      |
| events_stages_history_long       | NO      |
| events_statements_current        | NO      |
| events_statements_history        | NO      |
| events_statements_history_long   | NO      |
| events_transactions_current      | NO      |
| events_transactions_history      | NO      |
| events_transactions_history_long | NO      |
| events_waits_current             | NO      |
| events_waits_history             | NO      |
| events_waits_history_long        | NO      |
| global_instrumentation           | YES     |
| thread_instrumentation           | YES     |
| statements_digest                | YES     |
+----------------------------------+---------+
15 rows in set (0,001 sec)

I've enables all of them with update setup_consumers set enabled = 'YES'; and retried the test:

MariaDB [performance_schema]> select * from events_statements_current where event_name like 'statement/sp/%'\G
...
*************************** 2. row ***************************
              THREAD_ID: 54
               EVENT_ID: 1375298
           END_EVENT_ID: NULL
             EVENT_NAME: statement/sp/set
                 SOURCE:
            TIMER_START: 1751798753952000
              TIMER_END: 1751800198407000
             TIMER_WAIT: 1444455000
              LOCK_TIME: 0
               SQL_TEXT: NULL
                 DIGEST: NULL
            DIGEST_TEXT: NULL
         CURRENT_SCHEMA: sbtest
            OBJECT_TYPE: PROCEDURE
          OBJECT_SCHEMA: sbtest
            OBJECT_NAME: p_sbtest1
OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
...
      NESTING_EVENT_ID: 1373590
     NESTING_EVENT_TYPE: STATEMENT
    NESTING_EVENT_LEVEL: 2
*************************** 3. row ***************************
              THREAD_ID: 56
               EVENT_ID: 1335724
           END_EVENT_ID: NULL
             EVENT_NAME: statement/sp/stmt
                 SOURCE:
            TIMER_START: 1751791595187000
              TIMER_END: 1751800213219000
             TIMER_WAIT: 8618032000
              LOCK_TIME: 0
               SQL_TEXT: call p_sbtest1(new.id)
                 DIGEST: NULL
            DIGEST_TEXT: NULL
         CURRENT_SCHEMA: sbtest
            OBJECT_TYPE: TRIGGER
          OBJECT_SCHEMA: sbtest
            OBJECT_NAME: tr1
OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
...
      NESTING_EVENT_ID: 1335719
     NESTING_EVENT_TYPE: STATEMENT
    NESTING_EVENT_LEVEL: 1
*************************** 4. row ***************************
              THREAD_ID: 56
               EVENT_ID: 1337567
           END_EVENT_ID: NULL
             EVENT_NAME: statement/sp/stmt
                 SOURCE:
            TIMER_START: 1751799809228000
              TIMER_END: 1751800221029000
             TIMER_WAIT: 411801000
              LOCK_TIME: 0
               SQL_TEXT: SET @i := @i + 1
                 DIGEST: NULL
            DIGEST_TEXT: NULL
         CURRENT_SCHEMA: sbtest
            OBJECT_TYPE: PROCEDURE
          OBJECT_SCHEMA: sbtest
            OBJECT_NAME: p_sbtest1
OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: NULL
           MESSAGE_TEXT: NULL
                 ERRORS: 0
...
      NESTING_EVENT_ID: 1335724
     NESTING_EVENT_TYPE: STATEMENT
    NESTING_EVENT_LEVEL: 2
...
*************************** 6. row ***************************
              THREAD_ID: 57
               EVENT_ID: 1342865
           END_EVENT_ID: NULL
             EVENT_NAME: statement/sp/jump_if_not
                 SOURCE:
            TIMER_START: 1751800236059000
              TIMER_END: 1751800238311000
             TIMER_WAIT: 2252000
              LOCK_TIME: 0
               SQL_TEXT: NULL
                 DIGEST: NULL
            DIGEST_TEXT: NULL
         CURRENT_SCHEMA: sbtest
            OBJECT_TYPE: PROCEDURE
          OBJECT_SCHEMA: sbtest
            OBJECT_NAME: p_sbtest1
OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: NULL
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 0
          ROWS_EXAMINED: 0
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 0
       SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 0
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 0
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: 1342859
     NESTING_EVENT_TYPE: STATEMENT
    NESTING_EVENT_LEVEL: 2
6 rows in set (0,001 sec)

I highlighted some of the interesting column values above. We see SOURCE is always empty (like in MDEV-23827 I've reported while working on the previous post). Looks like this is a common problem for all kinds of instrumentation and I suspect it may have something to do with compile time adding of PSI keys discussed in MDEV-22841. We also see statements from both trigger and procedure, with different NESTING_EVENT_LEVEL. So, basically this new instrumentation work, even though on non-debug builds matching the stored procedure statements reported to flow control statements may become non-trivial. Let's hope most of the time spent is actually spent on nested DML and DDL statements and not on flow control itself.

Additionally there is a new summary table, events_statements_summary_by_program:

MariaDB [performance_schema]> select * from events_statements_summary_by_program\G
*************************** 1. row ***************************
                OBJECT_TYPE: TRIGGER
              OBJECT_SCHEMA: sbtest
                OBJECT_NAME: tr1
                 COUNT_STAR: 16858
             SUM_TIMER_WAIT: 171716322309000
             MIN_TIMER_WAIT: 0
             AVG_TIMER_WAIT: 10186043000
             MAX_TIMER_WAIT: 69867594000
           COUNT_STATEMENTS: 9193
        SUM_STATEMENTS_WAIT: 107504946033000
        MIN_STATEMENTS_WAIT: 4378538000
        AVG_STATEMENTS_WAIT: 11694217000
        MAX_STATEMENTS_WAIT: 69849730000
              SUM_LOCK_TIME: 0
                 SUM_ERRORS: 0
               SUM_WARNINGS: 0
          SUM_ROWS_AFFECTED: 0
              SUM_ROWS_SENT: 0
          SUM_ROWS_EXAMINED: 0
SUM_CREATED_TMP_DISK_TABLES: 0
     SUM_CREATED_TMP_TABLES: 0
       SUM_SELECT_FULL_JOIN: 0
SUM_SELECT_FULL_RANGE_JOIN: 0
           SUM_SELECT_RANGE: 0
     SUM_SELECT_RANGE_CHECK: 0
            SUM_SELECT_SCAN: 0
      SUM_SORT_MERGE_PASSES: 0
             SUM_SORT_RANGE: 0
              SUM_SORT_ROWS: 0
              SUM_SORT_SCAN: 0
          SUM_NO_INDEX_USED: 0
     SUM_NO_GOOD_INDEX_USED: 0
*************************** 2. row ***************************
                OBJECT_TYPE: PROCEDURE
              OBJECT_SCHEMA: sbtest
                OBJECT_NAME: p_sbtest1
                 COUNT_STAR: 16858
             SUM_TIMER_WAIT: 170865294923000
             MIN_TIMER_WAIT: 0
             AVG_TIMER_WAIT: 10135561000
             MAX_TIMER_WAIT: 69774831000
           COUNT_STATEMENTS: 18368282
        SUM_STATEMENTS_WAIT: 70946124125000
        MIN_STATEMENTS_WAIT: 112000
        AVG_STATEMENTS_WAIT: 3862000
        MAX_STATEMENTS_WAIT: 26757028000
              SUM_LOCK_TIME: 0
                 SUM_ERRORS: 0
               SUM_WARNINGS: 0
          SUM_ROWS_AFFECTED: 0
              SUM_ROWS_SENT: 0
          SUM_ROWS_EXAMINED: 8686
SUM_CREATED_TMP_DISK_TABLES: 0
     SUM_CREATED_TMP_TABLES: 0
       SUM_SELECT_FULL_JOIN: 0
SUM_SELECT_FULL_RANGE_JOIN: 0
           SUM_SELECT_RANGE: 0
     SUM_SELECT_RANGE_CHECK: 0
            SUM_SELECT_SCAN: 0
      SUM_SORT_MERGE_PASSES: 0
             SUM_SORT_RANGE: 0
              SUM_SORT_ROWS: 0
              SUM_SORT_SCAN: 0
          SUM_NO_INDEX_USED: 0
     SUM_NO_GOOD_INDEX_USED: 0
2 rows in set (0,001 sec)

and there we have the entire new section of 4 columns:

        SUM_STATEMENTS_WAIT: 70946124125000
        MIN_STATEMENTS_WAIT: 112000
        AVG_STATEMENTS_WAIT: 3862000
        MAX_STATEMENTS_WAIT: 26757028000

for time spent on executing individual statements within procedure or trigger. We can subtract SUM_STATEMENTS_WAIT from SUM_WAIT to find out how much time was spent on "flow control" overhead inside the stored procedure.

We now have more insights into the internal working of all this machinery...

To summarize, MariaDB 10.5, among other things, added instrumentation for stored procedures, functions, triggers and events. It is enabled by default for all non-system databases and is ready to use if you enable related consumers. I hope lack of official documentation at the moment will not prevent users from checking most time consuming stored program units and details of their internal work via Performance Schema. They may easily find out that they are actually bad for performance...

Sunday, September 27, 2020

Metadata Locks Instrumentation in MariaDB 10.5

There are different ways to study metadata locks in MySQL and MariaDB, as I once described in details. Until recently MariaDB had not provided the performance_schema.metadata_locks table, but it was finally added in 10.5. So, now you can easily get the details in a way that became the easiest and most well known since MySQL 5.7. You just have to make sure performance_schema is enabled at startup:

openxs@ao756:~/dbs/maria10.5$ bin/mysqld_safe --no-defaults --port=3311 --socket=/tmp/mariadb105.sock --performance_schema=1 &
[1] 23626
openxs@ao756:~/dbs/maria10.5$ 200927 18:26:41 mysqld_safe Logging to '/home/openxs/dbs/maria10.5/data/ao756.err'.
200927 18:26:41 mysqld_safe Starting mariadbd daemon with databases from /home/openxs/dbs/maria10.5/data

openxs@ao756:~/dbs/maria10.5$ bin/mysql --socket=/tmp/mariadb105.sock
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 10.5.6-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> select @@performance_schema;
+----------------------+
| @@performance_schema |
+----------------------+
|                    1 |
+----------------------+
1 row in set (0,000 sec)

MariaDB [(none)]> use performance_schema
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [performance_schema]> show tables like '%metadata%';
+-------------------------------------------+
| Tables_in_performance_schema (%metadata%) |
+-------------------------------------------+
| metadata_locks                            |
+-------------------------------------------+
1 row in set (0,001 sec)

MariaDB [performance_schema]> select * from setup_instruments where name like '%metadata%';
+---------------------------------------------------------+---------+-------+
| NAME                                                    | ENABLED | TIMED |
+---------------------------------------------------------+---------+-------+
| wait/io/file/csv/metadata                               | YES     | YES   |
| stage/sql/Waiting for schema metadata lock              | NO      | NO    |
| stage/sql/Waiting for table metadata lock               | NO      | NO    |
| stage/sql/Waiting for stored function metadata lock     | NO      | NO    |
| stage/sql/Waiting for stored procedure metadata lock    | NO      | NO    |
| stage/sql/Waiting for stored package body metadata lock | NO      | NO    |
| stage/sql/Waiting for trigger metadata lock             | NO      | NO    |
| stage/sql/Waiting for event metadata lock               | NO      | NO    |
| memory/performance_schema/metadata_locks                | YES     | NO    |
| wait/lock/metadata/sql/mdl                              | NO      | NO    |
+---------------------------------------------------------+---------+-------+
10 rows in set (0,034 sec)

So, the metadata_locks table is there and we just need to enable wait/lock/metadata/sql/mdl instrument:

MariaDB [performance_schema]> update setup_instruments set enabled = 'YES', timed='YES' where name like 'wait%mdl';
Query OK, 1 row affected (0,030 sec)
Rows matched: 1 Changed: 1 Warnings: 0

MariaDB [performance_schema]> select * from metadata_locks\G
*************************** 1. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: performance_schema
          OBJECT_NAME: metadata_locks
OBJECT_INSTANCE_BEGIN: 140032981253664
            LOCK_TYPE: SHARED_READ
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 13
       OWNER_EVENT_ID: 1
1 row in set (0,001 sec)

MariaDB [performance_schema]> select connection_id();
+-----------------+
| connection_id() |
+-----------------+
|               3 |
+-----------------+
1 row in set (0,000 sec)

MariaDB [performance_schema]> select * from threads where processlist_id = 3\G
*************************** 1. row ***************************
          THREAD_ID: 13
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 3
   PROCESSLIST_USER: openxs
   PROCESSLIST_HOST: localhost
     PROCESSLIST_DB: performance_schema
PROCESSLIST_COMMAND: Query
   PROCESSLIST_TIME: 0
PROCESSLIST_STATE: Sending data
   PROCESSLIST_INFO: select * from threads where processlist_id = 3
   PARENT_THREAD_ID: 1
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: Socket
       THREAD_OS_ID: 23719
1 row in set (0,018 sec)

The output above shows that one has to join to threads table on thread_id = metadata_locks.owner_thread_id to get the details for thread that set metadata lock:

MariaDB [performance_schema]> SELECT OBJECT_TYPE, OBJECT_SCHEMA, OBJECT_NAME, LOCK_TYPE, LOCK_STATUS, THREAD_ID, PROCESSLIST_ID, PROCESSLIST_INFO FROM performance_schema.metadata_locks INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID WHERE PROCESSLIST_ID <> CONNECTION_ID()\G
*************************** 1. row ***************************
     OBJECT_TYPE: BACKUP
   OBJECT_SCHEMA: NULL
     OBJECT_NAME: NULL
       LOCK_TYPE: BACKUP_DDL
     LOCK_STATUS: PENDING
       THREAD_ID: 50
PROCESSLIST_ID: 21
PROCESSLIST_INFO: drop database test
1 row in set (0,001 sec)

In the cases above we see the lock, SHARED_READ, on the table we access, performance_schema.metadata_locks, set by thread that reads from the table. Then we see how to avoid seeing locks set by current connection and how to get current statement (or other SHOW PROCESSLIST details) for the thread associated with the lock.

I noted that SOURCE column in the metadata_locks table is empty. It seems to be the case for other locks too, like those caused by sysbench test running:

MariaDB [performance_schema]> select * from metadata_locks limit 5\G
*************************** 1. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: sbtest
          OBJECT_NAME: sbtest1
OBJECT_INSTANCE_BEGIN: 140032787904800
            LOCK_TYPE: SHARED_WRITE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 34
       OWNER_EVENT_ID: 1
*************************** 2. row ***************************
          OBJECT_TYPE: BACKUP
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140032778203712
            LOCK_TYPE: BACKUP_TRANS_DML
        LOCK_DURATION: STATEMENT
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 34
       OWNER_EVENT_ID: 1
*************************** 3. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: sbtest
          OBJECT_NAME: sbtest1
OBJECT_INSTANCE_BEGIN: 140033249782768
            LOCK_TYPE: SHARED_WRITE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 33
       OWNER_EVENT_ID: 1
*************************** 4. row ***************************
          OBJECT_TYPE: BACKUP
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140033248057584
            LOCK_TYPE: BACKUP_TRANS_DML
        LOCK_DURATION: STATEMENT
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 33
       OWNER_EVENT_ID: 1
*************************** 5. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: sbtest
          OBJECT_NAME: sbtest1
OBJECT_INSTANCE_BEGIN: 140033449398688
            LOCK_TYPE: SHARED_WRITE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 32
       OWNER_EVENT_ID: 1
5 rows in set (0,001 sec)

MariaDB [performance_schema]>

I do not think that it is normal, as in MySQL 8.0.21, for example, I still see the reference to the line of source code where the lock is set:

openxs@ao756:~/dbs/8.0$ bin/mysql --socket=/tmp/mysql8.sock -uroot performance_schema
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.21 Source distribution

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from metadata_locks\G
*************************** 1. row ***************************
          OBJECT_TYPE: GLOBAL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
          COLUMN_NAME: NULL
OBJECT_INSTANCE_BEGIN: 139972651847648
            LOCK_TYPE: SHARED
        LOCK_DURATION: EXPLICIT
          LOCK_STATUS: GRANTED
               SOURCE: lock.cc:1033
      OWNER_THREAD_ID: 49
       OWNER_EVENT_ID: 112
*************************** 2. row ***************************
          OBJECT_TYPE: COMMIT
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
          COLUMN_NAME: NULL
OBJECT_INSTANCE_BEGIN: 139972651711440
            LOCK_TYPE: SHARED
        LOCK_DURATION: EXPLICIT
          LOCK_STATUS: GRANTED
               SOURCE: lock.cc:1108
      OWNER_THREAD_ID: 49
       OWNER_EVENT_ID: 112
...

I reported this immediately as a bug, MDEV-23827 - "performance_schema.metadata_locks.source column is always empty".

The details in this table that are not covered by the MySQL manual require some investigation. You probably noted BACKUP_TRANS_DML as a lock type above. Another example (that you may get while mariabackup a.k.a mariadb-backup in 10.5 is running):

MariaDB [performance_schema]> select * from metadata_locks\G
*************************** 1. row ***************************
          OBJECT_TYPE: BACKUP
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140032980286816
            LOCK_TYPE: BACKUP_START
        LOCK_DURATION: EXPLICIT
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 44
       OWNER_EVENT_ID: 1
...

I've investigated similar details in the past for MySQL and Percona Server, by code review and some gdb tests. I'd expect to see these values documented somewhere in the mdl.h file in the source code. Here is what we have:

openxs@ao756:~/git/server$ grep -rn BACKUP_TRANS_DML *
mysql-test/main/mdl.result:13:MDL_BACKUP_TRANS_DML      Backup lock
mysql-test/main/mdl.result:19:MDL_BACKUP_TRANS_DML      Backup lock
sql/mdl.h:295:#define MDL_BACKUP_TRANS_DML enum_mdl_type(8)
sql/mdl.cc:134: { C_STRING_WITH_LEN("MDL_BACKUP_TRANS_DML") },
sql/mdl.cc:490:               MDL_BIT(MDL_BACKUP_TRANS_DML)));
sql/mdl.cc:1633: MDL_BIT(MDL_BACKUP_DML) | MDL_BIT(MDL_BACKUP_TRANS_DML) | MDL_BIT(MDL_BACKUP_SYS_DML) | MDL_BIT(MDL_BACKUP_DDL) | MDL_BIT(MDL_BACKUP_ALTER_COPY),
sql/mdl.cc:1634: MDL_BIT(MDL_BACKUP_DML) | MDL_BIT(MDL_BACKUP_TRANS_DML) | MDL_BIT(MDL_BACKUP_SYS_DML) | MDL_BIT(MDL_BACKUP_DDL) | MDL_BIT(MDL_BACKUP_ALTER_COPY) | MDL_BIT(MDL_BACKUP_COMMIT),
sql/sql_base.cc:2048:      mdl_type= MDL_BACKUP_TRANS_DML;
storage/perfschema/table_helper.cc:636:    case MDL_BACKUP_TRANS_DML:
storage/perfschema/table_helper.cc:637:      PFS_engine_table::set_field_varchar_utf8(f, STRING_WITH_LEN("BACKUP_TRANS_DML"));
openxs@ao756:~/git/server$

This is what we have in sql/mdl.h in MariaDB 10.5:

    258 /** Backup locks */
    259
    260 /**
    261   Block concurrent backup
    262 */
    263 #define MDL_BACKUP_START enum_mdl_type(0)
    264 /**
    265    Block new write requests to non transactional tables
    266 */
    267 #define MDL_BACKUP_FLUSH enum_mdl_type(1)
    268 /**
    269    In addition to previous locks, blocks running requests to non trans tables
    270    Used to wait until all DML usage of on trans tables are finished
    271 */
    272 #define MDL_BACKUP_WAIT_FLUSH enum_mdl_type(2)
    273 /**
    274    In addition to previous locks, blocks new DDL's from starting
    275 */
    276 #define MDL_BACKUP_WAIT_DDL enum_mdl_type(3)
    277 /**
    278    In addition to previous locks, blocks commits
    279 */
    280 #define MDL_BACKUP_WAIT_COMMIT enum_mdl_type(4)
    281
    282 /**
    283   Blocks (or is blocked by) statements that intend to modify data. Acquired
    284   before commit lock by FLUSH TABLES WITH READ LOCK.
    285 */
    286 #define MDL_BACKUP_FTWRL1 enum_mdl_type(5)
    287
    288 /**
    289   Blocks (or is blocked by) commits. Acquired after global read lock by
    290   FLUSH TABLES WITH READ LOCK.
    291 */
    292 #define MDL_BACKUP_FTWRL2 enum_mdl_type(6)
    293
    294 #define MDL_BACKUP_DML enum_mdl_type(7)
    295 #define MDL_BACKUP_TRANS_DML enum_mdl_type(8)
    296 #define MDL_BACKUP_SYS_DML enum_mdl_type(9)
...

I see creative re-use of the values from enum_mdl_type.

The last bug not the least, this is how the global read lock and the metadata lock wait look like:

MariaDB [performance_schema]> flush tables with read lock;
Query OK, 0 rows affected (0,090 sec)

MariaDB [performance_schema]> select * from metadata_locks\G
*************************** 1. row ***************************
          OBJECT_TYPE: BACKUP
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140033115590992
            LOCK_TYPE: BACKUP_FTWRL1
        LOCK_DURATION: EXPLICIT
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 56
       OWNER_EVENT_ID: 1
...
MariaDB [performance_schema]> select * from metadata_locks\G
*************************** 1. row ***************************
          OBJECT_TYPE: BACKUP
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140033115590992
            LOCK_TYPE: BACKUP_FTWRL1
        LOCK_DURATION: EXPLICIT
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 56
       OWNER_EVENT_ID: 1
*************************** 2. row ***************************
          OBJECT_TYPE: BACKUP
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140032979653216
            LOCK_TYPE: BACKUP_DDL
        LOCK_DURATION: STATEMENT
          LOCK_STATUS: PENDING
               SOURCE:
      OWNER_THREAD_ID: 50
       OWNER_EVENT_ID: 1
*************************** 3. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: performance_schema
          OBJECT_NAME: metadata_locks
OBJECT_INSTANCE_BEGIN: 140033113928032
            LOCK_TYPE: SHARED_READ
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE:
      OWNER_THREAD_ID: 56
       OWNER_EVENT_ID: 1
3 rows in set (0,000 sec)

MariaDB [performance_schema]> select * from information_schema.metadata_lock_info\G
*************************** 1. row ***************************
    THREAD_ID: 23
    LOCK_MODE: MDL_BACKUP_FTWRL2
LOCK_DURATION: NULL
    LOCK_TYPE: Backup lock
TABLE_SCHEMA:
   TABLE_NAME:
1 row in set (0,001 sec)

The last statement above shows that the old way to check metadata locks, via metadata_lock_info plugin, had provided much less details. We could not even distinguish lock from lock wait (pending lock) with it!

Are there any locks on that bridge? Go figure...

Finally, even with metadata_locks table in place, finding the details about the exact blocking lock may be not trivial. See this blog post for some idea on how to do this based on assumption that blocking lock is surely set earlier than pending one.

Sunday, September 20, 2020

Dynamic Tracing of pthread_mutex_lock in MariaDB - First Steps

I've got several comments and questions during my session on dynamic tracing at MariaDB Server Fest 2020. One of them was from a MariaDB developer at Slack and sounded as follows:

"Can perf profile contention on pthread_mutex_lock?"

I surely stated that it can (it can profile anything), but tried to clarify what exactly is needed whenever we hit the probe. The replies where the following:

"ideally, collect stack traces of mariadbd, and sort them in descending order by time spent between pthread_mutex_lock and next pthread_mutex_unlock."

and:

"less ideally, just collect stacktraces and sort in descending order by frequency"

I am not sure about the former (I am yet to try to trace more than one probe at a time with perf and find out how to aggregate the result properly). It is surely doable (in a more efficient way) with bcc tools if one writes proper program, and probably even with bpftrace. I am going to show how to do this in the next posts one day... But the question was about perf.

Fortunately I can easily show how to do the latter. For the primitive test I've started MariaDB 10.5 built from GitHub source on my Ubuntu 16.04 netbook with the following options:

openxs@ao756:~/dbs/maria10.5$ bin/mysqld_safe --no-defaults --port=3311 --socket=/tmp/mariadb105.sock --performance_schema=1 --performance-schema-instrument='memory/%=ON' --thread-handling='pool-of-threads' &

and used the following sysbench test as a load that I expected to cause some mutex waits:

openxs@ao756:~/dbs/maria10.5$ sysbench --table-size=1000000 --threads=34 --time=300 --report-interval=5 --mysql-socket=/tmp/mariadb105.sock --mysql-user=openxs --mysql-db=sbtest /usr/share/sysbench/oltp_read_only.lua run

First step is to find out what exact pthread library is used:

openxs@ao756:~/dbs/maria10.5$ ldd bin/mariadbd
        linux-vdso.so.1 => (0x00007ffc03437000)
        libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f585d0c6000)
        liblz4.so.1 => /usr/lib/x86_64-linux-gnu/liblz4.so.1 (0x00007f585ceae000)
        liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f585cc8c000)
        libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f585ca7c000)
        libsnappy.so.1 => /usr/lib/x86_64-linux-gnu/libsnappy.so.1 (0x00007f585c874000)
        libaio.so.1 => /lib/x86_64-linux-gnu/libaio.so.1 (0x00007f585c672000)
        libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007f585c467000)
        libssl.so.1.0.0 => /lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007f585c1ff000)
        libcrypto.so.1.0.0 => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00007f585bdba000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f585bb9d000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f585b999000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f585b617000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f585b30e000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f585b0f8000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f585ad2e000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f585f4e6000)

Adding the probe after that is trivial:

openxs@ao756:~/dbs/maria10.5$ sudo perf probe -x /lib/x86_64-linux-gnu/libpthread.so.0 pthread_mutex_lock
Added new event:
probe_libpthread:pthread_mutex_lock (on pthread_mutex_lock in /lib/x86_64-linux-gnu/libpthread-2.23.so)

You can now use it in all perf tools, such as:

perf record -e probe_libpthread:pthread_mutex_lock -aR sleep 1

I had not checked yet what are the arguments and not tried to record mutex address or anything besides the fact of the call to this function. Then I started sysbench and while it worked:

openxs@ao756:~/dbs/maria10.5$ sysbench --table-size=1000000 --threads=34 --time=300 --report-interval=5 --mysql-socket=/tmp/mariadb105.sock --mysql-user=openxs --mysql-db=sbtest /usr/share/sysbench/oltp_read_only.lua run
sysbench 1.1.0-faaff4f (using bundled LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 34
Report intermediate results every 5 second(s)
Initializing random number generator from current time

Initializing worker threads...

Threads started!

[ 5s ] thds: 34 tps: 564.25 qps: 9079.14 (r/w/o: 7944.45/0.00/1134.69) lat (ms,95%): 99.33 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 34 tps: 570.68 qps: 9136.92 (r/w/o: 7995.56/0.00/1141.37) lat (ms,95%): 104.84 err/s: 0.00 reconn/s: 0.00
[ 15s ] thds: 34 tps: 585.57 qps: 9345.87 (r/w/o: 8175.33/0.00/1170.53) lat (ms,95%): 80.03 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 34 tps: 588.03 qps: 9409.23 (r/w/o: 8232.98/0.00/1176.25) lat (ms,95%): 97.55 err/s: 0.00 reconn/s: 0.00
[ 25s ] thds: 34 tps: 587.39 qps: 9399.31 (r/w/o: 8224.92/0.00/1174.39) lat (ms,95%): 97.55 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 34 tps: 584.80 qps: 9360.32 (r/w/o: 8190.33/0.00/1169.99) lat (ms,95%): 77.19 err/s: 0.00 reconn/s: 0.00
[ 35s ] thds: 34 tps: 573.02 qps: 9173.87 (r/w/o: 8028.24/0.00/1145.63) lat (ms,95%): 84.47 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 34 tps: 572.60 qps: 9160.00 (r/w/o: 8014.00/0.00/1146.00) lat (ms,95%): 84.47 err/s: 0.00 reconn/s: 0.00
[ 45s ] thds: 34 tps: 529.54 qps: 8468.10 (r/w/o: 7409.02/0.00/1059.09) lat (ms,95%): 101.13 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 34 tps: 393.04 qps: 6276.27 (r/w/o: 5491.38/0.00/784.88) lat (ms,95%): 121.08 err/s: 0.00 reconn/s: 0.00
[ 55s ] thds: 34 tps: 382.60 qps: 6134.80 (r/w/o: 5369.20/0.00/765.60) lat (ms,95%): 110.66 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 34 tps: 386.36 qps: 6210.71 (r/w/o: 5436.79/0.00/773.91) lat (ms,95%): 134.90 err/s: 0.00 reconn/s: 0.00
[ 65s ] thds: 34 tps: 397.01 qps: 6320.60 (r/w/o: 5528.37/0.00/792.23) lat (ms,95%): 153.02 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 34 tps: 419.01 qps: 6699.60 (r/w/o: 5861.98/0.00/837.63) lat (ms,95%): 248.83 err/s: 0.00 reconn/s: 0.00
...

I've tried to run perf record for the mariadbd process as follows, with -g option:

openxs@ao756:~/dbs/maria10.5$ sudo perf record -e probe_libpthread:pthread_mutex_lock -g -p`pidof mariadbd` sleep 30
^C[ perf record: Woken up 735 times to write data ]
[ perf record: Captured and wrote 183.516 MB perf.data (1150509 samples) ]

I had noty set the frequency and with the default one we see notable drop of QPS from sysbench and huge amount of data collected in less than 30 seconds (highlighted). So we can not speak about really small impact with such an approach. In production case I'd have to play with sampling frequency and time to run for sure.

Raw results can be checked with perf script:

openxs@ao756:~/dbs/maria10.5$ sudo perf script | more
mariadbd 9621 [000] 30901.858515: probe_libpthread:pthread_mutex_lock: (7f28dd88fd40)
                    9d40 pthread_mutex_lock (/lib/x86_64-linux-gnu/libpthread-2.23.so)
                  6a9233 close_thread_tables (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  70b87c mysql_execute_command (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  72a0ae Prepared_statement::execute (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  72a546 Prepared_statement::execute_loop (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  72af58 mysql_stmt_execute_common (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  72b0b5 mysqld_stmt_execute (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  7090d8 dispatch_command (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  70777c do_command (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  887c40 tp_callback (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  a76650 worker_main (/home/openxs/dbs/maria10.5/bin/mariadbd)
                  b31911 pfs_spawn_thread (/home/openxs/dbs/maria10.5/bin/mariadbd)
                    76ba start_thread (/lib/x86_64-linux-gnu/libpthread-2.23.so)
--More--

Now we can process the results collected. The easiest way is to use perf report:

openxs@ao756:~/dbs/maria10.5$ sudo perf report -g >/tmp/perf_mutex.txt

In the resulting file we can see the following:

openxs@ao756:~/dbs/maria10.5$ cat /tmp/perf_mutex.txt | head -100
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 1M of event 'probe_libpthread:pthread_mutex_lock'
# Event count (approx.): 1150509
#
# Children      Self Command   Shared Object        Symbol
# ........ ........ ........ ................... ......................................................
#
   100.00%   100.00% mariadbd libpthread-2.23.so   [.] pthread_mutex_lock
            |
            |--99.98%-- start_thread
            |          |
            |          |--99.97%-- pfs_spawn_thread
            |          |          |
            |          |          |--99.95%-- worker_main
            |          |          |          |
            |          |          |          |--86.17%-- tp_callback
            |          |          |          |          do_command
            |          |          |          |          dispatch_command
            |          |          |          |          |
            |          |          |          |          |--46.56%-- mysqld_stmt_execute
            |          |          |          |          |          mysql_stmt_execute_common
            |          |          |          |          |          Prepared_statement::execute_loop
            |          |          |          |          |          Prepared_statement::execute
            |          |          |          |          |          |
            |          |          |          |          |          |--27.70%-- mysql_execute_command
            |          |          |          |          |          |          |
            |          |          |          |          |          |          |--17.60%-- close_thread_tables
            |          |          |          |          |          |          |          |
            |          |          |          |          |          |          |          |--8.80%-- close_thread_table
            |          |          |          |          |          |          |          |          pthread_mutex_lock
            |          |          |          |          |          |          |          |
            |          |          |          |          |          |          |           --8.80%-- pthread_mutex_lock
            |          |          |          |          |          |          |
            |          |          |          |          |          |          |--9.47%-- execute_sqlcom_select
            |          |          |          |          |          |          |          |
            |          |          |          |          |          |          |          |--9.46%-- open_and_lock_tables
            |          |          |          |          |          |          |          |          open_tables
            |          |          |          |          |          |          |          |          open_table
            |          |          |          |          |          |          |          |          |
            |          |          |          |          |          |          |          |          |--8.80%-- pthread_mutex_lock
            |          |          |          |          |          |          |          |          |
            |          |          |          |          |          |          |          |          |--0.63%-- open_table_get_mdl_lock
            |          |          |          |          |          |          |          |          |          MDL_context::acquire_lock
            |          |          |          |          |          |          |          |          |          MDL_context::try_acquire_lock_impl
            |          |          |          |          |          |          |          |          |          MDL_map::find_or_insert
            |          |          |          |          |          |          |          |          |          pthread_mutex_lock
            |          |          |          |          |          |          |          |          |
            |          |          |          |          |          |          |          |           --0.03%-- tdc_acquire_share
            |          |          |          |          |          |          |          |                     pthread_mutex_lock
            |          |          |          |          |          |          |          |
            |          |          |          |          |          |          |           --0.01%-- handle_select
            |          |          |          |          |          |          |                     mysql_select
            |          |          |          |          |          |          |                     JOIN::optimize
            |          |          |          |          |          |          |                     JOIN::optimize_inner
            |          |          |          |          |          |          |                     |
            |          |          |          |          |          |          |                     |--0.01%-- join_read_const_table
            |          |          |          |          |          |          |                     |          join_read_const
            |          |          |          |          |          |          |                     |          handler::ha_index_read_idx_map
            |          |          |          |          |          |          |                     |          handler::index_read_idx_map
            |          |          |          |          |          |          |                     |          ha_innobase::index_read
            |          |          |          |          |          |          |                     |          row_search_mvcc
            |          |          |          |          |          |          |                     |          |
            |          |          |          |          |          |          |                     |          |--0.01%-- btr_cur_search_to_nth_level_func
            |          |          |          |          |          |          |                     |          |          buf_page_get_gen
            |          |          |          |          |          |          |                     |          |          buf_page_get_low
            |          |          |          |          |          |          |                     |          |          buf_read_page
            |          |          |          |          |          |          |                     |          |          buf_read_page_low
            |          |          |          |          |          |          |                     |          |          pthread_mutex_lock
            |          |          |          |          |          |          |                     |          |
            |          |          |          |          |          |          |                     |           --0.00%-- btr_pcur_move_to_next_page
            |          |          |          |          |          |          |                     |                     buf_page_get_gen
            |          |          |          |          |          |          |                     |                     buf_page_get_low
            |          |          |          |          |          |          |                     |                     buf_read_page
            |          |          |          |          |          |          |                     |                     buf_read_page_low
            |          |          |          |          |          |          |                     |                     pthread_mutex_lock
            |          |          |          |          |          |          |                     |
            |          |          |          |          |          |          |                      --0.01%-- SQL_SELECT::test_quick_select
            |          |          |          |          |          |          |                                get_key_scans_params
            |          |          |          |          |          |          |                                DsMrr_impl::dsmrr_info_const
            |          |          |          |          |          |          |                                handler::multi_range_read_info_const
            |          |          |          |          |          |          |                                ha_innobase::records_in_range
            |          |          |          |          |          |          |                                btr_estimate_n_rows_in_range_low
            |          |          |          |          |          |          |                                |
            |          |          |          |          |          |          |                                |--0.01%-- btr_cur_search_to_nth_level_func
            |          |          |          |          |          |          |                                |          buf_page_get_gen
            |          |          |          |          |          |          |                                |          buf_page_get_low
            |          |          |          |          |          |          |                                |          buf_read_page
            |          |          |          |          |          |          |                                |          buf_read_page_low
            |          |          |          |          |          |          |                                |          pthread_mutex_lock
            |          |          |          |          |          |          |                                |
            |          |          |          |          |          |          |                                 --0.00%-- buf_page_get_gen
            |          |          |          |          |          |          |                                           buf_page_get_low
            |          |          |          |          |          |          |                                           buf_read_page
            |          |          |          |          |          |          |                                           buf_read_page_low
            |          |          |          |          |          |          |                                           pthread_mutex_lock
            |          |          |          |          |          |          |
...

but the result is so large that you can not clearly see the full picture even with the smallest font set. One would need another way to visualize the data, for example, as a flame graph:

openxs@ao756:~/dbs/maria10.5$ sudo perf script | ~/git/FlameGraph/stackcollapse-perf.pl > /tmp/perf-folded.txt
openxs@ao756:~/dbs/maria10.5$ ~/git/FlameGraph/flamegraph.pl /tmp/perf-folded.txt >/tmp/mutex.svg

The resulting flame graph looks as follows:

Flame Graph for the pthread_mutex_lock calls in MariaDB server.

It makes it easier to see were most of waits originate from.

We can also check the /tmp/perf-folded.txt file created in the process:

openxs@ao756:~/dbs/maria10.5$ head -10 /tmp/perf-folded.txt
mariadbd;[libstdc++.so.6.0.21];tpool::thread_pool_generic::worker_main;pthread_mutex_lock 82
mariadbd;[libstdc++.so.6.0.21];tpool::thread_pool_generic::worker_main;tpool::task::execute;pthread_mutex_lock 41
mariadbd;[libstdc++.so.6.0.21];tpool::thread_pool_generic::worker_main;tpool::task::execute;tpool::thread_pool_generic::timer_generic::execute;pthread_mutex_lock 82
mariadbd;[libstdc++.so.6.0.21];tpool::thread_pool_generic::worker_main;tpool::task::execute;tpool::thread_pool_generic::timer_generic::execute;srv_error_monitor_task;pthread_mutex_lock 19
mariadbd;[libstdc++.so.6.0.21];tpool::thread_pool_generic::worker_main;tpool::task::execute;tpool::thread_pool_generic::timer_generic::execute;srv_master_callback;pthread_mutex_lock 18
mariadbd;start_thread;buf_flush_page_cleaner;buf_flush_lists;buf_flush_do_batch;pthread_mutex_lock 46
mariadbd;start_thread;buf_flush_page_cleaner;pthread_mutex_lock 64
mariadbd;start_thread;pfs_spawn_thread;ma_checkpoint_background;pthread_mutex_lock 1
mariadbd;start_thread;pfs_spawn_thread;ma_checkpoint_background;translog_get_horizon;pthread_mutex_lock 1
mariadbd;start_thread;pfs_spawn_thread;pthread_mutex_lock 112
openxs@ao756:~/dbs/maria10.5$ cat /tmp/perf-folded.txt | sort -r -n -k 1,1 | more
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispat
ch_command;THD::reset_killed;pthread_mutex_lock 231467
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispat
ch_command;pthread_mutex_lock 115734
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispat
ch_command;mysqld_stmt_execute;mysql_stmt_execute_common;Prepared_statement::exe
cute_loop;Prepared_statement::execute;THD::set_statement;pthread_mutex_lock 2169
99
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispat
ch_command;mysqld_stmt_execute;mysql_stmt_execute_common;Prepared_statement::exe
cute_loop;Prepared_statement::execute;mysql_execute_command;MDL_context::release
_locks_stored_before;MDL_context::release_lock;MDL_lock::remove_ticket;pthread_m
utex_lock 7232
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispat
ch_command;mysqld_stmt_execute;mysql_stmt_execute_common;Prepared_statement::exe
cute_loop;Prepared_statement::execute;mysql_execute_command;execute_sqlcom_selec
t;open_and_lock_tables;open_tables;open_table;tdc_acquire_share;pthread_mutex_lo
ck 301
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispat
ch_command;mysqld_stmt_execute;mysql_stmt_execute_common;Prepared_statement::exe
cute_loop;Prepared_statement::execute;mysql_execute_command;execute_sqlcom_selec
t;open_and_lock_tables;open_tables;open_table;pthread_mutex_lock 101268
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispat

to find out that it contains unique stack traces as the first field and number of such stack traces as the second. So, and answer the real question from a developer I have sort the result properly:

openxs@ao756:~/dbs/maria10.5$ cat /tmp/perf-folded.txt | sort -r -n -k 2 | head -5
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispatch_command;THD::reset_killed;pthread_mutex_lock 231467
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispatch_command;mysqld_stmt_execute;mysql_stmt_execute_common;Prepared_statement::execute_loop;Prepared_statement::execute;THD::set_statement;pthread_mutex_lock 216999
mariadbd;start_thread;pfs_spawn_thread;worker_main;pthread_mutex_lock 158530
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispatch_command;pthread_mutex_lock 115734
mariadbd;start_thread;pfs_spawn_thread;worker_main;tp_callback;do_command;dispatch_command;delete_explain_query;pthread_mutex_lock 101269
openxs@ao756:~/dbs/maria10.5$

To summarize, perf (augmented with some post processing tools) does allow to collect stacktraces leading to some probe and sort in descending order by frequency.

Finally, this is my presentation that inspired the question:

You can watch the session recorded if you missed it..