Sunday, May 31, 2020

Dynamic Tracing of Memory Allocations in MySQL: First Steps with perf

I often have to deal with customers claiming there is a memory leak in MySQL affecting their production servers. To prove the leak is real and show where it happens usually running with tcmalloc and relying to it's heap profiler, or even more intrusive, running under Valgrind Massif is needed. This is normal when the test case demonstrating the leak is isolated and we prove a leak for some bug report. But hardly anyone in production in a normal situation would agree to do this, as performance impact is very serious.

So, recently I decided to check if dynamic tracing (along the lines of this page by Brendan Gregg, with Flame Graphs to show suspicious code path) is a more reasonable alternative. As I have to work with different Linux versions, often as old as CentOS 6, I can not rely on bcc tools for this. I mentioned more than once that in internal discussion that adding dynamic probe for malloc() (and other memory allocating functions, if you prefer) with perf may already be a good start and give some data to study. So, time to try this approach myself (so that I am ready to suggest it to customers with all the details at hand) and continue my series of posts on dynamic tracing. It can not be all the time about bugs, readers needs solutions as well :)

For this post I decided to work with latest and greatest MySQL 8.0.20 that I have built from source on my good old almost irrelevant Ubuntu 16.04 netbook:
openxs@ao756:~$ uname -a
Linux ao756 4.4.0-178-generic #208-Ubuntu SMP Sun Apr 5 23:45:10 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
openxs@ao756:~$ cd dbs/8.0
openxs@ao756:~/dbs/8.0$ bin/mysql -uroot --socket=/tmp/mysql8.sock -e "show variables like '%version%'"
+--------------------------+-----------------------+
| Variable_name            | Value                 |
+--------------------------+-----------------------+
| immediate_server_version | 999999                |
| innodb_version           | 8.0.20                |
| original_server_version  | 999999                |
| protocol_version         | 10                    |
| slave_type_conversions   |                       |
| tls_version              | TLSv1,TLSv1.1,TLSv1.2 |
| version                  | 8.0.20                |
| version_comment          | Source distribution   |
| version_compile_machine  | x86_64                |
| version_compile_os       | Linux                 |
| version_compile_zlib     | 1.2.11                |
+--------------------------+-----------------------+
openxs@ao756:~/dbs/8.0$
The reason is simple. Looks like I have more readers while writing about MySQL, not some fork of it :) I hope this tend will change soon.

First of all, I have to check what memory allocation library is used (the details of adding probe and even some features available to probe may be different for jemalloc, for example):
openxs@ao756:~/dbs/8.0$ ldd bin/mysqld
        linux-vdso.so.1 =>  (0x00007ffe8afb6000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc9bdfe2000)
        libprotobuf-lite.so.3.6.1 => /home/openxs/dbs/8.0/bin/../lib/private/libprotobuf-lite.so.3.6.1 (0x00007fc9bdd8d000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fc9bdb85000)
        libssl.so.1.0.0 => /lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007fc9bd91c000)
        libcrypto.so.1.0.0 => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00007fc9bd4d7000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fc9bd2d3000)
        libaio.so.1 => /lib/x86_64-linux-gnu/libaio.so.1 (0x00007fc9bd0d1000)
        libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007fc9bcec6000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fc9bcb44000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc9bc83b000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fc9bc625000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc9bc25b000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fc9be1ff000)
openxs@ao756:~/dbs/8.0$
I do not see tcmalloc or jemalloc in the output above, but libc is linked dynamically. So, malloc() implementation would likely come from it (/lib/x86_64-linux-gnu/libc.so.6 binary, to be specific). Let me add the probe (see this older post for more details, some details about probes syntax presented here are also useful):
openxs@ao756:~$ sudo perf probe -x /lib/x86_64-linux-gnu/libc.so.6 'malloc size=%di:s64'
[sudo] password for openxs:
Added new events:
  probe_libc:malloc    (on malloc in /lib/x86_64-linux-gnu/libc-2.23.so with size=%di:s64)
  probe_libc:malloc_1  (on malloc in /lib/x86_64-linux-gnu/libc-2.23.so with size=%di:s64)

You can now use it in all perf tools, such as:

        perf record -e probe_libc:malloc_1 -aR sleep 1

One of the first things I want to be able to find out is how many bytes are allocated by malloc(). In case of libc malloc() on this Ubuntu I could just use the fist and only argument name (you'd be surprised what it really is, try to find out!) or just $params (see man perf-probe for details).  But in a general case (like with jemalloc on the same system) I should better refer to it based on the way it is passed - via %DI (not short form, NOT %EAI) register on x86 systems. This is why you see size=%di:s64 in the probe definition. Note also that to get real value as integer I had to use s64 type (for signed 64-bit integer, even though I;d expect the argument to be unsigned). By default and as unsigned we get hex value, and one day I'd like to be able to count also bytes allocated, not just calls.

I wanted to add return probe for the same function, in a hope to store (one day) the pointer and then maybe check (with other probe) if it is ever explicitly freed. But in my case it is not possible to get the return value:
openxs@ao756:~$ sudo perf probe -x  /lib/x86_64-linux-gnu/libc.so.6 malloc_ret='malloc%return $retval'
Failed to find "__libc_malloc%return",
 because __libc_malloc is an inlined function and has no return point.
Added new event:
  probe_libc:malloc_ret (on malloc%return in /lib/x86_64-linux-gnu/libc-2.23.so with $retval)

You can now use it in all perf tools, such as:

        perf record -e probe_libc:malloc_ret -aR sleep 1
I am yet to check the reason, but according to the above the function is inlined and thus does not have a return point. Sounds weird. Note that for jemalloc I was able to add similar probe successfully and get the pointer value traced.

So, what do I get when I try to trace this probe in perf? Let me show. If I run some stupid test surely allocating memory in one shell:
openxs@ao756:~/dbs/8.0$ bin/mysqlslap -uroot --socket=/tmp/mysql8.sock --concurrency=12 --create-schema=test --no-drop --number-of_queries=300000 --iterations=2 --query='select @a := repeat("a", 1000);'
While in the other I'll try to record some samples:
openxs@ao756:~$ sudo perf record -e 'probe_libc:malloc' -aRg sleep 10
[sudo] password for openxs:
Lowering default frequency rate to 3250.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
[ perf record: Woken up 710 times to write data ]
[ perf record: Captured and wrote 179.217 MB perf.data (1017163 samples) ]
Then I'll get the output like this from perf script:
openxs@ao756:~$ sudo perf script | grep mysqld | more
...
mysqld   632 [001] 218853.089892: probe_libc:malloc: (7fc7312b7130) size=120
                  a18de2 std::__detail::_Hashtable_alloc<Malloc_allocator<std::_
_detail::_Hash_node<TABLE_LIST*, true> > >::_M_allocate_buckets (/home/openxs/db
s/8.0/bin/mysqld)
                  a19061 lock_table_names (/home/openxs/dbs/8.0/bin/mysqld)
                  a1adc8 open_tables (/home/openxs/dbs/8.0/bin/mysqld)
                  a1b576 open_tables_for_query (/home/openxs/dbs/8.0/bin/mysqld)
                  ae7313 Sql_cmd_dml::prepare (/home/openxs/dbs/8.0/bin/mysqld)
                  aef86f Sql_cmd_dml::execute (/home/openxs/dbs/8.0/bin/mysqld)
                  a92b0a mysql_execute_command (/home/openxs/dbs/8.0/bin/mysqld)
                  a97824 mysql_parse (/home/openxs/dbs/8.0/bin/mysqld)
                  a99605 dispatch_command (/home/openxs/dbs/8.0/bin/mysqld)
                  a9abd0 do_command (/home/openxs/dbs/8.0/bin/mysqld)
                  bba6b0 handle_connection (/home/openxs/dbs/8.0/bin/mysqld)
                 20d0355 pfs_spawn_thread (/home/openxs/dbs/8.0/bin/mysqld)
...
mysqld  2795 [000] 218853.091016: probe_libc:malloc: (7fc7312b7130) size=1040                  bb7ed3 String::real_alloc (/home/openxs/dbs/8.0/bin/mysqld)
                  dda12f Item_func_repeat::val_str (/home/openxs/dbs/8.0/bin/mys
qld)
                  da5272 Item_func_set_user_var::check (/home/openxs/dbs/8.0/bin
/mysqld)
                  da5874 Item_func_set_user_var::val_str (/home/openxs/dbs/8.0/b
in/mysqld)
                  d1d1cf Item::send (/home/openxs/dbs/8.0/bin/mysqld)
                  a23d29 THD::send_result_set_row (/home/openxs/dbs/8.0/bin/mysq
ld)
                  eb86c1 Query_result_send::send_data (/home/openxs/dbs/8.0/bin/
mysqld)
                  b55524 SELECT_LEX_UNIT::ExecuteIteratorQuery (/home/openxs/dbs
/8.0/bin/mysqld)
                  b5566e SELECT_LEX_UNIT::execute (/home/openxs/dbs/8.0/bin/mysq
ld)
                  ae8f52 Sql_cmd_dml::execute_inner (/home/openxs/dbs/8.0/bin/my
sqld)
                  aefc37 Sql_cmd_dml::execute (/home/openxs/dbs/8.0/bin/mysqld)
                  a92b0a mysql_execute_command (/home/openxs/dbs/8.0/bin/mysqld)
                  a97824 mysql_parse (/home/openxs/dbs/8.0/bin/mysqld)
                  a99605 dispatch_command (/home/openxs/dbs/8.0/bin/mysqld)
...
Note the following:
  • I tried to trace all calls to malloc() from glibc, for all processes, with stack traces. As a result I've got 179 MB of raw data in perf.data over just 10 seconds, with default frequency of sampling! This is too much for any pracrtical purpose, so make sure to set smaller frequency (like -F99) or performance impact may be too seriois. With newer kernels and eBPF this impact may be notably reduced.
  • I've got both allocation size reported (and may get a good use for it one day), and stack traces, showing where the allocation is coming from! This is exactly what developers need to check if it is really a possible leak.
Now what I can I do with the above? I can use perf report (sudo perf report > /tmp/report.txt) and get the call graphs in text:
openxs@ao756:~$ cat /tmp/report.txt | more                                     
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 556  of event 'probe_libc:malloc'
# Event count (approx.): 4935574
#
# Children      Self  Command          Shared Object        Symbol

# ........  ........  ...............  ...................  ....................
................................................................................
................................................................................
..........
#
...
   87.56%    87.56%  mysqlslap        libc-2.23.so         [.] malloc
                  |
                  ---start_thread
                     run_task
                     |
                     |--86.77%-- mysql_real_query
                     |          cli_read_query_result
                     |          read_com_query_metadata
                     |          cli_read_metadata_ex
                     |          MEM_ROOT::AllocSlow
                     |          MEM_ROOT::AllocBlock
                     |          malloc
                     |
                      --0.79%-- mysql_store_result
                                cli_read_rows
                                MEM_ROOT::AllocSlow
                                MEM_ROOT::AllocBlock
                                malloc
...
    10.18%     0.00%  mysqld           mysqld               [.] do_command
                     |
                     ---do_command
                        dispatch_command
                        |
                        |--9.00%-- mysql_parse
                        |          |
                        |          |--7.88%-- mysql_execute_command
                        |          |          Sql_cmd_dml::execute
                        |          |          |
                        |          |          |--6.64%-- Sql_cmd_dml::prepare
                        |          |          |          open_tables_for_query
                        |          |          |          open_tables
                        |          |          |          lock_table_names
                        |          |          |          |
                        |          |          |          |--6.04%-- std::__detai
l::_Hashtable_alloc<Malloc_allocator<std::__detail::_Hash_node<TABLE_LIST*, true
> > >::_M_allocate_buckets
                        |          |          |          |          malloc
                        |          |          |          |
                        |          |          |           --0.59%-- get_and_lock
_tablespace_names
                        |          |          |                     std::__detai
l::_Hashtable_alloc<Malloc_allocator<std::__detail::_Hash_node<std::__cxx11::bas
ic_string<char, std::char_traits<char>, std::allocator<char> >, true> > >::_M_al
locate_buckets
                        |          |          |                     malloc
                        |          |          |
                        |          |           --1.24%-- Sql_cmd_dml::execute_inner
                        |          |                     SELECT_LEX_UNIT::execute
                        |          |                     SELECT_LEX_UNIT::ExecuteIteratorQuery
                        |          |                     Query_result_send::send_data
                        |          |                     THD::send_result_set_row
                        |          |                     Item::send
                        |          |                     |
                        |          |                     |--0.66%-- Item_func_set_user_var::val_str
                        |          |                     |          Item_func_set_user_var::check
                        |          |                     |          Item_func_repeat::val_str
                        |          |                     |          String::real_alloc
                        |          |                     |          malloc
                        |          |                     |
                        |          |                      --0.58%-- user_var_entry::val_str
                        |          |                                String::copy
                        |          |                                String::real_alloc
                        |          |                                malloc
                        |          |
                        |           --1.12%-- parse_sql
                        |                     |
                        |                     |--0.66%-- THD::sql_parser
                        |                     |          MYSQLparse
                        |                     |          THD::raise_condition
                        |                     |          Diagnostics_area::push_warning
                        |                     |          MEM_ROOT::AllocSlow
                        |                     |          MEM_ROOT::AllocBlock
                        |                     |          malloc
                        |                     |
                        |                      --0.46%-- Diagnostics_area::copy_sql_conditions_from_da
                        |                                Diagnostics_area::push_warning
                        |                                Diagnostics_area::push_warning
                        |                                MEM_ROOT::AllocSlow
                        |                                MEM_ROOT::AllocBlock
                        |                                malloc
                        |
                         --1.18%-- THD::swap_rewritten_query
                                   String::mem_realloc
                                   malloc
...
I can also create a flame graph of number of malloc() calls per function/sequence of calls. I had to modify steps from the original source a big, to come up with the following:
openxs@ao756:~$ sudo perf script > out.stack
openxs@ao756:~$ git/FlameGraph/stackcollapse-perf.pl < out.stack | git/FlameGraph/flamegraph.pl --color=mem --title='malloc( Fale Graph' --countname="calls" -- > malloc.svg
For this test I've included allocations for all the processes and had to use stackcollapse-perf.pl script, as stack traces are coming in perf format.

The result (actually a part of it, just for the mysqld process, copy/pasted as .png static file) may look as follows:

Sample malloc() calls flame graph build from one of tests.
Now you can see where most malloc() calls happen, and this is the first step in checking where leaks may come from if we really suspect them. Quite a lot of fun for a simple probe added with perf, on any Linux system younger than 10 years!

Next time I am going to apply this approach to some real life use case that looks like a leak and see if anything useful, not just fun, can be derived from it. For now I know it's doable and I am sure about the steps (documented here and more in my background notes) to trace malloc() with perf dynamic probe.

Saturday, May 30, 2020

Fun with Bugs #99 - On MySQL Bug Reports I am Subscribed to, Part XXXIII

In my previous post in this series I've commented on some interesting MySQL bug reports that were added during the second half of April. Time to move on to bugs reported in May, 2020, as we are quickly approaching MySQL Bug #100000 soon and I want to write a separate post for this great achievement :)

Here is the list:
  • Bug #99432 - "Improving memory barrier during rseg allocation". Nice contribution by my former colleague in Percona, Krunal Bauskar, who now works on making MySQL better for ARM processors. According to his findings, the use of a relaxed memory model improves performance on ARM by up to 2%. See also yet another bug report with a contribution that matters for ARM, Bug #99556 - "Avoid sequentially consistent atomics for atomic counters" (contributed by Sergey Glushchenko from Percona).
  • Bug #99444 - "New HASH JOIN order problem". One should not expect and rely on any specific order unless explicit ORDER BY is used, so formally this report by Gabor Aron is "Not a Bug". I put it into this list as several other community members helped him a lot in understanding why results with HASH_JOIN optimization in newer versions are still valid and what are the ways to get the results with the desired ordering efficiently. Guilhem Bichot, for instance, suggested two different ways, using window function and lateral table. Useful reading in any case!
  • Bug #99458 - "i_s_fts_index_cache_fill_one_index() is not protect by the lock". Looks like even crashes are possible as a result, based on comments. Nice finding by Haiqing Sun.
  • Bug #99459 - "SQL run with GROUP_MIN_MAX may infinite loop and never return". After some discussion around the validity and severity of bug reports where test case involved adding DEBUG_SYNC() to show the problem in a predictable way, this great bug report by Ze Yang was verified. All MySQL GA versions are affected, including 8.0.20! As a side note, I'd prefer NOT to read such discussions any more. They are wasting time of all parties involved.
  • Bug #99499 - "Incorrect result when constant equailty expression is used in LEFT JOIN condition". This bug that affects MySQL 5.7.x only (it was fixed in MySQL 8.0.17+ and in 5.6 code was different) was reported by Marcos Albe from Percona. 
  • Bug #99504 - "Generated column incorrect on INSERT when based on column w/ expression DEFAULT". Several problems are highlighted in the complex enough test case submitted by Brad Lanier.
  • Bug #99582 - "Reduce logging of new doublewrite buffer initialization which is confusing". 180 lines or so are added when --log-error-verbosity is set to 3. As a workaround one can add:
    log-error-suppression-list="MY-011950"
    to the [mysqld] section of the .cnf file. This problem was reported by Simon Mudd. Make sure to read all comments.
  • Bug #99591 - "Option --tc-heuristic-recover documentation wrong, missing details". In reality it does not work with more than one XA-capable engine installed. I wish fine manual documents the reality, not the good intentions of the past. This documentation request was added by Sami Ahlroos.
  • Bug #99593 - "Performance issues in 8.0.20". It seems to be yet another TempTable engine problem that caused regression comparing to MySQL 5.7. At least this:

    • SET GLOBAL internal_tmp_mem_storage_engine=MEMORY;

    is a workaround. The bug (a duplicate of internal Bug #30562964) was reported by billy noah and is fixed in upcoming MySQL 8.0.21.
  • Bug #99601 - "Broken Performance using EXIST function, increasing execution time each loop". This regression bug (without tag, but who cares...) in MySQL 8.0 was reported by Ronny Görner and minimal test case demonstrating that the problem is actually with function call was contributed by Shane Bester.
  • Bug #99643 - "innobase_commit_by_xid/innobase_rollback_by_xid is not thread safe". This bug was reported by Zhai Weixiang, who had also suggested the fix in the code.
  • Bug #99717 - "Performance regression of parallel count". Great bug report with code analysis and ready to use MTR test case from Ze Yang. Sunny Bains already confirmed that the problematic code is going to be removed.

To summarize:
  1. I am happy to see Oracle engineers explaining community bug reporters the reasons and possible solutions for the problems they hit that are not actually caused by any bug in MySQL. I tried to do this as well, whenever possible, while working on MySQL bugs...
  2. We can still find speculations that if the bug is repeatable only by adding DEBUG_SYNC() or similar debug lines, then it can not be verified or gets lower severity... IMHO this is nonsense, as there are many high severity verified real bug reports where this method is used to demonstrate the problem clearly. Just stop it!



Sunday, May 24, 2020

Fun with Bugs #98 - On MySQL Bug Reports I am Subscribed to, Part XXXII

There are many things to write about MySQL this week. It turned 25 years old, to begin with! Quite successful the first ever Percona Live ONLINE 24 hour conference also happened this week, and I've presented my talk there...

But this is a blog of former MySQL Entomologist, so when I have nothing ready to share about something exciting or immediately useful, I write about MySQL bugs. This is the case today as well. I need more time to think about MySQL history and write down the details to complement my presentation etc. Previous post in this series appeared a month ago, so let me present a yet another list of InnoDB, optimizer, replication and few other bugs reported by MySQL Community users since April 18, 2020 and before May, 2020:
  • Bug #99295 - "InnoDB in-memory estimate is wrong after an index is created". As Øystein Grøvlen found out, even though the entire newly added covering index is in the buffer pool, the buf_stat_per_index_t::get() function in MySQL 8 estimates just 1 page is in memory, and as a result the index is not used.
  • Bug #99326 - "undo truncation might still not be crash safe". After nice explanations of what may happen from Kevin Lewis:
    "After and internal discussion with Sunny Bains, I think I understand the concern better. Let's assume that a redo log is so large that it contains redo entries for all 512 Space IDs of an undo tablespace that is being truncated too often. In other words, even though each truncate removes old pages from the buffer pool and flushes newly created pages, it does not actually cause a checkpoint for each truncation like it did in 5.7. So the redo log can possibly contain records for more than 512 space IDs.

    There is a worklog tested and pushed to the 8.0.21 release branch that fixes this highly unlikely possibility.

    As part of WL#11819, we keep a count of the number of truncations that have happened between checkpoints. So if there is more than (512 / 8) truncations between checkpoints, then no more truncations can happen on that undo space until the next checkpoint happens."
    this (regression vs 5.7) bug report by Zanye Zjy was closed as "Not a Bug". This is a totally wrong status and handling for such a case. 
  • Bug #99354 - "Nondeterministic stored function returns incorrect results". This interesting bug was reported by Jacob Chafik. If anyone cares, it is still repeatable on 8.0.20 (and not repeatable on MariaDB 10.4.x):
    mysql> SELECT IF(COUNT(t1.id) > 0, "Success", "Failure") "Result" FROM t1 WHERE 1 IN (SELECT t1_inner.id FROM t1 t1_inner INNER JOIN t2 WHERE f1(t2.id) > 0);
    +---------+
    | Result  |
    +---------+
    | Failure |
    +---------+
    1 row in set (0,04 sec)

    mysql> SELECT IF(COUNT(t1.id) > 0, "Success", "Failure") "Result" FROM t1 WHERE 1 IN (SELECT t1_inner.id FROM t1 t1_inner INNER JOIN t2 WHERE t2.id > 0);
    +---------+
    | Result  |
    +---------+
    | Success |
    +---------+
    1 row in set (0,00 sec)

    mysql> explain SELECT IF(COUNT(t1.id) > 0, "Success", "Failure") "Result" FROM t
    1 WHERE 1 IN (SELECT t1_inner.id FROM t1 t1_inner INNER JOIN t2 WHERE f1(t2.id)
    > 0);

    +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
    | id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                      |
    +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
    |  1 | SIMPLE      | t1_inner | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |    50.00 | Using where                                |
    |  1 | SIMPLE      | t2       | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | FirstMatch                                 |
    |  1 | SIMPLE      | t1       | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where; Using join buffer (hash join) |
    +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
    3 rows in set, 1 warning (0,00 sec)

    mysql> explain SELECT IF(COUNT(t1.id) > 0, "Success", "Failure") "Result" FROM t1 WHERE 1 IN (SELECT t1_inner.id FROM t1 t1_inner INNER JOIN t2 WHERE t2.id > 0);
    +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------+
    | id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                         |
    +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------+
    |  1 | SIMPLE      | t1_inner | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |    50.00 | Using where                   |
    |  1 | SIMPLE      | t2       | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |    50.00 | Using where; FirstMatch       |
    |  1 | SIMPLE      | t1       | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using join buffer (hash join) |
    +----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------+
    3 rows in set, 1 warning (0,00 sec)

    mysql> select version();
    +-----------+
    | version() |
    +-----------+
    | 8.0.20    |
    +-----------+
    1 row in set (0,00 sec)
    Wrong results should not be produced for logically equivalent queries, even if plans are different. Unfortunately this regression (since 5.7.5+) bug had not got the "regression" tag.
  • Bug #99359 - "Order by in group_concat with prepare statement returns unexpected results". Two executions of the same statement with the user variable having the same value assigned should either both work or fail with error message, it's a matter of consistency. This is not the case in a situation described in the bug report from Feng Liyuan.
  • Bug #99363 - "Innodb_data_pending_fsyncs shows incorrect value". In this bug report SeongUck Lee clearly shown that the problem happens and gave some hints based on source code review what could cause this regression in 8.0 comparing to 5.7. Moreover, after adding debug assertion he managed to show how it is hit and what wrong values are produced, in gdb. Still this report for now ended up as "Not a Bug" (until a test case is presented, reportedly. What a shame!
  • Bug #99377 - "Assertion `thd->get_transaction()->is_empty(Transaction_ctx::STMT)' failed in ". Surely dropping and re-creating the mysql.general_log table is a corner case, but even in debug builds it should end up with some proper error message, not just assertion failure. This bug was reported by Roel Van de Paar.
  • Bug #99381 - "ORDER_BY Index-level optimizer hint implies NO_JOIN_INDEX for second table". The bug reporter, Jesper Wisborg Krogh, my former colleague in MySQL Support, knows a lot about MySQL queries optimization. Make sure to read his last book, "MySQL 8 Query Performance Tuning: A Systematic Method for Improving Execution Speeds". This time he found a problem with hints.
  • Bug #99398 - "Data in wrong row on left join". This regression bug in MySQL 8.0.20 reported by Soner Sayakci is already fixed in upcoming MySQL 8.0.21. Good news.
  • Bug #99412 - "Threads_running becomes scalability bottleneck on multi-node NUMA topologies". Nice bug report from Sergey Glushchenko, with a patch contributed. perf annotate was used to demonstrate the problem. Good to know Percona engineers keep working to make MySQL perform better!
  • Bug #99413 - "Constant propagation get the wrong result when mix with the different collations". Again, this bug report from Wj Huang was closed as 'Not a Bug" based mostly on the results from 8.0.21+ that is not yet released. This is either a "Duplicate" if the exact fix that helped in 8.0.21+ can be isolated, or "Can't repeat", but setting it to "Not a Bug" seems wrong to me.
  • Bug #99414 - "Where exists performance 8.0.16 onwards". This optimizer regression bug in 8.0.16+ (comparing to 5.7) was reported by Jonathan Taylor. See also a comment by Øystein Grøvlen about the tool available at https://github.com/ogrovlen/opttrace to pre-process and get a condensed trace for the join optimizer, that helps to understand what's going on.
Let me stop for now. Many more interesting bugs were reported in May, so stay tuned!

Happy Birthday, MySQL and Sakila!
To summarize:
  1.  Now (regression!) bugs are sometimes closed as "Not a Bug" after explanation what the problem is and statement that there is a fix in a version not yet released. Not as a "Duplicate", but "Not a Bug", after confirming the problem. This is a new level of wrong bugs handling!
  2. Same with a bug where bug reporter is still working on a repeatable test case, after showing the problem happened with some evidence, "Not a Bug".
  3. Oracle does a good job recently in making sure MySQL Server bugs are not hanging around without reaction and are processed fast, but looks like this is partially achieved by lowering standards of bugs processing that were established over last 15+ years. Closing reports by all means as soon as possible as "Not a Bug" is not a way to go. This is sad.

Saturday, April 25, 2020

Fun with Bugs #97 - On MySQL Bug Reports I am Subscribed to, Part XXXI

Time for the next blog post about MySQL bugs! The previous one covered some bugs I considered interesting in March, this one will be about community bug reports that I've picked up since March 26, 2020. I'd better review bug fixed in MySQL 5.7.30 instead of this, but it is still not released, even though we know it must get some important security fixes based on "Oracle Critical Patch Update Advisory - April 2020" recently published.

As usual, I am mostly interested in replication, InnoDB, optimizer and few more categories. Here is the list:
  • Bug #99082 - "Problem with replication: XA transaction, temp tables and row based binlog". As noted by Libor Laichmann, creating temporary tables inside XA transaction leads to broken replication in MySQL 5.7.29, with somewhat misleading error message. This is unfortunate. We do not see any evidence, yet, if 5.6 and 8.0 branches are similarly affected.
  • Bug #99094 - "coredump when install information schema plugin". Bug reporter, Lou Shuai, tried to create simple plugin for the information_schema and it crashed MySQL server 8.0.19 upon installation attempt. Bot the plugin source code and patch for the bug were contributed.
  • Bug #99100 - "GROUP BY will throw table is full when temptable memory allocation exceed limit". More fun with TempTable storage engine that I blamed previously. This bug was reported by Dean Zhou, who had performed detailed analysis in gdb and suggested a fix. It took some efforts for the bug reporter to get it "Verified" as a regression bug in MySQL 8.0.19 (without a "regression" tag, surely).
  • Bug #99101 - "SELECT .. LOCK IN SHARE MODE requires LOCK TABLES grant". Simple regression in MySQL 8.0.11+ (or incompatible change in behavior, if you prefer) vs MySQL 5.7 was found and reported by Matthew Boehm.
  • Bug #99136 - "TempTable wastes 1MB for each connection in thread cache". Nikolai Ikhalainen demonstrated this additional memory usage in MySQL 8.0.16 comparing to 5.7 quite clearly.
  • Bug #99174 - "Prepare transaction won't rollback if server crash after online ddl prepare stage". This bug was reported by Zhang Xiaojian. Additional debugging code (one DBUG_EXECUTE_IF statement) was added to the source to demonstrate the problem easily and it caused some questionable arguments of a kind:
    "First of all, changing our server code in order to cause a bug, can not be considered a repeatable test case."
    But the bug reporter provided more details and clear test case, and had not agreed with the above. The bug was soon "Verified", even though no attempts to check (or explain) if 5.7 may be similarly affected were made in public, so we still have to wonder if this is a regression.
  • Bug #99180 - "Accessing freed memory in perfschema when aggregating status vars". Let me just quote Manuel Ung:
    "When aggregate_thread_status is called for other threads, it's possible for that thread to have exited and freed the THD between the time we check that the thread was valid, until the time we call get_thd_status_var."
    Ironically, he had to add some conditional my_sleep() call to server code to get a repeatable test case, and this was NOT a problem for a good Oracle engineer to verify the bug immediately. There are still bugs in my beloved Performance Schema. Who could imagine that?
  • Bug #99200 - "CREATE USER get stuck on SHOW PROCESSLIST and ps.threads on slave". So, slave may disclose some sensitive information. As demonstrated by Marcelo Altmann, if a query had been rewritten by the parser because it contains sensitive information, it won't be cleaned up when slave's SQL thread applies it, making it visible in SHOW PROCESSLIST and performance_schema.threads. Both 5.7.29 and 8.0.19 are affected. Bug reporter had contributed fixes for both versions. See also another bug in Performance Schema that he reported and contributed a fix for, Bug #99204 - "performance_schema threads table PROCESSLIST_INFO incorrect behaviour".
  • Bug #99205 - "xa prepare write binlog while execute failed". Then XA PREPARE from the binary log is executed on slave and... we are in troubles. The bug was reported by Phoenix Zhang. Unfortunately it is still not clear from this verified bug report if MySQL 5.6 and 5.7 are similarly affected (probably they are).
  • Bug #99206 - "lock_wait_timeout is taking twice time while adding new partition in table". Nice finding by Lalit Choudhary. MySQL 8.0.19 is not affected by this bug.
  • Bug #99244 - "bad performance in hash join because join with no join condition between tables". This optimizer bug (wrong join order when new hash join optimization is used) was reported by Chunyang Xu. But note also a simple test case and perf outputs contributed later by Shane Bester.
  • Bug #99257 - "Inconsistent output and wrong ORDER BY Sorting for query with LIMIT". Yet another 5.7-only optimizer bug found by Lalit Choudhary. My quick test presented in a comment shows that MySQL 5.6.27 produced correct results, so this is a regression bug.
  • Bug #99273 - "Item_ref in Having Clause points to invalid outer table field". This may lead to wrong results for simple enough queries. This regression bug that affects both MySQL 5.7.29 and 8.0.19 was reported by Shanshan Ying.
  • Bug #99286 - "Concurrent update cause crash in row_search_mvcc". This great bug report with code analysis, fix suggested and a test case with detailed instructions was created by Zkong Kong. It was promptly verified, but I still miss any documented attempt to check (by running the test case or by code analysis if the bug applies only to 5.7.29 or MySQL 8.0.x is also potentially affected. For now I'll try to remember this assertion line:
    InnoDB: Assertion failure in thread 47758491551488 in file rem0rec.cc line 586
    and "crash with row_search_mvcc in backtrace". Who know when I hit something similar and what fork/version it will be...
Rainy birthday at the seaside in Cap-d'Ail. Almost 15 years of my 50 were spent checking new MySQL bug reports almost every day.
To summarize:
  1. We still see many regression bugs in recent versions of MySQL 5.7.x and 8.0.x, often without "regression" tag.
  2. There are cases when the bug is verified, but there are no clearly documented checks if all GA versions are affected.
  3. XA transactions are (and had always been) a real disaster for modern MySQL versions in replication environments - all kinds of replication breakage and inconsistencies are possible.
  4. Check "MySQL Bug Reporter Hall of Fame" if you want to know who from MySQL Community had contributed a lot of bug reports over last 10 years.

Saturday, April 18, 2020

What mysql_upgrade really does in MariaDB, Part I

Recently Monty explained that with MariaDB in a general case (backup, proper shutdown, storage engines incompatibilities, Galera, async replication, and maybe few bugs and corner cases aside) it should be possible to easily and directly upgrade from one major version to the other, skipping any number of intermediate major versions in between.

Specifically, the idea was the there is no need to run mysql_upgrade from each intermediate release (10.0, 10.1, 10.2, 10.3 and 10.4) if your goal is to eventually upgrade from MariaDB (or MySQL) 5.5 to MariaDB 10.4, and that mysql_upgrade is designed and coded in a way to take all possible differences for the version it is coming from into account. It supposedly includes and does all intermediate changes needed for all versions since the first release of the tool, and skips errors (that may appear only if some specific change was already done and is not strictly needed). He added a (new) KB article explaining this also.

This was a kind of surprise for me, as until now MySQL manual, for example, clearly says:
"Upgrade that skips versions is not supported. For example, upgrading directly from MySQL 5.5 to 5.7 is not supported."
As a support engineer I always highlighted the need to upgrade step by step, one major version at a time, and check how the entire setup and, most importantly, applications work after each step. I prefer to know from the very beginning in what exact version the problem appeared during upgrade. The problem may be of any kind, and of them corrupted data and different execution plan for some important query are probably the worst if we do not find them early...

But without such checks and in cases when Galera or async replication is not involved and we do not know for sure that there is incompatible change introduced that is NOT fixed by mysql_upgrade, upgrading step by step may, indeed, be considered mostly a waste of time.

I do not trust statements, even if they come from Monty himself, without checking them. So I tried to search MariaDB and MySQL bugs databases for still active bugs that happens when major version is skipped (but mysql_upgrade from the target final version is properly executed and returned no errors). I found no such bugs so far.

Next steps before I agree with the idea of skipping major versions as a valid and supported way for some cases (when proper applications testing is not possible in the process for whatever reason and no replication and HA expectations are directly involved) are to try such upgrades for my testing instances and check what mysql_upgrade really does.

Today, for the purpose of this blog post, I tried (again, as I did such things in the past) to upgrade MySQL 5.5.54 I have at hand among numerous other versions directly to MariaDB 10.4.13 (current from GitHub), and check what happens. This is a very basic test and my hope, honestly, was that it may show some problem, as mysql.user implementation details, for example, are very different.

So, I started with initializing new 5.5 data directory (to clean up 10.4 databases I had there based on previous upgrade tests with more real data):
openxs@ao756:~/dbs/5.5$ rm -rf data/
openxs@ao756:~/dbs/5.5$ scripts/mysql_install_db --no-defaults
Installing MySQL system tables...
200418 12:59:37 [Note] Ignoring --secure-file-priv value as server is running with --bootstrap.
200418 12:59:37 [Note] ./bin/mysqld (mysqld 5.5.54) starting as process 28166 ...
OK
Filling help tables...
200418 12:59:39 [Note] Ignoring --secure-file-priv value as server is running with --bootstrap.
200418 12:59:39 [Note] ./bin/mysqld (mysqld 5.5.54) starting as process 28172 ...
OK

To start mysqld at boot time you have to copy
support-files/mysql.server to the right place for your system

PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:

./bin/mysqladmin -u root password 'new-password'
./bin/mysqladmin -u root -h ao756 password 'new-password'

Alternatively you can run:
./bin/mysql_secure_installation

which will also give you the option of removing the test
databases and anonymous user created by default.  This is
strongly recommended for production servers.

See the manual for more instructions.

You can start the MySQL daemon with:
cd . ; ./bin/mysqld_safe &

You can test the MySQL daemon with mysql-test-run.pl
cd ./mysql-test ; perl mysql-test-run.pl

Please report any problems at http://bugs.mysql.com/

openxs@ao756:~/dbs/5.5$ ls data
mysql  performance_schema  test
Then to start MySQL 5.5 and create a simple table:
openxs@ao756:~/dbs/5.5$ bin/mysqld_safe --no-defaults --port=3309 --socket=/tmp/mysql55.sock &
[1] 28615
openxs@ao756:~/dbs/5.5$ 200418 13:00:22 mysqld_safe Logging to './data/ao756.err'.
200418 13:00:22 mysqld_safe Starting mysqld daemon with databases from ./data

openxs@ao756:~/dbs/5.5$ bin/mysql -uroot test --socket=/tmp/mysql55.sock       
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.54 MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table t1(id int auto_increment primary key, c1 timestamp default current_timestamp(), c2 varchar(20));
Query OK, 0 rows affected (0.09 sec)

mysql> insert into t1(c2) values ('abc'), ('def');Query OK, 2 rows affected (0.04 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t1;
+----+---------------------+------+
| id | c1                  | c2   |
+----+---------------------+------+
|  1 | 2020-04-18 13:02:47 | abc  |
|  2 | 2020-04-18 13:02:47 | def  |
+----+---------------------+------+
2 rows in set (0.00 sec)

mysql> show tables from mysql like 'innodb%';
Empty set (0.00 sec)

mysql> exit
Bye
As you can see, I added some data and checked that there are no mysql.innodb% tables (that store InnoDB persistent statistics) in 5.5.54. Then I proceeded to shutdown (without setting innodb_fast_shutdown = 0 explicitly, a mistake that could cost me a lot of troubles if the server was under load) and started 10.4.13 with the same datadir. As simple as that:
openxs@ao756:~/dbs/5.5$ bin/mysqladmin -uroot shutdown --socket=/tmp/mysql55.sock
200418 13:03:37 mysqld_safe mysqld from pid file ./data/ao756.pid ended
[1]+  Done                    bin/mysqld_safe --no-defaults --port=3309 --socket=/tmp/mysql55.sock
openxs@ao756:~/dbs/5.5$ cd ../maria10.4/
openxs@ao756:~/dbs/maria10.4$ bin/mysqld_safe --no-defaults --datadir=/home/openxs/dbs/5.5/data --general_log=1 --port=3309 --socket=/tmp/mariadb.sock &       
[1] 30832
openxs@ao756:~/dbs/maria10.4$ 200418 13:04:50 mysqld_safe Logging to '/home/openxs/dbs/5.5/data/ao756.err'.
200418 13:04:50 mysqld_safe Starting mysqld daemon with databases from /home/openxs/dbs/5.5/data

openxs@ao756:~/dbs/maria10.4$ bin/mysql -uroot test --socket=/tmp/mariadb.sock
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.4.13-MariaDB-log Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> show tables from mysql like 'innodb%';
Empty set (0,001 sec)

MariaDB [test]> select * from t1;+----+---------------------+------+
| id | c1                  | c2   |
+----+---------------------+------+
|  1 | 2020-04-18 13:02:47 | abc  |
|  2 | 2020-04-18 13:02:47 | def  |
+----+---------------------+------+
2 rows in set (0,001 sec)

MariaDB [test]> show variables like 'general%';
+------------------+-----------+
| Variable_name    | Value     |
+------------------+-----------+
| general_log      | ON        |
| general_log_file | ao756.log |
+------------------+-----------+
2 rows in set (0,003 sec)

MariaDB [test]> exit
Bye
Note that I just started 10.4 with default settings (and general log enables, so that I know what SQL statements were executed), and there is no obvious problem to start. I can even read the data from the table, even though InnoDB persistent statistics tables do not exist.

Surely I had plenty of suspicious messages in the error log:
openxs@ao756:~/dbs/maria10.4$ tail -200 ~/dbs/5.5/data/ao756.err
...
200418 13:00:25 [Note] ./bin/mysqld: ready for connections.
Version: '5.5.54'  socket: '/tmp/mysql55.sock'  port: 3309  MySQL Community Server (GPL)
200418 13:03:35 [Note] ./bin/mysqld: Normal shutdown

200418 13:03:35 [Note] Event Scheduler: Purging the queue. 0 events
200418 13:03:35  InnoDB: Starting shutdown...
200418 13:03:37  InnoDB: Shutdown completed; log sequence number 1599156
200418 13:03:37 [Note] ./bin/mysqld: Shutdown complete

200418 13:03:37 mysqld_safe mysqld from pid file ./data/ao756.pid ended
2020-04-18 13:04:51 0 [Note] InnoDB: Using Linux native AIO
2020-04-18 13:04:51 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-04-18 13:04:51 0 [Note] InnoDB: Uses event mutexes
2020-04-18 13:04:51 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2020-04-18 13:04:51 0 [Note] InnoDB: Number of pools: 1
2020-04-18 13:04:51 0 [Note] InnoDB: Using SSE2 crc32 instructions
2020-04-18 13:04:51 0 [Note] mysqld: O_TMPFILE is not supported on /tmp (disabling future attempts)
2020-04-18 13:04:51 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-04-18 13:04:51 0 [Note] InnoDB: Completed initialization of buffer pool
2020-04-18 13:04:51 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2020-04-18 13:04:51 0 [Note] InnoDB: Upgrading redo log: 2*50331648 bytes; LSN=1599156
2020-04-18 13:04:51 0 [Note] InnoDB: Starting to delete and rewrite log files.
2020-04-18 13:04:51 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 50331648 bytes
2020-04-18 13:04:51 0 [Note] InnoDB: Setting log file ./ib_logfile1 size to 50331648 bytes
2020-04-18 13:04:51 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
2020-04-18 13:04:51 0 [Note] InnoDB: New log files created, LSN=1599156
2020-04-18 13:04:51 0 [Note] InnoDB: Creating tablespace and datafile system tables.
2020-04-18 13:04:51 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
2020-04-18 13:04:51 0 [Note] InnoDB: Creating sys_virtual system tables.
2020-04-18 13:04:51 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2020-04-18 13:04:51 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2020-04-18 13:04:51 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2020-04-18 13:04:51 0 [Note] InnoDB: Waiting for purge to start
2020-04-18 13:04:51 0 [Note] InnoDB: 10.4.13 started; log sequence number 1599156; transaction id 777
2020-04-18 13:04:51 0 [Note] InnoDB: Loading buffer pool(s) from /home/openxs/dbs/5.5/data/ib_buffer_pool
2020-04-18 13:04:51 0 [Note] InnoDB: Cannot open '/home/openxs/dbs/5.5/data/ib_buffer_pool' for reading: No such file or directory
2020-04-18 13:04:51 0 [Note] Plugin 'FEEDBACK' is disabled.
2020-04-18 13:04:51 0 [Note] Server socket created on IP: '::'.
2020-04-18 13:04:51 0 [ERROR] Missing system table mysql.roles_mapping; please run mysql_upgrade to create it
2020-04-18 13:04:51 0 [ERROR] Incorrect definition of table mysql.event: expected column 'sql_mode' at position 14 to have type set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALUE_ON_ZERO','NO_BACKSLASH_ESCAPES','STRICT_TRANS_TABLES','STRICT_ALL_TABLES','NO_ZERO_IN_DATE','NO_ZERO_DATE','INVALID_DATES','ERROR_FOR_DIVISION_BY_ZERO','TRADITIONAL','NO_AUTO_CREATE_USER','HIGH_NOT_PRECEDENCE','NO_ENGINE_SUBSTITUTION','PAD_CHAR_TO_FULL_LENGTH','EMPTY_STRING_IS_NULL','SIMULTANEOUS_ASSIGNMENT'), found type set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','NOT_USED','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALUE_ON_ZERO','N...
2020-04-18 13:04:51 0 [ERROR] mysqld: Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.
2020-04-18 13:04:51 6 [Warning] Failed to load slave replication state from table mysql.gtid_slave_pos: 1146: Table 'mysql.gtid_slave_pos' doesn't exist

2020-04-18 13:04:52 0 [Note] Reading of all Master_info entries succeeded
2020-04-18 13:04:52 0 [Note] Added new Master_info '' to hash table
2020-04-18 13:04:52 0 [Note] /home/openxs/dbs/maria10.4/bin/mysqld: ready for connections.
Version: '10.4.13-MariaDB-log'  socket: '/tmp/mariadb.sock'  port: 3309  Source distribution
2020-04-18 13:05:11 8 [ERROR] InnoDB: Table `mysql`.`innodb_table_stats` not found.
There are notes, warnings and errors to pay attention to! Every other day I get error log from some customer with messages like:
2020-04-18 13:05:11 8 [ERROR] InnoDB: Table `mysql`.`innodb_table_stats` not found.
and others similar to those above, and it means that they had never run mysql_upgrade at all after some major version upgrade!

Next step was to run mysql_upgrade as some of the messages suggest:
openxs@ao756:~/dbs/maria10.4$ bin/mysql_upgrade -uroot test --socket=/tmp/mariadb.sock
MariaDB upgrade detected
Phase 1/7: Checking and upgrading mysql database
Processing databases
mysql
mysql.columns_priv                                 OK
mysql.db                                           OK
mysql.event                                        OK
mysql.func                                         OK
mysql.help_category                                OK
mysql.help_keyword                                 OK
mysql.help_relation                                OK
mysql.help_topic                                   OK
mysql.host                                         OK
mysql.ndb_binlog_index                             OK
mysql.plugin                                       OK
mysql.proc                                         OK
mysql.procs_priv                                   OK
mysql.proxies_priv                                 OK
mysql.servers                                      OK
mysql.tables_priv                                  OK
mysql.time_zone                                    OK
mysql.time_zone_leap_second                        OK
mysql.time_zone_name                               OK
mysql.time_zone_transition                         OK
mysql.time_zone_transition_type                    OK
mysql.user                                         OK
Upgrading from a version before MariaDB-10.1
Phase 2/7: Installing used storage engines
Checking for tables with unknown storage engine
Phase 3/7: Fixing views from mysql
Phase 4/7: Running 'mysql_fix_privilege_tables'
Phase 5/7: Fixing table and database names
Phase 6/7: Checking and upgrading tables
Processing databases
information_schema
performance_schema
test
test.t1                                            OK
Phase 7/7: Running 'FLUSH PRIVILEGES'
OK
The output is not that long for the case of only one additional table created, but the process took few seconds of time. I have missing tables now:
openxs@ao756:~/dbs/maria10.4$ bin/mysql --no-defaults -uroot --socket=/tmp/mariadb.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 24
Server version: 10.4.13-MariaDB-log Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show tables from mysql like 'innodb%';
+---------------------------+
| Tables_in_mysql (innodb%) |
+---------------------------+
| innodb_index_stats        |
| innodb_table_stats        |
+---------------------------+
2 rows in set (0,001 sec)
The final upgrade step would be to restart and check the error log after restart:
openxs@ao756:~/dbs/maria10.4$ bin/mysql --no-defaults -uroot --socket=/tmp/mariadb.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.4.13-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> select * from t1;
ERROR 1046 (3D000): No database selected
MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+---------------------+------+
| id | c1                  | c2   |
+----+---------------------+------+
|  1 | 2020-04-18 13:02:47 | abc  |
|  2 | 2020-04-18 13:02:47 | def  |
+----+---------------------+------+
2 rows in set (0,002 sec)
To confirm that running mysql_upgrade really helped, let me check the error log content since last startup:
openxs@ao756:~/dbs/maria10.4$ tail -80 ~/dbs/5.5/data/ao756.err
...
2020-04-18 13:04:52 0 [Note] /home/openxs/dbs/maria10.4/bin/mysqld: ready for connections.
Version: '10.4.13-MariaDB-log'  socket: '/tmp/mariadb.sock'  port: 3309  Source distribution
2020-04-18 13:05:11 8 [ERROR] InnoDB: Table `mysql`.`innodb_table_stats` not found.
2020-04-18 15:21:01 0 [Note] /home/openxs/dbs/maria10.4/bin/mysqld (initiated by: root[root] @ localhost []): Normal shutdown
2020-04-18 15:21:01 0 [Note] Event Scheduler: Purging the queue. 0 events
2020-04-18 15:21:01 0 [Note] InnoDB: FTS optimize thread exiting.
2020-04-18 15:21:01 0 [Note] InnoDB: Starting shutdown...
2020-04-18 15:21:01 0 [Note] InnoDB: Dumping buffer pool(s) to /home/openxs/dbs/5.5/data/ib_buffer_pool
2020-04-18 15:21:01 0 [Note] InnoDB: Buffer pool(s) dump completed at 200418 15:21:01
2020-04-18 15:21:03 0 [Note] InnoDB: Shutdown completed; log sequence number 1617395; transaction id 791
2020-04-18 15:21:03 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2020-04-18 15:21:03 0 [Note] /home/openxs/dbs/maria10.4/bin/mysqld: Shutdown complete

2020-04-18 15:21:13 0 [Note] InnoDB: Using Linux native AIO
2020-04-18 15:21:13 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-04-18 15:21:13 0 [Note] InnoDB: Uses event mutexes
2020-04-18 15:21:13 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2020-04-18 15:21:13 0 [Note] InnoDB: Number of pools: 1
2020-04-18 15:21:13 0 [Note] InnoDB: Using SSE2 crc32 instructions
2020-04-18 15:21:13 0 [Note] mysqld: O_TMPFILE is not supported on /tmp (disabling future attempts)
2020-04-18 15:21:13 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-04-18 15:21:13 0 [Note] InnoDB: Completed initialization of buffer pool
2020-04-18 15:21:13 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2020-04-18 15:21:13 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
2020-04-18 15:21:13 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2020-04-18 15:21:13 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2020-04-18 15:21:13 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2020-04-18 15:21:13 0 [Note] InnoDB: Waiting for purge to start
2020-04-18 15:21:13 0 [Note] InnoDB: 10.4.13 started; log sequence number 1617395; transaction id 791
2020-04-18 15:21:13 0 [Note] InnoDB: Loading buffer pool(s) from /home/openxs/dbs/5.5/data/ib_buffer_pool
2020-04-18 15:21:13 0 [Note] InnoDB: Buffer pool(s) load completed at 200418 15:21:13
2020-04-18 15:21:13 0 [Note] Plugin 'FEEDBACK' is disabled.
2020-04-18 15:21:13 0 [Note] Server socket created on IP: '::'.
2020-04-18 15:21:13 0 [Note] Reading of all Master_info entries succeeded
2020-04-18 15:21:13 0 [Note] Added new Master_info '' to hash table
2020-04-18 15:21:13 0 [Note] /home/openxs/dbs/maria10.4/bin/mysqld: ready for connections.
Version: '10.4.13-MariaDB'  socket: '/tmp/mariadb.sock'  port: 3310  Source distribution
openxs@ao756:~/dbs/maria10.4$
I see just notes, no errors or warnings above after the restart. It means I can continue with applications testing and consider upgrade successful so far. Do NOT assume you are done with upgrade until you see such a clean startup messages log!

Now let me check the general query log generated:
openxs@ao756:~/dbs/maria10.4$ cat ~/dbs/5.5/data/ao756.log | more
/home/openxs/dbs/maria10.4/bin/mysqld, Version: 10.4.13-MariaDB-log (Source dist
ribution). started with:
Tcp port: 3309  Unix socket: /tmp/mariadb.sock
Time                Id Command  Argument
200418 13:05:11      8 Connect  root@localhost as anonymous on test
                     8 Query    show databases
                     8 Query    show tables
                     8 Field List       t1
                     8 Query    select @@version_comment limit 1
200418 13:05:14      8 Query    show tables from mysql like 'innodb%'
200418 13:05:19      8 Query    select * from t1
200418 13:05:33      8 Query    show variables like 'general%'
200418 13:05:39      8 Quit
200418 13:05:49      9 Connect  root@localhost as anonymous on mysql
                     9 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                     9 Query    show variables like 'datadir'
                     9 Quit
                    10 Connect  root@localhost as anonymous on mysql
                    10 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    10 Query    show variables like 'version'
                    10 Quit
                    11 Connect  root@localhost as anonymous on mysql
                    11 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    11 Query    show create table mysql.event
                    11 Quit
Connection 8 above was my user session. Starting from 9, we see queries executed by mysql_upgrade (and maybe other utilities it may call, more on them in the next part.. We see thjat it gets the datadir and current serverr version, and in all cases by default it disables writing changes to the binary log or replicating them to other Galera nodes, if any. Let me continue:
                    12 Connect  root@localhost as anonymous on
                    12 Query    SET WSREP_ON=0
                    12 Query    SET SQL_LOG_BIN=0
                    12 Init DB  mysql
                    12 Query    SHOW /*!50002 FULL*/ TABLES
                    12 Query    CHECK TABLE `columns_priv`  FOR UPGRADE
                    12 Query    CHECK TABLE `db`  FOR UPGRADE
                    12 Query    CHECK TABLE `event`  FOR UPGRADE
...
                    12 Query    CHECK TABLE `user`  FOR UPGRADE
                    12 Quit
                    13 Connect  root@localhost as anonymous on mysql
                    13 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    13 Query    show create table mysql.user
                    13 Quit
                    14 Connect  root@localhost as anonymous on mysql
                    14 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    14 Query    SELECT DISTINCT LOWER(engine) AS c1 FROM information_schema.tables WHERE table_comment LIKE 'Unknown storage engine%' ORDER BY c1
                    14 Quit
We see running CHECK TABLE ... FOR UPGRADE for each table that exists in the mysql database and then checking structure of the mysql.user table and looking for the list of unknown storage engines. 

Then we get lists of tables in each real database and proceed with changes to table structures in the mysql database:
                    15 Connect  root@localhost as anonymous on
                    15 Query    SET WSREP_ON=0
                    15 Query    SET SQL_LOG_BIN=0
                    15 Query    SHOW DATABASES
                    15 Init DB  mysql
                    15 Query    SHOW /*!50002 FULL*/ TABLES                    15 Init DB  test
                    15 Query    SHOW /*!50002 FULL*/ TABLES
                    15 Quit
200418 13:05:50     16 Connect  root@localhost as anonymous on mysql
                    16 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    16 Query    set sql_mode=''
                    16 Query    set storage_engine=Aria
                    16 Query    set enforce_storage_engine=NULL
                    16 Query    set @have_innodb= (select count(engine) from inf
ormation_schema.engines where engine='INNODB' and support != 'NO')
                    16 Query    ALTER TABLE user add File_priv enum('N','Y') COL
LATE utf8_general_ci DEFAULT 'N' NOT NULL
                    16 Query    SET @hadGrantPriv:=0
                    16 Query    SELECT @hadGrantPriv:=1 FROM user WHERE Grant_pr
iv IS NOT NULL
                    16 Query    ALTER TABLE user add Grant_priv enum('N','Y') CO
LLATE utf8_general_ci DEFAULT 'N' NOT NULL,
                 add References_priv enum('N','Y') COLLATE utf8_general_ci DEFAU
LT 'N' NOT NULL,
                 add Index_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N
' NOT NULL,
                 add Alter_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N
' NOT NULL
                    16 Query    ALTER TABLE db add Grant_priv enum('N','Y') COLL
ATE utf8_general_ci DEFAULT 'N' NOT NULL,
               add References_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT
 'N' NOT NULL,
               add Index_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N'
NOT NULL,
               add Alter_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N'
NOT NULL
                    16 Query    UPDATE user SET Grant_priv=File_priv,References_
priv=Create_priv,Index_priv=Create_priv,Alter_priv=Create_priv WHERE @hadGrantPriv = 0
                    16 Query    UPDATE db SET References_priv=Create_priv,Index_
priv=Create_priv,Alter_priv=Create_priv WHERE @hadGrantPriv = 0
                    16 Query    ALTER TABLE user
ADD ssl_type enum('','ANY','X509', 'SPECIFIED') DEFAULT '' NOT NULL,
ADD ssl_cipher BLOB NOT NULL,
ADD x509_issuer BLOB NOT NULL,
ADD x509_subject BLOB NOT NULL
                    16 Query    ALTER TABLE user MODIFY ssl_type enum('','ANY','
X509', 'SPECIFIED') DEFAULT '' NOT NULL
                    16 Query    ALTER TABLE tables_priv
  ADD KEY Grantor (Grantor)
                    16 Query    ALTER TABLE tables_priv
  MODIFY Host char(60) NOT NULL default '',
  MODIFY Db char(64) NOT NULL default '',
  MODIFY User char(80) binary NOT NULL default '',
  MODIFY Table_name char(64) NOT NULL default '',
  MODIFY Grantor char(141) COLLATE utf8_bin NOT NULL default '',
  ENGINE=Aria,
  CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin
                    16 Query    ALTER TABLE tables_priv
  MODIFY Column_priv set('Select','Insert','Update','References')
    COLLATE utf8_general_ci DEFAULT '' NOT NULL,
  MODIFY Table_priv set('Select','Insert','Update','Delete','Create',
                        'Drop','Grant','References','Index','Alter',
                        'Create View','Show view','Trigger','Delete versioning r
ows')
    COLLATE utf8_general_ci DEFAULT '' NOT NULL,
  COMMENT='Table privileges'
200418 13:05:51     16 Query    ALTER TABLE columns_priv
  CHANGE Type Column_priv set('Select','Insert','Update','References')
    COLLATE utf8_general_ci DEFAULT '' NOT NULL
                    16 Query    ALTER TABLE columns_priv
  MODIFY Host char(60) NOT NULL default '',
  MODIFY Db char(64) NOT NULL default '',
  MODIFY User char(80) binary NOT NULL default '',
  MODIFY Table_name char(64) NOT NULL default '',
  MODIFY Column_name char(64) NOT NULL default '',
  ENGINE=Aria,
  CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin,
  COMMENT='Column privileges'
                    16 Query    ALTER TABLE columns_priv
  MODIFY Column_priv set('Select','Insert','Update','References')
    COLLATE utf8_general_ci DEFAULT '' NOT NULL
200418 13:05:52     16 Query    ALTER TABLE func add type enum ('function','aggregate') COLLATE utf8_general_ci NOT NULL
                    16 Query    SET @hadShowDbPriv:=0
                    16 Query    SELECT @hadShowDbPriv:=1 FROM user WHERE Show_db_priv IS NOT NULL
                    16 Query    ALTER TABLE user
ADD Show_db_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N' NOT NULL AFTER Alter_priv,
ADD Super_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N' NOT NULL AFTER
Show_db_priv,
ADD Create_tmp_table_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N' NOT
NULL AFTER Super_priv,
ADD Lock_tables_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N' NOT NULL
AFTER Create_tmp_table_priv,
...
Among other things we switch tables to Aria engine as this is one of the ideas of MariaDB 10.4. We also care about proper authentication plugin for the case of upgrade from MySQL:
...
                    16 Query    ALTER TABLE user
  ADD Password char(41) character set latin1 collate latin1_bin NOT NULL default
 '' AFTER User
                    16 Query    UPDATE user
  SET plugin='unix_socket' WHERE plugin='auth_socket'
                    16 Query    DELETE FROM plugin
  WHERE name='auth_socket'
...

A lot of changes are done by that connection 16, they are all coming mostly from one script that we'll check in the next part.

There are statements to create the tables needed as well:
...
  CREATE TABLE IF NOT EXISTS global_priv (Host char(60) binary DEFAULT '', User char(80) binary DEFAULT '', Priv JSON NOT NULL DEFAULT '{}' CHECK(JSON_VALID(Priv)), PRIMARY KEY Host (Host,User)) engine=Aria transactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Users and global privileges'
  SELECT Host, User, JSON_COMPACT(JSON_OBJECT('access',
                             1*('Y'=Select_priv)+
                             2*('Y'=Insert_priv)+
                             4*('Y'=Update_priv)+
                             8*('Y'=Delete_priv)+
                            16*('Y'=Create_priv)+
                            32*('Y'=Drop_priv)+
                            64*('Y'=Reload_priv)+
                           128*('Y'=Shutdown_priv)+
                           256*('Y'=Process_priv)+
                           512*('Y'=File_priv)+
                          1024*('Y'=Grant_priv)+
                          2048*('Y'=References_priv)+
                          4096*('Y'=Index_priv)+
                          8192*('Y'=Alter_priv)+
                         16384*('Y'=Show_db_priv)+
                         32768*('Y'=Super_priv)+
                         65536*('Y'=Create_tmp_table_priv)+
                        131072*('Y'=Lock_tables_priv)+
                        262144*('Y'=Execute_priv)+
                        524288*('Y'=Repl_slave_priv)+
                       1048576*('Y'=Repl_client_priv)+
                       2097152*('Y'=Create_view_priv)+
                       4194304*('Y'=Show_view_priv)+
                       8388608*('Y'=Create_routine_priv)+
                      16777216*('Y'=Alter_routine_priv)+
                      33554432*('Y'=Create_user_priv)+
                      67108864*('Y'=Event_priv)+
                     134217728*('Y'=Trigger_priv)+
                     268435456*('Y'=Create_tablespace_priv)+
                     536870912*('Y'=Delete_history_priv),
                    'ssl_type', ssl_type-1,
                    'ssl_cipher', ssl_cipher,
                    'x509_issuer', x509_issuer,
                    'x509_subject', x509_subject,
                    'max_questions', max_questions,
                    'max_updates', max_updates,
                    'max_connections', max_connections,
                    'max_user_connections', max_user_connections,
                    'max_statement_time', max_statement_time,
                    'plugin', if(plugin>'',plugin,if(length(password)=16,'mysql_
old_password','mysql_native_password')),
                    'authentication_string', if(plugin>'' and authentication_str
ing>'',authentication_string,password),
                    'password_last_changed', if(password_expired='Y', 0, UNIX_TI
MESTAMP(password_last_changed)),
                    'password_lifetime', ifnull(password_lifetime, -1),
                    'account_locked', 'Y'=account_locked,
                    'default_role', default_role,
                    'is_role', 'Y'=is_role)) as Priv
  FROM user;
  DROP TABLE user;
END IF
200418 13:06:41     16 Query    set sql_mode=''
                    16 Query    set @orig_storage_engine=@@storage_engine
                    16 Query    set storage_engine=Aria
                    16 Query    set system_versioning_alter_history=keep
                    16 Query    set @have_innodb= (select count(engine) from inf
ormation_schema.engines where engine='INNODB' and support != 'NO')
                    16 Query    SET @innodb_or_aria=IF(@have_innodb <> 0, 'InnoD
B', 'Aria')
                    16 Query    CREATE TABLE IF NOT EXISTS db (   Host char(60)
binary DEFAULT '' NOT NULL, Db char(64) binary DEFAULT '' NOT NULL, User char(80
) binary DEFAULT '' NOT NULL, Select_priv enum('N','Y') COLLATE utf8_general_ci
DEFAULT 'N' NOT NULL, Insert_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT
'N' NOT NULL, Update_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N' NOT
NULL, Delete_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N' NOT NULL, Cr
eate_priv enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N' NOT NULL, Drop_priv
...


Note that CREATE TABLE IF NOT EXISTS syntax is used, so if ALTERs failed before as some table was missing at a later step it is created with proper structure for 10.4.

Here is the infamous mysql.user view creation later:
...
                    16 Query    set @had_db_table= @@warning_count != 0
                    16 Query    CREATE TABLE IF NOT EXISTS global_priv (Host cha
r(60) binary DEFAULT '', User char(80) binary DEFAULT '', Priv JSON NOT NULL DEF
AULT '{}' CHECK(JSON_VALID(Priv)), PRIMARY KEY Host (Host,User)) engine=Aria tra
nsactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Users and global privi
leges'
                    16 Query    CREATE DEFINER=root@localhost SQL SECURITY DEFIN
ER VIEW IF NOT EXISTS user AS SELECT
  Host,
  User,
  IF(JSON_VALUE(Priv, '$.plugin') IN ('mysql_native_password', 'mysql_old_passwo
rd'), IFNULL(JSON_VALUE(Priv, '$.authentication_string'), ''), '') AS Password,
  IF(JSON_VALUE(Priv, '$.access') &         1, 'Y', 'N') AS Select_priv,
  IF(JSON_VALUE(Priv, '$.access') &         2, 'Y', 'N') AS Insert_priv,
  IF(JSON_VALUE(Priv, '$.access') &         4, 'Y', 'N') AS Update_priv,
  IF(JSON_VALUE(Priv, '$.access') &         8, 'Y', 'N') AS Delete_priv,
  IF(JSON_VALUE(Priv, '$.access') &        16, 'Y', 'N') AS Create_priv,
  IF(JSON_VALUE(Priv, '$.access') &        32, 'Y', 'N') AS Drop_priv,
  IF(JSON_VALUE(Priv, '$.access') &        64, 'Y', 'N') AS Reload_priv,
  IF(JSON_VALUE(Priv, '$.access') &       128, 'Y', 'N') AS Shutdown_priv,
  IF(JSON_VALUE(Priv, '$.access') &       256, 'Y', 'N') AS Process_priv,
  IF(JSON_VALUE(Priv, '$.access') &       512, 'Y', 'N') AS File_priv,
  IF(JSON_VALUE(Priv, '$.access') &      1024, 'Y', 'N') AS Grant_priv,
  IF(JSON_VALUE(Priv, '$.access') &      2048, 'Y', 'N') AS References_priv,
  IF(JSON_VALUE(Priv, '$.access') &      4096, 'Y', 'N') AS Index_priv,
  IF(JSON_VALUE(Priv, '$.access') &      8192, 'Y', 'N') AS Alter_priv,
  IF(JSON_VALUE(Priv, '$.access') &     16384, 'Y', 'N') AS Show_db_priv,
  IF(JSON_VALUE(Priv, '$.access') &     32768, 'Y', 'N') AS Super_priv,
  IF(JSON_VALUE(Priv, '$.access') &     65536, 'Y', 'N') AS Create_tmp_table_pri
v,
  IF(JSON_VALUE(Priv, '$.access') &    131072, 'Y', 'N') AS Lock_tables_priv,
  IF(JSON_VALUE(Priv, '$.access') &    262144, 'Y', 'N') AS Execute_priv,
  IF(JSON_VALUE(Priv, '$.access') &    524288, 'Y', 'N') AS Repl_slave_priv,
  IF(JSON_VALUE(Priv, '$.access') &   1048576, 'Y', 'N') AS Repl_client_priv,
  IF(JSON_VALUE(Priv, '$.access') &   2097152, 'Y', 'N') AS Create_view_priv,
  IF(JSON_VALUE(Priv, '$.access') &   4194304, 'Y', 'N') AS Show_view_priv,
  IF(JSON_VALUE(Priv, '$.access') &   8388608, 'Y', 'N') AS Create_routine_priv,
  IF(JSON_VALUE(Priv, '$.access') &  16777216, 'Y', 'N') AS Alter_routine_priv,
  IF(JSON_VALUE(Priv, '$.access') &  33554432, 'Y', 'N') AS Create_user_priv,
  IF(JSON_VALUE(Priv, '$.access') &  67108864, 'Y', 'N') AS Event_priv,
  IF(JSON_VALUE(Priv, '$.access') & 134217728, 'Y', 'N') AS Trigger_priv,
  IF(JSON_VALUE(Priv, '$.access') & 268435456, 'Y', 'N') AS Create_tablespace_pr
iv,
  IF(JSON_VALUE(Priv, '$.access') & 536870912, 'Y', 'N') AS Delete_history_priv,
  ELT(IFNULL(JSON_VALUE(Priv, '$.ssl_type'), 0) + 1, '', 'ANY','X509', 'SPECIFIE
D') AS ssl_type,
  IFNULL(JSON_VALUE(Priv, '$.ssl_cipher'), '') AS ssl_cipher,
  IFNULL(JSON_VALUE(Priv, '$.x509_issuer'), '') AS x509_issuer,
  IFNULL(JSON_VALUE(Priv, '$.x509_subject'), '') AS x509_subject,
  CAST(IFNULL(JSON_VALUE(Priv, '$.max_questions'), 0) AS UNSIGNED) AS max_questi
ons,
  CAST(IFNULL(JSON_VALUE(Priv, '$.max_updates'), 0) AS UNSIGNED) AS max_updates,
  CAST(IFNULL(JSON_VALUE(Priv, '$.max_connections'), 0) AS UNSIGNED) AS max_conn
ections,
  CAST(IFNULL(JSON_VALUE(Priv, '$.max_user_connections'), 0) AS SIGNED) AS max_u
ser_connections,
  IFNULL(JSON_VALUE(Priv, '$.plugin'), '') AS plugin,
  IFNULL(JSON_VALUE(Priv, '$.authentication_string'), '') AS authentication_stri
ng,
  'N' AS password_expired,
  ELT(IFNULL(JSON_VALUE(Priv, '$.is_role'), 0) + 1, 'N', 'Y') AS is_role,
  IFNULL(JSON_VALUE(Priv, '$.default_role'), '') AS default_role,
  CAST(IFNULL(JSON_VALUE(Priv, '$.max_statement_time'), 0.0) AS DECIMAL(12,6)) A
S max_statement_time
  FROM global_priv
                    16 Query    set @had_user_table= @@warning_count != 0
                    16 Query    CREATE TABLE IF NOT EXISTS roles_mapping ( Host
char(60) binary DEFAULT '' NOT NULL, User char(80) binary DEFAULT '' NOT NULL, Role char(80) binary DEFAULT '' NOT NULL, Admin_option enum('N','Y') COLLATE utf8_general_ci DEFAULT 'N' NOT NULL, UNIQUE (Host, User, Role)) engine=Aria transactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Granted roles'
...
Finally, of those details I am interested in, later in the general query log we see missing statistics tables creation, also for engine-independent persistent statistics:
...
                    16 Query    SET @create_innodb_table_stats="CREATE TABLE IF
NOT EXISTS innodb_table_stats (
        database_name                   VARCHAR(64) NOT NULL,
        table_name                      VARCHAR(199) NOT NULL,
        last_update                     TIMESTAMP NOT NULL DEFAULT CURRENT_TIMES
TAMP ON UPDATE CURRENT_TIMESTAMP,
        n_rows                          BIGINT UNSIGNED NOT NULL,
        clustered_index_size            BIGINT UNSIGNED NOT NULL,
        sum_of_other_index_sizes        BIGINT UNSIGNED NOT NULL,
        PRIMARY KEY (database_name, table_name)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0"
                    16 Query    SET @create_innodb_index_stats="CREATE TABLE IF
NOT EXISTS innodb_index_stats (
        database_name                   VARCHAR(64) NOT NULL,
        table_name                      VARCHAR(199) NOT NULL,
        index_name                      VARCHAR(64) NOT NULL,
        last_update                     TIMESTAMP NOT NULL DEFAULT CURRENT_TIMES
TAMP ON UPDATE CURRENT_TIMESTAMP,
        /* there are at least:
        stat_name='size'
        stat_name='n_leaf_pages'
        stat_name='n_diff_pfx%' */
        stat_name                       VARCHAR(64) NOT NULL,
        stat_value                      BIGINT UNSIGNED NOT NULL,
        sample_size                     BIGINT UNSIGNED,
        stat_description                VARCHAR(1024) NOT NULL,
        PRIMARY KEY (database_name, table_name, index_name, stat_name)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0"
...
                    16 Query    SET @str=IF(@have_innodb <> 0, @create_innodb_ta
ble_stats, "SET @dummy = 0")
                    16 Query    PREPARE stmt FROM @str
                    16 Prepare  CREATE TABLE IF NOT EXISTS innodb_table_stats (
        database_name                   VARCHAR(64) NOT NULL,
        table_name                      VARCHAR(199) NOT NULL,
        last_update                     TIMESTAMP NOT NULL DEFAULT CURRENT_TIMES
TAMP ON UPDATE CURRENT_TIMESTAMP,
        n_rows                          BIGINT UNSIGNED NOT NULL,
        clustered_index_size            BIGINT UNSIGNED NOT NULL,
        sum_of_other_index_sizes        BIGINT UNSIGNED NOT NULL,
        PRIMARY KEY (database_name, table_name)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0
                    16 Query    EXECUTE stmt
                    16 Execute  CREATE TABLE IF NOT EXISTS innodb_table_stats (
        database_name                   VARCHAR(64) NOT NULL,
        table_name                      VARCHAR(199) NOT NULL,
        last_update                     TIMESTAMP NOT NULL DEFAULT CURRENT_TIMES
TAMP ON UPDATE CURRENT_TIMESTAMP,
        n_rows                          BIGINT UNSIGNED NOT NULL,
        clustered_index_size            BIGINT UNSIGNED NOT NULL,
        sum_of_other_index_sizes        BIGINT UNSIGNED NOT NULL,
        PRIMARY KEY (database_name, table_name)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0
                    16 Query    DROP PREPARE stmt
                    16 Query    SET @str=IF(@have_innodb <> 0, @create_innodb_in
dex_stats, "SET @dummy = 0")
                    16 Query    PREPARE stmt FROM @str
                    16 Prepare  CREATE TABLE IF NOT EXISTS innodb_index_stats (
        database_name                   VARCHAR(64) NOT NULL,
        table_name                      VARCHAR(199) NOT NULL,
        index_name                      VARCHAR(64) NOT NULL,
        last_update                     TIMESTAMP NOT NULL DEFAULT CURRENT_TIMES
TAMP ON UPDATE CURRENT_TIMESTAMP,
        /* there are at least:
        stat_name='size'
        stat_name='n_leaf_pages'
        stat_name='n_diff_pfx%' */
        stat_name                       VARCHAR(64) NOT NULL,
        stat_value                      BIGINT UNSIGNED NOT NULL,
        sample_size                     BIGINT UNSIGNED,
        stat_description                VARCHAR(1024) NOT NULL,
        PRIMARY KEY (database_name, table_name, index_name, stat_name)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0
                    16 Query    EXECUTE stmt
200418 13:06:42     16 Execute  CREATE TABLE IF NOT EXISTS innodb_index_stats (
        database_name                   VARCHAR(64) NOT NULL,
        table_name                      VARCHAR(199) NOT NULL,
        index_name                      VARCHAR(64) NOT NULL,
        last_update                     TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
        /* there are at least:
        stat_name='size'
        stat_name='n_leaf_pages'
        stat_name='n_diff_pfx%' */
        stat_name                       VARCHAR(64) NOT NULL,
        stat_value                      BIGINT UNSIGNED NOT NULL,
        sample_size                     BIGINT UNSIGNED,
        stat_description                VARCHAR(1024) NOT NULL,
        PRIMARY KEY (database_name, table_name, index_name, stat_name)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0
                    16 Query    DROP PREPARE stmt
...
                    16 Query    CREATE TABLE IF NOT EXISTS table_stats (db_name
varchar(64) NOT NULL, table_name varchar(64) NOT NULL, cardinality bigint(21) un
signed DEFAULT NULL, PRIMARY KEY (db_name,table_name) ) engine=Aria transactional=0 CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Tables'
                    16 Query    CREATE TABLE IF NOT EXISTS column_stats (db_name
 varchar(64) NOT NULL, table_name varchar(64) NOT NULL, column_name varchar(64)
NOT NULL, min_value varbinary(255) DEFAULT NULL, max_value varbinary(255) DEFAULT NULL, nulls_ratio decimal(12,4) DEFAULT NULL, avg_length decimal(12,4) DEFAULT
 NULL, avg_frequency decimal(12,4) DEFAULT NULL, hist_size tinyint unsigned, his
t_type enum('SINGLE_PREC_HB','DOUBLE_PREC_HB'), histogram varbinary(255), PRIMARY KEY (db_name,table_name,column_name) ) engine=Aria transactional=0 CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Columns'
                    16 Query    CREATE TABLE IF NOT EXISTS index_stats (db_name
varchar(64) NOT NULL, table_name varchar(64) NOT NULL, index_name varchar(64) NOT NULL, prefix_arity int(11) unsigned NOT NULL, avg_frequency decimal(12,4) DEFAULT NULL, PRIMARY KEY (db_name,table_name,index_name,prefix_arity) ) engine=Aria
 transactional=0 CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Indexes'
...
So, one way ot the other, with direct SQL or with prepared statements etc, either by changing structure multiple times or by adding missing tables we should end up with the content MariaDB 10.4 really needed.

Final steps for connection 16 are:
...
                    16 Query    SET @broken_routines = (select count(*) from mysql.proc where db='performance_schema')
                    16 Query    SET @broken_events = (select count(*) from mysql.event where db='performance_schema')
                    16 Query    SET @broken_pfs= (select @broken_tables + @broken_views + @broken_routines + @broken_events)
                    16 Query    SET @cmd= "DROP DATABASE IF EXISTS performance schema"
                    16 Query    SET @str = IF(@broken_pfs = 0, @cmd, 'SET @du = 0')
                    16 Query    PREPARE stmt FROM @str
                    16 Prepare  DROP DATABASE IF EXISTS performance_schema
                    16 Query    EXECUTE stmt
                    16 Execute  DROP DATABASE IF EXISTS performance_schema
                    16 Query    DROP PREPARE stmt
                    16 Query    SET @cmd= "CREATE DATABASE performance_schema character set utf8"
                    16 Query    SET @str = IF(@broken_pfs = 0, @cmd, 'SET @du = 0')
                    16 Query    PREPARE stmt FROM @str
                    16 Prepare  CREATE DATABASE performance_schema character set utf8
                    16 Query    EXECUTE stmt
                    16 Execute  CREATE DATABASE performance_schema character set utf8
                    16 Query    DROP PREPARE stmt
                    16 Quit
So, it creates performance_schema if needed. Then we see the following:
                    17 Connect  root@localhost as anonymous on
                    17 Query    SET WSREP_ON=0
                    17 Query    SET SQL_LOG_BIN=0
                    17 Query    SHOW DATABASES
                    17 Init DB  mysql
                    17 Query    SHOW /*!50002 FULL*/ TABLES
                    17 Init DB  test
                    17 Query    SHOW /*!50002 FULL*/ TABLES
                    17 Quit
                    18 Connect  root@localhost as anonymous on
                    18 Query    SET WSREP_ON=0
                    18 Query    SET SQL_LOG_BIN=0
                    18 Query    SHOW DATABASES
                    18 Init DB  test
                    18 Query    SHOW /*!50002 FULL*/ TABLES
                    18 Query    CHECK TABLE `t1`  FOR UPGRADE
                    18 Quit
                    19 Connect  root@localhost as anonymous on mysql
                    19 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    19 Query    SELECT COUNT(*) AS c1 FROM mysql.slave_master_info
                    19 Quit
                    20 Connect  root@localhost as anonymous on mysql
                    20 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    20 Query    SELECT COUNT(*) AS c1 FROM mysql.slave_relay_log_info
                    20 Quit
                    21 Connect  root@localhost as anonymous on mysql
                    21 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    21 Query    FLUSH PRIVILEGES
                    21 Quit
                    22 Connect  root@localhost as anonymous on mysql
                    22 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    22 Query    show variables like 'datadir'
                    22 Quit
                    23 Connect  root@localhost as anonymous on mysql
                    23 Query    SET SQL_LOG_BIN=0, WSREP_ON=OFF
                    23 Query    show variables like 'datadir'
                    23 Quit
200418 15:19:34     24 Connect  root@localhost as anonymous on
                    24 Query    select @@version_comment limit 1
200418 15:19:38     24 Query    show tables from mysql like 'innodb%'
200418 15:21:01     24 Query    shutdown
That is, we check every table in every other database for upgrade, so some finalr steps and eventually execute FLUSH PRIVILEGES.

Theoretically we can continue to work then, as we did in connection 24, but the only reasonable action when mysql_upgrade was executed without errors would be to shut down and restart, as we actually did.

After restart we should have 10.4 up, running and ready to use, with all new features, colum,ns and tables avaiable and existing data upgraded to 10.4 "expectations".

Sunset of the step by step major MariaDB major versions upgrade era...
To summarize:
  1. Running mysql_upgrade after any major version upgrade is a must. MariaDB may work for years without that, but you may end up with all kinds of troubles, including bad performance for queries due to missing/non-usable statistcis etc. Do not ignore related messages in the error log!
  2. I was not able to find any still active bug reports about simple cases when direct upgrade skipping some major version(s) fails while the same upgrade done step by step does not fail.
  3. Missing tables etc are really created at later stages of the process, no matter what was the version we upgrade from. In SQL code at least I see no checks for what the older version was. Looks like SQL is executed no matter if we get the error, and we just continue to the next statement that is going to fix the problem, if any. Something similar happens to applying binary log DML events from the earlier position than needed - if we ignore errors and apply them all, eventually we should end up in consistent state.
  4. The SQL code executed, with checks etc, is complex enough. mysql_upgrade tries to deal with many details and fix many things, so one can not exclude bugs in the process withouth extensive testing.
  5. It seems to be safe to run mysql_upgrade multiple times even if it was already run and has nothing to do. Some waste of time and no other harm expected. SQL is written that way, with all that IF NOT EXISTS and other checks.
  6. Assuming that CHECK TABLE ... FOR UPGRADE from MariaDB 10.4 is able to find and fix all problems in tables originating from MariaDB or MySQL version x.y.z so that the result is usable, the process of direct upgrade skipping intermediate major versions may even work well, as Monty stated, with a set of assumptions and restrictions presented above. One still has to check the code and bugs to be sure about that part.