Blog of (former?) MySQL Entomologist: manual

Showing posts with label manual. Show all posts

Sunday, May 26, 2019

On Importing InnoDB Tablespaces and Row Formats

Let me start with a short summary and then proceed with a long story, code snippets, hexdumps, links and awk functions converted from the source code of MariaDB server. This blog post can be summarized as follows:

One can find row_format used to create table explicitly in the .frm file (or the outputs of SHOW CREATE TABLE or SHOW TABLE STATUS). Internals manual may help to find out where is it stored and source code reading helps to find the way to interpret the values.
For InnoDB tables created without specifying the row_format explicitly neither logical backup nor .frm file itself contains the information about the row format used. There are 4 of them (Redundant, Compact, Dynamic and Compressed). The one used implicitly is defined by current value of the innodb_default_row_format that may change dynamically.
At the .ibd file level there is no (easy) way to distinguish Redundant from Compact, this detail should come from elsewhere. If the source table's row format had NOT changed you can find it from the information_schema.innodb_sys_tables (or innodb_tables in case of MySQL 8), or from the output of SHOW TABL STATUS.
There is an easy enough way to check tablespace level flags in the .ibd file (sample awk functions/script are presented below) and this helps to find out that the row format was Compressed or Dynamic.
So far in basic cases (encryption etc aside) individual .ibd files for InnoDB tables from MariaDB (even 10.3.x) and MySQL 8.0.x are compatible enough.
You have to take all the above into account while importing individual tables to do partial restore or copy/move tablespaces from one database to the other.
Some useful additional reading and links may be found in MariaDB bug reports MDEV-19523 and MDEV-15049. Yes, reading MariaDB MDEVs may help MySQL DBAs to understand some things better!

Now the real story.

I miss London, so I am going to be there on June 13 to partcipate in Open Databases Meetup. Should I speak about importing InnoDB tablespaces there?

* * *

This is a long enough blog post about a "research" I had to make while working in Support recently. It all started with a question like this in a support issue earlier in May:

"Is it somehow possible to extract ROW_FORMAT used from a table in a backup in XtraBackup format?"

The context was importing tablespace for InnoDB table and error 1808, "Schema mismatch", and customer had a hope to find out proper format without attempts to import, in some way that can be scripted easily. When one tries to import .ibd file with a format that does not match .frm file or data dictionary content, she gets a very clear message in MariaDB (that still presents all thee details) due to the fix in MDEV-16851, but the idea was to avoid trial and error path entirely.

There were several ideas on how to proceed. Given the .frm, one could use mysqlfrm utility (you can still find MySQL Utilities that are only under Sustaining Support by Oracle here) to get full CREATE TABLE from the .frm. But I was sure that just checking ROW_FORMAT should be easier than that. (Later test of latest mysqlfrm I could get running on Fedora 29 proved that it was a good idea to avoid it due to some problems I may write about one day.) Fine MySQL Internals Manual clearly describes .frm file format and shows that at offset 0x28 in the header section we have row_type encoded as one byte:

0028 1 00 create_info->row_type

Quick search in source code ended up with the following defined in sql/handler.h (links refer to MariaDB code, but the idea is clear and same for MySQL as well):

enum row_type { ROW_TYPE_NOT_USED=-1, ROW_TYPE_DEFAULT, ROW_TYPE_FIXED,
ROW_TYPE_DYNAMIC, ROW_TYPE_COMPRESSED,
ROW_TYPE_REDUNDANT, ROW_TYPE_COMPACT, ROW_TYPE_PAGE };

The rest looked clear at the moment. We should see decimal values from 2 to 5 at offset 0x28 (decimal 40) from the beginning of the .frm file representing row formats supported by InnoDB. I quickly created a set of tables with different row formats:

MariaDB [test]> create table ti1(id int primary key, c1 int) engine=InnoDB row_format=redundant;
Query OK, 0 rows affected (0.147 sec)

MariaDB [test]> create table ti2(id int primary key, c1 int) engine=InnoDB row_format=compact;
Query OK, 0 rows affected (0.145 sec)

MariaDB [test]> create table ti3(id int primary key, c1 int) engine=InnoDB row_format=dynamic;
Query OK, 0 rows affected (0.149 sec)

MariaDB [test]> create table ti4(id int primary key, c1 int) engine=InnoDB row_format=compressed;
Query OK, 0 rows affected (0.130 sec)

MariaDB [test]> create table ti5(id int primary key, c1 int) engine=InnoDB;
Query OK, 0 rows affected (0.144 sec)
MariaDB [test]> insert into ti5 values(5,5);
Query OK, 1 row affected (0.027 sec)

and checked the content of the .frm files with hexdump:

[openxs@fc29 maria10.3]$ hexdump -C data/test/ti1.frm | more00000000 fe 01 0a 0c 12 00 56 00 01 00 b2 03 00 00 f9 01 |......V.........|
00000010 09 00 00 00 00 00 00 00 00 00 00 02 21 00 08 00 |............!...|
00000020 00 05 00 00 00 00 08 00 04 00 00 00 00 00 00 f9 |................|
...

As you can see, we see expected value 04 for ROW_TYPE_REDUNDANT of the table ti1. After that it's easy to come up with some command line to just show numeric row format, like this:

[openxs@fc29 server]$ hexdump --skip 40 --length=1 ~/dbs/maria10.3/data/test/ti1.frm | awk '{print $2}'
0004

or even better:

[openxs@fc29 maria10.3]$ hexdump -C data/test/ti1.frm | awk '/00000020/ {print $10}'
04
[openxs@fc29 maria10.3]$ hexdump -C data/test/ti2.frm | awk '/00000020/ {print $10}'
05
[openxs@fc29 maria10.3]$ hexdump -C data/test/ti3.frm | awk '/00000020/ {print $10}'
02
[openxs@fc29 maria10.3]$ hexdump -C data/test/ti4.frm | awk '/00000020/ {print $10}'
03
[openxs@fc29 maria10.3]$ hexdump -C data/test/ti5.frm | awk '/00000020/ {print $10}'
00

But in real customer case there was no problem with tables created with explicit row_format set (assuming the correct .frm was in place). The problem was with table like ti5 above, those created with the default row format:

MariaDB [test]> show variables like 'innodb%format';
+---------------------------+---------+
| Variable_name | Value |
+---------------------------+---------+
| innodb_default_row_format | dynamic |
| innodb_file_format | |
+---------------------------+---------+
2 rows in set (0.001 sec)

In .frm file (and in SHOW CREATE TABLE output) the format is NOT set, it's default, 0 (or 0x00 in hex). The problem happens when we try to import such a table into an instance with different innodb_default_row_format. Consider the following test case:

[openxs@fc29 maria10.3]$ bin/mysql --socket=/tmp/mariadb.sock -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.3.15-MariaDB Source distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> create database test2;
Query OK, 1 row affected (0.000 sec)

MariaDB [test]> use test2
Database changed
MariaDB [test2]> set global innodb_default_row_format=compact;
Query OK, 0 rows affected (0.000 sec)

MariaDB [test2]> create table ti0(id int primary key, c1 int) engine=InnoDB; Query OK, 0 rows affected (0.165 sec)

MariaDB [test2]> show create table ti0\G
*************************** 1. row ***************************
Table: ti0
Create Table: CREATE TABLE `ti0` (
`id` int(11) NOT NULL,
`c1` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.000 sec)

MariaDB [test2]> \! hexdump -C data/test2/ti0.frm | awk '/00000020/ {print $10}'
00

In this test we create a new table, ti0, in other database, test2, that has ROW_TYPE_DEFAULT (0) in the .frm file, same as the test.ti5 table created above. But if we try to import t5 tablespace by first exporting it properly in another session:

MariaDB [test]> flush tables ti5 for export;
Query OK, 0 rows affected (0.001 sec)

and then discarding original test2.t0 tablespace, copying .ibd and .cfg files (with proper renaming) and running ALTER TABLE ... IMPORT TABLESPACE:

MariaDB [test2]> alter table ti0 discard tablespace;Query OK, 0 rows affected (0.058 sec)

MariaDB [test2]> \! cp data/test/ti5.cfg data/test2/ti0.cfg
MariaDB [test2]> \! cp data/test/ti5.ibd data/test2/ti0.ibd
MariaDB [test2]> alter table ti0 import tablespace;
ERROR 1808 (HY000): Schema mismatch (Table flags don't match, server table has 0x1 and the meta-data file has 0x21; .cfg file uses ROW_FORMAT=DYNAMIC)

we fail with error 1808 (that has all the details about the original's table row format, DYNAMIC, and hex information about some flags in hex that are different). We failed because now innodb_default_row_format is different, it's COMPACT!

We can not fool the target server by removing (or not copying) non-mandatory .cfg file:

MariaDB [test2]> \! rm data/test2/ti0.cfg
MariaDB [test2]> alter table ti0 import tablespace;
ERROR 1808 (HY000): Schema mismatch (Expected FSP_SPACE_FLAGS=0x0, .ibd file contains 0x21.)

Now we see a bit different text, but the same error 1808. Real row format of InnoDB table is stored somewhere in .ibd file. As you can guess, copying .frm file (as it may when we copy back files from Xtrabackup- or mariabackup-based backup to do partial restore) also does not help - the files had the same row_format anyway and we verified that. So, real row format of InnoDB table is stored somewhere in InnoDB (data dictionary). When it does not match the one we see in .ibd file we get error 1808.

How to resolve this error? There are two ideas to explore (assuming we found the real format in .ibd file somehow):

Try to create target table with proper row_format and then import.
Set innodb_default_row_format properly and create target table without explicit row format set, and then import.

The first one works, as one can find out (but will end up with different .frm file than the original table had, surely). Check these:

MariaDB [test2]> select * from test.ti5;
+----+------+
| id | c1 |
+----+------+
| 5 | 5 |
+----+------+
1 row in set (0,001 sec)

MariaDB [test2]> alter table ti0 discard tablespace;
Query OK, 0 rows affected (0,066 sec)

MariaDB [test2]> \! cp data/test/ti5.ibd data/test2/ti0.ibd
MariaDB [test2]> alter table ti0 import tablespace;
ERROR 1808 (HY000): Schema mismatch (Expected FSP_SPACE_FLAGS=0x0, .ibd file contains 0x21.)
MariaDB [test2]> \! cp data/test/ti5.cfg data/test2/ti0.cfg
MariaDB [test2]> alter table ti0 import tablespace;
ERROR 1808 (HY000): Schema mismatch (Table flags don't match, server table has 0x1 and the meta-data file has 0x21; .cfg file uses ROW_FORMAT=DYNAMIC)
MariaDB [test2]> drop table ti0;
Query OK, 0 rows affected (0,168 sec)

So, if you care to understand the flags (we'll work on that below) or care to copy .cfg file as well, you surely can get the row format of the table. Now let's re-create ti0 with explicitly defined Dynamic row format and try to import again:

MariaDB [test2]> create table ti0(id int primary key, c1 int) engine=InnoDB row_format=Dynamic;
Query OK, 0 rows affected (0,241 sec)

MariaDB [test2]> alter table ti0 discard tablespace;
Query OK, 0 rows affected (0,071 sec)

MariaDB [test2]> \! cp data/test/ti5.ibd data/test2/ti0.ibd
MariaDB [test2]> alter table ti0 import tablespace;
Query OK, 0 rows affected, 1 warning (0,407 sec)

MariaDB [test2]> show warnings\G
*************************** 1. row ***************************
Level: Warning
   Code: 1810
Message: IO Read error: (2, No such file or directory) Error opening './test2/ti0.cfg', will attempt to import without schema verification
1 row in set (0,000 sec)

MariaDB [test2]> select * from ti0;
+----+------+
| id | c1   |
+----+------+
| 5 |    5 |
+----+------+
1 row in set (0,001 sec)

We see that copying .cfg file is not really mandatory and that explicit setting of ROW_FORMAT (assuming that .frm file is NOT copied) works.

The second idea also surely works (and customer in his trial and error attempts just tried with all possible formats until import was successful). Lucky from the first error we'll know the original format used for sure:

MariaDB [test2]> drop table ti0;
Query OK, 0 rows affected (0.084 sec)

MariaDB [test2]> set global innodb_default_row_format=dynamic;
Query OK, 0 rows affected (0.000 sec)

MariaDB [test2]> create table ti0(id int primary key, c1 int) engine=InnoDB;    Query OK, 0 rows affected (0.171 sec)

MariaDB [test2]> alter table ti0 discard tablespace;
Query OK, 0 rows affected (0.049 sec)

MariaDB [test2]> \! cp data/test/ti5.cfg data/test2/ti0.cfg
MariaDB [test2]> \! cp data/test/ti5.ibd data/test2/ti0.ibd
MariaDB [test2]> alter table ti0 import tablespace;
Query OK, 0 rows affected (0.307 sec)

MariaDB [test2]> select * from ti0;
+----+------+
| id | c1   |
+----+------+
| 5 |    5 |
+----+------+
1 row in set (0.000 sec)

MariaDB [test2]> show create table ti0\G
*************************** 1. row ***************************
       Table: ti0
Create Table: CREATE TABLE `ti0` (
`id` int(11) NOT NULL,
`c1` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.000 sec)

Now we can proceed with UNLOCK TABLES in that another session where we flushed test.ti5 for export.

How could we find out the row format to use without trial and error, now that we know in one specific case .frm file (or even CREATE TABLE statement shown by server or mysqldump) misses it?

First of all we could try to save this information (select @@innodb_default_file_format) alone with the backup. But that would show the value of this variable at the moment of asking, and it could be different when specific table was created. Does not work in general case.

We could use SHOW TABLE STATUS also, as follows:

MariaDB [test]> show create table ti5\G
*************************** 1. row ***************************
       Table: ti5
Create Table: CREATE TABLE `ti5` (
`id` int(11) NOT NULL,
`c1` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.001 sec)

MariaDB [test]> show table status like 'ti5'\G
*************************** 1. row ***************************
            Name: ti5
          Engine: InnoDB
         Version: 10
      Row_format: Dynamic
...

In the example above that table was created without setting row_format explicitly, but we see the real one used in the output of SHOW TABLE STATUS. So, if we cared enough, this kind of output could be saved when the data were backed up or exported.

Then we could try to get it for each table from the InnoDB data dictionary of the system we get .ibd files from. In older MySQL versions we'd have to dig into the real data dictionary tables on disk probably, but in any recent MySQL (up to 5.7, 8.0 may be somewhat different due to a new data dictionary) or MariaDB we have a convenient, SQL-based way to get this information. There are two INFORMATION_SCHEMA tables to consider: INNODB_SYS_TABLESPACES and INNODB_SYS_TABLES. The first one is not good enough, as it considers Compact and Redundant row formats the same (even though fine MySQL Manual does not say this):

MariaDB [test]> select * from information_schema.innodb_sys_tablespaces where name like '%ti%';
+-------+----------------------------+------+----------------------+-----------+---------------+------------+---------------+-----------+----------------+
| SPACE | NAME                       | FLAG | ROW_FORMAT           | PAGE_SIZE | ZIP_PAGE_SIZE | SPACE_TYPE | FS_BLOCK_SIZE | FILE_SIZE | ALLOCATED_SIZE |
+-------+----------------------------+------+----------------------+-----------+---------------+------------+---------------+-----------+----------------+
|     3 | mysql/transaction_registry |   33 | Dynamic              |     16384 |         16384 | Single     |          4096 |    147456 |         147456 |
|     4 | mysql/gtid_slave_pos       |   33 | Dynamic              |     16384 |         16384 | Single     |          4096 |     98304 |          98304 |
|     6 | test/ti1                   |    0 | Compact or Redundant |     16384 |         16384 | Single     |          4096 |     98304 |          98304 |
|     7 | test/ti2                   |    0 | Compact or Redundant |     16384 |         16384 | Single     |          4096 |     98304 |          98304 ||     8 | test/ti3                   |   33 | Dynamic              |     16384 |         16384 | Single     |          4096 |     98304 |          98304 |
|     9 | test/ti4                   |   41 | Compressed           |     16384 |          8192 | Single     |          4096 |     65536 |          65536 |
|    10 | test/ti5                   |   33 | Dynamic              |     16384 |         16384 | Single     |          4096 |     98304 |          98304 |
+-------+----------------------------+------+----------------------+-----------+---------------+------------+---------------+-----------+----------------+
7 rows in set (0.000 sec)

The second one works perfectly:

MariaDB [test2]> select * from information_schema.innodb_sys_tables where name like '%ti%';
+----------+----------------------------+------+--------+-------+------------+---------------+------------+
| TABLE_ID | NAME                       | FLAG | N_COLS | SPACE | ROW_FORMAT | ZIP_PAGE_SIZE | SPACE_TYPE |
+----------+----------------------------+------+--------+-------+------------+---------------+------------+
|       19 | mysql/gtid_slave_pos       |   33 |      7 |     4 | Dynamic    |             0 | Single     |
|       18 | mysql/transaction_registry |   33 |      8 |     3 | Dynamic    |             0 | Single     |
|       21 | test/ti1                   |    0 |      5 |     6 | Redundant |             0 | Single     |
|       22 | test/ti2                   |    1 |      5 |     7 | Compact    |             0 | Single     ||       23 | test/ti3                   |   33 |      5 |     8 | Dynamic    |             0 | Single     |
|       24 | test/ti4                   |   41 |      5 |     9 | Compressed |          8192 | Single     |
|       25 | test/ti5                   |   33 |      5 |    10 | Dynamic    |             0 | Single     |
|       26 | test2/ti0                  |    1 |      5 |    11 | Compact    |             0 | Single     |
+----------+----------------------------+------+--------+-------+------------+---------------+------------+
8 rows in set (0.000 sec)

In the table above I was wondering about the exact values in FLAG column (note 33, 0x21 in hex, looks familiar from the error message in previous examples). MySQL Manual says just this:

"A numeric value that represents bit-level information about tablespace format and storage characteristics."

MariaDB's KB page is now way more detailed after my bug report, MDEV-19523, was closed. See the link for the details, or check the code of the i_s_dict_fill_sys_tables() function if you want to interpret the data properly:

/**********************************************************************//**
Populate information_schema.innodb_sys_tables table with information
from SYS_TABLES.
@return 0 on success */
static
int
i_s_dict_fill_sys_tables(
/*=====================*/
   THD*       thd,       /*!< in: thread */
   dict_table_t*   table,       /*!< in: table */
   TABLE*       table_to_fill)   /*!< in/out: fill this table */
{
   Field**          fields;
   ulint           compact = DICT_TF_GET_COMPACT(table->flags);
   ulint           atomic_blobs = DICT_TF_HAS_ATOMIC_BLOBS(
                               table->flags);
   const ulint zip_size = dict_tf_get_zip_size(table->flags);
   const char*       row_format;

   if (!compact) {
       row_format = "Redundant";
   } else if (!atomic_blobs) {
       row_format = "Compact";
   } else if (DICT_TF_GET_ZIP_SSIZE(table->flags)) {
       row_format = "Compressed";
   } else {
       row_format = "Dynamic";
   }
...

Another part of the code shows how the checks above are performed:

#define DICT_TF_GET_COMPACT(flags) \

        ((flags & DICT_TF_MASK_COMPACT) \

        >> DICT_TF_POS_COMPACT)

/** Return the value of the ZIP_SSIZE field */

#define DICT_TF_GET_ZIP_SSIZE(flags) \

        ((flags & DICT_TF_MASK_ZIP_SSIZE) \

        >> DICT_TF_POS_ZIP_SSIZE)

/** Return the value of the ATOMIC_BLOBS field */

#define DICT_TF_HAS_ATOMIC_BLOBS(flags) \

        ((flags & DICT_TF_MASK_ATOMIC_BLOBS) \

        >> DICT_TF_POS_ATOMIC_BLOBS)

...

We miss masks and flags to double check (in the same storage/innobase/include/dict0mem.h file):

/** Width of the COMPACT flag */
#define DICT_TF_WIDTH_COMPACT        1

/** Width of the ZIP_SSIZE flag */
#define DICT_TF_WIDTH_ZIP_SSIZE        4

/** Width of the ATOMIC_BLOBS flag. The ROW_FORMAT=REDUNDANT and
ROW_FORMAT=COMPACT broke up BLOB and TEXT fields, storing the first 768 bytes
in the clustered index. ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED
store the whole blob or text field off-page atomically.
Secondary indexes are created from this external data using row_ext_t
to cache the BLOB prefixes. */
#define DICT_TF_WIDTH_ATOMIC_BLOBS    1

...

/** Zero relative shift position of the COMPACT field */
#define DICT_TF_POS_COMPACT        0
/** Zero relative shift position of the ZIP_SSIZE field */
#define DICT_TF_POS_ZIP_SSIZE        (DICT_TF_POS_COMPACT        \
                    + DICT_TF_WIDTH_COMPACT)
/** Zero relative shift position of the ATOMIC_BLOBS field */
#define DICT_TF_POS_ATOMIC_BLOBS    (DICT_TF_POS_ZIP_SSIZE        \
+ DICT_TF_WIDTH_ZIP_SSIZE)

If we make some basic math we can find out that DICT_TF_POS_ZIP_SSIZE is 1 and DICT_TF_POS_ATOMIC_BLOBS is 5, etc. The masks are defined as:

/** Bit mask of the COMPACT field */
#define DICT_TF_MASK_COMPACT                \
        ((~(~0U << DICT_TF_WIDTH_COMPACT))    \
        << DICT_TF_POS_COMPACT)
/** Bit mask of the ZIP_SSIZE field */
#define DICT_TF_MASK_ZIP_SSIZE                \
        ((~(~0U << DICT_TF_WIDTH_ZIP_SSIZE))    \
        << DICT_TF_POS_ZIP_SSIZE)
/** Bit mask of the ATOMIC_BLOBS field */
#define DICT_TF_MASK_ATOMIC_BLOBS            \
        ((~(~0U << DICT_TF_WIDTH_ATOMIC_BLOBS))    \
        << DICT_TF_POS_ATOMIC_BLOBS)

Basically we have what we need now, bit positions and masks. We can create a function to return a row format based on decimal value of falgs. Consider this primitive awk example:

openxs@ao756:~/dbs/maria10.3$ awk '
> function DICT_TF_GET_COMPACT(flags) {
>   return rshift(and(flags, DICT_TF_MASK_COMPACT), DICT_TF_POS_COMPACT);
> }
>
> function DICT_TF_GET_ZIP_SSIZE(flags)
> {
>   return rshift(and(flags, DICT_TF_MASK_ZIP_SSIZE), DICT_TF_POS_ZIP_SSIZE);
> }
>
> function DICT_TF_HAS_ATOMIC_BLOBS(flags)
> {
>   return rshift(and(flags, DICT_TF_MASK_ATOMIC_BLOBS), DICT_TF_POS_ATOMIC_BLOBS);
> }
>
> function innodb_row_format(flags)
> {
>     compact = DICT_TF_GET_COMPACT(flags);
>     atomic_blobs = DICT_TF_HAS_ATOMIC_BLOBS(flags);
>
>     if (!compact) {
>         row_format = "Redundant";
>     } else if (!atomic_blobs) {
>         row_format = "Compact";
>     } else if (DICT_TF_GET_ZIP_SSIZE(flags)) {
>         row_format = "Compressed";
>     } else {
>         row_format = "Dynamic";
>     }
>     return row_format;
> }
>
> BEGIN {
> DICT_TF_WIDTH_COMPACT=1;
> DICT_TF_WIDTH_ZIP_SSIZE=4;
> DICT_TF_WIDTH_ATOMIC_BLOBS=1;
>
> DICT_TF_POS_COMPACT=0;
> DICT_TF_POS_ZIP_SSIZE=DICT_TF_POS_COMPACT + DICT_TF_WIDTH_COMPACT;
> DICT_TF_POS_ATOMIC_BLOBS=DICT_TF_POS_ZIP_SSIZE + DICT_TF_WIDTH_ZIP_SSIZE;
>
> DICT_TF_MASK_COMPACT=lshift(compl(lshift(compl(0), DICT_TF_WIDTH_COMPACT)),DICT_TF_POS_COMPACT);
> DICT_TF_MASK_ZIP_SSIZE=lshift(compl(lshift(compl(0), DICT_TF_WIDTH_ZIP_SSIZE)),DICT_TF_POS_ZIP_SSIZE);
> DICT_TF_MASK_ATOMIC_BLOBS=lshift(compl(lshift(compl(0), DICT_TF_WIDTH_ATOMIC_BLOBS)),DICT_TF_POS_ATOMIC_BLOBS);
>
> print innodb_row_format(0), innodb_row_format(1), innodb_row_format(33), innodb_row_format(41);
> }'
Redundant Compact Dynamic Compressed
openxs@ao756:~/dbs/maria10.3$

So, we know how to get format based on decimal values of flags. The remaining subtask is to find out where are the flags in the .ibd file. Instead of digging into the code (server/storage/innobase/include/fsp0fsp.h etc) one can just check this great blog post by Jeremy Cole to find out that flags are at bytes 54-57, 16 bytes offset after FIL header that is 38 bytes long (4 bytes starting from hex offset 0x36) in the .ibd file. These bytes are highlighted with bold below:

[openxs@fc29 maria10.3]$ hexdump -C data/test/ti2.ibd | more
00000000 5d 4f 09 aa 00 00 00 00 00 00 00 00 00 00 00 00 |]O..............|
00000010 00 00 00 00 00 19 11 ee 00 08 00 00 00 00 00 00 |................|
00000020 00 00 00 00 00 07 00 00 00 07 00 00 00 00 00 00 |................|
00000030 00 06 00 00 00 40 00 00 00 00 00 00 00 04 00 00 |.....@..........|
...

[openxs@fc29 maria10.3]$ hexdump -C data/test/ti4.ibd | more
00000000 6c cd 19 15 00 00 00 00 00 00 00 00 00 00 00 00 |l...............|
00000010 00 00 00 00 00 19 44 9f 00 08 00 00 00 00 00 00 |......D.........|
00000020 00 00 00 00 00 09 00 00 00 09 00 00 00 00 00 00 |................|
00000030 00 06 00 00 00 40 00 00 00 29 00 00 00 04 00 00 |.....@...)......|
...

[openxs@fc29 maria10.3]$ hexdump -C data/test/ti5.ibd | more00000000 d8 21 6d 2e 00 00 00 00 00 00 00 00 00 00 00 00 |.!m.............|
00000010 00 00 00 00 00 19 62 9d 00 08 00 00 00 00 00 00 |......b.........|
00000020 00 00 00 00 00 0a 00 00 00 0a 00 00 00 00 00 00 |................|
00000030 00 06 00 00 00 40 00 00 00 21 00 00 00 04 00 00 |.....@...!......|
...

As you can see we have hex values 0x00, 0x29 (41 decimal), 0x21 (33 decimal) etc, and, theoretically, we can find out the exact row_format used (and other details) from that, based on the information presented above. For row format we need just one byte and we can get it as follows in hex:

openxs@ao756:~/dbs/maria10.3$ hexdump -C data/test/t*.ibd | awk '/00000030/ {print $11}'
21
openxs@ao756:~/dbs/maria10.3$ hexdump -C data/test/t*.ibd | awk '/00000030/ {flags=strtonum("0x"$11); print flags;}'
33

To use the awk function defined above we need to convert hex to decimal, hence a small trick with strtonum() function. Now, let me put it all together and show that we can apply this MySQL as well (I checked MariaDB code mostly in the process). Let me create same tables ti1 ... ti5 in MySQL 8.0.x:

openxs@ao756:~/dbs/8.0$ bin/mysqld_safe --no-defaults --basedir=/home/openxs/dbs/8.0 --datadir=/home/openxs/dbs/8.0/data --port=3308 --socket=/tmp/mysql8.sock &
[1] 31790
openxs@ao756:~/dbs/8.0$ 2019-05-26T10:55:18.274601Z mysqld_safe Logging to '/home/openxs/dbs/8.0/data/ao756.err'.
2019-05-26T10:55:18.353458Z mysqld_safe Starting mysqld daemon with databases from /home/openxs/dbs/8.0/data

openxs@ao756:~/dbs/8.0$ bin/mysql --socket=/tmp/mysql8.sock -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.13 Source distribution

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select @@innodb_default_row_format;
+-----------------------------+
| @@innodb_default_row_format |
+-----------------------------+
| dynamic                     |
+-----------------------------+
1 row in set (0,00 sec)

mysql> create table ti1(id int primary key, c1 int) engine=InnoDB row_format=redundant;
Query OK, 0 rows affected (0,65 sec)

mysql> create table ti2(id int primary key, c1 int) engine=InnoDB row_format=compact;
Query OK, 0 rows affected (0,44 sec)

mysql> create table ti3(id int primary key, c1 int) engine=InnoDB row_format=dynamic;
Query OK, 0 rows affected (0,51 sec)

mysql> create table ti4(id int primary key, c1 int) engine=InnoDB row_format=compressed;
Query OK, 0 rows affected (0,68 sec)

mysql> create table ti5(id int primary key, c1 int) engine=InnoDB;
Query OK, 0 rows affected (0,59 sec)

mysql> select * from information_schema.innodb_sys_tables where name like 'test/ti%';
ERROR 1109 (42S02): Unknown table 'INNODB_SYS_TABLES' in information_schema
mysql> select * from information_schema.innodb_tables where name like 'test/ti%';
+----------+----------+------+--------+-------+------------+---------------+------------+--------------+
| TABLE_ID | NAME     | FLAG | N_COLS | SPACE | ROW_FORMAT | ZIP_PAGE_SIZE | SPACE_TYPE | INSTANT_COLS |
+----------+----------+------+--------+-------+------------+---------------+------------+--------------+
|     1158 | test/ti1 |    0 |      5 |     6 | Redundant |             0 | Single     |            0 |
|     1159 | test/ti2 |    1 |      5 |     7 | Compact    |             0 | Single     |            0 |
|     1160 | test/ti3 |   33 |      5 |     8 | Dynamic    |             0 | Single     |            0 |
|     1161 | test/ti4 |   41 |      5 |     9 | Compressed |          8192 | Single     |            0 |
|     1162 | test/ti5 |   33 |      5 |    10 | Dynamic    |             0 | Single     |            0 |
+----------+----------+------+--------+-------+------------+---------------+------------+--------------+
5 rows in set (0,03 sec)

Now let's combine some shell and awk together:

openxs@ao756:~/dbs/8.0$ for file in `ls data/test/ti*.ibd`
> do
> echo $file
> hexdump -C $file | awk '
> function DICT_TF_GET_COMPACT(flags) {
>   return rshift(and(flags, DICT_TF_MASK_COMPACT), DICT_TF_POS_COMPACT);
> }
>
> function DICT_TF_GET_ZIP_SSIZE(flags)
> {
>   return rshift(and(flags, DICT_TF_MASK_ZIP_SSIZE), DICT_TF_POS_ZIP_SSIZE);
> }
>
> function DICT_TF_HAS_ATOMIC_BLOBS(flags)
> {
>   return rshift(and(flags, DICT_TF_MASK_ATOMIC_BLOBS), DICT_TF_POS_ATOMIC_BLOBS);
> }
>
> function innodb_row_format(flags)
> {
>     compact = DICT_TF_GET_COMPACT(flags);
>     atomic_blobs = DICT_TF_HAS_ATOMIC_BLOBS(flags);
>
>     if (!compact) {
>         row_format = "Redundant";
>     } else if (!atomic_blobs) {
>         row_format = "Compact";
>     } else if (DICT_TF_GET_ZIP_SSIZE(flags)) {
>         row_format = "Compressed";
>     } else {
>         row_format = "Dynamic";
>     }
>     return row_format;
> }
>
> BEGIN {
> DICT_TF_WIDTH_COMPACT=1;
> DICT_TF_WIDTH_ZIP_SSIZE=4;
> DICT_TF_WIDTH_ATOMIC_BLOBS=1;
>
> DICT_TF_POS_COMPACT=0;
> DICT_TF_POS_ZIP_SSIZE=DICT_TF_POS_COMPACT + DICT_TF_WIDTH_COMPACT;
> DICT_TF_POS_ATOMIC_BLOBS=DICT_TF_POS_ZIP_SSIZE + DICT_TF_WIDTH_ZIP_SSIZE;
>
> DICT_TF_MASK_COMPACT=lshift(compl(lshift(compl(0), DICT_TF_WIDTH_COMPACT)),DICT_TF_POS_COMPACT);
> DICT_TF_MASK_ZIP_SSIZE=lshift(compl(lshift(compl(0), DICT_TF_WIDTH_ZIP_SSIZE)),DICT_TF_POS_ZIP_SSIZE);
> DICT_TF_MASK_ATOMIC_BLOBS=lshift(compl(lshift(compl(0), DICT_TF_WIDTH_ATOMIC_BLOBS)),DICT_TF_POS_ATOMIC_BLOBS);
> }
> /00000030/ {flags=strtonum("0x"$11); print innodb_row_format(flags);}'
> done
data/test/ti1.ibd
Redundant
data/test/ti2.ibd
Redundant
data/test/ti3.ibd
Dynamic
data/test/ti4.ibd
Compressed
data/test/ti5.ibd
Dynamic
openxs@ao756:~/dbs/8.0$

So, we proved that there is a way (based on some code analysis and scripting) to find out the exact row format that was used to create InnoDB table based solely on the .ibd file and nothing else, in all cases but one! If you are reading carefully you noted Redundant printed for ti2.ibd as well, we've seen the same in the INNODB_SYS_TABLESPACES table. Flags in the tablespace are same for both Redundant and Compact row formats, see this part of the code also. It seems to be one of the reasons why .cfg file may be needed when we export tablespace is exactly this.

One day I'll find out and create a followup post. Too much core reading for my limited abilities today...

Saturday, August 18, 2018

On Fine MySQL Manual

Today I am going to provide some details on the last item in my list of problems with Oracle's way of MySQL server development, maintenance of MySQL Manual. I stated that:

"MySQL Manual still have many details missing and is not fixed fast enough.
Moreover, it is not open source..."

Let me explain the above:

MySQL Reference Manual is not open source. It used to be built from DocBook XML sources. Probably that's still the case. But you can not find the source code in open repositories (please, correct me if I am wrong, I tried to search...) That's because it is NOT open source. It says this clearly in Preface:
"This documentation is NOT distributed under a GPL license. Use of this documentation is subject to the following terms:
...
Any other use, such as any dissemination of printed copies or use of this documentation, in whole or in part, in another publication, requires the prior written consent from an authorized representative of Oracle. Oracle and/or its affiliates reserve any and all rights to this documentation not expressly granted above."
It was NOT Oracle who closed the source (as far as I remember, the manual was not GPL even in 2004, when I started to care about MySQL in general). But Oracle had a chance to change the license and set up some better contribution process for MySQL Community users, with many good writers among them. They decided not do this, so creators of forks and derived software have to duplicate efforts and rewrite everything themselves, and the only real way to help make the manual better is to report bugs.
Quality of new documentation is not improved much. We, MySQL Community users, have to report bugs in the manual, as it still has some details missing or documented wrongly. Let me illustrate this with recent 10 or so documentation requests users made (I skipped reports for features I do not care about for now, like group replication):
- Bug #91955 - "8.0 broke plugin API for C". According to the comment from Oracle developer in this bug reported by Laurynas Biveinis, writing plugins in C is not supported any more... But this change is NOT properly documented.
- Bug #91781 - "Add version matrix in "Chapter 2 Keywords and Reserved Words". Good idea in the manual is not consistently implemented.
- Bug #91743 - "performance_schema.data_locks not working correctly". Nice feature was added to Performance Schema (now we can check and study InnoDB locks with SQL!), but it is still not completely and properly documented.
- Bug #91654 - "Handling an Unexpected Halt of a Replication Slave" Documentation is uncertain". This is also an example of improper bug handling - when I worked in Oracle newer bugs were usually declared duplicates of older ones. Here the opposite decision was made, even though both reports were from the same user, Uday Varagani, who explicitly asked to change the decision. Obviously, documentation requests do not get any extra care comparing to other kinds of bugs, quite to the contrary...
- Bug #91648 - "Numeric literals must be a number between 0 and 9999". Surely ports with numbers larger than 9999 can be used.
- Bug #91642 - "Document how protocol version 4.0 and 4.1 differ on how strings are terminated". As noted by Rene' Cannao', comments in the code are still sometimes more useful than documentation.
- Bug #91549 - "MDL lock queue order seems to depend on table names". Make sure to read last comment in this nice request by Shane Bester. Dmitry Lenev provides some details on priorities of MDL locks in it. There are still cases when bugs and documentation requests document some details better than fine MySQL Manual!
- Bug #90997 - "Handling an Unexpected Halt of a Replication Slave" manual page is wrong". Sveta Smirnova highlighted a totally misleading statement in the manual.
- Bug #90935 - "Modify relay_log_recovery Documentation". Simple documentation request stays "Open" for 3 months. Definitely processing documentation requests is not a highest priority for Oracle engineers.
- Bug #90680 - "dragnet logging: document how to use / find error_symbol codes". Even if request comes from a customer or otherwise well known bug reporter, like Simon Mudd, and it's related to some totally new cool feature of MySQL 8, it can wait for the fix for months...
You can make your own conclusions from the above. But I do not see any good trends in the way new documentation is created or documentation requests are processed recently. Same problems as 4 years ago (see more on that in a side note below).
Older documentation requests get even less attention than recent ones sometimes, even though they may highlight problems with software itself, not the MySQL Manual. Let me illustrate this with a few bugs I reported:
- Bug #83640 - "Locks set by DELETE statement on already deleted record". I explicitly asked to
  "...document in the manual what locks this kind of DELETE sets when it encountered a record already marked as deleted, and why"
  This still had NOT happened. Moreover, eventually both MySQL and MariaDB engineers decided that current implementation of locking for this case is wrong and can be improved, so this report ended up as InnoDB bug. Check related MDEV-11215 for more details.
- Bug #73413 - "Manual does not explain MTS implementation in details". This is one of my favorite documentation requests. I've got a suggestion to explain what I want to see documented, in details. I tried my best, but if you want to get more details, read this blog.
- Bug #72368 - "Empty/zero statistics for imported tablespace until explicit ANALYZE TABLE". This is a bug in the way persistent statistics (the feature I truly hate) in InnoDB is re-estimated "automatically". But until the bug is fixed, I asked to document current implementation (and problems with it). So, where current implementation is properly documented? If only in the bug reports...
- Bug #71294 - "Manual page for P_S.FILE_INSTANCES table does not explain '~' in FILE_NAME". They pretend they do not get the point:
  "Per Marc Alff, the Performance Schema is simply reporting what it gets from runtime. If runtime is reporting nonexistent filenames, that's a server issue.
  
  Recategorizing this as a server bug."
- Bug #71293 - "Manual page for P_S.FILE_INSTANCES table does not explain EVENT_NAME values". Now they are waiting for the new DOCUMENTATION column in the setup_instruments table to be filled in with something by server developers... The code is the documentation? OK, bus as we know from the experience (see slides 44-46 here) chances to get a bug in Performance Schema fixed fast are even less than to see it properly and completely documented...

There are more problems with MySQL documentation (not only reference manual), but at the moment I consider 3 highlighted and illustrated above the most serious.

Regent's Canal is nice. If I only knew how to operate the locks there... MySQL Manual also misses some information about locks.

As a side note, it's not the first time I write about MySQL Manual. You can find some useful details in the following posts:

"What's wrong with MySQL Manual". In 2014, after spending few weeks reporting up to 5 documentation bugs per day, I thought that, essentially, there is nothing much wrong with it - it's fine, well indexed by Google and has meaningful human-readable URLs. Few problems listed were lack of careful readers (I tried my best to fix that), limited attention to old documentation requests, some pages with not so much useful content and lack of "How to" documents. The later I also tried to fix to some small extent in this blog, see howto tagged posts. The real fix came mostly from Percona's blog, though.
I have a special "missing manual" tag for blog posts that mention at least one bug in the manual.
I tried to use "missing_manual" tag consistently for my own documentation requests. Last year I shared a detailed enough review of the bugs with that tag that were still active.

As a yet another side note, I tried once to create a kind of community driven "missing manual" project, but failed. Writing manual from scratch is a (hard) full time job, while my full time job is providing support to users of MySQL, MariaDB and Percona software...

That said, I also wanted to join MySQL's documentation team in the past, but this was not possible at least because I am not a native speaker. If anything changed in this regard, I am still open to a job offer of this kind. My conditions for an acceptable offer from Oracle are known to all interested parties and they include (but are not limited to) at least 4 times the salary I had before I quit (it was almost 6 years ago) and working for Oracle's office in Paris (because in Oracle office you ere employed by formally matters a lot for your happiness and success as employee).

In case of no offer in upcoming week, I'll just continue to write my annoying but hopefully somewhat useful blog posts (until MySQL Reference Manual becomes true open source project ;)

Sunday, July 9, 2017

MySQL Support Engineer's Chronicles, Issue #7

This week in Support was busy enough for me. Among other things I had to study all possible reasons (other than obvious query cache impact) for queries hanging in "query end" status and noted Bug #80652 related to binlog group commit and fixed in MySQL 5.7.17+ and 8.0.1+ only. The case I had to review was related to Galera though, and I suggest you to note that "query end" may be related to Galera replication stall. Studying this path further soon brought lp:1197771 - "Cluster stalls while distributing transaction" to my attention again, so I asked about proper status for it on Facebook. As it happens way too often recently, I've got few 'Likes" but no further comments, neither from Percona nor from Codership. It seems good old times of my efficient bugs escalations via Facebook had gone...

Does it mean I am going to stop writing and sharing posts about bugs on Facebook? Definitely it does NOT! One of the recent reasons is this post about Bug #86902 that, after being shared by me, got a quick comment and clarification by Sveta Smirnova that the problem (performance problem when binary logging is turned OFF and you use more than one transactional storage engine concurrently) is actually know, - it was reported as Bug #80870 (that is still "Open", what a shame).

So, even if some Oracle or Percona employees will continue to ignore new bugs reports or my Facebook (re-)posts about MySQL bugs, at least bug reporter gets more chances to get help from Community, additional references and clarifications. So, stay tuned - I'll keep writing about MySQL bugs in this blog and, on a more regular basis and probably for a wider audience, - on Facebook. As popular Mahatma Gandhi's quote says:

"First they ignore you, then they laugh at you, then they fight you, then you win."

In my case with public MySQL bugs and the way they are processed by Oracle it was not strictly in this order, and I am yet to see people laughing at me for this reason, but I think I already won as since September 2012 public MySQL bugs database still exists and several Oracle engineers still pay attention to community bug reports there.

I had reported one bug myself this week, Bug #86930 - "Wrong results for queries with row constructors and information_schema". It was quickly verified by Miguel Solorzano. It's a funny bug (with workarounds), but not as scary as few other bugs recently reported and described in this post by Jean-François Gagné.Note, that as he stated, there are more than just that 2 bugs he mentioned (Bug #86926 and Bug #86927). Everybody in Oracle who should know that already knows it, and I know it. I keep stating that the implementation of InnoDB persistent statistics gives us more troubles than benefits. Last time I cared and complained that much it was about Bug #70629 - "InnoDB updated rec_per_key[] statistics not published to the optimizer enough often". But, trust me, this recent set of problems Jean-François found is way more serious.

I'd like to share few references I found extremely useful this week while working on Support issues this week. First of all, if you set up SSL connections for MySQL, take time to review this old post by my former colleague from Percona Roman Vynar, and another one by Matthew Boehm. This may save you some time and debugging efforts. If you care about MariaDB ColumnStore performance (I have to), check this KB article. If you want to replicate data to the table with somewhat different column types on slave, make sure to read the manual. Finally, if you care enough to keep SELinux enabled and use Galera clusters with any non-default ports or directories, make sure to read this blog post.

I've noted a couple of serious decisions made by Oracle in regard to MySQL. First of the is to discontinue work on MySQL Internals manual and integrate it some how into the main manual. You can find related text added recently to some bug reports, like Bug #67989 - "MySQL Internals documentation missing 5.6 binlog protocol parts" by my colleague Andrew Hutchings:

"No more updates are made to the MySQL Internals documentation, because it's in the process of being replaced by https://dev.mysql.com/doc/dev/mysql-server/latest/."

Sounds promising, isn't it?

The other one, that is going to be accepted way better, IMHO, is to discontinue work on query cache. Check Bug #86046 - "FROM sub query is cached by mistake" by Meiji Kimura, for example:

"[7 Jul 9:09] Erlend Dahl

MySQL will no longer invest in the query cache, see:
http://mysqlserverteam.com/mysql-8-0-retiring-support-for-the-query-cache/"

So, that were some of the links I've noted during this week. Some of the topics mentioned above may be continued and presented in more details in my further posts.

Sunday, March 9, 2014

What's wrong with MySQL Manual

I think that MySQL Manual is one of the reasons why MySQL became and still remains popular. I find a lot of useful information (at least references to share with customers, if not real insights) there even after 9+ years of working with MySQL every day and with all numerous articles and blog posts on most important topics available now. I still have MySQL manual page open at every browser instance on every laptop I use on a regular basis.

It's simply great, well indexed by Google and has meaningful human-readable URLs, so one can even guess them for the topics he need. I have http://dev.mysql.com/doc/refman/5.6/en/merge-storage-engine.html open right now and looking at it I clearly understand without any search that if I need a reference for SELECT syntax in 5.5 I have just replace "5.6" with "5.5" and "merge-storage-engine" with "select" to get there. It works that way and many users even had created some nice marcos or shortcuts for their browser to end up at the proper manual page after typing just the topic they are interested it.

These being said, today I'd like to write about problems with MySQL manual that I care about. Some of them are not new and are hard to fix/avoid, others I consider recent enough. Today I'd like to discuss 4 of them:

MySQL Manual does NOT have enough careful readers who are ready to report problems or missing details.
Old documentation bug reports seem to be ignored without any obvious good reason.
Some documentation pages are useless with their current content.
"How to" kind of official MySQL documentation is missing, and this seems intentional.

The most important problem for MySQL Manual is the fact that it seems it has not enough careful readers who are interested in making it better. Great MySQL Documentation team works hard on improving documentation, but even now, in Oracle, with all the funds theoretically available and almost unlimited resources in general, it seems the team is hardly can keep up with documenting new features and releases in time, so they do not have either time or free pairs of eyes to do real "QA" for the manual other than proofreading maybe.

Also, it's easy to understand that authors who had written the text are not the best people to read it later with a hope to find something useful or some mistake there. They deal with their "ideas" and may have too many things assumed in mind comparing to average user/reader of the manual. They are also very experienced MySQL users and even developers in the past. I had written a book myself long time ago and translated dozens of technical books into Russian, so I know the feeling - for you, as a writer, your text is clear, useful and may be just great. While it is NOT, especially for readers with different technical or cultural background.

As a result we have manual pages that are less than clear and has a lot of useful things missing or improperly documented. When I started to read the manual carefully some time ago (after I stopped writing about MySQL bugs at Facebook I needed some other thing to do to keep my mind in peace) I had found out that I can easily report a bug (if not 5 or more) for any page I open. I see missing details or wrong statements everywhere. I had even "invented" separate "missing manual" tag to classify some of the documentation bugs I've reported. It's so easy to find these problems and report them as bugs in "Server: Documentation" category...

So, MySQL manual has bugs, but not so many it seems (I see 86 active ones at the moment). This is NOT because it's ideal, but more because too few people care to read the manual carefully and report anything they consider wrong or missing. Actually, besides MySQL Support engineers and few other Oracle employees, few of my colleagues at Percona and myself, I know only 3 Community members who report documentation bugs recently:

Daniel van Eeden. He is a well known Community bug reporter and used to be great Oracle customer (at least this is how I remember him from the days when I worked in Oracle MySQL Support). Out of his 78 active bug reports I see 11 for MySQL Server manual, but he also reports bugs in documentation for Oracle's "commercial" MySQL software, like MySQL Enterprise Backup.
Peter Laursen. He is also a well known and productive MySQL bugs reporter for a decade probably. Out of his 83 active bug reports 5 are related to MySQL Manual. They are not very recent though.
Federico Razzoli. He had started to report himself only recently because of privacy-related issues with information requested by Oracle to obtain SSO account for bugs reporting. Before that he asked me to report bugs sometimes, and I did. So, now of his 5 active bugs 2 are related to MySQL Manual.

So, just few reporters who care and, at the same time, problems and deficiencies easy to find everywhere. This is a main problem, IMHO, and the only way to get a hope to solve it is to start reading MySQL Manual and reporting bugs/problems noted! Surely, it's just the first step, somebody has to write/provide real missing content or fixes. If you can suggest some better text, please, do it. If you do not want to wait there are probably alternative ways: you have to document MySQL related things elsewhere. For example, this is what Federico did when he noted that no way to deal with materialized views is documented in usual sources of MySQL-related information. I plan to discuss alternatives to MySQL Manual later. Let's concentrate on what to do to make it better for now.

Other important problem of the manual related to bugs is the fact that some of the bugs reported are NOT fixed in time. Let me share my favorite example here, my Bug #68097. It was reported almost 14 months ago and asked to actually properly document PERFORMANCE_SCHEMA behavior that was intended and explained long time ago - the only way to enable instrumentation for (InnoDB) rwlocks and mutexes is upon server startup. If these instruments where not enabled at startup they are just not available, otherwise you can disable them and enable them back dynamically. I fail to understand what's so hard to write exactly this, as we can find statements about this from Oracle engineers in many places.

As a result, MySQL Manual stays misleading for years! By the way, the oldest documentation bug I see in "Verified" status, Bug #30538, remains in this state without further comments since October 19, 2010. Based on recent comments it may be even somehow fixed, but nobody cares to change status then. And this is for a bug reported back in 2007 by famous customer (probably Google at that time), Mark Callaghan. Obviously we see not enough care about older documentation requests. My recent ones are usually fixed fast though.

Some pages of MySQL Manual exist for years, but their content is almost useless without looking into the code or checking with developers. Great examples can be found in documents devoted to NDB Cluster internals, for example, this page (it's one of many): https://dev.mysql.com/doc/ndbapi/en/ndb-internals-dump-command-2401.html. Nobody even tried to explain the output here, interpretation of the fields, use cases for this DUMP command, nothing. Just few words and example of output that I can probably get myself. How this page is useful to anybody?

Unfortunately, the problem is NOT limited to NDB Cluster internals (that I am happy to honestly not care about at all at the moment). Same happens even to InnoDB. Check http://dev.mysql.com/doc/refman/5.6/en/innodb-monitors.html and tell me where it explains the details about any lock waits related information you can find out in TRANSACTIONS section of INNODB STATUS output. I see only this text:

"If this section reports lock waits, your applications might have lock contention. The output can also help to trace the reasons for transaction deadlocks."

That's all! Really. It does not even explain how "next-key lock" looks in the output vs "record lock"... As a result, one can not explain the way to troubleshoot even simple deadlock based on the manual or, if she tries, she can NOT prove the points with anything but experiments and quotes from the source code. And all that happens when InnoDB lock model does NOT change for decade(s) and source code seems commented in details.

The fourth problem, the lack of official "How to" or "FAQ" documents, is closely related to the above. Not only MySQL Manual sometimes misses a lot of details, you can not get them from http://dev.mysql.com/doc/ (the place to look for official MySQL guides) at all! How to read the output of SHOW ENGINE INNODB STATUS? Now to use PERFORMANCE_SCHEMA to troubleshoot different types of performance problems? You can find answers (of different quality, level of details and age) in great blog posts, books and presentations, maybe in Oracle's knowledge base articles (if you are lucky to be their customer and good in search there) but NOT in official MySQL documentation.

I do not consider this normal or acceptable, especially for the open source software. Check FreeBSD Documentation Project, for example, or Linux Documentation Project to get some ideas of how other open source software approaches the problem.

Surely, adding a collection of "How to" guides probably requires input from Community, but why at least not to try to collect links to all the good existing documents of this kind somewhere at http://dev.mysql.com? Having third party documents of this kind with "officially approved" status after a review by Oracle's MySQL Documentation and/or Support teams would help a lot, I think.

That's all I can complain about at the moment. One day I'll try to continue...

Friday, July 26, 2013

Fun with Bugs #19 - waiting for MySQL 5.6.13 and some real fun?

I feel like MySQL 5.6.12 was released ages ago, while in reality it was on June 3, less than 2 months ago. No wonder I feel so, after writing several posts about bugs fixed and not fixed in it... Anyway, we still have to wait for MySQL 5.6.13 for a week or even two probably and in the meantime I decided to write new post for this series based on good old idea of making a digest of my recent bugs-related posts at Facebook. I know, it's boring and annoying (same as waiting for the release of 5.6.13).

Let's start with Bug #69846 - "ICP does not work on UNIQUE indexes". Based on my quick tests presented there I'd say that ICP (index condition pushdown) actually does not work for InnoDB clustered key (being it explicit PRIMARY KEY or just the first UNIQUE KEY), and this is probably by design, as such index is "equal" to InnoDB table itself. I'd still prefer to see this explained in the manual. Bug is still "Open", so Oracle engineers may have different opinion.

Another bug made my day actually, it's Bug #69842 from Domas. It does not matter that it is "Open" or that doublewrite buffer does not work efficiently with InnoDB page sizes smaller than 16K... Question is how to create proper Latin-based term for writing something 3, 5 ("quintuple-writing") or 17 times...

Optimizer in MySQL had always been a source of endless fun. It seems that for some cases it worked properly last time in 5.1 (if not 5.0). Check Bug #69833 for a recent example, "Bad interaction between MIN/MAX and "HAVING SUM(DISTINCT)": wrong results". Yes, it's a weird corner case with not much real life use, but still it's a regression bug. In some areas MySQL 5.1 was better than any 5.5 or 5.6 GA version from Oracle so far :)

Bug #62578 is still affecting customers who use MySQL 5.5.x. Recent checks show that 5.6.12 is NOT affected (neither Sinisa nor me were able to repeat it on 5.6.12), but mysql client of 5.5.32 still crashes when you reside terminal (or putty) window it runs in. Fix is well know and probably found its way to 5.6 long time ago, but why it is NOT in 5.5? Yes, I still remember talks about a policy to fix bugs in older GA versions only when customers ask and there is no risk or high efforts involved, but isn't this the case for this bug?

I've spent a lot of time trying to find similar known bug report for Bug #69825 reported by my colleague, but failed. It seems new problem, at least in public bugs database. Sounds simple, "InnoDB: Assertion failure in thread ... in file lock0wait.cc line 297", affects 5.6.12 and the only repeatable test case so far is based on customer confidential data (as far as I know), but still... Why it is still "Open" and got no comments? Had everybody in Oracle decided that GTID-based crash safe replication with information stored in InnoDB tables in mysql database just always work, so it must be some bug reporter fault? It's not the case, dear colleagues, it's not the case...

Now something simple and easy to verify by code review, Bug #69827. "Hardcoded libdir in cmake", so that lib64 on Linux is used only for x86_64. Who cares I've asked? But even if we do not take 64-bit PowerPC into account any more, some sources claim 64-bit ARM-based servers are coming soon... Should be easy fix actually (hacked in by many users already probably).

If you are going to upgrade to MySQL 5.6 and had forgotten that configuration file in basedir (if present) overrides settings in /etc/my.cnf, check Bug #68643 - "sql_mode is unkind in my.cnf created by mysql_install_db". Yes, as it silently includes STRICT_TRANS_TABLES. It's just a feature request for now, so hardly is going to change any time soon (if at all). Just take care about this, I warned you.

Week started with funny bugs mostly. This one is open for more than 7 years and still scare people, Bug #18256. If session is killed you still may see messages like this in your mysql command line client:

ERROR 1053 (08S01) at line 1: Server shutdown in progress

Do not panic, please. Server may be perfectly alive, just your session no longer works.

On the weekend I still had fun with the manual and friends who are scared by Oracle requests for personal information. Check Bug #69805 I've reported (or http://falseisnotnull.wordpress.com/2013/07/21/mysql-not-found-errors-precedence-docs-bug/ for the whole story) and note that fine manual does not describe what was really implemented correctly. Let them now decide who was wrong, manual (I hope) or implementation of the feature as designed. Error handling in stored routines is still to complex and far from perfect even in MySQL 5.6 if you ask me, at least comparing to good old Oracle PL/SQL (this is what I really miss since 2005 sometimes...)

That's all for now. I've spent this week digging around and trying to fix things and protect strangers from dangers like this:

Hope this helps somebody.