Sunday, January 28, 2018

On InnoDB's Online DDL

I am completing my preparations for the upcoming FOSDEM talk, and one of last things I wanted to do in frames of them is a quick review of known bugs and problems in current (as in MySQL 5.7 GA) implementation of so called "online" DDL for InnoDB tables.

In my previous post I already shared my view on another important InnoDB feature, persistent statistics. Unlike that, I do not really hate online DDL. I just try to avoid it if possible and use tools like pt-online-schema-change or gh-ost instead. Not because it is not documented properly (the documentation is quite detailed, there are still things to clarify though) or does not work as designed, but mostly because the term "online" (if we understand it as "not blocking", or "without blocking/affecting the application and read/write operations to the table being changed is available") is a bit misleading (it is more like "less blocking" or "blocking for shorter periods of time", faster and in-place, sometimes), and because it does not work the way one might expect in any kind of replication setups.

To be more specific:
  • Replication ignores LOCK=NONE :) Slave will only start to apply "concurrent" DML after commit, and this leads to a huge replication lag.
  • In too many cases the entire table is rebuilt (data are (re-)written), in place or by creating a copy, while notable writes in the process of running ALTER TABLE are really required only if we are introducing stricter constraints (and even in this case we can just validate the table, return error if some row does not satisfy new constraint, too long to fit, for example, and then change metadata if all rows are OK) or adding new indexes (that in any case can not be used until they are built).
  • The online log has to be kept (in memory or in temporary file). There is one such log file for each index being created or table being altered. Manual says:
    "This log file stores data inserted, updated, or deleted in the table during the DDL operation. The temporary log file is extended when needed by the value of innodb_sort_buffer_size, up to the maximum specified by innodb_online_alter_log_max_size. If a temporary log file exceeds the upper size limit, the ALTER TABLE operation fails and all uncommitted concurrent DML operations are rolled back. Thus, a large value for this option allows more DML to happen during an online DDL operation, but also extends the period of time at the end of the DDL operation when the table is locked to apply the data from the log."
    The problem is that the size depends on the concurrent DML workload and is hard to predict. Note also "when the table is locked" above to understand how much "online" is this...
There are also bugs, and I'd like to discuss some of them:
  • Bug #82997, "Online DDL fails with". There are not enough public details to be sure with what exactly, but maybe the problems (several are reported) happen when the table altered has generated column. if this is really so, the bug may be fixed in MySQL 5.7.19+.
  • Bug #73196, "Allow ALTER TABLE to run concurrently on master and slave". I can not put this better than Andrew Morgan did it in this verified feature request:
    "With online ALTER TABLE it is possible for the DDL operation to run for many hours while still processing DML on that same table. The ALTER TABLE is not started on the slave until after it has completed on the master and it will again take many hours to run on the slave. While the DDL runs on the slave, it is not possible for it to process any transactions which followed the ALTER TABLE on the master as they may be dependent on the changes that were made to the table's schema. This means that the slave will lag the master by many hours while the ALTER TABLE runs and then while it catches up on the backlog of DML sent from the master while that was happening."
    Both pt-osc and gh-ost resolve this problem, as they take replication topology into account and can throttle changes on master if needed. See also this documentation request by Daniël van Eeden, Bug #77619 , that lists more limitations of "online" DDL, and check how it helped to clarify them here.
  • Bug #67286, "InnoDB Online DDL hangs". It ended up as "Not a bug", but there is a good explanation of exclusive metadata lock set by the "online" ALTER in the comments:
    "The final (short) phase of ALTER where the internal data dictionary is updated requires exclusive access. That's why the ALTER was blocked by the active transaction having a shared lock on the table."
    I once studied similar (and even simpler) case in a lot of details with gdb, see this blog post. I've clearly see MDL_EXCLUSIVE lock request for simple ALTER TABLE ... STATS_AUTO_RECALC=1 that (according to the manual) "permits concurrent DML". Other manual page clarifies:
    "In most cases, an online DDL operation on a table waits for currently executing transactions that are accessing the table to commit or roll back because it requires exclusive access to the table for a brief period while the DDL statement is being prepared. Likewise, the online DDL operation requires exclusive access to the table for a brief time before finishing. Thus, an online DDL statement also waits for transactions that are started while the DDL is in progress to commit or roll back before completing."
    Dear MySQL Oracle developers, just remove "In most cases" (or clarify it), and this would be fair enough!
  • Bug #84004, "Manual misses details on MDL locks set and released for online ALTER TABLE". That's my documentation request I filed after spending some time tracing metadata locks usage in gdb. My request is simple (typos corrected):
    "Describe all kinds of metadata locks used by MySQL, their interactions and order of acquisition and release for most important SQL statements, including (but not limited to) all kinds of online ALTER TABLE statements for InnoDB tables."
  • Bug #68498, "can online ddl for innodb be more online?". This report by Mark Callaghan that refers to this detailed study is still "Verified". Based on the comments to that blog post, it is "enough online", but the details of implementation were not clearly documented at the moment. Check for the details and clarifications in the comments!
  • Bug #72109, "Avoid table rebuild when adding or removing of auto_increment settings". The bug report from Simon Mudd is still "Verified".
  • Bug #57583, "fast index create not used during "alter table foo engine=innodb"". The bug is still "Verified" and I can not tell from the manual if this is implemented in MySQL 5.7 or not.
  • Bug #83557, "Can't use LOCK=NONE to drop columns in table with virtual columns" - nice "Verified" bug report by Monty Solomon.
  • Bug #70790, "ALTER TABLE REBUILD PARTITION SHOULD NOT PREVENT DML IN UNAFFECTED PARTITIONS". My former colleague in Oracle Arnaud Adant simply asked to provide proper and reasonable support of online DDL for partitioned tables. This bug report is still "Verified", but at least we have a separate manual page now that explains the details and limitations of online DDL with partitioned tables (most of Arnaud's requests are still NOT implemented).
  • Bug #81819, "ALTER TABLE...LOCK=NONE is not allowed when FULLTEXT INDEX exists". As Marko Mäkelä explains in the last comment of this "Verified" feature request:
    "However, LOCK=NONE is never supported when a FULLTEXT INDEX exists on the table. Similarly, LOCK=NONE is not supported when SPATIAL INDEX (introduced in MySQL 5.7) exist. Speaking as the author of WL#6255 which implemented ALTER TABLE...LOCK=NONE for InnoDB B-tree indexes in MySQL 5.6, I share the bug reporter's disappointment."
To summarize, online DDL in MySQL 5.7 is surely far above and beyond "fast index creation", but there is still a lot of room from improvements. Real solution (that allows to perform ALTER TABLE fast and without unnecessary changes/writes to data in way more cases) may come with real data dictionary in MySQL and support for multiple table versions there, or from ideas like those implemented in MDEV-11369, "Instant add column for InnoDB", and expressed in MDEV-11424, "Instant ALTER TABLE of failure-free record format changes". Until that all is implemented I'd prefer to rely on good old tools like pt-osc

In any case we speak about backward incompatible changes to the way MySQL works and stores data now.

1 comment:

  1. Thank you for a great summary.

    I am disappointed that the manual buries the detail about locking the table while applying logged changes. This isn't online.