Wednesday, January 6, 2016

Exploring Metadata Locks with gdb - Double Checking the Initial Results

Some results in my initial post in this series led me to questions that I'll try to answer here. First of all, I noted that SELECT from a single table ended up with just one metadata lock request:

(gdb) b MDL_request::init
Breakpoint 1 at 0x648f13: file /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc, line 1266.
Breakpoint 2 at 0x648e70: file /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc, line 1245.
warning: Multiple breakpoints were set.
Use the "delete" command to delete unwanted breakpoints.
(gdb) c
Continuing.
[Switching to Thread 0x7ff224c9f700 (LWP 2017)]

Breakpoint 2, MDL_request::init (this=0x7ff1fbe425a8,
    mdl_namespace=MDL_key::TABLE, db_arg=0x7ff1fbe421c8 "test",
    name_arg=0x7ff1fbe421d0 "t1", mdl_type_arg=MDL_SHARED_READ,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {


MDL_SHARED_READ lock on the table is expected, we read it after all. Let's check with backtrace where this request happens:






(gdb) bt
#0  MDL_request::init (this=0x7ff1fbe425a8, mdl_namespace=MDL_key::TABLE,
    db_arg=0x7ff1fbe421c8 "test", name_arg=0x7ff1fbe421d0 "t1",
    mdl_type_arg=MDL_SHARED_READ, mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000006d3033 in st_select_lex::add_table_to_list (this=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:7479
#2  0x000000000077eb6d in MYSQLparse (YYTHD=0x7ff1fbb65000)
    at /var/lib/jenkins/jobs/percona-server-5.6-source-tarballs/workspace/sql/sql_yacc.yy:10888
#3  0x00000000006decef in parse_sql (thd=0x7ff1fbb65000,
    parser_state=0x7ff224c9e130, creation_ctx=0x0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:9005
#4  0x00000000006df1b1 in mysql_parse (thd=0x7ff1fbb65000, rawbuf=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:6878
#5  0x00000000006e0d1f in dispatch_command (command=<value optimized out>,
    thd=0x7ff1fbb65000, packet=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:1442
#6  0x00000000006ad692 in do_handle_one_connection (thd_arg=Unhandled dwarf expression opcode 0xf3
)
...


So, it happens as soon as we noted a table to read from while parsing the query text. What surprised me initially is no other MDL lock requests, nothing at schema level, for example. I expected that metadata locks should prevent dropping the schema when I read from some table there, but had not seen the schema-level lock for this.

To get more details on this and try to make sure I do not miss anything by studying a corner case, I've added a row to the table and then tried to do the same SELECT in explicit transaction, and (after SELECT completed, but transaction remained active) I tried to DROP DATABASE from the second session while this transaction was still active. (This test will also let us understand what metadata locks are requested for DROP DATABASE, by the way.) I've got the following in gdb:

[New Thread 0x7ff224c5e700 (LWP 2048)]
[Switching to Thread 0x7ff224c5e700 (LWP 2048)]

Breakpoint 2, MDL_request::init (this=0x7ff224c5a040,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff224c5a040, mdl_namespace=MDL_key::GLOBAL,
    db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000007d1ed7 in lock_schema_name (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test")
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/lock.cc:794
#2  0x00000000006b03d4 in mysql_rm_db (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:787
...

---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a1f0,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a3a0,
    mdl_namespace=MDL_key::SCHEMA, db_arg=0x7ff1e581d0b0 "test",
    name_arg=0xba2ae4 "", mdl_type_arg=MDL_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff224c5a3a0, mdl_namespace=MDL_key::SCHEMA,
    db_arg=0x7ff1e581d0b0 "test", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_EXCLUSIVE, mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000007d1fb2 in lock_schema_name (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test")
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/lock.cc:801
#2  0x00000000006b03d4 in mysql_rm_db (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:787
#3  0x00000000006daaa2 in mysql_execute_command (thd=0x7ff1fbbb5000)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:4371
#4  0x00000000006df518 in mysql_parse (thd=0x7ff1fbbb5000, rawbuf=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:6972
...

---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff1e58313b0,
    mdl_namespace=MDL_key::TABLE, db_arg=0x7ff1e5831570 "test",
    name_arg=0x7ff1e5831575 "t1", mdl_type_arg=MDL_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff1e58313b0, mdl_namespace=MDL_key::TABLE,
    db_arg=0x7ff1e5831570 "test", name_arg=0x7ff1e5831575 "t1",
    mdl_type_arg=MDL_EXCLUSIVE, mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000006b0857 in find_db_tables_and_rm_known_files (
    thd=0x7ff1fbbb5000, db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:1108
#2  mysql_rm_db (thd=0x7ff1fbbb5000, db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:812
...


Now it's clear what happens, eventually we have to drop the table while dropping the schema, and here we surely hit a blocking metadata lock that SELECT set! Let's continue:

---Type <return> to continue, or q <return> to quit---q
Quit

(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff1e581ece0,
    mdl_namespace=MDL_key::SCHEMA, db_arg=0x7ff1e5831570 "test",
    name_arg=0xba2ae4 "", mdl_type_arg=MDL_INTENTION_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff1e581ece0, mdl_namespace=MDL_key::SCHEMA,
    db_arg=0x7ff1e5831570 "test", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x000000000068da14 in lock_table_names (thd=0x7ff1fbbb5000,
    tables_start=0x7ff1e5831010, tables_end=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5033
#2  0x00000000006b05df in mysql_rm_db (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:834
...


So, while our request for MDL_EXCLUSIVE lock for the table surely could not be satisfied, server continued with some further requests:

---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a1f0,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a3a0,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.


and only at this stage it hanged (waiting for the transaction where SELECT happened to complete). As soon as I committed there:

Breakpoint 2, MDL_request::init (this=0x7ff224c5a260,
    mdl_namespace=MDL_key::TABLE, db_arg=0xb925ef "mysql",
    name_arg=0xb94906 "proc", mdl_type_arg=MDL_SHARED_READ,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a360,
    mdl_namespace=MDL_key::TABLE, db_arg=0xb925ef "mysql",
    name_arg=0xb94906 "proc", mdl_type_arg=MDL_SHARED_WRITE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {




So, we tried to read and then maybe change the mysql.proc table (to drop procedures created in this database, honestly I was not sure if there was any one there at the moment). Next:

(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c59cb0,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c599c0,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a250,
    mdl_namespace=MDL_key::TABLE, db_arg=0xb925ef "mysql",
    name_arg=0xc111a8 "event", mdl_type_arg=MDL_SHARED_WRITE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.




That is, some lock requests and we try to delete events in this database from mysql.event table obviously. Let's continue:

Breakpoint 2, MDL_request::init (this=0x7ff224c59900,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff224c59900, mdl_namespace=MDL_key::GLOBAL,
    db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x000000000068c13a in open_table (thd=0x7ff1fbbb5000,
    table_list=0x7ff224c59eb0, ot_ctx=0x7ff224c59c10)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:2895
#2  0x0000000000694585 in open_and_process_table (thd=0x7ff1fbbb5000,
    start=0x7ff224c59e48, counter=0x7ff224c59e50, flags=2048,
    prelocking_strategy=0x7ff224c59ea0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:4797
#3  open_tables (thd=0x7ff1fbbb5000, start=0x7ff224c59e48,
    counter=0x7ff224c59e50, flags=2048, prelocking_strategy=0x7ff224c59ea0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5304
#4  0x0000000000694d74 in open_and_lock_tables (thd=0x7ff1fbbb5000,
    tables=0x7ff224c59eb0, derived=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5960


#5  0x000000000085039d in open_and_lock_tables (thd=0x7ff1fbbb5000, lock_type=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.h:477
#6  Event_db_repository::open_event_table (thd=0x7ff1fbbb5000, lock_type=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/event_db_repository.cc:619
#7  0x0000000000850bfc in Event_db_repository::drop_schema_events (this=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/event_db_repository.cc:1011---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c59610,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff224c59610, mdl_namespace=MDL_key::BACKUP,
    db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000007d1dfe in Global_backup_lock::acquire_protection (
    this=0x7ff1fbbb6a08, thd=0x7ff1fbbb5000, duration=MDL_STATEMENT,
    lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/lock.cc:1225
#2  0x000000000068ce20 in open_table (thd=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:3258
#3  0x0000000000694585 in open_and_process_table (thd=0x7ff1fbbb5000,
    start=0x7ff224c59e48, counter=0x7ff224c59e50, flags=2048,
    prelocking_strategy=0x7ff224c59ea0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:4797
#4  open_tables (thd=0x7ff1fbbb5000, start=0x7ff224c59e48,
    counter=0x7ff224c59e50, flags=2048, prelocking_strategy=0x7ff224c59ea0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5304
#5  0x0000000000694d74 in open_and_lock_tables (thd=0x7ff1fbbb5000,
    tables=0x7ff224c59eb0, derived=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5960

...
---Type <return> to continue, or q <return> to quit---q
 at /uQuit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5ab00,
    mdl_namespace=MDL_key::BINLOG, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

We ended up with writing to the binary log and DROP DATABASE was completed.

Let's summarize this experience. First of all, it seems in current implementation there is no need to set metadata locks at schema level to prevent dropping the database we read from with SELECT - table level metadata lock will eventually block the DROP DATABASE request. If we have many tables in the database this may happen not at the very first one, so we can drop some tables before getting blocked. This is something to study later. I'd really prefer DROP DATABASE to be atomic and do not even start if it can not complete at the moment. See Bug #79610 also.

We see table-level metadata lock requests for mysql.proc and mysql.events tables when DROP DATABASE is executed. This is expected and reasonable - we can NOT drop the database (schema) if there are some stored functions, procedures or events are defined there that are used.

We had also noted that even if some metadata lock is requested but can not be obtained, processing may continue for some time until stopped, and then processing resumes when we finally get the required metadata lock.

This is the code to study in sql/lock.cc:

 760 /**
 761   Obtain an exclusive metadata lock on a schema name.
 762
 763   @param thd         Thread handle.
 764   @param db          The database name.
 765
 766   This function cannot be called while holding LOCK_open mutex.
 767   To avoid deadlocks, we do not try to obtain exclusive metadata
 768   locks in LOCK TABLES mode, since in this mode there may be
 769   other metadata locks already taken by the current connection,
 770   and we must not wait for MDL locks while holding locks.
 771
 772   @retval FALSE  Success.
 773   @retval TRUE   Failure: we're in LOCK TABLES mode, or out of memory,
 774                  or this connection was killed.
 775 */
 776
 777 bool lock_schema_name(THD *thd, const char *db)
 778 {
 779   MDL_request_list mdl_requests;
 780   MDL_request global_request;
 781   MDL_request backup_request;
 782   MDL_request mdl_request;
 783
 784   if (thd->locked_tables_mode)
 785   {
 786     my_message(ER_LOCK_OR_ACTIVE_TRANSACTION,
 787                ER(ER_LOCK_OR_ACTIVE_TRANSACTION), MYF(0));
 788     return TRUE;
 789   }
 790
 791   if (thd->global_read_lock.can_acquire_protection())
 792     return TRUE;
 793   global_request.init(MDL_key::GLOBAL, "", "", MDL_INTENTION_EXCLUSIVE,
 794                       MDL_STATEMENT);
 795
 796   if (thd->backup_tables_lock.abort_if_acquired())
 797     return true;
 798   thd->backup_tables_lock.init_protection_request(&backup_request,
 799                                                   MDL_STATEMENT);
 800
 801   mdl_request.init(MDL_key::SCHEMA, db, "", MDL_EXCLUSIVE, MDL_TRANSACTION)     ;
 802
 803   mdl_requests.push_front(&mdl_request);
 804   mdl_requests.push_front(&backup_request);
 805   mdl_requests.push_front(&global_request);
 806
 807   if (thd->mdl_context.acquire_locks(&mdl_requests,
 808                                      thd->variables.lock_wait_timeout))
 809     return TRUE;
 810
 811   DEBUG_SYNC(thd, "after_wait_locked_schema_name");
 812   return FALSE;


From this code we see that actually we acquire metadata locks in batches(!). This explaines why we had not stopped at our first "blocked" request immediately, and complicates tracing in gdb. We also see that we have some mdl_context object in THD structure, and we may use this later to study pending requests per thread. One day I'll start setting breakpoints at MDLcontext::acquire_locks as well.

It's also more or less clear what MDL_key::GLOBAL namespace is used for - this is a metadata lock requested by FLUSH TABLES WITH READ LOCK global lock, and thus we check it for any "write" operation, even before we try to do it. If I execute FLUSH TABLES WITH READ LOCK, I get the following in my gdb session with a breakpoint set:

[Switching to Thread 0x7ff224c9f700 (LWP 2017)]

Breakpoint 2, MDL_request::init (this=0x7ff224c9c9f0,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_SHARED, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c9c9f0,
    mdl_namespace=MDL_key::COMMIT, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_SHARED, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.


and then it's completed. Now, on UNLOCK TABLES nothing happens at MDL level.

The last topic I'd like to discuss in this post are metadata locks with namespace defined like this: mdl_namespace=MDL_key::BACKUP, "backup locks". Manual for 5.7 does not mention them, but they still exist in the code even there:

[openxs@centos ~]$ grep -rn MDL_key::BACKUP ~/git/mysql-server/*
[openxs@centos ~]$ echo $?
1

In MySQL we see the following in mdl.h:

  enum enum_mdl_namespace { GLOBAL=0,
                            TABLESPACE,
                            SCHEMA,
                            TABLE,
                            FUNCTION,
                            PROCEDURE,
                            TRIGGER,
                            EVENT,
                            COMMIT,
                            USER_LEVEL_LOCK,
                            LOCKING_SERVICE,
                            /* This should be the last ! */
                            NAMESPACE_END };


So, backup locks is the unique feature of Percona Server, explained in the manual. It is implemented with MDL locks. As soon as we run LOCK TABLES FOR BACKUP, we see in gdb session:

Breakpoint 2, MDL_request::init (this=0x7ff224c9ca90,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_SHARED, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {


This reminded me about the difference Percona Server can make, so I'll try to re-check my results with MySQL 5.6 next time as well. Stay tuned!

5 comments:

  1. Hi Valeriy

    I came across your blog post about metadata locks on Percona's site. Could you elaborate on it a little bit more? I'm looking for techniques to minimize its time during ALTER's.

    Am I right that if I put a ", LOCK=EXCLUSIVE" on my alter statement, it will get a lock much earlier comparing with LOCK=DEFAULT. I have a table that gets accessed frequently and all transactions on it are pretty short, but for some reason if I try to update it, the alter query hangs with metadata lock for long time. Does it make any sense at all?

    Sorry if my question is out of scope of your blog. I would appreciate any guidance. Maybe in form of another post about metadata locks? )

    ReplyDelete
    Replies
    1. With LOCK=EXCLUSIVE I expect to see MDL_EXCLUSIVE lock is set on the table at the very beginning. I agree that checking the influence of LOCK clause is a good topic to continue this series of posts.

      As for your real problem of blocked ALTERs, I'd like to get more details, like the text of typical ALTER that is blocked (to check if it is or can be executed "online"), exact MySQL version used, and the output of SHOW PROCESSLIST when the ALTER is waiting on the metadata lock.

      Delete
  2. MySQL version - 5.7.10.

    Typical alter is simplest add column. You can check it on show processlist file - https://gist.github.com/dnaumenko/7b8c0ab88df0aa1e15da. It took ~1000 sec to complete eventually given that the target table is not so big (291194 rows). Our devops insists on adding lock=exclusive to make it run faster during releases, but I'm not sure if it can help in any way. Related question - can lock=none mitigate the metadata locks and what can eventually go wrong if we use it?

    ReplyDelete
    Replies
    1. There is no way to mitigate (or skip) metadata locking. LOCK=NONE is just a quick way to check if some ALTER will be blocking or not:

      mysql> alter table tt1 add column c2 varchar(100), lock=NONE; Query OK, 0 rows affected (0.64 sec)
      Records: 0 Duplicates: 0 Warnings: 0

      mysql> alter table tt1 modify column c2 varchar(120), lock=NONE;
      ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: Cannot change column type INPLACE. Try LOCK=SHARED.

      So, if it's "not blocking reads and writes", it will proceed (and request metadata locks that are still needed, for example, surely any ALTER must prevent DROP TABLE from running until it's completed). If it will block reads or writes, you'll get error message and nothing will be locked.

      In your case the ALTER has to wait for metadata lock that is hold by some other session that probably did SELECT from the table and then NOT committed explicitly:

      827 | Waiting for table metadata lock | ALTER TABLE target ADD COLUMN advertiser_id int(10) unsigned DEFAULT NULL

      As you are on 5.7 you can easily identify the blocking session using P_S tables and queries like those mentioned in my blog post:

      https://www.percona.com/blog/2015/04/07/more-on-transactional-mysql-metadata-locks/

      Most likely the real problem for ALTER is some concurrent sessions that do NOT complete fast enough.

      Delete
    2. Thanks. I will check metadata locks table next time. Maybe there are some hanging sessions left that I didn't notice.

      Delete