Summary
To set up InnoDB mutex waits monitoring via PERFORMANCE_SCHEMA in general with MySQL 5.6.9 you should do at least the following:- Start MySQL server with all mutex related instruments enabled at startup (performance_schema=ON by default on recent 5.6.x), like this:
mysqld_safe --performance_schema_instrument='wait/synch/mutex/innodb/%=on' & - Connect to server and set up proper consumers explicitly, like this:
UPDATE performance_schema.setup_consumers SET enabled = 'YES' WHERE name like 'events_waits%'; - Run your problematic load.
- Check waits using whatever tables you need, like this:
mysql> select event_name, count_star, sum_timer_wait from performance_schema.events_waits_summary_global_by_event_name where event_name like 'wait/synch/mutex/innodb%' and count_star > 0 order by sum_timer_wait desc limit 5;
+--------------------------------------------+------------+----------------+
| event_name | count_star | sum_timer_wait |
+--------------------------------------------+------------+----------------+
| wait/synch/mutex/innodb/os_mutex | 50236 | 2018647953 |
| wait/synch/mutex/innodb/rw_lock_list_mutex | 8299 | 316652455 |
| wait/synch/mutex/innodb/mutex_list_mutex | 8378 | 300689235 |
| wait/synch/mutex/innodb/fil_system_mutex | 1173 | 56603575 |
| wait/synch/mutex/innodb/buf_pool_mutex | 818 | 34652947 |
+--------------------------------------------+------------+----------------+
5 rows in set (0.00 sec)
Whole story
Let's assume you study some InnoDB performance or scalability problem (like Bug #68079) and would like to know what exact mutex waits happened in the process and how much time was spent on them. Surely you can use INNODB STATUS to see some really bad waits (if you are unlucky) and global wait-related counters, or SHOW ENGINE INNODB MUTEX to see waits that were long enough to force OS context switch:
mysql> show engine innodb mutex;
+--------+-------------------------+-----------------+
| Type | Name | Status |
+--------+-------------------------+-----------------+
| InnoDB | log0log.cc:737 | os_waits=6 |
| InnoDB | buf0buf.cc:1225 | os_waits=3 |
| InnoDB | combined buf0buf.cc:975 | os_waits=117391 |
| InnoDB | log0log.cc:799 | os_waits=124 |
+--------+-------------------------+-----------------+
4 rows in set (0.00 sec)
The output above is enough to understand what mutex waits were long enough to cause OS waits (context switches), but it does not show how many waits happened in total or how much time InnoDB has to wait. You can try to find out is there any way to get more details and manual can give you a hint that debug binaries will show you enough details... but debug binaries may have different scalability properties and, after all, who uses debug binaries in production :)
For this reason or for other (like a lot of buzz everywhere or, again, fine manual) you eventually will figure out that now one should use PERFORMANCE_SCHEMA to study performance-related problems. It is as easy as setting performance_schema option to ON/1 one may conclude based on the manual. Let's try:
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysqld_safe --no-defaults --performance-schema=ON &
[1] 31089
and then run our problematic test case with mutex waits. Now, when it is finished finally:
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysqlslap -uroot --concurrency=9 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
Average number of seconds to run all queries: 7.230 seconds
Minimum number of seconds to run all queries: 6.906 seconds
Maximum number of seconds to run all queries: 7.997 seconds
Number of clients running queries: 9
Average number of queries per client: 111
You then try to check the waits, finally, why else you spenht so much time reading about PERFORMANCE_SCHEMA (and improvements to it in 5.6) everywhere:
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysql -uroot test
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 92
Server version: 5.6.9-rc MySQL Community Server (GPL)
Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select event_name, count_star, sum_timer_wait from performance_schema.events_waits_summary_global_by_event_name where event_name like 'wait/synch/mutex/innodb%' and count_star > 0 order by sum_timer_wait desc limit 5;
Empty set (0.00 sec)
Ups, no waits? How is this possible if SHOW ENGINE INNODB MUTEX shows increased counts? "Ah", (after reading the manual about instrumentation) one may say, we need to enable instruments and timing for them! Like this:
mysql> UPDATE performance_schema.setup_instruments SET enabled = 'YES', timed = 'YES' WHERE name LIKE 'wait/synch/mutex/innodb/%';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 45 Changed: 0 Warnings: 0
That's because by default not so many are enabled, none of those related to InnoDB mutex waits:
mysql> select * from performance_schema.setup_instruments where name like 'wait/synch/mutex/innodb%';
+-------------------------------------------------------+---------+-------+
| NAME | ENABLED | TIMED |
+-------------------------------------------------------+---------+-------+
| wait/synch/mutex/innodb/commit_threads_m | NO | NO |
| wait/synch/mutex/innodb/commit_cond_mutex | NO | NO |
| wait/synch/mutex/innodb/innobase_share_mutex | NO | NO |
| wait/synch/mutex/innodb/autoinc_mutex | NO | NO |
| wait/synch/mutex/innodb/buf_pool_mutex | NO | NO |
| wait/synch/mutex/innodb/buf_pool_zip_mutex | NO | NO |
...
+-------------------------------------------------------+---------+-------+
45 rows in set (0.19 sec)
Yet another fine MySQL 5.6 manual page clearly says that we can enable them dynamically:
"Modifications to any of these tables affect monitoring immediately, with the exception ofSo, we try our problematic test case again, check waits as explained above and... get zero rows again. After some more reading of the pages already mentioned somebody as ignorant as me starts to write posts at Facebook, get hints from nice former colleagues like Mark Leith (who actually had written a lot about the topic and influence PERFORMANCE_SCHEMA design and implementation even more) and finally notes that consumers are NOT set (and manual explains they should be):setup_actors
. Modifications tosetup_actors
affect only foreground threads created thereafter."
mysql> select * from performance_schema.setup_consumers; +--------------------------------+---------+
| NAME | ENABLED |
+--------------------------------+---------+
| events_stages_current | NO |
| events_stages_history | NO |
| events_stages_history_long | NO |
| events_statements_current | YES |
| events_statements_history | NO |
| events_statements_history_long | NO |
| events_waits_current | NO |
| events_waits_history | NO |
| events_waits_history_long | NO |
| global_instrumentation | YES |
| thread_instrumentation | YES |
| statements_digest | YES |
+--------------------------------+---------+
12 rows in set (0.00 sec)
So, let's fix this and make a quick test if it works at all (before running long one again):
mysql> UPDATE performance_schema.setup_consumers SET enabled = 'YES' WHERE name like 'events_waits%';
Query OK, 3 rows affected (0.00 sec)
Rows matched: 3 Changed: 3 Warnings: 0
mysql> drop table t1;
Query OK, 0 rows affected (1.08 sec)
mysql> create table test.t1 (i int) engine = innodb; Query OK, 0 rows affected (1.48 sec)
mysql> insert into test.t1 values (1), (2), (3); Query OK, 3 rows affected (0.36 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select event_name, count_star, sum_timer_wait from performance_schema.events_waits_summary_global_by_event_name where event_name like 'wait/synch/mutex/innodb%' and count_star > 0 order by sum_timer_wait desc limit 5;
+----------------------------------------+------------+----------------+
| event_name | count_star | sum_timer_wait |
+----------------------------------------+------------+----------------+
| wait/synch/mutex/innodb/trx_mutex | 268 | 14017381 |
| wait/synch/mutex/innodb/trx_undo_mutex | 21 | 1690647 |
+----------------------------------------+------------+----------------+
2 rows in set (0.00 sec)
Yes, it works. Let's start test again and finally get all the benefits of improved MySQL 5.6 diagnostics:
mysql> exit
Bye
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysqlslap -uroot --concurrency=12 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
Average number of seconds to run all queries: 7.330 seconds
Minimum number of seconds to run all queries: 6.926 seconds
Maximum number of seconds to run all queries: 7.692 seconds
Number of clients running queries: 12
Average number of queries per client: 83
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysql -uroot test
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 306
Server version: 5.6.9-rc MySQL Community Server (GPL)
Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select event_name, count_star, sum_timer_wait from performance_schema.events_waits_summary_global_by_event_name where event_name like 'wait/synch/mutex/innodb%' and count_star > 0 order by sum_timer_wait desc limit 5;
+----------------------------------------+------------+----------------+
| event_name | count_star | sum_timer_wait |
+----------------------------------------+------------+----------------+
| wait/synch/mutex/innodb/trx_mutex | 268 | 14017381 |
| wait/synch/mutex/innodb/trx_undo_mutex | 21 | 1690647 |
+----------------------------------------+------------+----------------+
2 rows in set (0.00 sec)
Now, where are my expected (at least buffer pool related, if somebody had forgotten) mutex waits?
Lucky I am, Mark Leith was reading my Facebook at the moment, so I've got a hint that some instruments can be enabled only at startup (try to find the list of them in the manual one day...) and, since 5.6.4, there are command line options to do that.
So, I have to restart server (imagine you are in production!) to get it working finally the way I've expected:
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysqld_safe --no-defaults --performance_schema=ON --performance_schema_instrument='wait/synch/mutex/innodb/%=on' &
[1] 32432
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ 130115 11:11:23 mysqld_safe Logging to '/home/valerii.kravchuk/mysql-5.6.9-rc-linux-glibc2.5-x86_64/data/cisco1.office.percona.com.err'.
130115 11:11:23 mysqld_safe Starting mysqld daemon with databases from /home/valerii.kravchuk/mysql-5.6.9-rc-linux-glibc2.5-x86_64/data
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysql -uroot test
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.9-rc MySQL Community Server (GPL)
Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> UPDATE performance_schema.setup_consumers SET enabled = 'YES' WHERE name like 'events_waits%';
Query OK, 3 rows affected (0.00 sec)
Rows matched: 3 Changed: 3 Warnings: 0
mysql> exit
Bye
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysqlslap -uroot --concurrency=12 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
Average number of seconds to run all queries: 7.640 seconds
Minimum number of seconds to run all queries: 6.874 seconds
Maximum number of seconds to run all queries: 7.930 seconds
Number of clients running queries: 12
Average number of queries per client: 83
[valerii.kravchuk@cisco1 mysql-5.6.9-rc-linux-glibc2.5-x86_64]$ bin/mysql -uroot test
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 123
Server version: 5.6.9-rc MySQL Community Server (GPL)
Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select event_name, count_star, sum_timer_wait from performance_schema.events_waits_summary_global_by_event_name where event_name like 'wait/synch/mutex/innodb%' and count_star > 0 order by sum_timer_wait desc limit 5;
+--------------------------------------------+------------+----------------+
| event_name | count_star | sum_timer_wait |
+--------------------------------------------+------------+----------------+
| wait/synch/mutex/innodb/os_mutex | 244191 | 32147377962 |
| wait/synch/mutex/innodb/trx_sys_mutex | 30180 | 4228391496 |
| wait/synch/mutex/innodb/buf_pool_mutex | 4812 | 757141973 |
| wait/synch/mutex/innodb/mutex_list_mutex | 8870 | 380601032 |
| wait/synch/mutex/innodb/rw_lock_list_mutex | 8304 | 363065843 |
+--------------------------------------------+------------+----------------+
5 rows in set (0.00 sec)
Then I decided to write this blog post with a quick "howto" summary on top. I hope it is useful for readers. This is the end of story for now.
Conclusions
- It's great to have Oracle engineers reading your Facebook posts and replying (especially in a real time). Thank you, Mark Leith!
- PERFORMANCE_SCHEMA in MySQL 5.6 can be used to study InnoDB mutex waits
- MySQL 5.,6 manual is not clear enough about steps and details about instruments involved to do the study explained above.
But I wonder, what kind of conclusion
ReplyDeleteyou can make from
wait/synch/mutex/innodb/os_mutex
mutes is the most problematic?
No conclusion at the moment :) Was too busy getting at least something from P_S.
ReplyDeleteThx for showing getting started with P_S is a little complicated. Beside showing already some quirks you missed performance_schema is ON by default (changed from 5.5 to 5.6). So you don't have to set it to enable P_S :)
ReplyDeleteThey wan't us to read a lot of docs ;)
thx
erkules
Thank you, Erkules, for this correction. I just remember that we have to set it ON explicitly from MySQL 5.5 where I tried to use it last time. Even more surprise then that some (important) instruments still have to be enabled upon startup.
ReplyDeleteVadim, os_mutex is surely something that scan be ignored and made less important with tuning. In any case, for the purpose and topic of this blog post details of benchmark or mistakes in investigation do not matter.
ReplyDeleteIn any case, http://bugs.mysql.com/bug.php?id=68079 is now verified and, based on comments, accepted by Oracle as a valid and important use case to study.
I have tried Performance Schema in MySQL 5.6, and met lots of problems too. Thanks for sharing.
ReplyDeleteBesides, in this test, i think os_mutex is meaningless, we can't get some conclusion from this mutex waits.
Here is my bug report (already verified) about missing details on when and how important instruments can be enabled: http://bugs.mysql.com/bug.php?id=68097
ReplyDelete