Wednesday, February 8, 2017

More on MyRocks Performance for Bug #68079 Case

My previous post on MyRocks was intended to provide some background details for a couple of slides for my FOSDEM talk on profiling MySQL. The results and performance demonstrated by MyRocks vs InnoDB from MySQL 5.7.17 were not really important to show how perf helps to understand where the time was spent while executing of one specific query vs the other (with the same results, but different plan), but they still caused a lot of comments from people who care, so I decided to double check and clarify few more details.

First of all, it was noted that one may get the same execution plan for both queries (even in INNER JOIN case we may end up with filesort and access only via PRIMARY keys), like this:
# bin/mysql test -e 'explain select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category\G'
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: task
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 96
ref: NULL
rows: 8092
Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: incident
type: eq_ref
possible_keys: PRIMARY,incident_category
key: PRIMARY
key_len: 96
ref: test.task.sys_id
rows: 1
Extra: NULL
Indeed, I've ended up with this plan easily, sometimes, after executing ANALYZE for the tables involved. But this was not always the case (here we see for InnoDB) and next ANALYZE or OPTIMIZE may change the plan:

mysql> analyze table task;
+-----------+---------+----------+----------+
| Table     | Op      | Msg_type | Msg_text |
+-----------+---------+----------+----------+
| test.task | analyze | status   | OK       |
+-----------+---------+----------+----------+
1 row in set (0.06 sec)

mysql> analyze table incident;
+---------------+---------+----------+----------+
| Table         | Op      | Msg_type | Msg_text |
+---------------+---------+----------+----------+
| test.incident | analyze | status   | OK       |
+---------------+---------+----------+----------+
1 row in set (0.15 sec)

mysql> show table status like 'task'\G
*************************** 1. row ***************************
           Name: task
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 8452
 Avg_row_length: 60
    Data_length: 507904
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2017-01-31 12:00:38
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

mysql> show table status like 'incident'\G
*************************** 1. row ***************************
           Name: incident
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 7786
 Avg_row_length: 204
    Data_length: 1589248
Max_data_length: 0
   Index_length: 507904
      Data_free: 4194304
 Auto_increment: NULL
    Create_time: 2017-01-31 12:00:35
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

mysql> explain select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category;
+----+-------------+----------+------------+--------+---------------------------+-------------------+---------+----------------------+------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys             | key               | key_len | ref                  | rows | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------------------+-------------------+---------+----------------------+------+----------+-------------+
|  1 | SIMPLE      | incident | NULL       | index  | PRIMARY,incident_category | incident_category | 123     | NULL                 | 7786 |   100.00 | Using index |
|  1 | SIMPLE      | task     | NULL       | eq_ref | PRIMARY                   | PRIMARY           | 96      | test.incident.sys_id |    1 |   100.00 | Using index |
+----+-------------+----------+------------+--------+---------------------------+-------------------+---------+----------------------+------+----------+-------------+
2 rows in set, 1 warning (0.02 sec)
Surely I can always force the plan needed, this way or that:

mysql> explain select count(*), category from task inner join incident force index(primary) on task.sys_id=incident.sys_id group by incident.category;
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
| id | select_type | table    | partitions | type   | possible_keys             | key     | key_len | ref              | rows | filtered | Extra                                        |
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | task     | NULL       | index  | PRIMARY                   | PRIMARY | 96      | NULL             | 8452 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | incident | NULL       | eq_ref | PRIMARY,incident_category | PRIMARY | 96      | test.task.sys_id |    1 |   100.00 | NULL                                         |
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
mysql> explain select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category;
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
| id | select_type | table    | partitions | type   | possible_keys             | key     | key_len | ref              | rows | filtered | Extra                                        |
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | task     | NULL       | index  | PRIMARY                   | PRIMARY | 96      | NULL             | 8452 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | incident | NULL       | eq_ref | PRIMARY,incident_category | PRIMARY | 96      | test.task.sys_id |    1 |   100.00 | NULL                                         |
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
I was interested more in comparing scalability and performance of these different ways to execute the same query, just because usually it's considered good to have covering secondary index used instead of creating temporary table and filesort for this kind of queries... The reality shows this is not always the case.

I've got suggestion to maybe tune something for MyRocks case or use jemalloc library. I am open to any tuning suggestions from real experts based on the fact that these are default settings in my build:

[openxs@fc23 fb56]$ bin/mysql -uroot -e"show variables like 'rocks%'"
+-------------------------------------------------+----------------------+
| Variable_name                                   | Value                |
+-------------------------------------------------+----------------------+
| rocksdb_access_hint_on_compaction_start         | 1                    |
| rocksdb_advise_random_on_open                   | ON                   |
| rocksdb_allow_concurrent_memtable_write         | OFF                  |
| rocksdb_allow_mmap_reads                        | OFF                  |
| rocksdb_allow_mmap_writes                       | OFF                  |
| rocksdb_background_sync                         | OFF                  |
| rocksdb_base_background_compactions             | 1                    |
| rocksdb_block_cache_size                        | 536870912            || rocksdb_block_restart_interval                  | 16                   |
| rocksdb_block_size                              | 4096                 || rocksdb_block_size_deviation                    | 10                   |
| rocksdb_bulk_load                               | OFF                  |
| rocksdb_bulk_load_size                          | 1000                 |
| rocksdb_bytes_per_sync                          | 0                    |
| rocksdb_cache_index_and_filter_blocks           | ON                   |
| rocksdb_checksums_pct                           | 100                  |
| rocksdb_collect_sst_properties                  | ON                   |
| rocksdb_commit_in_the_middle                    | OFF                  |
| rocksdb_compact_cf                              |                      |
| rocksdb_compaction_readahead_size               | 0                    |
| rocksdb_compaction_sequential_deletes           | 0                    |
| rocksdb_compaction_sequential_deletes_count_sd  | OFF                  |
| rocksdb_compaction_sequential_deletes_file_size | 0                    |
| rocksdb_compaction_sequential_deletes_window    | 0                    |
| rocksdb_create_checkpoint                       |                      |
| rocksdb_create_if_missing                       | ON                   |
| rocksdb_create_missing_column_families          | OFF                  |
| rocksdb_datadir                                 | ./.rocksdb           |
| rocksdb_db_write_buffer_size                    | 0                    |
| rocksdb_deadlock_detect                         | OFF                  |
| rocksdb_debug_optimizer_no_zero_cardinality     | ON                   |
| rocksdb_default_cf_options                      |                      |
| rocksdb_delete_obsolete_files_period_micros     | 21600000000          |
| rocksdb_disabledatasync                         | OFF                  |
| rocksdb_enable_2pc                              | ON                   |
| rocksdb_enable_bulk_load_api                    | ON                   |
| rocksdb_enable_thread_tracking                  | OFF                  |
| rocksdb_enable_write_thread_adaptive_yield      | OFF                  |
| rocksdb_error_if_exists                         | OFF                  |
| rocksdb_flush_memtable_on_analyze               | ON                   |
| rocksdb_force_flush_memtable_now                | OFF                  |
| rocksdb_force_index_records_in_range            | 0                    |
| rocksdb_hash_index_allow_collision              | ON                   |
| rocksdb_index_type                              | kBinarySearch        |
| rocksdb_info_log_level                          | error_level          |
| rocksdb_is_fd_close_on_exec                     | ON                   |
| rocksdb_keep_log_file_num                       | 1000                 |
| rocksdb_lock_scanned_rows                       | OFF                  |
| rocksdb_lock_wait_timeout                       | 1                    |
| rocksdb_log_file_time_to_roll                   | 0                    |
| rocksdb_manifest_preallocation_size             | 4194304              |
| rocksdb_max_background_compactions              | 1                    |
| rocksdb_max_background_flushes                  | 1                    |
| rocksdb_max_log_file_size                       | 0                    |
| rocksdb_max_manifest_file_size                  | 18446744073709551615 |
| rocksdb_max_open_files                          | -1                   |
| rocksdb_max_row_locks                           | 1073741824           |
| rocksdb_max_subcompactions                      | 1                    |
| rocksdb_max_total_wal_size                      | 0                    |
| rocksdb_merge_buf_size                          | 67108864             |
| rocksdb_merge_combine_read_size                 | 1073741824           |
| rocksdb_new_table_reader_for_compaction_inputs  | OFF                  |
| rocksdb_no_block_cache                          | OFF                  |
| rocksdb_override_cf_options                     |                      |
| rocksdb_paranoid_checks                         | ON                   |
| rocksdb_pause_background_work                   | OFF                  |
| rocksdb_perf_context_level                      | 0                    |
| rocksdb_persistent_cache_path                   |                      |
| rocksdb_persistent_cache_size                   | 0                    |
| rocksdb_pin_l0_filter_and_index_blocks_in_cache | ON                   |
| rocksdb_print_snapshot_conflict_queries         | OFF                  |
| rocksdb_rate_limiter_bytes_per_sec              | 0                    |
| rocksdb_read_free_rpl_tables                    |                      |
| rocksdb_records_in_range                        | 0                    |
| rocksdb_seconds_between_stat_computes           | 3600                 |
| rocksdb_signal_drop_index_thread                | OFF                  |
| rocksdb_skip_bloom_filter_on_read               | OFF                  |
| rocksdb_skip_fill_cache                         | OFF                  |
| rocksdb_skip_unique_check_tables                | .*                   |
| rocksdb_stats_dump_period_sec                   | 600                  |
| rocksdb_store_row_debug_checksums               | OFF                  |
| rocksdb_strict_collation_check                  | ON                   |
| rocksdb_strict_collation_exceptions             |                      |
| rocksdb_table_cache_numshardbits                | 6                    |
| rocksdb_table_stats_sampling_pct                | 10                   |
| rocksdb_tmpdir                                  |                      |
| rocksdb_trace_sst_api                           | OFF                  |
| rocksdb_unsafe_for_binlog                       | OFF                  |
| rocksdb_use_adaptive_mutex                      | OFF                  |
| rocksdb_use_direct_reads                        | OFF                  |
| rocksdb_use_direct_writes                       | OFF                  |
| rocksdb_use_fsync                               | OFF                  |
| rocksdb_validate_tables                         | 1                    |
| rocksdb_verify_row_debug_checksums              | OFF                  |
| rocksdb_wal_bytes_per_sync                      | 0                    |
| rocksdb_wal_dir                                 |                      |
| rocksdb_wal_recovery_mode                       | 2                    |
| rocksdb_wal_size_limit_mb                       | 0                    |
| rocksdb_wal_ttl_seconds                         | 0                    |
| rocksdb_whole_key_filtering                     | ON                   |
| rocksdb_write_disable_wal                       | OFF                  |
| rocksdb_write_ignore_missing_column_families    | OFF                  |
| rocksdb_write_sync                              | OFF                  |
+-------------------------------------------------+----------------------+
and hardware details of my QuadCore box with HDD were shared previously. I tend to think no tuning is needed for such a small tables and corner read-only use case (if maybe READ COMMITTED isolation level, that I plan to study separately, as initial attempts had not shown much difference).

Just for completeness, this is the minimal configuration file I've used and the commit I've built from (after creating a new clone yesterday, this time the default branch was fb-mysql-5.6.35):
[openxs@fc23 fb56]$ cat ~/fb56.cnf
[mysqld]
rocksdb
default-storage-engine=rocksdb
skip-innodb
default-tmp-storage-engine=MyISAM

log-bin
binlog-format=ROW

#transaction-isolation=READ-COMMITTED
#rocksdb_max_open_files=-1
#rocksdb_block_cache_size=128M

[openxs@fc23 mysql-5.6]$ git log -1commit 95dd650dcfb07b91971b20cbac82d61f244a05a9
Author: Gunnar Kudrjavets <gunnarku@fb.com>
Date:   Fri Feb 3 10:07:25 2017 -0800
...
You can guess that I tried to add some more options along the lines of the fine manual, but they had not changes things much to better, so I've commented them out.

So, with this fresh build based on MySQL 5.6.35 I've decided to study the influence of jemalloc for both queries with 4, 8 and 16 threads (most interesting settings for the QuadCore). I've used the --malloc-lib option for mysqld_safe to force use of jemalloc, and checked pmap ouput for mysqld process for jemalloc entries to confirm that it was really preloaded:



[openxs@fc23 fb56]$ ps aux | grep mysqld
openxs   20444  0.0  0.0 119164  3264 pts/0    S    11:49   0:00 /bin/sh bin/mysqld_safe --defaults-file=/home/openxs/fb56.cnf --malloc-lib=/usr/lib64/libjemalloc.so
openxs   20630  0.8  0.2 170704 24284 pts/0    Sl   11:49   0:00 /home/openxs/dbs/fb56/bin/mysqld --defaults-file=/home/openxs/fb56.cnf --basedir=/home/openxs/dbs/fb56 --datadir=/home/openxs/dbs/fb56/data --plugin-dir=/home/openxs/dbs/fb56/lib/plugin --log-error=/home/openxs/dbs/fb56/data/fc23.err --pid-file=/home/openxs/dbs/fb56/data/fc23.pid
openxs   20750  0.0  0.0 118520   904 pts/0    S+   11:49   0:00 grep --color=auto mysqld

[openxs@fc23 fb56]$ sudo pmap 20630 | grep jemalloc
00007fb8637db000    268K r-x-- libjemalloc.so.2
00007fb86381e000   2044K ----- libjemalloc.so.2
00007fb863a1d000     12K r---- libjemalloc.so.2
00007fb863a20000      4K rw--- libjemalloc.so.2
So, here are the raw results of mysqlslap runs with jemalloc preloaded after I made sure queries use different plans (STRAIGHT_JOIN uses PRIMARY keys and filesort, while INNER JOIN uses covering secondary index and avoids filesort):

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.695 seconds
        Minimum number of seconds to run all queries: 19.627 seconds
        Maximum number of seconds to run all queries: 19.769 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.332 seconds
        Minimum number of seconds to run all queries: 19.306 seconds
        Maximum number of seconds to run all queries: 19.349 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.276 seconds
        Minimum number of seconds to run all queries: 19.225 seconds
        Maximum number of seconds to run all queries: 19.309 seconds
        Number of clients running queries: 16
        Average number of queries per client: 62

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.553 seconds
        Minimum number of seconds to run all queries: 18.336 seconds
        Maximum number of seconds to run all queries: 18.748 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.227 seconds
        Minimum number of seconds to run all queries: 18.198 seconds
        Maximum number of seconds to run all queries: 18.269 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 17.941 seconds
        Minimum number of seconds to run all queries: 17.886 seconds
        Maximum number of seconds to run all queries: 18.000 seconds
        Number of clients running queries: 16
        Average number of queries per client: 62


Now, the same but with default memory allocator picked up (no jemalloc):

[openxs@fc23 fb56]$ ps aux | grep mysqld
openxs   27696  0.0  0.0 119164  3372 pts/0    S    13:50   0:00 /bin/sh bin/mysqld_safe --defaults-file=/home/openxs/fb56.cnf
openxs   27834  0.3  0.2 607012 23096 pts/0    Sl   13:50   0:00 ./bin/mysqld --defaults-file=/home/openxs/fb56.cnf --basedir=. --datadir=./data --plugin-dir=./lib/plugin --log-error=./data/fc23.err --pid-file=./data/fc23.pid
openxs   28029  0.0  0.0 118520   880 pts/0    S+   13:50   0:00 grep --color=auto mysqld
[openxs@fc23 fb56]$ pmap 27834 | grep jemalloc
[openxs@fc23 fb56]$

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 20.207 seconds
        Minimum number of seconds to run all queries: 19.833 seconds
        Maximum number of seconds to run all queries: 20.543 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.746 seconds
        Minimum number of seconds to run all queries: 19.500 seconds
        Maximum number of seconds to run all queries: 20.402 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.150 seconds
        Minimum number of seconds to run all queries: 19.084 seconds
        Maximum number of seconds to run all queries: 19.292 seconds
        Number of clients running queries: 16
        Average number of queries per client: 62

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.692 seconds
        Minimum number of seconds to run all queries: 18.516 seconds
        Maximum number of seconds to run all queries: 18.889 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.290 seconds
        Minimum number of seconds to run all queries: 18.250 seconds
        Maximum number of seconds to run all queries: 18.342 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.155 seconds
        Minimum number of seconds to run all queries: 18.066 seconds
        Maximum number of seconds to run all queries: 18.397 seconds
        Number of clients running queries: 16
        Average number of queries per client: 62


Finally, this is how it looks like on a chart:


I can make the following conclusions:
  1. Using jemalloc helps to get a bit better throughput for both queries (I tend to consider the INNER case with 16 threads a fluctuation for now), but not much.
  2. New build based on fb-mysql-5.6.35 branch as of yesterday's morning demonstrates worse performance for STRAIGHT_JOIN case (access via PRIMARY keys + filesort) comparing to the previous build from January 30, 2017 based on webscalesql-5.6.27.75 branch. I am yet to find out why is it so, as INNER JOIN case performs a bit better.
  3. I am open to any MyRocks runing suggestions for my old QuadCore box, as for now MySQL 5.7.17 performs notably better on this same hardware and data in the tables, with --no-defaults. No wonder, a lot of efforts were spent by Oracle on Bug #68079, while all I've got so far for MyRocks is public "Can't repeat" of a kind, plus simple tuning advices with limited impact for my case.
I am sure I am doing something wrong with MyRocks, just tell me what exactly...

7 comments:

  1. I've tested with the following commit today:

    [openxs@fc23 mysql-5.6]$ git log -1
    commit 04a32957d785e4546535c754cef9f9768e5c9d0f
    Author: Anirban Rahut
    Date: Wed Feb 8 11:09:31 2017 -0800
    ...

    and the results are way better:

    [openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
    Benchmark
    Average number of seconds to run all queries: 16.015 seconds
    Minimum number of seconds to run all queries: 15.923 seconds
    Maximum number of seconds to run all queries: 16.222 seconds
    Number of clients running queries: 16
    Average number of queries per client: 62

    [openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
    Benchmark
    Average number of seconds to run all queries: 16.191 seconds
    Minimum number of seconds to run all queries: 16.047 seconds
    Maximum number of seconds to run all queries: 16.389 seconds
    Number of clients running queries: 8
    Average number of queries per client: 125

    [openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
    Benchmark
    Average number of seconds to run all queries: 16.213 seconds
    Minimum number of seconds to run all queries: 16.054 seconds
    Maximum number of seconds to run all queries: 16.741 seconds
    Number of clients running queries: 4
    Average number of queries per client: 250

    [openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
    Benchmark
    Average number of seconds to run all queries: 16.760 seconds
    Minimum number of seconds to run all queries: 16.689 seconds
    Maximum number of seconds to run all queries: 16.952 seconds
    Number of clients running queries: 16
    Average number of queries per client: 62

    [openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
    Benchmark
    Average number of seconds to run all queries: 16.937 seconds
    Minimum number of seconds to run all queries: 16.814 seconds
    Maximum number of seconds to run all queries: 17.044 seconds
    Number of clients running queries: 8
    Average number of queries per client: 125

    [openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
    Benchmark
    Average number of seconds to run all queries: 16.923 seconds
    Minimum number of seconds to run all queries: 16.727 seconds
    Maximum number of seconds to run all queries: 17.514 seconds
    Number of clients running queries: 4
    Average number of queries per client: 250

    I've checked with READ COMMITTED isolation level also, and got no positive impact at all.

    ReplyDelete
  2. The benefit from jemalloc is less fragmentation (mysqld RSS is smaller) and less mutex contention. The mutex contention benefit is probably smaller on your 4-core box than on the larger servers that I use. To see the benefit from a smaller RSS you should run a workload that will cause churn in the block cache. If this test doesn't do that because all data fits in the block cache then you won't see it. Example blog posts:
    http://smalldatum.blogspot.com/2015/10/myrocks-versus-allocators-glibc.html
    http://smalldatum.blogspot.com/2015/10/myrocks-versus-allocators-glibc.html
    http://smalldatum.blogspot.com/2015/06/insert-benchmark-for-mongodb-memory.html

    READ COMMITTED is a win for MyRocks for a read-write workload with some data contention. This workload is read only.

    Until a few days ago, MyRocks was stuck on a RocksDB revision that had a perf bug known to the RocksDB team. But the MyRocks team didn't know about it. I lost time rediscovering that bug, perhaps you did too. Sorry.

    ReplyDelete
    Replies
    1. Good that I had not reported a regression bug yesterday, I was almost ready to do that...

      Delete
  3. Lost the first part of my comment. Can you summarize the problem here? I appreciate the details, but now I am lost. You reported that MyRocks runs the query in ~16 seconds with a recent build. Is the problem that InnoDB runs it much faster?

    ReplyDelete
    Replies
    1. Yes, the problem now is mostly a notable performance difference comparing to InnoDB from MySQL 5.7.17:

      [openxs@fc23 5.7]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 12.120 seconds
      Minimum number of seconds to run all queries: 11.968 seconds
      Maximum number of seconds to run all queries: 12.346 seconds
      Number of clients running queries: 4
      Average number of queries per client: 250

      [openxs@fc23 5.7]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 12.025 seconds
      Minimum number of seconds to run all queries: 11.961 seconds
      Maximum number of seconds to run all queries: 12.103 seconds
      Number of clients running queries: 8
      Average number of queries per client: 125

      [openxs@fc23 5.7]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 11.926 seconds
      Minimum number of seconds to run all queries: 11.875 seconds
      Maximum number of seconds to run all queries: 11.972 seconds
      Number of clients running queries: 16
      Average number of queries per client: 62

      For 4-16 threads on my 4 cores I see 11-12 seconds with InnoDB, more or less for both plans. If I got your comments right you were able to make MyRocks perform better than InnoDB on this use case, and I wonder how any why it could be so...

      Originally (for my FOSDEM talk) the problem was mostly to try to understand why on the same version the query that uses primary keys, temporary and filesort runs notably faster at high concurrency than the one using covering secondary index. In my today's test with recent MyRocks the difference is almost gone, so not a big deal any more...

      Delete
    2. tl;dr -- InnoDB is faster on my home server, MyRocks is faster on my work server. Work server has more & faster cores. Don't know why yet.

      Using scripts at https://github.com/mdcallag/mytools/tree/master/bench/b1 and two servers:
      * work - patched gcc-4.9, 48-cores, lots of RAM
      * home - ubuntu 16.04, gcc-5.4, 8gb RAM, core i3 5th gen -- 2 real cores, 4 with HT on, i3-5010U CPU @ 2.10GHz

      I used builds from Feb 9 and Feb 10, they differ by 1 commit...
      commit f3019b567aff85bbf923e04f3d7faf12b200fc33
      Author: Tian Xia
      Date: Thu Feb 9 11:58:35 2017 -0800

      I also have a few results from a patched MyRocks build on November 21.

      Results here -> https://gist.github.com/mdcallag/6aa16fce647cd8d93efea7f7d9aa10af

      Delete
    3. Note that I checked with 5.7.17 in case of InnoDB. Probably checking with InnoDB from 5.6.35 makes sense as well.

      Delete