Thursday, February 16, 2017

Fun with Bugs #48 - Group Replication Bugs and Missing Features as of MySQL 5.7.17

It seems recent post on Group Replication by Vadim caused an interesting discussion on Facebook. I am NOT going to continue it here, but decided to present some facts, specifically, list of public bug reports and feature requests for Group Replication (mostly "Verified", that is, accepted by Oracle engineers as valid) as of MySQL 5.7.17 (where the feature is promoted as GA), with just few comments to some of the bugs.

The goal is to double check this post when next Oracle MySQL 5.7.x release appears, to find out how much Oracle carews to fix the problems already identified by MySQL Community.

So, here are the bugs and feature requests in MySQL Server: Group Replication category (I've included only those reported for 5.7.17 into this post), as of today:
  • Bug #84315 - "add mysql group replication bootstrap option to init script". Fair enough, we have a way to boottrap a cluster for a node in all implementations of Galera cluster. This feature request is still "Open".
  • Bug #84329 - "Some Class B private address is not permitted automatically". Manual seems to say it should be.
  • Bug #84337 - "Configure mgr with out root user encounter an error". Seems to be a duplicate of Bug #82687, that is, known for quite a some time before GA.
  • Bug #84367 - "START GROUP_REPLICATION should be possible without error". You get error if it was already started. Not a big deal, but some optional clause like IF NOT STARTED may help (in a similar way to DROP TABLE ... IF EXISTS) to code scripts. Still "Open".
  • Bug #84710 - "group replication does not respect localhost IP". may be a problem for some MySQL Sandbox setups in a hope to test group replication.
  • Bug #84727 - "GR: Partitioned Node Should Get Updated Status and not accept writes". Writes on partitioned node are accepted and hang (forever?). As a side note, I think Kenny Gryp deserves a special recognition as an early adopter of Group Replication feature who cared to report many problems noted in the process.
  • Bug #84728 - "Not Possible To Avoid MySQL From Starting When GR Cannot Start". We need this fixed to avoid split brain situations by default.
  • Bug #84729 - "Group Replication: Block Reads On Partitioned Nodes". In Galera reads are blocked by default when node is not considered a member of cluster.
  • Bug #84730 - "Feature: Troubleshoot Group Replication Transaction Rollbacks". You can get a lot of information about conflicts in case of Galera. Sorry that I have to compare, but when discussing the ways of dealing with common problems in cluster environments related to MySQL one can not ignore existing solutions (NDB clusters and Galera clusters), so I just picked up a (somewhat similar) technology that I know and used (a bit). I think readers will do the same, try to base their conclusions on known examples.
  • Bug #84731 - "Group Replication: mysql client connections hang during group replication start". There is no reason to hang more than needed. We should just make sure reads and writes are NOT accepted until the node is a member of cluster and in sync.
  • Bug #84733 - "Cannot Start GR with super_read_only=1". But this setting may be needed to make sure that there is only one master node in cluster, no matter what happens ot them...
  • Bug #84773 - "Flow control slows down throughput while a node join the cluster". Let me quote: "This is done this way by design, that is, request group to slow down while the member is fetching and applying data.
  • Bug #84774 - "Performance drop every 60 seconds". Now this sounds like a performance problem to work on, maybe by adding some tuning options.
  • Bug #84784 - "Group Replication nodes do not rejoin cluster after network connectivity issues". It would be really nice for nodes to try to re-join the cluster in case of short term connectivity issues. Galera nodes do not give up that fast. The bug is still not verified.
  • Bug #84785  - "Prevent Large Transactions in Group Replication". Galera somehow allows to limit transaction size. Not that there were no related bugs, but still options exist.
  • Bug #84792 - "Idle GR Cluster: Member CPU hog". Not yet verified, but it seems in some cases node can use a notable share of CPU time for no clear/good reason.
  • Bug #84794 - "Cannot kill query inside GR". Weel, you can do STOP GROUP_REPLICATION, but then it can be dangerous, because...
  • Bug #84795 - "STOP GROUP_REPLICATION sets super_read_only=off" - the node with stopped replication may allow to change the data...
  • Bug #84798 - "Group Replication can use some verbosity in the error log". Galera cluster nodes are too verbose, one gets kilometers of log records about everything, anything and nothing. Still, better to skip some usual outputs in the logs than get no evidence at all on what was going on...
  • Bug #84900 - "Getting inconsistent result on different nodes". Now, this is really unfortunate (for a "cluster") and somewhat similar problem was reported by Vadim before, see Bug #82481 and was supposed to be fixed. Anyway, the inconsistency is repatable and looks somewhat scary.
  • Bug #84901 - "Can't execute transaction on second node". Best practice is to write on one and only one node (see Bug #83218). An attempt to write on the other node may fail...
Now, make your own conclusions about the maturity of Group Replication in MySQL 5.7.17. I manage to avoid it so far, as I try to keep myself as far from any kinds of "clusters" as possible... Had not worked well with Galera, unfortunately - I have to read its verbose logs on a daily basis.

Wednesday, February 8, 2017

More on MyRocks Performance for Bug #68079 Case

My previous post on MyRocks was intended to provide some background details for a couple of slides for my FOSDEM talk on profiling MySQL. The results and performance demonstrated by MyRocks vs InnoDB from MySQL 5.7.17 were not really important to show how perf helps to understand where the time was spent while executing of one specific query vs the other (with the same results, but different plan), but they still caused a lot of comments from people who care, so I decided to double check and clarify few more details.

First of all, it was noted that one may get the same execution plan for both queries (even in INNER JOIN case we may end up with filesort and access only via PRIMARY keys), like this:
# bin/mysql test -e 'explain select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category\G'
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: task
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 96
ref: NULL
rows: 8092
Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: incident
type: eq_ref
possible_keys: PRIMARY,incident_category
key: PRIMARY
key_len: 96
ref: test.task.sys_id
rows: 1
Extra: NULL
Indeed, I've ended up with this plan easily, sometimes, after executing ANALYZE for the tables involved. But this was not always the case (here we see for InnoDB) and next ANALYZE or OPTIMIZE may change the plan:

mysql> analyze table task;
+-----------+---------+----------+----------+
| Table     | Op      | Msg_type | Msg_text |
+-----------+---------+----------+----------+
| test.task | analyze | status   | OK       |
+-----------+---------+----------+----------+
1 row in set (0.06 sec)

mysql> analyze table incident;
+---------------+---------+----------+----------+
| Table         | Op      | Msg_type | Msg_text |
+---------------+---------+----------+----------+
| test.incident | analyze | status   | OK       |
+---------------+---------+----------+----------+
1 row in set (0.15 sec)

mysql> show table status like 'task'\G
*************************** 1. row ***************************
           Name: task
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 8452
 Avg_row_length: 60
    Data_length: 507904
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2017-01-31 12:00:38
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

mysql> show table status like 'incident'\G
*************************** 1. row ***************************
           Name: incident
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 7786
 Avg_row_length: 204
    Data_length: 1589248
Max_data_length: 0
   Index_length: 507904
      Data_free: 4194304
 Auto_increment: NULL
    Create_time: 2017-01-31 12:00:35
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

mysql> explain select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category;
+----+-------------+----------+------------+--------+---------------------------+-------------------+---------+----------------------+------+----------+-------------+
| id | select_type | table    | partitions | type   | possible_keys             | key               | key_len | ref                  | rows | filtered | Extra       |
+----+-------------+----------+------------+--------+---------------------------+-------------------+---------+----------------------+------+----------+-------------+
|  1 | SIMPLE      | incident | NULL       | index  | PRIMARY,incident_category | incident_category | 123     | NULL                 | 7786 |   100.00 | Using index |
|  1 | SIMPLE      | task     | NULL       | eq_ref | PRIMARY                   | PRIMARY           | 96      | test.incident.sys_id |    1 |   100.00 | Using index |
+----+-------------+----------+------------+--------+---------------------------+-------------------+---------+----------------------+------+----------+-------------+
2 rows in set, 1 warning (0.02 sec)
Surely I can always force the plan needed, this way or that:

mysql> explain select count(*), category from task inner join incident force index(primary) on task.sys_id=incident.sys_id group by incident.category;
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
| id | select_type | table    | partitions | type   | possible_keys             | key     | key_len | ref              | rows | filtered | Extra                                        |
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | task     | NULL       | index  | PRIMARY                   | PRIMARY | 96      | NULL             | 8452 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | incident | NULL       | eq_ref | PRIMARY,incident_category | PRIMARY | 96      | test.task.sys_id |    1 |   100.00 | NULL                                         |
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
mysql> explain select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category;
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
| id | select_type | table    | partitions | type   | possible_keys             | key     | key_len | ref              | rows | filtered | Extra                                        |
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | task     | NULL       | index  | PRIMARY                   | PRIMARY | 96      | NULL             | 8452 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | incident | NULL       | eq_ref | PRIMARY,incident_category | PRIMARY | 96      | test.task.sys_id |    1 |   100.00 | NULL                                         |
+----+-------------+----------+------------+--------+---------------------------+---------+---------+------------------+------+----------+----------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
I was interested more in comparing scalability and performance of these different ways to execute the same query, just because usually it's considered good to have covering secondary index used instead of creating temporary table and filesort for this kind of queries... The reality shows this is not always the case.

I've got suggestion to maybe tune something for MyRocks case or use jemalloc library. I am open to any tuning suggestions from real experts based on the fact that these are default settings in my build:

[openxs@fc23 fb56]$ bin/mysql -uroot -e"show variables like 'rocks%'"
+-------------------------------------------------+----------------------+
| Variable_name                                   | Value                |
+-------------------------------------------------+----------------------+
| rocksdb_access_hint_on_compaction_start         | 1                    |
| rocksdb_advise_random_on_open                   | ON                   |
| rocksdb_allow_concurrent_memtable_write         | OFF                  |
| rocksdb_allow_mmap_reads                        | OFF                  |
| rocksdb_allow_mmap_writes                       | OFF                  |
| rocksdb_background_sync                         | OFF                  |
| rocksdb_base_background_compactions             | 1                    |
| rocksdb_block_cache_size                        | 536870912            || rocksdb_block_restart_interval                  | 16                   |
| rocksdb_block_size                              | 4096                 || rocksdb_block_size_deviation                    | 10                   |
| rocksdb_bulk_load                               | OFF                  |
| rocksdb_bulk_load_size                          | 1000                 |
| rocksdb_bytes_per_sync                          | 0                    |
| rocksdb_cache_index_and_filter_blocks           | ON                   |
| rocksdb_checksums_pct                           | 100                  |
| rocksdb_collect_sst_properties                  | ON                   |
| rocksdb_commit_in_the_middle                    | OFF                  |
| rocksdb_compact_cf                              |                      |
| rocksdb_compaction_readahead_size               | 0                    |
| rocksdb_compaction_sequential_deletes           | 0                    |
| rocksdb_compaction_sequential_deletes_count_sd  | OFF                  |
| rocksdb_compaction_sequential_deletes_file_size | 0                    |
| rocksdb_compaction_sequential_deletes_window    | 0                    |
| rocksdb_create_checkpoint                       |                      |
| rocksdb_create_if_missing                       | ON                   |
| rocksdb_create_missing_column_families          | OFF                  |
| rocksdb_datadir                                 | ./.rocksdb           |
| rocksdb_db_write_buffer_size                    | 0                    |
| rocksdb_deadlock_detect                         | OFF                  |
| rocksdb_debug_optimizer_no_zero_cardinality     | ON                   |
| rocksdb_default_cf_options                      |                      |
| rocksdb_delete_obsolete_files_period_micros     | 21600000000          |
| rocksdb_disabledatasync                         | OFF                  |
| rocksdb_enable_2pc                              | ON                   |
| rocksdb_enable_bulk_load_api                    | ON                   |
| rocksdb_enable_thread_tracking                  | OFF                  |
| rocksdb_enable_write_thread_adaptive_yield      | OFF                  |
| rocksdb_error_if_exists                         | OFF                  |
| rocksdb_flush_memtable_on_analyze               | ON                   |
| rocksdb_force_flush_memtable_now                | OFF                  |
| rocksdb_force_index_records_in_range            | 0                    |
| rocksdb_hash_index_allow_collision              | ON                   |
| rocksdb_index_type                              | kBinarySearch        |
| rocksdb_info_log_level                          | error_level          |
| rocksdb_is_fd_close_on_exec                     | ON                   |
| rocksdb_keep_log_file_num                       | 1000                 |
| rocksdb_lock_scanned_rows                       | OFF                  |
| rocksdb_lock_wait_timeout                       | 1                    |
| rocksdb_log_file_time_to_roll                   | 0                    |
| rocksdb_manifest_preallocation_size             | 4194304              |
| rocksdb_max_background_compactions              | 1                    |
| rocksdb_max_background_flushes                  | 1                    |
| rocksdb_max_log_file_size                       | 0                    |
| rocksdb_max_manifest_file_size                  | 18446744073709551615 |
| rocksdb_max_open_files                          | -1                   |
| rocksdb_max_row_locks                           | 1073741824           |
| rocksdb_max_subcompactions                      | 1                    |
| rocksdb_max_total_wal_size                      | 0                    |
| rocksdb_merge_buf_size                          | 67108864             |
| rocksdb_merge_combine_read_size                 | 1073741824           |
| rocksdb_new_table_reader_for_compaction_inputs  | OFF                  |
| rocksdb_no_block_cache                          | OFF                  |
| rocksdb_override_cf_options                     |                      |
| rocksdb_paranoid_checks                         | ON                   |
| rocksdb_pause_background_work                   | OFF                  |
| rocksdb_perf_context_level                      | 0                    |
| rocksdb_persistent_cache_path                   |                      |
| rocksdb_persistent_cache_size                   | 0                    |
| rocksdb_pin_l0_filter_and_index_blocks_in_cache | ON                   |
| rocksdb_print_snapshot_conflict_queries         | OFF                  |
| rocksdb_rate_limiter_bytes_per_sec              | 0                    |
| rocksdb_read_free_rpl_tables                    |                      |
| rocksdb_records_in_range                        | 0                    |
| rocksdb_seconds_between_stat_computes           | 3600                 |
| rocksdb_signal_drop_index_thread                | OFF                  |
| rocksdb_skip_bloom_filter_on_read               | OFF                  |
| rocksdb_skip_fill_cache                         | OFF                  |
| rocksdb_skip_unique_check_tables                | .*                   |
| rocksdb_stats_dump_period_sec                   | 600                  |
| rocksdb_store_row_debug_checksums               | OFF                  |
| rocksdb_strict_collation_check                  | ON                   |
| rocksdb_strict_collation_exceptions             |                      |
| rocksdb_table_cache_numshardbits                | 6                    |
| rocksdb_table_stats_sampling_pct                | 10                   |
| rocksdb_tmpdir                                  |                      |
| rocksdb_trace_sst_api                           | OFF                  |
| rocksdb_unsafe_for_binlog                       | OFF                  |
| rocksdb_use_adaptive_mutex                      | OFF                  |
| rocksdb_use_direct_reads                        | OFF                  |
| rocksdb_use_direct_writes                       | OFF                  |
| rocksdb_use_fsync                               | OFF                  |
| rocksdb_validate_tables                         | 1                    |
| rocksdb_verify_row_debug_checksums              | OFF                  |
| rocksdb_wal_bytes_per_sync                      | 0                    |
| rocksdb_wal_dir                                 |                      |
| rocksdb_wal_recovery_mode                       | 2                    |
| rocksdb_wal_size_limit_mb                       | 0                    |
| rocksdb_wal_ttl_seconds                         | 0                    |
| rocksdb_whole_key_filtering                     | ON                   |
| rocksdb_write_disable_wal                       | OFF                  |
| rocksdb_write_ignore_missing_column_families    | OFF                  |
| rocksdb_write_sync                              | OFF                  |
+-------------------------------------------------+----------------------+
and hardware details of my QuadCore box with HDD were shared previously. I tend to think no tuning is needed for such a small tables and corner read-only use case (if maybe READ COMMITTED isolation level, that I plan to study separately, as initial attempts had not shown much difference).

Just for completeness, this is the minimal configuration file I've used and the commit I've built from (after creating a new clone yesterday, this time the default branch was fb-mysql-5.6.35):
[openxs@fc23 fb56]$ cat ~/fb56.cnf
[mysqld]
rocksdb
default-storage-engine=rocksdb
skip-innodb
default-tmp-storage-engine=MyISAM

log-bin
binlog-format=ROW

#transaction-isolation=READ-COMMITTED
#rocksdb_max_open_files=-1
#rocksdb_block_cache_size=128M

[openxs@fc23 mysql-5.6]$ git log -1commit 95dd650dcfb07b91971b20cbac82d61f244a05a9
Author: Gunnar Kudrjavets <gunnarku@fb.com>
Date:   Fri Feb 3 10:07:25 2017 -0800
...
You can guess that I tried to add some more options along the lines of the fine manual, but they had not changes things much to better, so I've commented them out.

So, with this fresh build based on MySQL 5.6.35 I've decided to study the influence of jemalloc for both queries with 4, 8 and 16 threads (most interesting settings for the QuadCore). I've used the --malloc-lib option for mysqld_safe to force use of jemalloc, and checked pmap ouput for mysqld process for jemalloc entries to confirm that it was really preloaded:



[openxs@fc23 fb56]$ ps aux | grep mysqld
openxs   20444  0.0  0.0 119164  3264 pts/0    S    11:49   0:00 /bin/sh bin/mysqld_safe --defaults-file=/home/openxs/fb56.cnf --malloc-lib=/usr/lib64/libjemalloc.so
openxs   20630  0.8  0.2 170704 24284 pts/0    Sl   11:49   0:00 /home/openxs/dbs/fb56/bin/mysqld --defaults-file=/home/openxs/fb56.cnf --basedir=/home/openxs/dbs/fb56 --datadir=/home/openxs/dbs/fb56/data --plugin-dir=/home/openxs/dbs/fb56/lib/plugin --log-error=/home/openxs/dbs/fb56/data/fc23.err --pid-file=/home/openxs/dbs/fb56/data/fc23.pid
openxs   20750  0.0  0.0 118520   904 pts/0    S+   11:49   0:00 grep --color=auto mysqld

[openxs@fc23 fb56]$ sudo pmap 20630 | grep jemalloc
00007fb8637db000    268K r-x-- libjemalloc.so.2
00007fb86381e000   2044K ----- libjemalloc.so.2
00007fb863a1d000     12K r---- libjemalloc.so.2
00007fb863a20000      4K rw--- libjemalloc.so.2
So, here are the raw results of mysqlslap runs with jemalloc preloaded after I made sure queries use different plans (STRAIGHT_JOIN uses PRIMARY keys and filesort, while INNER JOIN uses covering secondary index and avoids filesort):

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.695 seconds
        Minimum number of seconds to run all queries: 19.627 seconds
        Maximum number of seconds to run all queries: 19.769 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.332 seconds
        Minimum number of seconds to run all queries: 19.306 seconds
        Maximum number of seconds to run all queries: 19.349 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.276 seconds
        Minimum number of seconds to run all queries: 19.225 seconds
        Maximum number of seconds to run all queries: 19.309 seconds
        Number of clients running queries: 16
        Average number of queries per client: 62

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.553 seconds
        Minimum number of seconds to run all queries: 18.336 seconds
        Maximum number of seconds to run all queries: 18.748 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.227 seconds
        Minimum number of seconds to run all queries: 18.198 seconds
        Maximum number of seconds to run all queries: 18.269 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 17.941 seconds
        Minimum number of seconds to run all queries: 17.886 seconds
        Maximum number of seconds to run all queries: 18.000 seconds
        Number of clients running queries: 16
        Average number of queries per client: 62


Now, the same but with default memory allocator picked up (no jemalloc):

[openxs@fc23 fb56]$ ps aux | grep mysqld
openxs   27696  0.0  0.0 119164  3372 pts/0    S    13:50   0:00 /bin/sh bin/mysqld_safe --defaults-file=/home/openxs/fb56.cnf
openxs   27834  0.3  0.2 607012 23096 pts/0    Sl   13:50   0:00 ./bin/mysqld --defaults-file=/home/openxs/fb56.cnf --basedir=. --datadir=./data --plugin-dir=./lib/plugin --log-error=./data/fc23.err --pid-file=./data/fc23.pid
openxs   28029  0.0  0.0 118520   880 pts/0    S+   13:50   0:00 grep --color=auto mysqld
[openxs@fc23 fb56]$ pmap 27834 | grep jemalloc
[openxs@fc23 fb56]$

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 20.207 seconds
        Minimum number of seconds to run all queries: 19.833 seconds
        Maximum number of seconds to run all queries: 20.543 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.746 seconds
        Minimum number of seconds to run all queries: 19.500 seconds
        Maximum number of seconds to run all queries: 20.402 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 19.150 seconds
        Minimum number of seconds to run all queries: 19.084 seconds
        Maximum number of seconds to run all queries: 19.292 seconds
        Number of clients running queries: 16
        Average number of queries per client: 62

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.692 seconds
        Minimum number of seconds to run all queries: 18.516 seconds
        Maximum number of seconds to run all queries: 18.889 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.290 seconds
        Minimum number of seconds to run all queries: 18.250 seconds
        Maximum number of seconds to run all queries: 18.342 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[openxs@fc23 fb56]$ bin/mysqlslap -uroot --iterations=10 --concurrency=16 --create-schema=test --no-drop --number-of-queries=1000 --query='select count(*), category from task straight_join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 18.155 seconds
        Minimum number of seconds to run all queries: 18.066 seconds
        Maximum number of seconds to run all queries: 18.397 seconds
        Number of clients running queries: 16
        Average number of queries per client: 62


Finally, this is how it looks like on a chart:


I can make the following conclusions:
  1. Using jemalloc helps to get a bit better throughput for both queries (I tend to consider the INNER case with 16 threads a fluctuation for now), but not much.
  2. New build based on fb-mysql-5.6.35 branch as of yesterday's morning demonstrates worse performance for STRAIGHT_JOIN case (access via PRIMARY keys + filesort) comparing to the previous build from January 30, 2017 based on webscalesql-5.6.27.75 branch. I am yet to find out why is it so, as INNER JOIN case performs a bit better.
  3. I am open to any MyRocks runing suggestions for my old QuadCore box, as for now MySQL 5.7.17 performs notably better on this same hardware and data in the tables, with --no-defaults. No wonder, a lot of efforts were spent by Oracle on Bug #68079, while all I've got so far for MyRocks is public "Can't repeat" of a kind, plus simple tuning advices with limited impact for my case.
I am sure I am doing something wrong with MyRocks, just tell me what exactly...