Sunday, October 21, 2018

On Some Recent MySQL Optimizer Bugs

Yet another topic I missed in my blog post on problematic MySQL features back in July is MySQL optimizer. Unlike with XA transactions, it was done in purpose, as known bugs, limitations and deficiencies of MySQL optimizer is a topic for a series of blog posts if not a separate blog. At the moment the list of known active bug reports in optimize category consists of 380(!) items (mostly "Verified"), aside from feature requests and bugs not considered "production" ones by current "rules" of MySQL public bugs database. I try to check optimizer bugs often in my posts, I reported many of them, but I am still not ready to cover this topic entirely.

What I can do in frames of one blog post is a quick review of some "Verified" optimizer bugs reported over last year. I'll present them one by one in a list, with some comments (mostly related to my checks of the same test case with MariaDB 10.3.7 that I have at hand) and, hopefully, some conclusions about current state of MySQL optimizer.

I'll try to shed some light on current state of MySQL optimizer, but it's huge and dark area, with many details hidden...
So, here is the list, starting from most recently reported bugs:
  • Bug #92654 - "GROUP BY fails for queries when a temporary table is involved". This bug affects recent MySQL 8.0.12 and 5.7.23, but does not affect MariaDB 10.3, for example, from what I see:
    MariaDB [test]> insert into domain_tree values (1), (2), (3);
    Query OK, 3 rows affected (0.080 sec)
    Records: 3  Duplicates: 0  Warnings: 0

    MariaDB [test]> insert into host_connection_info values (1), (3);
    Query OK, 2 rows affected (0.054 sec)
    Records: 2  Duplicates: 0  Warnings: 0

    MariaDB [test]> SELECT
        ->   COUNT(1),
        ->   host_connection_status.connection_time
        -> FROM
        ->   (SELECT id
        ->    FROM domain_tree) AS hosts_with_status
        ->   LEFT OUTER JOIN
        ->   (SELECT
        ->      domain_id,
        ->      'recent' AS connection_time
        ->    FROM
        ->      host_connection_info) AS host_connection_status
        ->     ON hosts_with_status.id = host_connection_status.domain_id
        -> GROUP BY host_connection_status.connection_time;
    +----------+-----------------+
    | COUNT(1) | connection_time |
    +----------+-----------------+
    |        1 | NULL            |
    |        2 | recent          |
    +----------+-----------------+
    2 rows in set (0.003 sec)
  • Bug #92524 - "Left join with datetime join condition produces wrong results". The bug was reported by Wei Zhao, who contributed a patch. Again, MariaDB 10.3 is not affected:
    MariaDB [test]> select B.* from h1 left join g B on h1.a=B.a where B.d=str_to_date('99991231',"%Y%m%d") and h1.a=1;
    +---+---------------------+
    | a | d                   |
    +---+---------------------+
    | 1 | 9999-12-31 00:00:00 |
    +---+---------------------+
    1 row in set (0.151 sec)

    MariaDB [test]> select B.* from h1 left join g B on h1.a=B.a and B.d=str_to_date
    ('99991231',"%Y%m%d") where h1.a=1;
    +---+---------------------+
    | a | d                   |
    +---+---------------------+
    | 1 | 9999-12-31 00:00:00 |
    +---+---------------------+
    1 row in set (0.002 sec)
  • Bug #92466 - "Case function error on randomly generated values". See also related older Bug #86624 - "Subquery's RAND() column re-evaluated at every reference". These are either regressions comparing to MySQL 5.6 (and MariaDB), or unclear and weird change in behavior that can be workarounded with some tricks (suggested by Oracle developers) to force materialization of derived table. Essentially, result depends on execution plan - what else could we dream about?
  • Bug #92421 - "Queries with views and operations over local variables don't use indexes". Yet another case when MySQL 5.6 worked differently. As Roy Lyseng explained in comments:
    "... this is due to a deliberate choice that was taken when rewriting derived tables and views in 5.7: When a user variable was assigned a value in a query block, merging of derived tables was disabled.
    ...
    In 8.0, you can override this with a merge hint: /*+ merge(v_test) */, but this is unfortunately not implemented in 5.7.
    "
  • Bug #92209 - "AVG(YEAR(datetime_field)) makes an error result because of overflow". All recent MySQL versions and MariaDB 10.3.7 are affected.
  • Bug #92020 - "Introduce new SQL mode rejecting queries with results depending on query plan". Great feature request by Sveta Smirnova that shows current state of optimizer development properly. We need a feature for MySQL to stop accepting queries that may return different results depending on the execution plan. So, current MySQL considers different results when different execution plans are used normal! Sveta refers to her Bug #91878 - "Wrong results with optimizer_switch='derived_merge=ON';" as an example. MariaDB 10.3 is NOT affected by that bug.
  • Bug #91418 - "derived_merge causing incorrect results with distinct subquery and uuid()". From what I see in my tests, MariaDB 10.3.7 produce wrong results with derived_merge both ON and OFF, unfortunately.
  • Bug #91139 - "use index dives less often". In MySQL 5.7+ the default value of eq_range_index_dive_limit increased from to 10 to 200, and this may negatively affect performance. As Mark Callaghan noted, when there is only one possible index exists optimizer doesn't need to evaluate the query to figure out how to evaluate the query.
  • Bug #90847 - "Query returns wrong data if order by is present". This is definitely a corner case, but still. MariaDB 10.3 returns correct result in my tests.
  • Bug #90398 - "Duplicate entry for key '<group_key>' error". I can not reproduce the last public test case on MariaDB 10.3.
  • Bug #89419 - "Incorrect use of std::max". It was reported based on code analysis by Zsolt Parragi. See also Bug #90853 - "warning: taking the max of a value and unsigned zero is always equal to the other value [-Wmax-unsigned-zero]". Proper compiler detects this.
  • Bug #89410 - "delete from ...where not exists with table alias creates an ERROR 1064 (42000)". MariaDB 10.3 is also affected. Both Oracle and PostrgeSQL accepts the syntax, while in MySQL and MariaDB we can use multi-table delete syntax-based workaround as suggested by Roy Lyseng.
  • Bug #89367 - "Storing result in a variable(UDV) causes query on a view to use derived tables", was reported by Jaime Sicam. This is a kind of regression in MySQL 5.7. MariaDB 10.3 and MySQL 8.0 are not affected. Let me quote a comment by Roy Lyseng:
    "In 5.7, we added a heuristic so that queries that assign user variables are by default materialized and not merged. However, we should have let the ALGORITHM=MERGE override this decision. This is a bug."
  • Bug #89182 - "Optimizer unable to ignore index part entering less optimal query plan". Nice report from Przemyslaw Malkowski. One of many case when "ref" vs "range" decision seems to be wrong based on costs. Looks like optimizer still have parts that are heuristics/rules based and/or do not take costs into account properly.
  • Bug #89149 - "SELECT DISTINCT on multiple TEXT columns is slow". Yet another regression in MySQL 5.7+.
That's all optimizer bugs reported in 2018 and still "Verified" that I wanted to discuss.

From the list above I can conclude the following:
  1. There are many simple enough cases when queries provide wrong results or get not optimal execution plans in MySQL. For many of them MariaDB's optimizer does a better job.
  2. Behavior of optimizer for some popular use cases changed after MySQL 5.6, so take extra care to check queries and their results after upgrade to MySQL 5.7+.
  3. derived_merge optimization seems to cause a lot of problems for users in MySQL 5.7 and 8.0.
  4. It seems optimizer developers care enough to comment on bugs, suggest workarounds and explain decisions made.

No comments:

Post a Comment