Saturday, September 28, 2013

Fun with Bugs #24 - PERFORMANCE_SCHEMA

It seems that one of my session proposals is accepted for Percona Live London 2013, so I have to prepare myself to speak about PERFORMANCE_SCHEMA new features and problems in MySQL 5.6. Bugs are going to be discussed, among other things. Let's check current active bugs (and some "Not a bug"s) related to PERFORMANCE_SCHEMA in this issue.

I'd like to start with Bug #68514 that got some attention this week again, in despite of its "Not a bug" formal status. Detailed instrumentation comes with a cost, and to reduce high CPU cost (reported as Bug #67736 by Domas at 5.6 RC stage) it was decided to allocate memory in bigger batches. It seems notable (I'd say unexpectedly high) amount of memory may be allocated for performance counters if you have max_connections > 1500 (or table_open_cache > 10000, or table_definition_cache > 10000, or open_files_limit > 30000). So, this is something to take into account if you plan to use MySQL 5.6 in a hope to "scale up" your instance (as performance_schema=1 there by default and thus extra memory is allocated by default).

Speaking about "Verified" bugs, there are only 7 of them now for MySQL 5.6. So it's easy to just list them all:
  • Bug #68413 - "performance_schema overhead is at least 10%". It can be more, but it took a lot of time and efforts from Mark Callaghan and me to force Oracle engineers to accept this claim as valid. Sometimes I feel that public bug reports for PERFORMANCE_SCHEMA from users are considered more as an insult than a useful contribution from community (as it happens with even minor InnoDB problem reports, for example). Anyway, we managed to prove the overhead can really be that big, and later Oracle's own performance guru Dimitri Kravtchuk confirmed in his Bug #70018 that for some use cases just having PERFORMANCE_SCHEMA enabled may cost you 30% decrease of QPS, even if you do not try to use it in any way.
  • Comparing to the above, Bug #69727 reported by Todd Farmer is a minor issue. It seems setting instruments for P_S as server startup options is not so obvious in some cases.
  • Bug #69782 - "Old files not being removed from performance_schema.file_instances ". This may eventually become a problem for MySQL instances that create binary logs or relay logs often.
  • Bug #69915 - "statement/com/Query counter doesn't increment". Yet another bug report from Todd Farmer, who uses P_S and blogs about it a lot. No wonder he finds bugs in the process. This one sounds simple, so I don't really understand why it is still "Verified".
  • Bug #70025 - "Update on P_S setup_consumers and threads through JOIN only updates first row". Minor enough problem was found by yet another my former colleague in Oracle, Jesper Krogh. Jesper also often writes about P_S in his blog and does us a favor by reporting bugs in public bugs database. The most recent (at the moment) still "Verified" P_S bug is also from him: Bug #70028 - "P_S threads.INSTRUMENTED not set according to setup_actors".
 As you can conclude from the above, PERFORMANCE_SCHEMA is created by Oracle engineers for their own use and they use it a lot. So, it is doomed to be useful, especially now, when MySQL 5.6 is becoming widely used by Oracle customers and MySQL community users in production. We can rely on Facebook in this area - as they already use P_S, they'll do their best to force Oracle MySQL engineers to make it as efficient as possible eventually.

It's also clear from the above that one of the main problems of PERFORMANCE_SCHEMA is its name. Users try to name it as P_S, PS, PFS - whatever abbreviation is used, it's better than the original name (even if it comes from some SQL standard that nobody really cares about...). Along with INFORMATION_SCHEMA it makes me wanting to see Oracle's good old public synonyms implemented in MySQL some day... Before that it seems that the only sane way to use P_S on a regular basis in interactive command line client is via ps_helper, created by the godfather of P_S, Mark Leith.

7 comments:

  1. Cool nugget of information in one of those posts that shows something that I would never have though to try in a million years:

    SHOW ENGINE PERFORMANCE_SCHEMA STATUS

    ReplyDelete
  2. The author of 'ps_helper' is Mark Leith.

    The author of the PERFORMANCE_SCHEMA itself, is Marc Alff (myself).

    I am sure you have your own opinions on bugs themselves, but at the very minimum I would expect credits to be attributed correctly.

    -- Marc Alff

    ReplyDelete
  3. A godfather isn't a parent. You are the father Marc. A godfather is a beloved friend that helps with upbringing.

    ReplyDelete
  4. English is not my native language, but I think that "godfather" is correct term to define the role of Mark Leith in the process. I never claimed that he is the author of PERFORMANCE_SCHEMA.

    ReplyDelete
  5. > Before that it seems that the only sane way to use P_S on a regular basis in interactive command line client is via ps_helper, created by the godfather of P_S, Mark Leith.

    PS_HELPER is great, but P_S in 5.6 can be used even without it. You should read Marc Allf's blog or, if you prefer Russian language, download slides from my seminar at Devconf 2013 (See my LJ user info). You will find a lot of examples there.

    ReplyDelete
  6. Hi,

    The name isn't from any SQL standard. As the worklog task description says:
    "Proposals for the schema name were DYNAMIC (but that is a reserved word in standard SQL), PERFORMANCE_SCHEMA (Jim Winstead's suggestion), PERFORMANCE (Peter's preference), and DBTrace because it sounds like DTrace. There was a vote (see dev-private thread "WL#2360 Vote: schema name and table name"). The winner was PERFORMANCE_SCHEMA."
    http://dev.mysql.com/worklog/task/?id=2360
    I was responsible for writing that and all related worklog tasks at the time, but if you want to blame a godfather, try Jim.

    Peter Gulutzan
    ocelot.ca/blog

    ReplyDelete
  7. Thank you for clarification, Peter. I'd call you a "godmather" of PERFORMANCE_SCHEMA, but it seems I should just shut up if I do not want to be misinterpreted again.

    Surely PERFORMANCE_SCHEMA is useful "as is", and I've used it myself more than once. It's just painful to type the name every time and table names in it are also too verbose. I am spoiled by spending too much time with V$SQL and V$WAIT_EVENT in my life before MySQL...

    ReplyDelete