Sunday, January 20, 2013

Fun with Bugs, Issue #2, January 2013

Looks like Issue #1 was popular enough based on number of reviews, so let me continue and tell you what kinds of fun I had with MySQL bug reports between January 12 and today. Once again, this is a kind of digest for my bugs related posts on Facebook during this period.

I'd like to start with the latest open bug at the moment, Bug #68127. Looks like one can easily crash MySQL 5.6.9 by running few concurrent  SELECT * from threads queries against PERFORMANCE_SCHEMA. As PERFORMANCE_SCHEMA is enabled by default in 5.6.9 (now even such an ignorant person as me knows this, thanks to readers of my previous posts), this is a serious problem IMHO. Comparing to this, some missing or unclear information in the manual, like those Bug #68097 or Bug #68089 I've reported are so minor...

Now, let me scroll Timeline back and check how the period started. Looks like it was mostly devoted to MySQL 5.6, and this is natural - it's approaching GA and should be the best release ever. That's why it's so sad so see regression bugs or incomplete features in it. Immediately before my previous "Fun with Bugs" post my former colleague Sveta Smirnova had published her review of 18 most important new troubleshooting features of MySQL 5.6. I've taken a quick look and ended up with a list of related bugs:
  • Bug #67830 - EXPLAIN for DELETE and UPDATE does not work as expected, still just "Verified"
  • Bug #57514 - delayed replication may not work as expected, still just "Verified"
  • Bug #67236 - problems with new host_cache table. This one is already "Closed".
Then there was a regression bug in optimizer, again in 5.6, Bug #68046. It was just in time for one internal discussion we had at Percona on long IN lists optimization. Make sure you check it if you are going to upgrade to 5.6 now or soon after GA and use long IN lists based on some old habits or previous experience with it as a solution for some query performance problems.

MySQL 5.6 will be under increasing attention, so I'd really appreciate for even less severe bugs like Bug #68118 to get proper attention from Oracle engineers and to be formally verified fast. Especially in cases like above, when old utilities with familiar names are rewritten (in Perl?) and now works differently than in current GA versions.

Optimizer had been a topic of my interest for years, so it was hard to skip the bug that actually does NOT affect 5.6 it seems, only 5.1.46+ and 5.5. I mean this one, Bug #68072.  With proper indexes in place you can get wrong results for some queries when timediff() function is involved. What's funny is that without index the result will be correct. It's always great to choose performance vs correct results.

I have to admit that Oracle engineers did a great job in bugs processing during this period. Number of open bugs is now almost as low as during my last day in Oracle, and they really help to pinpoint cases when experienced engineers make mistakes when studying some problems and report bugs that are "false positives". It may take time to argue in such cases, but public discussion and double checking benefit all sides involved. See Bug #68075 as a great example of this. Even if bug still remains "Open", like Bug #68077, it's great to be able to read the analysis or opinion of Oracle engineers about the problem and potential solutions.

If you care about InnoDB scalability and want to see a case when MySQL 5.6 is better than 5.5, and understand why is it so and is MySQL 5.6 really that good and scalable in a real life, take a look at Bug #68079, especially at comments from Dimitri Kravtchuk there. I expect detailed study (if not a solution) from him soon, published at his great blog. Thanks to my former colleague Arnaud Adant the bug was not only formally verified soon, but he also made sure that Oracle's leading performance tuning expert is aware about it.

This was yet another great example of Oracle's care about public bugs database and problems reported there. They can do a lot. We, community users, should just remind them about bugs they miss for whatever reason. Short post about Bug #67124 - and now it is "Verified" and get a chance to have resources allocated and priority set so that more improvements will be done sooner.

I know Oracle does not care about other storage engines as much as about InnoDB, but still it would be nice to have some cases, like Bug #68086, at least clarified in the manual if not fixed. Lower level of care is demonstrated for bugs related to tools or connectors. And while I can consider these bugs funny (like in case of Bug #67994) they may not be easy to fix. Even more important in this case to have them formally verified, especially if verification itself looks easy.

Since my Issue #1 in this series I've got several notes from my new readers about problematic bug report. Bug #66819 was one of them (surely it is about InnoDB, it's all about InnoDB these days). My colleague Alexey Kopytov had studied root causes of infamous Bug #61104 in details there, and the bug was closed as fixed in versions 5.1.67, 5.5.29, 5.6.8, 5.7.0. Weixiang Zhai questioned that based on code review and then it was proved that not ALL cases are fixed. Oracle will work on the remaining in frames of new, internal bug report it seems, but in the meantime you should know that change buffer is still not yet entirely safe in 5.5.29 and it makes sense to still apply the same workaround as suggested in Bug #61104.


As you know, some bugs affect only debug builds. Does this mean they are not important to fix? No, having them prevents testing for other bugs properly, so they also have to be escalated. Check Bug #68116 from my colleague Laurynas for example. I hope it will not end up "Unsupported" only because his stack trace is from Percona Server...

By the way, MySQL 5.7 is already work in progress. We, in MySQL community, do not have access to source codes yet (as there were no official release), but I see 5.7 mentioned in bug reports more often now. Check this case, Bug #63178. I think that it is NOT acceptable to just close the bug with the fix only in 5.7.0 without any comments on why the problem was not fixed in earlier versions or access to source code of the fix. But who am I to suggest anything... On the positive side, the fact that somebody cared to close the bug and write 5.7.0 there probably means that there will be a formal release of 5.7.x soon. Let's hope...
 
I'd like to finish this post in a way similar to the previous one: if you have a valid support contract with Oracle, ask them what Bug #65664 is about and is it really fixed in MySQL 5.1.66. I have reasons to think it is not, unfortunately. Find out for yourself - this is your benefit as a customer after all, to be able to demand a definite reply. I can only assume or guess (or check the code) at the moment.

To be continued...

No comments:

Post a Comment