Monday, August 12, 2013

MySQL Bugs Verification - Is It Really Simple?

While it was explained already by Sveta and others what does it really mean to "Verify" (or "Confirm", in Launchpad/Percona's terms) a bug in MySQL software, and why this step in a bug's life cycle is important, we still often read complains about too much time taken to verify the bug even with a clearly repeatable test case that can be just copy/pasted, like Bug #69985 or notably more serious Bug #69990. Moreover, I often make comments of this kind myself...

So, it seems there is still a need for clear explanation of all steps that may be involved in verification of a MySQL or Percona Server (let's take server as a most complicated case) bug. First of all, both Oracle and Percona require engineers who process bugs to check them on latest releases and/or source code builds of all versions/branches that are currently GA and fully supported, and on all development versions. For Oracle MySQL bug it means check of recent builds from current source code of 5.1, 5.5, 5.6 and 5.7 and, for many kinds of bugs like packaging, also on official binaries of 5.1.71, 5.5.33, 5.6.13 and 5.7.1 provided by Oracle for the platform. Some bugs may be clearly OS-specific even based on description, but others may require checks on Linux, Windows or even Solaris and FreeBSD (if they are NOT repeatable on Linux). Note that there are also different kinds of Windows and different Linux distributions, both 32-bit and 64-bit, and sometimes all these details matter. So, if engineer did not bother to check everywhere before setting bug to "Verified", bug may come back to him with a request from development for some more checks...

On top of that from time to time MySQL developers and QA start to care about regression bugs more than usual and as a result ask engineers who process bugs to try to pinpoint the exact release when supposedly regression bug appeared. Sometimes it mean checks of 3 previous official releases, sometimes it means separate detailed study that only really brave people like Shane (who probably has all releases since 3.23.5x installed and ready to start anyway) do... Note that all these is for a bug with a clear test case, while many bug reports are not that obvious.

Surely, a lot of checks can be automated (as Sveta explained) using smart setups, MTR, MySQL sandboxes and some shell scripting. But then even a small bug/problem in scripting may lead to bugs NOT checked on some important version (like it happened with MySQL 5.6 at pre-GA stage just because it was no longer mysql-trunk, but mysql-5,6, while scripts remain the same). And then you know what happens - people like me note this problem and Pandora's box is opened...

On top of that, every "Verified" MySQL bug in Oracle should be copied to internal bugs database, and at this step one has to run a script (that may have bugs also) from a Web-form providing MySQL bug number, then check copied bug in the internal bugs database, set proper status for it, make sure it had got proper category (as not every category at http://bugs.mysql.com is supported in internal Oracle's bugs database), got proper priority and ends up assigned to the developer lead who can really care about it. Some bugs should obviously be immediately escalated, and there was a separate procedure for this... At least this was the case a year ago.

So, even if I was able sometimes to "Verify" a bug in a matter of 15 minutes since it was reported, proper verification even of a simplest server bug usually takes more than that, even if it was immediately noted by the engineer who had nothing more important to do at the moment. By the way, bugs processing is hardly a 24x7 service in any company, so we should NOT expect some engineer to really monitor all incoming bugs in a real time on Sunday.

Sounds like a really over-complicated procedure, isn't it? Is it any different in Percona? Yes, it's easier here as there is no need to copy from one bugs database to the other - everything is in one place. We also do not release binaries for Windows, FreeBSD or Solaris and thus usually do not care about bugs on these platforms much unless they are repeatable on Linux. But on the other hand Percona provides repositories of RPM and .deb packages, and thus some bugs had to be checked on all recent major releases of RHEL/CentOS, Debian and Ubuntu (that are officially fully supported platforms here). I also have to check on all major versions of Percona Server, 5.1, 5.5 and 5.6 at the moment. On top of that, if bug is not clearly related to Percona-specific feature, we have to check upstream MySQL version and, if it is affected, we have to report and link upstream bug to the Percona server bug. So, again, often we in Percona end up working with two bugs databases, a lot of copy/pasting and following other annoying procedures... So, do not expect Percona server bug to be "Confirmed" in a matter of minutes or days, even if it comes with a simple test case repeatable on a recent version. So it goes.

Summary is simple: please, respect hard and often boring work of engineers who process bugs and give them some time before complaining (this is a reminder for myself as well).

If you care a lot about the bug, probably you should just open a support request with a vendor (I hope you have a support subscription, if you really care that much?) and then use your power of a customer to make things happen. If it does not work this way - tell me and let me open Pandora's box (or can of worms, if you prefer) at Facebook...

No comments:

Post a Comment