Sunday, February 9, 2014

On responsible bugs reporting

Let me start with questions related to responsible MySQL bugs reporting that I'd like to be discussed and then present a history behind them.

Assuming that you, my dear reader from MySQL Community, noted or found some simple sequence of SQL statements that, when executed by authenticated MySQL user explicitly having all the privileges needed to execute these statements, crashes some version of your favorite MySQL fork, please, answer the following questions:
  1. Do you consider this kind of a bug a "security vulnerability"?
  2. Should you share complete test case at any public site (MySQL bugs database, Facebook, your personal blog, any)?
  3. Should you share just a description of possible "attack vector", as Oracle does when they publish security bug fixes?
  4. Should you share just a stack trace or failed assertion information, without any details on how to get it?
  5. Should you inform vendor of your MySQL fork about it in a private email officially (secalert_us@oracle.com in case of Oracle)?
  6. Should you try to have it added to CVE database and, if you should, what is the best way to do this?
  7. Should you try to use any informal, but still private ways share this information with MySQL vendors and other interested parties?
  8. What is the best way to do "responsible bugs reporting" in cases like this?



Now, the history behind these questions. Usually when I have to deal with any MySQL issue not obviously caused by a user mistake or misconfiguration, I suspect that the problem may be related to some (known or yet unknown, fixed or still active) MySQL bug. I search MySQL bugs database for error messages and codes, keywords, stack trace entries or assertions user noted, at the earliest stage of investigation. This approach works well enough for me for more than 8 years, as even if the problem was not related to any bug, similar reports may give explantions, workarounds or other hints on possible reasons.

To be able to search bugs faster and have more things readily available "from the top of my head" I try to read new public bug reports whenever they appear. Sometimes I send comments or hints. Often enough I see simple test cases there to check. And I do try to run these test cases in a hope to see the problem more clearly or find out that it is not repeatable for me (to try to understand why is that so later). I do this during my free time and while I work, I do this this for fun and when colleagues ask me for some assistance or, specifically, for any bug that may be related to their problem at hand.

I used to share my finding immediately on Facebook by posting URLs to bugs with my comments, but this activity is now stopped at least till PLMCE 2014 in a hope to find some better way of cooperation on bug fixing and as a part of one simple experiment for my talk proposed for that conference (that was not selected by the committee anyway). So, since the end of November, 2013 I share (rarely) only some questions or test cases from time to time, and ask colleagues to try them on their favorite MySQL, MariaDB or Percona Server versions, if they are also curious.

This week during my usual review of new bug reports in the morning I've noted an "Open" bug with a simple test case, just a couple of UPDATEs for some InnoDB table having a single row of data, that, according to the bug reporter, crashed recently released MySQL 5.6.16 (that I blogged about just a day before) and 5.5.36. Surely I've immediately tried this test case, that crashed my test instances because of assertion failure. I've waited for some time to see that the bug is still open, informed few of my colleagues hanging around at IRC chat about new crashing bug found, waited for some more time and... posted test case in a status update on Facebook. This post got some comments that suspected crash as a result, but not so many. I have maybe 200 friends there and most of them are actually current or former colleagues, MySQL engineers or software developers who work on MySQL and other databases in Oracle, HP, SkySQL, Percona and few other companies. My idea was to make them all immediately aware about potential problem. I've noted that very soon after this post the bug report had "disappeared" - it was marked as Private probably and, I hope, verified and got proper attention from Oracle engineers.

Next morning while checking recent emails quickly I've noted message from my former colleague, who now works in Oracle, asking to remove that Facebook post as explicit publishing of a way to crash MySQL is not a "responsible reporting" - I had to give Oracle a chance and time to fix this, as a potential "security vulnerability".

I've immediately deleted the post based just on respect to the person who asked, but internally I had doubts on what was actually a proper thing to do in this case, and I even felt insulted for a while by this (yet another) attempt to shut me up and stop me from discussing MySQL bugs in public (not that this really happens, but I have feelings like that from time to time). So, I tried to discuss this with my colleagues and found out that some of them do think the bug is actually a potential security vulnerability and thus had to be private. Others (like me) do not believe in any "security through obscurity" practice, especially for open source software, where any interested person can just read the code, find assertion there and then just think when and why it can be hit, based on code analysis.


Formal Oracle policy in this case is well known: "If you are not a customer or partner, please email secalert_us@oracle.com with your discovery". Customers should just open a Support Request.

I was explicitly against applying this policy to MySQL (I've suggested to create public bug reports in my New Year 2013 wishes to Oracle customers back in 2012) among other reasons because "security bugs" and "security issues" have more chances internally in Oracle to become forgotten as they get less visibility (at the same time as they get higher priority) and very few people in Oracle can access or escalate or otherwise monitor them.

Surely, this policy can not be enforced on community members and individuals like me. In the worst case if somebody discloses details of the security vulnerability, she will not get a credit, as: "In order to receive credit, security researchers must follow responsible disclosure practices".


That's the whole story so far. In my case it was not me who found the bug, so why should I care even about credit. My action caused immediate attention from Oracle side and thus I can be sure this specific bug is not forgotten and fixed as soon as possible (by the way, MySQL 5.5.35 is also affected by the same assertion failure, so this is NOT a recent regression and those who use older versions in production are potentially affected). I am sure my post was also noted by "security officers" in MariaDB Foundation and Percona, so in case these forks are also affected we now have chances to get fixes there as well, no matter when and how Oracle fixes the problem.

Still, this case and related private discussions led me to the questions at the beginning of this blog post. I'd like to get your opinions on them.

8 comments:

  1. It has been discussed many times if an easy way to crash the server by a non-privileged user is a "security issue" or not. Oracle themselves are not consistent in this respect (in bugs.nysql) and you weren't either when you were there. I also remember Shane Bester posted quite a lot af crash reports with single statements not even accessing a table. I don't know if those reports are still open, but they were for a long time.

    The reason why an easily reproducible crash may be considered a security issue is "shared hosting", where MySQL as everybody will know is the preferred database (even though it is not really fit for it IMO - it lacks sufficinet user based quota settings for instance). But if this is the case any query spinning the server up to 100% CPU for several minutes for the query also is. It is effective 'denial of service' by one user against other users. But who can prevent an unexprienced user to execute a full cartesian JOIN from his phpMyAdmin access to the database anyway? Or prevent him to write and execute a routine entering an infinite loop?

    In non-shared hosting most of this this can be preventing by securing against SQL-injections, proper use of user privileges etc.

    I think the Oracle person you contacted overreacted - at least if you FB posts were not shared fully publicly - and only to 'friends' (what ridicolous term, BTW!). I have experienced inconsistence myself wiht "privatization" og bugs - and in a few cases I had the impressison that the real reason was that the bug was too embarrasing for somebody and Oracle/Sun/MySQL and it was not a real security concern. In other words: I suspect it was more concern for "self_protection" than "protection of users" that led to "privatization" in those cases.

    As a generalreply I would find it relevant to share such bugs to anyone who 1) can understand the underlying reason 2) can do something about it 3) have any legit reason to know about it - but at the same time try to the avoid to shared with person who only have interest in using it as an exploit for bothering other people..

    But I realize it is not easy to accomplish that balance!

    ReplyDelete
  2. Yes, I am a human and thus I also was inconsistent more than once for sure in my decisions...

    My main point is: following formal Oracle policy that is public for cases like the one discussed is hardly useful for the community and hardly makes sense for open source software.

    On the other hand writing everywhere in public about yet another crash may do attract some attention of people with malicious intentions, who can then try to break all kinds of shared MySQL hosting providers using one more hint (on top of other ways they already know above, few of them mentioned in your comment, Peter).

    Also it was interesting to get email on being more responsible from a company who had not only missed a simple crash for a long time (that is, probably saves on QA efforts), but also do not monitor bugs close enough in 24x7 mode (or at least as close as I do, having other job to spend time on) and, what's more important, has too many assertion failures in the code for cases that can now probably handled more gracefully (just kill the statement that is causing assertion and mark table as "broken" maybe). All this on top of no user limits on CPU/memory or disk use...

    ReplyDelete
  3. "All crashes to be hidden" policy reminds me of a guy who once explained why he wasn't using blinkers on a road. He said that he was afraid of road-racketeers, and by not using blinkers he was making sure they don't know when to make their move to cause the intentional collision. I could never believe that somebody can be so stupid if I hadn't heard it myself. I think for any brain it's obvious that the only people who are *not* affected by his behavior are those road-racketeers because they need much more precision than blinkers provide to make the right move. All innocent people are in danger, though.

    People with malicious intentions do not need *many* ways to cause harm, they need only one that will always work. And among those, a crash is a quite inefficient way for DoS on a shared server. Server goes down, a few queries fail, server gets restarted automatically, all is up again. Big deal. If one does it several times, you know it is fairly easy to figure out which connection causes a crash and who it belongs to (much easier than to figure out the actual scenario). If the server is watched at all, in no time the user can be warned or blocked. Besides, crashes get fixed and the villain will have to find another one. Legitimate ways to do the same are much better and more reliable.

    Hiding bugs has nothing to do with users' safety or with reason in general. It is a policy which originated outside MySQL and was simply enforced on it at some point, and users' interests is a rather questionable excuse.

    I suppose you communicate with customers (or at least communicate with those who communicate with customers). How many times did it happen that crashes in their server were actually caused by a repeating malicious attack? As opposed to -- how many times did it happen that crashes were caused by their own or their clients' software which unwillingly triggered a bug of the server -- in which case, if you already knew about a similar bug, or could google it easily, you would give them good advice right away on how to tweak the query or server options to avoid further problems until it's fixed; or, if you couldn't find the known similar bug, you could spend days or weeks trying to figure out what's going on, while the server would keep crashing.

    And that's customers who can at least expect that you'll do this research for them. What about community users? Would it not be so much simpler if they could actually look through existing bugs and see which one they are likely to have encountered? Instead, it all works backwards. The only thing they can do is to report a new one, because old ones are not searchable. If they happen to report an original one, they will be asked for a lot of information which they don't necessarily have, in order to reproduce, which they aren't necessarily able to. If they report a bug for an older version, they will be first asked to upgrade, just to see the problem still exists, without being pointed at an old fixed bug that might affect them (as if it's so easy to upgrade a production server on a whim). If they report something already known, their report will be closed as a duplicate without them being able to watch a progress of the original one because it's either private or internal. So, how any of this is in the best interest of users?

    ReplyDelete
    Replies
    1. One of that cases when comment is much better than the original post!

      I agree with almost everything above, even though my sympathy for poor community users without a clearly repeatable test who are asked to check the recent MySQL version first are notably lower...

      Delete
    2. I think you misunderstood my point in regard to the old versions.

      It's perfectly fine if you say, "Yes, there was a bug like that, see link, already fixed, please upgrade". But it is *not* okay to say "your version is rather old, please upgrade, even though you cannot see fixed bugs because they are hidden and you cannot know if your upgrade effort should do you any good at all; and you also cannot search current open bugs for known regressions that will affect you after upgrade much worse than the bug you are trying to get rid of. But please upgrade anyway, just because."

      Many people cannot "check the new version" just for the sake of it, they don't have a test farm to play with, or time to do it. Yes, we might not owe anything to unpayed users, but everyone who reported a reasonable bug has already made some effort for the benefit of the community, the least they can expect back is some respect. Hiding bugs from them and their bugs from other people (unless they asked to) is anything but respect.

      Delete
    3. I agree 100% with your statement that hiding bugs is a disrespect to bug reporters and Community in general. Bug reporter tried to add some value for the benefit of all other users, but it was hidden. Moreover, even bugs fixed in all versions affected often remain hidden just because they are forgotten or as a yet another form of "care" about users (explained as "some poor soul may still use 5.5.0 or even 3.23.xx, so we can not make test cases public to protect them").

      Now, back to upgrade requests. It does not make sense to ask for check on recent version when there are clear steps on how to reproduce the bug in the report. But when there is no test case or any details to start searching for known bugs, just some random crash happened once in a while and then reported, further steps depends on how much each side care about this. If a person who processes bugs care a lot and feel some serious problem behind, she may spend days and months working of test cases. If, on the other hand, bug reporter really cares, she can spend few minutes for upgrade (that may be useful and planned/postponed for many reasons anyway) or setting up some (virtual) machine to run a slave of recent version and add it to existing setup. These days it should not take weeks and months.

      But this ("responsible processing of community bug reports") if off topic here. Questions are: should bug reporters or community members who care about MySQL bugs hide known cases of crashes or, instead, write about them everywhere?

      My position is clear: we should NOT try to hide anything. Oracle (or vendors of other MySQL forks) may try to do this if they want, but why should we do? So, IMHO, all bugs should be reported in public and explained to all community members who care to read.

      Delete
    4. (sorry for the removed comment, just had some interface problems)

      I think we should follow the common practice for *real* security vulnerabilities, involving unauthorized access and such. It is all about pros and cons; for regular crashes, benefits of hiding them just don't outweigh disadvantages. For security holes, it's a different story.

      Delete
  4. This comment has been removed by the author.

    ReplyDelete