Friday, July 6, 2018

Problems of Oracle's MySQL as an Open Source Product

In my previous summary blog post I listed 5 problems I see with the way Oracle handles MySQL server development. The first of them was that "Oracle does not develop MySQL server in a true open source way" and this is actually what I started my draft of that entire blog post with. Now it's time to get into details, as so far there was mostly fun around this and statements that MariaDB also could do better in the related Twitter discussion I had.

So, let me explain what forces me to think that Oracle is treating MySQL somewhat wrong for the open source product.

Nice pathway on this photo, but it's not straight and it's not clear where it goes. Same with MySQL development...
We get MySQL source code updated at GitHub only when (or, as it often happened in the past, some time after) the official release of new version happens. You can see, for example, that MySQL 8.0 source code at GitHub was actually last time updated on April 3, 2018, while MySQL 8.0.11 GA was released officially on April 19, 2018 (and that's when new code became really available in public repository). We do not see any code changes later than April 3, while it's clear that there are bug fixes already implemented for MySQL 8.0.12 (see Bug #90523 - "[MySQL 8.0 GA Release Build] InnoDB Assertion: (capacity & (capacity - 1)) == 0", for example. There is an easy way to crash official MySQL 8.0.11 binaries upon startup, fixed back before April 30, with some description of the fix even, but no source code of the fix is published) and 8.0.13 even (see Bug #90999 - "Bad usage of ppoll in libmysql"). With Oracle's approach to sharing the source code, we can not see the fixes that are already made long time ago, apply them, test them or comment on them. This is fundamentally wrong, IMHO, for any open source software.

In other projects we usually can see the code as soon as it is pushed to the branch (check MariaDB if you care, last change few hours ago at the moment). Main branches may have more strict rules for updating, but in general we see fixes as they happen, not only when new official release happens.
Side note: if you see that Bug #90523 became private after I mentioned it here, that's another wrong thing they often do. More on the in the next post, on community bug reports handling by Oracle...
Interesting enough, when the fix comes from community we can usually see the patch. This happened to the Bug #90999 mentioned above - we have a fix provided by Facebook and one can see the patch in Bug #91067 - "Contribution by Facebook: Do not use sigmask in ppoll for client libraries". When somebody makes pull request, patch source is visible. But one can never be sure if it's the final patch and had it passed all the usual QA tests and reviews, or what happens to pull requests closed because developer had not signed the agreement...

If the fix is developed by Oracle you'll see the code changed only with/after the official release. Moreover, it would be on you to identify the exact commit(s) that introduced the fix. For a long time Laurynas Biveinis from Percona cared to add comments about the exact commit that fixed the bug to public bug reports (see Bug #77689 - "mysql_execute_command SQLCOM_UNLOCK_TABLES redundant trans_check_state check?" as one of examples). Community members have to work hard to "reverse engineer" Oracle's fixes and link them back to details of real problems (community bug reports) they were intended to resolve!

Compare this to a typical changelog of MariaDB that leads you directly to commits and code changes.

What's even worse, Oracle started a practice to publish only part of their changes made for the release. Some tests, those for "security" bugs, are NOT published even if we assume they exist or even can be 100% sure they exist.

My recent enough favorite example is the "The CREATE TABLE of death" bug reported by Jean-François Gagné. If you follow his blog post and links in it you can find out all the details, including the test case that is public in MariaDB. With this public information you can go and crash any affected older MySQL versions. Bug reporter did everything to inform affected vendors properly, and responsible vendors disclosed the test (after they fixed the problem)!

Now, try to find similar test in public GitHub tree of Oracle MySQL. I tried to find it literally, try to find references to somewhat related public bug numbers etc, but failed. If you know better and can identify the related public test at GitHub, please, add a comment and correct me!

To summarize, this is what I am mostly concerned about:
  1. Public source code is updated only with the releases. There are no feature-specific code branches, development branches, just nothing public until the official release.
  2. Oracle does not provide any details about commits and their relations to bugs fixed in the release notes or anywhere else outside GitHub. One has to go study the source code to make his own conclusions.
  3. Oracle does not share some of test cases in their commits. So, some test cases remain non-public and we can only guess (based on code analysis) what was the real intention of the fix. This applies to security bugs and who knows to what else.
I would not go into other potential problems (I've heard about some others from developers, for example, related to code refactoring Oracle does) or more details. The above is enough for me to state that Oracle do wrong things with the way they publish source code and threat MySQL as open source product.

All the problems mentioned above were introduced by Oracle, these never happened in MySQL AB or Sun. MariaDB and Percona servers may have their own problems, but the above do NOT apply to them, so I state that other vendors develop MySQL forks and related projects differently, and still are in business and doing well!

Sunday, July 1, 2018

What's Right and What's Wrong With Oracle's Way of MySQL Server Development

Recently it's quite common to state that "Oracle's Acquisition Was Actually the Best Thing to Happen to MySQL". I am not going to argue with that - Oracle proved over years that they are committed to continue active development of this great open source RDBMS, and they have invested a lot into making it better and implementing features that were missed or became important recently. Unlike Sun Microsystems, they seem to clearly know what to do with this software to make it more popular and make money on it.

Among the right things Oracle does for MySQL server development I'd like to highlight the following:
  1. MySQL server development continues, with new features added, most popular OSes supported, regular releases happened and source code still published at GitHub under GPL license.
  2. Oracle continues to maintain public MySQL bugs database and fix bugs reported there.
  3. Oracle accepts external contributions to MySQL server under clear conditions. They acknowledge contributions in public. The release notes, in particular, mention authors of each community-provided patch.
  4. Oracle works hard on improving performance and scalability of MySQL server.
  5. Oracle tries to provide good background for their new MySQL designs (check new InnoDB redo logging, for example).
  6. Oracle cooperates with MySQL Community. They organize their own community events and participate in numerous related conferences, including (but not limited to) the biggest Percona Live ones. Oracle engineers speak and write about their work in progress. Oracle seems to actively support some open source tools that work with MySQL server, like ProxySQL.
  7. Oracle still keeps and maintains pluggable storage engine architecture and plugin APIs, even though their own development is recently mostly related to InnoDB storage engine.
  8. Oracle still maintains and improves public MySQL Manual.
So, Oracle is doing good with MySQL, and dozens of their customers, community members and MySQL experts keep stating this all the time. But, as a former and current "MySQL Entomologist" (somebody who worked on processing MySQL bug reports from community and reported MySQL bugs for 13 years), I clearly see problems with the way Oracle handles MySQL server development. I write and speak about these problems in public since the end of 2012 or so, and would like to summarize them in this post.
MySQL's future is bright, but there are some clouds


Here is the list of problems I see:
  1. Oracle does not develop MySQL server in a true open source way.
  2. Oracle does not care enough to maintain public bugs database properly.
  3. Some older MySQL features remain half-backed, not well tested, not properly integrated with each other and new features, and not documented properly, for years.
    In general, Oracle's focus seem to be more on new developments and cool features for MySQL (with some of them got ignored and going nowhere with time).
  4. Oracle's internal QA efforts still seem to be somewhat limited.
    We get regression bugs, ASAN failures, debug assertions, crashes, test failures etc in the official releases, and Oracle MySQL still relies a lot on QA by MySQL Community (while not highlighting this fact that much in public).
  5. MySQL Manual still have many details missing and is not fixed fast enough.
    Moreover, it is not open source, so there is no other way for community to fix or improve it other than add comments or report documentation bugs, and wait.
In the upcoming weeks I am going to explain each of these items in a separate post, with some links to my older blog posts, MySQL server bug reports and other sources that should illustrate my points. In the meantime I am open for comments from those who disagree with the theses presented above.