Saturday, August 18, 2018

On Fine MySQL Manual

Today I am going to provide some details on the last item in my list of problems with Oracle's way of MySQL server development, maintenance of MySQL Manual. I stated that:
"MySQL Manual still have many details missing and is not fixed fast enough.
Moreover, it is not open source...
"
Let me explain the above:
  1. MySQL Reference Manual is not open source. It used to be built from DocBook XML sources. Probably that's still the case. But you can not find the source code in open repositories (please, correct me if I am wrong, I tried to search...) That's because it is NOT open source. It says this clearly in Preface:
    "This documentation is NOT distributed under a GPL license. Use of this documentation is subject to the following terms:
    ...
    Any other use, such as any dissemination of printed copies or use of this documentation, in whole or in part, in another publication, requires the prior written consent from an authorized representative of Oracle. Oracle and/or its affiliates reserve any and all rights to this documentation not expressly granted above.
    "
    It was NOT Oracle who closed the source (as far as I remember, the manual was not GPL even in 2004, when I started to care about MySQL in general). But Oracle had a chance to change the license and set up some better contribution process for MySQL Community users, with many good writers among them. They decided not do this, so creators of forks and derived software have to duplicate efforts and rewrite everything themselves, and the only real way to help make the manual better is to report bugs.
  2. Quality of new documentation is not improved much. We, MySQL Community users, have to report bugs in the manual, as it still has some details missing or documented wrongly. Let me illustrate this with recent 10 or so documentation requests users made (I skipped reports for features I do not care about for now, like group replication):
    • Bug #91955 - "8.0 broke plugin API for C". According to the comment from Oracle developer in this bug reported by Laurynas Biveinis, writing plugins in C is not supported any more... But this change is NOT properly documented.
    • Bug #91781 - "Add version matrix in "Chapter 2 Keywords and Reserved Words". Good idea in the manual is not consistently implemented.
    • Bug #91743 - "performance_schema.data_locks not working correctly". Nice feature was added to Performance Schema (now we can check and study InnoDB locks with SQL!), but it is still not completely and properly documented.
    • Bug #91654 - "Handling an Unexpected Halt of a Replication Slave" Documentation is uncertain". This is also an example of improper bug handling - when I worked in Oracle newer bugs were usually declared duplicates of older ones. Here the opposite decision was made, even though both reports were from the same user, Uday Varagani, who explicitly asked to change the decision. Obviously, documentation requests do not get any extra care comparing to other kinds of bugs, quite to the contrary...
    • Bug #91648 - "Numeric literals must be a number between 0 and 9999". Surely ports with numbers larger than 9999 can be used.
    • Bug #91642 - "Document how protocol version 4.0 and 4.1 differ on how strings are terminated". As noted by Rene' Cannao', comments in the code are still sometimes more useful than documentation.
    • Bug #91549 - "MDL lock queue order seems to depend on table names". Make sure to read last comment in this nice request by Shane Bester. Dmitry Lenev provides some details on priorities of MDL locks in it. There are still cases when bugs and documentation requests document some details better than fine MySQL Manual!
    • Bug #90997 - "Handling an Unexpected Halt of a Replication Slave" manual page is wrong". Sveta Smirnova highlighted a totally misleading statement in the manual.
    • Bug #90935 - "Modify relay_log_recovery Documentation". Simple documentation request stays "Open" for 3 months. Definitely processing documentation requests is not a highest priority for Oracle engineers.
    • Bug #90680 - "dragnet logging: document how to use / find error_symbol codes". Even if request comes from a customer or otherwise well known bug reporter, like Simon Mudd, and it's related to some totally new cool feature of MySQL 8, it can wait for the fix for months...
    You can make your own conclusions from the above. But I do not see any good trends in the way new documentation is created or documentation requests are processed recently. Same problems as 4 years ago (see more on that in a side note below).
  3. Older documentation requests get even less attention than recent ones sometimes, even though they may highlight problems with software itself, not the MySQL Manual. Let me illustrate this with a few bugs I reported:
    • Bug #83640 - "Locks set by DELETE statement on already deleted record". I explicitly asked to
      "...document in the manual what locks this kind of DELETE sets when it encountered a record already marked as deleted, and why"
      This still had NOT happened. Moreover, eventually both MySQL and MariaDB engineers decided that current implementation of locking for this case is wrong and can be improved, so this report ended up as InnoDB bug. Check related  MDEV-11215 for more details.
    • Bug #73413 - "Manual does not explain MTS implementation in details". This is one of my favorite documentation requests. I've got a suggestion to explain what I want to see documented, in details. I tried my best, but if you want to get more details, read this blog.
    • Bug #72368 - "Empty/zero statistics for imported tablespace until explicit ANALYZE TABLE". This is a bug in the way persistent statistics (the feature I truly hate) in InnoDB is re-estimated "automatically". But until the bug is fixed, I asked to document current implementation (and problems with it). So, where current implementation is properly documented? If only in the bug reports...
    • Bug #71294 - "Manual page for P_S.FILE_INSTANCES table does not explain '~' in FILE_NAME".  They pretend they do not get the point:
      "Per Marc Alff, the Performance Schema is simply reporting what it gets from runtime. If runtime is reporting nonexistent filenames, that's a server issue.

      Recategorizing this as a server bug.
      "
    • Bug #71293 - "Manual page for P_S.FILE_INSTANCES table does not explain EVENT_NAME values". Now they are waiting for the new DOCUMENTATION column in the setup_instruments table to be filled in with something by server developers... The code is the documentation? OK, bus as we know from the experience (see slides 44-46 here) chances to get a bug in Performance Schema fixed fast are even less than to see it properly and completely documented...
There are more problems with MySQL documentation (not only reference manual), but at the moment I consider 3 highlighted and illustrated above the most serious.

Regent's Canal is nice. If I only knew how to operate the locks there... MySQL Manual also misses some information about locks.
As a side note, it's not the first time I write about MySQL Manual. You can find some useful details in the following posts:
  • "What's wrong with MySQL Manual". In 2014, after spending few weeks reporting up to 5 documentation bugs per day, I thought that, essentially, there is nothing much wrong with it - it's fine, well indexed by Google and has meaningful human-readable URLs. Few problems listed were lack of careful readers (I tried my best to fix that), limited attention to old documentation requests, some pages with not so much useful content and lack of "How to" documents. The later I also tried to fix to some small extent in this blog, see howto tagged posts. The real fix came mostly from Percona's blog, though.
  • I have a special "missing manual" tag for blog posts that mention at least one bug in the manual.
  • I tried to use "missing_manual" tag consistently for my own documentation requests. Last year I shared a detailed enough review of the bugs with that tag that were still active.
As a yet another side note, I tried once to create a kind of community driven "missing manual" project, but failed. Writing manual from scratch is a (hard) full time job, while my full time job is providing support to users of MySQL, MariaDB and Percona software...

That said, I also wanted to join MySQL's documentation team in the past, but this was not possible at least because I am not a native speaker. If anything changed in this regard, I am still open to a job offer of this kind. My conditions for an acceptable offer from Oracle are known to all interested parties and they include (but are not limited to) at least 4 times the salary I had before I quit (it was almost 6 years ago) and working for Oracle's office in Paris (because in Oracle office you ere employed by formally matters a lot for your happiness and success as employee).

In case of no offer in upcoming week, I'll just continue to write my annoying but hopefully somewhat useful blog posts (until MySQL Reference Manual becomes true open source project ;)

8 comments:

  1. Another example of documentation bug that is verified for a long time: Bug#85977 - he doc. of slave-parallel-type=LOGICAL_CLOCK wrongly reference Group Commit ([1]).

    [1]: https://bugs.mysql.com/bug.php?id=85977

    This is probably somehow related to your Bug#73413 Valeriy.

    ReplyDelete
  2. And another: Bug#75393 - Clarify the way the table mysql.gtid_executed is updated ([1]). But this one is still only opened.

    [1]: https://bugs.mysql.com/bug.php?id=75393

    ReplyDelete
  3. Another: Bug#75394 - Improve documentation of binlog_order_commits ([1]).

    [1]: https://bugs.mysql.com/bug.php?id=75394

    ReplyDelete
  4. Even though we have many example where documentation bugs are not treated in a satisfactory way, I want to point-out my recent very satisfactory interaction about a documentation bug: Bug#92012 - Wrong default for thread_cache_size in "How MySQL Uses Threads for Client Cnx".([1]). This documentation bug was verified in less than 18 hours and fixed in less than 30 hours. Great job by Umesh Shastry and Paul Dubois.

    [1]: https://bugs.mysql.com/bug.php?id=92012

    Another older documentation bug where everything went well is the following: Bug#73038 - Better document (or fix) relay_log_recovery ([2]).

    [2]: https://bugs.mysql.com/bug.php?id=73038

    ReplyDelete
    Replies
    1. Documentation bugs are usually verified really fast, and Umesh just rocks in his role, no doubts.

      Delete
    2. But proper and fast fixing of some bugs does not make cases like https://bugs.mysql.com/bug.php?id=68097 better.

      Delete
  5. While I tend to agree with the points made above and a couple of other recent posts you've made I wonder if the way you are getting your message across well have the desired affect. Oracle is a big, slow moving secret animal that doesn't like being poked. Or so it seems. I wonder ifif how should not be trying to entice them in the direction we want them to go by asking them to describe what they can about their processes and thinking, point out the issues we see and the direction we'd like to see things go and ask how as an individual or the community we can help to achieve these 'better' objectives. Forcing them in the direction we want won't work. However, one to one conversions have lead me to believe that they are aware of these issues, so we just need to see how we can help them internally move to make things happen. This may be a slow process but I'd guess that if pursued by many of us in the community it may take us where we want to go.

    ReplyDelete
    Replies
    1. I know from my own experience (recall 5.6 GA early days in 2013) that regular poking about mishandled bugs used to work and help, immediately. It lasted for a long time and made MySQL 5.6 and newer versions much better.

      This approach may not work that well any more (maybe because I was forced to stop applying it in 2013 and they got relaxed after that), but it worked. There are plenty of people in MySQL Community who tried and are still trying other approaches, and I wish them good luck. But I'll continue to highlight problems with bugs processing in public whenever I notice them.

      Delete