Sunday, June 25, 2017

My First Steps with MariaDB ColumnStore

This is a "Howto" kind of post, and some speculations and historical references aside, it will show you how to build MariaDB ColumnStore (current version 1.0.9) from GitHub source, how to install and configure it for basic usage, as well as how to resolve some minor problems you may get in the process. Keep reading and eventually you'll get the real Howto below :)

* * *

I try not to care about any software issues besides good old MySQL, InnoDB storage engine internals, query optimization and some MyRocks or a little bit of Galera clusters. But as I work in MariaDB Support it is not possible to ignore other things production DBAs use and care about these days. So, recently I noted increased number of issues related to MariaDB ColumnStore and had to step in to resolve some of them. To do this I had first to make sure this software is "supportable by me", that is, I can build it from source on my own hardware (so that I can apply patches if needed and is not dependent of any packaging problems), install, start and stop, do some basic things with it and know what to do to troubleshoot non-trivial problems.

Actually I tried to check if it is "supportable by me" long time ago. First during the company meeting in Berlin back in April 2016 I was presented with some basic installation and usage steps, then a year ago I tried to build one of the very first public releases of ColumnStore, and this ended up with MCOL-142  that was later fixed by the team. Starting from version 1.0.6 and my successful build of MariaDB ColumnStore based on GitHub instructions I was sure that I can supported it if needed.

This week I found myself in a situation that I do need to install MariaDB ColumnStore to help customer, but I had only my Ubuntu 14.04 netbook at hand. Based on my records, I had few minor problems building the engine itself there, as of version 1.0.7:

...
-- Performing Test INLINE - Failed
-- Could NOT find LibXml2 (missing:  LIBXML2_LIBRARIES LIBXML2_INCLUDE_DIR)
CMake Error at CMakeLists.txt:91 (MESSAGE):
  Could not find a usable libxml2 development environment!


-- Configuring incomplete, errors occurred!
See also "/home/openxs/git/mariadb-columnstore-engine/CMakeFiles/CMakeOutput.log".
See also "/home/openxs/git/mariadb-columnstore-engine/CMakeFiles/CMakeError.log".

sudo apt-get install build-essential automake libboost-all-dev bison cmake libncurses5-dev libreadline-dev libperl-dev libssl-dev libxml2-dev libkrb5-dev flex
...
Setting up libboost-wave-dev:amd64 (1.54.0.1ubuntu1) ...
Setting up libboost-all-dev (1.54.0.1ubuntu1) ...
Setting up libperl-dev (5.18.2-2ubuntu1.1) ...
Setting up libhwloc-plugins (1.8-1ubuntu1.14.04.1) ...
Processing triggers for libc-bin (2.19-0ubuntu6.9) ...
openxs@ao756:~/git/mariadb-columnstore-engine$
openxs@ao756:~/git/mariadb-columnstore-engine$ cmake . -DSERVER_BUILD_INCLUDE_DIR=../mariadb-columnstore-server/include -DSERVER_SOURCE_ROOT_DIR=../mariadb-columnstore-server
-- Running cmake version 2.8.12.2
-- MariaDB-Columnstore 1.0.7
-- Found LibXml2: /usr/lib/x86_64-linux-gnu/libxml2.so (found version "2.9.1")
-- Boost version: 1.54.0
-- Found the following Boost libraries:
--   system
--   filesystem
--   thread
--   regex
--   date_time
SERVER_BUILD_INCLUDE_DIR = /home/openxs/git/mariadb-columnstore-engine/../mariadb-columnstore-server/include
SERVER_SOURCE_ROOT_DIR = /home/openxs/git/mariadb-columnstore-engine/../mariadb-columnstore-server
-- Configuring done
-- Generating done
-- Build files have been written to: /home/openxs/git/mariadb-columnstore-engine
openxs@ao756:~/git/mariadb-columnstore-engine$

...
[100%] Built target cpimport
[100%] Building CXX object writeengine/server/CMakeFiles/WriteEngineServer.dir/we_getfilesizes.cpp.o
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp: In static member function ‘static int WriteEngine::WE_GetFileSizes::processFileName(messageqcpp::ByteStream&, std::string&, int)’:
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp:275:19: warning: ‘fileSize’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     bs << fileSize;
                   ^
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp:276:29: warning: ‘compressedFileSize’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     bs << compressedFileSize;
                             ^
Linking CXX executable WriteEngineServer
[100%] Built target WriteEngineServer
openxs@ao756:~/git/mariadb-columnstore-engine$ echo $?
0
openxs@ao756:~/git/mariadb-columnstore-engine$ make install
...
[100%] Built target cpimport
Install the project...
-- Install configuration: "RELWITHDEBINFO"
CMake Error at cmake_install.cmake:44 (FILE):
  file cannot create directory: /usr/local/mariadb/columnstore.  Maybe need
  administrative privileges.


make: *** [install] Error 1
So, first I had some missing build dependencies, had to read the manual and install them. Then both server and engine, as of version 1.0.7, were built successfully, but I had a problem during installation. The reason is clear, ColumnStore used to insist on installing into the /usr/local/mariadb/columnstore only, and I insisted on getting it elsewhere (as I usually do for my source code builds, and failed, so had to give up and use sudo). I got it and moved on at that times. If you care, proper steps for installing into any directory as a non-user are described in KB now.

So, based on the previous attempts and my success on Fedora 23, I decided to get recent 1.0.9 version from GitHub and build it on this Ubuntu 14.04 netbook. I followed current  GitHub instructions and few of my older notes (hence some extra options and not just cmake ., but seems these are truly optional and everything works as described):
...
openxs@ao756:~/git/mariadb-columnstore-server$ cd ../mariadb-columnstore-engine/openxs@ao756:~/git/mariadb-columnstore-engine$ cmake . -DSERVER_BUILD_INCLUDE_DIR=../mariadb-columnstore-server/include -DSERVER_SOURCE_ROOT_DIR=../mariadb-columnstore-server
-- Running cmake version 2.8.12.2
-- MariaDB-Columnstore 1.0.9
-- Boost version: 1.54.0
-- Found the following Boost libraries:
--   system
--   filesystem
--   thread
--   regex
--   date_time
SERVER_BUILD_INCLUDE_DIR = /home/openxs/git/mariadb-columnstore-engine/../mariadb-columnstore-server/include
SERVER_SOURCE_ROOT_DIR = /home/openxs/git/mariadb-columnstore-engine/../mariadb-columnstore-server
-- Configuring done
-- Generating done
-- Build files have been written to: /home/openxs/git/mariadb-columnstore-engine
openxs@ao756:~/git/mariadb-columnstore-engine$ time make -j 2
...
[100%] Built target cpimport
[100%] Building CXX object writeengine/server/CMakeFiles/WriteEngineServer.dir/we_getfilesizes.cpp.o
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp: In static member function -?static int WriteEngine::WE_GetFileSizes::processFileName(messageqcpp::ByteStream&, std::string&, int)Б-?:
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp:275:19: warning: Б-?fileSizeБ-? may be used uninitialized in this function [-Wmaybe-uninitialized]
     bs << fileSize;
                   ^
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp:276:29: warning: Б-?compressedFileSizeБ-? may be used uninitialized in this function [-Wmaybe-uninitialized]
     bs << compressedFileSize;
                             ^
Linking CXX executable WriteEngineServer
[100%] Built target WriteEngineServer

real    43m48.955s
user    80m37.430s
sys     4m17.819s

openxs@ao756:~/git/mariadb-columnstore-engine$ sudo make install
...
-- Installing: /usr/local/mariadb/columnstore/bin/cpimport
-- Set runtime path of "/usr/local/mariadb/columnstore/bin/cpimport" to "/usr/local/mariadb/columnstore/lib"

openxs@ao756:~/git/mariadb-columnstore-engine$ cd /usr/local/mariadb/columnstore/bin/
Next step is to run post-install (as root or via sudo in my case):
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./post-install
...
There is literally nothing interesting in the output and everything should just work. Now time to configure settings for MariaDB ColumnStore and run postConfigure script (again via sudo):
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./postConfigure

This is the MariaDB ColumnStore System Configuration and Installation tool.
It will Configure the MariaDB ColumnStore System and will perform a Package
Installation of all of the Servers within the System that is being configured.

IMPORTANT: This tool should only be run on the Parent OAM Module
           which is a Performance Module, preferred Module #1

Prompting instructions:

        Press 'enter' to accept a value in (), if available or
        Enter one of the options within [], if available, or
        Enter a new value


===== Setup System Server Type Configuration =====

There are 2 options when configuring the System Server Type: single and multi

  'single'  - Single-Server install is used when there will only be 1 server configured
              on the system. It can also be used for production systems, if the plan is
              to stay single-server.

  'multi'   - Multi-Server install is used when you want to configure multiple servers now or
              in the future. With Multi-Server install, you can still configure just 1 server
              now and add on addition servers/modules in the future.

Select the type of System Server install [1=single, 2=multi] (2) > 1

Performing the Single Server Install.
Enter System Name (columnstore-1) >

===== Setup Storage Configuration =====


----- Setup Performance Module DBRoot Data Storage Mount Configuration -----

There are 2 options when configuring the storage: internal or external

  'internal' -    This is specified when a local disk is used for the DBRoot storage.
                  High Availability Server Failover is not Supported in this mode

  'external' -    This is specified when the DBRoot directories are mounted.
                  High Availability Server Failover is Supported in this mode.

Select the type of Data Storage [1=internal, 2=external] (1) >


Enter the list (Nx,Ny,Nz) or range (Nx-Nz) of DBRoot IDs assigned to module 'pm1' (1) >
The MariaDB ColumnStore port of '3306' is already in-use on local server
Enter a different port number > 3307


===== Performing Configuration Setup and MariaDB ColumnStore Startup =====

NOTE: Setting 'NumBlocksPct' to 50%
      Setting 'TotalUmMemory' to 25% of total memory.

Running the MariaDB ColumnStore setup scripts

post-mysqld-install Successfully Completed
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock' (2 "No such file or directory")
Error running post-mysql-install, /tmp/post-mysql-install.log
Exiting...
openxs@ao756:/usr/local/mariadb/columnstore/bin$
The process is quite clear and I installed a simple test configuration on a single server, with internal storage used (you may find interesting details on how data are stored in KB articles and in my previous post), single Performance Module (PM) and single DBRoot. The process and details are well described here. It was noted that I have another running instance listening to port 3306 (it is Percona Server that I have on this netbook since my days working for Percona). I've picked up port 3307 for MariaDB instance. Now, it seems post-mysqld-install part was executed successfully and then after waiting for some time there was some failure while connecting to MySQL. I decided that it does not matter (I had some failures on Fedora 23, but eventually it worked, and I was in hurry and proceeded to the KB article on mscadmin and basic usage. I've noted that ColumnStore is NOT started and tried to start it:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./mcsadmin startsystem
startsystem   Sat Jun 24 20:24:54 2017
startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules


   System being started, please wait...........
**** startSystem Failed : check log files
openxs@ao756:/usr/local/mariadb/columnstore/bin$
So, I've got my first real trouble with MariaDB ColumnStore and I have no other option than troubleshoot it. Let's check the content of the error log of MariaDB server first:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ tail ../mysql/db/ao756.err
tail: cannot open Б-?../mysql/db/ao756.errБ-? for reading: Permission denied
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo tail ../mysql/db/ao756.err
2017-06-24 20:25:14 139795698476992 [Note] InnoDB: Highest supported file format is Barracuda.
2017-06-24 20:25:14 139795698476992 [Note] InnoDB: starting tracking changed pages from LSN 1616918
2017-06-24 20:25:14 139795698476992 [Note] InnoDB: 128 rollback segment(s) are active.
2017-06-24 20:25:14 139795698476992 [Note] InnoDB: Waiting for purge to start
2017-06-24 20:25:14 139795698476992 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.35-80.0 started; log sequence number 1616918
2017-06-24 20:25:14 139795698476992 [Note] Plugin 'FEEDBACK' is disabled.
2017-06-24 20:25:14 139795698476992 [ERROR] /usr/local/mariadb/columnstore/mysql//bin/mysqld: unknown variable 'gtid_mode=ON'
2017-06-24 20:25:14 139795698476992 [ERROR] Aborting

170624 20:25:17 mysqld_safe mysqld from pid file /usr/local/mariadb/columnstore/mysql/db/ao756.pid ended
So, the problem is in incompatible setting in some my.cnf file used by default (this setting was used for Percona Server I have installed from packages). After fixing the problem (I had to proceed so went for as dirty fix of commenting out settings I've found in /etc/my,.cnf) it was easy too start ColumnStore:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo vi /etc/my.cnf
openxs@ao756:/usr/local/mariadb/columnstore/bin$ grep gtid /etc/my.cnf#gtid_mode=ON
#enforce_gtid_consistency
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./mcsadmin shutdownsystem

shutdownsystem   Sat Jun 24 20:42:21 2017

This command stops the processing of applications on all Modules within the MariaDB ColumnStore System

   Checking for active transactions
           Do you want to proceed: (y or n) [n]: y

   Stopping System...
   Successful stop of System

   Shutting Down System...
   Successful shutdown of System

openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./mcsadmin startsystem
startsystem   Sat Jun 24 20:43:15 2017
startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules


   System being started, please wait..................
   Successful start of System

openxs@ao756:/usr/local/mariadb/columnstore/bin$ ps aux | grep column
root      6993  0.0  0.0  15336  1484 pts/2    S    20:43   0:00 /bin/bash /usr/local/mariadb/columnstore/bin/run.sh /usr/local/mariadb/columnstore/bin/ProcMon
root      6994  7.1  0.4 1214156 15808 pts/2   Sl   20:43   0:05 /usr/local/mariadb/columnstore/bin/ProcMon
root      7162  0.0  0.0   4456   788 pts/2    S    20:43   0:00 /bin/sh /usr/local/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/usr/local/mariadb/columnstore/mysql/db --pid-file=/usr/local/mariadb/columnstore/mysql/db/ao756.pid --ledir=/usr/local/mariadb/columnstore/mysql//bin
mysql     7380  0.6  4.1 1170124 160416 pts/2  Sl   20:43   0:00 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/mariadb/columnstore/mysql/lib/plugin --user=mysql --log-error=/usr/local/mariadb/columnstore/mysql/db/ao756.err --pid-file=/usr/local/mariadb/columnstore/mysql/db/ao756.pid --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock --port=3307
root      7454  0.0  0.2 506564 10564 pts/2    Sl   20:43   0:00 /usr/local/mariadb/columnstore/bin/controllernode fg
root      7475  0.0  0.2 353284 10176 pts/2    Sl   20:43   0:00 /usr/local/mariadb/columnstore/bin/ServerMonitor
root      7502  0.0  0.4 177956 17984 pts/2    Sl   20:43   0:00 /usr/local/mariadb/columnstore/bin/workernode DBRM_Worker1 fg
openxs    7951  0.0  0.0  14652   960 pts/2    S+   20:44   0:00 grep --color=auto column
openxs@ao756:/usr/local/mariadb/columnstore/bin$
 So, it seems the system started as expected, a lot of processes are running, but when I tried to create my first ColumnStore table I've got a message that the storage engine in not supported! Same for InfiniDB.

I'll skip some troubleshooting steps and ideas I had, and just state that:

You should make sure none of MySQL configuration files that are read by default should have any incompatible settings at the moment you run postConfigure script.

You can surely re-run postConfigure when everything is fixed, and this is a proper output for completed, correct attempt:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./postConfigure

This is the MariaDB ColumnStore System Configuration and Installation tool.
It will Configure the MariaDB ColumnStore System and will perform a Package
Installation of all of the Servers within the System that is being configured.

IMPORTANT: This tool should only be run on the Parent OAM Module
           which is a Performance Module, preferred Module #1

Prompting instructions:

        Press 'enter' to accept a value in (), if available or
        Enter one of the options within [], if available, or
        Enter a new value


===== Setup System Server Type Configuration =====

There are 2 options when configuring the System Server Type: single and multi

  'single'  - Single-Server install is used when there will only be 1 server configured
              on the system. It can also be used for production systems, if the plan is
              to stay single-server.

  'multi'   - Multi-Server install is used when you want to configure multiple servers now or
              in the future. With Multi-Server install, you can still configure just 1 server
              now and add on addition servers/modules in the future.

Select the type of System Server install [1=single, 2=multi] (1) >

Performing the Single Server Install.
Enter System Name (columnstore-1) >

===== Setup Storage Configuration =====


----- Setup Performance Module DBRoot Data Storage Mount Configuration -----

There are 2 options when configuring the storage: internal or external

  'internal' -    This is specified when a local disk is used for the DBRoot storage.
                  High Availability Server Failover is not Supported in this mode

  'external' -    This is specified when the DBRoot directories are mounted.
                  High Availability Server Failover is Supported in this mode.

Select the type of Data Storage [1=internal, 2=external] (1) >

Enter the list (Nx,Ny,Nz) or range (Nx-Nz) of DBRoot IDs assigned to module 'pm1' (1) >


===== Performing Configuration Setup and MariaDB ColumnStore Startup =====

NOTE: Setting 'NumBlocksPct' to 50%
      Setting 'TotalUmMemory' to 25% of total memory.

Running the MariaDB ColumnStore setup scripts

post-mysqld-install Successfully Completed
post-mysql-install Successfully Completed

Starting MariaDB Columnstore Database Platform

MariaDB ColumnStore Database Platform Starting, please wait ..... DONE

System Catalog Successfull Created

MariaDB ColumnStore Install Successfully Completed, System is Active

Enter the following command to define MariaDB ColumnStore Alias Commands

. /usr/local/mariadb/columnstore/bin/columnstoreAlias

Enter 'mcsmysql' to access the MariaDB ColumnStore SQL console
Enter 'mcsadmin' to access the MariaDB ColumnStore Admin console

openxs@ao756:/usr/local/mariadb/columnstore/bin$

Node new messages at the end. If you do NOT see any of them do not blindly assume that everything is working as expected. Also note that the tool remembereds previous chopices.

Now I've got access to the engine:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ ../mysql/bin/mysql -uroot test --host=127.0.0.1 --port=3307
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.23-MariaDB Columnstore 1.0.9-1
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> show engines;
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                          | Transactions | XA   | Savepoints |
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
| Columnstore        | YES     | Columnstore storage engine                                                                       | YES          | NO   | NO         |
| CSV                | YES     | CSV storage engine                                                                               | NO           | NO   | NO         |
| MyISAM             | YES     | MyISAM storage engine                                                                            | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                        | NO           | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                            | NO           | NO   | NO         |
| InfiniDB           | YES     | Columnstore storage engine (deprecated: use columnstore)                                         | YES          | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                   | YES          | NO   | YES        |
| Aria               | YES     | Crash-safe tables with MyISAM heritage                                                           | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Percona-XtraDB, Supports transactions, row-level locking, foreign keys and encryption for tables | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                               | NO           | NO   | NO         |
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
10 rows in set (0.00 sec)

MariaDB [test]> create table taa(id int, c1 int, c2 date, c3 int) engine=columnstore;
Query OK, 0 rows affected (3.81 sec)

MariaDB [test]> insert into taa values(1,1,now(),1), (2,2,NULL,2),(3,NULL,NULL,3);
Query OK, 3 rows affected, 1 warning (1.15 sec)
Records: 3  Duplicates: 0  Warnings: 1

MariaDB [test]> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1265
Message: Data truncated for column 'c2' at row 1
1 row in set (0.00 sec)

MariaDB [test]> select * from taa;
+------+------+------------+------+
| id   | c1   | c2         | c3   |
+------+------+------------+------+
|    1 |    1 | 2017-06-24 |    1 |
|    2 |    2 | NULL       |    2 |
|    3 | NULL | NULL       |    3 |
+------+------+------------+------+
3 rows in set (0.15 sec)

MariaDB [test]> create table tab like taa;
ERROR 1178 (42000): The storage engine for the table doesn't support The syntax or the data type(s) is not supported by Columnstore. Please check the Columnstore syntax guide for supported syntax or data types.MariaDB [test]> create table tab engine=columnstore select * from taa;
ERROR 1178 (42000): The storage engine for the table doesn't support The syntax or the data type(s) is not supported by Columnstore. Please check the Columnstore syntax guide for supported syntax or data types.MariaDB [test]> create table tab(id int, c1 int, c2 date, c3 int) engine=columnstore;
Query OK, 0 rows affected (3.40 sec)

MariaDB [test]> alter table taa engine=InnoDB;
ERROR 1030 (HY000): Got error 1815 "Unknown error 1815" from storage engine Columnstore
MariaDB [test]>
Note some limitation we have in ColumnStore comparing to "normal" MariaDB it is based on. Some things do not work, some unique error messages and bugs are present, surely. You can read about current known limitations here.

My real goal was to check if NULL values are properly loaded by the cpimport tool, so I tried:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ ./cpimport -e 1 -s '\t' -n1 test tab
Locale is : C
Column delimiter : \t

Using table OID 3005 as the default JOB ID
Input file(s) will be read from : STDIN
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T211331_S295214_Job_3005.xml
I/O error : Permission denied
I/O error : Permission denied
2017-06-24 21:13:31 (16045) ERR  :  file /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T211331_S295214_Job_3005.xml does not exist [1055]
Error in loading job information;  The File does not exist. ; cpimport.bin is terminating.
openxs@ao756:/usr/local/mariadb/columnstore/bin$

openxs@ao756:/usr/local/mariadb/columnstore/bin$ cat /tmp/taa.txt | sudo ./cpimport -e 1 -s '\t' -n1 test tab
Locale is : C
Column delimiter : \t

Using table OID 3005 as the default JOB ID
Input file(s) will be read from : STDIN
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T211615_S997504_Job_3005.xml
Log file for this job: /usr/local/mariadb/columnstore/data/bulk/log/Job_3005.log
2017-06-24 21:16:16 (16582) INFO : successfully loaded job file /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T211615_S997504_Job_3005.xml
2017-06-24 21:16:16 (16582) INFO : Job file loaded, run time for this step : 0.0409038 seconds
2017-06-24 21:16:16 (16582) INFO : PreProcessing check starts
2017-06-24 21:16:16 (16582) INFO : PreProcessing check completed
2017-06-24 21:16:16 (16582) INFO : preProcess completed, run time for this step : 0.209377 seconds
2017-06-24 21:16:16 (16582) INFO : No of Read Threads Spawned = 1
2017-06-24 21:16:16 (16582) INFO : No of Parse Threads Spawned = 3
2017-06-24 21:16:16 (16582) INFO : Reading input from STDIN to import into table test.tab...
2017-06-24 21:16:16 (16582) INFO : For table test.tab: 4 rows processed and 4 rows inserted.
2017-06-24 21:16:16 (16582) WARN : Column test.tab.c2; Number of invalid dates replaced with zero value : 1
2017-06-24 21:16:17 (16582) INFO : Bulk load completed, total run time : 1.28894 seconds

openxs@ao756:/usr/local/mariadb/columnstore/bin$
The problem is I have the error for a date column, and the error is caused by the first line in the file with column titles, as produced by mysql command line client by default. One has to add -N option to get proper data loaded:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ ../mysql/bin/mysql -uroot test --host=127.0.0.1 --port=3307 -N -e"select * from taa" >/tmp/taa.txt
openxs@ao756:/usr/local/mariadb/columnstore/bin$ cat /tmp/taa.txt
1 1 2017-06-24 1

2 2 NULL 2
3 NULL NULL 3
openxs@ao756:/usr/local/mariadb/columnstore/bin$ cat /tmp/taa.txt | sudo ./cpimport -e 1 -s '\t' -n1 test tab
Locale is : C
Column delimiter : \t

Using table OID 3005 as the default JOB ID
Input file(s) will be read from : STDIN
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T212114_S554015_Job_3005.xml
Log file for this job: /usr/local/mariadb/columnstore/data/bulk/log/Job_3005.log
2017-06-24 21:21:14 (18107) INFO : successfully loaded job file /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T212114_S554015_Job_3005.xml
2017-06-24 21:21:14 (18107) INFO : Job file loaded, run time for this step : 0.0412772 seconds
2017-06-24 21:21:14 (18107) INFO : PreProcessing check starts
2017-06-24 21:21:14 (18107) INFO : PreProcessing check completed
2017-06-24 21:21:14 (18107) INFO : preProcess completed, run time for this step : 0.205652 seconds
2017-06-24 21:21:14 (18107) INFO : No of Read Threads Spawned = 1
2017-06-24 21:21:14 (18107) INFO : No of Parse Threads Spawned = 3
2017-06-24 21:21:14 (18107) INFO : Reading input from STDIN to import into table test.tab...
2017-06-24 21:21:15 (18107) INFO : For table test.tab: 3 rows processed and 3 rows inserted.
2017-06-24 21:21:15 (18107) INFO : Bulk load completed, total run time : 1.28646 seconds
openxs@ao756:/usr/local/mariadb/columnstore/bin$ ../mysql/bin/mysql -uroot test --host=127.0.0.1 --port=3307 -N -e"select * from tab"
+------+------+------------+------+
|    1 |    1 | 2017-06-24 |    1 |
|    2 |    2 | NULL       |    2 |
|    3 | NULL | NULL       |    3 |
+------+------+------------+------+
To sunmmarize:
  1. These days one can easliy build, install and configure current version of MariaDB ColumnStore from GitHub sources.
  2. As long as you are reding KB artciles and GitHub documentation carefully, there is no problem to install and configure even non-default environment, where other MySQL versions co-exist.
  3. Youi should make sure node of the configuration files that are read by default contains any incompatible setting (MySQL 5.7-specific ones may fit into that calregory, or anything about GTID-based replication in MySQL).
  4. cpimport is a cool tol for paralles/fast data loads into MariaDB ColumnStore, but it does assume that input is a CSV or tab-separated file, and it tries to load all rows.
  5. Read error messages carefully.
  6. Make sure you read this article if you want to use ColumnStore.

Sunday, June 18, 2017

MySQL Support Engineer's Chronicles, Issue #6

Previous post in series was published almost 4 months ago, but I do not plan to end it. So, let me quickly discuss some of problems I worked on or was interested in so far in June, and provide some useful links.

Back on June 2 I had to find out what exact files are created by MariaDB's ColumnStore when I create a table in this storage engine. Actually in recent versions one can check the tables in the INFORMATION_SCHEMA, but if still wonders why are all these directories with numbers in the names (/usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/193.dir/000.dir/FILE000.cdf), please, check also this storage architecture review and take into account the following hint by the famous LinuxJedi:
partA: oid>>24
partB: (oid&0x00ff0000)>>16
partC: (oid&0x0000ff00) >> 8
partD: oid&0x000000ff
partE: partition
fName: segment
Those numbers in the directory names are decimal integers, 3 digits, zero-padded. oid is object_id.

If you plan to try to work with MariaDB ColumnStore, please, make sure to check also this KB article, MariaDB ColumnStore System Usage.

Later I had to work on some out-of-memory issues, and found this article on how to get the details on transparent huge pages usage on RedHat systems very useful.

No matter how much I hate clusters, I have to deal at least with Galera-based ones. One of the problems to deal with, recently, was about many threads "hanging" in "wsrep in pre-commit stage" status. In the process of investigation I had found this bug report, lp:1197771, and I am still wondering what is the proper status for it ("Fix Committed" to nothing does not impress me). Check also this discussion.

The most important discussions for MySQL Community, IMHO, happened around June 9 in relation to Bug #86607, "InnoDB crashed when master thread evict dict_table_t object". Great finding and detailed analysis by Alibaba's engineer, Jianwei Zhao, as well as his detailed MTR test case (that worked on a bit modified source code with a couple of lines like os_thread_sleep(2000000); added here and there), were all rejected by my deal old friend Sinisa... It caused a lot of reactions and comments on Facebook. Further references to magic "Oracle policies"  had not helped. What really helped is a real verification by Umesh Shastry later. As many colleagues noted, one had just to try, and test fails even on a non-modified code, quite often. It's quite possible that MariaDB will fix this bug faster, so you may want to monitor this related bug report, MDEV-13051. To summarize, it's not about any policies, but all about the proper care and attitude, and had always been like that...

Another great by to follow, Bug #86215, was reported back in May by Mark Callaghan. He spent a lot of time checking how well MySQL performs, historically, on different versions, for 4.0 to 8.0. Recent findings are summarized in this blog post. Based on the results, it seems that for most tests huge drop in performance happened in MySQL 5.7, and so far version 8.0.x had made few improvements on top of that. Check also this summary of tests and related blog posts. I wonder what would be the results of analysis for this bug and related posts by the great Dimitri Kravtchuk, whom the bug is assigned to currently.

Those who may have to work with MariaDB or MySQL on Windows and deal with crashes or hangs, may find this Micsoft's article useful. It explains how to get call stacks for threads with Visual Studio 2017. Thanks to Vladislav Vaintroub, for the link. I badly need new box with Windows 10 it seems, to check all this and other cool stuff, like shadow backups...

MariaDB's MaxScale 2.1 GA version had appeared recently, with a new cache filter feature. You can find the documentation here and there, and some comparison of it's initial implementation with ProxySQL's cache in this great blog post.

Yet another MySQL bug, reported by my colleague Hartmut Holzgraefe long time ago, attracted my attention while working on issues. This was Bug #76793, "Different row size limitations in Anaconda and Barracuda file formats". It was verified more than 2 years ago, but still nothing happens, neither in the code, nor in the manual to describe and explain the differences. Sad, but true.

The last but not the least, Support team of MariaDB is hiring now. We need one more engineer for our APAC team. If you want to work on complex and interesting issues from reasonable customers with properly set expectations, in a wisely managed team, and get a chance to work with me for 3-4 hours on a daily basis (yes, I am around/working during some APAC hours often enough, at least this summer) - please, get in touch and send your CV!