Sunday, April 24, 2016

Building MariaDB 10.1.x and Galera from Source for Multiple Node Cluster Testing Setup

My Facebook followers probably noted that I quit from Percona some time ago and work for MariaDB since March 1, 2016. I changed the company, but neither the job role (I am still a Support Engineer), nor the approach to do my job. I still prefer to test everything I suggest to customers and I usually use software I build from source myself for these tests.

While I try to avoid all kinds of clusters as much as possible for 15 years or so already (it does not matter if it's Oracle RAC, MySQL Cluster or Percona XtraDB Cluster, all of them), it's really hard to avoid Galera clusters while working for MariaDB. One of the reasons for this is that Galera, starting from MariaDB 10.1, can be easily "enabled"/used with any MariaDB 10.1.x instance, any time (at least when we speak about official binaries or those properly built - they are all "Galera ready"). Most of MariaDB customers do use Galera or can try to use it any time, so I have to be ready to test something Galera-specific any moment.

For simple cases I decided to use a setup with several (2 to begin with) cluster nodes on one box. This approach is described in the manual for Percona XtraDB Cluster and was also used by my former colleague Fernando Laudares for his blog post and many real life related tests.

So, I decided to proceed with the mix of ideas from the sources above and MariaDB's KB article on building Galera from source. As I decided to do this on my wife's Fedora 23 workstation, I checked this KB article for some details also. It lists prerequisites (boost-devel check-devel glibc-devel openssl-devel scons) and some of these packages (like scons in one of my cases) could be missing even on a system previosly used for builds for all kinds of MySQL related software. You can find something missing and fix the problem at later stage, but reading and following the manual or KB articles may help to save some time otherwise spent on trial and error.

I've started with making directories in my home directory (/home/openxs) for this Galera related testing setup, like these:
[openxs@fc23 ~]$ mkdir galera
[openxs@fc23 ~]$ cd galera
[openxs@fc23 galera]$ mkdir node1[openxs@fc23 galera]$ mkdir node2
[openxs@fc23 galera]$ mkdir node3
[openxs@fc23 galera]$ ls
node1  node2  node3
I plan to use 3 nodes one day, but for this blog post I'll set up only 2, to have the smallest possible and simplest cluster as a proof of concept.

Then I proceeded with cloning Galera from Codership's GitHub (this is supposed to be the latest and greatest). I changed current directory to my usual git repository and executed git clone https://github.com/codership/galera.git. When this command completed I've got a subdirectory named galera.

In that directory, assuming that all prerequisites are installed, to build current Galera library version it's enough to execute simple script while in galera directory, ./scripts/build.sh. I ended up with the following:
[openxs@fc23 galera]$ ls -l libgalera_smm.so
-rwxrwxr-x. 1 openxs openxs 40204824 Mar 31 12:21 libgalera_smm.so
[openxs@fc23 galera]$ file libgalera_smm.so
libgalera_smm.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=11457fa9fd69dabe617708c0dd288b218255a886, not stripped

[openxs@fc23 galera]$ pwd
/home/openxs/git/galera
[openxs@fc23 galera]$ cp libgalera_smm.so ~/galera/
and copied the library to the target directory for my testing setup (that should NOT conflict with whatever software I may have installed later from packages).

Now, time to build MariaDB properly to let it use Galera if needed. I already had recent (at the moment) 10.1.13 in the server subdirectory of my git repository. I've executed the following commands then:

[openxs@fc23 server]$ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DENABLED_LOCAL_INFILE=1 -DWITH_JEMALLOC=system -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/maria10.1
-- Running cmake version 3.4.1
-- MariaDB 10.1.13
...

[openxs@fc23 server]$ time make -j 4...
real    9m28.164s
user    32m43.960s
sys     2m45.637s
This was my usual command line to build MariaDB 10.x with only 2 extra options added: -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON.After make completed, I've executed make install && make clean and was ready to use my shiny new Galera-ready MariaDB 10.1.13.

To take into account the directories I am going to use for my cluster nodes and make sure they can start and communicate as separate mysqld instances, I have to create configuration files for them. I've changed working directory to /home/openxs/dbs/mariadb10.1 and started with this configuration file for the first node:

[openxs@fc23 maria10.1]$ cat /home/openxs/galera/mynode1.cnf                    
[mysqld]
datadir=/home/openxs/galera/node1
port=3306
socket=/tmp/mysql-node1.sock
pid-file=/tmp/mysql-node1.pid
log-error=/tmp/mysql-node1.err
binlog_format=ROW
innodb_autoinc_lock_mode=2

wsrep_on=ON # this is important for 10.1!
wsrep_provider=/home/openxs/galera/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node1
# wsrep_cluster_address=gcomm://
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020?pc.wait_prim=no
It's one of the shortest possible. I had to specify unique datadir, error log location, pid file, port and socket for the instance, set binlog format and point out Galera library location, set cluster name and node name. With proper planning I was able to specify wsrep_cluster_address referring to all other nodes properly, but for initial setup of the first node I can have it "empty" as commented out in the above, so that we start as a new cluster node. There is one essential setting for MariaDB 10.1.x that is not needed for "cluster-specific" instances like Percona XtraDB Cluster or older 10.0.x Galera packages from MariaDB (where it's ON by default). This is wsrep_on=ON. Without it MariaDB works as normal, non-cluster instance and ignores anything cluster-related. You can save a lot of time in case of upgrade to 10.1.x if you put it in your configuration file explicitly right now, no matter what the version is used.

Then I copied and modified configuration file for the second node:
[openxs@fc23 maria10.1]$ cp /home/openxs/galera/mynode1.cnf /home/openxs/galera/mynode2.cnf
[openxs@fc23 maria10.1]$ vi /home/openxs/galera/mynode2.cnf                     

[openxs@fc23 maria10.1]$ cat /home/openxs/galera/mynode2.cnf                   
[mysqld]
datadir=/home/openxs/galera/node2
port=3307
socket=/tmp/mysql-node2.sock
pid-file=/tmp/mysql-node2.pid
log-error=/tmp/mysql-node2.err
binlog_format=ROW
innodb_autoinc_lock_mode=2

wsrep_on=ON # this is important for 10.1!wsrep_provider=/home/openxs/galera/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node2
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020

?pc.wait_prim=nowsrep_provider_options = "base_port=5020;"
Note that while Galera node uses 4 ports, I specified only 2 unique ones explicitly, port for MySQL clients and base port for all Galera-related communication like IST and SST, with base_port setting. Note also how I referred to all cluster nodes with wsrep_cluster_address - this same value can be used for the configuration file of the first node actually. We can just start it as the first node of a new cluster (see below).

Now we have configuration files for 2 nodes ready (we can always add node3 later in the same way). But before starting new cluster we have to install system databases. For node1 it was performed in the following way:
[openxs@fc23 maria10.1]$ scripts/mysql_install_db --defaults-file=/home/openxs/galera/mynode1.cnf
Installing MariaDB/MySQL system tables in '/home/openxs/galera/node1' ...
2016-03-31 12:51:34 139766046820480 [Note] ./bin/mysqld (mysqld 10.1.13-MariaDB) starting as process 28297 ...
...

[openxs@fc23 maria10.1]$ ls -l /home/openxs/galera/node1
-rw-rw----. 1 openxs openxs    16384 Mar 31 12:51 aria_log.00000001
-rw-rw----. 1 openxs openxs       52 Mar 31 12:51 aria_log_control
-rw-rw----. 1 openxs openxs 12582912 Mar 31 12:51 ibdata1
-rw-rw----. 1 openxs openxs 50331648 Mar 31 12:51 ib_logfile0
-rw-rw----. 1 openxs openxs 50331648 Mar 31 12:51 ib_logfile1
drwx------. 2 openxs openxs     4096 Mar 31 12:51 mysql
drwx------. 2 openxs openxs     4096 Mar 31 12:51 performance_schema
drwx------. 2 openxs openxs     4096 Mar 31 12:51 test
Then I started node1 as a new cluster:
[openxs@fc23 maria10.1]$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode1.cnf --wsrep-new-cluster &
and created a table, t1, with some data in it. After that I repeated installation of system tables etc for node2, just referencing proper configuration file, and started node2 that was supposed to join the cluster:
openxs@fc23 maria10.1]$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode2.cnf &
Let's check if we do have both instances running and communicating in Galera cluster:
[openxs@fc23 maria10.1]$ tail /tmp/mysql-node2.err                             
2016-03-31 13:40:29 139627414767744 [Note] WSREP: Signalling provider to continue.
2016-03-31 13:40:29 139627414767744 [Note] WSREP: SST received: c91d17b6-f72b-11e5-95de-96e95167f593:0
2016-03-31 13:40:29 139627117668096 [Note] WSREP: 1.0 (node2): State transfer from 0.0 (node1) complete.
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Shifting JOINER -> JOINED (TO: 0)
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Member 1.0 (node2) synced with group.
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2016-03-31 13:40:29 139627414452992 [Note] WSREP: Synchronized with group, ready for connections
2016-03-31 13:40:29 139627414452992 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-31 13:40:29 139627414767744 [Note] /home/openxs/dbs/maria10.1/bin/mysqld: ready for connections.
Version: '10.1.13-MariaDB'  socket: '/tmp/mysql-node2.sock'  port: 3307  Source distribution

[openxs@fc23 maria10.1]$ tail /tmp/mysql-node1.err
2016-03-31 13:40:27 140071390934784 [Note] WSREP: Provider resumed.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node2) complete.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 0)
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Member 0.0 (node1) synced with group.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2016-03-31 13:40:27 140072429247232 [Note] WSREP: Synchronized with group, ready for connections
2016-03-31 13:40:27 140072429247232 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-31 13:40:27 140072141715200 [Note] WSREP: (c91c99ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-03-31 13:40:29 140072133322496 [Note] WSREP: 1.0 (node2): State transfer from 0.0 (node1) complete.
2016-03-31 13:40:29 140072133322496 [Note] WSREP: Member 1.0 (node2) synced with group.
Familiar messages (unfortunately...) that prove we had a second node joined and performed state transfer from the first one. Now it's time to connect and test how cluster works. This is what I had after node1 started and table with some data created there, but before node2 started:
[openxs@fc23 maria10.1]$ bin/mysql -uroot --socket=/tmp/mysql-node1.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.1.13-MariaDB Source distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show variables like 'wsrep_cluster%';
+-----------------------+-------------------------------------------------------+
| Variable_name         | Value                                                 |
+-----------------------+-------------------------------------------------------+
| wsrep_cluster_address | gcomm://127.0.0.1:4567,127.0.0.1:5020?pc.wait_prim=no |
| wsrep_cluster_name    | singlebox                                             |
+-----------------------+-------------------------------------------------------+
2 rows in set (0.00 sec)

MariaDB [(none)]> show status like 'wsrep_cluster%';
+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 1                                    |
| wsrep_cluster_size       | 1                                    |
| wsrep_cluster_state_uuid | c91d17b6-f72b-11e5-95de-96e95167f593 |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.00 sec)

MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
Then, when node2 joined the cluster, I checked that the data we've added on node1 are there:

[openxs@fc23 maria10.1]$ bin/mysql -uroot --socket=/tmp/mysql-node2.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.13-MariaDB Source distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)

MariaDB [test]> show status like 'wsrep_cluster%'; +--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 2                                    |
| wsrep_cluster_size       | 2                                    |
| wsrep_cluster_state_uuid | c91d17b6-f72b-11e5-95de-96e95167f593 |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.01 sec)
So, the first basic test with the Galera cluster of 2 nodes (both running on the same box) built from current source of Galera and MariaDB 10.1.x on Fedora 23 is completed successfully. I plan to play with it more in the future, use current xtrabackup built from source for SST and so on, and create blog posts about these steps and any interesting tests in this setup. Stay tuned.

From the dates above you can conclude that it took me 3 weeks to publish this post. That's because I was busy with the company meeting in Berlin and some usual Support work, and was not sure is it really a good idea for me to write any post with "Galera" or "MariaDB" words used in it even once...



5 comments:

  1. Im getting this error while I tried run the script


    root@tungsten-node-1:~/galera/galera# ./scripts/build.sh
    ~/galera/galera ~/galera/galera
    ~/galera/galera
    scons: Reading SConscript files ...
    Host: linux x86_64 64bit
    Signature: version: 3.15, revision: 3545
    Checking for C library pthread... yes
    Checking for C library rt... yes
    Checking for C header file sys/epoll.h... yes
    Checking for C header file byteswap.h... yes
    Checking for C header file endian.h... yes
    Checking for C header file execinfo.h... yes
    Checking for C++ header file boost/shared_ptr.hpp... no
    boost/shared_ptr.hpp not found or not usable

    ReplyDelete
  2. Based on the error message, you miss boost header. I have it on my Fedora 23 box:

    [openxs@fc23 build]$ sudo find / -name shared_ptr.hpp 2>/dev/null
    [sudo] password for openxs:
    /usr/include/boost/asio/detail/shared_ptr.hpp
    /usr/include/boost/smart_ptr/shared_ptr.hpp
    /usr/include/boost/thread/csbl/memory/shared_ptr.hpp
    /usr/include/boost/interprocess/smart_ptr/shared_ptr.hpp
    /usr/include/boost/serialization/shared_ptr.hpp
    /usr/include/boost/shared_ptr.hpp
    ...
    [openxs@fc23 build]$ rpm -qa | grep -i boost
    boost-program-options-1.58.0-11.fc23.x86_64
    boost-filesystem-1.58.0-11.fc23.x86_64
    boost-1.58.0-11.fc23.x86_64
    ...
    boost-devel-1.58.0-11.fc23.x86_64
    ...

    So, probably you just have to install boost-devel (and/or boost) package.

    ReplyDelete
  3. Is that like

    yum install boost.

    I tried in google, couldn't find that exactly. Im using ubuntu.

    ReplyDelete
  4. On Ubuntu I had to execute the following:

    sudo apt-get install libboost-all-dev
    sudo apt-get install scons
    sudo apt-get install check

    before running ./scripts/build.sh. Galera is building there now (slow VM).

    ReplyDelete
  5. Now that 10.1 has Galera included, should I expect to find the same SERIALIZABLE transaction bug (https://github.com/codership/galera/issues/336#issuecomment-138805890) that plagued other Codership bundled software? To note, this can be reproduced on a single node.

    ReplyDelete