In the process I had found out that two of my most often used Linux distributions, CentOS 6.x and Ubuntu 14.04, actually use Upstart, so good old System V's init tricks and scripts work there only partially and only because somebody cared to integrate them into this "new" approach to starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running. I had to do this years ago, but customer's question finally forced me to check and study some details on how this system actually works.
So, unfortunately, there is no script like mysqld_safe to start and restart MaxScale after installing official RPM from MariaDB (in this case it was maxscale-2.0.1-2.x86_64). My first idea was to write one, but then I asked myself why it is not there yet, decided to check what's already provided and take a closer look at /etc/init.d/maxscale I have on my CentOS 6.8 VM (the closest to what customer used). It's a typical classic shell script to start service and it starts MaxScale like this:
start() {Basically, it runs /usr/bin/maxscale --user=maxscale, the rest are details (like location of PID file the rest of the script then relies on). There is no advanced script to monitor status of this process with this PID (like mysqld_safe) or anything to care about automatic restarts. It is supposed that I just execute chkconfig maxscale on and then service starts when system enters proper runlevel.
echo -n $"Starting MaxScale: "
...
ulimit -HSn 65535
daemon --pidfile $MAXSCALE_PIDFILE /usr/bin/maxscale --user=maxscale $MAXSCALE_OPTIONS >& /dev/null
RETVAL=$?
[ $RETVAL -eq 0 ] && touch /var/lock/subsys/$servicename
...
# Return right code
if [ $RETVAL -ne 0 ]; then
failure
RETVAL=$_RETVAL_NOT_RUNNING
fi
echo
return $RETVAL
}
Simple test shows that when maxscale process is killed it's gone and is not restarted. At good old times I'd just add something like this:
mx:2345:respawn:/usr/bin/maxscale --user=maxscale
to /etc/inittab (as nice articles like this suggests), but quick check and then further reading proves that it's not going to work on CentOS 6.8, as it uses Upstart.
So, either I had to write something similar to mysqld_safe for MaxScale, or (having in mind size of code and number of bugs we had in the past in that script) I had better to find out what is the supposed way to respawn processes in Upstart. Basic ideas are, again, simple. One has to create /etc/init/service_name.conf file and put something like this there (real code quote from one of Ubuntu files for MySQL):
start on runlevel [2345] stop on starting rc RUNLEVEL=[016] respawn respawn limit 2 5 ... pre-start script ... end script exec /usr/sbin/mysqld post-start script ... end script
The file is easy to understand even without reading the manual. One has to set when service starts and stops, add respawn clause if we want to restart it in case of unexpected crashes or kills of the process, optionally limit the number of restarts and intervals between restarts etc, and, optionally, do something before and after start.
I quickly created /etc/init/maxscale.conf based on the above and it did a great job in starting it automatically upon system startup. I've just used exec /usr/bin/maxscale --user=maxscale basically and decided to deal with options and other details later if needed. But what was strange for the very beginning is that in /var/log/messages I've seen what looked like repeated attempts to start maxscale process, that failed:
Nov 26 17:44:45 centos maxscale[20229]: MaxScale started with 1 server threads.
Nov 26 17:44:45 centos init: maxscale main process ended, respawning
Nov 26 17:44:45 centos maxscale[20229]: Started MaxScale log flusher.
Nov 26 17:44:45 centos maxscale[20235]: Working directory: /var/log/maxscale
Nov 26 17:44:45 centos maxscale[20235]: MariaDB MaxScale 2.0.1 started
Nov 26 17:44:45 centos maxscale[20235]: MaxScale is running in process 20235
...
Nov 26 17:44:45 centos maxscale[20235]: Loaded module qc_sqlite: V1.0.0 from /usr/lib64/maxscale/libqc_sqlite.soMoreover, when proper maxscale process was killed, it was NOT respawned as expected.
Nov 26 17:44:45 centos maxscale[20235]: MaxScale is already running. Process id: 20229. Use another location for the PID file to run multiple instances of MaxScale on the same machine.
Nov 26 17:44:45 centos init: maxscale main process (20234) terminated with status 4
Nov 26 17:44:45 centos init: maxscale main process ended, respawning
It was just a proper time to read the manual more carefully, this part about expect stanza (that I noted in some of official Upstart scripts):
"To allow Upstart to determine the final process ID for a job, it needs to know how many times that process will call fork(2). Upstart itself cannot know the answer to this question since once a daemon is running, it could then fork a number of "worker" processes which could themselves fork any number of times. Upstart cannot be expected to know which PID is the "master" in this case, considering it does not know if worker processes will be created at all, let alone how many times, or how many times the process will fork initially. As such, it is necessary to tell Upstart which PID is the "master" or parent PID. This is achieved using the expect stanza.Let's check quickly how many times fork() is called in maxscale (I'd know it better if I ever cared to study the source code in details, but I had not checked most part of it yet). Test based on that cookbook gives unexpected results:
The syntax is simple, but you do need to know how many times your service forks."
[root@centos ~]# strace -o /tmp/strace.log -fFv maxscale &How comes we have 5 fork calls? Here they are:
[1] 2369
[root@centos ~]# sleep 10
[root@centos ~]# ps aux | grep strace
root 2369 2.6 0.0 4476 888 pts/0 S 20:28 0:00 strace -o /tmp/strace.log -fFv maxscale
root 2382 0.0 0.0 103312 868 pts/0 S+ 20:28 0:00 grep strace
[root@centos ~]# pkill -9 strace
[1]+ Killed strace -o /tmp/strace.log -fFv maxscale
[root@centos ~]# ps aux | grep maxscale
root 2375 1.3 0.2 276168 3896 ? Ssl 20:28 0:00 maxscale
root 2385 0.0 0.0 103312 868 pts/0 S+ 20:28 0:00 grep maxscale
[root@centos ~]# egrep "\<(fork|clone)\>\(" /tmp/strace.log | wc | awk '{print $1}'
5
[root@centos ~]# egrep "\<(fork|clone)\>\(" /tmp/strace.logIt seems process with PID 2374 actually forked just once, to produce PID 2375, and then several threads were started by that process. We can see them:
2374 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb024c08ab0) = 2375
2375 clone(child_stack=0x7fb01f819f10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb01f81a9d0, tls=0x7fb01f81a700, child_tidptr=0x7fb01f81a9d0) = 2376
2375 clone(child_stack=0x7fb01e118f10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb01e1199d0, tls=0x7fb01e119700, child_tidptr=0x7fb01e1199d0) = 2377
2375 clone(child_stack=0x7fb01d10af10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb01d10b9d0, tls=0x7fb01d10b700, child_tidptr=0x7fb01d10b9d0) = 2378
2375 clone(child_stack=0x7fb017ffef10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb017fff9d0, tls=0x7fb017fff700, child_tidptr=0x7fb017fff9d0) = 2379
[root@centos ~]# ps -T -p `pidof maxscale`So, it was really one fork() (and I could notify that even from studying /var/log/messages before) and I have to add expect fork stanza to my Upstart configuration file:
PID SPID TTY TIME CMD
2375 2375 ? 00:00:00 maxscale
2375 2376 ? 00:00:00 maxscale
2375 2377 ? 00:00:00 maxscale
2375 2378 ? 00:00:00 maxscale
2375 2379 ? 00:00:00 maxscale
[root@centos ~]#
[root@centos ~]# cat /etc/init/maxscale.confThis way it works as expected, as one may easily check:
# MaxScale service
description "MaxScale"
start on stopped rc RUNLEVEL=[2345]
stop on starting rc runlevel [!2345]
respawn
respawn limit 2 5
expect fork
exec /usr/bin/maxscale --user=maxscale
[root@centos ~]# initctl status maxscale
maxscale start/running, process 6600
[root@centos ~]# maxadmin
MaxScale> show servers
Server 0x19e47a0 (server1)
Server: 127.0.0.1
Status: Master, Running
Protocol: MySQLBackend
Port: 3306
Server Version: 5.7.15-9-log
Node Id: 1
Master Id: -1
Slave Ids:
Repl Depth: 0
Number of connections: 0
Current no. of conns: 0
Current no. of operations: 0
MaxScale> quit
[root@centos ~]# kill -9 6600
[root@centos ~]# ps aux | grep maxscale
maxscale 6627 2.0 0.2 276168 3884 ? Ssl 20:41 0:00 /usr/bin/maxscale --user=maxscale
root 6633 0.0 0.0 103312 872 pts/0 S+ 20:41 0:00 grep maxscale
[root@centos ~]# initctl status maxscale
maxscale start/running, process 6627
In the /var/log/messages we clearly see that the process is respawned by init:
...After few more checks I asked to implement this officially in packages for the Linux distributions that use Upstart, see MXS-1027.
Nov 26 20:38:15 centos maxscale[6600]: Started MaxScale log flusher.
Nov 26 20:40:33 centos maxscale[6600]: Loaded module MaxAdminAuth: V2.0.0 from /usr/lib64/maxscale/libMaxAdminAuth.so
Nov 26 20:41:52 centos init: maxscale main process (6600) killed by KILL signal
Nov 26 20:41:52 centos init: maxscale main process ended, respawning
Nov 26 20:41:52 centos maxscale[6627]: Working directory: /var/log/maxscale
Nov 26 20:42:07 centos maxscale[6627]: Loaded module MaxAdminAuth: V2.0.0 from /usr/lib64/maxscale/libMaxAdminAuth.so
...
To summarize, I wish I care more to find out how Upstart works long time ago. Now it's time to study systemd probably :) Anyway, after some reading and testing one can use it efficiently to provide automated service starts and restarts for MySQL server and services used with it.
No comments:
Post a Comment