In this third post of my "Linux /proc Filesystem for MySQL DBAs" series (see also Part I and Part II for the context and details) I am going to present a useful set of programs to access, summarize and record /proc details created and recently shared by famous Tanel Poder, 0x.tools. I'll try to build them and apply to Percona Server 5.7.32-35 running on Ubuntu 16.04 netbook and fighting with some read-write sysbench load.
To build the tools I clone them from GitHub and just make as usual:
openxs@ao756:~/git$ git clone https://github.com/tanelpoder/0xtools
Cloning into '0xtools'...
remote: Enumerating objects: 103, done.
remote: Counting objects: 100% (103/103), done.
remote: Compressing objects: 100% (67/67), done.
remote: Total 186 (delta 53), reused 77 (delta 32), pack-reused 83
Receiving objects: 100% (186/186), 108.43 KiB | 0 bytes/s, done.
Resolving deltas: 100% (94/94), done.
Checking connectivity... done.
openxs@ao756:~/git$ cd 0xtools/
openxs@ao756:~/git/0xtools$ make
gcc -I include -Wall -o bin/xcapture src/xcapture.c
openxs@ao756:~/git/0xtools$ sudo make install
# for now the temporary "install" method is with symlinks
ln -s `pwd`/bin/xcapture /usr/bin/xcapture
ln -s `pwd`/bin/psn /usr/bin/psn
ln -s `pwd`/bin/schedlat /usr/bin/schedlat
openxs@ao756:~/git/0xtools$ file bin/xcapture
bin/xcapture: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=ad05c149b0379a29fef765ec4706e912843846ff, not stripped
openxs@ao756:~/git/0xtools$ file bin/psn
bin/psn: Python script, ASCII text executable
openxs@ao756:~/git/0xtools$ file bin/schedlat
bin/schedlat: Python script, ASCII text executable
openxs@ao756:~/git/0xtools$ python --version
Python 2.7.12
Nothing more than that for now. Executable and two Python scripts. This installation process should work well on any Linux where you can build non-ancient MySQL from source as has some Python installed, starting from RHEL5.
We can get some details about the programs usage as follows:
openxs@ao756:~/git/0xtools$ xcapture -h
0x.Tools xcapture v1.0 by Tanel Poder [https://0x.tools]
Usage:
xcapture [options]
By default, sample all /proc tasks in states R, D every second and print to stdout
Options:
-a capture tasks in additional states, even the ones Sleeping (S)
-A capture tasks in All states, including Zombie (Z), Exiting (X), Idle (I)
-c <c1,c2> print additional columns (for example: -c exe,cmdline,kstack)
-d <N> seconds to sleep between samples (default: 1)
-E <string> custom task state Exclusion filter (default: XZIS)
-h display this help message
-o <dirname> write wide output into hourly CSV files in this directory instead of stdout
openxs@ao756:~/git/0xtools$ psn -h
usage: psn [-h] [-d seconds] [-p [pid]] [-t [tid]] [-r] [-a]
[--sample-hz SAMPLE_HZ] [--ps-hz PS_HZ] [-o filename] [-i filename]
[-s csv-columns] [-g csv-columns] [-G csv-columns] [--list]
optional arguments:
-h, --help show this help message and exit
-d seconds number of seconds to sample for
-p [pid], --pid [pid]
process id to sample (including all its threads), or
process name regex, or omit for system-wide sampling
-t [tid], --thread [tid]
thread/task id to sample (not implemented yet)
-r, --recursive also sample and report for descendant processes
-a, --all-states display threads in all states, including idle ones
--sample-hz SAMPLE_HZ
sample rate in Hz (default: 20)
--ps-hz PS_HZ sample rate of new processes in Hz (default: 2)
-o filename, --output-sample-db filename
path of sqlite3 database to persist samples to,
defaults to in-memory/transient
-i filename, --input-sample-db filename
path of sqlite3 database to read samples from instead
of actively sampling
-s csv-columns, --select csv-columns
additional columns to report
-g csv-columns, --group-by csv-columns
columns to aggregate by in reports
-G csv-columns, --append-group-by csv-columns
default + additional columns to aggregate by in
reports
--list list all available columns
The last one is also a python script that does not care to provide -h option, but shows the usage when called without arguments:
openxs@ao756:~/git/0xtools$ bin/schedlat
usage: bin/schedlat PID
There are also two shell scripts in bin/:
openxs@ao756:~/git/0xtools$ ls -F bin
psn* run_xcapture.sh* run_xcpu.sh* schedlat* xcapture*
The purposes of the tools are the following:
- xcapture - low-overhead thread state sampler based on reading /proc files
- psn - shows current top thread activity by sampling /proc files
- schedlat - shows CPU scheduling latency for the given PID as a % of its runtime
- run_xcapture.sh - a simple “daemon” script for keeping xcapture running
- run_xcpu.sh - low-frequency continuous stack sampling for threads on CPU (using perf)
The last script, run_xcpu.sh, is out of scope of this series. We may get back to it in some next perf-related posts. It shows proper usge of perf for low frequency (1 Hz) and low impact on-CPU profiling.
Let's see what we can get with xcapture:
openxs@ao756:~/git/0xtools$ mkdir /tmp/xcap
openxs@ao756:~/git/0xtools$ xcapture -a -o /tmp/xcap
0xTools xcapture v1.0 by Tanel Poder [https://0x.tools]
Sampling /proc...
^C
openxs@ao756:~/git/0xtools$
while sysbench test is running against my Percona Server:
openxs@ao756:~$ sysbench oltp_read_write --db-driver=mysql --tables=5 --table-size=100000 --mysql-user=root --mysql-socket=/var/run/mysqld/mysqld.sock --mysql-db=sbtest --threads=16 --time=1200 --report-interval=10 run
...
[ 1200s ] thds: 16 tps: 70.99 qps: 1420.02 (r/w/o: 993.87/284.16/141.98) lat (ms,95%): 344.08 err/s: 0.00 reconn/s: 0.00
SQL statistics:
queries performed:
read: 1214556
write: 347016
other: 173508
total: 1735080
transactions: 86754 (72.28 per sec.)
queries: 1735080 (1445.54 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
Throughput:
events/s (eps): 72.2771
time elapsed: 1200.2964s
total number of events: 86754
Latency (ms):
min: 89.13
avg: 221.33
max: 855.31
95th percentile: 356.70
sum: 19200998.85
Threads fairness:
events (avg/stddev): 5422.1250/12.10
execution time (avg/stddev): 1200.0624/0.11
To get the impact of capturing the information, I've got the same test results without xcapture running previously:
...
transactions: 87376 (72.80 per sec.)
queries: 1747537 (1456.07 per sec.)
ignored errors: 1 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
Throughput:
events/s (eps): 72.8027
time elapsed: 1200.1747s
total number of events: 87376
Latency (ms):
min: 87.54
avg: 219.75
max: 843.19
95th percentile: 356.70
...
So we had 72.80 TPS and 1456 QPS over 1200 seconds without monitoring and 72.28 TPS and 1446 QPS with it. Do not mind the absolute (no tuning was performed, all default, old netbook with slow HDD), but the difference is really ignorable.
Now let's see what was captured (with -a, so all threads of all processes in the system, in all states, even S - sleeping):
openxs@ao756:~/git/0xtools$ ls -l /tmp/xcap/
total 56720
-rw-rw-r-- 1 openxs openxs 36310058 Jan 8 16:59 2021-01-08.16.csv
-rw-rw-r-- 1 openxs openxs 21768834 Jan 8 17:07 2021-01-08.17.csv
openxs@ao756:~/git/0xtools$ cat /tmp/xcap/2021-01-08.16.csv | head -30
TS,PID,TID,USERNAME,ST,COMMAND,SYSCALL,WCHAN,EXE,KSTACK
2021-01-08 16:47:24.414,1,1,root,S,(systemd),-,0,-,-
2021-01-08 16:47:24.414,2,2,root,S,(kthreadd),-,0,-,-
2021-01-08 16:47:24.414,3,3,root,S,(ksoftirqd/0),-,0,-,-
2021-01-08 16:47:24.414,5,5,root,S,(kworker/0:0H),-,0,-,-
2021-01-08 16:47:24.414,7,7,root,S,(rcu_sched),-,0,-,-
2021-01-08 16:47:24.414,8,8,root,S,(rcu_bh),-,0,-,-
2021-01-08 16:47:24.414,9,9,root,S,(migration/0),-,0,-,-
2021-01-08 16:47:24.414,10,10,root,S,(watchdog/0),-,0,-,-
2021-01-08 16:47:24.414,11,11,root,S,(watchdog/1),-,0,-,-
2021-01-08 16:47:24.414,12,12,root,S,(migration/1),-,0,-,-
2021-01-08 16:47:24.414,13,13,root,S,(ksoftirqd/1),-,0,-,-
2021-01-08 16:47:24.414,15,15,root,S,(kworker/1:0H),-,0,-,-
2021-01-08 16:47:24.414,16,16,root,S,(kdevtmpfs),-,0,-,-
2021-01-08 16:47:24.414,17,17,root,S,(netns),-,0,-,-
2021-01-08 16:47:24.414,18,18,root,S,(perf),-,0,-,-
2021-01-08 16:47:24.414,19,19,root,S,(khungtaskd),-,0,-,-
2021-01-08 16:47:24.414,20,20,root,S,(writeback),-,0,-,-
2021-01-08 16:47:24.414,21,21,root,S,(ksmd),-,0,-,-
2021-01-08 16:47:24.414,22,22,root,S,(khugepaged),-,0,-,-
2021-01-08 16:47:24.414,23,23,root,S,(crypto),-,0,-,-
2021-01-08 16:47:24.414,24,24,root,S,(kintegrityd),-,0,-,-
2021-01-08 16:47:24.414,25,25,root,S,(bioset),-,0,-,-
2021-01-08 16:47:24.414,26,26,root,S,(kblockd),-,0,-,-
2021-01-08 16:47:24.414,27,27,root,S,(ata_sff),-,0,-,-
2021-01-08 16:47:24.414,28,28,root,S,(md),-,0,-,-
2021-01-08 16:47:24.414,29,29,root,S,(devfreq_wq),-,0,-,-
2021-01-08 16:47:24.414,34,34,root,S,(kswapd0),-,0,-,-
2021-01-08 16:47:24.414,35,35,root,S,(vmstat),-,0,-,-
2021-01-08 16:47:24.414,36,36,root,S,(fsnotify_mark),-,0,-,-
openxs@ao756:~/git/0xtools$
Nothing that fancy in the first 30 lines. But you can then load CSV files into the database for rocessing, or query them with standard Linux text processing tools. It’s like SQL but with different keywords: grep for filtering, awk or sed for column projection, uniq for group by, sort for ordering and head/tail for LIMIT. Let's count number of threads per command, state and syscall, for example, asnd get top 10:
openxs@ao756:~/git/0xtools$ cat /tmp/xcap/*.csv | awk -F, '{ printf("%-20s %-20s %s\n",$6,$5,$7) }' | sort | uniq -c | sort -nbr | head -10
71442 (console-kit-dae) S -
69174 (gmain) S read
69173 (gdbus) S read
48762 (dconf S read
47628 (dockerd) S -
46455 (mysqld) S -
19196 (sysbench) S read
15876 (gmain) S -
15876 (bioset) S -
14742 (containerd) S -
No wonder MySQL threads were mostly sleeping, with 16 of them and just 2 slow CPUs.
Next stool is psn (Linux Process Snapper), a Python script for troubleshooting currently on-going issues (no historical capture). It currently reports more fields directly from /proc than xcapture captures (like filenames accessed by IO system calls). Let me try to apply psn to the mysqld process:
openxs@ao756:~/git/0xtools$ psn -p `pidof mysqld`
Linux Process Snapper v0.18 by Tanel Poder [https://0x.tools]
Sampling /proc/stat for 5 seconds... finished.
=== Active Threads ========================================
samples | avg_threads | comm | state
-----------------------------------------------------------
205 | 2.05 | (mysqld) | Disk (Uninterruptible)
114 | 1.14 | (mysqld) | Running (ON CPU)
samples: 100 (expected: 100)
total processes: 1, threads: 44
runtime: 5.01, measure time: 1.29
or to the entire system while sysbench test is running:
openxs@ao756:~/git/0xtools$ psn
Linux Process Snapper v0.18 by Tanel Poder [https://0x.tools]
Sampling /proc/stat for 5 seconds... finished.
=== Active Threads ================================================
samples | avg_threads | comm | state
-------------------------------------------------------------------
94 | 2.24 | (mysqld) | Disk (Uninterruptible)
41 | 0.98 | (jbd*/dm-*-*) | Disk (Uninterruptible)
31 | 0.74 | (mysqld) | Running (ON CPU)
31 | 0.74 | (sysbench) | Running (ON CPU)
14 | 0.33 | (update-manager) | Running (ON CPU)
5 | 0.12 | (kworker/u*:*) | Running (ON CPU)
2 | 0.05 | (Xorg) | Running (ON CPU)
1 | 0.02 | (compiz) | Running (ON CPU)
1 | 0.02 | (dockerd) | Running (ON CPU)
1 | 0.02 | (rcu_sched) | Running (ON CPU)
samples: 42 (expected: 100)
total processes: 221, threads: 650
runtime: 5.06, measure time: 5.01
Some useful summaries of numbers of threads per state per process for the monitoring period, but nothing really fancy. This is, we can add other colunms from this long list with items that should look familiar for those studying /proc content:
openxs@ao756:~/git/0xtools$ psn --list
stat
====================================
pid int
comm str
comm2 str
state_id str
state str
ppid int
pgrp int
session int
tty_nr int
tpgid int
minflt int
cminflt int
majflt int
cmajflt int
utime long
stime long
cutime long
cstime long
utime_sec long
stime_sec long
cutime_sec long
cstime_sec long
priority int
nice int
num_threads int
starttime long
vsize long
rss long
rsslim str
exit_signal int
processor int
rt_priority int
delayacct_blkio_ticks long
guest_time int
cgust_time int
exit_code int
status
====================================
name str
umask str
state str
tgid int
ngid int
pid int
ppid int
tracerpid int
uid int
gid int
fdsize int
groups str
nstgid str
nspid str
nspgid str
nssid str
vmpeak_kb int
vmsize_kb int
vmlck_kb int
vmpin_kb int
vmhwm_kb int
vmrss_kb int
rssanon_kb int
rssfile_kb int
rssshmem_kb int
vmdata_kb int
vmstk_kb int
vmexe_kb int
vmlib_kb int
vmpte_kb int
vmpmd_kb int
vmswap_kb int
hugetlbpages_kb int
threads int
sigq str
sigpnd str
shdpnd str
sigblk str
sigign str
sigcgt str
capinh str
capprm str
capeff str
capbnd str
capamb str
seccomp int
cpus_allowed str
cpus_allowed_list str
mems_allowed str
mems_allowed_list str
voluntary_ctxt_switches int
nonvoluntary_ctxt_switches int
syscall
====================================
syscall_id int
syscall str
arg0 str
arg1 str
arg2 str
arg3 str
arg4 str
arg5 str
filename str
filename2 str
filenamesum str
basename str
dirname str
wchan
====================================
wchan str
io
====================================
rchar int
wchar int
syscr int
syscw int
read_bytes int
write_bytes int
cancelled_write_bytes int
smaps
====================================
address_range str
perms str
offset str
dev str
inode int
pathname str
size_kb int
rss_kb int
pss_kb int
shared_clean_kb int
shared_dirty_kb int
private_clean_kb int
private_dirty_kb int
referenced_kb int
anonymous_kb int
anonhugepages_kb int
shmempmdmapped_kb int
shared_hugetld_kb int
private_hugetld_kb int
swap_kb int
swappss_kb int
kernelpagesize_kb int
mmupagesize_kb int
locked_kb int
vmflags str
stack
====================================
kstack str
cmdline
====================================
cmdline str
openxs@ao756:~/git/0xtools$
We can add syscall to find out what threads were waiting on (not only number as in /proc, but syscall decoded, that's what requires some tricks or good tool):
openxs@ao756:~/git/0xtools$ sudo psn -p `pidof mysqld` -G syscall
Linux Process Snapper v0.18 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, syscall for 5 seconds... finished.
=== Active Threads ====================================================
samples | avg_threads | comm | state | syscall
-----------------------------------------------------------------------
136 | 1.36 | (mysqld) | Disk (Uninterruptible) | fsync
97 | 0.97 | (mysqld) | Disk (Uninterruptible) | fdatasync
41 | 0.41 | (mysqld) | Running (ON CPU) | [running]
5 | 0.05 | (mysqld) | Running (ON CPU) | poll
4 | 0.04 | (mysqld) | Disk (Uninterruptible) | pwrite64
3 | 0.03 | (mysqld) | Running (ON CPU) | futex
2 | 0.02 | (mysqld) | Disk (Uninterruptible) | write
samples: 100 (expected: 100)
total processes: 1, threads: 44
runtime: 5.02, measure time: 1.67
We can further group by different kernel stacks:
openxs@ao756:~/git/0xtools$ sudo psn -p `pidof mysqld` -G kstack
Linux Process Snapper v0.18 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, stack for 5 seconds... finished.
=== Active Threads =========================================================================================================================================================================================================================================================================================
samples | avg_threads | comm | state | kstack
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
101 | 1.01 | (mysqld) | Disk (Uninterruptible) | entry_SYSCALL_64_fastpath()->SyS_fsync()->do_fsync()->vfs_fsync_range()->ext4_sync_file()->jbd2_complete_transaction()->jbd2_log_wait_commit()
89 | 0.89 | (mysqld) | Disk (Uninterruptible) | entry_SYSCALL_64_fastpath()->SyS_fdatasync()->do_fsync()->vfs_fsync_range()->ext4_sync_file()->jbd2_complete_transaction()->jbd2_log_wait_commit()
48 | 0.48 | (mysqld) | Running (ON CPU) | entry_SYSCALL_64_fastpath()->SyS_poll()->do_sys_poll()->poll_schedule_timeout()
15 | 0.15 | (mysqld) | Running (ON CPU) | int_ret_from_sys_call()->syscall_return_slowpath()->exit_to_usermode_loop()
11 | 0.11 | (mysqld) | Disk (Uninterruptible) | entry_SYSCALL_64_fastpath()->SyS_fsync()->do_fsync()->vfs_fsync_range()->ext4_sync_file()->blkdev_issue_flush()->submit_bio_wait()
11 | 0.11 | (mysqld) | Running (ON CPU) | -
9 | 0.09 | (mysqld) | Disk (Uninterruptible) | entry_SYSCALL_64_fastpath()->SyS_fsync()->do_fsync()->vfs_fsync_range()->ext4_sync_file()->filemap_write_and_wait_range()->filemap_fdatawait_range()->__filemap_fdatawait_range()->wait_on_page_bit()
7 | 0.07 | (mysqld) | Running (ON CPU) | entry_SYSCALL_64_fastpath()->SyS_futex()->do_futex()->futex_wait()->futex_wait_queue_me()
4 | 0.04 | (mysqld) | Running (ON CPU) | retint_user()->prepare_exit_to_usermode()->exit_to_usermode_loop()
3 | 0.03 | (mysqld) | Disk (Uninterruptible) | entry_SYSCALL_64_fastpath()->SyS_pwrite64()->vfs_write()->__vfs_write()->new_sync_write()->ext4_file_write_iter()->__generic_file_write_iter()->generic_file_direct_write()->ext4_direct_IO()->__blockdev_direct_IO()->do_blockdev_direct_IO()
2 | 0.02 | (mysqld) | Disk (Uninterruptible) | entry_SYSCALL_64_fastpath()->SyS_fdatasync()->do_fsync()->vfs_fsync_range()->ext4_sync_file()->filemap_write_and_wait_range()->filemap_fdatawait_range()->__filemap_fdatawait_range()->wait_on_page_bit()
2 | 0.02 | (mysqld) | Disk (Uninterruptible) | entry_SYSCALL_64_fastpath()->SyS_pread64()->vfs_read()->__vfs_read()->new_sync_read()->generic_file_read_iter()->wait_on_page_bit_killable()
2 | 0.02 | (mysqld) | Running (ON CPU) | entry_SYSCALL_64_fastpath()->SyS_io_getevents()->read_events()
1 | 0.01 | (mysqld) | Running (ON CPU) | entry_SYSCALL_64_fastpath()->SyS_fsync()->do_fsync()->vfs_fsync_range()->ext4_sync_file()->filemap_write_and_wait_range()->filemap_fdatawait_range()->__filemap_fdatawait_range()->wait_on_page_bit()
1 | 0.01 | (mysqld) | Running (ON CPU) | entry_SYSCALL_64_fastpath()->SyS_fsync()->do_fsync()->vfs_fsync_range()->ext4_sync_file()->jbd2_complete_transaction()->jbd2_log_wait_commit()
samples: 100 (expected: 100)
total processes: 1, threads: 44
runtime: 5.01, measure time: 1.78
or even more, we can see wait channel and if it's disk I/O what file it is related to:
openxs@ao756:~/git/0xtools$ sudo psn -p `pidof mysqld` -G wchan,filename
Linux Process Snapper v0.18 by Tanel Poder [https://0x.tools]
Sampling /proc/syscall, wchan, stat for 5 seconds... finished.
=== Active Threads =====================================================================================================
samples | avg_threads | comm | state | wchan | filename
------------------------------------------------------------------------------------------------------------------------
91 | 0.91 | (mysqld) | Disk (Uninterruptible) | jbd2_log_wait_commit | /var/lib/mysql/ao756-bin.000092
89 | 0.89 | (mysqld) | Disk (Uninterruptible) | jbd2_log_wait_commit | /var/lib/mysql/ib_logfile1
49 | 0.49 | (mysqld) | Running (ON CPU) | 0 |
9 | 0.09 | (mysqld) | Disk (Uninterruptible) | wait_on_page_bit | /var/lib/mysql/ib_logfile1
7 | 0.07 | (mysqld) | Disk (Uninterruptible) | jbd2_log_wait_commit | /var/lib/mysql/xb_doublewrite
4 | 0.04 | (mysqld) | Disk (Uninterruptible) | jbd2_log_wait_commit | /var/lib/mysql/sbtest/sbtest5.ibd
3 | 0.03 | (mysqld) | Disk (Uninterruptible) | jbd2_log_wait_commit | /var/lib/mysql/sbtest/sbtest4.ibd
3 | 0.03 | (mysqld) | Running (ON CPU) | poll_schedule_timeout |
2 | 0.02 | (mysqld) | Disk (Uninterruptible) | wait_on_page_bit | /var/lib/mysql/ao756-bin.000092
1 | 0.01 | (mysqld) | Disk (Uninterruptible) | do_blockdev_direct_IO | /var/lib/mysql/xb_doublewrite
1 | 0.01 | (mysqld) | Disk (Uninterruptible) | submit_bio_wait | /var/lib/mysql/sbtest/sbtest2.ibd
1 | 0.01 | (mysqld) | Disk (Uninterruptible) | submit_bio_wait | /var/lib/mysql/sbtest/sbtest3.ibd
1 | 0.01 | (mysqld) | Disk (Uninterruptible) | wait_on_page_bit |
1 | 0.01 | (mysqld) | Disk (Uninterruptible) | wait_on_page_bit | /var/lib/mysql/sbtest/sbtest4.ibd
1 | 0.01 | (mysqld) | Disk (Uninterruptible) | wait_on_page_bit | /var/lib/mysql/sbtest/sbtest5.ibd
1 | 0.01 | (mysqld) | Running (ON CPU) | 0 | /var/lib/mysql/ib_logfile1
samples: 100 (expected: 100)
total processes: 1, threads: 44
runtime: 5.03, measure time: 2.06
This is really cool and easy to use! This way I got reminded that actually I have binary logging enabled somewhere and it surely slows things down :)
The last tool to check today is schedlat. It had not provided me with any insights:
openxs@ao756:~/git/0xtools$ schedlat `pidof mysqld`
SchedLat by Tanel Poder [https://0x.tools]
PID=19951 COMM=mysqld
TIMESTAMP %CPU %LAT %SLP
2021-01-08 18:52:38 0.0 0.0 100.0
2021-01-08 18:52:39 0.0 0.0 100.0
2021-01-08 18:52:40 0.0 0.0 100.0
2021-01-08 18:52:41 0.0 0.0 100.0
2021-01-08 18:52:42 0.0 0.4 99.6
2021-01-08 18:52:43 0.0 0.0 100.0
2021-01-08 18:52:44 0.0 0.0 100.0
2021-01-08 18:52:45 0.0 0.0 100.0
2021-01-08 18:52:46 0.0 0.0 100.0
2021-01-08 18:52:47 0.0 0.0 100.0
2021-01-08 18:52:48 0.0 0.0 100.0
2021-01-08 18:52:49 0.0 0.0 100.0
2021-01-08 18:52:50 0.0 0.0 100.0
2021-01-08 18:52:51 0.0 0.0 100.0
2021-01-08 18:52:52 0.0 0.0 100.0
2021-01-08 18:52:53 0.0 0.0 100.0
2021-01-08 18:52:54 0.0 0.0 100.0
2021-01-08 18:52:55 0.0 0.0 100.0
2021-01-08 18:52:56 0.0 0.0 100.0
2021-01-08 18:52:57 0.0 0.0 100.0
2021-01-08 18:52:58 0.0 0.0 100.0
2021-01-08 18:52:59 0.0 0.0 100.0
2021-01-08 18:53:00 0.0 0.0 100.0
2021-01-08 18:53:01 0.0 0.0 100.0
2021-01-08 18:53:02 0.0 0.0 100.0
2021-01-08 18:53:03 0.0 0.0 100.0
2021-01-08 18:53:04 0.0 0.0 100.0
2021-01-08 18:53:05 0.0 0.0 100.0
2021-01-08 18:53:06 0.0 0.0 100.0
2021-01-08 18:53:07 0.0 0.0 100.0
^CTraceback (most recent call last):
File "/usr/bin/schedlat", line 36, in <module>
time.sleep(1)
KeyboardInterrupt
I was not CPU-bound, the process was mostly sleeping.
I suggest you to get inspired by the thread states chart below and article it got it copied from, and read other articled by Tanel Poder here. This is what I am going to do myself before trying to apply the tools in real life. The idea of this blog post was to show what is possible for a creative person while sampling thread states from /proc.
Nice illustration of generic threads states we care about for off-CPU sampling, provided by Brendann Gregg at http://www.brendangregg.com/offcpuanalysis.html |
* * *
To summarize, why would anyone may prefer or need to use /proc sampling tools instead of other approaches:
- Unlike perf or eBPF tools, /proc is always there. It is present on old Linux versions, and poor man's sampling with shell scripts does not require installing anything else.
- Some information about the processes is visible to all users, so there may be no need for root/sudo privileges that MySQLproduction DBAs often lack.
- It is easy to use to do low-overhead off-CPU profiling. While it is possible to enable tracing for off-cpu events in perf, it comes with a higher tracing overhead (and then overhead of post-processing these high-frequency events). eBPF tools can be used to reduce both, but it is still extra overhead on top of what is already there in /proc.
So I decided to add these tools and ad hoc /proc sampling scripts (and checking sar outputs, if any) to my emergency toolset while studying some complex MySQL performance problems. They fit well into the ovberall system of application-level tracing based on performance_schema, on-CPU sampling with perf and some bpftrace quick scripts. Use right tools for the job!
In the next, probably final post in this series, I'll try to present some good example of MySQL performance problem where /proc-based sampling tools really help to get useful insight.
No comments:
Post a Comment