From: Robert Pang <robertpang@google.com>
To: colyli@suse.de
Cc: dongsheng.yang@easystack.cn, linux-bcache@vger.kernel.org
Subject: Re: [PATCH v2] bcache: allow allocator to invalidate bucket in gc
Date: Fri, 15 Mar 2024 15:45:27 -0700 [thread overview]
Message-ID: <20240315224527.694458-1-robertpang@google.com> (raw)
In-Reply-To: <1ddde040-9bde-515a-1d4d-b41de472a702@suse.de>
Hi all
We found this patch via google.
We have a setup that uses bcache to cache a network attached storage in a local SSD drive. Under heavy traffic, IO on the cached device stalls every hour or so for tens of seconds. When we track the latency with "fio" utility continuously, we can see the max IO latency shoots up when stall happens,
latency_test: (groupid=0, jobs=1): err= 0: pid=50416: Fri Mar 15 21:14:18 2024
read: IOPS=62.3k, BW=486MiB/s (510MB/s)(11.4GiB/24000msec)
slat (nsec): min=1377, max=98964, avg=4567.31, stdev=1330.69
clat (nsec): min=367, max=43682, avg=429.77, stdev=234.70
lat (nsec): min=1866, max=105301, avg=5068.60, stdev=1383.14
clat percentiles (nsec):
| 1.00th=[ 386], 5.00th=[ 406], 10.00th=[ 406], 20.00th=[ 410],
| 30.00th=[ 414], 40.00th=[ 414], 50.00th=[ 414], 60.00th=[ 418],
| 70.00th=[ 418], 80.00th=[ 422], 90.00th=[ 426], 95.00th=[ 462],
| 99.00th=[ 652], 99.50th=[ 708], 99.90th=[ 3088], 99.95th=[ 5600],
| 99.99th=[11328]
bw ( KiB/s): min=318192, max=627591, per=99.97%, avg=497939.04, stdev=81923.63, samples=47
iops : min=39774, max=78448, avg=62242.15, stdev=10240.39, samples=47
...
<IO stall>
latency_test: (groupid=0, jobs=1): err= 0: pid=50416: Fri Mar 15 21:21:23 2024
read: IOPS=26.0k, BW=203MiB/s (213MB/s)(89.1GiB/448867msec)
slat (nsec): min=958, max=40745M, avg=15596.66, stdev=13650543.09
clat (nsec): min=364, max=104599, avg=435.81, stdev=302.81
lat (nsec): min=1416, max=40745M, avg=16104.06, stdev=13650546.77
clat percentiles (nsec):
| 1.00th=[ 378], 5.00th=[ 390], 10.00th=[ 406], 20.00th=[ 410],
| 30.00th=[ 414], 40.00th=[ 414], 50.00th=[ 418], 60.00th=[ 418],
| 70.00th=[ 418], 80.00th=[ 422], 90.00th=[ 426], 95.00th=[ 494],
| 99.00th=[ 772], 99.50th=[ 916], 99.90th=[ 3856], 99.95th=[ 5920],
| 99.99th=[10816]
bw ( KiB/s): min= 1, max=627591, per=100.00%, avg=244393.77, stdev=103534.74, samples=765
iops : min= 0, max=78448, avg=30549.06, stdev=12941.82, samples=765
When we track per-second max latency in fio, we see something like this:
<time-ms>,<max-latency-ns>,,,
...
777000, 5155548, 0, 0, 0
778000, 105551, 1, 0, 0
802615, 24276019570, 0, 0, 0
802615, 82134, 1, 0, 0
804000, 9944554, 0, 0, 0
805000, 7424638, 1, 0, 0
fio --time_based --runtime=3600s --ramp_time=2s --ioengine=libaio --name=latency_test --filename=fio --bs=8k --iodepth=1 --size=900G --readwrite=randrw --verify=0 --filename=fio --write_lat_log=lat --log_avg_msec=1000 --log_max_value=1
We saw a smiliar issue reported in https://www.spinics.net/lists/linux-bcache/msg09578.html, which suggests an issue in garbage collection. When we trigger GC manually via "echo 1 > /sys/fs/bcache/a356bdb0-...-64f794387488/internal/trigger_gc", the stall is always reproduced. That thread points to this patch (https://www.spinics.net/lists/linux-bcache/msg08870.html) that we tested and the stall no longer happens.
AFAIK, this patch marks buckets reclaimable at the beginning of GC to unblock the allocator so it does not need to wait for GC to finish. This periodic stall is a serious issue. Can the community look at this issue and this patch if possible?
We are running Linux kernel version 5.10 and 6.1.
Thank you.
next prev parent reply other threads:[~2024-03-15 22:45 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-10 11:21 [PATCH] bcache: allow allocator to invalidate bucket in gc Dongsheng Yang
2020-09-10 11:28 ` [PATCH v2] " Dongsheng Yang
2020-09-18 9:53 ` Coly Li
2024-03-15 22:45 ` Robert Pang [this message]
2024-03-16 2:48 ` Coly Li
2024-03-17 5:41 ` Robert Pang
2024-03-17 13:59 ` Coly Li
2024-03-18 6:16 ` Robert Pang
2024-03-28 18:05 ` Robert Pang
2024-03-29 13:00 ` Coly Li
2024-04-11 6:44 ` Robert Pang
2024-05-03 18:23 ` Coly Li
2024-05-03 18:28 ` Coly Li
2024-05-04 2:04 ` Robert Pang
2024-05-04 3:08 ` Coly Li
2024-05-08 2:34 ` Dongsheng Yang
2024-05-12 5:43 ` Robert Pang
2024-05-12 9:41 ` Kernel error with 6.8.9 Pierre Juhen (IMAP)
2024-05-13 7:57 ` Coly Li
2024-05-17 0:34 ` Eric Wheeler
2024-05-17 15:57 ` Coly Li
2024-05-13 7:43 ` [PATCH v2] bcache: allow allocator to invalidate bucket in gc Coly Li
2024-05-14 5:15 ` Robert Pang
2024-05-14 23:39 ` Coly Li
2024-05-17 0:30 ` Eric Wheeler
2024-05-17 16:06 ` Coly Li
2024-05-17 21:47 ` Eric Wheeler
2024-05-24 7:14 ` Robert Pang
2024-05-27 18:14 ` Coly Li
2024-05-28 5:50 ` Robert Pang
2024-05-29 16:24 ` Coly Li
2024-06-03 7:04 ` Robert Pang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240315224527.694458-1-robertpang@google.com \
--to=robertpang@google.com \
--cc=colyli@suse.de \
--cc=dongsheng.yang@easystack.cn \
--cc=linux-bcache@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).