From: Ahmad Draidi <a.r.draidi@redscript.org>
To: Kent Overstreet <kent.overstreet@linux.dev>,
linux-bcachefs@vger.kernel.org
Cc: syzbot+7bf808f7fe4a6549f36e@syzkaller.appspotmail.com
Subject: Re: [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary
Date: Thu, 24 Oct 2024 07:46:07 +0400 [thread overview]
Message-ID: <92dce846-d110-4c97-afd1-0b198c1fdf4d@redscript.org> (raw)
In-Reply-To: <20241019215605.160125-1-kent.overstreet@linux.dev>
Greetings,
On 10/20/24 01:56, Kent Overstreet wrote:
> copygc tries to wait in a way that balances waiting for work to
> accumulate with running before we run out of free space - but for a
> variety of reasons (multiple devices, io clock slop, the vagaries of
> fragmentation) this isn't completely reliable.
>
> So to avoid getting stuck, add direct wakeups from the allocator to the
> copygc thread when we start to notice we're low on free buckets.
Since I switched to 6.11.x from 6.10.x, I've had "Allocator stuck?
Waited for 30 seconds" messages and I/O would stop to the FS. No timeout
on read, for example, but it just stops for hours, until I reboot. I'm
able to quickly and reliably trigger this with my workload.
I applied this patch on top of 6.11.4 but can still see "Allocator
stuck" in dmesg. I see the following before and after the patch:-
"BUG: unable to handle page fault for address: fffffffffffff81b
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page"
...
"RIP: 0010:bch2_btree_path_peek_slot+0x64/0x210 [bcachefs]"
A longer log snippet of "allocator stuck" and the above are at:
https://pastebin.com/ptuzaryi
I did fsck after FS got stuck, and errors were found and fixed, but
issue happens again, before and after the patch.
Some info that might be needed: I'm using ECC RAM, 2x SAS SSDs, 2x SATA
HDDs, LUKS, and the following opts:
starting version 1.12: rebalance_work_acct_fix
opts=metadata_replicas=2,data_replicas=2,metadata_replicas_required=2,data_replicas_required=2,
metadata_checksum=xxhash,data_checksum=xxhash,compression=lz4,background_compression=gzip,metadata_target=ssd,foreground_target=ssd,
background_target=hdd,promote_target=ssd
Let me know if I can help.
Thanks!
Ahmad
next prev parent reply other threads:[~2024-10-24 3:52 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-19 21:56 [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary Kent Overstreet
2024-10-24 3:46 ` Ahmad Draidi [this message]
2024-12-03 6:06 ` Ahmad Draidi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=92dce846-d110-4c97-afd1-0b198c1fdf4d@redscript.org \
--to=a.r.draidi@redscript.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
--cc=syzbot+7bf808f7fe4a6549f36e@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox