From: Ahmad Draidi <a.r.draidi@redscript.org>
To: linux-bcachefs@vger.kernel.org
Cc: syzbot+7bf808f7fe4a6549f36e@syzkaller.appspotmail.com
Subject: Re: [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary
Date: Tue, 3 Dec 2024 10:06:29 +0400 [thread overview]
Message-ID: <eaa30a2b-3d96-4249-983b-79cb0348d16d@redscript.org> (raw)
In-Reply-To: <92dce846-d110-4c97-afd1-0b198c1fdf4d@redscript.org>
Hello,
On 10/24/24 07:46, Ahmad Draidi wrote:
> Greetings,
>
>
> On 10/20/24 01:56, Kent Overstreet wrote:
>> copygc tries to wait in a way that balances waiting for work to
>> accumulate with running before we run out of free space - but for a
>> variety of reasons (multiple devices, io clock slop, the vagaries of
>> fragmentation) this isn't completely reliable.
>>
>> So to avoid getting stuck, add direct wakeups from the allocator to the
>> copygc thread when we start to notice we're low on free buckets.
>
> Since I switched to 6.11.x from 6.10.x, I've had "Allocator stuck?
> Waited for 30 seconds" messages and I/O would stop to the FS. No
> timeout on read, for example, but it just stops for hours, until I
> reboot. I'm able to quickly and reliably trigger this with my workload.
>
>
> I applied this patch on top of 6.11.4 but can still see "Allocator
> stuck" in dmesg. I see the following before and after the patch:-
>
> "BUG: unable to handle page fault for address: fffffffffffff81b
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page"
>
> ...
>
> "RIP: 0010:bch2_btree_path_peek_slot+0x64/0x210 [bcachefs]"
>
>
> A longer log snippet of "allocator stuck" and the above are at:
> https://pastebin.com/ptuzaryi
Just a quick update for anyone reading this. The issue is solved for me
after upgrading to 6.12.1.
>
>
> I did fsck after FS got stuck, and errors were found and fixed, but
> issue happens again, before and after the patch.
>
> Some info that might be needed: I'm using ECC RAM, 2x SAS SSDs, 2x
> SATA HDDs, LUKS, and the following opts:
>
> starting version 1.12: rebalance_work_acct_fix
> opts=metadata_replicas=2,data_replicas=2,metadata_replicas_required=2,data_replicas_required=2,
>
> metadata_checksum=xxhash,data_checksum=xxhash,compression=lz4,background_compression=gzip,metadata_target=ssd,foreground_target=ssd,
>
>
> background_target=hdd,promote_target=ssd
>
>
> Let me know if I can help.
>
>
> Thanks!
>
> Ahmad
>
>
prev parent reply other threads:[~2024-12-03 6:13 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-19 21:56 [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary Kent Overstreet
2024-10-24 3:46 ` Ahmad Draidi
2024-12-03 6:06 ` Ahmad Draidi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eaa30a2b-3d96-4249-983b-79cb0348d16d@redscript.org \
--to=a.r.draidi@redscript.org \
--cc=linux-bcachefs@vger.kernel.org \
--cc=syzbot+7bf808f7fe4a6549f36e@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox