[PATCH] bcachefs: Allocator now directly wakes up copygc when necessary

public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary
@ 2024-10-19 21:56 Kent Overstreet
  2024-10-24  3:46 ` Ahmad Draidi
  0 siblings, 1 reply; 3+ messages in thread
From: Kent Overstreet @ 2024-10-19 21:56 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, syzbot+7bf808f7fe4a6549f36e

copygc tries to wait in a way that balances waiting for work to
accumulate with running before we run out of free space - but for a
variety of reasons (multiple devices, io clock slop, the vagaries of
fragmentation) this isn't completely reliable.

So to avoid getting stuck, add direct wakeups from the allocator to the
copygc thread when we start to notice we're low on free buckets.

Reported-by: syzbot+7bf808f7fe4a6549f36e@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/alloc_foreground.c |  8 ++++++++
 fs/bcachefs/bcachefs.h         |  2 +-
 fs/bcachefs/movinggc.c         | 22 +++++++++++-----------
 3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/fs/bcachefs/alloc_foreground.c b/fs/bcachefs/alloc_foreground.c
index 5836870ab882..c7848672796d 100644
--- a/fs/bcachefs/alloc_foreground.c
+++ b/fs/bcachefs/alloc_foreground.c
@@ -822,6 +822,14 @@ int bch2_bucket_alloc_set_trans(struct btree_trans *trans,
 		}
 	}
 
+	if (bch2_err_matches(ret, BCH_ERR_freelist_empty)) {
+		rcu_read_lock();
+		struct task_struct *t = rcu_dereference(c->copygc_thread);
+		if (t)
+			wake_up_process(t);
+		rcu_read_unlock();
+	}
+
 	return ret;
 }
 
diff --git a/fs/bcachefs/bcachefs.h b/fs/bcachefs/bcachefs.h
index f4151ee51b03..7cc81fbc4c3a 100644
--- a/fs/bcachefs/bcachefs.h
+++ b/fs/bcachefs/bcachefs.h
@@ -986,7 +986,7 @@ struct bch_fs {
 	struct bch_fs_rebalance	rebalance;
 
 	/* COPYGC */
-	struct task_struct	*copygc_thread;
+	struct task_struct __rcu *copygc_thread;
 	struct write_point	copygc_write_point;
 	s64			copygc_wait_at;
 	s64			copygc_wait;
diff --git a/fs/bcachefs/movinggc.c b/fs/bcachefs/movinggc.c
index d658be90f737..80b18b4b04b7 100644
--- a/fs/bcachefs/movinggc.c
+++ b/fs/bcachefs/movinggc.c
@@ -363,19 +363,18 @@ static int bch2_copygc_thread(void *arg)
 		}
 
 		last = atomic64_read(&clock->now);
-		wait = bch2_copygc_wait_amount(c);
+		wait = max_t(long, 0, bch2_copygc_wait_amount(c) - clock->max_slop);
 
-		if (wait > clock->max_slop) {
+		if (wait > 0) {
 			c->copygc_wait_at = last;
 			c->copygc_wait = last + wait;
 			move_buckets_wait(&ctxt, buckets, true);
-			trace_and_count(c, copygc_wait, c, wait, last + wait);
-			bch2_kthread_io_clock_wait(clock, last + wait,
-					MAX_SCHEDULE_TIMEOUT);
+			trace_and_count(c, copygc_wait, c, wait, c->copygc_wait);
+			bch2_io_clock_schedule_timeout(clock, c->copygc_wait);
 			continue;
 		}
 
-		c->copygc_wait = 0;
+		c->copygc_wait = c->copygc_wait_at = 0;
 
 		c->copygc_running = true;
 		ret = bch2_copygc(&ctxt, buckets, &did_work);
@@ -407,9 +406,10 @@ static int bch2_copygc_thread(void *arg)
 
 void bch2_copygc_stop(struct bch_fs *c)
 {
-	if (c->copygc_thread) {
-		kthread_stop(c->copygc_thread);
-		put_task_struct(c->copygc_thread);
+	struct task_struct *t = rcu_dereference_protected(c->copygc_thread, true);
+	if (t) {
+		kthread_stop(t);
+		put_task_struct(t);
 	}
 	c->copygc_thread = NULL;
 }
@@ -436,8 +436,8 @@ int bch2_copygc_start(struct bch_fs *c)
 
 	get_task_struct(t);
 
-	c->copygc_thread = t;
-	wake_up_process(c->copygc_thread);
+	rcu_assign_pointer(c->copygc_thread, t);
+	wake_up_process(t);
 
 	return 0;
 }
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary
  2024-10-19 21:56 [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary Kent Overstreet
@ 2024-10-24  3:46 ` Ahmad Draidi
  2024-12-03  6:06   ` Ahmad Draidi
  0 siblings, 1 reply; 3+ messages in thread
From: Ahmad Draidi @ 2024-10-24  3:46 UTC (permalink / raw)
  To: Kent Overstreet, linux-bcachefs; +Cc: syzbot+7bf808f7fe4a6549f36e

Greetings,


On 10/20/24 01:56, Kent Overstreet wrote:
> copygc tries to wait in a way that balances waiting for work to
> accumulate with running before we run out of free space - but for a
> variety of reasons (multiple devices, io clock slop, the vagaries of
> fragmentation) this isn't completely reliable.
>
> So to avoid getting stuck, add direct wakeups from the allocator to the
> copygc thread when we start to notice we're low on free buckets.

Since I switched to 6.11.x from 6.10.x, I've had "Allocator stuck? 
Waited for 30 seconds" messages and I/O would stop to the FS. No timeout 
on read, for example, but it just stops for hours, until I reboot. I'm 
able to quickly and reliably trigger this with my workload.


I applied this patch on top of 6.11.4 but can still see "Allocator 
stuck" in dmesg. I see the following before and after the patch:-

"BUG: unable to handle page fault for address: fffffffffffff81b
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page"

...

"RIP: 0010:bch2_btree_path_peek_slot+0x64/0x210 [bcachefs]"


A longer log snippet of "allocator stuck" and the above are at: 
https://pastebin.com/ptuzaryi


I did fsck after FS got stuck, and errors were found and fixed, but 
issue happens again, before and after the patch.

Some info that might be needed: I'm using ECC RAM, 2x SAS SSDs, 2x SATA 
HDDs, LUKS, and the following opts:

starting version 1.12: rebalance_work_acct_fix 
opts=metadata_replicas=2,data_replicas=2,metadata_replicas_required=2,data_replicas_required=2,

metadata_checksum=xxhash,data_checksum=xxhash,compression=lz4,background_compression=gzip,metadata_target=ssd,foreground_target=ssd,

background_target=hdd,promote_target=ssd


Let me know if I can help.


Thanks!

Ahmad



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary
  2024-10-24  3:46 ` Ahmad Draidi
@ 2024-12-03  6:06   ` Ahmad Draidi
  0 siblings, 0 replies; 3+ messages in thread
From: Ahmad Draidi @ 2024-12-03  6:06 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: syzbot+7bf808f7fe4a6549f36e

Hello,


On 10/24/24 07:46, Ahmad Draidi wrote:
> Greetings,
>
>
> On 10/20/24 01:56, Kent Overstreet wrote:
>> copygc tries to wait in a way that balances waiting for work to
>> accumulate with running before we run out of free space - but for a
>> variety of reasons (multiple devices, io clock slop, the vagaries of
>> fragmentation) this isn't completely reliable.
>>
>> So to avoid getting stuck, add direct wakeups from the allocator to the
>> copygc thread when we start to notice we're low on free buckets.
>
> Since I switched to 6.11.x from 6.10.x, I've had "Allocator stuck? 
> Waited for 30 seconds" messages and I/O would stop to the FS. No 
> timeout on read, for example, but it just stops for hours, until I 
> reboot. I'm able to quickly and reliably trigger this with my workload.
>
>
> I applied this patch on top of 6.11.4 but can still see "Allocator 
> stuck" in dmesg. I see the following before and after the patch:-
>
> "BUG: unable to handle page fault for address: fffffffffffff81b
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page"
>
> ...
>
> "RIP: 0010:bch2_btree_path_peek_slot+0x64/0x210 [bcachefs]"
>
>
> A longer log snippet of "allocator stuck" and the above are at: 
> https://pastebin.com/ptuzaryi

Just a quick update for anyone reading this. The issue is solved for me 
after upgrading to 6.12.1.


>
>
> I did fsck after FS got stuck, and errors were found and fixed, but 
> issue happens again, before and after the patch.
>
> Some info that might be needed: I'm using ECC RAM, 2x SAS SSDs, 2x 
> SATA HDDs, LUKS, and the following opts:
>
> starting version 1.12: rebalance_work_acct_fix 
> opts=metadata_replicas=2,data_replicas=2,metadata_replicas_required=2,data_replicas_required=2,
>
> metadata_checksum=xxhash,data_checksum=xxhash,compression=lz4,background_compression=gzip,metadata_target=ssd,foreground_target=ssd, 
>
>
> background_target=hdd,promote_target=ssd
>
>
> Let me know if I can help.
>
>
> Thanks!
>
> Ahmad
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-12-03  6:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-19 21:56 [PATCH] bcachefs: Allocator now directly wakes up copygc when necessary Kent Overstreet
2024-10-24  3:46 ` Ahmad Draidi
2024-12-03  6:06   ` Ahmad Draidi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox