linux-bcachefs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bcachefs: Return EROFS for BCH_IOCTL_DATA ioctl requests in ro mount mode.
@ 2025-06-17 13:41 Julian Sun
  2025-06-17 14:04 ` Kent Overstreet
  0 siblings, 1 reply; 3+ messages in thread
From: Julian Sun @ 2025-06-17 13:41 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: kent.overstreet, Julian Sun, syzbot+56edda805363e0a093b8

Recently, syzkaller reported the following issue:

BUG: kernel NULL pointer dereference, address: 0000000000000000
Call Trace:
 <TASK>
 mempool_alloc_noprof+0x1a7/0x510 mm/mempool.c:402
 bch2_btree_update_start+0x549/0x1480 fs/bcachefs/btree_update_interior.c:1194
 bch2_btree_node_rewrite+0x17e/0x1120 fs/bcachefs/btree_update_interior.c:2208
 bch2_move_btree+0x6f0/0xc70 fs/bcachefs/move.c:1093
 bch2_scan_old_btree_nodes+0x95/0x240 fs/bcachefs/move.c:1215
 bch2_data_job+0x646/0x910 fs/bcachefs/move.c:1354
 bch2_data_thread+0x8f/0x1d0 fs/bcachefs/chardev.c:315
 kthread+0x711/0x8a0 kernel/kthread.c:464
 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

This is because after commit d4d71b58e513 ("bcachefs: RO mounts now use less memory"),
read-only mounts no longer initialize btree_interior_update_pool, which is required for
processing BCH_IOCTL_DATA requests.

Since all BCH_IOCTL_DATA requests involve writing data, EROFS should be returned in this scenario.

Reported-by: syzbot+56edda805363e0a093b8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/683bede4.a00a0220.d8eae.002a.GAE@google.com
Fixes: d4d71b58e513 ("bcachefs: RO mounts now use less memory")
Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
---
 fs/bcachefs/chardev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/bcachefs/chardev.c b/fs/bcachefs/chardev.c
index fde3c2380e28..ba9859fc9f24 100644
--- a/fs/bcachefs/chardev.c
+++ b/fs/bcachefs/chardev.c
@@ -384,6 +384,9 @@ static long bch2_ioctl_data(struct bch_fs *c,
 	if (arg.op >= BCH_DATA_OP_NR || arg.flags)
 		return -EINVAL;
 
+	if (c->vfs_sb->s_flags & SB_RDONLY)
+		return -EROFS;
+
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
 		return -ENOMEM;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] bcachefs: Return EROFS for BCH_IOCTL_DATA ioctl requests in ro mount mode.
  2025-06-17 13:41 [PATCH] bcachefs: Return EROFS for BCH_IOCTL_DATA ioctl requests in ro mount mode Julian Sun
@ 2025-06-17 14:04 ` Kent Overstreet
  2025-06-17 14:56   ` Julian Sun
  0 siblings, 1 reply; 3+ messages in thread
From: Kent Overstreet @ 2025-06-17 14:04 UTC (permalink / raw)
  To: Julian Sun; +Cc: linux-bcachefs, syzbot+56edda805363e0a093b8

On Tue, Jun 17, 2025 at 09:41:20PM +0800, Julian Sun wrote:
> Recently, syzkaller reported the following issue:
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> Call Trace:
>  <TASK>
>  mempool_alloc_noprof+0x1a7/0x510 mm/mempool.c:402
>  bch2_btree_update_start+0x549/0x1480 fs/bcachefs/btree_update_interior.c:1194
>  bch2_btree_node_rewrite+0x17e/0x1120 fs/bcachefs/btree_update_interior.c:2208
>  bch2_move_btree+0x6f0/0xc70 fs/bcachefs/move.c:1093
>  bch2_scan_old_btree_nodes+0x95/0x240 fs/bcachefs/move.c:1215
>  bch2_data_job+0x646/0x910 fs/bcachefs/move.c:1354
>  bch2_data_thread+0x8f/0x1d0 fs/bcachefs/chardev.c:315
>  kthread+0x711/0x8a0 kernel/kthread.c:464
>  ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> This is because after commit d4d71b58e513 ("bcachefs: RO mounts now use less memory"),
> read-only mounts no longer initialize btree_interior_update_pool, which is required for
> processing BCH_IOCTL_DATA requests.

Alan already gave me a better fix for this. You pretty much never want
to just check if the filesystem is ro or rw - that would be racy, that
can change at any time. If you need the filesystem to be rw, you do it
by getting a write ref (which may fail).

Just checking SB_RDONLY here would be "technically" correct since we
only need the mempool, which is's never deallocated until filesystem
teardown, and the interior update path should get its own ref on
c->writes before doing anything serious.

But it's bad form, because then other code changes might go "ok, we've
checked that we're RW, we're safe" - but we're actually not.

And, I'm just now noticing that bch2_btree_update_start() actually does
not get a ref on c->writes, so we might want to fix that - or move.c
needs to be getting a write ref, or both.

c->writes is a percpu refcount, so it's dirt cheap, there's generally
zero downside to taking a ref even if an upper layer already has one.
The only exception is if it's an internal operation that needs to run
when we're going RO - but we have a flag for that,
BCH_TRANS_COMMIT_no_check_rw, which bch2_btree_update_start() can check.

The other consideration with write refs is that we don't want to be
holding them for an unbounded duration, because that will block going RO
- so I think bch2_ioctl_data() actually wasn't the best place for this,
we should be checking if we're RW in move.c, every time we kick off an
op.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] bcachefs: Return EROFS for BCH_IOCTL_DATA ioctl requests in ro mount mode.
  2025-06-17 14:04 ` Kent Overstreet
@ 2025-06-17 14:56   ` Julian Sun
  0 siblings, 0 replies; 3+ messages in thread
From: Julian Sun @ 2025-06-17 14:56 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcachefs, syzbot+56edda805363e0a093b8

On Tue, Jun 17, 2025 at 10:04 PM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
>
> On Tue, Jun 17, 2025 at 09:41:20PM +0800, Julian Sun wrote:
> > Recently, syzkaller reported the following issue:
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > Call Trace:
> >  <TASK>
> >  mempool_alloc_noprof+0x1a7/0x510 mm/mempool.c:402
> >  bch2_btree_update_start+0x549/0x1480 fs/bcachefs/btree_update_interior.c:1194
> >  bch2_btree_node_rewrite+0x17e/0x1120 fs/bcachefs/btree_update_interior.c:2208
> >  bch2_move_btree+0x6f0/0xc70 fs/bcachefs/move.c:1093
> >  bch2_scan_old_btree_nodes+0x95/0x240 fs/bcachefs/move.c:1215
> >  bch2_data_job+0x646/0x910 fs/bcachefs/move.c:1354
> >  bch2_data_thread+0x8f/0x1d0 fs/bcachefs/chardev.c:315
> >  kthread+0x711/0x8a0 kernel/kthread.c:464
> >  ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
> >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> >
> > This is because after commit d4d71b58e513 ("bcachefs: RO mounts now use less memory"),
> > read-only mounts no longer initialize btree_interior_update_pool, which is required for
> > processing BCH_IOCTL_DATA requests.
>
> Alan already gave me a better fix for this. You pretty much never want
> to just check if the filesystem is ro or rw - that would be racy, that
> can change at any time. If you need the filesystem to be rw, you do it
> by getting a write ref (which may fail).
>
> Just checking SB_RDONLY here would be "technically" correct since we
> only need the mempool, which is's never deallocated until filesystem
> teardown, and the interior update path should get its own ref on
> c->writes before doing anything serious.
>
> But it's bad form, because then other code changes might go "ok, we've
> checked that we're RW, we're safe" - but we're actually not.
>
> And, I'm just now noticing that bch2_btree_update_start() actually does
> not get a ref on c->writes, so we might want to fix that - or move.c
> needs to be getting a write ref, or both.
>
> c->writes is a percpu refcount, so it's dirt cheap, there's generally
> zero downside to taking a ref even if an upper layer already has one.
> The only exception is if it's an internal operation that needs to run
> when we're going RO - but we have a flag for that,
> BCH_TRANS_COMMIT_no_check_rw, which bch2_btree_update_start() can check.
>
> The other consideration with write refs is that we don't want to be
> holding them for an unbounded duration, because that will block going RO
> - so I think bch2_ioctl_data() actually wasn't the best place for this,
> we should be checking if we're RW in move.c, every time we kick off an
> op.

Thanks for your detailed explanation, this makes sense to me.

-- 
Julian Sun <sunjunchao2870@gmail.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-06-17 14:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-17 13:41 [PATCH] bcachefs: Return EROFS for BCH_IOCTL_DATA ioctl requests in ro mount mode Julian Sun
2025-06-17 14:04 ` Kent Overstreet
2025-06-17 14:56   ` Julian Sun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).