linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bcachefs: don't return early from __btree_err for bad or incompatible node read errors
@ 2025-06-14 18:57 Bharadwaj Raju
  2025-06-15  0:19 ` Kent Overstreet
  0 siblings, 1 reply; 5+ messages in thread
From: Bharadwaj Raju @ 2025-06-14 18:57 UTC (permalink / raw)
  To: kent.overstreet
  Cc: Bharadwaj Raju, linux-bcachefs, shuah, linux-kernel,
	linux-kernel-mentees, syzbot+cfd994b9cdf00446fd54

After cd3cdb1ef706 ("Single err message for btree node reads"),
all errors caused __btree_err to return BCH_ERR_fsck_fix no matter what
the actual error type was if the recovery pass was scanning for btree
nodes. This lead to the code continuing despite things like bad node
formats when they earlier would have caused a jump to fsck_err, because
btree_err only jumps when the return from __btree_err does not match
fsck_fix. Ultimately this lead to undefined behavior by attempting to
unpack a key based on an invalid format.

Make errors of type BCH_ERR_btree_node_read_err_bad_node (only if
__bch2_topology_error) or BCH_ERR_btree_node_read_err_incompatible go
through the full __btree_err function instead of returning fsck_fix even
when we are in that recovery phase.

Reported-by: syzbot+cfd994b9cdf00446fd54@syzkaller.appspotmail.com
Fixes: cd3cdb1ef706 ("bcachefs: Single err message for btree node reads")
Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com>
---
 fs/bcachefs/btree_io.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/bcachefs/btree_io.c b/fs/bcachefs/btree_io.c
index d8f3c4c65e90..e010ae94f1e1 100644
--- a/fs/bcachefs/btree_io.c
+++ b/fs/bcachefs/btree_io.c
@@ -556,7 +556,10 @@ static int __btree_err(int ret,
 		       struct printbuf *err_msg,
 		       const char *fmt, ...)
 {
-	if (c->recovery.curr_pass == BCH_RECOVERY_PASS_scan_for_btree_nodes)
+	if (c->recovery.curr_pass == BCH_RECOVERY_PASS_scan_for_btree_nodes &&
+	    !(ret == -BCH_ERR_btree_node_read_err_bad_node &&
+	      __bch2_topology_error(c, err_msg)) &&
+	    ret != -BCH_ERR_btree_node_read_err_incompatible)
 		return bch_err_throw(c, fsck_fix);
 
 	bool have_retry = false;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] bcachefs: don't return early from __btree_err for bad or incompatible node read errors
  2025-06-14 18:57 [PATCH] bcachefs: don't return early from __btree_err for bad or incompatible node read errors Bharadwaj Raju
@ 2025-06-15  0:19 ` Kent Overstreet
  2025-06-15  8:12   ` Bharadwaj Raju
  0 siblings, 1 reply; 5+ messages in thread
From: Kent Overstreet @ 2025-06-15  0:19 UTC (permalink / raw)
  To: Bharadwaj Raju
  Cc: linux-bcachefs, shuah, linux-kernel, linux-kernel-mentees,
	syzbot+cfd994b9cdf00446fd54

On Sun, Jun 15, 2025 at 12:27:40AM +0530, Bharadwaj Raju wrote:
> After cd3cdb1ef706 ("Single err message for btree node reads"),
> all errors caused __btree_err to return BCH_ERR_fsck_fix no matter what
> the actual error type was if the recovery pass was scanning for btree
> nodes. This lead to the code continuing despite things like bad node
> formats when they earlier would have caused a jump to fsck_err, because
> btree_err only jumps when the return from __btree_err does not match
> fsck_fix. Ultimately this lead to undefined behavior by attempting to
> unpack a key based on an invalid format.

Hang on, -BCH_ERR_fsck_fix should've caused us to fix fixable errors,
not cause undefined behaviour.

Or is the issue that we're returning -BCH_ERR_fsck_fix for non-fixable
errors?

Glancing at the code, I think the bug might not be limited to btree node
scan; we now seem to be passing FSCK_CAN_FIX for all errors in the
non-btree-node-scan case, and I don't think that's right for
BCH_ERR_btree_node_read_err_must_retry cases.

But I'll have to go digging through the git history to confirm that, and
it sounds like you've already looked - does that sound like it?

> 
> Make errors of type BCH_ERR_btree_node_read_err_bad_node (only if
> __bch2_topology_error) or BCH_ERR_btree_node_read_err_incompatible go
> through the full __btree_err function instead of returning fsck_fix even
> when we are in that recovery phase.
> 
> Reported-by: syzbot+cfd994b9cdf00446fd54@syzkaller.appspotmail.com
> Fixes: cd3cdb1ef706 ("bcachefs: Single err message for btree node reads")
> Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com>
> ---
>  fs/bcachefs/btree_io.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/bcachefs/btree_io.c b/fs/bcachefs/btree_io.c
> index d8f3c4c65e90..e010ae94f1e1 100644
> --- a/fs/bcachefs/btree_io.c
> +++ b/fs/bcachefs/btree_io.c
> @@ -556,7 +556,10 @@ static int __btree_err(int ret,
>  		       struct printbuf *err_msg,
>  		       const char *fmt, ...)
>  {
> -	if (c->recovery.curr_pass == BCH_RECOVERY_PASS_scan_for_btree_nodes)
> +	if (c->recovery.curr_pass == BCH_RECOVERY_PASS_scan_for_btree_nodes &&
> +	    !(ret == -BCH_ERR_btree_node_read_err_bad_node &&
> +	      __bch2_topology_error(c, err_msg)) &&
> +	    ret != -BCH_ERR_btree_node_read_err_incompatible)
>  		return bch_err_throw(c, fsck_fix);
>  
>  	bool have_retry = false;
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] bcachefs: don't return early from __btree_err for bad or incompatible node read errors
  2025-06-15  0:19 ` Kent Overstreet
@ 2025-06-15  8:12   ` Bharadwaj Raju
  2025-06-15 16:05     ` Kent Overstreet
  0 siblings, 1 reply; 5+ messages in thread
From: Bharadwaj Raju @ 2025-06-15  8:12 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcachefs, shuah, linux-kernel, linux-kernel-mentees,
	syzbot+cfd994b9cdf00446fd54

On Sun, Jun 15, 2025 at 5:49 AM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
>
> On Sun, Jun 15, 2025 at 12:27:40AM +0530, Bharadwaj Raju wrote:
> > After cd3cdb1ef706 ("Single err message for btree node reads"),
> > all errors caused __btree_err to return BCH_ERR_fsck_fix no matter what
> > the actual error type was if the recovery pass was scanning for btree
> > nodes. This lead to the code continuing despite things like bad node
> > formats when they earlier would have caused a jump to fsck_err, because
> > btree_err only jumps when the return from __btree_err does not match
> > fsck_fix. Ultimately this lead to undefined behavior by attempting to
> > unpack a key based on an invalid format.
>
> Hang on, -BCH_ERR_fsck_fix should've caused us to fix fixable errors,
> not cause undefined behaviour.

-BCH_ERR_fsck_fix in btree_err (as _ret from __btree_err) prevents a jump
to fsck_err, so if that is the case, the code afterwards continues
as normal. That's where the UB in this bug comes from.

I think I should explain the path of the bug:
1. bch2_btree_node_read_done calls validate_bset, with a jump to
fsck_err if it returns an error.
2. validate_bset has btree_err_on(bch2_bkey_format_invalid(...), ...),
but in the bug case we were in btree-node-scan so __btree_err returns
fsck_fix, which means ret isn't modified and we don't jump to
fsck_err.
3. validate_bset doesn't return an error code, so
bch2_btree_node_read_done goes ahead and sets the invalid format --
and then UB happens when trying to unpack keys based on it.

> Or is the issue that we're returning -BCH_ERR_fsck_fix for non-fixable
> errors?
>
> Glancing at the code, I think the bug might not be limited to btree node
> scan; we now seem to be passing FSCK_CAN_FIX for all errors in the
> non-btree-node-scan case, and I don't think that's right for
> BCH_ERR_btree_node_read_err_must_retry cases.
>
> But I'll have to go digging through the git history to confirm that, and
> it sounds like you've already looked - does that sound like it?

Isn't the bch2_fsck_err_opt(c, FSCK_CAN_FIX, err_type) call only in
the node_read_err_fixable case?
In the must_retry case we return bad_node. But I'm not really familiar
with all this, so I think it'll be best if you do
take a look.

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] bcachefs: don't return early from __btree_err for bad or incompatible node read errors
  2025-06-15  8:12   ` Bharadwaj Raju
@ 2025-06-15 16:05     ` Kent Overstreet
  2025-06-15 16:46       ` Bharadwaj Raju
  0 siblings, 1 reply; 5+ messages in thread
From: Kent Overstreet @ 2025-06-15 16:05 UTC (permalink / raw)
  To: Bharadwaj Raju
  Cc: linux-bcachefs, shuah, linux-kernel, linux-kernel-mentees,
	syzbot+cfd994b9cdf00446fd54

On Sun, Jun 15, 2025 at 01:42:22PM +0530, Bharadwaj Raju wrote:
> On Sun, Jun 15, 2025 at 5:49 AM Kent Overstreet
> <kent.overstreet@linux.dev> wrote:
> >
> > On Sun, Jun 15, 2025 at 12:27:40AM +0530, Bharadwaj Raju wrote:
> > > After cd3cdb1ef706 ("Single err message for btree node reads"),
> > > all errors caused __btree_err to return BCH_ERR_fsck_fix no matter what
> > > the actual error type was if the recovery pass was scanning for btree
> > > nodes. This lead to the code continuing despite things like bad node
> > > formats when they earlier would have caused a jump to fsck_err, because
> > > btree_err only jumps when the return from __btree_err does not match
> > > fsck_fix. Ultimately this lead to undefined behavior by attempting to
> > > unpack a key based on an invalid format.
> >
> > Hang on, -BCH_ERR_fsck_fix should've caused us to fix fixable errors,
> > not cause undefined behaviour.
> 
> -BCH_ERR_fsck_fix in btree_err (as _ret from __btree_err) prevents a jump
> to fsck_err, so if that is the case, the code afterwards continues
> as normal. That's where the UB in this bug comes from.
> 
> I think I should explain the path of the bug:
> 1. bch2_btree_node_read_done calls validate_bset, with a jump to
> fsck_err if it returns an error.
> 2. validate_bset has btree_err_on(bch2_bkey_format_invalid(...), ...),
> but in the bug case we were in btree-node-scan so __btree_err returns
> fsck_fix, which means ret isn't modified and we don't jump to
> fsck_err.
> 3. validate_bset doesn't return an error code, so
> bch2_btree_node_read_done goes ahead and sets the invalid format --
> and then UB happens when trying to unpack keys based on it.
> 
> > Or is the issue that we're returning -BCH_ERR_fsck_fix for non-fixable
> > errors?
> >
> > Glancing at the code, I think the bug might not be limited to btree node
> > scan; we now seem to be passing FSCK_CAN_FIX for all errors in the
> > non-btree-node-scan case, and I don't think that's right for
> > BCH_ERR_btree_node_read_err_must_retry cases.
> >
> > But I'll have to go digging through the git history to confirm that, and
> > it sounds like you've already looked - does that sound like it?
> 
> Isn't the bch2_fsck_err_opt(c, FSCK_CAN_FIX, err_type) call only in
> the node_read_err_fixable case?

You're right, it is. Ok, so it's just the scan_for_btree_nodes path
that's busted.

But then you're calling __bch2_topology_error()? When we're in
scan_for_btree_nodes we're just checking if the node is readable, we
shouldn't be doing anything that flags errors/recovery passes elsewhere.

I think the correct fix would be more like

if (scan_for_btree_nodes)
	return ret == -BCH_ERR_btree_node_read_err_fixable
		? bch_err_throw(c, fsck_fix)
		: ret; /* or -BCH_ERR_btree_node_read_err_bad_node? */

The error codes feel pretty awkward here, we're converting between
different classes of error codes more than we should be which makes
things hard to follow - this is code that predates modern errcode.h
errcodes and was converted from a private enum, I think I could've done
that better - but I think either way works.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] bcachefs: don't return early from __btree_err for bad or incompatible node read errors
  2025-06-15 16:05     ` Kent Overstreet
@ 2025-06-15 16:46       ` Bharadwaj Raju
  0 siblings, 0 replies; 5+ messages in thread
From: Bharadwaj Raju @ 2025-06-15 16:46 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcachefs, shuah, linux-kernel, linux-kernel-mentees,
	syzbot+cfd994b9cdf00446fd54

On Sun, Jun 15, 2025 at 9:35 PM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
> But then you're calling __bch2_topology_error()? When we're in
> scan_for_btree_nodes we're just checking if the node is readable, we
> shouldn't be doing anything that flags errors/recovery passes elsewhere.

Right, that makes sense. I originally put that in because the old
codepath would unset silent (which was set when in
scan_for_btree_nodes) only if __bch2_topology_error returned nonzero
in the case of bad_node.

> I think the correct fix would be more like
>
> if (scan_for_btree_nodes)
>         return ret == -BCH_ERR_btree_node_read_err_fixable
>                 ? bch_err_throw(c, fsck_fix)
>                 : ret; /* or -BCH_ERR_btree_node_read_err_bad_node? */
>

Got it. I'll send the PATCH v2 then, thanks for the guidance!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-15 16:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-14 18:57 [PATCH] bcachefs: don't return early from __btree_err for bad or incompatible node read errors Bharadwaj Raju
2025-06-15  0:19 ` Kent Overstreet
2025-06-15  8:12   ` Bharadwaj Raju
2025-06-15 16:05     ` Kent Overstreet
2025-06-15 16:46       ` Bharadwaj Raju

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).