All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: linux-bcachefs@vger.kernel.org
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Subject: bcachefs backpointers errors in no write buf mode
Date: Fri, 7 Jul 2023 13:32:09 -0400	[thread overview]
Message-ID: <ZKhMGZBokKMq5p1P@bfoster> (raw)

Hi Kent,

I'm doing some analysis of these "existing backpointer found when
inserting" errors observed when running generic/388 in
backpointers_no_use_write_buffer=1 mode. To start with context, here's
what I see in one particular variant of the issue...

We're in read bio completion path doing a CRC narrow. The commit for
that transaction runs trans triggers on the extent key, which then falls
into bch2_trans_mark_extent() and attempts backpointer updates from
there. I stuck a warning in the backpointer error checking code just to
get a stack trace for the error context, which is appended below for
reference.

On IRC Kent mentioned this was likely a trigger ordering issue and that
something similar was addressed in the normal backpointer write buffer
update path. I think I follow what's going on there from commit
0a1b82456400 ("bcachefs: btree write buffer update ordering"), but the
context for the no write buf path seems a bit different..

WRT ordering, if I follow the code correctly, run_btree_triggers() in
the commit path basically implements an overwrite followed by !overwrite
pass, except if the old and new key type/triggers match we call into
bch2_trans_mark_key() with both the old and new keys. This calls into
bch2_trans_mark_extent(), which then implements an explicit insert ->
overwrite order. If that falls into the backpointer write buffer path,
that then has to reorder the updates in the write buffer by putting the
overwrite update at the head, presumably so it won't remove the
backpointer insert that is about to be added. (I assume there are
legitimate design reasons for the update ordering at each level here,
but the reordering back and forth seems like unfortunate complexity.)

This raises a few questions wrt the non write buffer path. First, the
existing bp error check lives in
bch2_bucket_backpointer_mod_nowritebuffer() before we even call into
bch2_trans_update(), so changing the order of trans updates doesn't seem
like it would avoid this particular error. I don't think we want to
change the high level order of the extent key trigger, so that means we
probably need to do something with the error check itself.

It also seems like this is possibly different from whatever issue lead
to the ordering fix in the write buf path, though I'm not quite sure how
that one manifests or is reproduced. A missing backpointer due to
unexpected bp removal, perhaps? If so, I suppose I could see why the
updates should be reordered in the non-buf trans, assuming the overwrite
would delete the just inserted bp key from the trans (and then btree on
commit), and perhaps I'm just not seeing that because I hit the existing
bp error check first.

So I'm still trying to piece high level things together such that an
appropriate fix is not yet clear to me. I'm wondering if perhaps the
high level (narrow crc) trans needs a flag to modify the bp checking
behavior to indicate we aren't actually inserting a new bp here (so
seeing an existing one is fine/expected), and then subsequently
bch2_bucket_backpointer_mod_nowritebuffer() needs to do something to
turn the "insert" into a bp key overwrite, and perhaps ignore the higher
level overwrite, or something *handwavy* along those lines.

I need to think about this some more, but wanted to put some thoughts
down in case I'm misunderstanding things or there's an obvious fix I'm
just not seeing yet..

Brian

--- 8< ---

WARNING: CPU: 1 PID: 31479 at fs/bcachefs/backpointers.c:128 backpointer_mod_err.isra.0+0x193/0x1b0 [bcachefs]
...
CPU: 1 PID: 31479 Comm: kworker/u16:4 Tainted: G        W  OE      6.4.0+ #68
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc36 04/01/2014
Workqueue: events_unbound __bch2_read_endio [bcachefs]

RIP: 0010:backpointer_mod_err.isra.0+0x193/0x1b0 [bcachefs]
...
Call Trace:
 <TASK>
 ? backpointer_mod_err.isra.0+0x193/0x1b0 [bcachefs]
 ? __warn+0x7d/0x130
 ? backpointer_mod_err.isra.0+0x193/0x1b0 [bcachefs]
 ? report_bug+0x18d/0x1c0
 ? handle_bug+0x41/0x70
 ? exc_invalid_op+0x13/0x60
 ? asm_exc_invalid_op+0x16/0x20
 ? backpointer_mod_err.isra.0+0x193/0x1b0 [bcachefs]
 bch2_bucket_backpointer_mod_nowritebuffer+0x15b/0x210 [bcachefs]
 ? bch2_bucket_backpointer_mod_nowritebuffer+0x5/0x210 [bcachefs]
 bch2_bucket_backpointer_mod+0x2ad/0x2c0 [bcachefs]
 bch2_trans_mark_pointer.constprop.0+0x34d/0x360 [bcachefs]
 ? bch2_trans_start_alloc_update+0x5/0x140 [bcachefs]
 __trans_mark_extent+0x1c5/0x3e0 [bcachefs]
 ? __trace_bprintk+0x54/0x60
 bch2_trans_mark_extent+0x9f/0xe0 [bcachefs]
 run_btree_triggers+0x1ae/0x330 [bcachefs]
 __bch2_trans_commit+0x92/0x1e70 [bcachefs]
 ? __bch2_rbio_narrow_crcs+0xbb/0x380 [bcachefs]
 bch2_rbio_narrow_crcs+0xbf/0x100 [bcachefs]
 ? bch2_rbio_narrow_crcs+0x45/0x100 [bcachefs]
 __bch2_read_endio+0x7c7/0x8f0 [bcachefs]
 ? process_one_work+0x1c8/0x3c0
 process_one_work+0x1c8/0x3c0
 worker_thread+0x4d/0x380
 ? _raw_spin_lock_irqsave+0x23/0x50
 ? __pfx_worker_thread+0x10/0x10
 kthread+0xf3/0x120
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2c/0x50
 </TASK>


             reply	other threads:[~2023-07-07 17:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-07 17:32 Brian Foster [this message]
2023-07-07 17:58 ` bcachefs backpointers errors in no write buf mode Kent Overstreet
2023-07-10 13:37   ` Brian Foster
2023-07-10 14:56     ` Kent Overstreet
2023-07-12 15:01       ` Brian Foster
2023-07-12 15:40         ` Kent Overstreet
2023-07-12 16:52           ` Brian Foster
2023-07-12 17:10             ` Kent Overstreet
2023-07-07 20:37 ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZKhMGZBokKMq5p1P@bfoster \
    --to=bfoster@redhat.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.