From: Brian Foster <bfoster@redhat.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org
Subject: Re: [PATCH 3/3] bcachefs: fix race between journal entry close and pin set
Date: Tue, 12 Sep 2023 14:54:34 -0400 [thread overview]
Message-ID: <ZQCz6qYUdWdgVsGH@bfoster> (raw)
In-Reply-To: <20230910020304.bhoqfrsq2ez3ngri@moria.home.lan>
On Sat, Sep 09, 2023 at 10:03:04PM -0400, Kent Overstreet wrote:
> On Wed, Sep 06, 2023 at 03:07:58PM -0400, Brian Foster wrote:
> > On Tue, Sep 05, 2023 at 08:59:02AM -0400, Brian Foster wrote:
> > > On Sun, Sep 03, 2023 at 10:29:28PM -0400, Kent Overstreet wrote:
> > > > On Sun, Sep 03, 2023 at 05:18:12PM -0400, Kent Overstreet wrote:
> > > > > On Thu, Aug 31, 2023 at 07:07:34AM -0400, Brian Foster wrote:
> > > > > > bcachefs freeze testing via fstests generic/390 occasionally
> > > > > > reproduces the following BUG from bch2_fs_read_only():
> > > > > >
> > > > > > BUG_ON(atomic_long_read(&c->btree_key_cache.nr_dirty));
> > > > >
> > > > > Your fix makes sense, but I wonder if it might be simpler and better to
> > > > > fix this in bch2_journal_reclaim_fast().
> > ...
> > >
> > > FWIW, one thing I disliked about the original patch was the need to
> > > mostly duplicate the buf put helper due to the locking context quirk. I
> > > was trying to avoid that, but I ran out of ideas before I wanted to move
> > > on actually fixing the bug. My preference would be to address the
> > > reference counting issue as is (to preserve design simplicity), and then
> > > maybe think a bit harder about cleaning up the res put implementation if
> > > the primary concern is that we feel like this starts to make things a
> > > bit convoluted..
> > >
> >
> > Another thought that comes to mind is to perhaps allow the journal_res
> > to hold a reference to the pin fifo for the associated seq.. The idea
> > would be we could continue to hold a reference during the open/close
> > journal buffer lifecycle, but a res of the same seq would acquire an
> > additional reference as well to keep the tail from popping before a
> > transaction can actually set a pin (i.e. essentially parallel to the
> > buffer reference).
>
> Yeah that would probably be cleanest - but it would be much too heavy.
>
I'm curious how so? Additional atomics?
If so, we could also consider something that allows a pin list ref on
the reservation to simply transfer to a pin set via the transaction.
That might not add any new overhead over the current code, but would
require some plumbing.
> > In a sense the behavior in this patch is kind of an optimization of that
> > approach, but I think implementing it directly would eliminate the need
> > to mostly duplicate the put helper, perhaps making things a bit more
> > natural. We could still fall into bch2_journal_reclaim_fast() from
> > res_put(), but now only for cases where the reservation outlives the
> > active journal buffer (due to a journal flush or whatever). That also
> > still addresses the race by properly using the reference count. I'd
> > probably have to try it to be sure I'm not missing something, but...
> > thoughts?
>
> We can do your approach if that's what you feel is going to be cleanest,
> but if we go that way here's my two main concerns:
>
Ok..
FWIW, it's not so much that I think it's the cleanest (I agree that the
variant discussed above probably ends up cleaner code-wise), but rather
that it provides sufficient confidence we won't just have to revisit the
same underlying problem the next time some bit of code assumes that an
outstanding reservation guarantees a usable pin list.
> - we can't put more slowpath code into the inline fastpath, remember
> this is all inlined into the main __bch2_trans_commit() path
>
Sure. I missed that closure_call() was itself inlined. I think that is
what initially confused me wrt to the feedback on patch 2. I'll try to
keep that helper external and perhaps just rename it.
> - your approach (the optimized version) means there's a new state
> transition where we do work, i.e. when journal_state_count() hits 0.
>
Indeed.
> So we have to make sure that that refcount isn't hitting 0 multiple
> times. The appropriate assertion already exists in
> journal_res_get_fast() - for a refcount to hit 0 multiple times it
> must leave 0 while in use, and we check for that.
>
> Let's just add a comment by that EBUG_ON(!journal_state_count(new,
> new.idx)) line indicating why it's now important.
>
Sounds reasonable. I'll try to come up with something. Thanks.
Brian
next prev parent reply other threads:[~2023-09-12 18:55 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-31 11:07 [PATCH 0/3] bcachefs: journal bug fixes Brian Foster
2023-08-31 11:07 ` [PATCH 1/3] bcachefs: restart journal reclaim thread on ro->rw transitions Brian Foster
2023-08-31 11:07 ` [PATCH 2/3] bcachefs: prepare journal buf put to handle pin put Brian Foster
2023-09-03 21:09 ` Kent Overstreet
2023-09-05 12:41 ` Brian Foster
2023-09-10 1:55 ` Kent Overstreet
2023-08-31 11:07 ` [PATCH 3/3] bcachefs: fix race between journal entry close and pin set Brian Foster
2023-09-03 21:18 ` Kent Overstreet
2023-09-04 2:29 ` Kent Overstreet
2023-09-05 12:59 ` Brian Foster
2023-09-06 19:07 ` Brian Foster
2023-09-10 2:03 ` Kent Overstreet
2023-09-12 18:54 ` Brian Foster [this message]
2023-09-12 19:15 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZQCz6qYUdWdgVsGH@bfoster \
--to=bfoster@redhat.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.