From: Hugh Dickins <hughd@google.com>
To: Keith Busch <kbusch@kernel.org>
Cc: Hugh Dickins <hughd@google.com>, Jan Kara <jack@suse.cz>,
Jens Axboe <axboe@kernel.dk>, Yu Kuai <yukuai1@huaweicloud.com>,
Liu Song <liusong@linux.alibaba.com>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH next] sbitmap: fix lockup while swapping
Date: Fri, 23 Sep 2022 16:15:29 -0700 (PDT) [thread overview]
Message-ID: <929a3aba-72b0-5e-5b80-824a2b7f5dc7@google.com> (raw)
In-Reply-To: <391b1763-7146-857-e3b6-dc2a8e797162@google.com>
On Fri, 23 Sep 2022, Hugh Dickins wrote:
> On Fri, 23 Sep 2022, Keith Busch wrote:
>
> > Does the following fix the observation? Rational being that there's no reason
> > to spin on the current wait state that is already under handling; let
> > subsequent clearings proceed to the next inevitable wait state immediately.
>
> It's running fine without lockup so far; but doesn't this change merely
> narrow the window? If this is interrupted in between atomic_try_cmpxchg()
> setting wait_cnt to 0 and sbq_index_atomic_inc() advancing wake_index,
> don't we run the same risk as before, of sbitmap_queue_wake_up() from
> the interrupt handler getting stuck on that wait_cnt 0?
Yes, it ran successfully for 50 minutes, then an interrupt came in
immediately after the cmpxchg, and it locked up just as before.
Easily dealt with by disabling interrupts, no doubt, but I assume it's a
badge of honour not to disable interrupts here (except perhaps in waking).
Some clever way to make the wait_cnt and wake_index adjustments atomic?
Or is this sbitmap_queue_wake_up() interrupting sbitmap_queue_wake_up()
just supposed never to happen, the counts preventing it: but some
misaccounting letting it happen by mistake?
>
> >
> > ---
> > diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> > index 624fa7f118d1..47bf7882210b 100644
> > --- a/lib/sbitmap.c
> > +++ b/lib/sbitmap.c
> > @@ -634,6 +634,13 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr)
> >
> > *nr -= sub;
> >
> > + /*
> > + * Increase wake_index before updating wait_cnt, otherwise concurrent
> > + * callers can see valid wait_cnt in old waitqueue, which can cause
> > + * invalid wakeup on the old waitqueue.
> > + */
> > + sbq_index_atomic_inc(&sbq->wake_index);
> > +
> > /*
> > * When wait_cnt == 0, we have to be particularly careful as we are
> > * responsible to reset wait_cnt regardless whether we've actually
> > @@ -660,13 +667,6 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr)
> > * of atomic_set().
> > */
> > smp_mb__before_atomic();
> > -
> > - /*
> > - * Increase wake_index before updating wait_cnt, otherwise concurrent
> > - * callers can see valid wait_cnt in old waitqueue, which can cause
> > - * invalid wakeup on the old waitqueue.
> > - */
> > - sbq_index_atomic_inc(&sbq->wake_index);
> > atomic_set(&ws->wait_cnt, wake_batch);
> >
> > return ret || *nr;
> > --
>
next prev parent reply other threads:[~2022-09-23 23:15 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-18 21:10 [PATCH next] sbitmap: fix lockup while swapping Hugh Dickins
2022-09-19 21:22 ` Keith Busch
2022-09-19 23:01 ` Hugh Dickins
2022-09-21 16:40 ` Jan Kara
2022-09-23 14:43 ` Jan Kara
2022-09-23 15:13 ` Keith Busch
2022-09-23 16:16 ` Hugh Dickins
2022-09-23 19:07 ` Keith Busch
2022-09-23 21:29 ` Hugh Dickins
2022-09-23 23:15 ` Hugh Dickins [this message]
2022-09-26 11:44 ` Jan Kara
2022-09-26 14:08 ` Yu Kuai
2022-09-27 3:39 ` Hugh Dickins
2022-09-27 10:31 ` Jan Kara
2022-09-28 3:56 ` Hugh Dickins
2022-09-28 3:59 ` [PATCH next v2] " Hugh Dickins
2022-09-28 4:07 ` Hugh Dickins
2022-09-29 8:39 ` Jan Kara
2022-09-29 19:50 ` [PATCH next v3] " Hugh Dickins
2022-09-29 19:56 ` Keith Busch
2022-09-29 23:58 ` Jens Axboe
[not found] ` <20220924023047.1410-1-hdanton@sina.com>
2022-09-27 4:02 ` [PATCH next] " Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=929a3aba-72b0-5e-5b80-824a2b7f5dc7@google.com \
--to=hughd@google.com \
--cc=axboe@kernel.dk \
--cc=jack@suse.cz \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liusong@linux.alibaba.com \
--cc=yukuai1@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox