All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, David Jeffery <djeffery@redhat.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Gabriel Krisman Bertazi <krisman@suse.de>,
	Chengming Zhou <zhouchengming@bytedance.com>,
	ming.lei@redhat.com
Subject: Re: [RFC PATCH] sbitmap: fix batching wakeup
Date: Tue, 8 Aug 2023 16:18:50 +0800	[thread overview]
Message-ID: <ZNH6as/wUkbCMAcN@fedora> (raw)
In-Reply-To: <20230802160553.uv5wn6nfjseniyxx@quack3>

On Wed, Aug 02, 2023 at 06:05:53PM +0200, Jan Kara wrote:
> On Fri 21-07-23 17:57:15, Ming Lei wrote:
> > From: David Jeffery <djeffery@redhat.com>
> > 
> > Current code supposes that it is enough to provide forward progress by just
> > waking up one wait queue after one completion batch is done.
> > 
> > Unfortunately this way isn't enough, cause waiter can be added to
> > wait queue just after it is woken up.
> > 
> > Follows one example(64 depth, wake_batch is 8)
> > 
> > 1) all 64 tags are active
> > 
> > 2) in each wait queue, there is only one single waiter
> > 
> > 3) each time one completion batch(8 completions) wakes up just one waiter in each wait
> > queue, then immediately one new sleeper is added to this wait queue
> > 
> > 4) after 64 completions, 8 waiters are wakeup, and there are still 8 waiters in each
> > wait queue
> > 
> > 5) after another 8 active tags are completed, only one waiter can be wakeup, and the other 7
> > can't be waken up anymore.
> > 
> > Turns out it isn't easy to fix this problem, so simply wakeup enough waiters for
> > single batch.
> > 
> > Cc: David Jeffery <djeffery@redhat.com>
> > Cc: Kemeng Shi <shikemeng@huaweicloud.com>
> > Cc: Gabriel Krisman Bertazi <krisman@suse.de>
> > Cc: Chengming Zhou <zhouchengming@bytedance.com>
> > Cc: Jan Kara <jack@suse.cz>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> 
> I'm sorry for the delay - I was on vacation. I can see the patch got
> already merged and I'm not strictly against that (although I think Gabriel
> was experimenting with this exact wakeup scheme and as far as I remember
> the more eager waking up was causing performance decrease for some
> configurations). But let me challenge the analysis above a bit. For the
> sleeper to be added to a waitqueue in step 3), blk_mq_mark_tag_wait() must
> fail the blk_mq_get_driver_tag() call. Which means that all tags were used

Here only allocating request by blk_mq_get_tag() is involved, and
getting driver tag isn't involved.

> at that moment. To summarize, anytime we add any new waiter to the
> waitqueue, all tags are used and thus we should eventually receive enough
> wakeups to wake all of them. What am I missing?

When running the final retry(__blk_mq_get_tag) before
sleeping(io_schedule()) in blk_mq_get_tag(), the sleeper has been added to
wait queue.

So when two completion batch comes, the two may wake up same wq because
same ->wake_index can be observed from two completion path, and both two
wake_up_nr() can return > 0 because adding sleeper into wq and wake_up_nr()
can be interleaved, then 16 completions just wakeup 2 sleepers added to
same wq.

If the story happens on one wq with >= 8 sleepers, io hang will be
triggered, if there are another two pending wq.


Thanks, 
Ming


  reply	other threads:[~2023-08-08 17:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-21  9:57 [RFC PATCH] sbitmap: fix batching wakeup Ming Lei
2023-07-21 10:40 ` Keith Busch
2023-07-21 10:50   ` Ming Lei
2023-07-21 17:38     ` David Jeffery
2023-07-21 11:51 ` Keith Busch
2023-07-21 16:35 ` Gabriel Krisman Bertazi
2023-07-22  2:42   ` Ming Lei
2023-07-21 17:29 ` Jens Axboe
2023-07-21 17:40 ` Jens Axboe
2023-08-02 16:05 ` Jan Kara
2023-08-08  8:18   ` Ming Lei [this message]
2023-08-08 10:30     ` Jan Kara
2024-01-15  9:51 ` Kemeng Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZNH6as/wUkbCMAcN@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=djeffery@redhat.com \
    --cc=jack@suse.cz \
    --cc=krisman@suse.de \
    --cc=linux-block@vger.kernel.org \
    --cc=shikemeng@huaweicloud.com \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.