From: Omar Sandoval <osandov@osandov.com>
To: David Jeffery <djeffery@redhat.com>
Cc: linux-block@vger.kernel.org, John Pittman <jpittman@redhat.com>
Subject: Re: [PATCH] sbitmap: only queue kyber's wait callback if not already active
Date: Thu, 19 Dec 2019 13:41:42 -0800 [thread overview]
Message-ID: <20191219214142.GB826140@vader> (raw)
In-Reply-To: <20191217160024.GA23066@redhat>
On Tue, Dec 17, 2019 at 11:00:24AM -0500, David Jeffery wrote:
> Under heavy loads where the kyber I/O scheduler hits the token limits for
> its scheduling domains, kyber can become stuck. When active requests
> complete, kyber may not be woken up leaving the I/O requests in kyber
> stuck.
>
> This stuck state is due to a race condition with kyber and the sbitmap
> functions it uses to run a callback when enough requests have completed.
> The running of a sbt_wait callback can race with the attempt to insert the
> sbt_wait. Since sbitmap_del_wait_queue removes the sbt_wait from the list
> first then sets the sbq field to NULL, kyber can see the item as not on a
> list but the call to sbitmap_add_wait_queue will see sbq as non-NULL. This
> results in the sbt_wait being inserted onto the wait list but ws_active
> doesn't get incremented. So the sbitmap queue does not know there is a
> waiter on a wait list.
>
> Since sbitmap doesn't think there is a waiter, kyber may never be
> informed that there are domain tokens available and the I/O never advances.
> With the sbt_wait on a wait list, kyber believes it has an active waiter
> so cannot insert a new waiter when reaching the domain's full state.
>
> This race can be fixed by only adding the sbt_wait to the queue if the
> sbq field is NULL. If sbq is not NULL, there is already an action active
> which will trigger the re-running of kyber. Let it run and add the
> sbt_wait to the wait list if still needing to wait.
So the race here is:
Thread 1 Thread 2
kyber_domain_wake
sbitmap_del_wait_queue
list_del_init
atomic_dec sbq->ws_active
kyber_get_domain_token
list_empty_careful
sbitmap_add_wait_queue
if (!sqb_wait->sb) // false
add_wait_queue
sbq_wait->sbq = NULL
Now sbq_wait->sbq == NULL, sbq->ws_active = 0, and
!list_empty(domain_wait), so sbq_wake_ptr returns NULL and
sbitmap_queue_wake_up does nothing.
I get the feeling that sbitmap_{add,del}_wait_queue need some memory
barriers... But ignoring that, this fix seems right.
Reviewed-by: Omar Sandoval <osandov@fb.com>
P.S. s/sbt_wait/sbq_wait/g in the commit message.
prev parent reply other threads:[~2019-12-19 21:41 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-17 16:00 [PATCH] sbitmap: only queue kyber's wait callback if not already active David Jeffery
2019-12-18 1:14 ` Jens Axboe
2019-12-19 21:41 ` Omar Sandoval [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191219214142.GB826140@vader \
--to=osandov@osandov.com \
--cc=djeffery@redhat.com \
--cc=jpittman@redhat.com \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.