All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: linux-block@vger.kernel.org
Cc: osandov@fb.com, efault@gmx.de, paolo.valente@linaro.org,
	Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 9/9] sbitmap: fix race in wait batch accounting
Date: Thu, 10 May 2018 10:24:27 -0600	[thread overview]
Message-ID: <1525969467-12476-10-git-send-email-axboe@kernel.dk> (raw)
In-Reply-To: <1525969467-12476-1-git-send-email-axboe@kernel.dk>

If we have multiple callers of sbq_wake_up(), we can end up in a
situation where the wait_cnt will continually go more and more
negative. Consider the case where our wake batch is 1, hence
wait_cnt will start out as 1.

wait_cnt == 1

CPU0				CPU1
atomic_dec_return(), cnt == 0
				atomic_dec_return(), cnt == -1
				cmpxchg(-1, 0) (succeeds)
				[wait_cnt now 0]
cmpxchg(0, 1) (fails)

This ends up with wait_cnt being 0, we'll wakeup immediately
next time. Going through the same loop as above again, and
we'll have wait_cnt -1.

For the case where we have a larger wake batch, the only
difference is that the starting point will be higher. We'll
still end up with continually smaller batch wakeups, which
defeats the purpose of the rolling wakeups.

Always reset the wait_cnt to the batch value. Then it doesn't
matter who wins the race. But ensure that whomever does win
the race is the one that increments the ws index as well.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 lib/sbitmap.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 8f0950fbaa5c..ffc36e695f50 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -478,22 +478,26 @@ static void sbq_wake_up(struct sbitmap_queue *sbq)
 
 	wait_cnt = atomic_dec_return(&ws->wait_cnt);
 	if (wait_cnt <= 0) {
+		int ret;
+
 		wake_batch = READ_ONCE(sbq->wake_batch);
+
 		/*
 		 * Pairs with the memory barrier in sbitmap_queue_resize() to
 		 * ensure that we see the batch size update before the wait
 		 * count is reset.
 		 */
 		smp_mb__before_atomic();
+
 		/*
-		 * If there are concurrent callers to sbq_wake_up(), the last
-		 * one to decrement the wait count below zero will bump it back
-		 * up. If there is a concurrent resize, the count reset will
-		 * either cause the cmpxchg to fail or overwrite after the
-		 * cmpxchg.
+		 * For concurrent callers (either resize, or just multiple
+		 * concurrent wakeups of waiters), the one that wins the
+		 * cmpxchg() race increments the waiter index.
 		 */
-		atomic_cmpxchg(&ws->wait_cnt, wait_cnt, wait_cnt + wake_batch);
-		sbq_index_atomic_inc(&sbq->wake_index);
+		ret = atomic_cmpxchg(&ws->wait_cnt, wait_cnt, wake_batch);
+		if (ret == wait_cnt)
+			sbq_index_atomic_inc(&sbq->wake_index);
+
 		wake_up_nr(&ws->wait, wake_batch);
 	}
 }
-- 
2.7.4

      parent reply	other threads:[~2018-05-10 16:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-10 16:24 [PATCHSET v3 0/9] blk-mq-sched and sbitmap shallow depth Jens Axboe
2018-05-10 16:24 ` [PATCH 1/9] blk-mq: don't call into depth limiting for reserved tags Jens Axboe
2018-05-10 16:59   ` Omar Sandoval
2018-05-10 16:24 ` [PATCH 2/9] bfq-iosched: don't worry about reserved tags in limit_depth Jens Axboe
2018-05-10 16:59   ` Omar Sandoval
2018-05-10 16:24 ` [PATCH 3/9] bfq: calculate shallow depths at init time Jens Axboe
2018-05-10 17:00   ` Omar Sandoval
2018-05-10 16:24 ` [PATCH 4/9] bfq-iosched: remove unused variable Jens Axboe
2018-05-10 17:00   ` Omar Sandoval
2018-05-10 16:24 ` [PATCH 5/9] sbitmap: fix missed wakeups caused by sbitmap_queue_get_shallow() Jens Axboe
2018-05-10 17:01   ` Omar Sandoval
2018-05-10 17:09     ` Jens Axboe
2018-05-10 16:24 ` [PATCH 6/9] sbitmap: warn if using smaller shallow depth than was setup Jens Axboe
2018-05-10 17:02   ` Omar Sandoval
2018-05-10 17:09     ` Jens Axboe
2018-05-10 16:24 ` [PATCH 7/9] bfq-iosched: update shallow depth to smallest one used Jens Axboe
2018-05-10 17:03   ` Omar Sandoval
2018-05-10 16:24 ` [PATCH 8/9] kyber-iosched: update shallow depth when setting up hardware queue Jens Axboe
2018-05-10 17:03   ` Omar Sandoval
2018-05-10 16:24 ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1525969467-12476-10-git-send-email-axboe@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=efault@gmx.de \
    --cc=linux-block@vger.kernel.org \
    --cc=osandov@fb.com \
    --cc=paolo.valente@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.