public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND v2 1/2] block: introduce init_wait_func()
@ 2025-02-08  9:04 Muchun Song
  2025-02-08  9:04 ` [PATCH RESEND v2 2/2] block: refactor rq_qos_wait() Muchun Song
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Muchun Song @ 2025-02-08  9:04 UTC (permalink / raw)
  To: axboe, tj, yukuai1
  Cc: chengming.zhou, muchun.song, linux-block, linux-kernel,
	Muchun Song, Ingo Molnar, Peter Zijlstra

There is already a macro DEFINE_WAIT_FUNC() to declare a wait_queue_entry
with a specified waking function. But there is not a counterpart for
initializing one wait_queue_entry with a specified waking function. So
introducing init_wait_func() for this, which also could be used in iocost
and rq-qos. Using default_wake_function() in rq_qos_wait() to wake up
waiters, which could remove ->task field from rq_qos_wait_data.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 block/blk-iocost.c   |  3 +--
 block/blk-rq-qos.c   | 14 +++++++-------
 include/linux/wait.h |  6 ++++--
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index a5894ec9696e7..2f611649a2edb 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -2718,8 +2718,7 @@ static void ioc_rqos_throttle(struct rq_qos *rqos, struct bio *bio)
 	 * All waiters are on iocg->waitq and the wait states are
 	 * synchronized using waitq.lock.
 	 */
-	init_waitqueue_func_entry(&wait.wait, iocg_wake_fn);
-	wait.wait.private = current;
+	init_wait_func(&wait.wait, iocg_wake_fn);
 	wait.bio = bio;
 	wait.abs_cost = abs_cost;
 	wait.committed = false;	/* will be set true by waker */
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index eb9618cd68adf..0b1245d368cd1 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -196,7 +196,6 @@ bool rq_depth_scale_down(struct rq_depth *rqd, bool hard_throttle)
 
 struct rq_qos_wait_data {
 	struct wait_queue_entry wq;
-	struct task_struct *task;
 	struct rq_wait *rqw;
 	acquire_inflight_cb_t *cb;
 	void *private_data;
@@ -218,7 +217,12 @@ static int rq_qos_wake_function(struct wait_queue_entry *curr,
 		return -1;
 
 	data->got_token = true;
-	wake_up_process(data->task);
+	/*
+	 * autoremove_wake_function() removes the wait entry only when it
+	 * actually changed the task state. We want the wait always removed.
+	 * Remove explicitly and use default_wake_function().
+	 */
+	default_wake_function(curr, mode, wake_flags, key);
 	list_del_init_careful(&curr->entry);
 	return 1;
 }
@@ -244,11 +248,6 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 		 cleanup_cb_t *cleanup_cb)
 {
 	struct rq_qos_wait_data data = {
-		.wq = {
-			.func	= rq_qos_wake_function,
-			.entry	= LIST_HEAD_INIT(data.wq.entry),
-		},
-		.task = current,
 		.rqw = rqw,
 		.cb = acquire_inflight_cb,
 		.private_data = private_data,
@@ -259,6 +258,7 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 	if (!has_sleeper && acquire_inflight_cb(rqw, private_data))
 		return;
 
+	init_wait_func(&data.wq, rq_qos_wake_function);
 	has_sleeper = !prepare_to_wait_exclusive(&rqw->wait, &data.wq,
 						 TASK_UNINTERRUPTIBLE);
 	do {
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 6d90ad9744087..2bdc8f47963bf 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -1207,14 +1207,16 @@ int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, i
 
 #define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function)
 
-#define init_wait(wait)								\
+#define init_wait_func(wait, function)						\
 	do {									\
 		(wait)->private = current;					\
-		(wait)->func = autoremove_wake_function;			\
+		(wait)->func = function;					\
 		INIT_LIST_HEAD(&(wait)->entry);					\
 		(wait)->flags = 0;						\
 	} while (0)
 
+#define init_wait(wait)	init_wait_func(wait, autoremove_wake_function)
+
 typedef int (*task_call_f)(struct task_struct *p, void *arg);
 extern int task_call_func(struct task_struct *p, task_call_f func, void *arg);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH RESEND v2 2/2] block: refactor rq_qos_wait()
  2025-02-08  9:04 [PATCH RESEND v2 1/2] block: introduce init_wait_func() Muchun Song
@ 2025-02-08  9:04 ` Muchun Song
  2025-02-10 21:07 ` [PATCH RESEND v2 1/2] block: introduce init_wait_func() Tejun Heo
  2025-02-11 20:04 ` Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Muchun Song @ 2025-02-08  9:04 UTC (permalink / raw)
  To: axboe, tj, yukuai1
  Cc: chengming.zhou, muchun.song, linux-block, linux-kernel,
	Muchun Song, Yu Kuai

When rq_qos_wait() is first introduced, it is easy to understand. But
with some bug fixes applied, it is not easy for newcomers to understand
the whole logic under those fixes. In this patch, rq_qos_wait() is
refactored and more comments are added for better understanding. There
are 3 points for the improvement:

1) Use waitqueue_active() instead of wq_has_sleeper() to eliminate
   unnecessary memory barrier in wq_has_sleeper() which is supposed
   to be used in waker side. In this case, we do need the barrier.
   So use the cheaper one to locklessly test for waiters on the queue.

2) Remove acquire_inflight_cb() logic for the first waiter out of the
   while loop to make the code clear.

3) Add more comments to explain how to sync with different waiters and
   the waker.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-rq-qos.c | 68 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 47 insertions(+), 21 deletions(-)

diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 0b1245d368cd1..5d995d389eaf5 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -223,6 +223,14 @@ static int rq_qos_wake_function(struct wait_queue_entry *curr,
 	 * Remove explicitly and use default_wake_function().
 	 */
 	default_wake_function(curr, mode, wake_flags, key);
+	/*
+	 * Note that the order of operations is important as finish_wait()
+	 * tests whether @curr is removed without grabbing the lock. This
+	 * should be the last thing to do to make sure we will not have a
+	 * UAF access to @data. And the semantics of memory barrier in it
+	 * also make sure the waiter will see the latest @data->got_token
+	 * once list_empty_careful() in finish_wait() returns true.
+	 */
 	list_del_init_careful(&curr->entry);
 	return 1;
 }
@@ -248,37 +256,55 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 		 cleanup_cb_t *cleanup_cb)
 {
 	struct rq_qos_wait_data data = {
-		.rqw = rqw,
-		.cb = acquire_inflight_cb,
-		.private_data = private_data,
+		.rqw		= rqw,
+		.cb		= acquire_inflight_cb,
+		.private_data	= private_data,
+		.got_token	= false,
 	};
-	bool has_sleeper;
+	bool first_waiter;
 
-	has_sleeper = wq_has_sleeper(&rqw->wait);
-	if (!has_sleeper && acquire_inflight_cb(rqw, private_data))
+	/*
+	 * If there are no waiters in the waiting queue, try to increase the
+	 * inflight counter if we can. Otherwise, prepare for adding ourselves
+	 * to the waiting queue.
+	 */
+	if (!waitqueue_active(&rqw->wait) && acquire_inflight_cb(rqw, private_data))
 		return;
 
 	init_wait_func(&data.wq, rq_qos_wake_function);
-	has_sleeper = !prepare_to_wait_exclusive(&rqw->wait, &data.wq,
+	first_waiter = prepare_to_wait_exclusive(&rqw->wait, &data.wq,
 						 TASK_UNINTERRUPTIBLE);
+	/*
+	 * Make sure there is at least one inflight process; otherwise, waiters
+	 * will never be woken up. Since there may be no inflight process before
+	 * adding ourselves to the waiting queue above, we need to try to
+	 * increase the inflight counter for ourselves. And it is sufficient to
+	 * guarantee that at least the first waiter to enter the waiting queue
+	 * will re-check the waiting condition before going to sleep, thus
+	 * ensuring forward progress.
+	 */
+	if (!data.got_token && first_waiter && acquire_inflight_cb(rqw, private_data)) {
+		finish_wait(&rqw->wait, &data.wq);
+		/*
+		 * We raced with rq_qos_wake_function() getting a token,
+		 * which means we now have two. Put our local token
+		 * and wake anyone else potentially waiting for one.
+		 *
+		 * Enough memory barrier in list_empty_careful() in
+		 * finish_wait() is paired with list_del_init_careful()
+		 * in rq_qos_wake_function() to make sure we will see
+		 * the latest @data->got_token.
+		 */
+		if (data.got_token)
+			cleanup_cb(rqw, private_data);
+		return;
+	}
+
+	/* we are now relying on the waker to increase our inflight counter. */
 	do {
-		/* The memory barrier in set_current_state saves us here. */
 		if (data.got_token)
 			break;
-		if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) {
-			finish_wait(&rqw->wait, &data.wq);
-
-			/*
-			 * We raced with rq_qos_wake_function() getting a token,
-			 * which means we now have two. Put our local token
-			 * and wake anyone else potentially waiting for one.
-			 */
-			if (data.got_token)
-				cleanup_cb(rqw, private_data);
-			return;
-		}
 		io_schedule();
-		has_sleeper = true;
 		set_current_state(TASK_UNINTERRUPTIBLE);
 	} while (1);
 	finish_wait(&rqw->wait, &data.wq);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH RESEND v2 1/2] block: introduce init_wait_func()
  2025-02-08  9:04 [PATCH RESEND v2 1/2] block: introduce init_wait_func() Muchun Song
  2025-02-08  9:04 ` [PATCH RESEND v2 2/2] block: refactor rq_qos_wait() Muchun Song
@ 2025-02-10 21:07 ` Tejun Heo
  2025-02-11 20:04 ` Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2025-02-10 21:07 UTC (permalink / raw)
  To: Muchun Song
  Cc: axboe, yukuai1, chengming.zhou, muchun.song, linux-block,
	linux-kernel, Ingo Molnar, Peter Zijlstra

On Sat, Feb 08, 2025 at 05:04:15PM +0800, Muchun Song wrote:
> There is already a macro DEFINE_WAIT_FUNC() to declare a wait_queue_entry
> with a specified waking function. But there is not a counterpart for
> initializing one wait_queue_entry with a specified waking function. So
> introducing init_wait_func() for this, which also could be used in iocost
> and rq-qos. Using default_wake_function() in rq_qos_wait() to wake up
> waiters, which could remove ->task field from rq_qos_wait_data.
> 
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

For rq-qos / blk-iocost part:

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RESEND v2 1/2] block: introduce init_wait_func()
  2025-02-08  9:04 [PATCH RESEND v2 1/2] block: introduce init_wait_func() Muchun Song
  2025-02-08  9:04 ` [PATCH RESEND v2 2/2] block: refactor rq_qos_wait() Muchun Song
  2025-02-10 21:07 ` [PATCH RESEND v2 1/2] block: introduce init_wait_func() Tejun Heo
@ 2025-02-11 20:04 ` Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2025-02-11 20:04 UTC (permalink / raw)
  To: tj, yukuai1, Muchun Song
  Cc: chengming.zhou, muchun.song, linux-block, linux-kernel,
	Ingo Molnar, Peter Zijlstra


On Sat, 08 Feb 2025 17:04:15 +0800, Muchun Song wrote:
> There is already a macro DEFINE_WAIT_FUNC() to declare a wait_queue_entry
> with a specified waking function. But there is not a counterpart for
> initializing one wait_queue_entry with a specified waking function. So
> introducing init_wait_func() for this, which also could be used in iocost
> and rq-qos. Using default_wake_function() in rq_qos_wait() to wake up
> waiters, which could remove ->task field from rq_qos_wait_data.
> 
> [...]

Applied, thanks!

[1/2] block: introduce init_wait_func()
      commit: 36d03cb3277e29beedb87b8efb1e4da02b26e0c0
[2/2] block: refactor rq_qos_wait()
      commit: a052bfa636bb763786b9dc13a301a59afb03787a

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-02-11 20:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-08  9:04 [PATCH RESEND v2 1/2] block: introduce init_wait_func() Muchun Song
2025-02-08  9:04 ` [PATCH RESEND v2 2/2] block: refactor rq_qos_wait() Muchun Song
2025-02-10 21:07 ` [PATCH RESEND v2 1/2] block: introduce init_wait_func() Tejun Heo
2025-02-11 20:04 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox