All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hao Xu <hao.xu@linux.dev>
To: io-uring@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	Wanpeng Li <wanpengli@tencent.com>
Subject: [PATCH 13/19] io-wq: add wq->owner for uringlet mode
Date: Fri, 19 Aug 2022 23:27:32 +0800	[thread overview]
Message-ID: <20220819152738.1111255-14-hao.xu@linux.dev> (raw)
In-Reply-To: <20220819152738.1111255-1-hao.xu@linux.dev>

From: Hao Xu <howeyxu@tencent.com>

In uringlet mode, we allow exact one worker to submit sqes at the same
time. nr_running is not a good choice to aim that. Add an member
wq->owner and its lock to achieve that, this avoids race condition
between workers.

Signed-off-by: Hao Xu <howeyxu@tencent.com>
---
 io_uring/io-wq.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index 00a1cdefb787..9fcaeea7a478 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -96,6 +96,9 @@ struct io_wq {
 
 	void *private;
 
+	raw_spinlock_t lock;
+	struct io_worker *owner;
+
 	struct io_wqe *wqes[];
 };
 
@@ -381,6 +384,8 @@ static inline bool io_worker_test_submit(struct io_worker *worker)
 	return worker->flags & IO_WORKER_F_SUBMIT;
 }
 
+#define IO_WQ_OWNER_TRANSMIT	((struct io_worker *)-1)
+
 static void io_wqe_dec_running(struct io_worker *worker)
 {
 	struct io_wqe_acct *acct = io_wqe_get_acct(worker);
@@ -401,6 +406,10 @@ static void io_wqe_dec_running(struct io_worker *worker)
 
 		io_uringlet_end(wq->private);
 		io_worker_set_scheduled(worker);
+		raw_spin_lock(&wq->lock);
+		wq->owner = IO_WQ_OWNER_TRANSMIT;
+		raw_spin_unlock(&wq->lock);
+
 		raw_spin_lock(&wqe->lock);
 		rcu_read_lock();
 		activated = io_wqe_activate_free_worker(wqe, acct);
@@ -674,6 +683,17 @@ static void io_wqe_worker_let(struct io_worker *worker)
 
 	while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
 		unsigned int empty_count = 0;
+		struct io_worker *owner;
+
+		raw_spin_lock(&wq->lock);
+		owner = wq->owner;
+		if (owner && owner != IO_WQ_OWNER_TRANSMIT && owner != worker) {
+			raw_spin_unlock(&wq->lock);
+			set_current_state(TASK_INTERRUPTIBLE);
+			goto sleep;
+		}
+		wq->owner = worker;
+		raw_spin_unlock(&wq->lock);
 
 		__io_worker_busy(wqe, worker);
 		set_current_state(TASK_INTERRUPTIBLE);
@@ -697,6 +717,7 @@ static void io_wqe_worker_let(struct io_worker *worker)
 			cond_resched();
 		} while (1);
 
+sleep:
 		raw_spin_lock(&wqe->lock);
 		__io_worker_idle(wqe, worker);
 		raw_spin_unlock(&wqe->lock);
@@ -780,6 +801,14 @@ int io_uringlet_offload(struct io_wq *wq)
 	struct io_wqe_acct *acct = io_get_acct(wqe, true);
 	bool waken;
 
+	raw_spin_lock(&wq->lock);
+	if (wq->owner) {
+		raw_spin_unlock(&wq->lock);
+		return 0;
+	}
+	wq->owner = IO_WQ_OWNER_TRANSMIT;
+	raw_spin_unlock(&wq->lock);
+
 	raw_spin_lock(&wqe->lock);
 	rcu_read_lock();
 	waken = io_wqe_activate_free_worker(wqe, acct);
@@ -1248,6 +1277,7 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
 	wq->free_work = data->free_work;
 	wq->do_work = data->do_work;
 	wq->private = data->private;
+	raw_spin_lock_init(&wq->lock);
 
 	ret = -ENOMEM;
 	for_each_node(node) {
-- 
2.25.1


  parent reply	other threads:[~2022-08-19 15:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-19 15:27 [RFC 00/19] uringlet Hao Xu
2022-08-19 15:27 ` [PATCH 01/19] io_uring: change return value of create_io_worker() and io_wqe_create_worker() Hao Xu
2022-08-19 15:27 ` [PATCH 02/19] io_uring: add IORING_SETUP_URINGLET Hao Xu
2022-08-19 15:27 ` [PATCH 03/19] io_uring: make worker pool per ctx for uringlet mode Hao Xu
2022-08-19 15:27 ` [PATCH 04/19] io-wq: split io_wqe_worker() to io_wqe_worker_normal() and io_wqe_worker_let() Hao Xu
2022-08-19 15:27 ` [PATCH 05/19] io_uring: add io_uringler_offload() for uringlet mode Hao Xu
2022-08-19 15:27 ` [PATCH 06/19] io-wq: change the io-worker scheduling logic Hao Xu
2022-08-19 15:27 ` [PATCH 07/19] io-wq: move worker state flags to io-wq.h Hao Xu
2022-08-19 15:27 ` [PATCH 08/19] io-wq: add IO_WORKER_F_SUBMIT and its friends Hao Xu
2022-08-19 15:27 ` [PATCH 09/19] io-wq: add IO_WORKER_F_SCHEDULED " Hao Xu
2022-08-19 15:27 ` [PATCH 10/19] io_uring: add io_submit_sqes_let() Hao Xu
2022-08-19 15:27 ` [PATCH 11/19] io_uring: don't allocate io-wq for a worker in uringlet mode Hao Xu
2022-08-19 15:27 ` [PATCH 12/19] io_uring: add uringlet worker cancellation function Hao Xu
2022-08-19 15:27 ` Hao Xu [this message]
2022-08-19 15:27 ` [PATCH 14/19] io_uring: modify issue_flags for uringlet mode Hao Xu
2022-08-19 15:27 ` [PATCH 15/19] io_uring: don't use inline completion cache if scheduled Hao Xu
2022-08-19 15:27 ` [PATCH 16/19] io_uring: release ctx->let when a ring exits Hao Xu
2022-08-19 15:27 ` [PATCH 17/19] io_uring: disable task plug for now Hao Xu
2022-08-19 15:27 ` [PATCH 18/19] io-wq: only do io_uringlet_end() at the first schedule time Hao Xu
2022-08-19 15:27 ` [PATCH 19/19] io_uring: wire up uringlet Hao Xu
2022-08-25 13:03 ` [RFC 00/19] uringlet Hao Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220819152738.1111255-14-hao.xu@linux.dev \
    --to=hao.xu@linux.dev \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.