All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Uday Shankar <ushankar@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Caleb Sander Mateos <csander@purestorage.com>
Subject: Re: [PATCH v3 2/2] ublk: require unique task per io instead of unique task per hctx
Date: Fri, 11 Apr 2025 16:53:19 +0800	[thread overview]
Message-ID: <Z_jYfwFN_AYkUNJK@fedora> (raw)
In-Reply-To: <20250410-ublk_task_per_io-v3-2-b811e8f4554a@purestorage.com>

On Thu, Apr 10, 2025 at 06:17:51PM -0600, Uday Shankar wrote:
> Currently, ublk_drv associates to each hardware queue (hctx) a unique
> task (called the queue's ubq_daemon) which is allowed to issue
> COMMIT_AND_FETCH commands against the hctx. If any other task attempts
> to do so, the command fails immediately with EINVAL. When considered
> together with the block layer architecture, the result is that for each
> CPU C on the system, there is a unique ublk server thread which is
> allowed to handle I/O submitted on CPU C. This can lead to suboptimal
> performance under imbalanced load generation. For an extreme example,
> suppose all the load is generated on CPUs mapping to a single ublk
> server thread. Then that thread may be fully utilized and become the
> bottleneck in the system, while other ublk server threads are totally
> idle.
> 
> This issue can also be addressed directly in the ublk server without
> kernel support by having threads dequeue I/Os and pass them around to
> ensure even load. But this solution requires inter-thread communication
> at least twice for each I/O (submission and completion), which is
> generally a bad pattern for performance. The problem gets even worse
> with zero copy, as more inter-thread communication would be required to
> have the buffer register/unregister calls to come from the correct
> thread.

Agree.

The limit is actually originated from current implementation, both
REGISTER_IO_BUF and UNREGISTER_IO_BUF should be fine to run from other
pthread because the request buffer 'meta' is actually read-only.

> 
> Therefore, address this issue in ublk_drv by requiring a unique task per
> I/O instead of per queue/hctx. Imbalanced load can then be balanced
> across all ublk server threads by having threads issue FETCH_REQs in a
> round-robin manner. As a small toy example, consider a system with a
> single ublk device having 2 queues, each of queue depth 4. A ublk server
> having 4 threads could issue its FETCH_REQs against this device as
> follows (where each entry is the qid,tag pair that the FETCH_REQ
> targets):
> 
> poller thread:	T0	T1	T2	T3
> 		0,0	0,1	0,2	0,3
> 		1,3	1,0	1,1	1,2
> 
> Since tags appear to be allocated in sequential chunks, this setup
> provides a rough approximation to distributing I/Os round-robin across
> all ublk server threads, while letting I/Os stay fully thread-local.

BLK_MQ_F_TAG_RR can be set for this way, so is it possible to make this
as one feature? And set BLK_MQ_F_TAG_RR for this feature.

Also can you share what the preferred implementation is for ublk server?

I think per-io pthread may not be good, maybe partition tags space into
fixed range/pthread?

`ublk_queue' reference is basically read-only in IO code path, I think
it need to be declared explicitly as 'const' pointer in IO code/uring code
path first. Otherwise, it is easy to trigger data race with per-io task
since it is lockless.


Thanks, 
Ming


  reply	other threads:[~2025-04-11  8:53 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-11  0:17 [PATCH v3 0/2] ublk: decouple server threads from hctxs Uday Shankar
2025-04-11  0:17 ` [PATCH v3 1/2] ublk: properly serialize all FETCH_REQs Uday Shankar
2025-04-11  8:29   ` Ming Lei
2025-04-11 16:00   ` Caleb Sander Mateos
2025-04-11  0:17 ` [PATCH v3 2/2] ublk: require unique task per io instead of unique task per hctx Uday Shankar
2025-04-11  8:53   ` Ming Lei [this message]
2025-04-16  0:12     ` Uday Shankar
2025-04-17  1:29       ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_jYfwFN_AYkUNJK@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ushankar@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.