linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Uday Shankar <ushankar@purestorage.com>
To: Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	 Caleb Sander Mateos <csander@purestorage.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Shuah Khan <shuah@kernel.org>,  Jonathan Corbet <corbet@lwn.net>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	 linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org,
	 Uday Shankar <ushankar@purestorage.com>
Subject: [PATCH v8 0/9] ublk: decouple server threads from ublk_queues/hctxs
Date: Thu, 29 May 2025 17:47:09 -0600	[thread overview]
Message-ID: <20250529-ublk_task_per_io-v8-0-e9d3b119336a@purestorage.com> (raw)

This patch set aims to allow ublk server threads to better balance load
amongst themselves by decoupling server threads from ublk_queues/hctxs,
so that multiple threads can service I/Os that are issued from a single
CPU. This can improve performance for workloads in which ublk server CPU
is a bottleneck, and for which load is issued from CPUs which are not
balanced across ublk_queues/hctxs.

Performance
-----------

First create two ublk devices with:

ublkb0: ./kublk add -t null -q 2 --nthreads 2
ublkb1: ./kublk add -t null -q 2 --nthreads 2 --per_io_tasks

Then run load with:

taskset -c 1 fio/t/io_uring -r5 -p0 /dev/ublkb0: 1.90M IOPS
taskset -c 1 fio/t/io_uring -r5 -p0 /dev/ublkb1: 2.18M IOPS

Since ublkb1 has per-io-tasks, the second command is able to make use of
both ublk server worker threads and therefore has increased max
throughput.

Caveats:
- This testing was done on a system with 2 numa nodes, but the penalty
  of having I/O cross a numa (or LLC) boundary in the per_io_tasks case
  is quite high. So these numbers were obtained after moving all ublk
  server threads and the application threads to CPUs on the same numa
  node/LLC.
- One might expect the scaling to be linear - because ublkb1 can make
  use of twice as many ublk server threads, it should be able to drive
  twice the throughput. However this is not true (the improvement is
  ~15%), and needs further investigation.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
---
Changes in v8:
- Fix queue_rqs batch dispatch OOPS when dispatching a list of requests
  associated to > 1 ublk_queue (Ming Lei, Caleb Sander Mateos)
- Simplify queue_rqs (Caleb Sander Mateos)
- Narrow a couple of types (Ming Lei)
- Add stress test for per io daemons (Ming Lei)
- Link to v7: https://lore.kernel.org/r/20250527-ublk_task_per_io-v7-0-cbdbaf283baa@purestorage.com

Changes in v7:
- Fix queue_rqs batch dispatch for per-io daemons
- Kick round-robin tag allocation changes to a followup
- Add explicit feature flag for per-task daemons (Ming Lei, Caleb Sander
  Mateos)
- Move some variable assignments to avoid redundant computation (Caleb
  Sander Mateos)
- Switch from storing pointers in ublk_io to computing based on address
  with container_of in a couple places (Ming Lei)
- Link to v6: https://lore.kernel.org/r/20250507-ublk_task_per_io-v6-0-a2a298783c01@purestorage.com

Changes in v6:
- Add a feature flag for this feature, called UBLK_F_RR_TAGS (Ming Lei)
- Add test for this feature (Ming Lei)
- Add documentation for this feature (Ming Lei)
- Link to v5: https://lore.kernel.org/r/20250416-ublk_task_per_io-v5-0-9261ad7bff20@purestorage.com

Changes in v5:
- Set io->task before ublk_mark_io_ready (Caleb Sander Mateos)
- Set io->task atomically, read it atomically when needed
- Return 0 on success from command-specific helpers in
  __ublk_ch_uring_cmd (Caleb Sander Mateos)
- Rename ublk_handle_need_get_data to ublk_get_data (Caleb Sander
  Mateos)
- Link to v4: https://lore.kernel.org/r/20250415-ublk_task_per_io-v4-0-54210b91a46f@purestorage.com

Changes in v4:
- Drop "ublk: properly serialize all FETCH_REQs" since Ming is taking it
  in another set
- Prevent data races by marking data structures which should be
  read-only in the I/O path as const (Ming Lei)
- Link to v3: https://lore.kernel.org/r/20250410-ublk_task_per_io-v3-0-b811e8f4554a@purestorage.com

Changes in v3:
- Check for UBLK_IO_FLAG_ACTIVE on I/O again after taking lock to ensure
  that two concurrent FETCH_REQs on the same I/O can't succeed (Caleb
  Sander Mateos)
- Link to v2: https://lore.kernel.org/r/20250408-ublk_task_per_io-v2-0-b97877e6fd50@purestorage.com

Changes in v2:
- Remove changes split into other patches
- To ease error handling/synchronization, associate each I/O (instead of
  each queue) to the last task that issues a FETCH_REQ against it. Only
  that task is allowed to operate on the I/O.
- Link to v1: https://lore.kernel.org/r/20241002224437.3088981-1-ushankar@purestorage.com

---
Uday Shankar (9):
      ublk: have a per-io daemon instead of a per-queue daemon
      selftests: ublk: kublk: plumb q_id in io_uring user_data
      selftests: ublk: kublk: tie sqe allocation to io instead of queue
      selftests: ublk: kublk: lift queue initialization out of thread
      selftests: ublk: kublk: move per-thread data out of ublk_queue
      selftests: ublk: kublk: decouple ublk_queues from ublk server threads
      selftests: ublk: add functional test for per io daemons
      selftests: ublk: add stress test for per io daemons
      Documentation: ublk: document UBLK_F_PER_IO_DAEMON

 Documentation/block/ublk.rst                       |  35 ++-
 drivers/block/ublk_drv.c                           | 111 +++----
 include/uapi/linux/ublk_cmd.h                      |   9 +
 tools/testing/selftests/ublk/Makefile              |   2 +
 tools/testing/selftests/ublk/fault_inject.c        |   4 +-
 tools/testing/selftests/ublk/file_backed.c         |  20 +-
 tools/testing/selftests/ublk/kublk.c               | 344 ++++++++++++++-------
 tools/testing/selftests/ublk/kublk.h               |  73 +++--
 tools/testing/selftests/ublk/null.c                |  22 +-
 tools/testing/selftests/ublk/stripe.c              |  17 +-
 tools/testing/selftests/ublk/test_common.sh        |   5 +
 tools/testing/selftests/ublk/test_generic_12.sh    |  55 ++++
 tools/testing/selftests/ublk/test_stress_06.sh     |  36 +++
 .../selftests/ublk/trace/count_ios_per_tid.bt      |  11 +
 14 files changed, 512 insertions(+), 232 deletions(-)
---
base-commit: 533c87e2ed742454957f14d7bef9f48d5a72e72d
change-id: 20250408-ublk_task_per_io-c693cf608d7a

Best regards,
-- 
Uday Shankar <ushankar@purestorage.com>


             reply	other threads:[~2025-05-29 23:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-29 23:47 Uday Shankar [this message]
2025-05-29 23:47 ` [PATCH v8 1/9] ublk: have a per-io daemon instead of a per-queue daemon Uday Shankar
2025-05-30  1:33   ` Jens Axboe
2025-05-30  2:05   ` Ming Lei
2025-05-29 23:47 ` [PATCH v8 2/9] selftests: ublk: kublk: plumb q_id in io_uring user_data Uday Shankar
2025-05-29 23:47 ` [PATCH v8 3/9] selftests: ublk: kublk: tie sqe allocation to io instead of queue Uday Shankar
2025-05-29 23:47 ` [PATCH v8 4/9] selftests: ublk: kublk: lift queue initialization out of thread Uday Shankar
2025-05-29 23:47 ` [PATCH v8 5/9] selftests: ublk: kublk: move per-thread data out of ublk_queue Uday Shankar
2025-05-29 23:47 ` [PATCH v8 6/9] selftests: ublk: kublk: decouple ublk_queues from ublk server threads Uday Shankar
2025-05-29 23:47 ` [PATCH v8 7/9] selftests: ublk: add functional test for per io daemons Uday Shankar
2025-05-29 23:47 ` [PATCH v8 8/9] selftests: ublk: add stress " Uday Shankar
2025-05-30  2:13   ` Ming Lei
2025-05-29 23:47 ` [PATCH v8 9/9] Documentation: ublk: document UBLK_F_PER_IO_DAEMON Uday Shankar
2025-05-30 13:07 ` [PATCH v8 0/9] ublk: decouple server threads from ublk_queues/hctxs Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250529-ublk_task_per_io-v8-0-e9d3b119336a@purestorage.com \
    --to=ushankar@purestorage.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=corbet@lwn.net \
    --cc=csander@purestorage.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).