[PATCH V3 0/5] ublk: NUMA-aware memory allocation

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH V3 0/5] ublk: NUMA-aware memory allocation
@ 2025-10-29  3:10 Ming Lei
  2025-10-29  3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Hi Jens,

The 1st two patches implement ublk driver NUMA aware memory allocation.

The last two patches implement it for ublk selftest utility.

`taskset -c 0-31 ~/git/fio/t/io_uring -p0 -n16 -r 40 /dev/ublkb0` shows
5%~10% IOPS improvement on one AMD zen4 dual socket machine when creating
ublk/null with 16 queues and AUTO_BUF_REG(zero copy).

V3:
	- don't use DECLARE_FLEX_ARRAY()
	- annotate flexible array by __counted_by()

V2:
	- use a flexible array member for queues field, save one indirection
	  for retrieving ublk queue
	- rename __queues into queues 
	- remove the queue_size field from struct ublk_device
	- Move queue allocation and deallocation into ublk_init_queue() and
	ublk_deinit_queue() 
	- use flexible array for ublk_queue.ios
	- convert ublk_thread_set_sched_affinity() to use pthread_setaffinity_np()

Ming Lei (5):
  ublk: reorder tag_set initialization before queue allocation
  ublk: implement NUMA-aware memory allocation
  ublk: use struct_size() for allocation
  selftests: ublk: set CPU affinity before thread initialization
  selftests: ublk: make ublk_thread thread-local variable

 drivers/block/ublk_drv.c             | 98 +++++++++++++++++-----------
 tools/testing/selftests/ublk/kublk.c | 70 ++++++++++++--------
 tools/testing/selftests/ublk/kublk.h |  9 +--
 3 files changed, 105 insertions(+), 72 deletions(-)

-- 
2.47.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Move ublk_add_tag_set() before ublk_init_queues() in the device
initialization path. This allows us to use the blk-mq CPU-to-queue
mapping established by the tag_set to determine the appropriate
NUMA node for each queue allocation.

The error handling paths are also reordered accordingly.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/ublk_drv.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 0c74a41a6753..2569566bf5e6 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -3178,17 +3178,17 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
 			ub->dev_info.nr_hw_queues, nr_cpu_ids);
 	ublk_align_max_io_size(ub);
 
-	ret = ublk_init_queues(ub);
+	ret = ublk_add_tag_set(ub);
 	if (ret)
 		goto out_free_dev_number;
 
-	ret = ublk_add_tag_set(ub);
+	ret = ublk_init_queues(ub);
 	if (ret)
-		goto out_deinit_queues;
+		goto out_free_tag_set;
 
 	ret = -EFAULT;
 	if (copy_to_user(argp, &ub->dev_info, sizeof(info)))
-		goto out_free_tag_set;
+		goto out_deinit_queues;
 
 	/*
 	 * Add the char dev so that ublksrv daemon can be setup.
@@ -3197,10 +3197,10 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
 	ret = ublk_add_chdev(ub);
 	goto out_unlock;
 
-out_free_tag_set:
-	blk_mq_free_tag_set(&ub->tag_set);
 out_deinit_queues:
 	ublk_deinit_queues(ub);
+out_free_tag_set:
+	blk_mq_free_tag_set(&ub->tag_set);
 out_free_dev_number:
 	ublk_free_dev_number(ub);
 out_free_ub:
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
  2025-10-29  3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  2025-10-29 16:00   ` Caleb Sander Mateos
                     ` (2 more replies)
  2025-10-29  3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Implement NUMA-friendly memory allocation for ublk driver to improve
performance on multi-socket systems.

This commit includes the following changes:

1. Convert struct ublk_device to use a flexible array member for the
   queues field instead of a separate pointer array allocation. This
   eliminates one level of indirection and simplifies memory management.
   The queues array is now allocated as part of struct ublk_device using
   struct_size().

2. Rename __queues to queues, dropping the __ prefix since the field is
   now accessed directly throughout the codebase rather than only through
   the ublk_get_queue() helper.

3. Remove the queue_size field from struct ublk_device as it is no longer
   needed.

4. Move queue allocation and deallocation into ublk_init_queue() and
   ublk_deinit_queue() respectively, improving encapsulation. This
   simplifies ublk_init_queues() and ublk_deinit_queues() to just
   iterate and call the per-queue functions.

5. Add ublk_get_queue_numa_node() helper function to determine the
   appropriate NUMA node for a queue by finding the first CPU mapped
   to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
   converting it to a NUMA node using cpu_to_node(). This function is
   called internally by ublk_init_queue() to determine the allocation
   node.

6. Allocate each queue structure on its local NUMA node using
   kvzalloc_node() in ublk_init_queue().

7. Allocate the I/O command buffer on the same NUMA node using
   alloc_pages_node().

This reduces memory access latency on multi-socket NUMA systems by
ensuring each queue's data structures are local to the CPUs that
access them.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
 1 file changed, 53 insertions(+), 31 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 2569566bf5e6..ed77b4527b33 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -209,9 +209,6 @@ struct ublk_queue {
 struct ublk_device {
 	struct gendisk		*ub_disk;
 
-	char	*__queues;
-
-	unsigned int	queue_size;
 	struct ublksrv_ctrl_dev_info	dev_info;
 
 	struct blk_mq_tag_set	tag_set;
@@ -239,6 +236,8 @@ struct ublk_device {
 	bool canceling;
 	pid_t 	ublksrv_tgid;
 	struct delayed_work	exit_work;
+
+	struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
 };
 
 /* header of ublk_params */
@@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
 static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
 		int qid)
 {
-       return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
+	return dev->queues[qid];
 }
 
 static inline bool ublk_rq_has_data(const struct request *rq)
@@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
 
 static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
 {
-	int size = ublk_queue_cmd_buf_size(ub);
-	struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
-	int i;
+	struct ublk_queue *ubq = ub->queues[q_id];
+	int size, i;
+
+	if (!ubq)
+		return;
+
+	size = ublk_queue_cmd_buf_size(ub);
 
 	for (i = 0; i < ubq->q_depth; i++) {
 		struct ublk_io *io = &ubq->ios[i];
@@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
 
 	if (ubq->io_cmd_buf)
 		free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
+
+	kvfree(ubq);
+	ub->queues[q_id] = NULL;
+}
+
+static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
+{
+	unsigned int cpu;
+
+	/* Find first CPU mapped to this queue */
+	for_each_possible_cpu(cpu) {
+		if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
+			return cpu_to_node(cpu);
+	}
+
+	return NUMA_NO_NODE;
 }
 
 static int ublk_init_queue(struct ublk_device *ub, int q_id)
 {
-	struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
+	int depth = ub->dev_info.queue_depth;
+	int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
 	gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
-	void *ptr;
+	struct ublk_queue *ubq;
+	struct page *page;
+	int numa_node;
 	int size;
 
+	/* Determine NUMA node based on queue's CPU affinity */
+	numa_node = ublk_get_queue_numa_node(ub, q_id);
+
+	/* Allocate queue structure on local NUMA node */
+	ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
+	if (!ubq)
+		return -ENOMEM;
+
 	spin_lock_init(&ubq->cancel_lock);
 	ubq->flags = ub->dev_info.flags;
 	ubq->q_id = q_id;
-	ubq->q_depth = ub->dev_info.queue_depth;
+	ubq->q_depth = depth;
 	size = ublk_queue_cmd_buf_size(ub);
 
-	ptr = (void *) __get_free_pages(gfp_flags, get_order(size));
-	if (!ptr)
+	/* Allocate I/O command buffer on local NUMA node */
+	page = alloc_pages_node(numa_node, gfp_flags, get_order(size));
+	if (!page) {
+		kvfree(ubq);
 		return -ENOMEM;
+	}
+	ubq->io_cmd_buf = page_address(page);
 
-	ubq->io_cmd_buf = ptr;
+	ub->queues[q_id] = ubq;
 	ubq->dev = ub;
 	return 0;
 }
 
 static void ublk_deinit_queues(struct ublk_device *ub)
 {
-	int nr_queues = ub->dev_info.nr_hw_queues;
 	int i;
 
-	if (!ub->__queues)
-		return;
-
-	for (i = 0; i < nr_queues; i++)
+	for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
 		ublk_deinit_queue(ub, i);
-	kvfree(ub->__queues);
 }
 
 static int ublk_init_queues(struct ublk_device *ub)
 {
-	int nr_queues = ub->dev_info.nr_hw_queues;
-	int depth = ub->dev_info.queue_depth;
-	int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
-	int i, ret = -ENOMEM;
+	int i, ret;
 
-	ub->queue_size = ubq_size;
-	ub->__queues = kvcalloc(nr_queues, ubq_size, GFP_KERNEL);
-	if (!ub->__queues)
-		return ret;
-
-	for (i = 0; i < nr_queues; i++) {
-		if (ublk_init_queue(ub, i))
+	for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
+		ret = ublk_init_queue(ub, i);
+		if (ret)
 			goto fail;
 	}
 
@@ -3128,7 +3150,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
 		goto out_unlock;
 
 	ret = -ENOMEM;
-	ub = kzalloc(sizeof(*ub), GFP_KERNEL);
+	ub = kzalloc(struct_size(ub, queues, info.nr_hw_queues), GFP_KERNEL);
 	if (!ub)
 		goto out_unlock;
 	mutex_init(&ub->mutex);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 3/5] ublk: use struct_size() for allocation
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
  2025-10-29  3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  2025-10-29 16:00   ` Caleb Sander Mateos
  2025-10-29  3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
  2025-10-29  3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
  4 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Convert ublk_queue to use struct_size() for allocation.

Changes in this commit:

1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
   instead of manual size calculation (sizeof(struct ublk_queue) +
   depth * sizeof(struct ublk_io)).

This provides better type safety and makes the code more maintainable
by using standard kernel macro for flexible array handling.

Meantime annotate ublk_queue.ios by __counted_by().

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/ublk_drv.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index ed77b4527b33..409874714c62 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -203,7 +203,7 @@ struct ublk_queue {
 	bool fail_io; /* copy of dev->state == UBLK_S_DEV_FAIL_IO */
 	spinlock_t		cancel_lock;
 	struct ublk_device *dev;
-	struct ublk_io ios[];
+	struct ublk_io ios[] __counted_by(q_depth);
 };
 
 struct ublk_device {
@@ -2700,7 +2700,6 @@ static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
 static int ublk_init_queue(struct ublk_device *ub, int q_id)
 {
 	int depth = ub->dev_info.queue_depth;
-	int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
 	gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
 	struct ublk_queue *ubq;
 	struct page *page;
@@ -2711,7 +2710,8 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id)
 	numa_node = ublk_get_queue_numa_node(ub, q_id);
 
 	/* Allocate queue structure on local NUMA node */
-	ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
+	ubq = kvzalloc_node(struct_size(ubq, ios, depth), GFP_KERNEL,
+			    numa_node);
 	if (!ubq)
 		return -ENOMEM;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
                   ` (2 preceding siblings ...)
  2025-10-29  3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  2025-10-29  3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
  4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Move ublk_thread_set_sched_affinity() call before ublk_thread_init()
to ensure memory allocations during thread initialization occur on
the correct NUMA node. This leverages Linux's first-touch memory
policy for better NUMA locality.

Also convert ublk_thread_set_sched_affinity() to use
pthread_setaffinity_np() instead of sched_setaffinity(), as the
pthread API is the proper interface for setting thread affinity in
multithreaded programs.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 tools/testing/selftests/ublk/kublk.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index 6b8123c12a7a..062537ab8976 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -839,7 +839,7 @@ static int ublk_process_io(struct ublk_thread *t)
 static void ublk_thread_set_sched_affinity(const struct ublk_thread *t,
 		cpu_set_t *cpuset)
 {
-        if (sched_setaffinity(0, sizeof(*cpuset), cpuset) < 0)
+	if (pthread_setaffinity_np(pthread_self(), sizeof(*cpuset), cpuset) < 0)
 		ublk_err("ublk dev %u thread %u set affinity failed",
 				t->dev->dev_info.dev_id, t->idx);
 }
@@ -862,15 +862,21 @@ static void *ublk_io_handler_fn(void *data)
 	t->dev = info->dev;
 	t->idx = info->idx;
 
+	/*
+	 * IO perf is sensitive with queue pthread affinity on NUMA machine
+	 *
+	 * Set sched_affinity at beginning, so following allocated memory/pages
+	 * could be CPU/NUMA aware.
+	 */
+	if (info->affinity)
+		ublk_thread_set_sched_affinity(t, info->affinity);
+
 	ret = ublk_thread_init(t, info->extra_flags);
 	if (ret) {
 		ublk_err("ublk dev %d thread %u init failed\n",
 				dev_id, t->idx);
 		return NULL;
 	}
-	/* IO perf is sensitive with queue pthread affinity on NUMA machine*/
-	if (info->affinity)
-		ublk_thread_set_sched_affinity(t, info->affinity);
 	sem_post(info->ready);
 
 	ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %u started\n",
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
                   ` (3 preceding siblings ...)
  2025-10-29  3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Refactor ublk_thread to be a thread-local variable instead of storing
it in ublk_dev:

- Remove pthread_t thread field from struct ublk_thread and move it to
  struct ublk_thread_info

- Remove struct ublk_thread array from struct ublk_dev, reducing memory
  footprint

- Define struct ublk_thread as local variable in __ublk_io_handler_fn()
  instead of accessing it from dev->threads[]

- Extract main IO handling logic into __ublk_io_handler_fn() which is
  marked as noinline

- Move CPU affinity setup to ublk_io_handler_fn() before calling
  __ublk_io_handler_fn()

- Update ublk_thread_set_sched_affinity() to take struct ublk_thread_info *
  instead of struct ublk_thread *, and use pthread_setaffinity_np()
  instead of sched_setaffinity()

- Reorder struct ublk_thread fields to group related state together

This change makes each thread's ublk_thread structure truly local to
the thread, improving cache locality and reducing memory usage.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 tools/testing/selftests/ublk/kublk.c | 76 +++++++++++++++-------------
 tools/testing/selftests/ublk/kublk.h |  9 ++--
 2 files changed, 45 insertions(+), 40 deletions(-)

diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index 062537ab8976..f8fa102a627f 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -836,62 +836,70 @@ static int ublk_process_io(struct ublk_thread *t)
 	return reapped;
 }
 
-static void ublk_thread_set_sched_affinity(const struct ublk_thread *t,
-		cpu_set_t *cpuset)
-{
-	if (pthread_setaffinity_np(pthread_self(), sizeof(*cpuset), cpuset) < 0)
-		ublk_err("ublk dev %u thread %u set affinity failed",
-				t->dev->dev_info.dev_id, t->idx);
-}
-
 struct ublk_thread_info {
 	struct ublk_dev 	*dev;
+	pthread_t		thread;
 	unsigned		idx;
 	sem_t 			*ready;
 	cpu_set_t 		*affinity;
 	unsigned long long	extra_flags;
 };
 
-static void *ublk_io_handler_fn(void *data)
+static void ublk_thread_set_sched_affinity(const struct ublk_thread_info *info)
 {
-	struct ublk_thread_info *info = data;
-	struct ublk_thread *t = &info->dev->threads[info->idx];
+	if (pthread_setaffinity_np(pthread_self(), sizeof(*info->affinity), info->affinity) < 0)
+		ublk_err("ublk dev %u thread %u set affinity failed",
+				info->dev->dev_info.dev_id, info->idx);
+}
+
+static __attribute__((noinline)) int __ublk_io_handler_fn(struct ublk_thread_info *info)
+{
+	struct ublk_thread t = {
+		.dev = info->dev,
+		.idx = info->idx,
+	};
 	int dev_id = info->dev->dev_info.dev_id;
 	int ret;
 
-	t->dev = info->dev;
-	t->idx = info->idx;
-
-	/*
-	 * IO perf is sensitive with queue pthread affinity on NUMA machine
-	 *
-	 * Set sched_affinity at beginning, so following allocated memory/pages
-	 * could be CPU/NUMA aware.
-	 */
-	if (info->affinity)
-		ublk_thread_set_sched_affinity(t, info->affinity);
-
-	ret = ublk_thread_init(t, info->extra_flags);
+	ret = ublk_thread_init(&t, info->extra_flags);
 	if (ret) {
 		ublk_err("ublk dev %d thread %u init failed\n",
-				dev_id, t->idx);
-		return NULL;
+				dev_id, t.idx);
+		return ret;
 	}
 	sem_post(info->ready);
 
 	ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %u started\n",
-			gettid(), dev_id, t->idx);
+			gettid(), dev_id, t.idx);
 
 	/* submit all io commands to ublk driver */
-	ublk_submit_fetch_commands(t);
+	ublk_submit_fetch_commands(&t);
 	do {
-		if (ublk_process_io(t) < 0)
+		if (ublk_process_io(&t) < 0)
 			break;
 	} while (1);
 
 	ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %d exiting\n",
-		 gettid(), dev_id, t->idx);
-	ublk_thread_deinit(t);
+		 gettid(), dev_id, t.idx);
+	ublk_thread_deinit(&t);
+	return 0;
+}
+
+static void *ublk_io_handler_fn(void *data)
+{
+	struct ublk_thread_info *info = data;
+
+	/*
+	 * IO perf is sensitive with queue pthread affinity on NUMA machine
+	 *
+	 * Set sched_affinity at beginning, so following allocated memory/pages
+	 * could be CPU/NUMA aware.
+	 */
+	if (info->affinity)
+		ublk_thread_set_sched_affinity(info);
+
+	__ublk_io_handler_fn(info);
+
 	return NULL;
 }
 
@@ -989,14 +997,13 @@ static int ublk_start_daemon(const struct dev_ctx *ctx, struct ublk_dev *dev)
 		 */
 		if (dev->nthreads == dinfo->nr_hw_queues)
 			tinfo[i].affinity = &affinity_buf[i];
-		pthread_create(&dev->threads[i].thread, NULL,
+		pthread_create(&tinfo[i].thread, NULL,
 				ublk_io_handler_fn,
 				&tinfo[i]);
 	}
 
 	for (i = 0; i < dev->nthreads; i++)
 		sem_wait(&ready);
-	free(tinfo);
 	free(affinity_buf);
 
 	/* everything is fine now, start us */
@@ -1019,7 +1026,8 @@ static int ublk_start_daemon(const struct dev_ctx *ctx, struct ublk_dev *dev)
 
 	/* wait until we are terminated */
 	for (i = 0; i < dev->nthreads; i++)
-		pthread_join(dev->threads[i].thread, &thread_ret);
+		pthread_join(tinfo[i].thread, &thread_ret);
+	free(tinfo);
  fail:
 	for (i = 0; i < dinfo->nr_hw_queues; i++)
 		ublk_queue_deinit(&dev->q[i]);
diff --git a/tools/testing/selftests/ublk/kublk.h b/tools/testing/selftests/ublk/kublk.h
index 5e55484fb0aa..fe42705c6d42 100644
--- a/tools/testing/selftests/ublk/kublk.h
+++ b/tools/testing/selftests/ublk/kublk.h
@@ -175,23 +175,20 @@ struct ublk_queue {
 
 struct ublk_thread {
 	struct ublk_dev *dev;
-	struct io_uring ring;
-	unsigned int cmd_inflight;
-	unsigned int io_inflight;
-
-	pthread_t thread;
 	unsigned idx;
 
 #define UBLKS_T_STOPPING	(1U << 0)
 #define UBLKS_T_IDLE	(1U << 1)
 	unsigned state;
+	unsigned int cmd_inflight;
+	unsigned int io_inflight;
+	struct io_uring ring;
 };
 
 struct ublk_dev {
 	struct ublk_tgt tgt;
 	struct ublksrv_ctrl_dev_info  dev_info;
 	struct ublk_queue q[UBLK_MAX_QUEUES];
-	struct ublk_thread threads[UBLK_MAX_THREADS];
 	unsigned nthreads;
 	unsigned per_io_tasks;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
@ 2025-10-29 16:00   ` Caleb Sander Mateos
  2025-10-29 22:53     ` Ming Lei
  2025-10-30  4:04   ` kernel test robot
  2025-10-30  8:00   ` kernel test robot
  2 siblings, 1 reply; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-29 16:00 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Uday Shankar

On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Implement NUMA-friendly memory allocation for ublk driver to improve
> performance on multi-socket systems.
>
> This commit includes the following changes:
>
> 1. Convert struct ublk_device to use a flexible array member for the
>    queues field instead of a separate pointer array allocation. This
>    eliminates one level of indirection and simplifies memory management.
>    The queues array is now allocated as part of struct ublk_device using
>    struct_size().

Technically it ends up being the same number of indirections as
before, since changing queues from a single allocation to an array of
separate allocations adds another indirection.

>
> 2. Rename __queues to queues, dropping the __ prefix since the field is
>    now accessed directly throughout the codebase rather than only through
>    the ublk_get_queue() helper.
>
> 3. Remove the queue_size field from struct ublk_device as it is no longer
>    needed.
>
> 4. Move queue allocation and deallocation into ublk_init_queue() and
>    ublk_deinit_queue() respectively, improving encapsulation. This
>    simplifies ublk_init_queues() and ublk_deinit_queues() to just
>    iterate and call the per-queue functions.
>
> 5. Add ublk_get_queue_numa_node() helper function to determine the
>    appropriate NUMA node for a queue by finding the first CPU mapped
>    to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
>    converting it to a NUMA node using cpu_to_node(). This function is
>    called internally by ublk_init_queue() to determine the allocation
>    node.
>
> 6. Allocate each queue structure on its local NUMA node using
>    kvzalloc_node() in ublk_init_queue().
>
> 7. Allocate the I/O command buffer on the same NUMA node using
>    alloc_pages_node().
>
> This reduces memory access latency on multi-socket NUMA systems by
> ensuring each queue's data structures are local to the CPUs that
> access them.
>
> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
>  1 file changed, 53 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 2569566bf5e6..ed77b4527b33 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -209,9 +209,6 @@ struct ublk_queue {
>  struct ublk_device {
>         struct gendisk          *ub_disk;
>
> -       char    *__queues;
> -
> -       unsigned int    queue_size;
>         struct ublksrv_ctrl_dev_info    dev_info;
>
>         struct blk_mq_tag_set   tag_set;
> @@ -239,6 +236,8 @@ struct ublk_device {
>         bool canceling;
>         pid_t   ublksrv_tgid;
>         struct delayed_work     exit_work;
> +
> +       struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
>  };
>
>  /* header of ublk_params */
> @@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
>  static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
>                 int qid)
>  {
> -       return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
> +       return dev->queues[qid];
>  }
>
>  static inline bool ublk_rq_has_data(const struct request *rq)
> @@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
>
>  static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
>  {
> -       int size = ublk_queue_cmd_buf_size(ub);
> -       struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> -       int i;
> +       struct ublk_queue *ubq = ub->queues[q_id];
> +       int size, i;
> +
> +       if (!ubq)
> +               return;
> +
> +       size = ublk_queue_cmd_buf_size(ub);
>
>         for (i = 0; i < ubq->q_depth; i++) {
>                 struct ublk_io *io = &ubq->ios[i];
> @@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
>
>         if (ubq->io_cmd_buf)
>                 free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
> +
> +       kvfree(ubq);
> +       ub->queues[q_id] = NULL;
> +}
> +
> +static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> +{
> +       unsigned int cpu;
> +
> +       /* Find first CPU mapped to this queue */
> +       for_each_possible_cpu(cpu) {
> +               if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
> +                       return cpu_to_node(cpu);
> +       }

I think you could avoid this quadratic lookup by using blk_mq_hw_ctx's
numa_node field. The initialization code would probably have to move
to ublk_init_hctx() in order to have access to the blk_mq_hw_ctx. But
may not be worth the effort just to save some time at ublk creation
time. What you have seems fine.

Best,
Caleb

> +
> +       return NUMA_NO_NODE;
>  }
>
>  static int ublk_init_queue(struct ublk_device *ub, int q_id)
>  {
> -       struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> +       int depth = ub->dev_info.queue_depth;
> +       int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
>         gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
> -       void *ptr;
> +       struct ublk_queue *ubq;
> +       struct page *page;
> +       int numa_node;
>         int size;
>
> +       /* Determine NUMA node based on queue's CPU affinity */
> +       numa_node = ublk_get_queue_numa_node(ub, q_id);
> +
> +       /* Allocate queue structure on local NUMA node */
> +       ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
> +       if (!ubq)
> +               return -ENOMEM;
> +
>         spin_lock_init(&ubq->cancel_lock);
>         ubq->flags = ub->dev_info.flags;
>         ubq->q_id = q_id;
> -       ubq->q_depth = ub->dev_info.queue_depth;
> +       ubq->q_depth = depth;
>         size = ublk_queue_cmd_buf_size(ub);
>
> -       ptr = (void *) __get_free_pages(gfp_flags, get_order(size));
> -       if (!ptr)
> +       /* Allocate I/O command buffer on local NUMA node */
> +       page = alloc_pages_node(numa_node, gfp_flags, get_order(size));
> +       if (!page) {
> +               kvfree(ubq);
>                 return -ENOMEM;
> +       }
> +       ubq->io_cmd_buf = page_address(page);
>
> -       ubq->io_cmd_buf = ptr;
> +       ub->queues[q_id] = ubq;
>         ubq->dev = ub;
>         return 0;
>  }
>
>  static void ublk_deinit_queues(struct ublk_device *ub)
>  {
> -       int nr_queues = ub->dev_info.nr_hw_queues;
>         int i;
>
> -       if (!ub->__queues)
> -               return;
> -
> -       for (i = 0; i < nr_queues; i++)
> +       for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
>                 ublk_deinit_queue(ub, i);
> -       kvfree(ub->__queues);
>  }
>
>  static int ublk_init_queues(struct ublk_device *ub)
>  {
> -       int nr_queues = ub->dev_info.nr_hw_queues;
> -       int depth = ub->dev_info.queue_depth;
> -       int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
> -       int i, ret = -ENOMEM;
> +       int i, ret;
>
> -       ub->queue_size = ubq_size;
> -       ub->__queues = kvcalloc(nr_queues, ubq_size, GFP_KERNEL);
> -       if (!ub->__queues)
> -               return ret;
> -
> -       for (i = 0; i < nr_queues; i++) {
> -               if (ublk_init_queue(ub, i))
> +       for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
> +               ret = ublk_init_queue(ub, i);
> +               if (ret)
>                         goto fail;
>         }
>
> @@ -3128,7 +3150,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
>                 goto out_unlock;
>
>         ret = -ENOMEM;
> -       ub = kzalloc(sizeof(*ub), GFP_KERNEL);
> +       ub = kzalloc(struct_size(ub, queues, info.nr_hw_queues), GFP_KERNEL);
>         if (!ub)
>                 goto out_unlock;
>         mutex_init(&ub->mutex);
> --
> 2.47.0
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 3/5] ublk: use struct_size() for allocation
  2025-10-29  3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
@ 2025-10-29 16:00   ` Caleb Sander Mateos
  0 siblings, 0 replies; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-29 16:00 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Uday Shankar

On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Convert ublk_queue to use struct_size() for allocation.
>
> Changes in this commit:
>
> 1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
>    instead of manual size calculation (sizeof(struct ublk_queue) +
>    depth * sizeof(struct ublk_io)).
>
> This provides better type safety and makes the code more maintainable
> by using standard kernel macro for flexible array handling.
>
> Meantime annotate ublk_queue.ios by __counted_by().
>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>

> ---
>  drivers/block/ublk_drv.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index ed77b4527b33..409874714c62 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -203,7 +203,7 @@ struct ublk_queue {
>         bool fail_io; /* copy of dev->state == UBLK_S_DEV_FAIL_IO */
>         spinlock_t              cancel_lock;
>         struct ublk_device *dev;
> -       struct ublk_io ios[];
> +       struct ublk_io ios[] __counted_by(q_depth);
>  };
>
>  struct ublk_device {
> @@ -2700,7 +2700,6 @@ static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
>  static int ublk_init_queue(struct ublk_device *ub, int q_id)
>  {
>         int depth = ub->dev_info.queue_depth;
> -       int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
>         gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
>         struct ublk_queue *ubq;
>         struct page *page;
> @@ -2711,7 +2710,8 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id)
>         numa_node = ublk_get_queue_numa_node(ub, q_id);
>
>         /* Allocate queue structure on local NUMA node */
> -       ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
> +       ubq = kvzalloc_node(struct_size(ubq, ios, depth), GFP_KERNEL,
> +                           numa_node);
>         if (!ubq)
>                 return -ENOMEM;
>
> --
> 2.47.0
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29 16:00   ` Caleb Sander Mateos
@ 2025-10-29 22:53     ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 22:53 UTC (permalink / raw)
  To: Caleb Sander Mateos; +Cc: Jens Axboe, linux-block, Uday Shankar

On Wed, Oct 29, 2025 at 09:00:12AM -0700, Caleb Sander Mateos wrote:
> On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > Implement NUMA-friendly memory allocation for ublk driver to improve
> > performance on multi-socket systems.
> >
> > This commit includes the following changes:
> >
> > 1. Convert struct ublk_device to use a flexible array member for the
> >    queues field instead of a separate pointer array allocation. This
> >    eliminates one level of indirection and simplifies memory management.
> >    The queues array is now allocated as part of struct ublk_device using
> >    struct_size().
> 
> Technically it ends up being the same number of indirections as
> before, since changing queues from a single allocation to an array of
> separate allocations adds another indirection.

I think it is fine, because the pre-condition is NUMA aware allocation for
ublk_queue.

> 
> >
> > 2. Rename __queues to queues, dropping the __ prefix since the field is
> >    now accessed directly throughout the codebase rather than only through
> >    the ublk_get_queue() helper.
> >
> > 3. Remove the queue_size field from struct ublk_device as it is no longer
> >    needed.
> >
> > 4. Move queue allocation and deallocation into ublk_init_queue() and
> >    ublk_deinit_queue() respectively, improving encapsulation. This
> >    simplifies ublk_init_queues() and ublk_deinit_queues() to just
> >    iterate and call the per-queue functions.
> >
> > 5. Add ublk_get_queue_numa_node() helper function to determine the
> >    appropriate NUMA node for a queue by finding the first CPU mapped
> >    to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
> >    converting it to a NUMA node using cpu_to_node(). This function is
> >    called internally by ublk_init_queue() to determine the allocation
> >    node.
> >
> > 6. Allocate each queue structure on its local NUMA node using
> >    kvzalloc_node() in ublk_init_queue().
> >
> > 7. Allocate the I/O command buffer on the same NUMA node using
> >    alloc_pages_node().
> >
> > This reduces memory access latency on multi-socket NUMA systems by
> > ensuring each queue's data structures are local to the CPUs that
> > access them.
> >
> > Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
> >  1 file changed, 53 insertions(+), 31 deletions(-)
> >
> > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > index 2569566bf5e6..ed77b4527b33 100644
> > --- a/drivers/block/ublk_drv.c
> > +++ b/drivers/block/ublk_drv.c
> > @@ -209,9 +209,6 @@ struct ublk_queue {
> >  struct ublk_device {
> >         struct gendisk          *ub_disk;
> >
> > -       char    *__queues;
> > -
> > -       unsigned int    queue_size;
> >         struct ublksrv_ctrl_dev_info    dev_info;
> >
> >         struct blk_mq_tag_set   tag_set;
> > @@ -239,6 +236,8 @@ struct ublk_device {
> >         bool canceling;
> >         pid_t   ublksrv_tgid;
> >         struct delayed_work     exit_work;
> > +
> > +       struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
> >  };
> >
> >  /* header of ublk_params */
> > @@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
> >  static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
> >                 int qid)
> >  {
> > -       return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
> > +       return dev->queues[qid];
> >  }
> >
> >  static inline bool ublk_rq_has_data(const struct request *rq)
> > @@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
> >
> >  static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> >  {
> > -       int size = ublk_queue_cmd_buf_size(ub);
> > -       struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> > -       int i;
> > +       struct ublk_queue *ubq = ub->queues[q_id];
> > +       int size, i;
> > +
> > +       if (!ubq)
> > +               return;
> > +
> > +       size = ublk_queue_cmd_buf_size(ub);
> >
> >         for (i = 0; i < ubq->q_depth; i++) {
> >                 struct ublk_io *io = &ubq->ios[i];
> > @@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> >
> >         if (ubq->io_cmd_buf)
> >                 free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
> > +
> > +       kvfree(ubq);
> > +       ub->queues[q_id] = NULL;
> > +}
> > +
> > +static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> > +{
> > +       unsigned int cpu;
> > +
> > +       /* Find first CPU mapped to this queue */
> > +       for_each_possible_cpu(cpu) {
> > +               if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
> > +                       return cpu_to_node(cpu);
> > +       }
> 
> I think you could avoid this quadratic lookup by using blk_mq_hw_ctx's
> numa_node field. The initialization code would probably have to move
> to ublk_init_hctx() in order to have access to the blk_mq_hw_ctx. But
> may not be worth the effort just to save some time at ublk creation
> time. What you have seems fine.

It isn't doable and not necessary.

disk/hw queues are created & initialized when handling UBLK_CMD_START_DEV, but the
backed ublk_queue need to be allocated when handling UBLK_CMD_ADD_DEV, which happens
before dealing with UBLK_CMD_START_DEV.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
  2025-10-29 16:00   ` Caleb Sander Mateos
@ 2025-10-30  4:04   ` kernel test robot
  2025-10-30  8:00   ` kernel test robot
  2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2025-10-30  4:04 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block
  Cc: oe-kbuild-all, Uday Shankar, Caleb Sander Mateos, Ming Lei

Hi Ming,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master v6.18-rc3 next-20251029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
config: csky-randconfig-r054-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301107.wrn8eeW8-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301107.wrn8eeW8-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510301107.wrn8eeW8-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from <command-line>:
>> drivers/block/ublk_drv.c:240:56: error: 'dev_info' undeclared here (not in a function); did you mean '_dev_info'?
     240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
         |                                                        ^~~~~~~~
   include/linux/compiler_types.h:346:71: note: in definition of macro '__counted_by'
     346 | # define __counted_by(member)           __attribute__((__counted_by__(member)))
         |                                                                       ^~~~~~
>> drivers/block/ublk_drv.c:240:34: error: 'counted_by' argument is not an identifier
     240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
         |                                  ^~~~~~


vim +240 drivers/block/ublk_drv.c

   208	
   209	struct ublk_device {
   210		struct gendisk		*ub_disk;
   211	
   212		struct ublksrv_ctrl_dev_info	dev_info;
   213	
   214		struct blk_mq_tag_set	tag_set;
   215	
   216		struct cdev		cdev;
   217		struct device		cdev_dev;
   218	
   219	#define UB_STATE_OPEN		0
   220	#define UB_STATE_USED		1
   221	#define UB_STATE_DELETED	2
   222		unsigned long		state;
   223		int			ub_number;
   224	
   225		struct mutex		mutex;
   226	
   227		spinlock_t		lock;
   228		struct mm_struct	*mm;
   229	
   230		struct ublk_params	params;
   231	
   232		struct completion	completion;
   233		u32			nr_io_ready;
   234		bool 			unprivileged_daemons;
   235		struct mutex cancel_mutex;
   236		bool canceling;
   237		pid_t 	ublksrv_tgid;
   238		struct delayed_work	exit_work;
   239	
 > 240		struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
   241	};
   242	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
  2025-10-29 16:00   ` Caleb Sander Mateos
  2025-10-30  4:04   ` kernel test robot
@ 2025-10-30  8:00   ` kernel test robot
  2025-10-30 14:07     ` Caleb Sander Mateos
  2 siblings, 1 reply; 14+ messages in thread
From: kernel test robot @ 2025-10-30  8:00 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block
  Cc: llvm, oe-kbuild-all, Uday Shankar, Caleb Sander Mateos, Ming Lei

Hi Ming,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on shuah-kselftest/next shuah-kselftest/fixes linus/master v6.18-rc3 next-20251029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
     240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
         |                                                        ^~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler_types.h:346:62: note: expanded from macro '__counted_by'
     346 | # define __counted_by(member)           __attribute__((__counted_by__(member)))
         |                                                                       ^~~~~~
   1 error generated.


vim +/counted_by +240 drivers/block/ublk_drv.c

   208	
   209	struct ublk_device {
   210		struct gendisk		*ub_disk;
   211	
   212		struct ublksrv_ctrl_dev_info	dev_info;
   213	
   214		struct blk_mq_tag_set	tag_set;
   215	
   216		struct cdev		cdev;
   217		struct device		cdev_dev;
   218	
   219	#define UB_STATE_OPEN		0
   220	#define UB_STATE_USED		1
   221	#define UB_STATE_DELETED	2
   222		unsigned long		state;
   223		int			ub_number;
   224	
   225		struct mutex		mutex;
   226	
   227		spinlock_t		lock;
   228		struct mm_struct	*mm;
   229	
   230		struct ublk_params	params;
   231	
   232		struct completion	completion;
   233		u32			nr_io_ready;
   234		bool 			unprivileged_daemons;
   235		struct mutex cancel_mutex;
   236		bool canceling;
   237		pid_t 	ublksrv_tgid;
   238		struct delayed_work	exit_work;
   239	
 > 240		struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
   241	};
   242	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-30  8:00   ` kernel test robot
@ 2025-10-30 14:07     ` Caleb Sander Mateos
  2025-10-30 17:56       ` Nathan Chancellor
  0 siblings, 1 reply; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-30 14:07 UTC (permalink / raw)
  To: kernel test robot
  Cc: Ming Lei, Jens Axboe, linux-block, llvm, oe-kbuild-all,
	Uday Shankar

On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
>
> Hi Ming,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on axboe-block/for-next]
> [also build test ERROR on shuah-kselftest/next shuah-kselftest/fixes linus/master v6.18-rc3 next-20251029]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
> >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
>      240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
>          |                                                        ^~~~~~~~~~~~~~~~~~~~~

Hmm, guess it doesn't support nested fields?

>    include/linux/compiler_types.h:346:62: note: expanded from macro '__counted_by'
>      346 | # define __counted_by(member)           __attribute__((__counted_by__(member)))
>          |                                                                       ^~~~~~
>    1 error generated.
>
>
> vim +/counted_by +240 drivers/block/ublk_drv.c
>
>    208
>    209  struct ublk_device {
>    210          struct gendisk          *ub_disk;
>    211
>    212          struct ublksrv_ctrl_dev_info    dev_info;
>    213
>    214          struct blk_mq_tag_set   tag_set;
>    215
>    216          struct cdev             cdev;
>    217          struct device           cdev_dev;
>    218
>    219  #define UB_STATE_OPEN           0
>    220  #define UB_STATE_USED           1
>    221  #define UB_STATE_DELETED        2
>    222          unsigned long           state;
>    223          int                     ub_number;
>    224
>    225          struct mutex            mutex;
>    226
>    227          spinlock_t              lock;
>    228          struct mm_struct        *mm;
>    229
>    230          struct ublk_params      params;
>    231
>    232          struct completion       completion;
>    233          u32                     nr_io_ready;
>    234          bool                    unprivileged_daemons;
>    235          struct mutex cancel_mutex;
>    236          bool canceling;
>    237          pid_t   ublksrv_tgid;
>    238          struct delayed_work     exit_work;
>    239
>  > 240          struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
>    241  };
>    242
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-30 14:07     ` Caleb Sander Mateos
@ 2025-10-30 17:56       ` Nathan Chancellor
  2025-10-31  3:28         ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Nathan Chancellor @ 2025-10-30 17:56 UTC (permalink / raw)
  To: Caleb Sander Mateos
  Cc: kernel test robot, Ming Lei, Jens Axboe, linux-block, llvm,
	oe-kbuild-all, Uday Shankar, Kees Cook

On Thu, Oct 30, 2025 at 07:07:25AM -0700, Caleb Sander Mateos wrote:
> On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
...
> > patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> > patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> > config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
> >
> > All errors (new ones prefixed by >>):
> >
> > >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> >      240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
> >          |                                                        ^~~~~~~~~~~~~~~~~~~~~
> 
> Hmm, guess it doesn't support nested fields?

Correct. I think this is something that we want to support at some point
if I remember correctly but I think there was a lot of discussion
between GCC and clang on how to actually do it but Kees is free to
correct me if that is wrong.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-30 17:56       ` Nathan Chancellor
@ 2025-10-31  3:28         ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-31  3:28 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Caleb Sander Mateos, kernel test robot, Jens Axboe, linux-block,
	llvm, oe-kbuild-all, Uday Shankar, Kees Cook

On Thu, Oct 30, 2025 at 10:56:31AM -0700, Nathan Chancellor wrote:
> On Thu, Oct 30, 2025 at 07:07:25AM -0700, Caleb Sander Mateos wrote:
> > On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
> ...
> > > patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> > > patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> > > config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> > > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
> > >
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <lkp@intel.com>
> > > | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
> > >
> > > All errors (new ones prefixed by >>):
> > >
> > > >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> > >      240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
> > >          |                                                        ^~~~~~~~~~~~~~~~~~~~~
> > 
> > Hmm, guess it doesn't support nested fields?
> 
> Correct. I think this is something that we want to support at some point
> if I remember correctly but I think there was a lot of discussion
> between GCC and clang on how to actually do it but Kees is free to
> correct me if that is wrong.

Thanks for the confirmation.

Will remove this __counted_by() in next version.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-10-31  3:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
2025-10-29  3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
2025-10-29 16:00   ` Caleb Sander Mateos
2025-10-29 22:53     ` Ming Lei
2025-10-30  4:04   ` kernel test robot
2025-10-30  8:00   ` kernel test robot
2025-10-30 14:07     ` Caleb Sander Mateos
2025-10-30 17:56       ` Nathan Chancellor
2025-10-31  3:28         ` Ming Lei
2025-10-29  3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
2025-10-29 16:00   ` Caleb Sander Mateos
2025-10-29  3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
2025-10-29  3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox