[PATCH V3 0/5] ublk: NUMA-aware memory allocation

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V3 0/5] ublk: NUMA-aware memory allocation
@ 2025-10-29  3:10 Ming Lei
  2025-10-29  3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Hi Jens,

The 1st two patches implement ublk driver NUMA aware memory allocation.

The last two patches implement it for ublk selftest utility.

`taskset -c 0-31 ~/git/fio/t/io_uring -p0 -n16 -r 40 /dev/ublkb0` shows
5%~10% IOPS improvement on one AMD zen4 dual socket machine when creating
ublk/null with 16 queues and AUTO_BUF_REG(zero copy).

V3:
	- don't use DECLARE_FLEX_ARRAY()
	- annotate flexible array by __counted_by()

V2:
	- use a flexible array member for queues field, save one indirection
	  for retrieving ublk queue
	- rename __queues into queues 
	- remove the queue_size field from struct ublk_device
	- Move queue allocation and deallocation into ublk_init_queue() and
	ublk_deinit_queue() 
	- use flexible array for ublk_queue.ios
	- convert ublk_thread_set_sched_affinity() to use pthread_setaffinity_np()

Ming Lei (5):
  ublk: reorder tag_set initialization before queue allocation
  ublk: implement NUMA-aware memory allocation
  ublk: use struct_size() for allocation
  selftests: ublk: set CPU affinity before thread initialization
  selftests: ublk: make ublk_thread thread-local variable

 drivers/block/ublk_drv.c             | 98 +++++++++++++++++-----------
 tools/testing/selftests/ublk/kublk.c | 70 ++++++++++++--------
 tools/testing/selftests/ublk/kublk.h |  9 +--
 3 files changed, 105 insertions(+), 72 deletions(-)

-- 
2.47.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Move ublk_add_tag_set() before ublk_init_queues() in the device
initialization path. This allows us to use the blk-mq CPU-to-queue
mapping established by the tag_set to determine the appropriate
NUMA node for each queue allocation.

The error handling paths are also reordered accordingly.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/ublk_drv.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 0c74a41a6753..2569566bf5e6 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -3178,17 +3178,17 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
 			ub->dev_info.nr_hw_queues, nr_cpu_ids);
 	ublk_align_max_io_size(ub);
 
-	ret = ublk_init_queues(ub);
+	ret = ublk_add_tag_set(ub);
 	if (ret)
 		goto out_free_dev_number;
 
-	ret = ublk_add_tag_set(ub);
+	ret = ublk_init_queues(ub);
 	if (ret)
-		goto out_deinit_queues;
+		goto out_free_tag_set;
 
 	ret = -EFAULT;
 	if (copy_to_user(argp, &ub->dev_info, sizeof(info)))
-		goto out_free_tag_set;
+		goto out_deinit_queues;
 
 	/*
 	 * Add the char dev so that ublksrv daemon can be setup.
@@ -3197,10 +3197,10 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
 	ret = ublk_add_chdev(ub);
 	goto out_unlock;
 
-out_free_tag_set:
-	blk_mq_free_tag_set(&ub->tag_set);
 out_deinit_queues:
 	ublk_deinit_queues(ub);
+out_free_tag_set:
+	blk_mq_free_tag_set(&ub->tag_set);
 out_free_dev_number:
 	ublk_free_dev_number(ub);
 out_free_ub:
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
  2025-10-29  3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  2025-10-29 16:00   ` Caleb Sander Mateos
                     ` (2 more replies)
  2025-10-29  3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Implement NUMA-friendly memory allocation for ublk driver to improve
performance on multi-socket systems.

This commit includes the following changes:

1. Convert struct ublk_device to use a flexible array member for the
   queues field instead of a separate pointer array allocation. This
   eliminates one level of indirection and simplifies memory management.
   The queues array is now allocated as part of struct ublk_device using
   struct_size().

2. Rename __queues to queues, dropping the __ prefix since the field is
   now accessed directly throughout the codebase rather than only through
   the ublk_get_queue() helper.

3. Remove the queue_size field from struct ublk_device as it is no longer
   needed.

4. Move queue allocation and deallocation into ublk_init_queue() and
   ublk_deinit_queue() respectively, improving encapsulation. This
   simplifies ublk_init_queues() and ublk_deinit_queues() to just
   iterate and call the per-queue functions.

5. Add ublk_get_queue_numa_node() helper function to determine the
   appropriate NUMA node for a queue by finding the first CPU mapped
   to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
   converting it to a NUMA node using cpu_to_node(). This function is
   called internally by ublk_init_queue() to determine the allocation
   node.

6. Allocate each queue structure on its local NUMA node using
   kvzalloc_node() in ublk_init_queue().

7. Allocate the I/O command buffer on the same NUMA node using
   alloc_pages_node().

This reduces memory access latency on multi-socket NUMA systems by
ensuring each queue's data structures are local to the CPUs that
access them.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
 1 file changed, 53 insertions(+), 31 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 2569566bf5e6..ed77b4527b33 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -209,9 +209,6 @@ struct ublk_queue {
 struct ublk_device {
 	struct gendisk		*ub_disk;
 
-	char	*__queues;
-
-	unsigned int	queue_size;
 	struct ublksrv_ctrl_dev_info	dev_info;
 
 	struct blk_mq_tag_set	tag_set;
@@ -239,6 +236,8 @@ struct ublk_device {
 	bool canceling;
 	pid_t 	ublksrv_tgid;
 	struct delayed_work	exit_work;
+
+	struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
 };
 
 /* header of ublk_params */
@@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
 static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
 		int qid)
 {
-       return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
+	return dev->queues[qid];
 }
 
 static inline bool ublk_rq_has_data(const struct request *rq)
@@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
 
 static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
 {
-	int size = ublk_queue_cmd_buf_size(ub);
-	struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
-	int i;
+	struct ublk_queue *ubq = ub->queues[q_id];
+	int size, i;
+
+	if (!ubq)
+		return;
+
+	size = ublk_queue_cmd_buf_size(ub);
 
 	for (i = 0; i < ubq->q_depth; i++) {
 		struct ublk_io *io = &ubq->ios[i];
@@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
 
 	if (ubq->io_cmd_buf)
 		free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
+
+	kvfree(ubq);
+	ub->queues[q_id] = NULL;
+}
+
+static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
+{
+	unsigned int cpu;
+
+	/* Find first CPU mapped to this queue */
+	for_each_possible_cpu(cpu) {
+		if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
+			return cpu_to_node(cpu);
+	}
+
+	return NUMA_NO_NODE;
 }
 
 static int ublk_init_queue(struct ublk_device *ub, int q_id)
 {
-	struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
+	int depth = ub->dev_info.queue_depth;
+	int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
 	gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
-	void *ptr;
+	struct ublk_queue *ubq;
+	struct page *page;
+	int numa_node;
 	int size;
 
+	/* Determine NUMA node based on queue's CPU affinity */
+	numa_node = ublk_get_queue_numa_node(ub, q_id);
+
+	/* Allocate queue structure on local NUMA node */
+	ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
+	if (!ubq)
+		return -ENOMEM;
+
 	spin_lock_init(&ubq->cancel_lock);
 	ubq->flags = ub->dev_info.flags;
 	ubq->q_id = q_id;
-	ubq->q_depth = ub->dev_info.queue_depth;
+	ubq->q_depth = depth;
 	size = ublk_queue_cmd_buf_size(ub);
 
-	ptr = (void *) __get_free_pages(gfp_flags, get_order(size));
-	if (!ptr)
+	/* Allocate I/O command buffer on local NUMA node */
+	page = alloc_pages_node(numa_node, gfp_flags, get_order(size));
+	if (!page) {
+		kvfree(ubq);
 		return -ENOMEM;
+	}
+	ubq->io_cmd_buf = page_address(page);
 
-	ubq->io_cmd_buf = ptr;
+	ub->queues[q_id] = ubq;
 	ubq->dev = ub;
 	return 0;
 }
 
 static void ublk_deinit_queues(struct ublk_device *ub)
 {
-	int nr_queues = ub->dev_info.nr_hw_queues;
 	int i;
 
-	if (!ub->__queues)
-		return;
-
-	for (i = 0; i < nr_queues; i++)
+	for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
 		ublk_deinit_queue(ub, i);
-	kvfree(ub->__queues);
 }
 
 static int ublk_init_queues(struct ublk_device *ub)
 {
-	int nr_queues = ub->dev_info.nr_hw_queues;
-	int depth = ub->dev_info.queue_depth;
-	int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
-	int i, ret = -ENOMEM;
+	int i, ret;
 
-	ub->queue_size = ubq_size;
-	ub->__queues = kvcalloc(nr_queues, ubq_size, GFP_KERNEL);
-	if (!ub->__queues)
-		return ret;
-
-	for (i = 0; i < nr_queues; i++) {
-		if (ublk_init_queue(ub, i))
+	for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
+		ret = ublk_init_queue(ub, i);
+		if (ret)
 			goto fail;
 	}
 
@@ -3128,7 +3150,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
 		goto out_unlock;
 
 	ret = -ENOMEM;
-	ub = kzalloc(sizeof(*ub), GFP_KERNEL);
+	ub = kzalloc(struct_size(ub, queues, info.nr_hw_queues), GFP_KERNEL);
 	if (!ub)
 		goto out_unlock;
 	mutex_init(&ub->mutex);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 3/5] ublk: use struct_size() for allocation
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
  2025-10-29  3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  2025-10-29 16:00   ` Caleb Sander Mateos
  2025-10-29  3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
  2025-10-29  3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
  4 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Convert ublk_queue to use struct_size() for allocation.

Changes in this commit:

1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
   instead of manual size calculation (sizeof(struct ublk_queue) +
   depth * sizeof(struct ublk_io)).

This provides better type safety and makes the code more maintainable
by using standard kernel macro for flexible array handling.

Meantime annotate ublk_queue.ios by __counted_by().

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/ublk_drv.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index ed77b4527b33..409874714c62 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -203,7 +203,7 @@ struct ublk_queue {
 	bool fail_io; /* copy of dev->state == UBLK_S_DEV_FAIL_IO */
 	spinlock_t		cancel_lock;
 	struct ublk_device *dev;
-	struct ublk_io ios[];
+	struct ublk_io ios[] __counted_by(q_depth);
 };
 
 struct ublk_device {
@@ -2700,7 +2700,6 @@ static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
 static int ublk_init_queue(struct ublk_device *ub, int q_id)
 {
 	int depth = ub->dev_info.queue_depth;
-	int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
 	gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
 	struct ublk_queue *ubq;
 	struct page *page;
@@ -2711,7 +2710,8 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id)
 	numa_node = ublk_get_queue_numa_node(ub, q_id);
 
 	/* Allocate queue structure on local NUMA node */
-	ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
+	ubq = kvzalloc_node(struct_size(ubq, ios, depth), GFP_KERNEL,
+			    numa_node);
 	if (!ubq)
 		return -ENOMEM;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
                   ` (2 preceding siblings ...)
  2025-10-29  3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  2025-10-29  3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
  4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Move ublk_thread_set_sched_affinity() call before ublk_thread_init()
to ensure memory allocations during thread initialization occur on
the correct NUMA node. This leverages Linux's first-touch memory
policy for better NUMA locality.

Also convert ublk_thread_set_sched_affinity() to use
pthread_setaffinity_np() instead of sched_setaffinity(), as the
pthread API is the proper interface for setting thread affinity in
multithreaded programs.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 tools/testing/selftests/ublk/kublk.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index 6b8123c12a7a..062537ab8976 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -839,7 +839,7 @@ static int ublk_process_io(struct ublk_thread *t)
 static void ublk_thread_set_sched_affinity(const struct ublk_thread *t,
 		cpu_set_t *cpuset)
 {
-        if (sched_setaffinity(0, sizeof(*cpuset), cpuset) < 0)
+	if (pthread_setaffinity_np(pthread_self(), sizeof(*cpuset), cpuset) < 0)
 		ublk_err("ublk dev %u thread %u set affinity failed",
 				t->dev->dev_info.dev_id, t->idx);
 }
@@ -862,15 +862,21 @@ static void *ublk_io_handler_fn(void *data)
 	t->dev = info->dev;
 	t->idx = info->idx;
 
+	/*
+	 * IO perf is sensitive with queue pthread affinity on NUMA machine
+	 *
+	 * Set sched_affinity at beginning, so following allocated memory/pages
+	 * could be CPU/NUMA aware.
+	 */
+	if (info->affinity)
+		ublk_thread_set_sched_affinity(t, info->affinity);
+
 	ret = ublk_thread_init(t, info->extra_flags);
 	if (ret) {
 		ublk_err("ublk dev %d thread %u init failed\n",
 				dev_id, t->idx);
 		return NULL;
 	}
-	/* IO perf is sensitive with queue pthread affinity on NUMA machine*/
-	if (info->affinity)
-		ublk_thread_set_sched_affinity(t, info->affinity);
 	sem_post(info->ready);
 
 	ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %u started\n",
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable
  2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
                   ` (3 preceding siblings ...)
  2025-10-29  3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
@ 2025-10-29  3:10 ` Ming Lei
  4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29  3:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei

Refactor ublk_thread to be a thread-local variable instead of storing
it in ublk_dev:

- Remove pthread_t thread field from struct ublk_thread and move it to
  struct ublk_thread_info

- Remove struct ublk_thread array from struct ublk_dev, reducing memory
  footprint

- Define struct ublk_thread as local variable in __ublk_io_handler_fn()
  instead of accessing it from dev->threads[]

- Extract main IO handling logic into __ublk_io_handler_fn() which is
  marked as noinline

- Move CPU affinity setup to ublk_io_handler_fn() before calling
  __ublk_io_handler_fn()

- Update ublk_thread_set_sched_affinity() to take struct ublk_thread_info *
  instead of struct ublk_thread *, and use pthread_setaffinity_np()
  instead of sched_setaffinity()

- Reorder struct ublk_thread fields to group related state together

This change makes each thread's ublk_thread structure truly local to
the thread, improving cache locality and reducing memory usage.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 tools/testing/selftests/ublk/kublk.c | 76 +++++++++++++++-------------
 tools/testing/selftests/ublk/kublk.h |  9 ++--
 2 files changed, 45 insertions(+), 40 deletions(-)

diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index 062537ab8976..f8fa102a627f 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -836,62 +836,70 @@ static int ublk_process_io(struct ublk_thread *t)
 	return reapped;
 }
 
-static void ublk_thread_set_sched_affinity(const struct ublk_thread *t,
-		cpu_set_t *cpuset)
-{
-	if (pthread_setaffinity_np(pthread_self(), sizeof(*cpuset), cpuset) < 0)
-		ublk_err("ublk dev %u thread %u set affinity failed",
-				t->dev->dev_info.dev_id, t->idx);
-}
-
 struct ublk_thread_info {
 	struct ublk_dev 	*dev;
+	pthread_t		thread;
 	unsigned		idx;
 	sem_t 			*ready;
 	cpu_set_t 		*affinity;
 	unsigned long long	extra_flags;
 };
 
-static void *ublk_io_handler_fn(void *data)
+static void ublk_thread_set_sched_affinity(const struct ublk_thread_info *info)
 {
-	struct ublk_thread_info *info = data;
-	struct ublk_thread *t = &info->dev->threads[info->idx];
+	if (pthread_setaffinity_np(pthread_self(), sizeof(*info->affinity), info->affinity) < 0)
+		ublk_err("ublk dev %u thread %u set affinity failed",
+				info->dev->dev_info.dev_id, info->idx);
+}
+
+static __attribute__((noinline)) int __ublk_io_handler_fn(struct ublk_thread_info *info)
+{
+	struct ublk_thread t = {
+		.dev = info->dev,
+		.idx = info->idx,
+	};
 	int dev_id = info->dev->dev_info.dev_id;
 	int ret;
 
-	t->dev = info->dev;
-	t->idx = info->idx;
-
-	/*
-	 * IO perf is sensitive with queue pthread affinity on NUMA machine
-	 *
-	 * Set sched_affinity at beginning, so following allocated memory/pages
-	 * could be CPU/NUMA aware.
-	 */
-	if (info->affinity)
-		ublk_thread_set_sched_affinity(t, info->affinity);
-
-	ret = ublk_thread_init(t, info->extra_flags);
+	ret = ublk_thread_init(&t, info->extra_flags);
 	if (ret) {
 		ublk_err("ublk dev %d thread %u init failed\n",
-				dev_id, t->idx);
-		return NULL;
+				dev_id, t.idx);
+		return ret;
 	}
 	sem_post(info->ready);
 
 	ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %u started\n",
-			gettid(), dev_id, t->idx);
+			gettid(), dev_id, t.idx);
 
 	/* submit all io commands to ublk driver */
-	ublk_submit_fetch_commands(t);
+	ublk_submit_fetch_commands(&t);
 	do {
-		if (ublk_process_io(t) < 0)
+		if (ublk_process_io(&t) < 0)
 			break;
 	} while (1);
 
 	ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %d exiting\n",
-		 gettid(), dev_id, t->idx);
-	ublk_thread_deinit(t);
+		 gettid(), dev_id, t.idx);
+	ublk_thread_deinit(&t);
+	return 0;
+}
+
+static void *ublk_io_handler_fn(void *data)
+{
+	struct ublk_thread_info *info = data;
+
+	/*
+	 * IO perf is sensitive with queue pthread affinity on NUMA machine
+	 *
+	 * Set sched_affinity at beginning, so following allocated memory/pages
+	 * could be CPU/NUMA aware.
+	 */
+	if (info->affinity)
+		ublk_thread_set_sched_affinity(info);
+
+	__ublk_io_handler_fn(info);
+
 	return NULL;
 }
 
@@ -989,14 +997,13 @@ static int ublk_start_daemon(const struct dev_ctx *ctx, struct ublk_dev *dev)
 		 */
 		if (dev->nthreads == dinfo->nr_hw_queues)
 			tinfo[i].affinity = &affinity_buf[i];
-		pthread_create(&dev->threads[i].thread, NULL,
+		pthread_create(&tinfo[i].thread, NULL,
 				ublk_io_handler_fn,
 				&tinfo[i]);
 	}
 
 	for (i = 0; i < dev->nthreads; i++)
 		sem_wait(&ready);
-	free(tinfo);
 	free(affinity_buf);
 
 	/* everything is fine now, start us */
@@ -1019,7 +1026,8 @@ static int ublk_start_daemon(const struct dev_ctx *ctx, struct ublk_dev *dev)
 
 	/* wait until we are terminated */
 	for (i = 0; i < dev->nthreads; i++)
-		pthread_join(dev->threads[i].thread, &thread_ret);
+		pthread_join(tinfo[i].thread, &thread_ret);
+	free(tinfo);
  fail:
 	for (i = 0; i < dinfo->nr_hw_queues; i++)
 		ublk_queue_deinit(&dev->q[i]);
diff --git a/tools/testing/selftests/ublk/kublk.h b/tools/testing/selftests/ublk/kublk.h
index 5e55484fb0aa..fe42705c6d42 100644
--- a/tools/testing/selftests/ublk/kublk.h
+++ b/tools/testing/selftests/ublk/kublk.h
@@ -175,23 +175,20 @@ struct ublk_queue {
 
 struct ublk_thread {
 	struct ublk_dev *dev;
-	struct io_uring ring;
-	unsigned int cmd_inflight;
-	unsigned int io_inflight;
-
-	pthread_t thread;
 	unsigned idx;
 
 #define UBLKS_T_STOPPING	(1U << 0)
 #define UBLKS_T_IDLE	(1U << 1)
 	unsigned state;
+	unsigned int cmd_inflight;
+	unsigned int io_inflight;
+	struct io_uring ring;
 };
 
 struct ublk_dev {
 	struct ublk_tgt tgt;
 	struct ublksrv_ctrl_dev_info  dev_info;
 	struct ublk_queue q[UBLK_MAX_QUEUES];
-	struct ublk_thread threads[UBLK_MAX_THREADS];
 	unsigned nthreads;
 	unsigned per_io_tasks;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
@ 2025-10-29 16:00   ` Caleb Sander Mateos
  2025-10-29 22:53     ` Ming Lei
  2025-10-30  4:04   ` kernel test robot
  2025-10-30  8:00   ` kernel test robot
  2 siblings, 1 reply; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-29 16:00 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Uday Shankar

On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Implement NUMA-friendly memory allocation for ublk driver to improve
> performance on multi-socket systems.
>
> This commit includes the following changes:
>
> 1. Convert struct ublk_device to use a flexible array member for the
>    queues field instead of a separate pointer array allocation. This
>    eliminates one level of indirection and simplifies memory management.
>    The queues array is now allocated as part of struct ublk_device using
>    struct_size().

Technically it ends up being the same number of indirections as
before, since changing queues from a single allocation to an array of
separate allocations adds another indirection.

>
> 2. Rename __queues to queues, dropping the __ prefix since the field is
>    now accessed directly throughout the codebase rather than only through
>    the ublk_get_queue() helper.
>
> 3. Remove the queue_size field from struct ublk_device as it is no longer
>    needed.
>
> 4. Move queue allocation and deallocation into ublk_init_queue() and
>    ublk_deinit_queue() respectively, improving encapsulation. This
>    simplifies ublk_init_queues() and ublk_deinit_queues() to just
>    iterate and call the per-queue functions.
>
> 5. Add ublk_get_queue_numa_node() helper function to determine the
>    appropriate NUMA node for a queue by finding the first CPU mapped
>    to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
>    converting it to a NUMA node using cpu_to_node(). This function is
>    called internally by ublk_init_queue() to determine the allocation
>    node.
>
> 6. Allocate each queue structure on its local NUMA node using
>    kvzalloc_node() in ublk_init_queue().
>
> 7. Allocate the I/O command buffer on the same NUMA node using
>    alloc_pages_node().
>
> This reduces memory access latency on multi-socket NUMA systems by
> ensuring each queue's data structures are local to the CPUs that
> access them.
>
> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
>  1 file changed, 53 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 2569566bf5e6..ed77b4527b33 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -209,9 +209,6 @@ struct ublk_queue {
>  struct ublk_device {
>         struct gendisk          *ub_disk;
>
> -       char    *__queues;
> -
> -       unsigned int    queue_size;
>         struct ublksrv_ctrl_dev_info    dev_info;
>
>         struct blk_mq_tag_set   tag_set;
> @@ -239,6 +236,8 @@ struct ublk_device {
>         bool canceling;
>         pid_t   ublksrv_tgid;
>         struct delayed_work     exit_work;
> +
> +       struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
>  };
>
>  /* header of ublk_params */
> @@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
>  static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
>                 int qid)
>  {
> -       return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
> +       return dev->queues[qid];
>  }
>
>  static inline bool ublk_rq_has_data(const struct request *rq)
> @@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
>
>  static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
>  {
> -       int size = ublk_queue_cmd_buf_size(ub);
> -       struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> -       int i;
> +       struct ublk_queue *ubq = ub->queues[q_id];
> +       int size, i;
> +
> +       if (!ubq)
> +               return;
> +
> +       size = ublk_queue_cmd_buf_size(ub);
>
>         for (i = 0; i < ubq->q_depth; i++) {
>                 struct ublk_io *io = &ubq->ios[i];
> @@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
>
>         if (ubq->io_cmd_buf)
>                 free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
> +
> +       kvfree(ubq);
> +       ub->queues[q_id] = NULL;
> +}
> +
> +static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> +{
> +       unsigned int cpu;
> +
> +       /* Find first CPU mapped to this queue */
> +       for_each_possible_cpu(cpu) {
> +               if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
> +                       return cpu_to_node(cpu);
> +       }

I think you could avoid this quadratic lookup by using blk_mq_hw_ctx's
numa_node field. The initialization code would probably have to move
to ublk_init_hctx() in order to have access to the blk_mq_hw_ctx. But
may not be worth the effort just to save some time at ublk creation
time. What you have seems fine.

Best,
Caleb

> +
> +       return NUMA_NO_NODE;
>  }
>
>  static int ublk_init_queue(struct ublk_device *ub, int q_id)
>  {
> -       struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> +       int depth = ub->dev_info.queue_depth;
> +       int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
>         gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
> -       void *ptr;
> +       struct ublk_queue *ubq;
> +       struct page *page;
> +       int numa_node;
>         int size;
>
> +       /* Determine NUMA node based on queue's CPU affinity */
> +       numa_node = ublk_get_queue_numa_node(ub, q_id);
> +
> +       /* Allocate queue structure on local NUMA node */
> +       ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
> +       if (!ubq)
> +               return -ENOMEM;
> +
>         spin_lock_init(&ubq->cancel_lock);
>         ubq->flags = ub->dev_info.flags;
>         ubq->q_id = q_id;
> -       ubq->q_depth = ub->dev_info.queue_depth;
> +       ubq->q_depth = depth;
>         size = ublk_queue_cmd_buf_size(ub);
>
> -       ptr = (void *) __get_free_pages(gfp_flags, get_order(size));
> -       if (!ptr)
> +       /* Allocate I/O command buffer on local NUMA node */
> +       page = alloc_pages_node(numa_node, gfp_flags, get_order(size));
> +       if (!page) {
> +               kvfree(ubq);
>                 return -ENOMEM;
> +       }
> +       ubq->io_cmd_buf = page_address(page);
>
> -       ubq->io_cmd_buf = ptr;
> +       ub->queues[q_id] = ubq;
>         ubq->dev = ub;
>         return 0;
>  }
>
>  static void ublk_deinit_queues(struct ublk_device *ub)
>  {
> -       int nr_queues = ub->dev_info.nr_hw_queues;
>         int i;
>
> -       if (!ub->__queues)
> -               return;
> -
> -       for (i = 0; i < nr_queues; i++)
> +       for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
>                 ublk_deinit_queue(ub, i);
> -       kvfree(ub->__queues);
>  }
>
>  static int ublk_init_queues(struct ublk_device *ub)
>  {
> -       int nr_queues = ub->dev_info.nr_hw_queues;
> -       int depth = ub->dev_info.queue_depth;
> -       int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
> -       int i, ret = -ENOMEM;
> +       int i, ret;
>
> -       ub->queue_size = ubq_size;
> -       ub->__queues = kvcalloc(nr_queues, ubq_size, GFP_KERNEL);
> -       if (!ub->__queues)
> -               return ret;
> -
> -       for (i = 0; i < nr_queues; i++) {
> -               if (ublk_init_queue(ub, i))
> +       for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
> +               ret = ublk_init_queue(ub, i);
> +               if (ret)
>                         goto fail;
>         }
>
> @@ -3128,7 +3150,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
>                 goto out_unlock;
>
>         ret = -ENOMEM;
> -       ub = kzalloc(sizeof(*ub), GFP_KERNEL);
> +       ub = kzalloc(struct_size(ub, queues, info.nr_hw_queues), GFP_KERNEL);
>         if (!ub)
>                 goto out_unlock;
>         mutex_init(&ub->mutex);
> --
> 2.47.0
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 3/5] ublk: use struct_size() for allocation
  2025-10-29  3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
@ 2025-10-29 16:00   ` Caleb Sander Mateos
  0 siblings, 0 replies; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-29 16:00 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Uday Shankar

On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Convert ublk_queue to use struct_size() for allocation.
>
> Changes in this commit:
>
> 1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
>    instead of manual size calculation (sizeof(struct ublk_queue) +
>    depth * sizeof(struct ublk_io)).
>
> This provides better type safety and makes the code more maintainable
> by using standard kernel macro for flexible array handling.
>
> Meantime annotate ublk_queue.ios by __counted_by().
>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>

> ---
>  drivers/block/ublk_drv.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index ed77b4527b33..409874714c62 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -203,7 +203,7 @@ struct ublk_queue {
>         bool fail_io; /* copy of dev->state == UBLK_S_DEV_FAIL_IO */
>         spinlock_t              cancel_lock;
>         struct ublk_device *dev;
> -       struct ublk_io ios[];
> +       struct ublk_io ios[] __counted_by(q_depth);
>  };
>
>  struct ublk_device {
> @@ -2700,7 +2700,6 @@ static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
>  static int ublk_init_queue(struct ublk_device *ub, int q_id)
>  {
>         int depth = ub->dev_info.queue_depth;
> -       int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
>         gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
>         struct ublk_queue *ubq;
>         struct page *page;
> @@ -2711,7 +2710,8 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id)
>         numa_node = ublk_get_queue_numa_node(ub, q_id);
>
>         /* Allocate queue structure on local NUMA node */
> -       ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
> +       ubq = kvzalloc_node(struct_size(ubq, ios, depth), GFP_KERNEL,
> +                           numa_node);
>         if (!ubq)
>                 return -ENOMEM;
>
> --
> 2.47.0
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29 16:00   ` Caleb Sander Mateos
@ 2025-10-29 22:53     ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 22:53 UTC (permalink / raw)
  To: Caleb Sander Mateos; +Cc: Jens Axboe, linux-block, Uday Shankar

On Wed, Oct 29, 2025 at 09:00:12AM -0700, Caleb Sander Mateos wrote:
> On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > Implement NUMA-friendly memory allocation for ublk driver to improve
> > performance on multi-socket systems.
> >
> > This commit includes the following changes:
> >
> > 1. Convert struct ublk_device to use a flexible array member for the
> >    queues field instead of a separate pointer array allocation. This
> >    eliminates one level of indirection and simplifies memory management.
> >    The queues array is now allocated as part of struct ublk_device using
> >    struct_size().
> 
> Technically it ends up being the same number of indirections as
> before, since changing queues from a single allocation to an array of
> separate allocations adds another indirection.

I think it is fine, because the pre-condition is NUMA aware allocation for
ublk_queue.

> 
> >
> > 2. Rename __queues to queues, dropping the __ prefix since the field is
> >    now accessed directly throughout the codebase rather than only through
> >    the ublk_get_queue() helper.
> >
> > 3. Remove the queue_size field from struct ublk_device as it is no longer
> >    needed.
> >
> > 4. Move queue allocation and deallocation into ublk_init_queue() and
> >    ublk_deinit_queue() respectively, improving encapsulation. This
> >    simplifies ublk_init_queues() and ublk_deinit_queues() to just
> >    iterate and call the per-queue functions.
> >
> > 5. Add ublk_get_queue_numa_node() helper function to determine the
> >    appropriate NUMA node for a queue by finding the first CPU mapped
> >    to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
> >    converting it to a NUMA node using cpu_to_node(). This function is
> >    called internally by ublk_init_queue() to determine the allocation
> >    node.
> >
> > 6. Allocate each queue structure on its local NUMA node using
> >    kvzalloc_node() in ublk_init_queue().
> >
> > 7. Allocate the I/O command buffer on the same NUMA node using
> >    alloc_pages_node().
> >
> > This reduces memory access latency on multi-socket NUMA systems by
> > ensuring each queue's data structures are local to the CPUs that
> > access them.
> >
> > Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
> >  1 file changed, 53 insertions(+), 31 deletions(-)
> >
> > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > index 2569566bf5e6..ed77b4527b33 100644
> > --- a/drivers/block/ublk_drv.c
> > +++ b/drivers/block/ublk_drv.c
> > @@ -209,9 +209,6 @@ struct ublk_queue {
> >  struct ublk_device {
> >         struct gendisk          *ub_disk;
> >
> > -       char    *__queues;
> > -
> > -       unsigned int    queue_size;
> >         struct ublksrv_ctrl_dev_info    dev_info;
> >
> >         struct blk_mq_tag_set   tag_set;
> > @@ -239,6 +236,8 @@ struct ublk_device {
> >         bool canceling;
> >         pid_t   ublksrv_tgid;
> >         struct delayed_work     exit_work;
> > +
> > +       struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
> >  };
> >
> >  /* header of ublk_params */
> > @@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
> >  static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
> >                 int qid)
> >  {
> > -       return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
> > +       return dev->queues[qid];
> >  }
> >
> >  static inline bool ublk_rq_has_data(const struct request *rq)
> > @@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
> >
> >  static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> >  {
> > -       int size = ublk_queue_cmd_buf_size(ub);
> > -       struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> > -       int i;
> > +       struct ublk_queue *ubq = ub->queues[q_id];
> > +       int size, i;
> > +
> > +       if (!ubq)
> > +               return;
> > +
> > +       size = ublk_queue_cmd_buf_size(ub);
> >
> >         for (i = 0; i < ubq->q_depth; i++) {
> >                 struct ublk_io *io = &ubq->ios[i];
> > @@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> >
> >         if (ubq->io_cmd_buf)
> >                 free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
> > +
> > +       kvfree(ubq);
> > +       ub->queues[q_id] = NULL;
> > +}
> > +
> > +static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> > +{
> > +       unsigned int cpu;
> > +
> > +       /* Find first CPU mapped to this queue */
> > +       for_each_possible_cpu(cpu) {
> > +               if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
> > +                       return cpu_to_node(cpu);
> > +       }
> 
> I think you could avoid this quadratic lookup by using blk_mq_hw_ctx's
> numa_node field. The initialization code would probably have to move
> to ublk_init_hctx() in order to have access to the blk_mq_hw_ctx. But
> may not be worth the effort just to save some time at ublk creation
> time. What you have seems fine.

It isn't doable and not necessary.

disk/hw queues are created & initialized when handling UBLK_CMD_START_DEV, but the
backed ublk_queue need to be allocated when handling UBLK_CMD_ADD_DEV, which happens
before dealing with UBLK_CMD_START_DEV.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
  2025-10-29 16:00   ` Caleb Sander Mateos
@ 2025-10-30  4:04   ` kernel test robot
  2025-10-30  8:00   ` kernel test robot
  2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2025-10-30  4:04 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block
  Cc: oe-kbuild-all, Uday Shankar, Caleb Sander Mateos, Ming Lei

Hi Ming,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master v6.18-rc3 next-20251029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
config: csky-randconfig-r054-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301107.wrn8eeW8-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301107.wrn8eeW8-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510301107.wrn8eeW8-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from <command-line>:
>> drivers/block/ublk_drv.c:240:56: error: 'dev_info' undeclared here (not in a function); did you mean '_dev_info'?
     240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
         |                                                        ^~~~~~~~
   include/linux/compiler_types.h:346:71: note: in definition of macro '__counted_by'
     346 | # define __counted_by(member)           __attribute__((__counted_by__(member)))
         |                                                                       ^~~~~~
>> drivers/block/ublk_drv.c:240:34: error: 'counted_by' argument is not an identifier
     240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
         |                                  ^~~~~~


vim +240 drivers/block/ublk_drv.c

   208	
   209	struct ublk_device {
   210		struct gendisk		*ub_disk;
   211	
   212		struct ublksrv_ctrl_dev_info	dev_info;
   213	
   214		struct blk_mq_tag_set	tag_set;
   215	
   216		struct cdev		cdev;
   217		struct device		cdev_dev;
   218	
   219	#define UB_STATE_OPEN		0
   220	#define UB_STATE_USED		1
   221	#define UB_STATE_DELETED	2
   222		unsigned long		state;
   223		int			ub_number;
   224	
   225		struct mutex		mutex;
   226	
   227		spinlock_t		lock;
   228		struct mm_struct	*mm;
   229	
   230		struct ublk_params	params;
   231	
   232		struct completion	completion;
   233		u32			nr_io_ready;
   234		bool 			unprivileged_daemons;
   235		struct mutex cancel_mutex;
   236		bool canceling;
   237		pid_t 	ublksrv_tgid;
   238		struct delayed_work	exit_work;
   239	
 > 240		struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
   241	};
   242	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
  2025-10-29 16:00   ` Caleb Sander Mateos
  2025-10-30  4:04   ` kernel test robot
@ 2025-10-30  8:00   ` kernel test robot
  2025-10-30 14:07     ` Caleb Sander Mateos
  2 siblings, 1 reply; 14+ messages in thread
From: kernel test robot @ 2025-10-30  8:00 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block
  Cc: llvm, oe-kbuild-all, Uday Shankar, Caleb Sander Mateos, Ming Lei

Hi Ming,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on shuah-kselftest/next shuah-kselftest/fixes linus/master v6.18-rc3 next-20251029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
     240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
         |                                                        ^~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler_types.h:346:62: note: expanded from macro '__counted_by'
     346 | # define __counted_by(member)           __attribute__((__counted_by__(member)))
         |                                                                       ^~~~~~
   1 error generated.


vim +/counted_by +240 drivers/block/ublk_drv.c

   208	
   209	struct ublk_device {
   210		struct gendisk		*ub_disk;
   211	
   212		struct ublksrv_ctrl_dev_info	dev_info;
   213	
   214		struct blk_mq_tag_set	tag_set;
   215	
   216		struct cdev		cdev;
   217		struct device		cdev_dev;
   218	
   219	#define UB_STATE_OPEN		0
   220	#define UB_STATE_USED		1
   221	#define UB_STATE_DELETED	2
   222		unsigned long		state;
   223		int			ub_number;
   224	
   225		struct mutex		mutex;
   226	
   227		spinlock_t		lock;
   228		struct mm_struct	*mm;
   229	
   230		struct ublk_params	params;
   231	
   232		struct completion	completion;
   233		u32			nr_io_ready;
   234		bool 			unprivileged_daemons;
   235		struct mutex cancel_mutex;
   236		bool canceling;
   237		pid_t 	ublksrv_tgid;
   238		struct delayed_work	exit_work;
   239	
 > 240		struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
   241	};
   242	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-30  8:00   ` kernel test robot
@ 2025-10-30 14:07     ` Caleb Sander Mateos
  2025-10-30 17:56       ` Nathan Chancellor
  0 siblings, 1 reply; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-30 14:07 UTC (permalink / raw)
  To: kernel test robot
  Cc: Ming Lei, Jens Axboe, linux-block, llvm, oe-kbuild-all,
	Uday Shankar

On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
>
> Hi Ming,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on axboe-block/for-next]
> [also build test ERROR on shuah-kselftest/next shuah-kselftest/fixes linus/master v6.18-rc3 next-20251029]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
> >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
>      240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
>          |                                                        ^~~~~~~~~~~~~~~~~~~~~

Hmm, guess it doesn't support nested fields?

>    include/linux/compiler_types.h:346:62: note: expanded from macro '__counted_by'
>      346 | # define __counted_by(member)           __attribute__((__counted_by__(member)))
>          |                                                                       ^~~~~~
>    1 error generated.
>
>
> vim +/counted_by +240 drivers/block/ublk_drv.c
>
>    208
>    209  struct ublk_device {
>    210          struct gendisk          *ub_disk;
>    211
>    212          struct ublksrv_ctrl_dev_info    dev_info;
>    213
>    214          struct blk_mq_tag_set   tag_set;
>    215
>    216          struct cdev             cdev;
>    217          struct device           cdev_dev;
>    218
>    219  #define UB_STATE_OPEN           0
>    220  #define UB_STATE_USED           1
>    221  #define UB_STATE_DELETED        2
>    222          unsigned long           state;
>    223          int                     ub_number;
>    224
>    225          struct mutex            mutex;
>    226
>    227          spinlock_t              lock;
>    228          struct mm_struct        *mm;
>    229
>    230          struct ublk_params      params;
>    231
>    232          struct completion       completion;
>    233          u32                     nr_io_ready;
>    234          bool                    unprivileged_daemons;
>    235          struct mutex cancel_mutex;
>    236          bool canceling;
>    237          pid_t   ublksrv_tgid;
>    238          struct delayed_work     exit_work;
>    239
>  > 240          struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
>    241  };
>    242
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-30 14:07     ` Caleb Sander Mateos
@ 2025-10-30 17:56       ` Nathan Chancellor
  2025-10-31  3:28         ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Nathan Chancellor @ 2025-10-30 17:56 UTC (permalink / raw)
  To: Caleb Sander Mateos
  Cc: kernel test robot, Ming Lei, Jens Axboe, linux-block, llvm,
	oe-kbuild-all, Uday Shankar, Kees Cook

On Thu, Oct 30, 2025 at 07:07:25AM -0700, Caleb Sander Mateos wrote:
> On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
...
> > patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> > patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> > config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
> >
> > All errors (new ones prefixed by >>):
> >
> > >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> >      240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
> >          |                                                        ^~~~~~~~~~~~~~~~~~~~~
> 
> Hmm, guess it doesn't support nested fields?

Correct. I think this is something that we want to support at some point
if I remember correctly but I think there was a lot of discussion
between GCC and clang on how to actually do it but Kees is free to
correct me if that is wrong.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
  2025-10-30 17:56       ` Nathan Chancellor
@ 2025-10-31  3:28         ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-31  3:28 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Caleb Sander Mateos, kernel test robot, Jens Axboe, linux-block,
	llvm, oe-kbuild-all, Uday Shankar, Kees Cook

On Thu, Oct 30, 2025 at 10:56:31AM -0700, Nathan Chancellor wrote:
> On Thu, Oct 30, 2025 at 07:07:25AM -0700, Caleb Sander Mateos wrote:
> > On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
> ...
> > > patch link:    https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> > > patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> > > config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> > > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
> > >
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <lkp@intel.com>
> > > | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
> > >
> > > All errors (new ones prefixed by >>):
> > >
> > > >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> > >      240 |         struct ublk_queue       *queues[] __counted_by(dev_info.nr_hw_queues);
> > >          |                                                        ^~~~~~~~~~~~~~~~~~~~~
> > 
> > Hmm, guess it doesn't support nested fields?
> 
> Correct. I think this is something that we want to support at some point
> if I remember correctly but I think there was a lot of discussion
> between GCC and clang on how to actually do it but Kees is free to
> correct me if that is wrong.

Thanks for the confirmation.

Will remove this __counted_by() in next version.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-10-31  3:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29  3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
2025-10-29  3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
2025-10-29  3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
2025-10-29 16:00   ` Caleb Sander Mateos
2025-10-29 22:53     ` Ming Lei
2025-10-30  4:04   ` kernel test robot
2025-10-30  8:00   ` kernel test robot
2025-10-30 14:07     ` Caleb Sander Mateos
2025-10-30 17:56       ` Nathan Chancellor
2025-10-31  3:28         ` Ming Lei
2025-10-29  3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
2025-10-29 16:00   ` Caleb Sander Mateos
2025-10-29  3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
2025-10-29  3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.