* [PATCH V3 0/5] ublk: NUMA-aware memory allocation
@ 2025-10-29 3:10 Ming Lei
2025-10-29 3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Hi Jens,
The 1st two patches implement ublk driver NUMA aware memory allocation.
The last two patches implement it for ublk selftest utility.
`taskset -c 0-31 ~/git/fio/t/io_uring -p0 -n16 -r 40 /dev/ublkb0` shows
5%~10% IOPS improvement on one AMD zen4 dual socket machine when creating
ublk/null with 16 queues and AUTO_BUF_REG(zero copy).
V3:
- don't use DECLARE_FLEX_ARRAY()
- annotate flexible array by __counted_by()
V2:
- use a flexible array member for queues field, save one indirection
for retrieving ublk queue
- rename __queues into queues
- remove the queue_size field from struct ublk_device
- Move queue allocation and deallocation into ublk_init_queue() and
ublk_deinit_queue()
- use flexible array for ublk_queue.ios
- convert ublk_thread_set_sched_affinity() to use pthread_setaffinity_np()
Ming Lei (5):
ublk: reorder tag_set initialization before queue allocation
ublk: implement NUMA-aware memory allocation
ublk: use struct_size() for allocation
selftests: ublk: set CPU affinity before thread initialization
selftests: ublk: make ublk_thread thread-local variable
drivers/block/ublk_drv.c | 98 +++++++++++++++++-----------
tools/testing/selftests/ublk/kublk.c | 70 ++++++++++++--------
tools/testing/selftests/ublk/kublk.h | 9 +--
3 files changed, 105 insertions(+), 72 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
` (3 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Move ublk_add_tag_set() before ublk_init_queues() in the device
initialization path. This allows us to use the blk-mq CPU-to-queue
mapping established by the tag_set to determine the appropriate
NUMA node for each queue allocation.
The error handling paths are also reordered accordingly.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 0c74a41a6753..2569566bf5e6 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -3178,17 +3178,17 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
ub->dev_info.nr_hw_queues, nr_cpu_ids);
ublk_align_max_io_size(ub);
- ret = ublk_init_queues(ub);
+ ret = ublk_add_tag_set(ub);
if (ret)
goto out_free_dev_number;
- ret = ublk_add_tag_set(ub);
+ ret = ublk_init_queues(ub);
if (ret)
- goto out_deinit_queues;
+ goto out_free_tag_set;
ret = -EFAULT;
if (copy_to_user(argp, &ub->dev_info, sizeof(info)))
- goto out_free_tag_set;
+ goto out_deinit_queues;
/*
* Add the char dev so that ublksrv daemon can be setup.
@@ -3197,10 +3197,10 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
ret = ublk_add_chdev(ub);
goto out_unlock;
-out_free_tag_set:
- blk_mq_free_tag_set(&ub->tag_set);
out_deinit_queues:
ublk_deinit_queues(ub);
+out_free_tag_set:
+ blk_mq_free_tag_set(&ub->tag_set);
out_free_dev_number:
ublk_free_dev_number(ub);
out_free_ub:
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
` (2 more replies)
2025-10-29 3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
` (2 subsequent siblings)
4 siblings, 3 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Implement NUMA-friendly memory allocation for ublk driver to improve
performance on multi-socket systems.
This commit includes the following changes:
1. Convert struct ublk_device to use a flexible array member for the
queues field instead of a separate pointer array allocation. This
eliminates one level of indirection and simplifies memory management.
The queues array is now allocated as part of struct ublk_device using
struct_size().
2. Rename __queues to queues, dropping the __ prefix since the field is
now accessed directly throughout the codebase rather than only through
the ublk_get_queue() helper.
3. Remove the queue_size field from struct ublk_device as it is no longer
needed.
4. Move queue allocation and deallocation into ublk_init_queue() and
ublk_deinit_queue() respectively, improving encapsulation. This
simplifies ublk_init_queues() and ublk_deinit_queues() to just
iterate and call the per-queue functions.
5. Add ublk_get_queue_numa_node() helper function to determine the
appropriate NUMA node for a queue by finding the first CPU mapped
to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
converting it to a NUMA node using cpu_to_node(). This function is
called internally by ublk_init_queue() to determine the allocation
node.
6. Allocate each queue structure on its local NUMA node using
kvzalloc_node() in ublk_init_queue().
7. Allocate the I/O command buffer on the same NUMA node using
alloc_pages_node().
This reduces memory access latency on multi-socket NUMA systems by
ensuring each queue's data structures are local to the CPUs that
access them.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
1 file changed, 53 insertions(+), 31 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 2569566bf5e6..ed77b4527b33 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -209,9 +209,6 @@ struct ublk_queue {
struct ublk_device {
struct gendisk *ub_disk;
- char *__queues;
-
- unsigned int queue_size;
struct ublksrv_ctrl_dev_info dev_info;
struct blk_mq_tag_set tag_set;
@@ -239,6 +236,8 @@ struct ublk_device {
bool canceling;
pid_t ublksrv_tgid;
struct delayed_work exit_work;
+
+ struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
};
/* header of ublk_params */
@@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
int qid)
{
- return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
+ return dev->queues[qid];
}
static inline bool ublk_rq_has_data(const struct request *rq)
@@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
{
- int size = ublk_queue_cmd_buf_size(ub);
- struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
- int i;
+ struct ublk_queue *ubq = ub->queues[q_id];
+ int size, i;
+
+ if (!ubq)
+ return;
+
+ size = ublk_queue_cmd_buf_size(ub);
for (i = 0; i < ubq->q_depth; i++) {
struct ublk_io *io = &ubq->ios[i];
@@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
if (ubq->io_cmd_buf)
free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
+
+ kvfree(ubq);
+ ub->queues[q_id] = NULL;
+}
+
+static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
+{
+ unsigned int cpu;
+
+ /* Find first CPU mapped to this queue */
+ for_each_possible_cpu(cpu) {
+ if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
+ return cpu_to_node(cpu);
+ }
+
+ return NUMA_NO_NODE;
}
static int ublk_init_queue(struct ublk_device *ub, int q_id)
{
- struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
+ int depth = ub->dev_info.queue_depth;
+ int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
- void *ptr;
+ struct ublk_queue *ubq;
+ struct page *page;
+ int numa_node;
int size;
+ /* Determine NUMA node based on queue's CPU affinity */
+ numa_node = ublk_get_queue_numa_node(ub, q_id);
+
+ /* Allocate queue structure on local NUMA node */
+ ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
+ if (!ubq)
+ return -ENOMEM;
+
spin_lock_init(&ubq->cancel_lock);
ubq->flags = ub->dev_info.flags;
ubq->q_id = q_id;
- ubq->q_depth = ub->dev_info.queue_depth;
+ ubq->q_depth = depth;
size = ublk_queue_cmd_buf_size(ub);
- ptr = (void *) __get_free_pages(gfp_flags, get_order(size));
- if (!ptr)
+ /* Allocate I/O command buffer on local NUMA node */
+ page = alloc_pages_node(numa_node, gfp_flags, get_order(size));
+ if (!page) {
+ kvfree(ubq);
return -ENOMEM;
+ }
+ ubq->io_cmd_buf = page_address(page);
- ubq->io_cmd_buf = ptr;
+ ub->queues[q_id] = ubq;
ubq->dev = ub;
return 0;
}
static void ublk_deinit_queues(struct ublk_device *ub)
{
- int nr_queues = ub->dev_info.nr_hw_queues;
int i;
- if (!ub->__queues)
- return;
-
- for (i = 0; i < nr_queues; i++)
+ for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
ublk_deinit_queue(ub, i);
- kvfree(ub->__queues);
}
static int ublk_init_queues(struct ublk_device *ub)
{
- int nr_queues = ub->dev_info.nr_hw_queues;
- int depth = ub->dev_info.queue_depth;
- int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
- int i, ret = -ENOMEM;
+ int i, ret;
- ub->queue_size = ubq_size;
- ub->__queues = kvcalloc(nr_queues, ubq_size, GFP_KERNEL);
- if (!ub->__queues)
- return ret;
-
- for (i = 0; i < nr_queues; i++) {
- if (ublk_init_queue(ub, i))
+ for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
+ ret = ublk_init_queue(ub, i);
+ if (ret)
goto fail;
}
@@ -3128,7 +3150,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
goto out_unlock;
ret = -ENOMEM;
- ub = kzalloc(sizeof(*ub), GFP_KERNEL);
+ ub = kzalloc(struct_size(ub, queues, info.nr_hw_queues), GFP_KERNEL);
if (!ub)
goto out_unlock;
mutex_init(&ub->mutex);
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V3 3/5] ublk: use struct_size() for allocation
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-29 3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
2025-10-29 3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
4 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Convert ublk_queue to use struct_size() for allocation.
Changes in this commit:
1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
instead of manual size calculation (sizeof(struct ublk_queue) +
depth * sizeof(struct ublk_io)).
This provides better type safety and makes the code more maintainable
by using standard kernel macro for flexible array handling.
Meantime annotate ublk_queue.ios by __counted_by().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index ed77b4527b33..409874714c62 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -203,7 +203,7 @@ struct ublk_queue {
bool fail_io; /* copy of dev->state == UBLK_S_DEV_FAIL_IO */
spinlock_t cancel_lock;
struct ublk_device *dev;
- struct ublk_io ios[];
+ struct ublk_io ios[] __counted_by(q_depth);
};
struct ublk_device {
@@ -2700,7 +2700,6 @@ static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
static int ublk_init_queue(struct ublk_device *ub, int q_id)
{
int depth = ub->dev_info.queue_depth;
- int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
struct ublk_queue *ubq;
struct page *page;
@@ -2711,7 +2710,8 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id)
numa_node = ublk_get_queue_numa_node(ub, q_id);
/* Allocate queue structure on local NUMA node */
- ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
+ ubq = kvzalloc_node(struct_size(ubq, ios, depth), GFP_KERNEL,
+ numa_node);
if (!ubq)
return -ENOMEM;
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
` (2 preceding siblings ...)
2025-10-29 3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
2025-10-29 3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Move ublk_thread_set_sched_affinity() call before ublk_thread_init()
to ensure memory allocations during thread initialization occur on
the correct NUMA node. This leverages Linux's first-touch memory
policy for better NUMA locality.
Also convert ublk_thread_set_sched_affinity() to use
pthread_setaffinity_np() instead of sched_setaffinity(), as the
pthread API is the proper interface for setting thread affinity in
multithreaded programs.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
tools/testing/selftests/ublk/kublk.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index 6b8123c12a7a..062537ab8976 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -839,7 +839,7 @@ static int ublk_process_io(struct ublk_thread *t)
static void ublk_thread_set_sched_affinity(const struct ublk_thread *t,
cpu_set_t *cpuset)
{
- if (sched_setaffinity(0, sizeof(*cpuset), cpuset) < 0)
+ if (pthread_setaffinity_np(pthread_self(), sizeof(*cpuset), cpuset) < 0)
ublk_err("ublk dev %u thread %u set affinity failed",
t->dev->dev_info.dev_id, t->idx);
}
@@ -862,15 +862,21 @@ static void *ublk_io_handler_fn(void *data)
t->dev = info->dev;
t->idx = info->idx;
+ /*
+ * IO perf is sensitive with queue pthread affinity on NUMA machine
+ *
+ * Set sched_affinity at beginning, so following allocated memory/pages
+ * could be CPU/NUMA aware.
+ */
+ if (info->affinity)
+ ublk_thread_set_sched_affinity(t, info->affinity);
+
ret = ublk_thread_init(t, info->extra_flags);
if (ret) {
ublk_err("ublk dev %d thread %u init failed\n",
dev_id, t->idx);
return NULL;
}
- /* IO perf is sensitive with queue pthread affinity on NUMA machine*/
- if (info->affinity)
- ublk_thread_set_sched_affinity(t, info->affinity);
sem_post(info->ready);
ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %u started\n",
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
` (3 preceding siblings ...)
2025-10-29 3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Refactor ublk_thread to be a thread-local variable instead of storing
it in ublk_dev:
- Remove pthread_t thread field from struct ublk_thread and move it to
struct ublk_thread_info
- Remove struct ublk_thread array from struct ublk_dev, reducing memory
footprint
- Define struct ublk_thread as local variable in __ublk_io_handler_fn()
instead of accessing it from dev->threads[]
- Extract main IO handling logic into __ublk_io_handler_fn() which is
marked as noinline
- Move CPU affinity setup to ublk_io_handler_fn() before calling
__ublk_io_handler_fn()
- Update ublk_thread_set_sched_affinity() to take struct ublk_thread_info *
instead of struct ublk_thread *, and use pthread_setaffinity_np()
instead of sched_setaffinity()
- Reorder struct ublk_thread fields to group related state together
This change makes each thread's ublk_thread structure truly local to
the thread, improving cache locality and reducing memory usage.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
tools/testing/selftests/ublk/kublk.c | 76 +++++++++++++++-------------
tools/testing/selftests/ublk/kublk.h | 9 ++--
2 files changed, 45 insertions(+), 40 deletions(-)
diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index 062537ab8976..f8fa102a627f 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -836,62 +836,70 @@ static int ublk_process_io(struct ublk_thread *t)
return reapped;
}
-static void ublk_thread_set_sched_affinity(const struct ublk_thread *t,
- cpu_set_t *cpuset)
-{
- if (pthread_setaffinity_np(pthread_self(), sizeof(*cpuset), cpuset) < 0)
- ublk_err("ublk dev %u thread %u set affinity failed",
- t->dev->dev_info.dev_id, t->idx);
-}
-
struct ublk_thread_info {
struct ublk_dev *dev;
+ pthread_t thread;
unsigned idx;
sem_t *ready;
cpu_set_t *affinity;
unsigned long long extra_flags;
};
-static void *ublk_io_handler_fn(void *data)
+static void ublk_thread_set_sched_affinity(const struct ublk_thread_info *info)
{
- struct ublk_thread_info *info = data;
- struct ublk_thread *t = &info->dev->threads[info->idx];
+ if (pthread_setaffinity_np(pthread_self(), sizeof(*info->affinity), info->affinity) < 0)
+ ublk_err("ublk dev %u thread %u set affinity failed",
+ info->dev->dev_info.dev_id, info->idx);
+}
+
+static __attribute__((noinline)) int __ublk_io_handler_fn(struct ublk_thread_info *info)
+{
+ struct ublk_thread t = {
+ .dev = info->dev,
+ .idx = info->idx,
+ };
int dev_id = info->dev->dev_info.dev_id;
int ret;
- t->dev = info->dev;
- t->idx = info->idx;
-
- /*
- * IO perf is sensitive with queue pthread affinity on NUMA machine
- *
- * Set sched_affinity at beginning, so following allocated memory/pages
- * could be CPU/NUMA aware.
- */
- if (info->affinity)
- ublk_thread_set_sched_affinity(t, info->affinity);
-
- ret = ublk_thread_init(t, info->extra_flags);
+ ret = ublk_thread_init(&t, info->extra_flags);
if (ret) {
ublk_err("ublk dev %d thread %u init failed\n",
- dev_id, t->idx);
- return NULL;
+ dev_id, t.idx);
+ return ret;
}
sem_post(info->ready);
ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %u started\n",
- gettid(), dev_id, t->idx);
+ gettid(), dev_id, t.idx);
/* submit all io commands to ublk driver */
- ublk_submit_fetch_commands(t);
+ ublk_submit_fetch_commands(&t);
do {
- if (ublk_process_io(t) < 0)
+ if (ublk_process_io(&t) < 0)
break;
} while (1);
ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %d exiting\n",
- gettid(), dev_id, t->idx);
- ublk_thread_deinit(t);
+ gettid(), dev_id, t.idx);
+ ublk_thread_deinit(&t);
+ return 0;
+}
+
+static void *ublk_io_handler_fn(void *data)
+{
+ struct ublk_thread_info *info = data;
+
+ /*
+ * IO perf is sensitive with queue pthread affinity on NUMA machine
+ *
+ * Set sched_affinity at beginning, so following allocated memory/pages
+ * could be CPU/NUMA aware.
+ */
+ if (info->affinity)
+ ublk_thread_set_sched_affinity(info);
+
+ __ublk_io_handler_fn(info);
+
return NULL;
}
@@ -989,14 +997,13 @@ static int ublk_start_daemon(const struct dev_ctx *ctx, struct ublk_dev *dev)
*/
if (dev->nthreads == dinfo->nr_hw_queues)
tinfo[i].affinity = &affinity_buf[i];
- pthread_create(&dev->threads[i].thread, NULL,
+ pthread_create(&tinfo[i].thread, NULL,
ublk_io_handler_fn,
&tinfo[i]);
}
for (i = 0; i < dev->nthreads; i++)
sem_wait(&ready);
- free(tinfo);
free(affinity_buf);
/* everything is fine now, start us */
@@ -1019,7 +1026,8 @@ static int ublk_start_daemon(const struct dev_ctx *ctx, struct ublk_dev *dev)
/* wait until we are terminated */
for (i = 0; i < dev->nthreads; i++)
- pthread_join(dev->threads[i].thread, &thread_ret);
+ pthread_join(tinfo[i].thread, &thread_ret);
+ free(tinfo);
fail:
for (i = 0; i < dinfo->nr_hw_queues; i++)
ublk_queue_deinit(&dev->q[i]);
diff --git a/tools/testing/selftests/ublk/kublk.h b/tools/testing/selftests/ublk/kublk.h
index 5e55484fb0aa..fe42705c6d42 100644
--- a/tools/testing/selftests/ublk/kublk.h
+++ b/tools/testing/selftests/ublk/kublk.h
@@ -175,23 +175,20 @@ struct ublk_queue {
struct ublk_thread {
struct ublk_dev *dev;
- struct io_uring ring;
- unsigned int cmd_inflight;
- unsigned int io_inflight;
-
- pthread_t thread;
unsigned idx;
#define UBLKS_T_STOPPING (1U << 0)
#define UBLKS_T_IDLE (1U << 1)
unsigned state;
+ unsigned int cmd_inflight;
+ unsigned int io_inflight;
+ struct io_uring ring;
};
struct ublk_dev {
struct ublk_tgt tgt;
struct ublksrv_ctrl_dev_info dev_info;
struct ublk_queue q[UBLK_MAX_QUEUES];
- struct ublk_thread threads[UBLK_MAX_THREADS];
unsigned nthreads;
unsigned per_io_tasks;
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
@ 2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-29 22:53 ` Ming Lei
2025-10-30 4:04 ` kernel test robot
2025-10-30 8:00 ` kernel test robot
2 siblings, 1 reply; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-29 16:00 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, Uday Shankar
On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Implement NUMA-friendly memory allocation for ublk driver to improve
> performance on multi-socket systems.
>
> This commit includes the following changes:
>
> 1. Convert struct ublk_device to use a flexible array member for the
> queues field instead of a separate pointer array allocation. This
> eliminates one level of indirection and simplifies memory management.
> The queues array is now allocated as part of struct ublk_device using
> struct_size().
Technically it ends up being the same number of indirections as
before, since changing queues from a single allocation to an array of
separate allocations adds another indirection.
>
> 2. Rename __queues to queues, dropping the __ prefix since the field is
> now accessed directly throughout the codebase rather than only through
> the ublk_get_queue() helper.
>
> 3. Remove the queue_size field from struct ublk_device as it is no longer
> needed.
>
> 4. Move queue allocation and deallocation into ublk_init_queue() and
> ublk_deinit_queue() respectively, improving encapsulation. This
> simplifies ublk_init_queues() and ublk_deinit_queues() to just
> iterate and call the per-queue functions.
>
> 5. Add ublk_get_queue_numa_node() helper function to determine the
> appropriate NUMA node for a queue by finding the first CPU mapped
> to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
> converting it to a NUMA node using cpu_to_node(). This function is
> called internally by ublk_init_queue() to determine the allocation
> node.
>
> 6. Allocate each queue structure on its local NUMA node using
> kvzalloc_node() in ublk_init_queue().
>
> 7. Allocate the I/O command buffer on the same NUMA node using
> alloc_pages_node().
>
> This reduces memory access latency on multi-socket NUMA systems by
> ensuring each queue's data structures are local to the CPUs that
> access them.
>
> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
> 1 file changed, 53 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 2569566bf5e6..ed77b4527b33 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -209,9 +209,6 @@ struct ublk_queue {
> struct ublk_device {
> struct gendisk *ub_disk;
>
> - char *__queues;
> -
> - unsigned int queue_size;
> struct ublksrv_ctrl_dev_info dev_info;
>
> struct blk_mq_tag_set tag_set;
> @@ -239,6 +236,8 @@ struct ublk_device {
> bool canceling;
> pid_t ublksrv_tgid;
> struct delayed_work exit_work;
> +
> + struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> };
>
> /* header of ublk_params */
> @@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
> static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
> int qid)
> {
> - return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
> + return dev->queues[qid];
> }
>
> static inline bool ublk_rq_has_data(const struct request *rq)
> @@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
>
> static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> {
> - int size = ublk_queue_cmd_buf_size(ub);
> - struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> - int i;
> + struct ublk_queue *ubq = ub->queues[q_id];
> + int size, i;
> +
> + if (!ubq)
> + return;
> +
> + size = ublk_queue_cmd_buf_size(ub);
>
> for (i = 0; i < ubq->q_depth; i++) {
> struct ublk_io *io = &ubq->ios[i];
> @@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
>
> if (ubq->io_cmd_buf)
> free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
> +
> + kvfree(ubq);
> + ub->queues[q_id] = NULL;
> +}
> +
> +static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> +{
> + unsigned int cpu;
> +
> + /* Find first CPU mapped to this queue */
> + for_each_possible_cpu(cpu) {
> + if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
> + return cpu_to_node(cpu);
> + }
I think you could avoid this quadratic lookup by using blk_mq_hw_ctx's
numa_node field. The initialization code would probably have to move
to ublk_init_hctx() in order to have access to the blk_mq_hw_ctx. But
may not be worth the effort just to save some time at ublk creation
time. What you have seems fine.
Best,
Caleb
> +
> + return NUMA_NO_NODE;
> }
>
> static int ublk_init_queue(struct ublk_device *ub, int q_id)
> {
> - struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> + int depth = ub->dev_info.queue_depth;
> + int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
> gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
> - void *ptr;
> + struct ublk_queue *ubq;
> + struct page *page;
> + int numa_node;
> int size;
>
> + /* Determine NUMA node based on queue's CPU affinity */
> + numa_node = ublk_get_queue_numa_node(ub, q_id);
> +
> + /* Allocate queue structure on local NUMA node */
> + ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
> + if (!ubq)
> + return -ENOMEM;
> +
> spin_lock_init(&ubq->cancel_lock);
> ubq->flags = ub->dev_info.flags;
> ubq->q_id = q_id;
> - ubq->q_depth = ub->dev_info.queue_depth;
> + ubq->q_depth = depth;
> size = ublk_queue_cmd_buf_size(ub);
>
> - ptr = (void *) __get_free_pages(gfp_flags, get_order(size));
> - if (!ptr)
> + /* Allocate I/O command buffer on local NUMA node */
> + page = alloc_pages_node(numa_node, gfp_flags, get_order(size));
> + if (!page) {
> + kvfree(ubq);
> return -ENOMEM;
> + }
> + ubq->io_cmd_buf = page_address(page);
>
> - ubq->io_cmd_buf = ptr;
> + ub->queues[q_id] = ubq;
> ubq->dev = ub;
> return 0;
> }
>
> static void ublk_deinit_queues(struct ublk_device *ub)
> {
> - int nr_queues = ub->dev_info.nr_hw_queues;
> int i;
>
> - if (!ub->__queues)
> - return;
> -
> - for (i = 0; i < nr_queues; i++)
> + for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
> ublk_deinit_queue(ub, i);
> - kvfree(ub->__queues);
> }
>
> static int ublk_init_queues(struct ublk_device *ub)
> {
> - int nr_queues = ub->dev_info.nr_hw_queues;
> - int depth = ub->dev_info.queue_depth;
> - int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
> - int i, ret = -ENOMEM;
> + int i, ret;
>
> - ub->queue_size = ubq_size;
> - ub->__queues = kvcalloc(nr_queues, ubq_size, GFP_KERNEL);
> - if (!ub->__queues)
> - return ret;
> -
> - for (i = 0; i < nr_queues; i++) {
> - if (ublk_init_queue(ub, i))
> + for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
> + ret = ublk_init_queue(ub, i);
> + if (ret)
> goto fail;
> }
>
> @@ -3128,7 +3150,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
> goto out_unlock;
>
> ret = -ENOMEM;
> - ub = kzalloc(sizeof(*ub), GFP_KERNEL);
> + ub = kzalloc(struct_size(ub, queues, info.nr_hw_queues), GFP_KERNEL);
> if (!ub)
> goto out_unlock;
> mutex_init(&ub->mutex);
> --
> 2.47.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 3/5] ublk: use struct_size() for allocation
2025-10-29 3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
@ 2025-10-29 16:00 ` Caleb Sander Mateos
0 siblings, 0 replies; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-29 16:00 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, Uday Shankar
On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Convert ublk_queue to use struct_size() for allocation.
>
> Changes in this commit:
>
> 1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
> instead of manual size calculation (sizeof(struct ublk_queue) +
> depth * sizeof(struct ublk_io)).
>
> This provides better type safety and makes the code more maintainable
> by using standard kernel macro for flexible array handling.
>
> Meantime annotate ublk_queue.ios by __counted_by().
>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> ---
> drivers/block/ublk_drv.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index ed77b4527b33..409874714c62 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -203,7 +203,7 @@ struct ublk_queue {
> bool fail_io; /* copy of dev->state == UBLK_S_DEV_FAIL_IO */
> spinlock_t cancel_lock;
> struct ublk_device *dev;
> - struct ublk_io ios[];
> + struct ublk_io ios[] __counted_by(q_depth);
> };
>
> struct ublk_device {
> @@ -2700,7 +2700,6 @@ static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> static int ublk_init_queue(struct ublk_device *ub, int q_id)
> {
> int depth = ub->dev_info.queue_depth;
> - int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
> gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
> struct ublk_queue *ubq;
> struct page *page;
> @@ -2711,7 +2710,8 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id)
> numa_node = ublk_get_queue_numa_node(ub, q_id);
>
> /* Allocate queue structure on local NUMA node */
> - ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
> + ubq = kvzalloc_node(struct_size(ubq, ios, depth), GFP_KERNEL,
> + numa_node);
> if (!ubq)
> return -ENOMEM;
>
> --
> 2.47.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 16:00 ` Caleb Sander Mateos
@ 2025-10-29 22:53 ` Ming Lei
0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 22:53 UTC (permalink / raw)
To: Caleb Sander Mateos; +Cc: Jens Axboe, linux-block, Uday Shankar
On Wed, Oct 29, 2025 at 09:00:12AM -0700, Caleb Sander Mateos wrote:
> On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > Implement NUMA-friendly memory allocation for ublk driver to improve
> > performance on multi-socket systems.
> >
> > This commit includes the following changes:
> >
> > 1. Convert struct ublk_device to use a flexible array member for the
> > queues field instead of a separate pointer array allocation. This
> > eliminates one level of indirection and simplifies memory management.
> > The queues array is now allocated as part of struct ublk_device using
> > struct_size().
>
> Technically it ends up being the same number of indirections as
> before, since changing queues from a single allocation to an array of
> separate allocations adds another indirection.
I think it is fine, because the pre-condition is NUMA aware allocation for
ublk_queue.
>
> >
> > 2. Rename __queues to queues, dropping the __ prefix since the field is
> > now accessed directly throughout the codebase rather than only through
> > the ublk_get_queue() helper.
> >
> > 3. Remove the queue_size field from struct ublk_device as it is no longer
> > needed.
> >
> > 4. Move queue allocation and deallocation into ublk_init_queue() and
> > ublk_deinit_queue() respectively, improving encapsulation. This
> > simplifies ublk_init_queues() and ublk_deinit_queues() to just
> > iterate and call the per-queue functions.
> >
> > 5. Add ublk_get_queue_numa_node() helper function to determine the
> > appropriate NUMA node for a queue by finding the first CPU mapped
> > to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
> > converting it to a NUMA node using cpu_to_node(). This function is
> > called internally by ublk_init_queue() to determine the allocation
> > node.
> >
> > 6. Allocate each queue structure on its local NUMA node using
> > kvzalloc_node() in ublk_init_queue().
> >
> > 7. Allocate the I/O command buffer on the same NUMA node using
> > alloc_pages_node().
> >
> > This reduces memory access latency on multi-socket NUMA systems by
> > ensuring each queue's data structures are local to the CPUs that
> > access them.
> >
> > Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> > drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
> > 1 file changed, 53 insertions(+), 31 deletions(-)
> >
> > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > index 2569566bf5e6..ed77b4527b33 100644
> > --- a/drivers/block/ublk_drv.c
> > +++ b/drivers/block/ublk_drv.c
> > @@ -209,9 +209,6 @@ struct ublk_queue {
> > struct ublk_device {
> > struct gendisk *ub_disk;
> >
> > - char *__queues;
> > -
> > - unsigned int queue_size;
> > struct ublksrv_ctrl_dev_info dev_info;
> >
> > struct blk_mq_tag_set tag_set;
> > @@ -239,6 +236,8 @@ struct ublk_device {
> > bool canceling;
> > pid_t ublksrv_tgid;
> > struct delayed_work exit_work;
> > +
> > + struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> > };
> >
> > /* header of ublk_params */
> > @@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
> > static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
> > int qid)
> > {
> > - return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
> > + return dev->queues[qid];
> > }
> >
> > static inline bool ublk_rq_has_data(const struct request *rq)
> > @@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
> >
> > static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> > {
> > - int size = ublk_queue_cmd_buf_size(ub);
> > - struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> > - int i;
> > + struct ublk_queue *ubq = ub->queues[q_id];
> > + int size, i;
> > +
> > + if (!ubq)
> > + return;
> > +
> > + size = ublk_queue_cmd_buf_size(ub);
> >
> > for (i = 0; i < ubq->q_depth; i++) {
> > struct ublk_io *io = &ubq->ios[i];
> > @@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> >
> > if (ubq->io_cmd_buf)
> > free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
> > +
> > + kvfree(ubq);
> > + ub->queues[q_id] = NULL;
> > +}
> > +
> > +static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> > +{
> > + unsigned int cpu;
> > +
> > + /* Find first CPU mapped to this queue */
> > + for_each_possible_cpu(cpu) {
> > + if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
> > + return cpu_to_node(cpu);
> > + }
>
> I think you could avoid this quadratic lookup by using blk_mq_hw_ctx's
> numa_node field. The initialization code would probably have to move
> to ublk_init_hctx() in order to have access to the blk_mq_hw_ctx. But
> may not be worth the effort just to save some time at ublk creation
> time. What you have seems fine.
It isn't doable and not necessary.
disk/hw queues are created & initialized when handling UBLK_CMD_START_DEV, but the
backed ublk_queue need to be allocated when handling UBLK_CMD_ADD_DEV, which happens
before dealing with UBLK_CMD_START_DEV.
Thanks,
Ming
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
@ 2025-10-30 4:04 ` kernel test robot
2025-10-30 8:00 ` kernel test robot
2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2025-10-30 4:04 UTC (permalink / raw)
To: Ming Lei, Jens Axboe, linux-block
Cc: oe-kbuild-all, Uday Shankar, Caleb Sander Mateos, Ming Lei
Hi Ming,
kernel test robot noticed the following build errors:
[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master v6.18-rc3 next-20251029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
config: csky-randconfig-r054-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301107.wrn8eeW8-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301107.wrn8eeW8-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510301107.wrn8eeW8-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from <command-line>:
>> drivers/block/ublk_drv.c:240:56: error: 'dev_info' undeclared here (not in a function); did you mean '_dev_info'?
240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
| ^~~~~~~~
include/linux/compiler_types.h:346:71: note: in definition of macro '__counted_by'
346 | # define __counted_by(member) __attribute__((__counted_by__(member)))
| ^~~~~~
>> drivers/block/ublk_drv.c:240:34: error: 'counted_by' argument is not an identifier
240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
| ^~~~~~
vim +240 drivers/block/ublk_drv.c
208
209 struct ublk_device {
210 struct gendisk *ub_disk;
211
212 struct ublksrv_ctrl_dev_info dev_info;
213
214 struct blk_mq_tag_set tag_set;
215
216 struct cdev cdev;
217 struct device cdev_dev;
218
219 #define UB_STATE_OPEN 0
220 #define UB_STATE_USED 1
221 #define UB_STATE_DELETED 2
222 unsigned long state;
223 int ub_number;
224
225 struct mutex mutex;
226
227 spinlock_t lock;
228 struct mm_struct *mm;
229
230 struct ublk_params params;
231
232 struct completion completion;
233 u32 nr_io_ready;
234 bool unprivileged_daemons;
235 struct mutex cancel_mutex;
236 bool canceling;
237 pid_t ublksrv_tgid;
238 struct delayed_work exit_work;
239
> 240 struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
241 };
242
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-30 4:04 ` kernel test robot
@ 2025-10-30 8:00 ` kernel test robot
2025-10-30 14:07 ` Caleb Sander Mateos
2 siblings, 1 reply; 14+ messages in thread
From: kernel test robot @ 2025-10-30 8:00 UTC (permalink / raw)
To: Ming Lei, Jens Axboe, linux-block
Cc: llvm, oe-kbuild-all, Uday Shankar, Caleb Sander Mateos, Ming Lei
Hi Ming,
kernel test robot noticed the following build errors:
[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on shuah-kselftest/next shuah-kselftest/fixes linus/master v6.18-rc3 next-20251029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
All errors (new ones prefixed by >>):
>> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
| ^~~~~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:346:62: note: expanded from macro '__counted_by'
346 | # define __counted_by(member) __attribute__((__counted_by__(member)))
| ^~~~~~
1 error generated.
vim +/counted_by +240 drivers/block/ublk_drv.c
208
209 struct ublk_device {
210 struct gendisk *ub_disk;
211
212 struct ublksrv_ctrl_dev_info dev_info;
213
214 struct blk_mq_tag_set tag_set;
215
216 struct cdev cdev;
217 struct device cdev_dev;
218
219 #define UB_STATE_OPEN 0
220 #define UB_STATE_USED 1
221 #define UB_STATE_DELETED 2
222 unsigned long state;
223 int ub_number;
224
225 struct mutex mutex;
226
227 spinlock_t lock;
228 struct mm_struct *mm;
229
230 struct ublk_params params;
231
232 struct completion completion;
233 u32 nr_io_ready;
234 bool unprivileged_daemons;
235 struct mutex cancel_mutex;
236 bool canceling;
237 pid_t ublksrv_tgid;
238 struct delayed_work exit_work;
239
> 240 struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
241 };
242
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-30 8:00 ` kernel test robot
@ 2025-10-30 14:07 ` Caleb Sander Mateos
2025-10-30 17:56 ` Nathan Chancellor
0 siblings, 1 reply; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-30 14:07 UTC (permalink / raw)
To: kernel test robot
Cc: Ming Lei, Jens Axboe, linux-block, llvm, oe-kbuild-all,
Uday Shankar
On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
>
> Hi Ming,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on axboe-block/for-next]
> [also build test ERROR on shuah-kselftest/next shuah-kselftest/fixes linus/master v6.18-rc3 next-20251029]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
> base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
> >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> 240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> | ^~~~~~~~~~~~~~~~~~~~~
Hmm, guess it doesn't support nested fields?
> include/linux/compiler_types.h:346:62: note: expanded from macro '__counted_by'
> 346 | # define __counted_by(member) __attribute__((__counted_by__(member)))
> | ^~~~~~
> 1 error generated.
>
>
> vim +/counted_by +240 drivers/block/ublk_drv.c
>
> 208
> 209 struct ublk_device {
> 210 struct gendisk *ub_disk;
> 211
> 212 struct ublksrv_ctrl_dev_info dev_info;
> 213
> 214 struct blk_mq_tag_set tag_set;
> 215
> 216 struct cdev cdev;
> 217 struct device cdev_dev;
> 218
> 219 #define UB_STATE_OPEN 0
> 220 #define UB_STATE_USED 1
> 221 #define UB_STATE_DELETED 2
> 222 unsigned long state;
> 223 int ub_number;
> 224
> 225 struct mutex mutex;
> 226
> 227 spinlock_t lock;
> 228 struct mm_struct *mm;
> 229
> 230 struct ublk_params params;
> 231
> 232 struct completion completion;
> 233 u32 nr_io_ready;
> 234 bool unprivileged_daemons;
> 235 struct mutex cancel_mutex;
> 236 bool canceling;
> 237 pid_t ublksrv_tgid;
> 238 struct delayed_work exit_work;
> 239
> > 240 struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> 241 };
> 242
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-30 14:07 ` Caleb Sander Mateos
@ 2025-10-30 17:56 ` Nathan Chancellor
2025-10-31 3:28 ` Ming Lei
0 siblings, 1 reply; 14+ messages in thread
From: Nathan Chancellor @ 2025-10-30 17:56 UTC (permalink / raw)
To: Caleb Sander Mateos
Cc: kernel test robot, Ming Lei, Jens Axboe, linux-block, llvm,
oe-kbuild-all, Uday Shankar, Kees Cook
On Thu, Oct 30, 2025 at 07:07:25AM -0700, Caleb Sander Mateos wrote:
> On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
...
> > patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> > patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> > config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
> >
> > All errors (new ones prefixed by >>):
> >
> > >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> > 240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> > | ^~~~~~~~~~~~~~~~~~~~~
>
> Hmm, guess it doesn't support nested fields?
Correct. I think this is something that we want to support at some point
if I remember correctly but I think there was a lot of discussion
between GCC and clang on how to actually do it but Kees is free to
correct me if that is wrong.
Cheers,
Nathan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-30 17:56 ` Nathan Chancellor
@ 2025-10-31 3:28 ` Ming Lei
0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-31 3:28 UTC (permalink / raw)
To: Nathan Chancellor
Cc: Caleb Sander Mateos, kernel test robot, Jens Axboe, linux-block,
llvm, oe-kbuild-all, Uday Shankar, Kees Cook
On Thu, Oct 30, 2025 at 10:56:31AM -0700, Nathan Chancellor wrote:
> On Thu, Oct 30, 2025 at 07:07:25AM -0700, Caleb Sander Mateos wrote:
> > On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
> ...
> > > patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> > > patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> > > config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> > > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
> > >
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <lkp@intel.com>
> > > | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
> > >
> > > All errors (new ones prefixed by >>):
> > >
> > > >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> > > 240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> > > | ^~~~~~~~~~~~~~~~~~~~~
> >
> > Hmm, guess it doesn't support nested fields?
>
> Correct. I think this is something that we want to support at some point
> if I remember correctly but I think there was a lot of discussion
> between GCC and clang on how to actually do it but Kees is free to
> correct me if that is wrong.
Thanks for the confirmation.
Will remove this __counted_by() in next version.
Thanks,
Ming
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-10-31 3:29 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-29 22:53 ` Ming Lei
2025-10-30 4:04 ` kernel test robot
2025-10-30 8:00 ` kernel test robot
2025-10-30 14:07 ` Caleb Sander Mateos
2025-10-30 17:56 ` Nathan Chancellor
2025-10-31 3:28 ` Ming Lei
2025-10-29 3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-29 3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
2025-10-29 3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox