* [PATCH V3 0/5] ublk: NUMA-aware memory allocation
@ 2025-10-29 3:10 Ming Lei
2025-10-29 3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Hi Jens,
The 1st two patches implement ublk driver NUMA aware memory allocation.
The last two patches implement it for ublk selftest utility.
`taskset -c 0-31 ~/git/fio/t/io_uring -p0 -n16 -r 40 /dev/ublkb0` shows
5%~10% IOPS improvement on one AMD zen4 dual socket machine when creating
ublk/null with 16 queues and AUTO_BUF_REG(zero copy).
V3:
- don't use DECLARE_FLEX_ARRAY()
- annotate flexible array by __counted_by()
V2:
- use a flexible array member for queues field, save one indirection
for retrieving ublk queue
- rename __queues into queues
- remove the queue_size field from struct ublk_device
- Move queue allocation and deallocation into ublk_init_queue() and
ublk_deinit_queue()
- use flexible array for ublk_queue.ios
- convert ublk_thread_set_sched_affinity() to use pthread_setaffinity_np()
Ming Lei (5):
ublk: reorder tag_set initialization before queue allocation
ublk: implement NUMA-aware memory allocation
ublk: use struct_size() for allocation
selftests: ublk: set CPU affinity before thread initialization
selftests: ublk: make ublk_thread thread-local variable
drivers/block/ublk_drv.c | 98 +++++++++++++++++-----------
tools/testing/selftests/ublk/kublk.c | 70 ++++++++++++--------
tools/testing/selftests/ublk/kublk.h | 9 +--
3 files changed, 105 insertions(+), 72 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
` (3 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Move ublk_add_tag_set() before ublk_init_queues() in the device
initialization path. This allows us to use the blk-mq CPU-to-queue
mapping established by the tag_set to determine the appropriate
NUMA node for each queue allocation.
The error handling paths are also reordered accordingly.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 0c74a41a6753..2569566bf5e6 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -3178,17 +3178,17 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
ub->dev_info.nr_hw_queues, nr_cpu_ids);
ublk_align_max_io_size(ub);
- ret = ublk_init_queues(ub);
+ ret = ublk_add_tag_set(ub);
if (ret)
goto out_free_dev_number;
- ret = ublk_add_tag_set(ub);
+ ret = ublk_init_queues(ub);
if (ret)
- goto out_deinit_queues;
+ goto out_free_tag_set;
ret = -EFAULT;
if (copy_to_user(argp, &ub->dev_info, sizeof(info)))
- goto out_free_tag_set;
+ goto out_deinit_queues;
/*
* Add the char dev so that ublksrv daemon can be setup.
@@ -3197,10 +3197,10 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
ret = ublk_add_chdev(ub);
goto out_unlock;
-out_free_tag_set:
- blk_mq_free_tag_set(&ub->tag_set);
out_deinit_queues:
ublk_deinit_queues(ub);
+out_free_tag_set:
+ blk_mq_free_tag_set(&ub->tag_set);
out_free_dev_number:
ublk_free_dev_number(ub);
out_free_ub:
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
` (2 more replies)
2025-10-29 3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
` (2 subsequent siblings)
4 siblings, 3 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Implement NUMA-friendly memory allocation for ublk driver to improve
performance on multi-socket systems.
This commit includes the following changes:
1. Convert struct ublk_device to use a flexible array member for the
queues field instead of a separate pointer array allocation. This
eliminates one level of indirection and simplifies memory management.
The queues array is now allocated as part of struct ublk_device using
struct_size().
2. Rename __queues to queues, dropping the __ prefix since the field is
now accessed directly throughout the codebase rather than only through
the ublk_get_queue() helper.
3. Remove the queue_size field from struct ublk_device as it is no longer
needed.
4. Move queue allocation and deallocation into ublk_init_queue() and
ublk_deinit_queue() respectively, improving encapsulation. This
simplifies ublk_init_queues() and ublk_deinit_queues() to just
iterate and call the per-queue functions.
5. Add ublk_get_queue_numa_node() helper function to determine the
appropriate NUMA node for a queue by finding the first CPU mapped
to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
converting it to a NUMA node using cpu_to_node(). This function is
called internally by ublk_init_queue() to determine the allocation
node.
6. Allocate each queue structure on its local NUMA node using
kvzalloc_node() in ublk_init_queue().
7. Allocate the I/O command buffer on the same NUMA node using
alloc_pages_node().
This reduces memory access latency on multi-socket NUMA systems by
ensuring each queue's data structures are local to the CPUs that
access them.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
1 file changed, 53 insertions(+), 31 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 2569566bf5e6..ed77b4527b33 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -209,9 +209,6 @@ struct ublk_queue {
struct ublk_device {
struct gendisk *ub_disk;
- char *__queues;
-
- unsigned int queue_size;
struct ublksrv_ctrl_dev_info dev_info;
struct blk_mq_tag_set tag_set;
@@ -239,6 +236,8 @@ struct ublk_device {
bool canceling;
pid_t ublksrv_tgid;
struct delayed_work exit_work;
+
+ struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
};
/* header of ublk_params */
@@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
int qid)
{
- return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
+ return dev->queues[qid];
}
static inline bool ublk_rq_has_data(const struct request *rq)
@@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
{
- int size = ublk_queue_cmd_buf_size(ub);
- struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
- int i;
+ struct ublk_queue *ubq = ub->queues[q_id];
+ int size, i;
+
+ if (!ubq)
+ return;
+
+ size = ublk_queue_cmd_buf_size(ub);
for (i = 0; i < ubq->q_depth; i++) {
struct ublk_io *io = &ubq->ios[i];
@@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
if (ubq->io_cmd_buf)
free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
+
+ kvfree(ubq);
+ ub->queues[q_id] = NULL;
+}
+
+static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
+{
+ unsigned int cpu;
+
+ /* Find first CPU mapped to this queue */
+ for_each_possible_cpu(cpu) {
+ if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
+ return cpu_to_node(cpu);
+ }
+
+ return NUMA_NO_NODE;
}
static int ublk_init_queue(struct ublk_device *ub, int q_id)
{
- struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
+ int depth = ub->dev_info.queue_depth;
+ int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
- void *ptr;
+ struct ublk_queue *ubq;
+ struct page *page;
+ int numa_node;
int size;
+ /* Determine NUMA node based on queue's CPU affinity */
+ numa_node = ublk_get_queue_numa_node(ub, q_id);
+
+ /* Allocate queue structure on local NUMA node */
+ ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
+ if (!ubq)
+ return -ENOMEM;
+
spin_lock_init(&ubq->cancel_lock);
ubq->flags = ub->dev_info.flags;
ubq->q_id = q_id;
- ubq->q_depth = ub->dev_info.queue_depth;
+ ubq->q_depth = depth;
size = ublk_queue_cmd_buf_size(ub);
- ptr = (void *) __get_free_pages(gfp_flags, get_order(size));
- if (!ptr)
+ /* Allocate I/O command buffer on local NUMA node */
+ page = alloc_pages_node(numa_node, gfp_flags, get_order(size));
+ if (!page) {
+ kvfree(ubq);
return -ENOMEM;
+ }
+ ubq->io_cmd_buf = page_address(page);
- ubq->io_cmd_buf = ptr;
+ ub->queues[q_id] = ubq;
ubq->dev = ub;
return 0;
}
static void ublk_deinit_queues(struct ublk_device *ub)
{
- int nr_queues = ub->dev_info.nr_hw_queues;
int i;
- if (!ub->__queues)
- return;
-
- for (i = 0; i < nr_queues; i++)
+ for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
ublk_deinit_queue(ub, i);
- kvfree(ub->__queues);
}
static int ublk_init_queues(struct ublk_device *ub)
{
- int nr_queues = ub->dev_info.nr_hw_queues;
- int depth = ub->dev_info.queue_depth;
- int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
- int i, ret = -ENOMEM;
+ int i, ret;
- ub->queue_size = ubq_size;
- ub->__queues = kvcalloc(nr_queues, ubq_size, GFP_KERNEL);
- if (!ub->__queues)
- return ret;
-
- for (i = 0; i < nr_queues; i++) {
- if (ublk_init_queue(ub, i))
+ for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
+ ret = ublk_init_queue(ub, i);
+ if (ret)
goto fail;
}
@@ -3128,7 +3150,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
goto out_unlock;
ret = -ENOMEM;
- ub = kzalloc(sizeof(*ub), GFP_KERNEL);
+ ub = kzalloc(struct_size(ub, queues, info.nr_hw_queues), GFP_KERNEL);
if (!ub)
goto out_unlock;
mutex_init(&ub->mutex);
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V3 3/5] ublk: use struct_size() for allocation
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-29 3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
2025-10-29 3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
4 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Convert ublk_queue to use struct_size() for allocation.
Changes in this commit:
1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
instead of manual size calculation (sizeof(struct ublk_queue) +
depth * sizeof(struct ublk_io)).
This provides better type safety and makes the code more maintainable
by using standard kernel macro for flexible array handling.
Meantime annotate ublk_queue.ios by __counted_by().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index ed77b4527b33..409874714c62 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -203,7 +203,7 @@ struct ublk_queue {
bool fail_io; /* copy of dev->state == UBLK_S_DEV_FAIL_IO */
spinlock_t cancel_lock;
struct ublk_device *dev;
- struct ublk_io ios[];
+ struct ublk_io ios[] __counted_by(q_depth);
};
struct ublk_device {
@@ -2700,7 +2700,6 @@ static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
static int ublk_init_queue(struct ublk_device *ub, int q_id)
{
int depth = ub->dev_info.queue_depth;
- int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
struct ublk_queue *ubq;
struct page *page;
@@ -2711,7 +2710,8 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id)
numa_node = ublk_get_queue_numa_node(ub, q_id);
/* Allocate queue structure on local NUMA node */
- ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
+ ubq = kvzalloc_node(struct_size(ubq, ios, depth), GFP_KERNEL,
+ numa_node);
if (!ubq)
return -ENOMEM;
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
` (2 preceding siblings ...)
2025-10-29 3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
2025-10-29 3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Move ublk_thread_set_sched_affinity() call before ublk_thread_init()
to ensure memory allocations during thread initialization occur on
the correct NUMA node. This leverages Linux's first-touch memory
policy for better NUMA locality.
Also convert ublk_thread_set_sched_affinity() to use
pthread_setaffinity_np() instead of sched_setaffinity(), as the
pthread API is the proper interface for setting thread affinity in
multithreaded programs.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
tools/testing/selftests/ublk/kublk.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index 6b8123c12a7a..062537ab8976 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -839,7 +839,7 @@ static int ublk_process_io(struct ublk_thread *t)
static void ublk_thread_set_sched_affinity(const struct ublk_thread *t,
cpu_set_t *cpuset)
{
- if (sched_setaffinity(0, sizeof(*cpuset), cpuset) < 0)
+ if (pthread_setaffinity_np(pthread_self(), sizeof(*cpuset), cpuset) < 0)
ublk_err("ublk dev %u thread %u set affinity failed",
t->dev->dev_info.dev_id, t->idx);
}
@@ -862,15 +862,21 @@ static void *ublk_io_handler_fn(void *data)
t->dev = info->dev;
t->idx = info->idx;
+ /*
+ * IO perf is sensitive with queue pthread affinity on NUMA machine
+ *
+ * Set sched_affinity at beginning, so following allocated memory/pages
+ * could be CPU/NUMA aware.
+ */
+ if (info->affinity)
+ ublk_thread_set_sched_affinity(t, info->affinity);
+
ret = ublk_thread_init(t, info->extra_flags);
if (ret) {
ublk_err("ublk dev %d thread %u init failed\n",
dev_id, t->idx);
return NULL;
}
- /* IO perf is sensitive with queue pthread affinity on NUMA machine*/
- if (info->affinity)
- ublk_thread_set_sched_affinity(t, info->affinity);
sem_post(info->ready);
ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %u started\n",
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
` (3 preceding siblings ...)
2025-10-29 3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
@ 2025-10-29 3:10 ` Ming Lei
4 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 3:10 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Uday Shankar, Caleb Sander Mateos, Ming Lei
Refactor ublk_thread to be a thread-local variable instead of storing
it in ublk_dev:
- Remove pthread_t thread field from struct ublk_thread and move it to
struct ublk_thread_info
- Remove struct ublk_thread array from struct ublk_dev, reducing memory
footprint
- Define struct ublk_thread as local variable in __ublk_io_handler_fn()
instead of accessing it from dev->threads[]
- Extract main IO handling logic into __ublk_io_handler_fn() which is
marked as noinline
- Move CPU affinity setup to ublk_io_handler_fn() before calling
__ublk_io_handler_fn()
- Update ublk_thread_set_sched_affinity() to take struct ublk_thread_info *
instead of struct ublk_thread *, and use pthread_setaffinity_np()
instead of sched_setaffinity()
- Reorder struct ublk_thread fields to group related state together
This change makes each thread's ublk_thread structure truly local to
the thread, improving cache locality and reducing memory usage.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
tools/testing/selftests/ublk/kublk.c | 76 +++++++++++++++-------------
tools/testing/selftests/ublk/kublk.h | 9 ++--
2 files changed, 45 insertions(+), 40 deletions(-)
diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index 062537ab8976..f8fa102a627f 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -836,62 +836,70 @@ static int ublk_process_io(struct ublk_thread *t)
return reapped;
}
-static void ublk_thread_set_sched_affinity(const struct ublk_thread *t,
- cpu_set_t *cpuset)
-{
- if (pthread_setaffinity_np(pthread_self(), sizeof(*cpuset), cpuset) < 0)
- ublk_err("ublk dev %u thread %u set affinity failed",
- t->dev->dev_info.dev_id, t->idx);
-}
-
struct ublk_thread_info {
struct ublk_dev *dev;
+ pthread_t thread;
unsigned idx;
sem_t *ready;
cpu_set_t *affinity;
unsigned long long extra_flags;
};
-static void *ublk_io_handler_fn(void *data)
+static void ublk_thread_set_sched_affinity(const struct ublk_thread_info *info)
{
- struct ublk_thread_info *info = data;
- struct ublk_thread *t = &info->dev->threads[info->idx];
+ if (pthread_setaffinity_np(pthread_self(), sizeof(*info->affinity), info->affinity) < 0)
+ ublk_err("ublk dev %u thread %u set affinity failed",
+ info->dev->dev_info.dev_id, info->idx);
+}
+
+static __attribute__((noinline)) int __ublk_io_handler_fn(struct ublk_thread_info *info)
+{
+ struct ublk_thread t = {
+ .dev = info->dev,
+ .idx = info->idx,
+ };
int dev_id = info->dev->dev_info.dev_id;
int ret;
- t->dev = info->dev;
- t->idx = info->idx;
-
- /*
- * IO perf is sensitive with queue pthread affinity on NUMA machine
- *
- * Set sched_affinity at beginning, so following allocated memory/pages
- * could be CPU/NUMA aware.
- */
- if (info->affinity)
- ublk_thread_set_sched_affinity(t, info->affinity);
-
- ret = ublk_thread_init(t, info->extra_flags);
+ ret = ublk_thread_init(&t, info->extra_flags);
if (ret) {
ublk_err("ublk dev %d thread %u init failed\n",
- dev_id, t->idx);
- return NULL;
+ dev_id, t.idx);
+ return ret;
}
sem_post(info->ready);
ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %u started\n",
- gettid(), dev_id, t->idx);
+ gettid(), dev_id, t.idx);
/* submit all io commands to ublk driver */
- ublk_submit_fetch_commands(t);
+ ublk_submit_fetch_commands(&t);
do {
- if (ublk_process_io(t) < 0)
+ if (ublk_process_io(&t) < 0)
break;
} while (1);
ublk_dbg(UBLK_DBG_THREAD, "tid %d: ublk dev %d thread %d exiting\n",
- gettid(), dev_id, t->idx);
- ublk_thread_deinit(t);
+ gettid(), dev_id, t.idx);
+ ublk_thread_deinit(&t);
+ return 0;
+}
+
+static void *ublk_io_handler_fn(void *data)
+{
+ struct ublk_thread_info *info = data;
+
+ /*
+ * IO perf is sensitive with queue pthread affinity on NUMA machine
+ *
+ * Set sched_affinity at beginning, so following allocated memory/pages
+ * could be CPU/NUMA aware.
+ */
+ if (info->affinity)
+ ublk_thread_set_sched_affinity(info);
+
+ __ublk_io_handler_fn(info);
+
return NULL;
}
@@ -989,14 +997,13 @@ static int ublk_start_daemon(const struct dev_ctx *ctx, struct ublk_dev *dev)
*/
if (dev->nthreads == dinfo->nr_hw_queues)
tinfo[i].affinity = &affinity_buf[i];
- pthread_create(&dev->threads[i].thread, NULL,
+ pthread_create(&tinfo[i].thread, NULL,
ublk_io_handler_fn,
&tinfo[i]);
}
for (i = 0; i < dev->nthreads; i++)
sem_wait(&ready);
- free(tinfo);
free(affinity_buf);
/* everything is fine now, start us */
@@ -1019,7 +1026,8 @@ static int ublk_start_daemon(const struct dev_ctx *ctx, struct ublk_dev *dev)
/* wait until we are terminated */
for (i = 0; i < dev->nthreads; i++)
- pthread_join(dev->threads[i].thread, &thread_ret);
+ pthread_join(tinfo[i].thread, &thread_ret);
+ free(tinfo);
fail:
for (i = 0; i < dinfo->nr_hw_queues; i++)
ublk_queue_deinit(&dev->q[i]);
diff --git a/tools/testing/selftests/ublk/kublk.h b/tools/testing/selftests/ublk/kublk.h
index 5e55484fb0aa..fe42705c6d42 100644
--- a/tools/testing/selftests/ublk/kublk.h
+++ b/tools/testing/selftests/ublk/kublk.h
@@ -175,23 +175,20 @@ struct ublk_queue {
struct ublk_thread {
struct ublk_dev *dev;
- struct io_uring ring;
- unsigned int cmd_inflight;
- unsigned int io_inflight;
-
- pthread_t thread;
unsigned idx;
#define UBLKS_T_STOPPING (1U << 0)
#define UBLKS_T_IDLE (1U << 1)
unsigned state;
+ unsigned int cmd_inflight;
+ unsigned int io_inflight;
+ struct io_uring ring;
};
struct ublk_dev {
struct ublk_tgt tgt;
struct ublksrv_ctrl_dev_info dev_info;
struct ublk_queue q[UBLK_MAX_QUEUES];
- struct ublk_thread threads[UBLK_MAX_THREADS];
unsigned nthreads;
unsigned per_io_tasks;
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
@ 2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-29 22:53 ` Ming Lei
2025-10-30 4:04 ` kernel test robot
2025-10-30 8:00 ` kernel test robot
2 siblings, 1 reply; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-29 16:00 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, Uday Shankar
On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Implement NUMA-friendly memory allocation for ublk driver to improve
> performance on multi-socket systems.
>
> This commit includes the following changes:
>
> 1. Convert struct ublk_device to use a flexible array member for the
> queues field instead of a separate pointer array allocation. This
> eliminates one level of indirection and simplifies memory management.
> The queues array is now allocated as part of struct ublk_device using
> struct_size().
Technically it ends up being the same number of indirections as
before, since changing queues from a single allocation to an array of
separate allocations adds another indirection.
>
> 2. Rename __queues to queues, dropping the __ prefix since the field is
> now accessed directly throughout the codebase rather than only through
> the ublk_get_queue() helper.
>
> 3. Remove the queue_size field from struct ublk_device as it is no longer
> needed.
>
> 4. Move queue allocation and deallocation into ublk_init_queue() and
> ublk_deinit_queue() respectively, improving encapsulation. This
> simplifies ublk_init_queues() and ublk_deinit_queues() to just
> iterate and call the per-queue functions.
>
> 5. Add ublk_get_queue_numa_node() helper function to determine the
> appropriate NUMA node for a queue by finding the first CPU mapped
> to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
> converting it to a NUMA node using cpu_to_node(). This function is
> called internally by ublk_init_queue() to determine the allocation
> node.
>
> 6. Allocate each queue structure on its local NUMA node using
> kvzalloc_node() in ublk_init_queue().
>
> 7. Allocate the I/O command buffer on the same NUMA node using
> alloc_pages_node().
>
> This reduces memory access latency on multi-socket NUMA systems by
> ensuring each queue's data structures are local to the CPUs that
> access them.
>
> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
> 1 file changed, 53 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 2569566bf5e6..ed77b4527b33 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -209,9 +209,6 @@ struct ublk_queue {
> struct ublk_device {
> struct gendisk *ub_disk;
>
> - char *__queues;
> -
> - unsigned int queue_size;
> struct ublksrv_ctrl_dev_info dev_info;
>
> struct blk_mq_tag_set tag_set;
> @@ -239,6 +236,8 @@ struct ublk_device {
> bool canceling;
> pid_t ublksrv_tgid;
> struct delayed_work exit_work;
> +
> + struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> };
>
> /* header of ublk_params */
> @@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
> static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
> int qid)
> {
> - return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
> + return dev->queues[qid];
> }
>
> static inline bool ublk_rq_has_data(const struct request *rq)
> @@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
>
> static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> {
> - int size = ublk_queue_cmd_buf_size(ub);
> - struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> - int i;
> + struct ublk_queue *ubq = ub->queues[q_id];
> + int size, i;
> +
> + if (!ubq)
> + return;
> +
> + size = ublk_queue_cmd_buf_size(ub);
>
> for (i = 0; i < ubq->q_depth; i++) {
> struct ublk_io *io = &ubq->ios[i];
> @@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
>
> if (ubq->io_cmd_buf)
> free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
> +
> + kvfree(ubq);
> + ub->queues[q_id] = NULL;
> +}
> +
> +static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> +{
> + unsigned int cpu;
> +
> + /* Find first CPU mapped to this queue */
> + for_each_possible_cpu(cpu) {
> + if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
> + return cpu_to_node(cpu);
> + }
I think you could avoid this quadratic lookup by using blk_mq_hw_ctx's
numa_node field. The initialization code would probably have to move
to ublk_init_hctx() in order to have access to the blk_mq_hw_ctx. But
may not be worth the effort just to save some time at ublk creation
time. What you have seems fine.
Best,
Caleb
> +
> + return NUMA_NO_NODE;
> }
>
> static int ublk_init_queue(struct ublk_device *ub, int q_id)
> {
> - struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> + int depth = ub->dev_info.queue_depth;
> + int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
> gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
> - void *ptr;
> + struct ublk_queue *ubq;
> + struct page *page;
> + int numa_node;
> int size;
>
> + /* Determine NUMA node based on queue's CPU affinity */
> + numa_node = ublk_get_queue_numa_node(ub, q_id);
> +
> + /* Allocate queue structure on local NUMA node */
> + ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
> + if (!ubq)
> + return -ENOMEM;
> +
> spin_lock_init(&ubq->cancel_lock);
> ubq->flags = ub->dev_info.flags;
> ubq->q_id = q_id;
> - ubq->q_depth = ub->dev_info.queue_depth;
> + ubq->q_depth = depth;
> size = ublk_queue_cmd_buf_size(ub);
>
> - ptr = (void *) __get_free_pages(gfp_flags, get_order(size));
> - if (!ptr)
> + /* Allocate I/O command buffer on local NUMA node */
> + page = alloc_pages_node(numa_node, gfp_flags, get_order(size));
> + if (!page) {
> + kvfree(ubq);
> return -ENOMEM;
> + }
> + ubq->io_cmd_buf = page_address(page);
>
> - ubq->io_cmd_buf = ptr;
> + ub->queues[q_id] = ubq;
> ubq->dev = ub;
> return 0;
> }
>
> static void ublk_deinit_queues(struct ublk_device *ub)
> {
> - int nr_queues = ub->dev_info.nr_hw_queues;
> int i;
>
> - if (!ub->__queues)
> - return;
> -
> - for (i = 0; i < nr_queues; i++)
> + for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
> ublk_deinit_queue(ub, i);
> - kvfree(ub->__queues);
> }
>
> static int ublk_init_queues(struct ublk_device *ub)
> {
> - int nr_queues = ub->dev_info.nr_hw_queues;
> - int depth = ub->dev_info.queue_depth;
> - int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
> - int i, ret = -ENOMEM;
> + int i, ret;
>
> - ub->queue_size = ubq_size;
> - ub->__queues = kvcalloc(nr_queues, ubq_size, GFP_KERNEL);
> - if (!ub->__queues)
> - return ret;
> -
> - for (i = 0; i < nr_queues; i++) {
> - if (ublk_init_queue(ub, i))
> + for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
> + ret = ublk_init_queue(ub, i);
> + if (ret)
> goto fail;
> }
>
> @@ -3128,7 +3150,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header)
> goto out_unlock;
>
> ret = -ENOMEM;
> - ub = kzalloc(sizeof(*ub), GFP_KERNEL);
> + ub = kzalloc(struct_size(ub, queues, info.nr_hw_queues), GFP_KERNEL);
> if (!ub)
> goto out_unlock;
> mutex_init(&ub->mutex);
> --
> 2.47.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 3/5] ublk: use struct_size() for allocation
2025-10-29 3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
@ 2025-10-29 16:00 ` Caleb Sander Mateos
0 siblings, 0 replies; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-29 16:00 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, Uday Shankar
On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Convert ublk_queue to use struct_size() for allocation.
>
> Changes in this commit:
>
> 1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
> instead of manual size calculation (sizeof(struct ublk_queue) +
> depth * sizeof(struct ublk_io)).
>
> This provides better type safety and makes the code more maintainable
> by using standard kernel macro for flexible array handling.
>
> Meantime annotate ublk_queue.ios by __counted_by().
>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> ---
> drivers/block/ublk_drv.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index ed77b4527b33..409874714c62 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -203,7 +203,7 @@ struct ublk_queue {
> bool fail_io; /* copy of dev->state == UBLK_S_DEV_FAIL_IO */
> spinlock_t cancel_lock;
> struct ublk_device *dev;
> - struct ublk_io ios[];
> + struct ublk_io ios[] __counted_by(q_depth);
> };
>
> struct ublk_device {
> @@ -2700,7 +2700,6 @@ static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> static int ublk_init_queue(struct ublk_device *ub, int q_id)
> {
> int depth = ub->dev_info.queue_depth;
> - int ubq_size = sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io);
> gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO;
> struct ublk_queue *ubq;
> struct page *page;
> @@ -2711,7 +2710,8 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id)
> numa_node = ublk_get_queue_numa_node(ub, q_id);
>
> /* Allocate queue structure on local NUMA node */
> - ubq = kvzalloc_node(ubq_size, GFP_KERNEL, numa_node);
> + ubq = kvzalloc_node(struct_size(ubq, ios, depth), GFP_KERNEL,
> + numa_node);
> if (!ubq)
> return -ENOMEM;
>
> --
> 2.47.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 16:00 ` Caleb Sander Mateos
@ 2025-10-29 22:53 ` Ming Lei
0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-29 22:53 UTC (permalink / raw)
To: Caleb Sander Mateos; +Cc: Jens Axboe, linux-block, Uday Shankar
On Wed, Oct 29, 2025 at 09:00:12AM -0700, Caleb Sander Mateos wrote:
> On Tue, Oct 28, 2025 at 8:11 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > Implement NUMA-friendly memory allocation for ublk driver to improve
> > performance on multi-socket systems.
> >
> > This commit includes the following changes:
> >
> > 1. Convert struct ublk_device to use a flexible array member for the
> > queues field instead of a separate pointer array allocation. This
> > eliminates one level of indirection and simplifies memory management.
> > The queues array is now allocated as part of struct ublk_device using
> > struct_size().
>
> Technically it ends up being the same number of indirections as
> before, since changing queues from a single allocation to an array of
> separate allocations adds another indirection.
I think it is fine, because the pre-condition is NUMA aware allocation for
ublk_queue.
>
> >
> > 2. Rename __queues to queues, dropping the __ prefix since the field is
> > now accessed directly throughout the codebase rather than only through
> > the ublk_get_queue() helper.
> >
> > 3. Remove the queue_size field from struct ublk_device as it is no longer
> > needed.
> >
> > 4. Move queue allocation and deallocation into ublk_init_queue() and
> > ublk_deinit_queue() respectively, improving encapsulation. This
> > simplifies ublk_init_queues() and ublk_deinit_queues() to just
> > iterate and call the per-queue functions.
> >
> > 5. Add ublk_get_queue_numa_node() helper function to determine the
> > appropriate NUMA node for a queue by finding the first CPU mapped
> > to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
> > converting it to a NUMA node using cpu_to_node(). This function is
> > called internally by ublk_init_queue() to determine the allocation
> > node.
> >
> > 6. Allocate each queue structure on its local NUMA node using
> > kvzalloc_node() in ublk_init_queue().
> >
> > 7. Allocate the I/O command buffer on the same NUMA node using
> > alloc_pages_node().
> >
> > This reduces memory access latency on multi-socket NUMA systems by
> > ensuring each queue's data structures are local to the CPUs that
> > access them.
> >
> > Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> > drivers/block/ublk_drv.c | 84 +++++++++++++++++++++++++---------------
> > 1 file changed, 53 insertions(+), 31 deletions(-)
> >
> > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > index 2569566bf5e6..ed77b4527b33 100644
> > --- a/drivers/block/ublk_drv.c
> > +++ b/drivers/block/ublk_drv.c
> > @@ -209,9 +209,6 @@ struct ublk_queue {
> > struct ublk_device {
> > struct gendisk *ub_disk;
> >
> > - char *__queues;
> > -
> > - unsigned int queue_size;
> > struct ublksrv_ctrl_dev_info dev_info;
> >
> > struct blk_mq_tag_set tag_set;
> > @@ -239,6 +236,8 @@ struct ublk_device {
> > bool canceling;
> > pid_t ublksrv_tgid;
> > struct delayed_work exit_work;
> > +
> > + struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> > };
> >
> > /* header of ublk_params */
> > @@ -781,7 +780,7 @@ static noinline void ublk_put_device(struct ublk_device *ub)
> > static inline struct ublk_queue *ublk_get_queue(struct ublk_device *dev,
> > int qid)
> > {
> > - return (struct ublk_queue *)&(dev->__queues[qid * dev->queue_size]);
> > + return dev->queues[qid];
> > }
> >
> > static inline bool ublk_rq_has_data(const struct request *rq)
> > @@ -2662,9 +2661,13 @@ static const struct file_operations ublk_ch_fops = {
> >
> > static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> > {
> > - int size = ublk_queue_cmd_buf_size(ub);
> > - struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
> > - int i;
> > + struct ublk_queue *ubq = ub->queues[q_id];
> > + int size, i;
> > +
> > + if (!ubq)
> > + return;
> > +
> > + size = ublk_queue_cmd_buf_size(ub);
> >
> > for (i = 0; i < ubq->q_depth; i++) {
> > struct ublk_io *io = &ubq->ios[i];
> > @@ -2676,57 +2679,76 @@ static void ublk_deinit_queue(struct ublk_device *ub, int q_id)
> >
> > if (ubq->io_cmd_buf)
> > free_pages((unsigned long)ubq->io_cmd_buf, get_order(size));
> > +
> > + kvfree(ubq);
> > + ub->queues[q_id] = NULL;
> > +}
> > +
> > +static int ublk_get_queue_numa_node(struct ublk_device *ub, int q_id)
> > +{
> > + unsigned int cpu;
> > +
> > + /* Find first CPU mapped to this queue */
> > + for_each_possible_cpu(cpu) {
> > + if (ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu] == q_id)
> > + return cpu_to_node(cpu);
> > + }
>
> I think you could avoid this quadratic lookup by using blk_mq_hw_ctx's
> numa_node field. The initialization code would probably have to move
> to ublk_init_hctx() in order to have access to the blk_mq_hw_ctx. But
> may not be worth the effort just to save some time at ublk creation
> time. What you have seems fine.
It isn't doable and not necessary.
disk/hw queues are created & initialized when handling UBLK_CMD_START_DEV, but the
backed ublk_queue need to be allocated when handling UBLK_CMD_ADD_DEV, which happens
before dealing with UBLK_CMD_START_DEV.
Thanks,
Ming
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
@ 2025-10-30 4:04 ` kernel test robot
2025-10-30 8:00 ` kernel test robot
2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2025-10-30 4:04 UTC (permalink / raw)
To: Ming Lei, Jens Axboe, linux-block
Cc: oe-kbuild-all, Uday Shankar, Caleb Sander Mateos, Ming Lei
Hi Ming,
kernel test robot noticed the following build errors:
[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master v6.18-rc3 next-20251029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
config: csky-randconfig-r054-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301107.wrn8eeW8-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301107.wrn8eeW8-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510301107.wrn8eeW8-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from <command-line>:
>> drivers/block/ublk_drv.c:240:56: error: 'dev_info' undeclared here (not in a function); did you mean '_dev_info'?
240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
| ^~~~~~~~
include/linux/compiler_types.h:346:71: note: in definition of macro '__counted_by'
346 | # define __counted_by(member) __attribute__((__counted_by__(member)))
| ^~~~~~
>> drivers/block/ublk_drv.c:240:34: error: 'counted_by' argument is not an identifier
240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
| ^~~~~~
vim +240 drivers/block/ublk_drv.c
208
209 struct ublk_device {
210 struct gendisk *ub_disk;
211
212 struct ublksrv_ctrl_dev_info dev_info;
213
214 struct blk_mq_tag_set tag_set;
215
216 struct cdev cdev;
217 struct device cdev_dev;
218
219 #define UB_STATE_OPEN 0
220 #define UB_STATE_USED 1
221 #define UB_STATE_DELETED 2
222 unsigned long state;
223 int ub_number;
224
225 struct mutex mutex;
226
227 spinlock_t lock;
228 struct mm_struct *mm;
229
230 struct ublk_params params;
231
232 struct completion completion;
233 u32 nr_io_ready;
234 bool unprivileged_daemons;
235 struct mutex cancel_mutex;
236 bool canceling;
237 pid_t ublksrv_tgid;
238 struct delayed_work exit_work;
239
> 240 struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
241 };
242
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-30 4:04 ` kernel test robot
@ 2025-10-30 8:00 ` kernel test robot
2025-10-30 14:07 ` Caleb Sander Mateos
2 siblings, 1 reply; 14+ messages in thread
From: kernel test robot @ 2025-10-30 8:00 UTC (permalink / raw)
To: Ming Lei, Jens Axboe, linux-block
Cc: llvm, oe-kbuild-all, Uday Shankar, Caleb Sander Mateos, Ming Lei
Hi Ming,
kernel test robot noticed the following build errors:
[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on shuah-kselftest/next shuah-kselftest/fixes linus/master v6.18-rc3 next-20251029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
All errors (new ones prefixed by >>):
>> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
| ^~~~~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:346:62: note: expanded from macro '__counted_by'
346 | # define __counted_by(member) __attribute__((__counted_by__(member)))
| ^~~~~~
1 error generated.
vim +/counted_by +240 drivers/block/ublk_drv.c
208
209 struct ublk_device {
210 struct gendisk *ub_disk;
211
212 struct ublksrv_ctrl_dev_info dev_info;
213
214 struct blk_mq_tag_set tag_set;
215
216 struct cdev cdev;
217 struct device cdev_dev;
218
219 #define UB_STATE_OPEN 0
220 #define UB_STATE_USED 1
221 #define UB_STATE_DELETED 2
222 unsigned long state;
223 int ub_number;
224
225 struct mutex mutex;
226
227 spinlock_t lock;
228 struct mm_struct *mm;
229
230 struct ublk_params params;
231
232 struct completion completion;
233 u32 nr_io_ready;
234 bool unprivileged_daemons;
235 struct mutex cancel_mutex;
236 bool canceling;
237 pid_t ublksrv_tgid;
238 struct delayed_work exit_work;
239
> 240 struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
241 };
242
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-30 8:00 ` kernel test robot
@ 2025-10-30 14:07 ` Caleb Sander Mateos
2025-10-30 17:56 ` Nathan Chancellor
0 siblings, 1 reply; 14+ messages in thread
From: Caleb Sander Mateos @ 2025-10-30 14:07 UTC (permalink / raw)
To: kernel test robot
Cc: Ming Lei, Jens Axboe, linux-block, llvm, oe-kbuild-all,
Uday Shankar
On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
>
> Hi Ming,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on axboe-block/for-next]
> [also build test ERROR on shuah-kselftest/next shuah-kselftest/fixes linus/master v6.18-rc3 next-20251029]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Ming-Lei/ublk-reorder-tag_set-initialization-before-queue-allocation/20251029-111323
> base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
> >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> 240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> | ^~~~~~~~~~~~~~~~~~~~~
Hmm, guess it doesn't support nested fields?
> include/linux/compiler_types.h:346:62: note: expanded from macro '__counted_by'
> 346 | # define __counted_by(member) __attribute__((__counted_by__(member)))
> | ^~~~~~
> 1 error generated.
>
>
> vim +/counted_by +240 drivers/block/ublk_drv.c
>
> 208
> 209 struct ublk_device {
> 210 struct gendisk *ub_disk;
> 211
> 212 struct ublksrv_ctrl_dev_info dev_info;
> 213
> 214 struct blk_mq_tag_set tag_set;
> 215
> 216 struct cdev cdev;
> 217 struct device cdev_dev;
> 218
> 219 #define UB_STATE_OPEN 0
> 220 #define UB_STATE_USED 1
> 221 #define UB_STATE_DELETED 2
> 222 unsigned long state;
> 223 int ub_number;
> 224
> 225 struct mutex mutex;
> 226
> 227 spinlock_t lock;
> 228 struct mm_struct *mm;
> 229
> 230 struct ublk_params params;
> 231
> 232 struct completion completion;
> 233 u32 nr_io_ready;
> 234 bool unprivileged_daemons;
> 235 struct mutex cancel_mutex;
> 236 bool canceling;
> 237 pid_t ublksrv_tgid;
> 238 struct delayed_work exit_work;
> 239
> > 240 struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> 241 };
> 242
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-30 14:07 ` Caleb Sander Mateos
@ 2025-10-30 17:56 ` Nathan Chancellor
2025-10-31 3:28 ` Ming Lei
0 siblings, 1 reply; 14+ messages in thread
From: Nathan Chancellor @ 2025-10-30 17:56 UTC (permalink / raw)
To: Caleb Sander Mateos
Cc: kernel test robot, Ming Lei, Jens Axboe, linux-block, llvm,
oe-kbuild-all, Uday Shankar, Kees Cook
On Thu, Oct 30, 2025 at 07:07:25AM -0700, Caleb Sander Mateos wrote:
> On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
...
> > patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> > patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> > config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
> >
> > All errors (new ones prefixed by >>):
> >
> > >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> > 240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> > | ^~~~~~~~~~~~~~~~~~~~~
>
> Hmm, guess it doesn't support nested fields?
Correct. I think this is something that we want to support at some point
if I remember correctly but I think there was a lot of discussion
between GCC and clang on how to actually do it but Kees is free to
correct me if that is wrong.
Cheers,
Nathan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
2025-10-30 17:56 ` Nathan Chancellor
@ 2025-10-31 3:28 ` Ming Lei
0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2025-10-31 3:28 UTC (permalink / raw)
To: Nathan Chancellor
Cc: Caleb Sander Mateos, kernel test robot, Jens Axboe, linux-block,
llvm, oe-kbuild-all, Uday Shankar, Kees Cook
On Thu, Oct 30, 2025 at 10:56:31AM -0700, Nathan Chancellor wrote:
> On Thu, Oct 30, 2025 at 07:07:25AM -0700, Caleb Sander Mateos wrote:
> > On Thu, Oct 30, 2025 at 1:01 AM kernel test robot <lkp@intel.com> wrote:
> ...
> > > patch link: https://lore.kernel.org/r/20251029031035.258766-3-ming.lei%40redhat.com
> > > patch subject: [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation
> > > config: x86_64-randconfig-074-20251030 (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/config)
> > > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510301522.i47z9R95-lkp@intel.com/reproduce)
> > >
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <lkp@intel.com>
> > > | Closes: https://lore.kernel.org/oe-kbuild-all/202510301522.i47z9R95-lkp@intel.com/
> > >
> > > All errors (new ones prefixed by >>):
> > >
> > > >> drivers/block/ublk_drv.c:240:49: error: 'counted_by' argument must be a simple declaration reference
> > > 240 | struct ublk_queue *queues[] __counted_by(dev_info.nr_hw_queues);
> > > | ^~~~~~~~~~~~~~~~~~~~~
> >
> > Hmm, guess it doesn't support nested fields?
>
> Correct. I think this is something that we want to support at some point
> if I remember correctly but I think there was a lot of discussion
> between GCC and clang on how to actually do it but Kees is free to
> correct me if that is wrong.
Thanks for the confirmation.
Will remove this __counted_by() in next version.
Thanks,
Ming
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-10-31 3:29 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29 3:10 [PATCH V3 0/5] ublk: NUMA-aware memory allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 1/5] ublk: reorder tag_set initialization before queue allocation Ming Lei
2025-10-29 3:10 ` [PATCH V3 2/5] ublk: implement NUMA-aware memory allocation Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-29 22:53 ` Ming Lei
2025-10-30 4:04 ` kernel test robot
2025-10-30 8:00 ` kernel test robot
2025-10-30 14:07 ` Caleb Sander Mateos
2025-10-30 17:56 ` Nathan Chancellor
2025-10-31 3:28 ` Ming Lei
2025-10-29 3:10 ` [PATCH V3 3/5] ublk: use struct_size() for allocation Ming Lei
2025-10-29 16:00 ` Caleb Sander Mateos
2025-10-29 3:10 ` [PATCH V3 4/5] selftests: ublk: set CPU affinity before thread initialization Ming Lei
2025-10-29 3:10 ` [PATCH V3 5/5] selftests: ublk: make ublk_thread thread-local variable Ming Lei
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.